Chapel (not actually the title)

Speaker presented this as a "whirlwind introduction to Chapel".

Themes for Chapel:
  1. general parallel programming
  2. global-view abstractions (data structures as a whole, not what each thread or processor owns)
  3. multi-resolution design (get as close to the hardware as necessary, when necessary)
  4. control of locality/affinity
  5. reduce the gap between parallel languages and mainstream languages

Chapel is not yet performant. The designers are still working on completeness and correctness. The language is still in flux; some of the speakers examples are in language that is not yet really defined. Chapel is slower than MPI currently. (Interesting to note the keynote speaker's contention: she seems to say that this means the language will be unsuccessful, at least until this can be solved). The SourceForge page lists the status as "pre-alpha".

Chapel is not only for use on Cray computers; it is intended to work even on multicore desktops. It supports data parallel programming but also task parallel programming. It even supports GPU programming.

Data-parallel calculation: C = A + alpha * B, where A, B and C are arrays.

    const ProblemSpace: domain(1, int(64))
        = [1..m];
    var A,B,C: [ProblemSpace] real;
    forall (a,b,c) in (A,B,C) do
        c = a + alpha * b;    

This is an example of data-parallel code in Chapel, and will run using multiple cores on a single machine.

Language has built-in facilities for reduce functionality, for data parallel operations, and for whole-array manipulations. It has some powerful facilities to accessing slices and subdomains, for high-level manipulations. It supports parallel i/o as well.

It seems that many of the nice features of performance the speaker commented are are "still coming": the current implementation (he reiterates) is not performant. "Good implementations" will do things that the current implementations do not do.

I have built Chapel (current release as of Nov 16, 2009; version 1.02) on oink. I built it in my own directory, because I don't know how to build it to go with the setup of UPS stuff that Ron has established.

GPU programming support is "good", because Chapel supports task parallelism to "fire kernels off to the accelerator"... Supporting all the necessary features in one language is considered to be critical. "Distributions" defined to be GPU distributions will use GPU resources for calculations. The Chapel version of a benchmark using this matches the performance of CUDA. But the example done is a "stream" calculation, which is a silly thing to do on a GPU (why?) "Better to do GPU programming in a single language"; much simpler than CUDA.

How does this compare with OpenCL?

What are some interesting calculations to try in Chapel? It would seem that some of the accelerator modeling things might be workable.