1 <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
4 <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1">
5 <title>The GHC Commentary - Parallel Arrays</title>
8 <body BGCOLOR="FFFFFF">
9 <h1>The GHC Commentary - Parallel Arrays</h1>
11 This section describes an experimental extension by high-performance
12 arrays, which comprises special syntax for array types and array
13 comprehensions, a set of optimising program transformations, and a set
14 of special purpose libraries. The extension is currently only partially
15 implemented, but the development will be tracked here.
17 Parallel arrays originally got their name from the aim to provide an
18 architecture-independent programming model for a range of parallel
19 computers. However, since experiments showed that the approach is also
20 worthwhile for sequential array code, the emphasis has shifted to their
21 parallel evaluation semantics: As soon as any element in a parallel
22 array is demanded, all the other elements are evaluated, too. This
23 makes parallel arrays more strict than <a
24 href="http://haskell.org/onlinelibrary/array.html">standard Haskell 98
25 arrays</a>, but also opens the door for a loop-based implementation
26 strategy that leads to significantly more efficient code.
28 The programming model as well as the use of the <em>flattening
29 transformation</em>, which is central to the approach, has its origin in
30 the programming language <a
31 href="http://www.cs.cmu.edu/~scandal/nesl.html">Nesl</a>.
33 <h2>More Sugar: Special Syntax for Array Comprehensions</h2>
35 The option <code>-XParr</code>, which is a dynamic hsc option that can
36 be reversed with <code>-XNoParr</code>, enables special syntax for
37 parallel arrays, which is not essential to using parallel arrays, but
38 makes for significantly more concise programs. The switch works by
39 making the lexical analyser (located in <a
40 href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/parser/Lex.lhs"><code>Lex.lhs</code></a>)
41 recognise the tokens <code>[:</code> and <code>:]</code>. Given that
42 the additional productions in the parser (located in <a
43 href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/parser/Parser.y"><code>Parser.y</code></a>)
44 cannot be triggered without the lexer generating the necessary tokens,
45 there is no need to alter the behaviour of the parser.
47 The following additional syntax is accepted (the non-terminals are those
48 from the <a href="http://haskell.org/onlinereport/">Haskell 98 language
52 atype -> '[:' type ':] (parallel array type)
54 aexp -> '[:' exp1 ',' ... ',' expk ':]' (explicit array, k >= 0)
55 | '[:' exp1 [',' exp2] '..' exp3 ':]' (arithmetic array sequence)
56 | '[:' exp '|' quals1 '|' ... '|' qualsn ':]' (array comprehension, n >= 1)
58 quals -> qual1 ',' ... ',' qualn (qualifier list, n >= 1)
60 apat -> '[:' pat1 ',' ... ',' patk ':]' (array pattern, k >= 0)
64 Moreover, the extended comprehension syntax that allows for <em>parallel
65 qualifiers</em> (i.e., qualifiers separated by "<code>|</code>") is also
66 supported in list comprehensions. In general, the similarity to the
67 special syntax for list is obvious. The two main differences are that
68 (a) arithmetic array sequences are always finite and (b)
69 <code>[::]</code> is not treated as a constructor in expressions and
70 patterns, but rather as a special case of the explicit array syntax.
71 The former is a simple consequence of the parallel evaluation semantics
72 of parallel arrays and the latter is due to arrays not being a
73 constructor-based data type.
75 As a naming convention, types and functions that are concerned with
76 parallel arrays usually contain the string <code>parr</code> or
77 <code>PArr</code> (often as a prefix), and where corresponding types or
78 functions for handling lists exist, the name is identical, except for
79 containing the substring <code>parr</code> instead of <code>list</code>
82 The following implications are worth noting explicitly:
84 <li>As the value and pattern <code>[::]</code> is an empty explicit
85 parallel array (i.e., something of the form <code>ExplicitPArr ty
86 []</code> in the AST). This is in contrast to lists, which use the
87 nil-constructor instead. In the case of parallel arrays, using a
88 constructor would be rather awkward, as it is not a constructor-based
89 type. (This becomes rather clear in the desugarer.)
90 <li>As a consequence, array patterns have the general form <code>[:p1,
91 p2, ..., pn:]</code>, where <code>n</code> >= 0. Thus, two array
92 patterns overlap iff they have the same length -- an important property
93 for the pattern matching compiler.
96 <h2>Prelude Support for Parallel Arrays</h2>
99 href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/lib/std/PrelPArr.lhs"><code>PrelPArr</code></a>
100 defines the standard operations and their types on parallel arrays and
101 provides a basic implementation based on boxed arrays. The interface of
102 <code>PrelPArr</code> is oriented by H98's <code>PrelList</code>, but
103 leaving out all functions that require infinite structures and adding
104 frequently needed array operations, such as permutations. Parallel
105 arrays are quite unqiue in that they use an entirely different
106 representation as soon as the flattening transformation is activated,
107 which is described further below. In particular, <code>PrelPArr</code>
108 defines the type <code>[::]</code> and operations to create, process,
109 and inspect parallel arrays. The type as well as the names of some of
110 the operations are also hardwired in <a
111 href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/prelude/TysWiredIn.lhs"><code>TysWiredIn</code></a>
112 (see the definition of <code>parrTyCon</code> in this module) and <a
113 href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/prelude/PrelNames.lhs"><code>PrelNames</code></a>.
114 This is again very much like the case of lists, where the type is
116 href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/lib/std/PrelBase.lhs"><code>PrelBase</code></a>
117 and similarly wired in; however, for lists the entirely
118 constructor-based definition is exposed to user programs, which is not
119 the case for parallel arrays.
121 <h2>Desugaring Parallel Arrays</h2>
123 The parallel array extension requires the desugarer to replace all
124 occurrences of (1) explicit parallel arrays, (2) array patterns, and (3)
125 array comprehensions by a suitable combination of invocations of
126 operations defined in <code>PrelPArr</code>.
128 <h4>Explicit Parallel Arrays</h4>
130 These are by far the simplest to remove. We simply replace every
131 occurrence of <code>[:<i>e<sub>1</sub></i>, ...,
132 <i>e<sub>n</sub></i>:]</code> by
135 toP [<i>e<sub>1</sub></i>, ..., <i>e<sub>n</sub></i>]
139 i.e., we build a list of the array elements, which is, then, converted
140 into a parallel array.
142 <h4>Parallel Array Patterns</h4>
144 Array patterns are much more tricky as element positions may contain
145 further patterns and the <a
146 href="../the-beast/desugar.html#patmat">pattern matching compiler</a>
147 requires us to flatten all those out. But before we turn to the gory
148 details, here first the basic idea: A flat array pattern matches exactly
149 iff it's length corresponds to the length of the matched array. Hence,
150 if we have a set of flat array patterns matching an array value
151 <code>a</code>, it suffices to generate a Core <code>case</code>
152 expression that scrutinises <code>lengthP a</code> and has one
153 alternative for every length of array occuring in one of the patterns.
154 Moreover, there needs to be a default case catching all other array
155 lengths. In each alternative, array indexing (i.e., the functions
156 <code>!:</code>) is used to bind array elements to the corresponding
157 pattern variables. This sounds easy enough and is essentially what the
158 parallel array equation of the function <a
159 href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/deSugar/DsUtils.lhs"><code>DsUtils</code></a><code>.mkCoAlgCaseMatchResult</code>
162 Unfortunately, however, the pattern matching compiler expects that it
163 can turn (almost) any pattern into variable patterns, literals, or
164 constructor applications by way of the functions <a
165 href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/deSugar/Match.lhs"><code>Match</code></a><code>.tidy1</code>.
166 And to make matters worse, some weird machinery in the module <a
167 href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/deSugar/Check.lhs"><code>Check</code></a>
168 insists on being able to reverse the process (essentially to pretty
169 print patterns in warnings for incomplete or overlapping patterns).
171 The solution to this is an (unlimited) set of <em>fake</em> constructors
172 for parallel arrays, courtesy of <a
173 href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/prelude/TysWiredIn.lhs"><code>TysWiredIn</code></a><code>.parrFakeCon</code>.
174 In other words, any pattern of the form <code>[:<i>p<sub>1</sub></i>,
175 ..., <i>p<sub>n</sub></i>:]</code> is transformed into
178 MkPArray<i>n</i> <i>p<sub>1</sub></i> ... <i>p<sub>n</sub></i>
182 by <code>Match.tidy1</code>, then, run through the rest of the pattern
183 matching compiler, and finally, picked up by
184 <code>DsUtils.mkCoAlgCaseMatchResult</code>, which converts it into a
185 <code>case</code> expression as outlined above.
187 As an example consider the source expression
195 <code>Match.tidy1</code> converts it into a form that is equivalent to
199 MkPArr2 x2 x3 x4 -> e2;
204 which <code>DsUtils.mkCoAlgCaseMatchResult</code> turns into the
211 1 -> let x1 = v!:0 in e1
212 3 -> let x2 = v!:0; x2 = v!:1; x3 = v!:2 in e2
216 <h4>Parallel Array Comprehensions</h4>
218 The most challenging construct of the three are array comprehensions.
219 In principle, it would be possible to transform them in essentially the
220 same way as list comprehensions, but this would lead to abysmally slow
221 code as desugaring of list comprehensions generates code that is
222 optimised for sequential, constructor-based structures. In contrast,
223 array comprehensions need to be transformed into code that solely relies
224 on collective operations and avoids the creation of many small
227 The transformation is implemented by the function <a
228 href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/deSugar/DsListComp.lhs"><code>DsListComp</code></a><code>.dsPArrComp</code>.
229 In the following, we denote this transformation function by the form
230 <code><<<i>e</i>>> pa ea</code>, where <code><i>e</i></code>
231 is the comprehension to be compiled and the arguments <code>pa</code>
232 and <code>ea</code> denote a pattern and the currently processed array
233 expression, respectively. The invariant constraining these two
234 arguments is that all elements in the array produced by <code>ea</code>
235 will <em>successfully</em> match against <code>pa</code>.
237 Given a source-level comprehensions <code>[:e | qss:]</code>, we compile
238 it with <code><<[:e | qss:]>> () [:():]</code> using the
241 <<[:e' | :]>> pa ea = mapP (\pa -> e') ea
242 <<[:e' | b , qs:]>> pa ea = <<[:e' | qs:]>> pa (filterP (\pa -> b) ea)
243 <<[:e' | p <- e, qs:]>> pa ea =
244 let ef = filterP (\x -> case x of {p -> True; _ -> False}) e
246 <<[:e' | qs:]>> (pa, p) (crossP ea ef)
247 <<[:e' | let ds, qs:]>> pa ea =
248 <<[:e' | qs:]>> (pa, (x_1, ..., x_n))
249 (mapP (\v@pa -> (v, let ds in (x_1, ..., x_n))) ea)
251 {x_1, ..., x_n} = DV (ds) -- Defined Variables
252 <<[:e' | qs | qss:]>> pa ea =
253 <<[:e' | qss:]>> (pa, (x_1, ..., x_n))
254 (zipP ea <<[:(x_1, ..., x_n) | qs:]>>)
256 {x_1, ..., x_n} = DV (qs)</pre>
259 We assume the denotation of <code>crossP</code> to be given by
261 crossP :: [:a:] -> [:b:] -> [:(a, b):]
265 x1 = concatP $ mapP (replicateP len2) a1
266 x2 = concatP $ replicateP len1 a2
271 For a more efficient implementation of <code>crossP</code>, see
273 href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/lib/std/PrelPArr.lhs"><code>PrelPArr</code></a>.
275 Moreover, the following optimisations are important:
277 <li>In the <code>p <- e</code> rule, if <code>pa == ()</code>, drop
278 it and simplify the <code>crossP ea e</code> to <code>e</code>.
279 <li>We assume that fusion will optimise sequences of array processing
281 <li>FIXME: Do we want to have the following function?
283 mapFilterP :: (a -> Maybe b) -> [:a:] -> [:b:]</pre>
286 Even with fusion <code>(mapP (\p -> e) . filterP (\p ->
287 b))</code> may still result in redundant pattern matching
288 operations. (Let's wait with this until we have seen what the
289 Simplifier does to the generated code.)
292 <h2>Doing Away With Nested Arrays: The Flattening Transformation</h2>
294 On the quest towards an entirely unboxed representation of parallel
295 arrays, the flattening transformation is the essential ingredient. GHC
297 href="http://www.cse.unsw.edu.au/~chak/papers/CK00.html">substantially
298 improved version</a> of the transformation whose original form was
299 described by Blelloch & Sabot. The flattening transformation
300 replaces values of type <code>[:a:]</code> as well as functions
301 operating on these values by alternative, more efficient data structures
304 The flattening machinery is activated by the option
305 <code>-fflatten</code>, which is a static hsc option. It consists of
306 two steps: function vectorisation and array specialisation. Only the
307 first of those is implemented so far. If selected, the transformation
308 is applied to a module in Core form immediately after the <a
309 href="../the-beast/desugar.html">desugarer,</a> before the <a
310 href="../the-beast/simplifier.html">Mighty Simplifier</a> gets to do its
311 job. After vectorisation, the Core program can be dumped using the
312 option <code>-ddump-vect</code>. The is a good reason for us to perform
313 flattening immediately after the desugarer. In <a
314 href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/main/HscMain.lhs"><code>HscMain</code></a><code>.hscRecomp</code>
315 the so-called <em>persistent compiler state</em> is maintained, which
316 contains all the information about imported interface files needed to
317 lookup the details of imported names (e.g., during renaming and type
318 checking). However, all this information is zapped before the
319 simplifier is invoked (supposedly to reduce the space-consumption of
320 GHC). As flattening has to get at all kinds of identifiers from Prelude
321 modules, we need to do it before the relevant information in the
322 persistent compiler state is gone.
325 As flattening generally requires all libraries to be compiled for
326 flattening (just like profiling does), there is a <em>compiler way</em>
327 <code>"ndp"</code>, which can be selected using the way option code
328 <code>-ndp</code>. This option will automagically select
329 <code>-XParr</code> and <code>-fflatten</code>.
331 <h4><code>FlattenMonad</code></h4>
334 href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/ndpFlatten/FlattenMonad.lhs"><code>FlattenMonad</code></a>
335 implements the monad <code>Flatten</code> that is used during
336 vectorisation to keep track of various sets of bound variables and a
337 variable substitution map; moreover, it provides a supply of new uniques
338 and allows us to look up names in the persistent compiler state (i.e.,
339 imported identifiers).
341 In order to be able to look up imported identifiers in the persistent
342 compiler state, it is important that these identifies are included into
343 the free variable lists computed by the renamer. More precisely, all
344 names needed by flattening are included in the names produced by
345 <code>RnEnv.getImplicitModuleFVs</code>. To avoid putting
346 flattening-dependent lists of names into the renamer, the module
347 <code>FlattenInfo</code> exports <code>namesNeededForFlattening</code>.
349 [FIXME: It might be worthwhile to document in the non-Flattening part of
350 the Commentary that the persistent compiler state is zapped after
351 desugaring and how the free variables determined by the renamer imply
352 which names are imported.]
356 Last modified: Tue Feb 12 01:44:21 EST 2002