From: chak Date: Mon, 11 Feb 2002 14:42:54 +0000 (+0000) Subject: [project @ 2002-02-11 14:42:53 by chak] X-Git-Tag: Approximately_9120_patches~109 X-Git-Url: http://git.megacz.com/?p=ghc-hetmet.git;a=commitdiff_plain;h=24dc6d31d5f4827dadc1b4bc1efa081717effe5e [project @ 2002-02-11 14:42:53 by chak] Documentation for the parallel array extension merged earlier. (This documents Milestone 1 as well as some of the basics of flattening.) --- diff --git a/ghc/docs/comm/exts/ndp.html b/ghc/docs/comm/exts/ndp.html new file mode 100644 index 0000000..e6053f2 --- /dev/null +++ b/ghc/docs/comm/exts/ndp.html @@ -0,0 +1,360 @@ + + + + + The GHC Commentary - Parallel Arrays + + + +

The GHC Commentary - Parallel Arrays

+

+ This section describes an experimental extension by high-performance + arrays, which comprises special syntax for array types and array + comprehensions, a set of optimising program transformations, and a set + of special purpose libraries. The extension is currently only partially + implemented, but the development will be tracked here. +

+ Parallel arrays originally got their name from the aim to provide an + architecture-independent programming model for a range of parallel + computers. However, since experiments showed that the approach is also + worthwhile for sequential array code, the emphasis has shifted to their + parallel evaluation semantics: As soon as any element in a parallel + array is demanded, all the other elements are evaluated, too. This + makes parallel arrays more strict than standard Haskell 98 + arrays, but also opens the door for a loop-based implementation + strategy that leads to significantly more efficient code. +

+ The programming model as well as the use of the flattening + transformation, which is central to the approach, has its origin in + the programming language Nesl. + +

More Sugar: Special Syntax for Array Comprehensions

+

+ The option -fparr, which is a dynamic hsc option that can + be reversed with -fno-parr, enables special syntax for + parallel arrays, which is not essential to using parallel arrays, but + makes for significantly more concise programs. The switch works by + making the lexical analyser (located in Lex.lhs) + recognise the tokens [: and :]. Given that + the additional productions in the parser (located in Parser.y) + cannot be triggered without the lexer generating the necessary tokens, + there is no need to alter the behaviour of the parser. +

+ The following additional syntax is accepted (the non-terminals are those + from the Haskell 98 language + definition): +

+

+atype -> '[:' type ':]				     (parallel array type)
+
+aexp  -> '[:' exp1 ',' ... ',' expk ':]'             (explicit array, k >= 0)
+      |  '[:' exp1 [',' exp2] '..' exp3 ':]'	     (arithmetic array sequence)
+      |  '[:' exp '|' quals1 '|' ... '|' qualsn ':]' (array comprehension, n >= 1)
+
+quals -> qual1 ',' ... ',' qualn	             (qualifier list, n >= 1)
+
+apat  -> '[:' pat1 ',' ... ',' patk ':]'	     (array pattern, k >= 0)
+
+
+

+ Moreover, the extended comprehension syntax that allows for parallel + qualifiers (i.e., qualifiers separated by "|") is also + supported in list comprehensions. In general, the similarity to the + special syntax for list is obvious. The two main differences are that + (a) arithmetic array sequences are always finite and (b) + [::] is not treated as a constructor in expressions and + patterns, but rather as a special case of the explicit array syntax. + The former is a simple consequence of the parallel evaluation semantics + of parallel arrays and the latter is due to arrays not being a + constructor-based data type. +

+ As a naming convention, types and functions that are concerned with + parallel arrays usually contain the string parr or + PArr (often as a prefix), and where corresponding types or + functions for handling lists exist, the name is identical, except for + containing the substring parr instead of list + (possibly in caps). +

+ The following implications are worth noting explicitly: +

+ +

Prelude Support for Parallel Arrays

+

+ The Prelude module PrelPArr + defines the standard operations and their types on parallel arrays and + provides a basic implementation based on boxed arrays. The interface of + PrelPArr is oriented by H98's PrelList, but + leaving out all functions that require infinite structures and adding + frequently needed array operations, such as permutations. Parallel + arrays are quite unqiue in that they use an entirely different + representation as soon as the flattening transformation is activated, + which is described further below. In particular, PrelPArr + defines the type [::] and operations to create, process, + and inspect parallel arrays. The type as well as the names of some of + the operations are also hardwired in TysWiredIn + (see the definition of parrTyCon in this module) and PrelNames. + This is again very much like the case of lists, where the type is + defined in PrelBase + and similarly wired in; however, for lists the entirely + constructor-based definition is exposed to user programs, which is not + the case for parallel arrays. + +

Desugaring Parallel Arrays

+

+ The parallel array extension requires the desugarer to replace all + occurrences of (1) explicit parallel arrays, (2) array patterns, and (3) + array comprehensions by a suitable combination of invocations of + operations defined in PrelPArr. + +

Explicit Parallel Arrays

+

+ These are by far the simplest to remove. We simply replace every + occurrence of [:e1, ..., + en:] by +

+ + toP [e1, ..., en] + +
+

+ i.e., we build a list of the array elements, which is, then, converted + into a parallel array. + +

Parallel Array Patterns

+

+ Array patterns are much more tricky as element positions may contain + further patterns and the pattern matching compiler + requires us to flatten all those out. But before we turn to the gory + details, here first the basic idea: A flat array pattern matches exactly + iff it's length corresponds to the length of the matched array. Hence, + if we have a set of flat array patterns matching an array value + a, it suffices to generate a Core case + expression that scrutinises lengthP a and has one + alternative for every length of array occuring in one of the patterns. + Moreover, there needs to be a default case catching all other array + lengths. In each alternative, array indexing (i.e., the functions + !:) is used to bind array elements to the corresponding + pattern variables. This sounds easy enough and is essentially what the + parallel array equation of the function DsUtils.mkCoAlgCaseMatchResult + does. +

+ Unfortunately, however, the pattern matching compiler expects that it + can turn (almost) any pattern into variable patterns, literals, or + constructor applications by way of the functions Match.tidy1. + And to make matters worse, some weird machinery in the module Check + insists on being able to reverse the process (essentially to pretty + print patterns in warnings for incomplete or overlapping patterns). +

+ The solution to this is an (unlimited) set of fake constructors + for parallel arrays, courtesy of TysWiredIn.parrFakeCon. + In other words, any pattern of the form [:p1, + ..., pn:] is transformed into +

+ + MkPArrayn p1 ... pn + +
+

+ by Match.tidy1, then, run through the rest of the pattern + matching compiler, and finally, picked up by + DsUtils.mkCoAlgCaseMatchResult, which converts it into a + case expression as outlined above. +

+ As an example consider the source expression +

+case v of
+  [:x1:]         -> e1
+  [:x2, x3, x4:] -> e2
+  _		 -> e3
+
+

+ Match.tidy1 converts it into a form that is equivalent to +

+case v of {
+  MkPArr1 x1       -> e1;
+  MkPArr2 x2 x3 x4 -> e2;
+  _	           -> e3;
+}
+
+

+ which DsUtils.mkCoAlgCaseMatchResult turns into the + following Core code: +

+      case lengthP v of
+        Int# i# -> 
+	  case i# of l {
+	    DFT ->					  e3
+	    1   -> let x1 = v!:0                       in e1
+	    3   -> let x2 = v!:0; x2 = v!:1; x3 = v!:2 in e2
+	  }
+
+ +

Parallel Array Comprehensions

+

+ The most challenging construct of the three are array comprehensions. + In principle, it would be possible to transform them in essentially the + same way as list comprehensions, but this would lead to abysmally slow + code as desugaring of list comprehensions generates code that is + optimised for sequential, constructor-based structures. In contrast, + array comprehensions need to be transformed into code that solely relies + on collective operations and avoids the creation of many small + intermediate arrays. +

+ The transformation is implemented by the function DsListComp.dsPArrComp. + In the following, we denote this transformation function by the form + <<e>> pa ea, where e + is the comprehension to be compiled and the arguments pa + and ea denote a pattern and the currently processed array + expression, respectively. The invariant constraining these two + arguments is that all elements in the array produced by ea + will successfully match against pa. +

+ Given a source-level comprehensions [:e | qss:], we compile + it with <<[:e | qss:]>> () [:():] using the + rules +

+<<[:e' |           :]>> pa ea = mapP (\pa -> e') ea
+<<[:e' | b     , qs:]>> pa ea = <<[:e' | qs:]>> pa (filterP (\pa -> b) ea)
+<<[:e' | p <- e, qs:]>> pa ea = 
+  let ef = filterP (\x -> case x of {p -> True; _ -> False}) e
+  in
+  <<[:e' | qs:]>> (pa, p) (crossP ea ef)
+<<[:e' | let ds, qs:]>> pa ea = 
+  <<[:e' | qs:]>> (pa, (x_1, ..., x_n)) 
+    	      (mapP (\v@pa -> (v, let ds in (x_1, ..., x_n))) ea)
+where
+  {x_1, ..., x_n} = DV (ds)		-- Defined Variables
+<<[:e' | qs | qss:]>>   pa ea = 
+  <<[:e' | qss:]>> (pa, (x_1, ..., x_n)) 
+    	       (zipP ea <<[:(x_1, ..., x_n) | qs:]>>)
+where
+  {x_1, ..., x_n} = DV (qs)
+
+

+ We assume the denotation of crossP to be given by +

+crossP       :: [:a:] -> [:b:] -> [:(a, b):]
+crossP a1 a2  = let
+  		len1 = lengthP a1
+  		len2 = lengthP a2
+  		x1   = concatP $ mapP (replicateP len2) a1
+  		x2   = concatP $ replicateP len1 a2
+  	      in
+  	      zipP x1 x2
+
+

+ For a more efficient implementation of crossP, see + PrelPArr. +

+ Moreover, the following optimisations are important: +

    +
  • In the p <- e rule, if pa == (), drop + it and simplify the crossP ea e to e. +
  • We assume that fusion will optimise sequences of array processing + combinators. +
  • FIXME: Do we want to have the following function? +
    +mapFilterP :: (a -> Maybe b) -> [:a:] -> [:b:]
    +
    +

    + Even with fusion (mapP (\p -> e) . filterP (\p -> + b)) may still result in redundant pattern matching + operations. (Let's wait with this until we have seen what the + Simplifier does to the generated code.) +

+ +

Doing Away With Nested Arrays: The Flattening Transformation

+

+ On the quest towards an entirely unboxed representation of parallel + arrays, the flattening transformation is the essential ingredient. GHC + uses a substantially + improved version of the transformation whose original form was + described by Blelloch & Sabot. The flattening transformation + replaces values of type [:a:] as well as functions + operating on these values by alternative, more efficient data structures + and functions. +

+ The flattening machinery is activated by the option + -fflatten, which is a static hsc option. It consists of + two steps: function vectorisation and array specialisation. Only the + first of those is implemented so far. If selected, the transformation + is applied to a module in Core form immediately after the desugarer, before the Mighty Simplifier gets to do its + job. After vectorisation, the Core program can be dumped using the + option -ddump-vect. The is a good reason for us to perform + flattening immediately after the desugarer. In HscMain.hscRecomp + the so-called persistent compiler state is maintained, which + contains all the information about imported interface files needed to + lookup the details of imported names (e.g., during renaming and type + checking). However, all this information is zapped before the + simplifier is invoked (supposedly to reduce the space-consumption of + GHC). As flattening has to get at all kinds of identifiers from Prelude + modules, we need to do it before the relevant information in the + persistent compiler state is gone. + +

+ As flattening generally requires all libraries to be compiled for + flattening (just like profiling does), there is a compiler way + "ndp", which can be selected using the way option code + -ndp. This option will automagically select + -fparr and -fflatten. + +

FlattenMonad

+

+ The module FlattenMonad + implements the monad Flatten that is used during + vectorisation to keep track of various sets of bound variables and a + variable substitution map; moreover, it provides a supply of new uniques + and allows us to look up names in the persistent compiler state (i.e., + imported identifiers). +

+ In order to be able to look up imported identifiers in the persistent + compiler state, it is important that these identifies are included into + the free variable lists computed by the renamer. More precisely, all + names needed by flattening are included in the names produced by + RnEnv.getImplicitModuleFVs. To avoid putting + flattening-dependent lists of names into the renamer, the module + FlattenInfo exports namesNeededForFlattening. + + [FIXME: It might be worthwhile to document in the non-Flattening part of + the Commentary that the persistent compiler state is zapped after + desugaring and how the free variables determined by the renamer imply + which names are imported.] + +

+ +Last modified: Tue Feb 12 01:44:21 EST 2002 + + + + diff --git a/ghc/docs/comm/index.html b/ghc/docs/comm/index.html index ee12a62..ac2379e 100644 --- a/ghc/docs/comm/index.html +++ b/ghc/docs/comm/index.html @@ -6,7 +6,7 @@ -

The Glasgow Haskell Compiler (GHC) Commentary [v0.10]

+

The Glasgow Haskell Compiler (GHC) Commentary [v0.11]

Array Libraries - [not available yet] -

  • On why we have ForeignPtr +
  • On why we have + ForeignPtr + +

    Extensions, or Making a Complicated System More Complicated

    +

    +

    + +

    The Source

    The online master copy of the Commentary is at

    @@ -86,7 +100,7 @@

    -Last modified: Wed Feb 6 22:21:52 EST 2002 +Last modified: Mon Feb 11 20:18:35 EST 2002 diff --git a/ghc/docs/comm/the-beast/desugar.html b/ghc/docs/comm/the-beast/desugar.html index 9d92828..a667402 100644 --- a/ghc/docs/comm/the-beast/desugar.html +++ b/ghc/docs/comm/the-beast/desugar.html @@ -60,7 +60,7 @@ obtained. This is supported by the function DsMonad.dsLookupGlobalValue :: Name -> DsM Id. -

    Pattern Matching

    +

    Pattern Matching

    Nested pattern matching with guards and everything is translated into the simple, flat case expressions of Core by the following modules: @@ -149,7 +149,7 @@


    -Last modified: Wed Jan 16 23:40:16 EST 2002 +Last modified: Mon Feb 11 22:35:25 EST 2002