From: chak Date: Mon, 11 Feb 2002 14:42:54 +0000 (+0000) Subject: [project @ 2002-02-11 14:42:53 by chak] X-Git-Tag: Approximately_9120_patches~109 X-Git-Url: http://git.megacz.com/?p=ghc-hetmet.git;a=commitdiff_plain;h=24dc6d31d5f4827dadc1b4bc1efa081717effe5e [project @ 2002-02-11 14:42:53 by chak] Documentation for the parallel array extension merged earlier. (This documents Milestone 1 as well as some of the basics of flattening.) --- diff --git a/ghc/docs/comm/exts/ndp.html b/ghc/docs/comm/exts/ndp.html new file mode 100644 index 0000000..e6053f2 --- /dev/null +++ b/ghc/docs/comm/exts/ndp.html @@ -0,0 +1,360 @@ + + + + + The GHC Commentary - Parallel Arrays + + + +

The GHC Commentary - Parallel Arrays

+ This section describes an experimental extension by high-performance + arrays, which comprises special syntax for array types and array + comprehensions, a set of optimising program transformations, and a set + of special purpose libraries. The extension is currently only partially + implemented, but the development will be tracked here. +

+ Parallel arrays originally got their name from the aim to provide an + architecture-independent programming model for a range of parallel + computers. However, since experiments showed that the approach is also + worthwhile for sequential array code, the emphasis has shifted to their + parallel evaluation semantics: As soon as any element in a parallel + array is demanded, all the other elements are evaluated, too. This + makes parallel arrays more strict than standard Haskell 98 + arrays, but also opens the door for a loop-based implementation + strategy that leads to significantly more efficient code. +

+ The programming model as well as the use of the flattening + transformation, which is central to the approach, has its origin in + the programming language Nesl. + +

More Sugar: Special Syntax for Array Comprehensions

+ The option -fparr, which is a dynamic hsc option that can + be reversed with -fno-parr, enables special syntax for + parallel arrays, which is not essential to using parallel arrays, but + makes for significantly more concise programs. The switch works by + making the lexical analyser (located in Lex.lhs) + recognise the tokens [: and :]. Given that + the additional productions in the parser (located in Parser.y) + cannot be triggered without the lexer generating the necessary tokens, + there is no need to alter the behaviour of the parser. +

+ The following additional syntax is accepted (the non-terminals are those + from the Haskell 98 language + definition): +

+atype -> '[:' type ':]				     (parallel array type)
+
+aexp  -> '[:' exp1 ',' ... ',' expk ':]'             (explicit array, k >= 0)
+      |  '[:' exp1 [',' exp2] '..' exp3 ':]'	     (arithmetic array sequence)
+      |  '[:' exp '|' quals1 '|' ... '|' qualsn ':]' (array comprehension, n >= 1)
+
+quals -> qual1 ',' ... ',' qualn	             (qualifier list, n >= 1)
+
+apat  -> '[:' pat1 ',' ... ',' patk ':]'	     (array pattern, k >= 0)
+

+ Moreover, the extended comprehension syntax that allows for parallel + qualifiers (i.e., qualifiers separated by "|") is also + supported in list comprehensions. In general, the similarity to the + special syntax for list is obvious. The two main differences are that + (a) arithmetic array sequences are always finite and (b) + [::] is not treated as a constructor in expressions and + patterns, but rather as a special case of the explicit array syntax. + The former is a simple consequence of the parallel evaluation semantics + of parallel arrays and the latter is due to arrays not being a + constructor-based data type. +

+ As a naming convention, types and functions that are concerned with + parallel arrays usually contain the string parr or + PArr (often as a prefix), and where corresponding types or + functions for handling lists exist, the name is identical, except for + containing the substring parr instead of list + (possibly in caps). +

+ The following implications are worth noting explicitly: +

As the value and pattern [::] is an empty explicit + parallel array (i.e., something of the form ExplicitPArr ty + [] in the AST). This is in contrast to lists, which use the + nil-constructor instead. In the case of parallel arrays, using a + constructor would be rather awkward, as it is not a constructor-based + type. (This becomes rather clear in the desugarer.) +
As a consequence, array patterns have the general form [:p1, + p2, ..., pn:], where n >= 0. Thus, two array + patterns overlap iff they have the same length -- an important property + for the pattern matching compiler. +

+ +

Prelude Support for Parallel Arrays

+ The Prelude module PrelPArr + defines the standard operations and their types on parallel arrays and + provides a basic implementation based on boxed arrays. The interface of + PrelPArr is oriented by H98's PrelList, but + leaving out all functions that require infinite structures and adding + frequently needed array operations, such as permutations. Parallel + arrays are quite unqiue in that they use an entirely different + representation as soon as the flattening transformation is activated, + which is described further below. In particular, PrelPArr + defines the type [::] and operations to create, process, + and inspect parallel arrays. The type as well as the names of some of + the operations are also hardwired in TysWiredIn+ (see the definition of parrTyCon in this module) and PrelNames. + This is again very much like the case of lists, where the type is + defined in PrelBase + and similarly wired in; however, for lists the entirely + constructor-based definition is exposed to user programs, which is not + the case for parallel arrays. + +

`Desugaring Parallel Arrays`


+    
+      The parallel array extension requires the desugarer to replace all
+      occurrences of (1) explicit parallel arrays, (2) array patterns, and (3)
+      array comprehensions by a suitable combination of invocations of
+      operations defined in PrelPArr.
+
+    
Explicit Parallel Arrays
+    
+      These are by far the simplest to remove.  We simply replace every
+      occurrence of [:e₁, ...,
+      e_n:] by
+    

+      
+	toP [e₁, ..., e_n]
+      
+    
+    
+      i.e., we build a list of the array elements, which is, then, converted
+      into a parallel array.
+
+    
Parallel Array Patterns
+    
+      Array patterns are much more tricky as element positions may contain
+      further patterns and the pattern matching compiler
+      requires us to flatten all those out.  But before we turn to the gory
+      details, here first the basic idea: A flat array pattern matches exactly
+      iff it's length corresponds to the length of the matched array.  Hence,
+      if we have a set of flat array patterns matching an array value
+      a, it suffices to generate a Core case
+      expression that scrutinises lengthP a and has one
+      alternative for every length of array occuring in one of the patterns.
+      Moreover, there needs to be a default case catching all other array
+      lengths.  In each alternative, array indexing (i.e., the functions
+      !:) is used to bind array elements to the corresponding
+      pattern variables.  This sounds easy enough and is essentially what the
+      parallel array equation of the function DsUtils.mkCoAlgCaseMatchResult
+      does.
+    

+      Unfortunately, however, the pattern matching compiler expects that it
+      can turn (almost) any pattern into variable patterns, literals, or
+      constructor applications by way of the functions Match.tidy1.
+      And to make matters worse, some weird machinery in the module Check 
+      insists on being able to reverse the process (essentially to pretty
+      print patterns in warnings for incomplete or overlapping patterns).
+    

+      The solution to this is an (unlimited) set of fake constructors
+      for parallel arrays, courtesy of TysWiredIn.parrFakeCon.  
+      In other words, any pattern of the form [:p₁,
+      ..., p_n:] is transformed into 
+    

+      
+	MkPArrayn p₁ ... p_n
+      
+    

+    
+      by Match.tidy1, then, run through the rest of the pattern
+      matching compiler, and finally, picked up by
+      DsUtils.mkCoAlgCaseMatchResult, which converts it into a
+      case expression as outlined above.
+    

+      As an example consider the source expression
+    
+case v of
+  [:x1:]         -> e1
+  [:x2, x3, x4:] -> e2
+  _		 -> e3
+    
+    
+      Match.tidy1 converts it into a form that is equivalent to
+    
+case v of {
+  MkPArr1 x1       -> e1;
+  MkPArr2 x2 x3 x4 -> e2;
+  _	           -> e3;
+}
+    
+    
+      which DsUtils.mkCoAlgCaseMatchResult turns into the
+      following Core code:
+    
+      case lengthP v of
+        Int# i# -> 
+	  case i# of l {
+	    DFT ->					  e3
+	    1   -> let x1 = v!:0                       in e1
+	    3   -> let x2 = v!:0; x2 = v!:1; x3 = v!:2 in e2
+	  }
+    
+
+    Parallel Array Comprehensions
+    
+      The most challenging construct of the three are array comprehensions.
+      In principle, it would be possible to transform them in essentially the
+      same way as list comprehensions, but this would lead to abysmally slow
+      code as desugaring of list comprehensions generates code that is
+      optimised for sequential, constructor-based structures.  In contrast,
+      array comprehensions need to be transformed into code that solely relies
+      on collective operations and avoids the creation of many small
+      intermediate arrays.  
+    

+      The transformation is implemented by the function DsListComp.dsPArrComp.
+      In the following, we denote this transformation function by the form
+      <<e>> pa ea, where e
+      is the comprehension to be compiled and the arguments pa
+      and ea denote a pattern and the currently processed array
+      expression, respectively.  The invariant constraining these two
+      arguments is that all elements in the array produced by ea
+      will successfully match against pa.
+    

+      Given a source-level comprehensions [:e | qss:], we compile
+      it with <<[:e | qss:]>> () [:():] using the
+      rules 
+    
+<<[:e' |           :]>> pa ea = mapP (\pa -> e') ea
+<<[:e' | b     , qs:]>> pa ea = <<[:e' | qs:]>> pa (filterP (\pa -> b) ea)
+<<[:e' | p <- e, qs:]>> pa ea = 
+  let ef = filterP (\x -> case x of {p -> True; _ -> False}) e
+  in
+  <<[:e' | qs:]>> (pa, p) (crossP ea ef)
+<<[:e' | let ds, qs:]>> pa ea = 
+  <<[:e' | qs:]>> (pa, (x_1, ..., x_n)) 
+    	      (mapP (\v@pa -> (v, let ds in (x_1, ..., x_n))) ea)
+where
+  {x_1, ..., x_n} = DV (ds)		-- Defined Variables
+<<[:e' | qs | qss:]>>   pa ea = 
+  <<[:e' | qss:]>> (pa, (x_1, ..., x_n)) 
+    	       (zipP ea <<[:(x_1, ..., x_n) | qs:]>>)
+where
+  {x_1, ..., x_n} = DV (qs)
+    
+    
+      We assume the denotation of crossP to be given by
+    
+crossP       :: [:a:] -> [:b:] -> [:(a, b):]
+crossP a1 a2  = let
+  		len1 = lengthP a1
+  		len2 = lengthP a2
+  		x1   = concatP $ mapP (replicateP len2) a1
+  		x2   = concatP $ replicateP len1 a2
+  	      in
+  	      zipP x1 x2
+    
+    
+      For a more efficient implementation of crossP, see
+      PrelPArr.
+    

+      Moreover, the following optimisations are important:
+    

+      In the p <- e rule, if pa == (), drop
+	it and simplify the crossP ea e to e.
+      
We assume that fusion will optimise sequences of array processing
+	combinators.
+      
FIXME: Do we want to have the following function?
+	+mapFilterP :: (a -> Maybe b) -> [:a:] -> [:b:]
+	
+	
+	  Even with fusion (mapP (\p -> e) . filterP (\p ->
+	  b)) may still result in redundant pattern matching
+	  operations.  (Let's wait with this until we have seen what the
+	  Simplifier does to the generated code.)
+    
+
+    Doing Away With Nested Arrays: The Flattening Transformation
+    
+      On the quest towards an entirely unboxed representation of parallel
+      arrays, the flattening transformation is the essential ingredient.  GHC
+      uses a substantially
+      improved version of the transformation whose original form was
+      described by Blelloch & Sabot.  The flattening transformation
+      replaces values of type [:a:] as well as functions
+      operating on these values by alternative, more efficient data structures
+      and functions.
+    

+      The flattening machinery is activated by the option
+      -fflatten, which is a static hsc option.  It consists of
+      two steps: function vectorisation and array specialisation.  Only the
+      first of those is implemented so far.  If selected, the transformation
+      is applied to a module in Core form immediately after the desugarer, before the Mighty Simplifier gets to do its
+      job.  After vectorisation, the Core program can be dumped using the
+      option -ddump-vect.  The is a good reason for us to perform
+      flattening immediately after the desugarer.  In HscMain.hscRecomp
+      the so-called persistent compiler state is maintained, which
+      contains all the information about imported interface files needed to
+      lookup the details of imported names (e.g., during renaming and type
+      checking).  However, all this information is zapped before the
+      simplifier is invoked (supposedly to reduce the space-consumption of
+      GHC).  As flattening has to get at all kinds of identifiers from Prelude
+      modules, we need to do it before the relevant information in the
+      persistent compiler state is gone.
+
+    

+      As flattening generally requires all libraries to be compiled for
+      flattening (just like profiling does), there is a compiler way
+      "ndp", which can be selected using the way option code
+      -ndp.  This option will automagically select
+      -fparr and -fflatten. 
+
+    
FlattenMonad
+    
+      The module FlattenMonad
+      implements the monad Flatten that is used during
+      vectorisation to keep track of various sets of bound variables and a
+      variable substitution map; moreover, it provides a supply of new uniques
+      and allows us to look up names in the persistent compiler state (i.e.,
+      imported identifiers).
+    

+      In order to be able to look up imported identifiers in the persistent
+      compiler state, it is important that these identifies are included into
+      the free variable lists computed by the renamer.  More precisely, all
+      names needed by flattening are included in the names produced by
+      RnEnv.getImplicitModuleFVs.  To avoid putting
+      flattening-dependent lists of names into the renamer, the module
+      FlattenInfo exports namesNeededForFlattening.
+
+      [FIXME: It might be worthwhile to document in the non-Flattening part of
+      the Commentary that the persistent compiler state is zapped after
+      desugaring and how the free variables determined by the renamer imply
+      which names are imported.]
+    
+    

+
+Last modified: Tue Feb 12 01:44:21 EST 2002
+
+    
+  
+
diff --git a/ghc/docs/comm/index.html b/ghc/docs/comm/index.html
index ee12a62..ac2379e 100644
--- a/ghc/docs/comm/index.html
+++ b/ghc/docs/comm/index.html
@@ -6,7 +6,7 @@
   
 
   
-    
The Glasgow Haskell Compiler (GHC) Commentary [v0.10]
+    The Glasgow Haskell Compiler (GHC) Commentary [v0.11]
     
       Array Libraries 
-        [not available yet]
-      
On why we have ForeignPtr
+      
On why we have
+	  ForeignPtr 
     
+
+    Extensions, or Making a Complicated System More Complicated
+    
+    

+      Parallel Arrays
+    
+
+    The Source
     
       The online master copy of the Commentary is at
     

@@ -86,7 +100,7 @@
 
     
 
-Last modified: Wed Feb  6 22:21:52 EST 2002
+Last modified: Mon Feb 11 20:18:35 EST 2002
 
     
   
diff --git a/ghc/docs/comm/the-beast/desugar.html b/ghc/docs/comm/the-beast/desugar.html
index 9d92828..a667402 100644
--- a/ghc/docs/comm/the-beast/desugar.html
+++ b/ghc/docs/comm/the-beast/desugar.html
@@ -60,7 +60,7 @@
       obtained.  This is supported by the function
       DsMonad.dsLookupGlobalValue :: Name -> DsM Id.
 
-    
Pattern Matching
+    Pattern Matching
     
       Nested pattern matching with guards and everything is translated into
       the simple, flat case expressions of Core by the following modules:
@@ -149,7 +149,7 @@
 
     

 
-Last modified: Wed Jan 16 23:40:16 EST 2002
+Last modified: Mon Feb 11 22:35:25 EST 2002