X-Git-Url: http://git.megacz.com/?p=ghc-hetmet.git;a=blobdiff_plain;f=docs%2Fcomm%2Fthe-beast%2Fdesugar.html;fp=docs%2Fcomm%2Fthe-beast%2Fdesugar.html;h=a66740259bfff392ed1d7945119947aa7d3429e9;hp=0000000000000000000000000000000000000000;hb=0065d5ab628975892cea1ec7303f968c3338cbe1;hpb=28a464a75e14cece5db40f2765a29348273ff2d2 diff --git a/docs/comm/the-beast/desugar.html b/docs/comm/the-beast/desugar.html new file mode 100644 index 0000000..a667402 --- /dev/null +++ b/docs/comm/the-beast/desugar.html @@ -0,0 +1,156 @@ + + + + + The GHC Commentary - Sugar Free: From Haskell To Core + + + +

The GHC Commentary - Sugar Free: From Haskell To Core

+

+ Up until after type checking, GHC keeps the source program in an + abstract representation of Haskell source without removing any of the + syntactic sugar (such as, list comprehensions) that could easily be + represented by more primitive Haskell. This complicates part of the + front-end considerably as the abstract syntax of Haskell (as exported by + the module HsSyn) + is much more complex than a simplified representation close to, say, the + Haskell + Kernel would be. However, having a representation that is as close + as possible to the surface syntax simplifies the generation of clear + error messages. As GHC (quite in contrast to "conventional" compilers) + prints code fragments as part of error messages, the choice of + representation is especially important. +

+ Nonetheless, as soon as the input has passed all static checks, it is + transformed into GHC's principal intermediate language that goes by the + name of Core and whose representation is exported by the + module CoreSyn. + All following compiler phases, except code generation operate on Core. + Due to Andrew Tolmach's effort, there is also an external + representation for Core. +

+ The conversion of the compiled module from HsSyn into that + of CoreSyn is performed by a phase called the + desugarer, which is located in + fptools/ghc/compiler/deSugar/. + It's operation is detailed in the following. +

+ +

Auxilliary Functions

+

+ The modules DsMonad + defines the desugarer monad (of type DsM) which maintains + the environment needed for desugaring. In particular, it encapsulates a + unique supply for generating new variables, a map to lookup standard + names (such as functions from the prelude), a source location for error + messages, and a pool to collect warning messages generated during + desugaring. Initialisation of the environment happens in the function Desugar.desugar, + which is also the main entry point into the desugarer. +

+ The generation of Core code often involves the use of standard functions + for which proper identifiers (i.e., values of type Id that + actually refer to the definition in the right Prelude) need to be + obtained. This is supported by the function + DsMonad.dsLookupGlobalValue :: Name -> DsM Id. + +

Pattern Matching

+

+ Nested pattern matching with guards and everything is translated into + the simple, flat case expressions of Core by the following modules: +

+
Match: +
This modules contains the main pattern-matching compiler in the form + of a function called match. There is some documentation + as to how match works contained in the module itself. + Generally, the implemented algorithm is similar to the one described + in Phil Wadler's Chapter ? of Simon Peyton Jones' The + Implementation of Functional Programming Languages. + Match exports a couple of functions with not really + intuitive names. In particular, it exports match, + matchWrapper, matchExport, and + matchSimply. The function match, which is + the main work horse, is only used by the other matching modules. The + function matchExport - despite it's name - is merely used + internally in Match and handles warning messages (see + below for more details). The actual interface to the outside is + matchWrapper, which converts the output of the type + checker into the form needed by the pattern matching compiler (i.e., a + list of EquationInfo). Similar in function to + matchWrapper is matchSimply, which provides + an interface for the case where a single expression is to be matched + against a single pattern (as, for example, is the case in bindings in + a do expression). +
MatchCon: +
This module generates code for a set of alternative constructor + patterns that belong to a single type by means of the routine + matchConFamily. More precisely, the routine gets a set + of equations where the left-most pattern of each equation is a + constructor pattern with a head symbol from the same type as that of + all the other equations. A Core case expression is generated that + distinguihes between all these constructors. The routine is clever + enough to generate a sparse case expression and to add a catch-all + default case only when needed (i.e., if the case expression isn't + exhaustive already). There is also an explanation at the start of the + modules. +
MatchLit: +
Generates code for a set of alternative literal patterns by means of + the routine matchLiterals. The principle is similar to + that of matchConFamily, but all left-most patterns are + literals of the same type. +
DsUtils: +
This module provides a set of auxilliary definitions as well as the + data types EquationInfo and MatchResult that + form the input and output, respectively, of the pattern matching + compiler. +
Check: +
This module does not really contribute the compiling pattern + matching, but it inspects sets of equations to find whether there are + any overlapping patterns or non-exhaustive pattern sets. This task is + implemented by the function check, which returns a list of + patterns that are part of a non-exhaustive case distinction as well as a + set of equation labels that can be reached during execution of the code; + thus, the remaining equations are shadowed due to overlapping patterns. + The function check is invoked and its result converted into + suitable warning messages by the function Match.matchExport + (which is a wrapper for Match.match). +
+

+ The central function match, given a set of equations, + proceeds in a number of steps: +

    +
  1. It starts by desugaring the left-most pattern of each equation using + the function tidy1 (indirectly via + tidyEqnInfo). During this process, non-elementary + pattern (e.g., those using explicit list syntax [x, y, ..., + z]) are converted to a standard constructor pattern and also + irrefutable pattern are removed. +
  2. Then, a process called unmixing clusters the equations into + blocks (without re-ordering them), such that the left-most pattern of + all equations in a block are either all variables, all literals, or + all constructors. +
  3. Each block is, then, compiled by matchUnmixedEqns, + which forwards the handling of literal pattern blocks to + MatchLit.matchLiterals, of constructor pattern blocks to + MatchCon.matchConFamily, and hands variable pattern + blocks back to match. +
+ +


+ +Last modified: Mon Feb 11 22:35:25 EST 2002 + + + +