Up until after type checking, GHC keeps the source program in an
abstract representation of Haskell source without removing any of the
syntactic sugar (such as, list comprehensions) that could easily be
represented by more primitive Haskell. This complicates part of the
front-end considerably as the abstract syntax of Haskell (as exported by
the module HsSyn
)
is much more complex than a simplified representation close to, say, the
Haskell
Kernel would be. However, having a representation that is as close
as possible to the surface syntax simplifies the generation of clear
error messages. As GHC (quite in contrast to "conventional" compilers)
prints code fragments as part of error messages, the choice of
representation is especially important.
Nonetheless, as soon as the input has passed all static checks, it is
transformed into GHC's principal intermediate language that goes by the
name of Core and whose representation is exported by the
module CoreSyn
.
All following compiler phases, except code generation operate on Core.
Due to Andrew Tolmach's effort, there is also an external
representation for Core.
The conversion of the compiled module from HsSyn
into that
of CoreSyn
is performed by a phase called the
desugarer, which is located in
fptools/ghc/compiler/deSugar/
.
It's operation is detailed in the following.
The modules DsMonad
defines the desugarer monad (of type DsM
) which maintains
the environment needed for desugaring. In particular, it encapsulates a
unique supply for generating new variables, a map to lookup standard
names (such as functions from the prelude), a source location for error
messages, and a pool to collect warning messages generated during
desugaring. Initialisation of the environment happens in the function Desugar
.desugar
,
which is also the main entry point into the desugarer.
The generation of Core code often involves the use of standard functions
for which proper identifiers (i.e., values of type Id
that
actually refer to the definition in the right Prelude) need to be
obtained. This is supported by the function
DsMonad.dsLookupGlobalValue :: Name -> DsM Id
.
Nested pattern matching with guards and everything is translated into the simple, flat case expressions of Core by the following modules:
Match
:
match
. There is some documentation
as to how match
works contained in the module itself.
Generally, the implemented algorithm is similar to the one described
in Phil Wadler's Chapter ? of Simon Peyton Jones' The
Implementation of Functional Programming Languages.
Match
exports a couple of functions with not really
intuitive names. In particular, it exports match
,
matchWrapper
, matchExport
, and
matchSimply
. The function match
, which is
the main work horse, is only used by the other matching modules. The
function matchExport
- despite it's name - is merely used
internally in Match
and handles warning messages (see
below for more details). The actual interface to the outside is
matchWrapper
, which converts the output of the type
checker into the form needed by the pattern matching compiler (i.e., a
list of EquationInfo
). Similar in function to
matchWrapper
is matchSimply
, which provides
an interface for the case where a single expression is to be matched
against a single pattern (as, for example, is the case in bindings in
a do
expression).
MatchCon
:
matchConFamily
. More precisely, the routine gets a set
of equations where the left-most pattern of each equation is a
constructor pattern with a head symbol from the same type as that of
all the other equations. A Core case expression is generated that
distinguihes between all these constructors. The routine is clever
enough to generate a sparse case expression and to add a catch-all
default case only when needed (i.e., if the case expression isn't
exhaustive already). There is also an explanation at the start of the
modules.
MatchLit
:
matchLiterals
. The principle is similar to
that of matchConFamily
, but all left-most patterns are
literals of the same type.
DsUtils
:
EquationInfo
and MatchResult
that
form the input and output, respectively, of the pattern matching
compiler.
Check
:
check
, which returns a list of
patterns that are part of a non-exhaustive case distinction as well as a
set of equation labels that can be reached during execution of the code;
thus, the remaining equations are shadowed due to overlapping patterns.
The function check
is invoked and its result converted into
suitable warning messages by the function Match.matchExport
(which is a wrapper for Match.match
).
The central function match
, given a set of equations,
proceeds in a number of steps:
tidy1
(indirectly via
tidyEqnInfo
). During this process, non-elementary
pattern (e.g., those using explicit list syntax [x, y, ...,
z]
) are converted to a standard constructor pattern and also
irrefutable pattern are removed.
matchUnmixedEqns
,
which forwards the handling of literal pattern blocks to
MatchLit.matchLiterals
, of constructor pattern blocks to
MatchCon.matchConFamily
, and hands variable pattern
blocks back to match
.