--- /dev/null
+<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
+<html>
+ <head>
+ <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1">
+ <title>The GHC Commentary - Feedback</title>
+ </head>
+
+ <body BGCOLOR="FFFFFF">
+ <h1>Feedback</h1>
+ <p>
+ <a href="mailto:chak@cse.unsw.edu.au">I</a> welcome any feedback on the
+ material and in particular would appreciated comments on which parts of
+ the document are incomprehensible or miss explanation -- e.g., due to
+ the use of GHC speak that is explained nowhere (words like infotable or
+ so). Moreover, I would be interested to know which areas of GHC you
+ would like to see covered here.
+ <p>
+ For the moment is probably best if feedback is directed to
+ <p>
+ <blockquote>
+ <a
+ href="mailto:chak@cse.unsw.edu.au"><code>chak@cse.unsw.edu.au</code></a>
+ </blockquote>
+ <p>
+ However, if there is sufficient interest, we might consider setting up a
+ mailing list.
+
+ <p><small>
+<!-- hhmts start -->
+Last modified: Wed Aug 8 00:11:42 EST 2001
+<!-- hhmts end -->
+ </small>
+ </body>
+</html>
--- /dev/null
+<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
+<html>
+ <head>
+ <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1">
+ <title>The GHC Commentary - Outline of the Genesis</title>
+ </head>
+
+ <body BGCOLOR="FFFFFF">
+ <h1>The GHC Commentary - Outline of the Genesis</h1>
+ <p>
+ Building GHC happens in two stages: First you have to prepare the tree
+ with <code>make boot</code>; and second, you build the compiler and
+ associated libraries with <code>make all</code>. The <code>boot</code>
+ stage builds some tools used during the main build process, generates
+ parsers and other pre-computed source, and finally computes dependency
+ information. There is considerable detail on the build process in GHC's
+ <a
+ href="http://haskell.cs.yale.edu/ghc/docs/latest/building/building-guide.html">Building Guide.</a>
+
+ <h4>Debugging the Beast</h4>
+ <p>
+ If you are hacking the compiler or like to play with unstable
+ development versions, chances are that the compiler someday just crashes
+ on you. Then, it is a good idea to load the <code>core</code> into
+ <code>gdb</code> as usual, but unfortunately there is usually not too
+ much useful information.
+ <p>
+ The next step, then, is somewhat tedious. You should build a compiler
+ producing programs with a runtime system that has debugging turned on
+ and use that to build the crashing compiler. There are many sanity
+ checks in the RTS, which may detect inconsistency before they lead to a
+ crash and you may include more debugging information, which helps
+ <code>gdb.</code> For a RTS with debugging turned on, add the following
+ to <code>build.mk</code> (see also the comment in
+ <a
+ href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/mk/config.mk.in"><code>config.mk.in</code></a> that you find when searching for
+ <code>GhcRtsHcOpts</code>):
+<blockquote><pre>
+GhcRtsHcOpts+=-optc-DDEBUG
+GhcRtsCcOpts+=-optc-g
+EXTRA_LD_OPTS=-lbfd -liberty</pre></blockquote>
+ <p>
+ Then go into <code>fptools/ghc/rts</code> and <code>make clean boot &&
+ make all</code>. With the resulting runtime system, you have to re-link
+ the compiler. Go into <code>fptools/ghc/compiler</code>, delete the
+ file <code>hsc</code> (up to version 4.08) or
+ <code>ghc-<version></code>, and execute <code>make all</code>.
+ <p>
+ The <code>EXTRA_LD_OPTS</code> are necessary as some of the debugging
+ code uses the BFD library, which in turn requires <code>liberty</code>.
+ I would also recommend (in 4.11 and from 5.0 upwards) adding these linker
+ options to the files <code>package.conf</code> and
+ <code>package.conf.inplace</code> in the directory
+ <code>fptools/ghc/driver/</code> to the <code>extra_ld_opts</code> entry
+ of the package <code>RTS</code>. Otherwise, you have to supply them
+ whenever you compile and link a program with a compiler that uses the
+ debugging RTS for the programs it produces.
+ <p>
+ To run GHC up to version 4.08 in <code>gdb</code>, first invoke the
+ compiler as usual, but pass it the option <code>-v</code>. This will
+ show you the exact invocation of the compiler proper <code>hsc</code>.
+ Run <code>hsc</code> with these options in <code>gdb</code>. The
+ development version 4.11 and stable releases from 5.0 on do no longer
+ use the Perl driver; so, you can run them directly with gdb.
+ <p>
+ <strong>Debugging a compiler during building from HC files.</strong>
+ If you are boot strapping the compiler on new platform from HC files and
+ it crashes somewhere during the build (e.g., when compiling the
+ libraries), do as explained above, but you may have to re-configure the
+ build system with <code>--enable-hc-boot</code> before re-making the
+ code in <code>fptools/ghc/driver/</code>.
+ If you do this with a compiler up to version 4.08, run the build process
+ with <code>make EXTRA_HC_OPTS=-v</code> to get the exact arguments with
+ which you have to invoke <code>hsc</code> in <code>gdb</code>.
+
+ <p><small>
+<!-- hhmts start -->
+Last modified: Wed Aug 8 19:18:54 EST 2001
+<!-- hhmts end -->
+ </small>
+ </body>
+</html>
--- /dev/null
+<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
+<html>
+ <head>
+ <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1">
+ <title>The GHC Commentary - Mindboggling Makefiles</title>
+ </head>
+
+ <body BGCOLOR="FFFFFF">
+ <h1>The GHC Commentary - Mindboggling Makefiles</h1>
+ <p>
+ The size and structure of GHC's makefiles makes it quite easy to scream
+ out loud - in pain - during the process of tracking down problems in the
+ make system or when attempting to alter it. GHC's <a
+ href="http://haskell.cs.yale.edu/ghc/docs/latest/building/building-guide.html">Building
+ Guide</a> has valuable information on <a
+ href="http://haskell.cs.yale.edu/ghc/docs/latest/building/sec-makefile-arch.html">the
+ makefile architecture.</a>
+
+ <h4>A maze of twisty little passages, all alike</h4>
+ <p>
+ The <code>fptools/</code> toplevel and the various project directories
+ contain not only a <code>Makefile</code> each, but there are
+ subdirectories of name <code>mk/</code> at various levels that contain
+ rules, targets, and so on specific to a project - or, in the case of the
+ toplevel, the default rules for the whole system. Each <code>mk/</code>
+ directory contains a file <code>boilerplate.mk</code> that ties the
+ various other makefiles together. Files called <code>target.mk</code>,
+ <code>paths.mk</code>, and <code>suffix.mk</code> contain make targets,
+ definitions of variables containing paths, and suffix rules,
+ respectively.
+ <p>
+ One particularly nasty trick used in this hierarchy of makesfiles is the
+ way in which the variable <code>$(TOP)</code> is used. AFAIK,
+ <code>$(TOP)</code> always points to a directory containing an
+ <code>mk/</code> subdirectory; however, it not necessarily points to the
+ toplevel <code>fptools/</code> directory. For example, within the GHC
+ subtree, <code>$(TOP)</code> points to <code>fptools/ghc/</code>.
+ However, some of the makefiles in <code>fptools/ghc/mk/</code> will then
+ <em>temporarily</em> redefine <code>$(TOP)</code> to point a level
+ higher (i.e., to <code>fptools/</code>) while they are including the
+ toplevel boilerplate. After that <code>$(TOP)</code> is redefined to
+ whatever value it had before including makfiles from higher up in the
+ hierarchy.
+
+ <p><small>
+<!-- hhmts start -->
+Last modified: Wed Aug 8 19:19:54 EST 2001
+<!-- hhmts end -->
+ </small>
+ </body>
+</html>
--- /dev/null
+<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
+<html>
+ <head>
+ <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1">
+ <title>The GHC Commentary - The Beast Explained</title>
+ </head>
+
+ <body BGCOLOR="FFFFFF">
+ <h1>The Glasgow Haskell Compiler (GHC) Commentary [v0.1]</h1>
+ <p>
+ <!-- Contributors: Whoever makes substantial additions or changes to the
+ document, please add your name and keep the order alphabetic. Moreover,
+ please bump the version number for any substantial modification that you
+ check into CVS.
+ -->
+ <strong>Manuel M. T. Chakravarty</strong><br>
+ <br>
+ <p>
+ This document started as a collection of notes describing what <a
+ href="mailto:chak@cse.unsw.edu.au">I</a> learnt when poking around in
+ the <a href="http://haskell.org/ghc/">GHC</a> sources. During the
+ <i>Haskell Implementers Workshop</i> in January 2001 it was decided to
+ put the commentary into GHC's CVS repository to allow the whole
+ developer community to add their wizardly insight to the document.
+ <p>
+ <strong>The document is still in its infancy - help it grow!</strong>
+
+ <h2>Before the Show Begins</h2>
+ <p>
+ <ul>
+ <li><a href="feedback.html">Feedback</a>
+ <li><a href="others.html">Other Sources of Wisdom</a>
+ </ul>
+
+ <h2>Genesis</h2>
+ <p>
+ <ul>
+ <li><a href="genesis/genesis.html">Outline of the Genesis</a>
+ <li><a href="genesis/makefiles.html">Mindboggling Makefiles</a>
+ </ul>
+
+ <h2>The Beast Dissected</h2>
+ <p>
+ <ul>
+ <li><a href="the-beast/driver.html">The Glorious Driver</a>
+ <li><a href="the-beast/basicTypes.html">The Basics</a>
+ <li><a href="the-beast/typecheck.html">Checking Types</a>
+ <li><a href="the-beast/simplifier.html">The Mighty Simplifier</a>
+ <li><a href="the-beast/mangler.html">The Evil Mangler</a>
+ </ul>
+
+ <h2>RTS & Libraries</h2>
+ <p>
+ <ul>
+ <li><a href="rts-libs/stgc.html">Spineless Tagless C</a>
+ <li><a href="rts-libs/primitives.html">Primitives</a>
+ <li><a href="rts-libs/prelfound.html">Prelude Foundations</a>
+ <li><a href="rts-libs/prelude.html">Cunning Prelude Code</a>
+ <li><!-- <a href="rts-libs/arrays.html"> -->Array Libraries</a>
+ <small>[not available yet]</small>
+ </ul>
+
+ <p><small>
+<!-- hhmts start -->
+Last modified: Wed Aug 8 00:11:49 EST 2001
+<!-- hhmts end -->
+ </small>
+ </body>
+</html>
--- /dev/null
+<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
+<html>
+ <head>
+ <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1">
+ <title>The GHC Commentary - Other Sources of Wisdom</title>
+ </head>
+
+ <body BGCOLOR="FFFFFF">
+ <h1>Other Sources of Wisdom</h1>
+ <p>
+ Believe it or not, but there are other people besides you who are
+ masochistic enough to study the innards of the beast. Some of the have
+ been kind (or cruel?) enough to share their insights with us. Here is a
+ probably incomplete list:
+ <p>
+ <ul>
+ <li>The <a
+ href="http://www.cee.hw.ac.uk/~dsg/gph/docs/StgSurvival.ps.gz">STG
+ Survival Sheet</a> has -- according to its header -- been written by
+ `a poor wee soul',<sup><a href="#footnote1">1</a></sup> which
+ probably has been pushed into the torments of madness by the very
+ act of contemplating the inner workings of the STG runtime system.
+ This document discusses GHC's runtime system with a focus on
+ support for parallel processing (aka GUM).
+ <li>Instructions on <a
+ href="http://www-users.cs.york.ac.uk/~olaf/PUBLICATIONS/extendGHC.html">Adding
+ an Optimisation Pass to the Glasgow Haskell Compiler</a>
+ have been compiled by <a
+ href="http://www-users.cs.york.ac.uk/~olaf/">Olaf Chitil</a>.
+ Unfortunately, this document is already a little aged.
+
+ <!-- Add references to other background texts listed on the GHC docu
+ page
+ -->
+
+ </ul>
+
+ <p><hr><p>
+ <sup><a name="footnote1">1</a></sup>Usually reliable sources have it that
+ the poor soul in question is no one less than GUM hardcore hacker <a
+ href="http://www.cee.hw.ac.uk/~hwloidl/">Hans-Wolfgang Loidl</a>.
+
+ <p><small>
+<!-- hhmts start -->
+Last modified: Wed Aug 8 00:47:05 EST 2001
+<!-- hhmts end -->
+ </small>
+ </body>
+</html>
--- /dev/null
+<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
+<html>
+ <head>
+ <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1">
+ <title>The GHC Commentary - Prelude Foundations</title>
+ </head>
+
+ <body BGCOLOR="FFFFFF">
+ <h1>The GHC Commentary - Prelude Foundations</h1>
+ <p>
+ The standard Haskell Prelude as well as GHC's Prelude extensions are
+ constructed from GHC's <a href="primitives.html">primitives</a> in a
+ couple of layers.
+
+ <h4><code>PrelBase.lhs</code></h4>
+ <p>
+ Some most elementary Prelude definitions are collected in <a
+ href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/lib/std/PrelBase.lhs"><code>PrelBase.lhs</code></a>.
+ In particular, it defines the boxed versions of Haskell primitive types
+ - for example, <code>Int</code> is defined as
+ <blockquote><pre>
+data Int = I# Int#</pre>
+ </blockquote>
+ <p>
+ Saying that a boxed integer <code>Int</code> is formed by applying the
+ data constructor <code>I#</code> to an <em>unboxed</em> integer of type
+ <code>Int#</code>. Unboxed types are hardcoded in the compiler and
+ exported together with the <a href="primitives.html">primitive
+ operations</a> understood by GHC.
+ <p>
+ <code>PrelBase.lhs</code> similarly defines basic types, such as,
+ boolean values
+ <blockquote><pre>
+data Bool = False | True deriving (Eq, Ord)</pre>
+ </blockquote>
+ <p>
+ the unit type
+ <blockquote><pre>
+data () = ()</pre>
+ </blockquote>
+ <p>
+ and lists
+ <blockquote><pre>
+data [] a = [] | a : [a]</pre>
+ </blockquote>
+ <p>
+ It also contains instance delarations for these types. In addition,
+ <code>PrelBase.lhs</code> contains some <a href="prelude.html">tricky
+ machinery</a> for efficient list handling.
+
+ <p><small>
+<!-- hhmts start -->
+Last modified: Wed Aug 8 19:30:18 EST 2001
+<!-- hhmts end -->
+ </small>
+ </body>
+</html>
--- /dev/null
+<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
+<html>
+ <head>
+ <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1">
+ <title>The GHC Commentary - Cunning Prelude Code</title>
+ </head>
+
+ <body BGCOLOR="FFFFFF">
+ <h1>The GHC Commentary - Cunning Prelude Code</h1>
+ <p>
+ GHC's uses a many optimsations and GHC specific techniques (unboxed
+ values, RULES pragmas, and so on) to make the heavily used Prelude code
+ as fast as possible.
+
+ <h4>fold/build</h4>
+ <p>
+ There is a lot of magic in <a
+ href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/lib/std/PrelBase.lhs"><code>PrelBase.lhs</code></a> -
+ among other things, the <a
+ href="http://haskell.cs.yale.edu/ghc/docs/latest/set/rewrite-rules.html">RULES
+ pragmas</a> implementing the <a
+ href="http://research.microsoft.com/Users/simonpj/Papers/deforestation-short-cut.ps.Z">fold/build</a>
+ optimisation. The code for <code>map</code> is
+ a good example for how it all works. In the prelude code for version
+ 4.08.1 it reads as follows:
+ <blockquote><pre>
+map :: (a -> b) -> [a] -> [b]
+map = mapList
+
+-- Note eta expanded
+mapFB :: (elt -> lst -> lst) -> (a -> elt) -> a -> lst -> lst
+mapFB c f x ys = c (f x) ys
+
+mapList :: (a -> b) -> [a] -> [b]
+mapList _ [] = []
+mapList f (x:xs) = f x : mapList f xs
+
+{-# RULES
+"map" forall f xs. map f xs = build (\c n -> foldr (mapFB c f) n xs)
+"mapFB" forall c f g. mapFB (mapFB c f) g = mapFB c (f.g)
+"mapList" forall f. foldr (mapFB (:) f) [] = mapList f
+ #-}</pre>
+ </blockquote>
+ <p>
+ This code is structured as it is, because the "map" rule first
+ <em>breaks</em> the map <em>open,</em> which exposes it to the various
+ foldr/build rules, and if no foldr/build rule matches, the "mapList"
+ rule <em>closes</em> it again in a later phase of optimisation - after
+ build was inlined. As a consequence, the whole thing depends a bit on
+ the timing of the various optimsations (the map might be closed again
+ before any of the foldr/build rules fires). To make the timing
+ deterministic, <code>build</code> gets a <code>{-# INLINE 2 build
+ #-}</code> pragma, which delays <code>build</code>'s inlining, and thus,
+ the closing of the map.
+
+ <p><small>
+<!-- hhmts start -->
+Last modified: Wed Aug 8 19:31:18 EST 2001
+<!-- hhmts end -->
+ </small>
+ </body>
+</html>
--- /dev/null
+<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
+<html>
+ <head>
+ <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1">
+ <title>The GHC Commentary - Primitives</title>
+ </head>
+
+ <body BGCOLOR="FFFFFF">
+ <h1>The GHC Commentary - Primitives</h1>
+ <p>
+ Most user-level Haskell types and functions provided by GHC (in
+ particular those from the Prelude and GHC's Prelude extensions) are
+ internally constructed from even more elementary types and functions.
+ Most notably, GHC understands a notion of <em>unboxed types,</em> which
+ are the Haskell representation of primitive bit-level integer, float,
+ etc. types (as opposed to their boxed, heap allocated counterparts) -
+ cf. <a
+ href="http://research.microsoft.com/Users/simonpj/Papers/unboxed-values.ps.Z">"Unboxed
+ Values as First Class Citizens."</a>
+
+ <h4>The Ultimate Source of Primitives</h4>
+ <p>
+ The hardwired types of GHC are brought into scope by the module
+ <code>PrelGHC</code>. This modules only exists in the form of a
+ handwritten interface file <a
+ href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/lib/std/PrelGHC.hi-boot"><code>PrelGHC.hi-boot</code>,</a>
+ which lists the type and function names, as well as instance
+ declarations. The actually types of these names as well as their
+ implementation is hardwired into GHC. Note that the names in this file
+ are z-encoded, and in particular, identifiers ending on <code>zh</code>
+ denote user-level identifiers ending in a hash mark (<code>#</code>),
+ which is used to flag unboxed values or functions operating on unboxed
+ values. For example, we have <code>Char#</code>, <code>ord#</code>, and
+ so on.
+
+ <h4>The New Primitive Definition Scheme</h4>
+ <p>
+ As of (about) the development version 4.11, the types and various
+ properties of primitive operations are defined in the file <a
+ href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/prelude/primops.txt"><code>primops.txt</code></a>
+ (Personally, I don't think that the <code>.txt</code> suffix is really
+ appropriate, as the file is used for automatic code generation).
+ <p>
+ The utility <a
+ href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/utils/genprimopcode/"><code>genprimopcode</code></a>
+ generates a series of Haskell files from <code>primops.txt</code>, which
+ encode the types and various properties of the primitive operations as
+ compiler internal data structures. These Haskell files are not complete
+ modules, but program fragments, which are included into compiler modules
+ during the GHC build process. The generated include files can be found
+ in the directory <code>fptools/ghc/compiler/</code> and carry names
+ matching the pattern <code>primop-*.hs-incl</code>. They are generate
+ during the execution of the <code>boot</code> target in the
+ <code>fptools/ghc/</code> directory. This scheme significantly
+ simplifies the maintenance of primitive operations.
+
+ <p><small>
+<!-- hhmts start -->
+Last modified: Wed Aug 8 19:29:12 EST 2001
+<!-- hhmts end -->
+ </small>
+ </body>
+</html>
--- /dev/null
+<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
+<html>
+ <head>
+ <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1">
+ <title>The GHC Commentary - Spineless Tagless C</title>
+ </head>
+
+ <body BGCOLOR="FFFFFF">
+ <h1>The GHC Commentary - Spineless Tagless C</h1>
+ <p>
+ The C code generated by GHC doesn't use higher-level features of C to be
+ able to control as precisely as possible what code is generated.
+ Moreover, it uses special features of gcc (such as, first class labels)
+ to produce more efficient code.
+ <p>
+ STG C makes ample use of C's macro language to define idioms, which also
+ reduces the size of the generated C code (thus, reducing I/O times).
+ These macros are defined in the C headers located in GHC's <a
+ href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/includes/"><code>includes</code></a>
+ directory.
+
+ <h4><code>TailCalls.h</code></h4>
+ <p>
+ <a
+ href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/includes/TailCalls.h"><code>TailCalls.h</code></a>
+ defines how tail calls are implemented - and in particular - optimised
+ in GHC generated code. The default case, for an architecture for which
+ GHC is not optimised, is to use the mini interpreter described in the <a
+ href="http://research.microsoft.com/copyright/accept.asp?path=/users/simonpj/papers/spineless-tagless-gmachine.ps.gz&pub=34">STG paper.</a>
+ <p>
+ For supported architectures, various tricks are used to generate
+ assembler implementing proper tail calls. On i386, gcc's first class
+ labels are used to directly jump to a function pointer. Furthermore,
+ markers of the form <code>--- BEGIN ---</code> and <code>--- END
+ ---</code> are added to the assembly right after the function prologue
+ and before the epilogue. These markers are used by <a
+ href="../the-beast/mangler.html">the Evil Mangler.</a>
+
+ <p><small>
+<!-- hhmts start -->
+Last modified: Wed Aug 8 19:28:29 EST 2001
+<!-- hhmts end -->
+ </small>
+ </body>
+</html>
--- /dev/null
+<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
+<html>
+ <head>
+ <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1">
+ <title>The GHC Commentary - The Basics</title>
+ </head>
+
+ <body BGCOLOR="FFFFFF">
+ <h1>The GHC Commentary - The Basics</h1>
+ <p>
+ The directory <a
+ href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/basicTypes/"><code>fptools/ghc/compiler/basicTypes/</code></a>
+ contains modules that define some of the essential types definition for
+ the compiler - such as, identifiers, variables, modules, and unique
+ names.
+
+ <h4><code>Id</code>s</h4>
+ <p>
+ An <code>Id</code> (defined in <a
+ href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/basicTypes/Id.lhs"><code>Id.lhs</code></a>
+ essentially records information about value and data constructor
+ identifiers -- to be precise, in the case of data constructors, two
+ <code>Id</code>s are used to represent the worker and wrapper functions
+ for the data constructor, respectively. The information maintained in
+ the <code>Id</code> abstraction includes among other items strictness,
+ occurrence, specialisation, and unfolding information.
+ <p>
+ Due to the way <code>Id</code>s are used for data constructors,
+ all <code>Id</code>s are represented as variables, which contain a
+ <code>varInfo</code> field of abstract type <code><a
+ href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/basicTypes/IdInfo.lhs">IdInfo</a>.IdInfo</code>.
+ This is where the information about <code>Id</code>s is really stored.
+ The following is a (currently, partial) list of the various items in an
+ <code>IdInfo</code>:
+ <p>
+ <dl>
+ <dt><a name="occInfo">Occurence information</a>
+ <dd>The <code>OccInfo</code> data type is defined in the module <a
+ href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/basicTypes/BasicTypes.lhs"><code>BasicTypes.lhs</code></a>.
+ Apart from the trivial <code>NoOccInfo</code>, it distinguishes
+ between variables that do not occur at all (<code>IAmDead</code>),
+ occur just once (<code>OneOcc</code>), or a <a
+ href="simplifier.html#loopBreaker">loop breakers</a>
+ (<code>IAmALoopBreaker</code>).
+ </dl>
+
+ <p><small>
+<!-- hhmts start -->
+Last modified: Wed Aug 8 19:23:01 EST 2001
+<!-- hhmts end -->
+ </small>
+ </body>
+</html>
--- /dev/null
+<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
+<html>
+ <head>
+ <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1">
+ <title>The GHC Commentary - The Glorious Driver</title>
+ </head>
+
+ <body BGCOLOR="FFFFFF">
+ <h1>The GHC Commentary - The Glorious Driver</h1>
+ <p>
+ The Glorious Driver (GD) is the part of GHC that orchestrates the
+ interaction of all the other pieces that make up GHC. It supersedes the
+ <em>Evil Driver (ED),</em> which was a Perl script that served the same
+ purpose and was in use until version 4.08.1 of GHC. Simon Marlow
+ eventually slayed the ED and instated the GD.
+ <p>
+ The GD has been substantially extended for GHCi, i.e., the interactive
+ variant of GHC that integrates the compiler with a (meta-circular)
+ interpreter since version 5.00.
+
+ <p><small>
+<!-- hhmts start -->
+Last modified: Wed Aug 8 19:22:14 EST 2001
+<!-- hhmts end -->
+ </small>
+ </body>
+</html>
--- /dev/null
+<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
+<html>
+ <head>
+ <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1">
+ <title>The GHC Commentary - The Evil Mangler</title>
+ </head>
+
+ <body BGCOLOR="FFFFFF">
+ <h1>The GHC Commentary - The Evil Mangler</h1>
+ <p>
+ The Evil Mangler (EM) is a Perl script invoked by the <a
+ href="driver.html">Glorious Driver</a> after the C compiler (gcc) has
+ translated the GHC-produced C code into assembly. Consequently, it is
+ only of interest if <code>-fvia-C</code> is in effect (either explicitly
+ or implicitly).
+
+ <h4>Its purpose</h4>
+ <p>
+ The EM reads the assembly produced by gcc and re-arranges code blocks as
+ well as nukes instructions that it considers <em>non-essential.</em> It
+ derives it evilness from its utterly ad hoc, machine, compiler, and
+ whatnot dependent design and implementation. More precisely, the EM
+ performs the following tasks:
+ <ul>
+ <li>The code executed when a closure is entered is moved adjacent to
+ that closure's infotable. Moreover, the order of the info table
+ entries is reversed.
+ <li>Function prologue and epilogue code is removed. (GHC generated code
+ manages its own stack and uses the system stack only for return
+ addresses and during calls to C code.)
+ <li>Certain code patterns are replaced by simpler code (eg, loads of
+ fast entry points followed by indirect jumps are replaced by direct
+ jumps to the fast entry point).
+ </ul>
+
+ <h4>Implementation</h4>
+ <p>
+ The EM is located in the Perl script <a
+ href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/driver/mangler/ghc-asm.lprl"><code>ghc-asm.lprl</code></a>.
+ The script reads the <code>.s</code> file and chops it up into
+ <em>chunks</em> (that's how they are actually called in the script) that
+ roughly correspond to basic blocks. Each chunk is annotated with an
+ educated guess about what kind of code it contains (e.g., infotable,
+ fast entry point, slow entry point, etc.). The annotations also contain
+ the symbol introducing the chunk of assembly and whether that chunk has
+ already been processed or not.
+ <p>
+ The parsing of the input into chunks as well as recognising assembly
+ instructions that are to be removed or altered is based on a large
+ number of Perl regular expressions sprinkled over the whole code. These
+ expressions are rather fragile as they heavily rely on the structure of
+ the generated code - in fact, they even rely on the right amount of wide
+ space and thus on the formatting of the assembly.
+ <p>
+ Afterwards, the chunks are reordered, some of them purged, and some
+ stripped of some useless instructions. Moreover, some instructions are
+ manipulated (eg, loads of fast entry points followed by indirect jumps
+ are replaced by direct jumps to the fast entry point).
+ <p>
+ The EM knows which part of the code belongs to function prologues and
+ epilogues as <a href="../rts-libs/stgc.html">STG C</a> adds tags of the
+ form <code>--- BEGIN ---</code> and <code>--- END ---</code> the
+ assembler just before and after the code proper of a function starts.
+ It adds these tags using gcc's <code>__asm__</code> feature.
+ <p>
+ <strong>Update:</strong> Gcc 2.96 upwards performs more aggressive basic
+ block re-ordering and dead code elimination. This seems to make the
+ whole <code>--- END ---</code> tag business redundant -- in fact, if
+ proper code is generated, no <code>--- END ---</code> tags survive gcc
+ optimiser.
+
+ <p><small>
+<!-- hhmts start -->
+Last modified: Wed Aug 8 19:27:22 EST 2001
+<!-- hhmts end -->
+ </small>
+ </body>
+</html>
--- /dev/null
+<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
+<html>
+ <head>
+ <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1">
+ <title>The GHC Commentary - The Mighty Simplifier</title>
+ </head>
+
+ <body BGCOLOR="FFFFFF">
+ <h1>The GHC Commentary - The Mighty Simplifier</h1>
+ <p>
+ Most of the optimising program transformations applied by GHC are
+ performed on an intermediate language called <em>Core,</em> which
+ essentially is a compiler-friendly formulation of rank-2 polymorphic
+ lambda terms defined in the module <a
+ href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/coreSyn/CoreSyn.lhs/"><code>CoreSyn.lhs</code>.</a>
+ The transformation engine optimising Core programs is called the
+ <em>Simplifier</em> and composed from a couple of modules located in the
+ directory <a
+ href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/simplCore/"><code>fptools/ghc/compiler/simplCore/</code>.</a>
+ The main engine of the simplifier is contained in <a
+ href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/simplCore/Simplify.lhs"><code>Simplify.lhs</code>.</a>
+ and its driver is the routine <code>core2core</code> in <a
+ href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/simplCore/SimplCore.lhs"><code>SimplCore.lhs</code>.</a>
+ <p>
+ The program that the simplifier has produced after applying its various
+ optimisations can be obtained by passing the option
+ <code>-ddump-simpl</code> to GHC. Moreover, the various intermediate
+ stages of the optimisation process is printed when passing
+ <code>-dverbose-core2core</code>.
+
+ <h4><a name="loopBreaker">Recursive Definitions</a></h4>
+ <p>
+ The simplification process has to take special care when handling
+ recursive binding groups; otherwise, the compiler might loop.
+ Therefore, the routine <code>reOrderRec</code> in <a
+ href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/simplCore/OccurAnal.lhs"><code>OccurAnal.lhs</code></a>
+ computes a set of <em>loop breakers</em> - a set of definitions that
+ together cut any possible loop in the binding group. It marks the
+ identifiers bound by these definitions as loop breakers by enriching
+ their <a href="basicTypes.html#occInfo">occurence information.</a> Loop
+ breakers will <em>never</em> be inlined by the simplifier; thus,
+ guaranteeing termination of the simplification procedure. (This is not
+ entirely accurate -- see <a href="#rules">rewrite rules</a> below.)
+
+ The processes finding loop breakers works as follows: First, the
+ strongly connected components (SCC) of the graph representing all
+ function dependencies is computed. Then, each SCC is inspected in turn.
+ If it contains only a single binding (self-recursive function), this is
+ the loop breaker. In case of multiple recursive bindings, the function
+ attempts to select bindings where the decision not to inline them does
+ cause the least harm - in the sense of inhibiting optimisations in the
+ code. This is achieved by considering each binding in turn and awarding
+ a <em>score</em> between 0 and 4, where a lower score means that the
+ function is less useful for inlining - and thus, a better loop breaker.
+ The evaluation of bingings is performed by the function
+ <code>score</code> locally defined in <code>OccurAnal</code>.
+
+ Note that, because core programs represent function definitions as
+ <em>one</em> binding choosing between the possibly many equations in the
+ source program with a <code>case</code> construct, a loop breaker cannot
+ inline any of its possibly many alternatives (not even the non-recursive
+ alternatives).
+
+ <h4><a name="rules">Rewrite Rules</a></h4>
+ <p>
+ The application of rewrite rules is controlled in the module <a
+ href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/simplCore/Simplify.lhs"><code>Simplify.lhs</code></a>
+ by the function <code>completeCall</code>. This function first checks
+ whether it should inline the function applied at the currently inspected
+ call site, then simplifies the arguments, and finally, checks whether
+ any rewrite rule can be applied (and also whether there is a matching
+ specialised version of the applied function). The actual check for rule
+ application is performed by the function <code><a
+ href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/specialise/Rules.lhs">Rules</a>.lookupRule</code>.
+ <p>
+ It should be note that the application of rewrite rules is not subject
+ to the loop breaker check - i.e., rules of loop breakers will be applied
+ regardless of whether this may cause the simplifier to diverge.
+
+ <p><small>
+<!-- hhmts start -->
+Last modified: Wed Aug 8 19:25:33 EST 2001
+<!-- hhmts end -->
+ </small>
+ </body>
+</html>
--- /dev/null
+<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
+<html>
+ <head>
+ <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1">
+ <title>The GHC Commentary - Checking Types</title>
+ </head>
+
+ <body BGCOLOR="FFFFFF">
+ <h1>The GHC Commentary - Checking Types</h1>
+ <p>
+ Probably the most important phase in the frontend is the type checker,
+ which is located at <a
+ href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/typecheck/"><code>fptools/ghc/compiler/typecheck/</code>.</a>
+
+ <h4>Type Checking Environment</h4>
+ <p>
+ During type checking, GHC maintains a <em>type environment</em> whose
+ details are fixed in <a
+ href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/typecheck/TcEnv.lhs"><code>TcEnv.lhs</code>.</a>
+ Among other things, the environment contains all imported and local
+ instances as well as a list of <em>global</em> entities (imported and
+ local types and classes together with imported identifiers) and
+ <em>local</em> entities (locally defined identifiers). This environment
+ is threaded through the type checking monad.
+
+ <h4>Expressions</h4>
+ <p>
+ Expressions are type checked by <a
+ href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/typecheck/TcExpr.lhs"><code>TcExpr.lhs</code>.</a>
+ <p>
+ Usage occurences of identifiers are processed by the function
+ <code>tcId</code> whose main purpose is to <a href="#inst">instantiate
+ overloaded identifiers.</a> It essentially calls
+ <code>TcInst.instOverloadedFun</code> once for each universally
+ quantified set of type constraints. It should be noted that overloaded
+ identifiers are replaced by new names that are first defined in the LIE
+ (Local Instance Environment?) and later promoted into top-level
+ bindings.
+
+ <h4><a name="inst">Handling of Dictionaries and Method Instances</a></h4>
+ <p>
+ GHC implements overloading using so-called <em>dictionaries.</em> A
+ dictionary is a tuple of functions -- one function for each method in
+ the class of which the dictionary implements an instance. During type
+ checking, GHC replaces each type constraint of a function with one
+ additional argument. At runtime, the extended function gets passed a
+ matching class dictionary by way of these additional arguments.
+ Whenever the function needs to call a method of such a class, it simply
+ extracts it from the dictionary.
+ <p>
+ This sounds simple enough; however, the actual implementation is a bit
+ more tricky as it wants to keep track of all the instances at which
+ overloaded functions are used in a module. This information is useful
+ to optimise the code. The implementation is the module <a
+ href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/typecheck/Inst.lhs"><code>Inst.lhs</code>.</a>
+ <p>
+ The function <code>instOverloadedFun</code> is invoked for each
+ overloaded usage occurence of an identifier, where overloaded means that
+ the type of the idendifier contains a non-trivial type constraint. It
+ proceeds in two steps: (1) Allocation of a method instance
+ (<code>newMethodWithGivenTy</code>) and (2) instantiation of functional
+ dependencies. The former implies allocating a new unique identifier,
+ which replaces the original (overloaded) identifier at the currently
+ type-checked usage occurrence.
+ <p>
+ The new identifier (after being threaded through the LIE) eventually
+ will be bound by a top-level binding whose rhs contains a partial
+ application of the original overloaded identifier. This papp applies
+ the overloaded function to the dictionaries needed for the current
+ instance. In GHC lingo, this is called a <em>method.</em> Before
+ becoming a top-level binding, the method is first represented as a value
+ of type <code>Inst.Inst</code>, which makes it easy to fold multiple
+ instances of the same identifier at the same types into one global
+ definition. (And probably other things, too, which I haven't
+ investigated yet.)
+
+ <p>
+ <strong>Note:</strong> As of 13 January 2001 (wrt. to the code in the
+ CVS HEAD), the above mechanism interferes badly with RULES pragmas
+ defined over overloaded functions. During instantiation, a new name is
+ created for an overloaded function partially applied to the dictionaries
+ needed in a usage position of that function. As the rewrite rule,
+ however, mentions the original overloaded name, it won't fire anymore
+ -- unless later phases remove the intermediate definition again. The
+ latest CVS version of GHC has an option
+ <code>-fno-method-sharing</code>, which avoids sharing instantiation
+ stubs. This is usually/often/sometimes sufficient to make the rules
+ fire again.
+
+ <p><small>
+<!-- hhmts start -->
+Last modified: Wed Aug 8 19:24:09 EST 2001
+<!-- hhmts end -->
+ </small>
+ </body>
+</html>