From c737f0f3c09a0fcf379a60d36b3f44f4b93dfae3 Mon Sep 17 00:00:00 2001 From: chak Date: Wed, 8 Aug 2001 09:48:59 +0000 Subject: [PATCH] [project @ 2001-08-08 09:48:58 by chak] As discussed at HIM in Cambridge, the GHC Commentary is now located in the main repository. The idea is that any developer who wants to share his wisdom with the world-at-large can alter the document. Obviously, it would be nice to keep the structure and style of the document as I have written it so far (admittedly not terribly much). In particular, please add links to the files in CVS where appropriate. Moreover, I have now included a version number in the title, which should be pumped whenever there are substantial changes; check the file index.html. I am planing to set up a cron job that updates the version of the Commentary on my Web page with the CVS version every night (this is also the version linked from GHC's documentation page). --- ghc/docs/comm/feedback.html | 34 +++++++++++ ghc/docs/comm/genesis/genesis.html | 82 ++++++++++++++++++++++++++ ghc/docs/comm/genesis/makefiles.html | 51 ++++++++++++++++ ghc/docs/comm/index.html | 69 ++++++++++++++++++++++ ghc/docs/comm/others.html | 49 ++++++++++++++++ ghc/docs/comm/rts-libs/prelfound.html | 57 ++++++++++++++++++ ghc/docs/comm/rts-libs/prelude.html | 62 ++++++++++++++++++++ ghc/docs/comm/rts-libs/primitives.html | 63 ++++++++++++++++++++ ghc/docs/comm/rts-libs/stgc.html | 45 +++++++++++++++ ghc/docs/comm/the-beast/basicTypes.html | 53 +++++++++++++++++ ghc/docs/comm/the-beast/driver.html | 27 +++++++++ ghc/docs/comm/the-beast/mangler.html | 78 +++++++++++++++++++++++++ ghc/docs/comm/the-beast/simplifier.html | 86 +++++++++++++++++++++++++++ ghc/docs/comm/the-beast/typecheck.html | 96 +++++++++++++++++++++++++++++++ 14 files changed, 852 insertions(+) create mode 100644 ghc/docs/comm/feedback.html create mode 100644 ghc/docs/comm/genesis/genesis.html create mode 100644 ghc/docs/comm/genesis/makefiles.html create mode 100644 ghc/docs/comm/index.html create mode 100644 ghc/docs/comm/others.html create mode 100644 ghc/docs/comm/rts-libs/prelfound.html create mode 100644 ghc/docs/comm/rts-libs/prelude.html create mode 100644 ghc/docs/comm/rts-libs/primitives.html create mode 100644 ghc/docs/comm/rts-libs/stgc.html create mode 100644 ghc/docs/comm/the-beast/basicTypes.html create mode 100644 ghc/docs/comm/the-beast/driver.html create mode 100644 ghc/docs/comm/the-beast/mangler.html create mode 100644 ghc/docs/comm/the-beast/simplifier.html create mode 100644 ghc/docs/comm/the-beast/typecheck.html diff --git a/ghc/docs/comm/feedback.html b/ghc/docs/comm/feedback.html new file mode 100644 index 0000000..1da8b10 --- /dev/null +++ b/ghc/docs/comm/feedback.html @@ -0,0 +1,34 @@ + + + + + The GHC Commentary - Feedback + + + +

Feedback

+

+ I welcome any feedback on the + material and in particular would appreciated comments on which parts of + the document are incomprehensible or miss explanation -- e.g., due to + the use of GHC speak that is explained nowhere (words like infotable or + so). Moreover, I would be interested to know which areas of GHC you + would like to see covered here. +

+ For the moment is probably best if feedback is directed to +

+

+ chak@cse.unsw.edu.au +
+

+ However, if there is sufficient interest, we might consider setting up a + mailing list. + +

+ +Last modified: Wed Aug 8 00:11:42 EST 2001 + + + + diff --git a/ghc/docs/comm/genesis/genesis.html b/ghc/docs/comm/genesis/genesis.html new file mode 100644 index 0000000..28a7add --- /dev/null +++ b/ghc/docs/comm/genesis/genesis.html @@ -0,0 +1,82 @@ + + + + + The GHC Commentary - Outline of the Genesis + + + +

The GHC Commentary - Outline of the Genesis

+

+ Building GHC happens in two stages: First you have to prepare the tree + with make boot; and second, you build the compiler and + associated libraries with make all. The boot + stage builds some tools used during the main build process, generates + parsers and other pre-computed source, and finally computes dependency + information. There is considerable detail on the build process in GHC's + Building Guide. + +

Debugging the Beast

+

+ If you are hacking the compiler or like to play with unstable + development versions, chances are that the compiler someday just crashes + on you. Then, it is a good idea to load the core into + gdb as usual, but unfortunately there is usually not too + much useful information. +

+ The next step, then, is somewhat tedious. You should build a compiler + producing programs with a runtime system that has debugging turned on + and use that to build the crashing compiler. There are many sanity + checks in the RTS, which may detect inconsistency before they lead to a + crash and you may include more debugging information, which helps + gdb. For a RTS with debugging turned on, add the following + to build.mk (see also the comment in + config.mk.in that you find when searching for + GhcRtsHcOpts): +

+GhcRtsHcOpts+=-optc-DDEBUG
+GhcRtsCcOpts+=-optc-g
+EXTRA_LD_OPTS=-lbfd -liberty
+

+ Then go into fptools/ghc/rts and make clean boot && + make all. With the resulting runtime system, you have to re-link + the compiler. Go into fptools/ghc/compiler, delete the + file hsc (up to version 4.08) or + ghc-<version>, and execute make all. +

+ The EXTRA_LD_OPTS are necessary as some of the debugging + code uses the BFD library, which in turn requires liberty. + I would also recommend (in 4.11 and from 5.0 upwards) adding these linker + options to the files package.conf and + package.conf.inplace in the directory + fptools/ghc/driver/ to the extra_ld_opts entry + of the package RTS. Otherwise, you have to supply them + whenever you compile and link a program with a compiler that uses the + debugging RTS for the programs it produces. +

+ To run GHC up to version 4.08 in gdb, first invoke the + compiler as usual, but pass it the option -v. This will + show you the exact invocation of the compiler proper hsc. + Run hsc with these options in gdb. The + development version 4.11 and stable releases from 5.0 on do no longer + use the Perl driver; so, you can run them directly with gdb. +

+ Debugging a compiler during building from HC files. + If you are boot strapping the compiler on new platform from HC files and + it crashes somewhere during the build (e.g., when compiling the + libraries), do as explained above, but you may have to re-configure the + build system with --enable-hc-boot before re-making the + code in fptools/ghc/driver/. + If you do this with a compiler up to version 4.08, run the build process + with make EXTRA_HC_OPTS=-v to get the exact arguments with + which you have to invoke hsc in gdb. + +

+ +Last modified: Wed Aug 8 19:18:54 EST 2001 + + + + diff --git a/ghc/docs/comm/genesis/makefiles.html b/ghc/docs/comm/genesis/makefiles.html new file mode 100644 index 0000000..0caa27a --- /dev/null +++ b/ghc/docs/comm/genesis/makefiles.html @@ -0,0 +1,51 @@ + + + + + The GHC Commentary - Mindboggling Makefiles + + + +

The GHC Commentary - Mindboggling Makefiles

+

+ The size and structure of GHC's makefiles makes it quite easy to scream + out loud - in pain - during the process of tracking down problems in the + make system or when attempting to alter it. GHC's Building + Guide has valuable information on the + makefile architecture. + +

A maze of twisty little passages, all alike

+

+ The fptools/ toplevel and the various project directories + contain not only a Makefile each, but there are + subdirectories of name mk/ at various levels that contain + rules, targets, and so on specific to a project - or, in the case of the + toplevel, the default rules for the whole system. Each mk/ + directory contains a file boilerplate.mk that ties the + various other makefiles together. Files called target.mk, + paths.mk, and suffix.mk contain make targets, + definitions of variables containing paths, and suffix rules, + respectively. +

+ One particularly nasty trick used in this hierarchy of makesfiles is the + way in which the variable $(TOP) is used. AFAIK, + $(TOP) always points to a directory containing an + mk/ subdirectory; however, it not necessarily points to the + toplevel fptools/ directory. For example, within the GHC + subtree, $(TOP) points to fptools/ghc/. + However, some of the makefiles in fptools/ghc/mk/ will then + temporarily redefine $(TOP) to point a level + higher (i.e., to fptools/) while they are including the + toplevel boilerplate. After that $(TOP) is redefined to + whatever value it had before including makfiles from higher up in the + hierarchy. + +

+ +Last modified: Wed Aug 8 19:19:54 EST 2001 + + + + diff --git a/ghc/docs/comm/index.html b/ghc/docs/comm/index.html new file mode 100644 index 0000000..c4601d6 --- /dev/null +++ b/ghc/docs/comm/index.html @@ -0,0 +1,69 @@ + + + + + The GHC Commentary - The Beast Explained + + + +

The Glasgow Haskell Compiler (GHC) Commentary [v0.1]

+

+ + Manuel M. T. Chakravarty
+  
+

+ This document started as a collection of notes describing what I learnt when poking around in + the GHC sources. During the + Haskell Implementers Workshop in January 2001 it was decided to + put the commentary into GHC's CVS repository to allow the whole + developer community to add their wizardly insight to the document. +

+ The document is still in its infancy - help it grow! + +

Before the Show Begins

+

+

+ +

Genesis

+

+

+ +

The Beast Dissected

+

+

+ +

RTS & Libraries

+

+

+ +

+ +Last modified: Wed Aug 8 00:11:49 EST 2001 + + + + diff --git a/ghc/docs/comm/others.html b/ghc/docs/comm/others.html new file mode 100644 index 0000000..a34e633 --- /dev/null +++ b/ghc/docs/comm/others.html @@ -0,0 +1,49 @@ + + + + + The GHC Commentary - Other Sources of Wisdom + + + +

Other Sources of Wisdom

+

+ Believe it or not, but there are other people besides you who are + masochistic enough to study the innards of the beast. Some of the have + been kind (or cruel?) enough to share their insights with us. Here is a + probably incomplete list: +

+

+ +


+ 1Usually reliable sources have it that + the poor soul in question is no one less than GUM hardcore hacker Hans-Wolfgang Loidl. + +

+ +Last modified: Wed Aug 8 00:47:05 EST 2001 + + + + diff --git a/ghc/docs/comm/rts-libs/prelfound.html b/ghc/docs/comm/rts-libs/prelfound.html new file mode 100644 index 0000000..25407ee --- /dev/null +++ b/ghc/docs/comm/rts-libs/prelfound.html @@ -0,0 +1,57 @@ + + + + + The GHC Commentary - Prelude Foundations + + + +

The GHC Commentary - Prelude Foundations

+

+ The standard Haskell Prelude as well as GHC's Prelude extensions are + constructed from GHC's primitives in a + couple of layers. + +

PrelBase.lhs

+

+ Some most elementary Prelude definitions are collected in PrelBase.lhs. + In particular, it defines the boxed versions of Haskell primitive types + - for example, Int is defined as +

+data Int = I# Int#
+
+

+ Saying that a boxed integer Int is formed by applying the + data constructor I# to an unboxed integer of type + Int#. Unboxed types are hardcoded in the compiler and + exported together with the primitive + operations understood by GHC. +

+ PrelBase.lhs similarly defines basic types, such as, + boolean values +

+data  Bool  =  False | True  deriving (Eq, Ord)
+
+

+ the unit type +

+data  ()  =  ()
+
+

+ and lists +

+data [] a = [] | a : [a]
+
+

+ It also contains instance delarations for these types. In addition, + PrelBase.lhs contains some tricky + machinery for efficient list handling. + +

+ +Last modified: Wed Aug 8 19:30:18 EST 2001 + + + + diff --git a/ghc/docs/comm/rts-libs/prelude.html b/ghc/docs/comm/rts-libs/prelude.html new file mode 100644 index 0000000..3f12bb1 --- /dev/null +++ b/ghc/docs/comm/rts-libs/prelude.html @@ -0,0 +1,62 @@ + + + + + The GHC Commentary - Cunning Prelude Code + + + +

The GHC Commentary - Cunning Prelude Code

+

+ GHC's uses a many optimsations and GHC specific techniques (unboxed + values, RULES pragmas, and so on) to make the heavily used Prelude code + as fast as possible. + +

fold/build

+

+ There is a lot of magic in PrelBase.lhs - + among other things, the RULES + pragmas implementing the fold/build + optimisation. The code for map is + a good example for how it all works. In the prelude code for version + 4.08.1 it reads as follows: +

+map :: (a -> b) -> [a] -> [b]
+map = mapList
+
+-- Note eta expanded
+mapFB ::  (elt -> lst -> lst) -> (a -> elt) -> a -> lst -> lst
+mapFB c f x ys = c (f x) ys
+
+mapList :: (a -> b) -> [a] -> [b]
+mapList _ []     = []
+mapList f (x:xs) = f x : mapList f xs
+
+{-# RULES
+"map"	  forall f xs.  map f xs	       = build (\c n -> foldr (mapFB c f) n xs)
+"mapFB"	  forall c f g. mapFB (mapFB c f) g    = mapFB c (f.g) 
+"mapList" forall f.	foldr (mapFB (:) f) [] = mapList f
+ #-}
+
+

+ This code is structured as it is, because the "map" rule first + breaks the map open, which exposes it to the various + foldr/build rules, and if no foldr/build rule matches, the "mapList" + rule closes it again in a later phase of optimisation - after + build was inlined. As a consequence, the whole thing depends a bit on + the timing of the various optimsations (the map might be closed again + before any of the foldr/build rules fires). To make the timing + deterministic, build gets a {-# INLINE 2 build + #-} pragma, which delays build's inlining, and thus, + the closing of the map. + +

+ +Last modified: Wed Aug 8 19:31:18 EST 2001 + + + + diff --git a/ghc/docs/comm/rts-libs/primitives.html b/ghc/docs/comm/rts-libs/primitives.html new file mode 100644 index 0000000..19095b4 --- /dev/null +++ b/ghc/docs/comm/rts-libs/primitives.html @@ -0,0 +1,63 @@ + + + + + The GHC Commentary - Primitives + + + +

The GHC Commentary - Primitives

+

+ Most user-level Haskell types and functions provided by GHC (in + particular those from the Prelude and GHC's Prelude extensions) are + internally constructed from even more elementary types and functions. + Most notably, GHC understands a notion of unboxed types, which + are the Haskell representation of primitive bit-level integer, float, + etc. types (as opposed to their boxed, heap allocated counterparts) - + cf. "Unboxed + Values as First Class Citizens." + +

The Ultimate Source of Primitives

+

+ The hardwired types of GHC are brought into scope by the module + PrelGHC. This modules only exists in the form of a + handwritten interface file PrelGHC.hi-boot, + which lists the type and function names, as well as instance + declarations. The actually types of these names as well as their + implementation is hardwired into GHC. Note that the names in this file + are z-encoded, and in particular, identifiers ending on zh + denote user-level identifiers ending in a hash mark (#), + which is used to flag unboxed values or functions operating on unboxed + values. For example, we have Char#, ord#, and + so on. + +

The New Primitive Definition Scheme

+

+ As of (about) the development version 4.11, the types and various + properties of primitive operations are defined in the file primops.txt + (Personally, I don't think that the .txt suffix is really + appropriate, as the file is used for automatic code generation). +

+ The utility genprimopcode + generates a series of Haskell files from primops.txt, which + encode the types and various properties of the primitive operations as + compiler internal data structures. These Haskell files are not complete + modules, but program fragments, which are included into compiler modules + during the GHC build process. The generated include files can be found + in the directory fptools/ghc/compiler/ and carry names + matching the pattern primop-*.hs-incl. They are generate + during the execution of the boot target in the + fptools/ghc/ directory. This scheme significantly + simplifies the maintenance of primitive operations. + +

+ +Last modified: Wed Aug 8 19:29:12 EST 2001 + + + + diff --git a/ghc/docs/comm/rts-libs/stgc.html b/ghc/docs/comm/rts-libs/stgc.html new file mode 100644 index 0000000..196ec91 --- /dev/null +++ b/ghc/docs/comm/rts-libs/stgc.html @@ -0,0 +1,45 @@ + + + + + The GHC Commentary - Spineless Tagless C + + + +

The GHC Commentary - Spineless Tagless C

+

+ The C code generated by GHC doesn't use higher-level features of C to be + able to control as precisely as possible what code is generated. + Moreover, it uses special features of gcc (such as, first class labels) + to produce more efficient code. +

+ STG C makes ample use of C's macro language to define idioms, which also + reduces the size of the generated C code (thus, reducing I/O times). + These macros are defined in the C headers located in GHC's includes + directory. + +

TailCalls.h

+

+ TailCalls.h + defines how tail calls are implemented - and in particular - optimised + in GHC generated code. The default case, for an architecture for which + GHC is not optimised, is to use the mini interpreter described in the STG paper. +

+ For supported architectures, various tricks are used to generate + assembler implementing proper tail calls. On i386, gcc's first class + labels are used to directly jump to a function pointer. Furthermore, + markers of the form --- BEGIN --- and --- END + --- are added to the assembly right after the function prologue + and before the epilogue. These markers are used by the Evil Mangler. + +

+ +Last modified: Wed Aug 8 19:28:29 EST 2001 + + + + diff --git a/ghc/docs/comm/the-beast/basicTypes.html b/ghc/docs/comm/the-beast/basicTypes.html new file mode 100644 index 0000000..2295db5 --- /dev/null +++ b/ghc/docs/comm/the-beast/basicTypes.html @@ -0,0 +1,53 @@ + + + + + The GHC Commentary - The Basics + + + +

The GHC Commentary - The Basics

+

+ The directory fptools/ghc/compiler/basicTypes/ + contains modules that define some of the essential types definition for + the compiler - such as, identifiers, variables, modules, and unique + names. + +

Ids

+

+ An Id (defined in Id.lhs + essentially records information about value and data constructor + identifiers -- to be precise, in the case of data constructors, two + Ids are used to represent the worker and wrapper functions + for the data constructor, respectively. The information maintained in + the Id abstraction includes among other items strictness, + occurrence, specialisation, and unfolding information. +

+ Due to the way Ids are used for data constructors, + all Ids are represented as variables, which contain a + varInfo field of abstract type IdInfo.IdInfo. + This is where the information about Ids is really stored. + The following is a (currently, partial) list of the various items in an + IdInfo: +

+

+
Occurence information +
The OccInfo data type is defined in the module BasicTypes.lhs. + Apart from the trivial NoOccInfo, it distinguishes + between variables that do not occur at all (IAmDead), + occur just once (OneOcc), or a loop breakers + (IAmALoopBreaker). +
+ +

+ +Last modified: Wed Aug 8 19:23:01 EST 2001 + + + + diff --git a/ghc/docs/comm/the-beast/driver.html b/ghc/docs/comm/the-beast/driver.html new file mode 100644 index 0000000..2c56182 --- /dev/null +++ b/ghc/docs/comm/the-beast/driver.html @@ -0,0 +1,27 @@ + + + + + The GHC Commentary - The Glorious Driver + + + +

The GHC Commentary - The Glorious Driver

+

+ The Glorious Driver (GD) is the part of GHC that orchestrates the + interaction of all the other pieces that make up GHC. It supersedes the + Evil Driver (ED), which was a Perl script that served the same + purpose and was in use until version 4.08.1 of GHC. Simon Marlow + eventually slayed the ED and instated the GD. +

+ The GD has been substantially extended for GHCi, i.e., the interactive + variant of GHC that integrates the compiler with a (meta-circular) + interpreter since version 5.00. + +

+ +Last modified: Wed Aug 8 19:22:14 EST 2001 + + + + diff --git a/ghc/docs/comm/the-beast/mangler.html b/ghc/docs/comm/the-beast/mangler.html new file mode 100644 index 0000000..d6d3a08 --- /dev/null +++ b/ghc/docs/comm/the-beast/mangler.html @@ -0,0 +1,78 @@ + + + + + The GHC Commentary - The Evil Mangler + + + +

The GHC Commentary - The Evil Mangler

+

+ The Evil Mangler (EM) is a Perl script invoked by the Glorious Driver after the C compiler (gcc) has + translated the GHC-produced C code into assembly. Consequently, it is + only of interest if -fvia-C is in effect (either explicitly + or implicitly). + +

Its purpose

+

+ The EM reads the assembly produced by gcc and re-arranges code blocks as + well as nukes instructions that it considers non-essential. It + derives it evilness from its utterly ad hoc, machine, compiler, and + whatnot dependent design and implementation. More precisely, the EM + performs the following tasks: +

+ +

Implementation

+

+ The EM is located in the Perl script ghc-asm.lprl. + The script reads the .s file and chops it up into + chunks (that's how they are actually called in the script) that + roughly correspond to basic blocks. Each chunk is annotated with an + educated guess about what kind of code it contains (e.g., infotable, + fast entry point, slow entry point, etc.). The annotations also contain + the symbol introducing the chunk of assembly and whether that chunk has + already been processed or not. +

+ The parsing of the input into chunks as well as recognising assembly + instructions that are to be removed or altered is based on a large + number of Perl regular expressions sprinkled over the whole code. These + expressions are rather fragile as they heavily rely on the structure of + the generated code - in fact, they even rely on the right amount of wide + space and thus on the formatting of the assembly. +

+ Afterwards, the chunks are reordered, some of them purged, and some + stripped of some useless instructions. Moreover, some instructions are + manipulated (eg, loads of fast entry points followed by indirect jumps + are replaced by direct jumps to the fast entry point). +

+ The EM knows which part of the code belongs to function prologues and + epilogues as STG C adds tags of the + form --- BEGIN --- and --- END --- the + assembler just before and after the code proper of a function starts. + It adds these tags using gcc's __asm__ feature. +

+ Update: Gcc 2.96 upwards performs more aggressive basic + block re-ordering and dead code elimination. This seems to make the + whole --- END --- tag business redundant -- in fact, if + proper code is generated, no --- END --- tags survive gcc + optimiser. + +

+ +Last modified: Wed Aug 8 19:27:22 EST 2001 + + + + diff --git a/ghc/docs/comm/the-beast/simplifier.html b/ghc/docs/comm/the-beast/simplifier.html new file mode 100644 index 0000000..40cf7cf --- /dev/null +++ b/ghc/docs/comm/the-beast/simplifier.html @@ -0,0 +1,86 @@ + + + + + The GHC Commentary - The Mighty Simplifier + + + +

The GHC Commentary - The Mighty Simplifier

+

+ Most of the optimising program transformations applied by GHC are + performed on an intermediate language called Core, which + essentially is a compiler-friendly formulation of rank-2 polymorphic + lambda terms defined in the module CoreSyn.lhs. + The transformation engine optimising Core programs is called the + Simplifier and composed from a couple of modules located in the + directory fptools/ghc/compiler/simplCore/. + The main engine of the simplifier is contained in Simplify.lhs. + and its driver is the routine core2core in SimplCore.lhs. +

+ The program that the simplifier has produced after applying its various + optimisations can be obtained by passing the option + -ddump-simpl to GHC. Moreover, the various intermediate + stages of the optimisation process is printed when passing + -dverbose-core2core. + +

Recursive Definitions

+

+ The simplification process has to take special care when handling + recursive binding groups; otherwise, the compiler might loop. + Therefore, the routine reOrderRec in OccurAnal.lhs + computes a set of loop breakers - a set of definitions that + together cut any possible loop in the binding group. It marks the + identifiers bound by these definitions as loop breakers by enriching + their occurence information. Loop + breakers will never be inlined by the simplifier; thus, + guaranteeing termination of the simplification procedure. (This is not + entirely accurate -- see rewrite rules below.) + + The processes finding loop breakers works as follows: First, the + strongly connected components (SCC) of the graph representing all + function dependencies is computed. Then, each SCC is inspected in turn. + If it contains only a single binding (self-recursive function), this is + the loop breaker. In case of multiple recursive bindings, the function + attempts to select bindings where the decision not to inline them does + cause the least harm - in the sense of inhibiting optimisations in the + code. This is achieved by considering each binding in turn and awarding + a score between 0 and 4, where a lower score means that the + function is less useful for inlining - and thus, a better loop breaker. + The evaluation of bingings is performed by the function + score locally defined in OccurAnal. + + Note that, because core programs represent function definitions as + one binding choosing between the possibly many equations in the + source program with a case construct, a loop breaker cannot + inline any of its possibly many alternatives (not even the non-recursive + alternatives). + +

Rewrite Rules

+

+ The application of rewrite rules is controlled in the module Simplify.lhs + by the function completeCall. This function first checks + whether it should inline the function applied at the currently inspected + call site, then simplifies the arguments, and finally, checks whether + any rewrite rule can be applied (and also whether there is a matching + specialised version of the applied function). The actual check for rule + application is performed by the function Rules.lookupRule. +

+ It should be note that the application of rewrite rules is not subject + to the loop breaker check - i.e., rules of loop breakers will be applied + regardless of whether this may cause the simplifier to diverge. + +

+ +Last modified: Wed Aug 8 19:25:33 EST 2001 + + + + diff --git a/ghc/docs/comm/the-beast/typecheck.html b/ghc/docs/comm/the-beast/typecheck.html new file mode 100644 index 0000000..0cbd5bf --- /dev/null +++ b/ghc/docs/comm/the-beast/typecheck.html @@ -0,0 +1,96 @@ + + + + + The GHC Commentary - Checking Types + + + +

The GHC Commentary - Checking Types

+

+ Probably the most important phase in the frontend is the type checker, + which is located at fptools/ghc/compiler/typecheck/. + +

Type Checking Environment

+

+ During type checking, GHC maintains a type environment whose + details are fixed in TcEnv.lhs. + Among other things, the environment contains all imported and local + instances as well as a list of global entities (imported and + local types and classes together with imported identifiers) and + local entities (locally defined identifiers). This environment + is threaded through the type checking monad. + +

Expressions

+

+ Expressions are type checked by TcExpr.lhs. +

+ Usage occurences of identifiers are processed by the function + tcId whose main purpose is to instantiate + overloaded identifiers. It essentially calls + TcInst.instOverloadedFun once for each universally + quantified set of type constraints. It should be noted that overloaded + identifiers are replaced by new names that are first defined in the LIE + (Local Instance Environment?) and later promoted into top-level + bindings. + +

Handling of Dictionaries and Method Instances

+

+ GHC implements overloading using so-called dictionaries. A + dictionary is a tuple of functions -- one function for each method in + the class of which the dictionary implements an instance. During type + checking, GHC replaces each type constraint of a function with one + additional argument. At runtime, the extended function gets passed a + matching class dictionary by way of these additional arguments. + Whenever the function needs to call a method of such a class, it simply + extracts it from the dictionary. +

+ This sounds simple enough; however, the actual implementation is a bit + more tricky as it wants to keep track of all the instances at which + overloaded functions are used in a module. This information is useful + to optimise the code. The implementation is the module Inst.lhs. +

+ The function instOverloadedFun is invoked for each + overloaded usage occurence of an identifier, where overloaded means that + the type of the idendifier contains a non-trivial type constraint. It + proceeds in two steps: (1) Allocation of a method instance + (newMethodWithGivenTy) and (2) instantiation of functional + dependencies. The former implies allocating a new unique identifier, + which replaces the original (overloaded) identifier at the currently + type-checked usage occurrence. +

+ The new identifier (after being threaded through the LIE) eventually + will be bound by a top-level binding whose rhs contains a partial + application of the original overloaded identifier. This papp applies + the overloaded function to the dictionaries needed for the current + instance. In GHC lingo, this is called a method. Before + becoming a top-level binding, the method is first represented as a value + of type Inst.Inst, which makes it easy to fold multiple + instances of the same identifier at the same types into one global + definition. (And probably other things, too, which I haven't + investigated yet.) + +

+ Note: As of 13 January 2001 (wrt. to the code in the + CVS HEAD), the above mechanism interferes badly with RULES pragmas + defined over overloaded functions. During instantiation, a new name is + created for an overloaded function partially applied to the dictionaries + needed in a usage position of that function. As the rewrite rule, + however, mentions the original overloaded name, it won't fire anymore + -- unless later phases remove the intermediate definition again. The + latest CVS version of GHC has an option + -fno-method-sharing, which avoids sharing instantiation + stubs. This is usually/often/sometimes sufficient to make the rules + fire again. + +

+ +Last modified: Wed Aug 8 19:24:09 EST 2001 + + + + -- 1.7.10.4