1 <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
4 <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1">
5 <title>The GHC Commentary - Checking Types</title>
8 <body BGCOLOR="FFFFFF">
9 <h1>The GHC Commentary - Checking Types</h1>
11 Probably the most important phase in the frontend is the type checker,
12 which is located at <a
13 href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/typecheck/"><code>fptools/ghc/compiler/typecheck/</code>.</a>
14 GHC type checks programs in their original Haskell form before the
15 desugarer converts them into Core code. This complicates the type
16 checker as it has to handle the much more verbose Haskell AST, but it
17 improves error messages, as those message are based on the same
18 structure that the user sees.
21 GHC defines the abstract syntax of Haskell programs in <a
22 href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/hsSyn/HsSyn.lhs"><code>HsSyn</code></a>
23 using a structure that abstracts over the concrete representation of
24 bound occurences of identifiers and patterns. The module <a
25 href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/typecheck/TcHsSyn.lhs"><code>TcHsSyn</code></a>
26 defines a number of helper function required by the type checker. Note
28 href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/typecheck/TcRnTypes.lhs"><code>TcRnTypes</code></a>.<code>TcId</code>
29 used to represent identifiers in some signatures during type checking
30 is, in fact, nothing but a synonym for a <a href="vars.html">plain
34 <h2>The Overall Flow of Things</h2>
36 <h4>Entry Points Into the Type Checker</h4>
38 The interface of the type checker (and <a
39 href="renamer.html">renamer</a>) to the rest of the compiler is provided
41 href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/typecheck/TcRnDriver.lhs"><code>TcRnDriver</code></a>.
42 Entire modules are processed by calling <code>tcRnModule</code> and GHCi
43 uses <code>tcRnStmt</code>, <code>tcRnExpr</code>, and
44 <code>tcRnType</code> to typecheck statements and expressions, and to
45 kind check types, respectively. Moreover, <code>tcRnExtCore</code> is
46 provided to typecheck external Core code. Moreover,
47 <code>tcTopSrcDecls</code> is used by Template Haskell - more
48 specifically by <code>TcSplice.tc_bracket</code>
49 - to type check the contents of declaration brackets.
52 <h4>Renaming and Type Checking a Module</h4>
54 The function <code>tcRnModule</code> controls the complete static
55 analysis of a Haskell module. It sets up the combined renamer and type
56 checker monad, resolves all import statements, initiates the actual
57 renaming and type checking process, and finally, wraps off by processing
61 The actual type checking and renaming process is initiated via
62 <code>TcRnDriver.tcRnSrcDecls</code>, which uses a helper called
63 <code>tc_rn_src_decls</code> to implement the iterative renaming and
64 type checking process required by <a href="../exts/th.html">Template
65 Haskell</a>. However, before it invokes <code>tc_rn_src_decls</code>,
66 it takes care of hi-boot files; afterwards, it simplifies type
67 constraints and zonking (see below regarding the later).
70 The function <code>tc_rn_src_decls</code> partitions static analysis of
71 a whole module into multiple rounds, where the initial round is followed
72 by an additional one for each toplevel splice. It collects all
73 declarations up to the next splice into an <code>HsDecl.HsGroup</code>
74 to rename and type check that <em>declaration group</em> by calling
75 <code>TcRnDriver.tcRnGroup</code>. Afterwards, it executes the
76 splice (if there are any left) and proceeds to the next group, which
77 includes the declarations produced by the splice.
80 The function <code>tcRnGroup</code>, finally, gets down to invoke the
81 actual renaming and type checking via
82 <code>TcRnDriver.rnTopSrcDecls</code> and
83 <code>TcRnDriver.tcTopSrcDecls</code>, respectively. The renamer, apart
84 from renaming, computes the global type checking environment, of type
85 <code>TcRnTypes.TcGblEnv</code>, which is stored in the type checking
86 monad before type checking commences.
89 <h2>Type Checking a Declaration Group</h2>
91 The type checking of a declaration group, performed by
92 <code>tcTopSrcDecls</code> starts by processing of the type and class
93 declarations of the current module, using the function
94 <code>TcTyClsDecls.tcTyAndClassDecls</code>. This is followed by a
95 first round over instance declarations using
96 <code>TcInstDcls.tcInstDecls1</code>, which in particular generates all
97 additional bindings due to the deriving process. Then come foreign
98 import declarations (<code>TcForeign.tcForeignImports</code>) and
99 default declarations (<code>TcDefaults.tcDefaults</code>).
102 Now, finally, toplevel value declarations (including derived ones) are
103 type checked using <code>TcBinds.tcTopBinds</code>. Afterwards,
104 <code>TcInstDcls.tcInstDecls2</code> traverses instances for the second
105 time. Type checking concludes with processing foreign exports
106 (<code>TcForeign.tcForeignExports</code>) and rewrite rules
107 (<code>TcRules.tcRules</code>). Finally, the global environment is
108 extended with the new bindings.
111 <h2>Type checking Type and Class Declarations</h2>
113 Type and class declarations are type checked in a couple of phases that
114 contain recursive dependencies - aka <em>knots.</em> The first knot
115 encompasses almost the whole type checking of these declarations and
116 forms the main piece of <code>TcTyClsDecls.tcTyAndClassDecls</code>.
119 Inside this big knot, the first main operation is kind checking, which
120 again involves a knot. It is implemented by <code>kcTyClDecls</code>,
121 which performs kind checking of potentially recursively-dependent type
122 and class declarations using kind variables for initially unknown kinds.
123 During processing the individual declarations some of these variables
124 will be instantiated depending on the context; the rest gets by default
125 kind <code>*</code> (during <em>zonking</em> of the kind signatures).
126 Type synonyms are treated specially in this process, because they can
127 have an unboxed type, but they cannot be recursive. Hence, their kinds
128 are inferred in dependency order. Moreover, in contrast to class
129 declarations and other type declarations, synonyms are not entered into
130 the global environment as a global <code>TyThing</code>.
131 (<code>TypeRep.TyThing</code> is a sum type that combines the various
132 flavours of typish entities, such that they can be stuck into type
133 environments and similar.)
136 <h2>More Details</h2>
138 <h4>Types Variables and Zonking</h4>
140 During type checking type variables are represented by mutable variables
141 - cf. the <a href="vars.html#TyVar">variable story.</a> Consequently,
142 unification can instantiate type variables by updating those mutable
143 variables. This process of instantiation is (for reasons that elude me)
145 href="http://www.dictionary.com/cgi-bin/dict.pl?term=zonk&db=*">zonking</a>
146 in GHC's sources. The zonking routines for the various forms of Haskell
147 constructs are responsible for most of the code in the module <a
148 href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/typecheck/TcHsSyn.lhs"><code>TcHsSyn</code>,</a>
149 whereas the routines that actually operate on mutable types are defined
151 href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/typecheck/TcMType.lhs"><code>TcMType</code></a>;
152 this includes the zonking of type variables and type terms, routines to
153 create mutable structures and update them as well as routines that check
154 constraints, such as that type variables in function signatures have not
155 been instantiated during type checking. The actual type unification
156 routine is <code>uTys</code> in the module <a
157 href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/typecheck/TcUnify.lhs"><code>TcUnify</code></a>.
160 All type variables that may be instantiated (those in signatures
161 may not), but haven't been instantiated during type checking, are zonked
162 to <code>()</code>, so that after type checking all mutable variables
163 have been eliminated.
166 <h4>Type Representation</h4>
168 The representation of types is fixed in the module <a
169 href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/typecheck/TcRep.lhs"><code>TcRep</code></a>
170 and exported as the data type <code>Type</code>. As explained in <a
171 href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/typecheck/TcType.lhs"><code>TcType</code></a>,
172 GHC supports rank-N types, but, in the type checker, maintains the
173 restriction that type variables cannot be instantiated to quantified
174 types (i.e., the type system is predicative). The type checker floats
175 universal quantifiers outside and maintains types in prenex form.
176 (However, quantifiers can, of course, not float out of negative
177 positions.) Overall, we have
181 sigma -> forall tyvars. phi
186 | tycon tau_1 .. tau_n
188 | tau_1 -> tau_2</pre>
191 where <code>sigma</code> is in prenex form; i.e., there is never a
192 forall to the right of an arrow in a <code>phi</code> type. Moreover, a
193 type of the form <code>tau</code> never contains a quantifier (which
194 includes arguments to type constructors).
197 Of particular interest are the variants <code>SourceTy</code> and
198 <code>NoteTy</code> of <a
199 href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/typecheck/TypeRep.lhs"><code>TypeRep</code></a>.<code>Type</code>.
200 The constructor <code>SourceTy :: SourceType -> Type</code> represents a
201 type constraint; that is, a predicate over types represented by a
202 dictionary. The type checker treats a <code>SourceTy</code> as opaque,
203 but during the translation to core it will be expanded into its concrete
204 representation (i.e., a dictionary type) by the function <a
205 href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/types/Type.lhs"><code>Type</code></a>.<code>sourceTypeRep</code>.
206 Note that newtypes are not covered by <code>SourceType</code>s anymore,
207 even if some comments in GHC still suggest this. Instead, all newtype
208 applications are initially represented as a <code>NewTcApp</code>, until
209 they are eliminated by calls to <a
210 href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/types/Type.lhs"><code>Type</code></a>.<code>newTypeRep</code>.
213 The <code>NoteTy</code> constructor is used to add non-essential
214 information to a type term. Such information has the type
215 <code>TypeRep.TyNote</code> and is either the set of free type variables
216 of the annotated expression or the unexpanded version of a type synonym.
217 Free variables sets are cached as notes to save the overhead of
218 repeatedly computing the same set for a given term. Unexpanded type
219 synonyms are useful for generating comprehensible error messages, but
220 have no influence on the process of type checking.
223 <h4>Type Checking Environment</h4>
225 During type checking, GHC maintains a <em>type environment</em> whose
226 type definitions are fixed in the module <a
227 href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/typecheck/TcRnTypes.lhs"><code>TcRnTypes</code></a> with the operations defined in
229 href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/typecheck/TcEnv.lhs"><code>TcEnv</code></a>.
230 Among other things, the environment contains all imported and local
231 instances as well as a list of <em>global</em> entities (imported and
232 local types and classes together with imported identifiers) and
233 <em>local</em> entities (locally defined identifiers). This environment
234 is threaded through the type checking monad, whose support functions
235 including initialisation can be found in the module <a
236 href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/typecheck/TcRnMonad.lhs"><code>TcRnMonad</code>.</a>
240 Expressions are type checked by <a
241 href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/typecheck/TcExpr.lhs"><code>TcExpr</code>.</a>
243 Usage occurences of identifiers are processed by the function
244 <code>tcId</code> whose main purpose is to <a href="#inst">instantiate
245 overloaded identifiers.</a> It essentially calls
246 <code>TcInst.instOverloadedFun</code> once for each universally
247 quantified set of type constraints. It should be noted that overloaded
248 identifiers are replaced by new names that are first defined in the LIE
249 (Local Instance Environment?) and later promoted into top-level
252 <h4><a name="inst">Handling of Dictionaries and Method Instances</a></h4>
254 GHC implements overloading using so-called <em>dictionaries.</em> A
255 dictionary is a tuple of functions -- one function for each method in
256 the class of which the dictionary implements an instance. During type
257 checking, GHC replaces each type constraint of a function with one
258 additional argument. At runtime, the extended function gets passed a
259 matching class dictionary by way of these additional arguments.
260 Whenever the function needs to call a method of such a class, it simply
261 extracts it from the dictionary.
263 This sounds simple enough; however, the actual implementation is a bit
264 more tricky as it wants to keep track of all the instances at which
265 overloaded functions are used in a module. This information is useful
266 to optimise the code. The implementation is the module <a
267 href="http://cvs.haskell.org/cgi-bin/cvsweb.cgi/fptools/ghc/compiler/typecheck/Inst.lhs"><code>Inst.lhs</code>.</a>
269 The function <code>instOverloadedFun</code> is invoked for each
270 overloaded usage occurence of an identifier, where overloaded means that
271 the type of the idendifier contains a non-trivial type constraint. It
272 proceeds in two steps: (1) Allocation of a method instance
273 (<code>newMethodWithGivenTy</code>) and (2) instantiation of functional
274 dependencies. The former implies allocating a new unique identifier,
275 which replaces the original (overloaded) identifier at the currently
276 type-checked usage occurrence.
278 The new identifier (after being threaded through the LIE) eventually
279 will be bound by a top-level binding whose rhs contains a partial
280 application of the original overloaded identifier. This papp applies
281 the overloaded function to the dictionaries needed for the current
282 instance. In GHC lingo, this is called a <em>method.</em> Before
283 becoming a top-level binding, the method is first represented as a value
284 of type <code>Inst.Inst</code>, which makes it easy to fold multiple
285 instances of the same identifier at the same types into one global
286 definition. (And probably other things, too, which I haven't
290 <strong>Note:</strong> As of 13 January 2001 (wrt. to the code in the
291 CVS HEAD), the above mechanism interferes badly with RULES pragmas
292 defined over overloaded functions. During instantiation, a new name is
293 created for an overloaded function partially applied to the dictionaries
294 needed in a usage position of that function. As the rewrite rule,
295 however, mentions the original overloaded name, it won't fire anymore
296 -- unless later phases remove the intermediate definition again. The
297 latest CVS version of GHC has an option
298 <code>-fno-method-sharing</code>, which avoids sharing instantiation
299 stubs. This is usually/often/sometimes sufficient to make the rules
304 Last modified: Mon May 9 11:02:20 EST 2005