X-Git-Url: http://git.megacz.com/?p=ghc-hetmet.git;a=blobdiff_plain;f=docs%2Fcomm%2Fthe-beast%2Ftypecheck.html;fp=docs%2Fcomm%2Fthe-beast%2Ftypecheck.html;h=8d22784b8a6b0e9eeaffd6f39cc56f9af482cb1b;hp=0000000000000000000000000000000000000000;hb=0065d5ab628975892cea1ec7303f968c3338cbe1;hpb=28a464a75e14cece5db40f2765a29348273ff2d2 diff --git a/docs/comm/the-beast/typecheck.html b/docs/comm/the-beast/typecheck.html new file mode 100644 index 0000000..8d22784 --- /dev/null +++ b/docs/comm/the-beast/typecheck.html @@ -0,0 +1,316 @@ + + +
+ +
+ Probably the most important phase in the frontend is the type checker,
+ which is located at fptools/ghc/compiler/typecheck/
.
+ GHC type checks programs in their original Haskell form before the
+ desugarer converts them into Core code. This complicates the type
+ checker as it has to handle the much more verbose Haskell AST, but it
+ improves error messages, as those message are based on the same
+ structure that the user sees.
+
+ GHC defines the abstract syntax of Haskell programs in HsSyn
+ using a structure that abstracts over the concrete representation of
+ bound occurences of identifiers and patterns. The module TcHsSyn
+ defines a number of helper function required by the type checker. Note
+ that the type TcRnTypes
.TcId
+ used to represent identifiers in some signatures during type checking
+ is, in fact, nothing but a synonym for a plain
+ Id
.
+
+ It is also noteworthy, that the representations of types changes during
+ type checking from HsType
to TypeRep.Type
.
+ The latter is a hybrid type representation that
+ is used to type Core, but still contains sufficient information to
+ recover source types. In particular, the type checker maintains and
+ compares types in their Type
form.
+
+ The interface of the type checker (and renamer) to the rest of the compiler is provided
+ by TcRnDriver
.
+ Entire modules are processed by calling tcRnModule
and GHCi
+ uses tcRnStmt
, tcRnExpr
, and
+ tcRnType
to typecheck statements and expressions, and to
+ kind check types, respectively. Moreover, tcRnExtCore
is
+ provided to typecheck external Core code. Moreover,
+ tcTopSrcDecls
is used by Template Haskell - more
+ specifically by TcSplice.tc_bracket
+ - to type check the contents of declaration brackets.
+
+ The function tcRnModule
controls the complete static
+ analysis of a Haskell module. It sets up the combined renamer and type
+ checker monad, resolves all import statements, initiates the actual
+ renaming and type checking process, and finally, wraps off by processing
+ the export list.
+
+ The actual type checking and renaming process is initiated via
+ TcRnDriver.tcRnSrcDecls
, which uses a helper called
+ tc_rn_src_decls
to implement the iterative renaming and
+ type checking process required by Template
+ Haskell. However, before it invokes tc_rn_src_decls
,
+ it takes care of hi-boot files; afterwards, it simplifies type
+ constraints and zonking (see below regarding the later).
+
+ The function tc_rn_src_decls
partitions static analysis of
+ a whole module into multiple rounds, where the initial round is followed
+ by an additional one for each toplevel splice. It collects all
+ declarations up to the next splice into an HsDecl.HsGroup
+ to rename and type check that declaration group by calling
+ TcRnDriver.tcRnGroup
. Afterwards, it executes the
+ splice (if there are any left) and proceeds to the next group, which
+ includes the declarations produced by the splice.
+
+ The function tcRnGroup
, finally, gets down to invoke the
+ actual renaming and type checking via
+ TcRnDriver.rnTopSrcDecls
and
+ TcRnDriver.tcTopSrcDecls
, respectively. The renamer, apart
+ from renaming, computes the global type checking environment, of type
+ TcRnTypes.TcGblEnv
, which is stored in the type checking
+ monad before type checking commences.
+
+ The type checking of a declaration group, performed by
+ tcTopSrcDecls
starts by processing of the type and class
+ declarations of the current module, using the function
+ TcTyClsDecls.tcTyAndClassDecls
. This is followed by a
+ first round over instance declarations using
+ TcInstDcls.tcInstDecls1
, which in particular generates all
+ additional bindings due to the deriving process. Then come foreign
+ import declarations (TcForeign.tcForeignImports
) and
+ default declarations (TcDefaults.tcDefaults
).
+
+ Now, finally, toplevel value declarations (including derived ones) are
+ type checked using TcBinds.tcTopBinds
. Afterwards,
+ TcInstDcls.tcInstDecls2
traverses instances for the second
+ time. Type checking concludes with processing foreign exports
+ (TcForeign.tcForeignExports
) and rewrite rules
+ (TcRules.tcRules
). Finally, the global environment is
+ extended with the new bindings.
+
+ Type and class declarations are type checked in a couple of phases that
+ contain recursive dependencies - aka knots. The first knot
+ encompasses almost the whole type checking of these declarations and
+ forms the main piece of TcTyClsDecls.tcTyAndClassDecls
.
+
+ Inside this big knot, the first main operation is kind checking, which
+ again involves a knot. It is implemented by kcTyClDecls
,
+ which performs kind checking of potentially recursively-dependent type
+ and class declarations using kind variables for initially unknown kinds.
+ During processing the individual declarations some of these variables
+ will be instantiated depending on the context; the rest gets by default
+ kind *
(during zonking of the kind signatures).
+ Type synonyms are treated specially in this process, because they can
+ have an unboxed type, but they cannot be recursive. Hence, their kinds
+ are inferred in dependency order. Moreover, in contrast to class
+ declarations and other type declarations, synonyms are not entered into
+ the global environment as a global TyThing
.
+ (TypeRep.TyThing
is a sum type that combines the various
+ flavours of typish entities, such that they can be stuck into type
+ environments and similar.)
+
+ During type checking type variables are represented by mutable variables
+ - cf. the variable story. Consequently,
+ unification can instantiate type variables by updating those mutable
+ variables. This process of instantiation is (for reasons that elude me)
+ called zonking
+ in GHC's sources. The zonking routines for the various forms of Haskell
+ constructs are responsible for most of the code in the module TcHsSyn
,
+ whereas the routines that actually operate on mutable types are defined
+ in TcMType
;
+ this includes the zonking of type variables and type terms, routines to
+ create mutable structures and update them as well as routines that check
+ constraints, such as that type variables in function signatures have not
+ been instantiated during type checking. The actual type unification
+ routine is uTys
in the module TcUnify
.
+
+ All type variables that may be instantiated (those in signatures
+ may not), but haven't been instantiated during type checking, are zonked
+ to ()
, so that after type checking all mutable variables
+ have been eliminated.
+
+ The representation of types is fixed in the module TcRep
+ and exported as the data type Type
. As explained in TcType
,
+ GHC supports rank-N types, but, in the type checker, maintains the
+ restriction that type variables cannot be instantiated to quantified
+ types (i.e., the type system is predicative). The type checker floats
+ universal quantifiers outside and maintains types in prenex form.
+ (However, quantifiers can, of course, not float out of negative
+ positions.) Overall, we have
+
+++sigma -> forall tyvars. phi +phi -> theta => rho +rho -> sigma -> rho + | tau +tau -> tyvar + | tycon tau_1 .. tau_n + | tau_1 tau_2 + | tau_1 -> tau_2+
+ where sigma
is in prenex form; i.e., there is never a
+ forall to the right of an arrow in a phi
type. Moreover, a
+ type of the form tau
never contains a quantifier (which
+ includes arguments to type constructors).
+
+ Of particular interest are the variants SourceTy
and
+ NoteTy
of TypeRep
.Type
.
+ The constructor SourceTy :: SourceType -> Type
represents a
+ type constraint; that is, a predicate over types represented by a
+ dictionary. The type checker treats a SourceTy
as opaque,
+ but during the translation to core it will be expanded into its concrete
+ representation (i.e., a dictionary type) by the function Type
.sourceTypeRep
.
+ Note that newtypes are not covered by SourceType
s anymore,
+ even if some comments in GHC still suggest this. Instead, all newtype
+ applications are initially represented as a NewTcApp
, until
+ they are eliminated by calls to Type
.newTypeRep
.
+
+ The NoteTy
constructor is used to add non-essential
+ information to a type term. Such information has the type
+ TypeRep.TyNote
and is either the set of free type variables
+ of the annotated expression or the unexpanded version of a type synonym.
+ Free variables sets are cached as notes to save the overhead of
+ repeatedly computing the same set for a given term. Unexpanded type
+ synonyms are useful for generating comprehensible error messages, but
+ have no influence on the process of type checking.
+
+ During type checking, GHC maintains a type environment whose
+ type definitions are fixed in the module TcRnTypes
with the operations defined in
+TcEnv
.
+ Among other things, the environment contains all imported and local
+ instances as well as a list of global entities (imported and
+ local types and classes together with imported identifiers) and
+ local entities (locally defined identifiers). This environment
+ is threaded through the type checking monad, whose support functions
+ including initialisation can be found in the module TcRnMonad
.
+
+
+ Expressions are type checked by TcExpr
.
+
+ Usage occurences of identifiers are processed by the function
+ tcId
whose main purpose is to instantiate
+ overloaded identifiers. It essentially calls
+ TcInst.instOverloadedFun
once for each universally
+ quantified set of type constraints. It should be noted that overloaded
+ identifiers are replaced by new names that are first defined in the LIE
+ (Local Instance Environment?) and later promoted into top-level
+ bindings.
+
+
+ GHC implements overloading using so-called dictionaries. A + dictionary is a tuple of functions -- one function for each method in + the class of which the dictionary implements an instance. During type + checking, GHC replaces each type constraint of a function with one + additional argument. At runtime, the extended function gets passed a + matching class dictionary by way of these additional arguments. + Whenever the function needs to call a method of such a class, it simply + extracts it from the dictionary. +
+ This sounds simple enough; however, the actual implementation is a bit
+ more tricky as it wants to keep track of all the instances at which
+ overloaded functions are used in a module. This information is useful
+ to optimise the code. The implementation is the module Inst.lhs
.
+
+ The function instOverloadedFun
is invoked for each
+ overloaded usage occurence of an identifier, where overloaded means that
+ the type of the idendifier contains a non-trivial type constraint. It
+ proceeds in two steps: (1) Allocation of a method instance
+ (newMethodWithGivenTy
) and (2) instantiation of functional
+ dependencies. The former implies allocating a new unique identifier,
+ which replaces the original (overloaded) identifier at the currently
+ type-checked usage occurrence.
+
+ The new identifier (after being threaded through the LIE) eventually
+ will be bound by a top-level binding whose rhs contains a partial
+ application of the original overloaded identifier. This papp applies
+ the overloaded function to the dictionaries needed for the current
+ instance. In GHC lingo, this is called a method. Before
+ becoming a top-level binding, the method is first represented as a value
+ of type Inst.Inst
, which makes it easy to fold multiple
+ instances of the same identifier at the same types into one global
+ definition. (And probably other things, too, which I haven't
+ investigated yet.)
+
+
+ Note: As of 13 January 2001 (wrt. to the code in the
+ CVS HEAD), the above mechanism interferes badly with RULES pragmas
+ defined over overloaded functions. During instantiation, a new name is
+ created for an overloaded function partially applied to the dictionaries
+ needed in a usage position of that function. As the rewrite rule,
+ however, mentions the original overloaded name, it won't fire anymore
+ -- unless later phases remove the intermediate definition again. The
+ latest CVS version of GHC has an option
+ -fno-method-sharing
, which avoids sharing instantiation
+ stubs. This is usually/often/sometimes sufficient to make the rules
+ fire again.
+
+
+ +Last modified: Thu May 12 22:52:46 EST 2005 + + + +