1 <!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
4 <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1">
5 <title>The GHC Commentary - The Real Story about Variables, Ids, TyVars, and the like</title>
8 <body BGCOLOR="FFFFFF">
9 <h1>The GHC Commentary - The Real Story about Variables, Ids, TyVars, and the like</h1>
15 The <code>Var</code> type, defined in <code>basicTypes/Var.lhs</code>,
16 represents variables, both term variables and type variables:
21 realUnique :: FastInt,
23 varDetails :: VarDetails,
28 <li> The <code>varName</code> field contains the identity of the variable:
29 its unique number, and its print-name. See "<a href="names.html">The truth about names</a>".
31 <p><li> The <code>realUnique</code> field caches the unique number in the
32 <code>varName</code> field, just to make comparison of <code>Var</code>s a little faster.
34 <p><li> The <code>varType</code> field gives the type of a term variable, or the kind of a
35 type variable. (Types and kinds are both represented by a <code>Type</code>.)
37 <p><li> The <code>varDetails</code> field distinguishes term variables from type variables,
38 and makes some further distinctions (see below).
40 <p><li> For term variables (only) the <code>varInfo</code> field contains lots of useful
41 information: strictness, unfolding, etc. However, this information is all optional;
42 you can always throw away the <code>IdInfo</code>. In contrast, you can't safely throw away
43 the <code>VarDetails</code> of a <code>Var</code>
46 It's often fantastically convenient to have term variables and type variables
47 share a single data type. For example,
49 exprFreeVars :: CoreExpr -> VarSet
51 If there were two types, we'd need to return two sets. Simiarly, big lambdas and
52 little lambdas use the same constructor in Core, which is extremely convenient.
54 We define a couple of type synonyms:
56 type Id = Var -- Term variables
57 type TyVar = Var -- Type variables
59 just to help us document the occasions when we are expecting only term variables,
60 or only type variables.
63 <h2> The <code>VarDetails</code> field </h2>
65 The <code>VarDetails</code> field tells what kind of variable this is:
68 = LocalId -- Used for locally-defined Ids (see NOTE below)
71 | GlobalId -- Used for imported Ids, dict selectors etc
75 | MutTyVar (IORef (Maybe Type)) -- Used during unification;
76 Bool -- True <=> this is a type signature variable, which
77 -- should not be unified with a non-tyvar type
81 <h2>Type variables (<code>TyVar</code>)</h2>
84 The <code>TyVar</code> case is self-explanatory. The
85 <code>MutTyVar</code> case is used only during type checking. Then a
86 type variable can be unified, using an imperative update, with a type,
87 and that is what the <code>IORef</code> is for. The <code>Bool</code>
88 field records whether the type variable arose from a type signature,
89 in which case it should not be unified with a type (only with another
92 For a long time I tried to keep mutable Vars statically type-distinct
93 from immutable Vars, but I've finally given up. It's just too painful.
94 After type checking there are no MutTyVars left, but there's no static check
97 <h2>Term variables (<code>Id</code>)</h2>
99 A term variable (of type <code>Id</code>) is represented either by a
100 <code>LocalId</code> or a <code>GlobalId</code>:
102 A <code>GlobalId</code> is
104 <li> Always bound at top-level.
105 <li> Always has a <code>GlobalName</code>, and hence has
106 a <code>Unique</code> that is globally unique across the whole
107 GHC invocation (a single invocation may compile multiple modules).
108 <li> Has <code>IdInfo</code> that is absolutely fixed, forever.
112 A <code>LocalId</code> is:
114 <li> Always bound in the module being compiled:
116 <li> <em>either</em> bound within an expression (lambda, case, local let(rec))
117 <li> <em>or</em> defined at top level in the module being compiled.
119 <li> Has IdInfo that changes as the simpifier bashes repeatedly on it.
122 The key thing about <code>LocalId</code>s is that the free-variable finder
123 typically treats them as candidate free variables. That is, it ignores
124 <code>GlobalId</code>s such as imported constants, data contructors, etc.
126 An important invariant is this: <em>All the bindings in the module
127 being compiled (whether top level or not) are <code>LocalId</code>s
128 until the CoreTidy phase.</em> In the CoreTidy phase, all
129 externally-visible top-level bindings are made into GlobalIds. This
130 is the point when a <code>LocalId</code> becomes "frozen" and becomes
131 a fixed, immutable <code>GlobalId</code>.
133 (A binding is <em>"externally-visible"</em> if it is exported, or
134 mentioned in the unfolding of an externally-visible Id. An
135 externally-visible Id may not have an unfolding, either because it is
136 too big, or because it is the loop-breaker of a recursive group.)
138 <h3>Global Ids and implicit Ids</h3>
140 <code>GlobalId</code>s are further categorised by their <code>GlobalIdDetails</code>.
141 This type is defined in <code>basicTypes/IdInfo</code>, because it mentions other
142 structured types like <code>DataCon</code>. Unfortunately it is *used* in <code>Var.lhs</code>
143 so there's a <code>hi-boot</code> knot to get it there. Anyway, here's the declaration:
146 = NotGlobalId -- Used as a convenient extra return value
147 -- from globalIdDetails
149 | VanillaGlobal -- Imported from elsewhere
151 | PrimOpId PrimOp -- The Id for a primitive operator
152 | FCallId ForeignCall -- The Id for a foreign call
154 -- These next ones are all "implicit Ids"
155 | RecordSelId FieldLabel -- The Id for a record selector
156 | DataConId DataCon -- The Id for a data constructor *worker*
157 | DataConWrapId DataCon -- The Id for a data constructor *wrapper*
158 -- [the only reasons we need to know is so that
159 -- a) we can suppress printing a definition in the interface file
160 -- b) when typechecking a pattern we can get from the
161 -- Id back to the data con]
163 The <code>GlobalIdDetails</code> allows us to go from the <code>Id</code> for
164 a record selector, say, to its field name; or the <code>Id</code> for a primitive
165 operator to the <code>PrimOp</code> itself.
167 Certain <code>GlobalId</code>s are called <em>"implicit"</em> Ids. An implicit
168 Id is derived by implication from some other declaration. So a record selector is
169 derived from its data type declaration, for example. An implicit Ids is always
170 a <code>GlobalId</code>. For most of the compilation, the implicit Ids are just
171 that: implicit. If you do -ddump-simpl you won't see their definition. (That's
172 why it's true to say that until CoreTidy all Ids in this compilation unit are
173 LocalIds.) But at CorePrep, a binding is added for each implicit Id defined in
174 this module, so that the code generator will generate code for the (curried) function.
176 Implicit Ids carry their unfolding inside them, of course, so they may well have
177 been inlined much earlier; but we generate the curried top-level defn just in
178 case its ever needed.
182 The <code>LocalIdDetails</code> gives more info about a <code>LocalId</code>:
185 = NotExported -- Not exported
186 | Exported -- Exported
187 | SpecPragma -- Not exported, but not to be discarded either
188 -- It's unclean that this is so deeply built in
190 From this we can tell whether the <code>LocalId</code> is exported, and that
191 tells us whether we can drop an unused binding as dead code.
193 The <code>SpecPragma</code> thing is a HACK. Suppose you write a SPECIALIZE pragma:
195 foo :: Num a => a -> a
196 {-# SPECIALIZE foo :: Int -> Int #-}
199 The type checker generates a dummy call to <code>foo</code> at the right types:
201 $dummy = foo Int dNumInt
203 The Id <code>$dummy</code> is marked <code>SpecPragma</code>. Its role is to hang
204 onto that call to <code>foo</code> so that the specialiser can see it, but there
205 are no calls to <code>$dummy</code>.
206 The simplifier is careful not to discard <code>SpecPragma</code> Ids, so that it
207 reaches the specialiser. The specialiser processes the right hand side of a <code>SpecPragma</code> Id
208 to find calls to overloaded functions, <em>and then discards the <code>SpecPragma</code> Id</em>.
209 So <code>SpecPragma</code> behaves a like <code>Exported</code>, at least until the specialiser.
212 <h3> ExternalNames and InternalNames </h3>
214 Notice that whether an Id is a <code>LocalId</code> or <code>GlobalId</code> is
215 not the same as whether the Id has an <code>ExternaName</code> or an <code>InternalName</code>
216 (see "<a href="names.html#sort">The truth about Names</a>"):
218 <li> Every <code>GlobalId</code> has an <code>ExternalName</code>.
219 <li> A <code>LocalId</code> might have either kind of <code>Name</code>.
223 Last modified: Tue Nov 13 14:11:35 EST 2001