--- /dev/null
+<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML//EN">
+<html>
+ <head>
+ <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=ISO-8859-1">
+ <title>The GHC Commentary - The Real Story about Variables, Ids, TyVars, and the like</title>
+ </head>
+
+ <body BGCOLOR="FFFFFF">
+ <h1>The GHC Commentary - The Real Story about Variables, Ids, TyVars, and the like</h1>
+ <p>
+
+
+<h2>Variables</h2>
+
+The <code>Var</code> type, defined in <code>basicTypes/Var.lhs</code>,
+represents variables, both term variables and type variables:
+<pre>
+ data Var
+ = Var {
+ varName :: Name,
+ realUnique :: FastInt,
+ varType :: Type,
+ varDetails :: VarDetails,
+ varInfo :: IdInfo
+ }
+</pre>
+<ul>
+<li> The <code>varName</code> field contains the identity of the variable:
+its unique number, and its print-name. The unique number is cached in the
+<code>realUnique</code> field, just to make comparison of <code>Var</code>s a little faster.
+
+<p><li> The <code>Type</code> field gives the type of a term variable, or the kind of a
+type variable. (Types and kinds are both represented by a <code>Type</code>.)
+
+<p><li> The <code>varDetails</code> field distinguishes term variables from type variables,
+and makes some further distinctions (see below).
+
+<p><li> For term variables (only) the <code>varInfo</code> field contains lots of useful
+information: strictness, unfolding, etc. However, this information is all optional;
+you can always throw away the <code>IdInfo</code>. In contrast, you can't safely throw away
+the <code>VarDetails</code> of a <code>Var</code>
+</ul>
+<p>
+It's often fantastically convenient to have term variables and type variables
+share a single data type. For example,
+<pre>
+ exprFreeVars :: CoreExpr -> VarSet
+</pre>
+If there were two types, we'd need to return two sets. Simiarly, big lambdas and
+little lambdas use the same constructor in Core, which is extremely convenient.
+<p>
+We define a couple of type synonyms:
+<pre>
+ type Id = Var -- Term variables
+ type TyVar = Var -- Type variables
+</pre>
+just to help us document the occasions when we are expecting only term variables,
+or only type variables.
+
+<h2> The <code>VarDetails</code> field </h2>
+
+The <code>VarDetails</code> field tells what kind of variable this is:
+<pre>
+data VarDetails
+ = LocalId -- Used for locally-defined Ids (see NOTE below)
+ LocalIdDetails
+
+ | GlobalId -- Used for imported Ids, dict selectors etc
+ GlobalIdDetails
+
+ | TyVar
+ | MutTyVar (IORef (Maybe Type)) -- Used during unification;
+ Bool -- True <=> this is a type signature variable, which
+ -- should not be unified with a non-tyvar type
+</pre>
+
+<h2>Type variables (<code>TyVar</code>)</h2>
+
+The <code>TyVar</code> case is self-explanatory. The
+<code>MutTyVar</code> case is used only during type checking. Then a
+tupe variable can be unified, using an imperative update, with a type,
+and that is what the <code>IORef</code> is for. The <code>Bool</code>
+field records whether the type variable arose from a type signature,
+in which case it should not be unified with a type (only with another
+type variable).
+<p>
+For a long time I tried to keep mutable Vars statically type-distinct
+from immutable Vars, but I've finally given up. It's just too painful.
+After type checking there are no MutTyVars left, but there's no static check
+of that fact.
+
+<h2>Term variables (<code>Id</code>)</h2>
+
+A term variable (of type <code>Id</code>) is represented either by a
+<code>LocalId</code> or a <code>GlobalId</code>:
+<p>
+A <code>GlobalId</code> is
+<ul>
+<li> Always bound at top-level.
+<li> Always has a <code>GlobalName</code>, and hence has
+ a <code>Unique</code> that is globally unique across the whole
+ GHC invocation (a single invocation may compile multiple modules).
+<li> Has <code>IdInfo</code> that is absolutely fixed, forever.
+</ul>
+
+<p>
+A <code>LocalId</code> is:
+<ul>
+<li> Always bound in the module being compiled:
+<ul>
+<li> <em>either</em> bound within an expression (lambda, case, local let(rec))
+<li> <em>or</em> defined at top level in the module being compiled.
+</ul>
+<li> Has IdInfo that changes as the simpifier bashes repeatedly on it.
+</ul>
+<p>
+The key thing about <code>LocalId</code>s is that the free-variable finder
+typically treats them as candidate free variables. That is, it ignores
+<code>GlobalId</code>s such as imported constants, data contructors, etc.
+<p>
+An important invariant is this: <em>All the bindings in the module
+being compiled (whether top level or not) are <code>LocalId</code>s
+until the CoreTidy phase.</em> In the CoreTidy phase, all
+externally-visible top-level bindings are made into GlobalIds. This
+is the point when a <code>LocalId</code> becomes "frozen" and becomes
+a fixed, immutable <code>GlobalId</code>.
+<p>
+(A binding is <em>"externally-visible"</em> if it is exported, or
+mentioned in the unfolding of an externally-visible Id. An
+externally-visible Id may not have an unfolding, either because it is
+too big, or because it is the loop-breaker of a recursive group.)
+
+<h3>Global Ids and implicit Ids</h3>
+
+<code>GlobalId</code>s are further categorised by their <code>GlobalIdDetails</code>.
+This type is defined in <code>basicTypes/IdInfo</code>, because it mentions other
+structured types like <code>DataCon</code>. Unfortunately it is *used* in <code>Var.lhs</code>
+so there's a <code>hi-boot</code> knot to get it there. Anyway, here's the declaration:
+<pre>
+data GlobalIdDetails
+ = NotGlobalId -- Used as a convenient extra return value
+ -- from globalIdDetails
+
+ | VanillaGlobal -- Imported from elsewhere
+
+ | PrimOpId PrimOp -- The Id for a primitive operator
+ | FCallId ForeignCall -- The Id for a foreign call
+
+ -- These next ones are all "implicit Ids"
+ | RecordSelId FieldLabel -- The Id for a record selector
+ | DataConId DataCon -- The Id for a data constructor *worker*
+ | DataConWrapId DataCon -- The Id for a data constructor *wrapper*
+ -- [the only reasons we need to know is so that
+ -- a) we can suppress printing a definition in the interface file
+ -- b) when typechecking a pattern we can get from the
+ -- Id back to the data con]
+</pre>
+The <code>GlobalIdDetails</code> allows us to go from the <code>Id</code> for
+a record selector, say, to its field name; or the <code>Id</code> for a primitive
+operator to the <code>PrimOp</code> itself.
+<p>
+Certain <code>GlobalId</code>s are called <em>"implicit"</em> Ids. An implicit
+Id is derived by implication from some other declaration. So a record selector is
+derived from its data type declaration, for example. An implicit Ids is always
+a <code>GlobalId</code>. For most of the compilation, the implicit Ids are just
+that: implicit. If you do -ddump-simpl you won't see their definition. (That's
+why it's true to say that until CoreTidy all Ids in this compilation unit are
+LocalIds.) But at CorePrep, a binding is added for each implicit Id defined in
+this module, so that the code generator will generate code for the (curried) function.
+<p>
+Implicit Ids carry their unfolding inside them, of course, so they may well have
+been inlined much earlier; but we generate the curried top-level defn just in
+case its ever needed.
+
+<h3>LocalIds</h3>
+
+The <code>LocalIdDetails</code> gives more info about a <code>LocalId</code>:
+<pre>
+data LocalIdDetails
+ = NotExported -- Not exported
+ | Exported -- Exported
+ | SpecPragma -- Not exported, but not to be discarded either
+ -- It's unclean that this is so deeply built in
+</pre>
+From this we can tell whether the <code>LocalId</code> is exported, and that
+tells us whether we can drop an unused binding as dead code.
+<p>
+The <code>SpecPragma</code> thing is a HACK. Suppose you write a SPECIALIZE pragma:
+<pre>
+ foo :: Num a => a -> a
+ {-# SPECIALIZE foo :: Int -> Int #-}
+ foo = ...
+</pre>
+The type checker generates a dummy call to <code>foo</code> at the right types:
+<pre>
+ $dummy = foo Int dNumInt
+</pre>
+The Id <code>$dummy</code> is marked <code>SpecPragma</code>. Its role is to hang
+onto that call to <code>foo</code> so that the specialiser can see it, but there
+are no calls to <code>$dummy</code>.
+The simplifier is careful not to discard <code>SpecPragma</code> Ids, so that it
+reaches the specialiser. The specialiser processes the right hand side of a <code>SpecPragma</code> Id
+to find calls to overloaded functions, <em>and then discards the <code>SpecPragma</code> Id</em>.
+So <code>SpecPragma</code> behaves a like <code>Exported</code>, at least until the specialiser.
+
+
+<h3>Global and Local <code>Name</code>s</h3>
+
+Notice that whether an Id is a <code>LocalId</code> or <code>GlobalId</code> is
+not the same as whether the Id has a <code>Local</code> or <code>Global</code> <code>Name</code>:
+<ul>
+<li> Every <code>GlobalId</code> has a <code>Global</code> <code>Name</code>.
+<li> A <code>LocalId</code> might have either kind of <code>Name</code>.
+</ul>
+The significance of Global vs Local names is this:
+<ul>
+<li> A <code>Global</code> Name has a module and occurrence name; a <code>Local</code>
+has only an occurrence name.
+<p> <li> A <code>Global</code> Name has a unique that never changes. It is never
+cloned. This is important, because the simplifier invents new names pretty freely,
+but we don't want to lose the connnection with the type environment (constructed earlier).
+A <code>Local</code> name can be cloned freely.
+</ul>
+
+
+<!-- hhmts start -->
+Last modified: Wed Aug 8 19:23:01 EST 2001
+<!-- hhmts end -->
+ </small>
+ </body>
+</html>