X-Git-Url: http://git.megacz.com/?p=ghc-hetmet.git;a=blobdiff_plain;f=docs%2Fcomm%2Fthe-beast%2Fvars.html;fp=docs%2Fcomm%2Fthe-beast%2Fvars.html;h=9bbd310c605a66c2b5ed0db6ca7f1a71018c009d;hp=0000000000000000000000000000000000000000;hb=0065d5ab628975892cea1ec7303f968c3338cbe1;hpb=28a464a75e14cece5db40f2765a29348273ff2d2 diff --git a/docs/comm/the-beast/vars.html b/docs/comm/the-beast/vars.html new file mode 100644 index 0000000..9bbd310 --- /dev/null +++ b/docs/comm/the-beast/vars.html @@ -0,0 +1,235 @@ + + + + + The GHC Commentary - The Real Story about Variables, Ids, TyVars, and the like + + + +

The GHC Commentary - The Real Story about Variables, Ids, TyVars, and the like

+ + +

Variables

+ +The Var type, defined in basicTypes/Var.lhs, +represents variables, both term variables and type variables: +

+    data Var
+      = Var {
+	    varName    :: Name,
+	    realUnique :: FastInt,
+	    varType    :: Type,
+	    varDetails :: VarDetails,
+	    varInfo    :: IdInfo		
+	}
+

The varName field contains the identity of the variable: +its unique number, and its print-name. See "The truth about names". + +
The realUnique field caches the unique number in the +varName field, just to make comparison of Vars a little faster. + +
The varType field gives the type of a term variable, or the kind of a +type variable. (Types and kinds are both represented by a Type.) + +
The varDetails field distinguishes term variables from type variables, +and makes some further distinctions (see below). + +
For term variables (only) the varInfo field contains lots of useful +information: strictness, unfolding, etc. However, this information is all optional; +you can always throw away the IdInfo. In contrast, you can't safely throw away +the VarDetails of a Var +

+It's often fantastically convenient to have term variables and type variables +share a single data type. For example, +

+  exprFreeVars :: CoreExpr -> VarSet
+

+If there were two types, we'd need to return two sets. Simiarly, big lambdas and +little lambdas use the same constructor in Core, which is extremely convenient. +

+We define a couple of type synonyms: +

+  type Id    = Var  -- Term variables
+  type TyVar = Var  -- Type variables
+

+just to help us document the occasions when we are expecting only term variables, +or only type variables. + + +

The `VarDetails` field

+ +The VarDetails field tells what kind of variable this is: +

+data VarDetails
+  = LocalId 		-- Used for locally-defined Ids (see NOTE below)
+	LocalIdDetails
+
+  | GlobalId 		-- Used for imported Ids, dict selectors etc
+	GlobalIdDetails
+
+  | TyVar
+  | MutTyVar (IORef (Maybe Type)) 	-- Used during unification;
+	     TyVarDetails
+

+ + +

Type variables (`TyVar`)

+ +

+The TyVar case is self-explanatory. The MutTyVar +case is used only during type checking. Then a type variable can be unified, +using an imperative update, with a type, and that is what the +IORef is for. The TcType.TyVarDetails field records +the sort of type variable we are dealing with. It is defined as +

+data TyVarDetails = SigTv | ClsTv | InstTv | VanillaTv
+

+SigTv marks type variables that were introduced when +instantiating a type signature prior to matching it against the inferred type +of a definition. The variants ClsTv and InstTv mark +scoped type variables introduced by class and instance heads, respectively. +These first three sorts of type variables are skolem variables (tested by the +predicate isSkolemTyVar); i.e., they must not be +instantiated. All other type variables are marked as VanillaTv. +

+For a long time I tried to keep mutable Vars statically type-distinct +from immutable Vars, but I've finally given up. It's just too painful. +After type checking there are no MutTyVars left, but there's no static check +of that fact. + +

Term variables (`Id`)

+ +A term variable (of type Id) is represented either by a +LocalId or a GlobalId: +

+A GlobalId is +

Always bound at top-level. +
Always has a GlobalName, and hence has + a Unique that is globally unique across the whole + GHC invocation (a single invocation may compile multiple modules). +
Has IdInfo that is absolutely fixed, forever. +

+ +

+A LocalId is: +

Always bound in the module being compiled: +
- either bound within an expression (lambda, case, local let(rec)) +
- or defined at top level in the module being compiled. +
+
Has IdInfo that changes as the simpifier bashes repeatedly on it. +

+The key thing about LocalIds is that the free-variable finder +typically treats them as candidate free variables. That is, it ignores +GlobalIds such as imported constants, data contructors, etc. +

+An important invariant is this: All the bindings in the module +being compiled (whether top level or not) are LocalIds +until the CoreTidy phase. In the CoreTidy phase, all +externally-visible top-level bindings are made into GlobalIds. This +is the point when a LocalId becomes "frozen" and becomes +a fixed, immutable GlobalId. +

+(A binding is "externally-visible" if it is exported, or +mentioned in the unfolding of an externally-visible Id. An +externally-visible Id may not have an unfolding, either because it is +too big, or because it is the loop-breaker of a recursive group.) + +

Global Ids and implicit Ids

+ +GlobalIds are further categorised by their GlobalIdDetails. +This type is defined in basicTypes/IdInfo, because it mentions other +structured types like DataCon. Unfortunately it is *used* in Var.lhs +so there's a hi-boot knot to get it there. Anyway, here's the declaration: +

+data GlobalIdDetails
+  = NotGlobalId			-- Used as a convenient extra return value 
+                                -- from globalIdDetails
+
+  | VanillaGlobal		-- Imported from elsewhere
+
+  | PrimOpId PrimOp		-- The Id for a primitive operator
+  | FCallId ForeignCall		-- The Id for a foreign call
+
+  -- These next ones are all "implicit Ids"
+  | RecordSelId FieldLabel	-- The Id for a record selector
+  | DataConId DataCon		-- The Id for a data constructor *worker*
+  | DataConWrapId DataCon	-- The Id for a data constructor *wrapper*
+				-- [the only reasons we need to know is so that
+				--  a) we can  suppress printing a definition in the interface file
+				--  b) when typechecking a pattern we can get from the
+				--     Id back to the data con]
+

+The GlobalIdDetails allows us to go from the Id for +a record selector, say, to its field name; or the Id for a primitive +operator to the PrimOp itself. +

+Certain GlobalIds are called "implicit" Ids. An implicit +Id is derived by implication from some other declaration. So a record selector is +derived from its data type declaration, for example. An implicit Ids is always +a GlobalId. For most of the compilation, the implicit Ids are just +that: implicit. If you do -ddump-simpl you won't see their definition. (That's +why it's true to say that until CoreTidy all Ids in this compilation unit are +LocalIds.) But at CorePrep, a binding is added for each implicit Id defined in +this module, so that the code generator will generate code for the (curried) function. +

+Implicit Ids carry their unfolding inside them, of course, so they may well have +been inlined much earlier; but we generate the curried top-level defn just in +case its ever needed. + +

LocalIds

+ +The LocalIdDetails gives more info about a LocalId: +

+data LocalIdDetails 
+  = NotExported	-- Not exported
+  | Exported	-- Exported
+  | SpecPragma	-- Not exported, but not to be discarded either
+		-- It's unclean that this is so deeply built in
+

+From this we can tell whether the LocalId is exported, and that +tells us whether we can drop an unused binding as dead code. +

+The SpecPragma thing is a HACK. Suppose you write a SPECIALIZE pragma: +

+   foo :: Num a => a -> a
+   {-# SPECIALIZE foo :: Int -> Int #-}
+   foo = ...
+

+The type checker generates a dummy call to foo at the right types: +

+   $dummy = foo Int dNumInt
+

+The Id $dummy is marked SpecPragma. Its role is to hang +onto that call to foo so that the specialiser can see it, but there +are no calls to $dummy. +The simplifier is careful not to discard SpecPragma Ids, so that it +reaches the specialiser. The specialiser processes the right hand side of a SpecPragma Id +to find calls to overloaded functions, and then discards the SpecPragma Id. +So SpecPragma behaves a like Exported, at least until the specialiser. + + +

ExternalNames and InternalNames

+ +Notice that whether an Id is a LocalId or GlobalId is +not the same as whether the Id has an ExternaName or an InternalName +(see "The truth about Names"): +

Every GlobalId has an ExternalName. +
A LocalId might have either kind of Name. +

+ + +Last modified: Fri Sep 12 15:17:18 BST 2003 + + + + +