From e4d057ecd70b0e777f8bd56c3a079608268900c3 Mon Sep 17 00:00:00 2001
From: chak
-Last modified: Sat Mar 5 19:52:33 EST 2005
+Last modified: Wed May 4 11:48:54 EST 2005
diff --git a/ghc/docs/comm/the-beast/names.html b/ghc/docs/comm/the-beast/names.html
index 1d7ce18..061fae3 100644
--- a/ghc/docs/comm/the-beast/names.html
+++ b/ghc/docs/comm/the-beast/names.html
@@ -8,133 +8,160 @@
-
-
-Every entity (type constructor, class, identifier, type variable) has
-a
-The name spaces are:
-The Glasgow Haskell Compiler (GHC) Commentary [v0.15]
+ The Glasgow Haskell Compiler (GHC) Commentary [v0.16]
The GHC Commentary - The truth about names: OccNames, and Names
Name
. The Name
type is pervasive in GHC,
-and is defined in basicTypes/Name.lhs
. Here is what a Name looks like,
-though it is private to the Name module.
-
- data Name = Name {
- n_sort :: NameSort, -- What sort of name it is
- n_occ :: !OccName, -- Its occurrence name
- n_uniq :: Unique, -- Its identity
- n_loc :: !SrcLoc -- Definition site
- }
-
-
-
-
-
-n_sort
field says what sort of name this is: see
-NameSort below.
-n_occ
field gives the "occurrence name" of the Name; see
-OccName below.
-n_uniq
field allows fast tests for equality of Names.
-n_loc
field gives some indication of where the name was bound.
-The
-
-There are three flavours of NameSort
of a Name
Name
:
-
- data NameSort
- = External Module
- | Internal
- | System
-
-
-
-
-
-
-
-
-
-ExternalName
has a globally-unique
-(module name,occurrence name) pair, namely the
-original name of the entity,
-describing where the thing was originally defined. So for example,
-if we have
-
- module M where
- f = e1
- g = e2
-
- module A where
- import qualified M as Q
- import M
- a = Q.f + g
-
-then the RdrNames for "a", "Q.f" and "g" get replaced (by the Renamer)
-by the Names "A.a", "M.f", and "M.g" respectively.
-
-InternalName
-has only an occurrence name. Distinct InternalNames may have the same occurrence
-name; use the Unique to distinguish them.
-
-ExternalName
has a unique that never changes. It is never
-cloned. This is important, because the simplifier invents new names pretty freely,
-but we don't want to lose the connnection with the type environment (constructed earlier).
-An InternalName
name can be cloned freely.
-
-ExternalNames
, whereas extra
-top-level bindings generated (say) by the type checker get InternalNames
.
-This distinction is occasionally useful for filtering diagnostic output; e.g.
-for -ddump-types.
-
-ExternalName
will generate symbols that
-appear as external symbols in the object file. An Id with an InternalName
-cannot be referenced from outside the module, and so generates a local symbol in
-the object file. The CoreTidy pass makes the decision about which names should
-be External and which Internal.
-
-System
name is for the most part the same as an
-Internal
. Indeed, the differences are purely cosmetic:
-
-
- Occurrence names:
-
-An OccName
OccName
is more-or-less just a string, like "foo" or "Tree",
-giving the (unqualified) name of an entity.
-
-Well, not quite just a string, because in Haskell a name like "C" could mean a type
-constructor or data constructor, depending on context. So GHC defines a type
-OccName (defined in basicTypes/OccName.lhs) that is a pair of
-a FastString and a NameSpace indicating which name space the
-name is drawn from:
-
- data OccName = OccName NameSpace EncodedFS
-
-The EncodedFS is a synonym for FastString indicating that the
-string is Z-encoded. (Details in OccName.lhs.) Z-encoding encodes
-funny characters like '%' and '$' into alphabetic characters, like "zp" and "zd",
-so that they can be used in object-file symbol tables without confusing linkers
-and suchlike.
-
-
-
-
+ Every entity (type constructor, class, identifier, type variable) has a
+ Name
. The Name
type is pervasive in GHC, and
+ is defined in basicTypes/Name.lhs
. Here is what a Name
+ looks like, though it is private to the Name module.
+
+++data Name = Name { + n_sort :: NameSort, -- What sort of name it is + n_occ :: !OccName, -- Its occurrence name + n_uniq :: Unique, -- Its identity + n_loc :: !SrcLoc -- Definition site + }+
n_sort
field says what sort of name this is: see
+ NameSort below.
+ n_occ
field gives the "occurrence name" of the
+ Name; see
+ OccName below.
+ n_uniq
field allows fast tests for equality of
+ Names.
+ n_loc
field gives some indication of where the
+ name was bound.
+ NameSort
of a Name
+ There are four flavours of Name
:
+
+++data NameSort + = External Module (Maybe Name) + -- (Just parent) => this Name is a subordinate name of 'parent' + -- e.g. data constructor of a data type, method of a class + -- Nothing => not a subordinate + + | WiredIn Module (Maybe Name) TyThing BuiltInSyntax + -- A variant of External, for wired-in things + + | Internal -- A user-defined Id or TyVar + -- defined in the module being compiled + + | System -- A system-defined Id or TyVar. Typically the + -- OccName is very uninformative (like 's')+
External
name has a globally-unique
+ (module name, occurrence name) pair, namely the
+ original name of the entity,
+ describing where the thing was originally defined. So for example,
+ if we have
+ +++module M where + f = e1 + g = e2 + +module A where + import qualified M as Q + import M + a = Q.f + g+
+ then the RdrNames for "a", "Q.f" and "g" get replaced (by the + Renamer) by the Names "A.a", "M.f", and "M.g" respectively. +
+InternalName
+ has only an occurrence name. Distinct InternalNames may have the same
+ occurrence name; use the Unique to distinguish them.
+ ExternalName
has a unique that never changes. It
+ is never cloned. This is important, because the simplifier invents
+ new names pretty freely, but we don't want to lose the connnection
+ with the type environment (constructed earlier). An
+ InternalName
name can be cloned freely.
+ ExternalNames
,
+ whereas extra top-level bindings generated (say) by the type checker
+ get InternalNames
. q This distinction is occasionally
+ useful for filtering diagnostic output; e.g. for -ddump-types.
+ ExternalName
will generate symbols that
+ appear as external symbols in the object file. An Id with an
+ InternalName
cannot be referenced from outside the
+ module, and so generates a local symbol in the object file. The
+ CoreTidy pass makes the decision about which names should be External
+ and which Internal.
+ System
name is for the most part the same as an
+ Internal
. Indeed, the differences are purely cosmetic:
+ OccName
+ An OccName
is more-or-less just a string, like "foo" or
+ "Tree", giving the (unqualified) name of an entity.
+
+ Well, not quite just a string, because in Haskell a name like "C" could + mean a type constructor or data constructor, depending on context. So + GHC defines a type OccName (defined in + basicTypes/OccName.lhs) that is a pair of a FastString + and a NameSpace indicating which name space the name is drawn + from: +
+++data OccName = OccName NameSpace EncodedFS+
+ The EncodedFS is a synonym for FastString indicating + that the string is Z-encoded. (Details in OccName.lhs.) + Z-encoding encodes funny characters like '%' and '$' into alphabetic + characters, like "zp" and "zd", so that they can be used in object-file + symbol tables without confusing linkers and suchlike. +
++ The name spaces are: +
++ The renamer sits between the parser and the typechecker. + However, its operation is quite tightly interwoven with the + typechecker. This is partially due to support for Template Haskell, + where spliced code has to be renamed and type checked. In particular, + top-level splices lead to multiple rounds of renaming and type + checking. +
+
+ The main externally used functions of the renamer are provided by the
+ module rename/RnSource.lhs
. In particular, we have
+
+++rnSrcDecls :: HsGroup RdrName -> RnM (TcGblEnv, HsGroup Name) +rnTyClDecls :: [LTyClDecl RdrName] -> RnM [LTyClDecl Name] +rnSplice :: HsSplice RdrName -> RnM (HsSplice Name, FreeVars)+
+ All of which execute in the renamer monad RnM
. The first
+ function, rnSrcDecls
renames a binding group; the second,
+ rnTyClDecls
renames a list of (toplevel) type and class
+ declarations; and the third, rnSplice
renames a Template
+ Haskell splice. As the types indicate, the main task of the renamer is
+ to convert converts all the RdrNames to Names, which includes a number of
+ well-formedness checks (no duplicate declarations, all names are in
+ scope, and so on). In addition, the renamer performs other, not
+ strictly name-related, well-formedness checks, which includes checking
+ that the appropriate flags have been supplied whenever language
+ extensions are used in the source.
+
+ A RdrName.RdrName is pretty much just a string (for an + unqualified name like "f") or a pair of strings (for a + qualified name like "M.f"): +
++++data RdrName + = Unqual OccName + -- Used for ordinary, unqualified occurrences -(This section is, like most of the Commentary, rather incomplete.) -+-The renamer sits between the parser and the typechecker. -Roughly speaking, It has the type: -
- HsModule RdrName -> HsModule Name --That is, it converts all the RdrNames to Names. + | Qual Module OccName + -- A qualified name written by the user in + -- *source* code. The module isn't necessarily + -- the module where the thing is defined; + -- just the one from which it is imported -RdrNames
+ | Orig Module OccName + -- An original name; the module is the *defining* module. + -- This is used when GHC generates code that will be fed + -- into the renamer (e.g. from deriving clauses), but where + -- we want to say "Use Prelude.map dammit". + + | Exact Name + -- We know exactly the Name. This is used + -- (a) when the parser parses built-in syntax like "[]" + -- and "(,)", but wants a RdrName from it + -- (b) when converting names to the RdrNames in IfaceTypes + -- Here an Exact RdrName always contains an External Name + -- (Internal Names are converted to simple Unquals) + -- (c) by Template Haskell, when TH has generated a unique name
+ The OccName type is described in The + truth about names. +
-A RdrNames is pretty much just a string (for an unqualified name -like "f") or a pair of strings (for a qualified name like "M.f"): -- data RdrName = RdrName Qual OccName - - data Qual = Unqual - - | Qual ModuleName -- A qualified name written by the user in source code - -- The module isn't necessarily the module where - -- the thing is defined; just the one from which it - -- is imported - - | Orig ModuleName -- This is an *original* name; the module is the place - -- where the thing was defined --The OccName type is described in "The truth about names". -
-The OrigName variant is used internally; it allows GHC to speak of RdrNames -that refer to the original name of the thing. +
+ Due to the tight integration of the renamer with the typechecker, both + use the same monad in recent versions of GHC. So, we have +
++++type RnM a = TcRn a -- Historical +type TcM a = TcRn a -- Historical+
+ with the combined monad defined as +
++++type TcRn a = TcRnIf TcGblEnv TcLclEnv a +type TcRnIf a b c = IOEnv (Env a b) c +data Env gbl lcl -- Changes as we move into an expression + = Env { + env_top :: HscEnv, -- Top-level stuff that never changes + -- Includes all info about imported things -Rebindable syntax
+ env_us :: TcRef UniqSupply, -- Unique supply for local varibles -In Haskell when one writes "3" one gets "fromInteger 3", where -"fromInteger" comes from the Prelude (regardless of whether the -Prelude is in scope). If you want to completely redefine numbers, -that becomes inconvenient. So GHC lets you say -"-fno-implicit-prelude"; in that case, the "fromInteger" comes from -whatever is in scope. (This is documented in the User Guide.) --This feature is implemented as follows (I always forget). -
-
+ env_gbl :: gbl, -- Info about things defined at the top level + -- of the module being compiled + + env_lcl :: lcl -- Nested stuff; changes as we go into + -- an expression + } +- Four HsSyn constructs (NegApp, NPlusKPat, HsIntegral, HsFractional) -contain a Name (i.e. it is not parameterised). -
- When the parser builds these constructs, it puts in the built-in Prelude -Name (e.g. PrelNum.fromInteger). -
- When the renamer encounters these constructs, it calls RnEnv.lookupSyntaxName. -This checks for -fno-implicit-prelude; if not, it just returns the same Name; -otherwise it takes the occurrence name of the Name, turns it into an unqualified RdrName, and looks -it up in the environment. The returned name is plugged back into the construct. -
- The typechecker uses the Name to generate the appropriate typing constraints. -
+ the details of the global environment type TcGblEnv
and
+ local environment type TcLclEnv
are also defined in the
+ module typecheck/TcRnTypes.lhs
. The monad
+ IOEnv
is defined in utils/IOEnv.hs
and extends
+ the vanilla IO
monad with an additional state parameter
+ env
that is treated as in a reader monad. (Side effecting
+ operations, such as updating the unique supply, are done with
+ TcRef
s, which are simply a synonym for IORef
s.)
+
+ As anticipated by the variants Orig
and Exact
+ of RdrName
some names should not change during renaming,
+ whereas others need to be turned into unique names. In this context,
+ the two functions RnEnv.newTopSrcBinder
and
+ RnEnv.newLocals
are important:
+
+++newTopSrcBinder :: Module -> Maybe Name -> Located RdrName -> RnM Name +newLocalsRn :: [Located RdrName] -> RnM [Name]+
+ The two functions introduces new toplevel and new local names,
+ respectively, where the first two arguments to
+ newTopSrcBinder
determine the currently compiled module and
+ the parent construct of the newly defined name. Both functions create
+ new names only for RdrName
s that are neither exact nor
+ original.
+
+ A global RdrName
environment
+ RdrName.GlobalRdrEnv
is a map from OccName
s to
+ lists of qualified names. More precisely, the latter are
+ Name
s with an associated Provenance
:
+
+++data Provenance + = LocalDef -- Defined locally + Module + + | Imported -- Imported + [ImportSpec] -- INVARIANT: non-empty + Bool -- True iff the thing was named *explicitly* + -- in *any* of the import specs rather than being + -- imported as part of a group; + -- e.g. + -- import B + -- import C( T(..) ) + -- Here, everything imported by B, and the constructors of T + -- are not named explicitly; only T is named explicitly. + -- This info is used when warning of unused names.+
+ The part of the global RdrName
environment for a module
+ that contains the local definitions is created by the function
+ RnNames.importsFromLocalDecls
, which also computes a data
+ structure recording all imported declarations in the form of a value of
+ type TcRnTypes.ImportAvails
.
+
+ The function importsFromLocalDecls
, in turn, makes use of
+ RnNames.getLocalDeclBinders :: Module -> HsGroup RdrName -> RnM
+ [AvailInfo]
to extract all declared names from a binding group,
+ where HscTypes.AvailInfo
is essentially a collection of
+ Name
s; i.e., getLocalDeclBinders
, on the fly,
+ generates Name
s from the RdrName
s of all
+ top-level binders of the module represented by the HsGroup
+ RdrName
argument.
+
+ It is important to note that all this happens before the renamer
+ actually descends into the toplevel bindings of a module. In other
+ words, before TcRnDriver.rnTopSrcDecls
performs the
+ renaming of a module by way of RnSource.rnSrcDecls
, it uses
+ importsFromLocalDecls
to set up the global
+ RdrName
environment, which contains Name
s for
+ all imported and all locally defined toplevel binders. Hence,
+ when the helpers of rnSrcDecls
come across the
+ defining occurences of a toplevel RdrName
, they
+ don't rename it by generating a new name, but they simply look up its
+ name in the global RdrName
environment.
+
+ In Haskell when one writes "3" one gets "fromInteger 3", where + "fromInteger" comes from the Prelude (regardless of whether the + Prelude is in scope). If you want to completely redefine numbers, + that becomes inconvenient. So GHC lets you say + "-fno-implicit-prelude"; in that case, the "fromInteger" comes from + whatever is in scope. (This is documented in the User Guide.) +
++ This feature is implemented as follows (I always forget). +
HsExpr.SyntaxExpr
. Moreover, the association list
+ HsExpr.SyntaxTable
is set up by the renamer to map
+ rebindable names to the value they are bound to.
+ HsExpr.NegApp
, HsPat.NPat
,
+ HsPat.NPlusKPat
, HsLit.HsIntegral
, and
+ HsLit.HsFractional
) and
+ two constructs related to code>do expressions
+ (HsExpr.BindStmt
and
+ HsExpr.ExprStmt
) have rebindable syntax.
+ -Last modified: Tue Nov 13 14:11:35 EST 2001 +Last modified: Wed May 4 17:16:15 EST 2005 + -- 1.7.10.4