2 % (c) The GRASP/AQUA Project, Glasgow University, 1998
4 \section[DataCon]{@DataCon@: Data Constructors}
8 DataCon, DataConIds(..),
11 dataConRepType, dataConSig, dataConName, dataConTag, dataConTyCon,
12 dataConTyVars, dataConStupidTheta,
13 dataConArgTys, dataConOrigArgTys, dataConResTy,
14 dataConInstOrigArgTys, dataConRepArgTys,
15 dataConFieldLabels, dataConFieldType,
16 dataConStrictMarks, dataConExStricts,
17 dataConSourceArity, dataConRepArity,
19 dataConWorkId, dataConWrapId, dataConWrapId_maybe, dataConImplicitIds,
21 isNullarySrcDataCon, isNullaryRepDataCon, isTupleCon, isUnboxedTupleCon,
22 isVanillaDataCon, classDataCon,
24 splitProductType_maybe, splitProductType,
27 #include "HsVersions.h"
29 import Type ( Type, ThetaType, substTyWith, substTy, zipTopTvSubst,
30 mkForAllTys, mkFunTys, mkTyConApp,
32 mkPredTys, isStrictPred, pprType
34 import TyCon ( TyCon, FieldLabel, tyConDataCons, tyConDataCons,
35 isProductTyCon, isTupleTyCon, isUnboxedTupleTyCon )
36 import Class ( Class, classTyCon )
37 import Name ( Name, NamedThing(..), nameUnique )
38 import Var ( TyVar, Id )
39 import BasicTypes ( Arity, StrictnessMark(..) )
41 import Unique ( Unique, Uniquable(..) )
42 import ListSetOps ( assoc )
43 import Util ( zipEqual, zipWithEqual )
44 import Maybes ( expectJust )
48 Data constructor representation
49 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
50 Consider the following Haskell data type declaration
52 data T = T !Int ![Int]
54 Using the strictness annotations, GHC will represent this as
58 That is, the Int has been unboxed. Furthermore, the Haskell source construction
68 That is, the first argument is unboxed, and the second is evaluated. Finally,
69 pattern matching is translated too:
71 case e of { T a b -> ... }
75 case e of { T a' b -> let a = I# a' in ... }
77 To keep ourselves sane, we name the different versions of the data constructor
78 differently, as follows.
81 Note [Data Constructor Naming]
82 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
83 Each data constructor C has two, and possibly three, Names associated with it:
85 OccName Name space Used for
86 ---------------------------------------------------------------------------
87 * The "source data con" C DataName The DataCon itself
88 * The "real data con" C VarName Its worker Id
89 * The "wrapper data con" $WC VarName Wrapper Id (optional)
91 Each of these three has a distinct Unique. The "source data con" name
92 appears in the output of the renamer, and names the Haskell-source
93 data constructor. The type checker translates it into either the wrapper Id
94 (if it exists) or worker Id (otherwise).
96 The data con has one or two Ids associated with it:
98 The "worker Id", is the actual data constructor.
99 Its type may be different to the Haskell source constructor
101 - useless dict args are dropped
102 - strict args may be flattened
103 The worker is very like a primop, in that it has no binding.
105 Newtypes have no worker Id
108 The "wrapper Id", $WC, whose type is exactly what it looks like
109 in the source program. It is an ordinary function,
110 and it gets a top-level binding like any other function.
112 The wrapper Id isn't generated for a data type if the worker
113 and wrapper are identical. It's always generated for a newtype.
117 A note about the stupid context
118 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
119 Data types can have a context:
121 data (Eq a, Ord b) => T a b = T1 a b | T2 a
123 and that makes the constructors have a context too
124 (notice that T2's context is "thinned"):
126 T1 :: (Eq a, Ord b) => a -> b -> T a b
127 T2 :: (Eq a) => a -> T a b
129 Furthermore, this context pops up when pattern matching
130 (though GHC hasn't implemented this, but it is in H98, and
131 I've fixed GHC so that it now does):
135 f :: Eq a => T a b -> a
137 I say the context is "stupid" because the dictionaries passed
138 are immediately discarded -- they do nothing and have no benefit.
139 It's a flaw in the language.
141 Up to now [March 2002] I have put this stupid context into the
142 type of the "wrapper" constructors functions, T1 and T2, but
143 that turned out to be jolly inconvenient for generics, and
144 record update, and other functions that build values of type T
145 (because they don't have suitable dictionaries available).
147 So now I've taken the stupid context out. I simply deal with
148 it separately in the type checker on occurrences of a
149 constructor, either in an expression or in a pattern.
151 [May 2003: actually I think this decision could evasily be
152 reversed now, and probably should be. Generics could be
153 disabled for types with a stupid context; record updates now
154 (H98) needs the context too; etc. It's an unforced change, so
155 I'm leaving it for now --- but it does seem odd that the
156 wrapper doesn't include the stupid context.]
158 [July 04] With the advent of generalised data types, it's less obvious
159 what the "stupid context" is. Consider
160 C :: forall a. Ord a => a -> a -> T (Foo a)
161 Does the C constructor in Core contain the Ord dictionary? Yes, it must:
166 C a (d:Ord a) (p:a) (q:a) -> compare d p q
168 Note that (Foo a) might not be an instance of Ord.
170 %************************************************************************
172 \subsection{Data constructors}
174 %************************************************************************
179 dcName :: Name, -- This is the name of the *source data con*
180 -- (see "Note [Data Constructor Naming]" above)
181 dcUnique :: Unique, -- Cached from Name
186 -- data Eq a => T a = forall b. Ord b => MkT a [b]
188 -- The next six fields express the type of the constructor, in pieces
192 -- dcStupidTheta = [Eq a]
194 -- dcOrigArgTys = [a,List b]
198 dcVanilla :: Bool, -- True <=> This is a vanilla Haskell 98 data constructor
199 -- Its type is of form
200 -- forall a1..an . t1 -> ... tm -> T a1..an
201 -- No existentials, no GADTs, nothing.
203 dcTyVars :: [TyVar], -- Universally-quantified type vars
204 -- for the data constructor.
205 -- dcVanilla = True <=> The [TyVar] are identical to those of the parent tycon
206 -- False <=> The [TyVar] are NOT NECESSARILY THE SAME AS THE TYVARS
207 -- FOR THE PARENT TyCon. (With GADTs the data
208 -- con might not even have the same number of
211 dcStupidTheta :: ThetaType, -- This is a "thinned" version of
212 -- the context of the data decl.
213 -- "Thinned", because the Report says
214 -- to eliminate any constraints that don't mention
215 -- tyvars free in the arg types for this constructor
217 -- "Stupid", because the dictionaries aren't used for anything.
219 -- Indeed, [as of March 02] they are no
220 -- longer in the type of the wrapper Id, because
221 -- that makes it harder to use the wrap-id to rebuild
222 -- values after record selection or in generics.
224 dcTheta :: ThetaType, -- The existentially quantified stuff
226 dcOrigArgTys :: [Type], -- Original argument types
227 -- (before unboxing and flattening of
230 -- Result type of constructor is T t1..tn
231 dcTyCon :: TyCon, -- Result tycon, T
232 dcResTys :: [Type], -- Result type args, t1..tn
234 -- Now the strictness annotations and field labels of the constructor
235 dcStrictMarks :: [StrictnessMark],
236 -- Strictness annotations as decided by the compiler.
237 -- Does *not* include the existential dictionaries
238 -- length = dataConSourceArity dataCon
240 dcFields :: [FieldLabel],
241 -- Field labels for this constructor, in the
242 -- same order as the argument types;
243 -- length = 0 (if not a record) or dataConSourceArity.
245 -- Constructor representation
246 dcRepArgTys :: [Type], -- Final, representation argument types,
247 -- after unboxing and flattening,
248 -- and *including* existential dictionaries
250 dcRepStrictness :: [StrictnessMark], -- One for each *representation* argument
252 dcRepType :: Type, -- Type of the constructor
253 -- forall a b . Ord b => a -> [b] -> MkT a
254 -- (this is *not* of the constructor wrapper Id:
255 -- see notes after this data type declaration)
257 -- Notice that the existential type parameters come *second*.
258 -- Reason: in a case expression we may find:
259 -- case (e :: T t) of { MkT b (d:Ord b) (x:t) (xs:[b]) -> ... }
260 -- It's convenient to apply the rep-type of MkT to 't', to get
261 -- forall b. Ord b => ...
262 -- and use that to check the pattern. Mind you, this is really only
266 -- Finally, the curried worker function that corresponds to the constructor
267 -- It doesn't have an unfolding; the code generator saturates these Ids
268 -- and allocates a real constructor when it finds one.
270 -- An entirely separate wrapper function is built in TcTyDecls
273 dcInfix :: Bool -- True <=> declared infix
274 -- Used for Template Haskell and 'deriving' only
275 -- The actual fixity is stored elsewhere
279 = NewDC Id -- Newtypes have only a wrapper, but no worker
280 | AlgDC (Maybe Id) Id -- Algebraic data types always have a worker, and
281 -- may or may not have a wrapper, depending on whether
282 -- the wrapper does anything.
284 -- _Neither_ the worker _nor_ the wrapper take the dcStupidTheta dicts as arguments
286 -- The wrapper takes dcOrigArgTys as its arguments
287 -- The worker takes dcRepArgTys as its arguments
288 -- If the worker is absent, dcRepArgTys is the same as dcOrigArgTys
290 -- The 'Nothing' case of AlgDC is important
291 -- Not only is this efficient,
292 -- but it also ensures that the wrapper is replaced
293 -- by the worker (becuase it *is* the wroker)
294 -- even when there are no args. E.g. in
296 -- the (:) *is* the worker.
297 -- This is really important in rule matching,
298 -- (We could match on the wrappers,
299 -- but that makes it less likely that rules will match
300 -- when we bring bits of unfoldings together.)
305 fIRST_TAG = 1 -- Tags allocated from here for real constructors
308 The dcRepType field contains the type of the representation of a contructor
309 This may differ from the type of the contructor *Id* (built
310 by MkId.mkDataConId) for two reasons:
311 a) the constructor Id may be overloaded, but the dictionary isn't stored
312 e.g. data Eq a => T a = MkT a a
314 b) the constructor may store an unboxed version of a strict field.
316 Here's an example illustrating both:
317 data Ord a => T a = MkT Int! a
319 T :: Ord a => Int -> a -> T a
321 Trep :: Int# -> a -> T a
322 Actually, the unboxed part isn't implemented yet!
325 %************************************************************************
327 \subsection{Instances}
329 %************************************************************************
332 instance Eq DataCon where
333 a == b = getUnique a == getUnique b
334 a /= b = getUnique a /= getUnique b
336 instance Ord DataCon where
337 a <= b = getUnique a <= getUnique b
338 a < b = getUnique a < getUnique b
339 a >= b = getUnique a >= getUnique b
340 a > b = getUnique a > getUnique b
341 compare a b = getUnique a `compare` getUnique b
343 instance Uniquable DataCon where
346 instance NamedThing DataCon where
349 instance Outputable DataCon where
350 ppr con = ppr (dataConName con)
352 instance Show DataCon where
353 showsPrec p con = showsPrecSDoc p (ppr con)
357 %************************************************************************
359 \subsection{Construction}
361 %************************************************************************
365 -> Bool -- Declared infix
366 -> Bool -- Vanilla (see notes with dcVanilla)
367 -> [StrictnessMark] -> [FieldLabel]
368 -> [TyVar] -> ThetaType -> ThetaType
369 -> [Type] -> TyCon -> [Type]
372 -- Can get the tag from the TyCon
374 mkDataCon name declared_infix vanilla
375 arg_stricts -- Must match orig_arg_tys 1-1
377 tyvars stupid_theta theta orig_arg_tys tycon res_tys
381 con = MkData {dcName = name,
382 dcUnique = nameUnique name, dcVanilla = vanilla,
383 dcTyVars = tyvars, dcStupidTheta = stupid_theta, dcTheta = theta,
384 dcOrigArgTys = orig_arg_tys, dcTyCon = tycon, dcResTys = res_tys,
385 dcRepArgTys = rep_arg_tys,
386 dcStrictMarks = arg_stricts, dcRepStrictness = rep_arg_stricts,
387 dcFields = fields, dcTag = tag, dcRepType = ty,
388 dcIds = ids, dcInfix = declared_infix}
390 -- Strictness marks for source-args
391 -- *after unboxing choices*,
392 -- but *including existential dictionaries*
394 -- The 'arg_stricts' passed to mkDataCon are simply those for the
395 -- source-language arguments. We add extra ones for the
396 -- dictionary arguments right here.
397 dict_tys = mkPredTys theta
398 real_arg_tys = dict_tys ++ orig_arg_tys
399 real_stricts = map mk_dict_strict_mark theta ++ arg_stricts
401 -- Representation arguments and demands
402 (rep_arg_stricts, rep_arg_tys) = computeRep real_stricts real_arg_tys
404 tag = assoc "mkDataCon" (tyConDataCons tycon `zip` [fIRST_TAG..]) con
405 ty = mkForAllTys tyvars (mkFunTys rep_arg_tys result_ty)
406 -- NB: the existential dict args are already in rep_arg_tys
408 result_ty = mkTyConApp tycon res_tys
410 mk_dict_strict_mark pred | isStrictPred pred = MarkedStrict
411 | otherwise = NotMarkedStrict
415 dataConName :: DataCon -> Name
418 dataConTag :: DataCon -> ConTag
421 dataConTyCon :: DataCon -> TyCon
422 dataConTyCon = dcTyCon
424 dataConRepType :: DataCon -> Type
425 dataConRepType = dcRepType
427 dataConIsInfix :: DataCon -> Bool
428 dataConIsInfix = dcInfix
430 dataConTyVars :: DataCon -> [TyVar]
431 dataConTyVars = dcTyVars
433 dataConWorkId :: DataCon -> Id
434 dataConWorkId dc = case dcIds dc of
435 AlgDC _ wrk_id -> wrk_id
436 NewDC _ -> pprPanic "dataConWorkId" (ppr dc)
438 dataConWrapId_maybe :: DataCon -> Maybe Id
439 dataConWrapId_maybe dc = case dcIds dc of
440 AlgDC mb_wrap _ -> mb_wrap
441 NewDC wrap -> Just wrap
443 dataConWrapId :: DataCon -> Id
444 -- Returns an Id which looks like the Haskell-source constructor
445 dataConWrapId dc = case dcIds dc of
446 AlgDC (Just wrap) _ -> wrap
447 AlgDC Nothing wrk -> wrk -- worker=wrapper
450 dataConImplicitIds :: DataCon -> [Id]
451 dataConImplicitIds dc = case dcIds dc of
452 AlgDC (Just wrap) work -> [wrap,work]
453 AlgDC Nothing work -> [work]
456 dataConFieldLabels :: DataCon -> [FieldLabel]
457 dataConFieldLabels = dcFields
459 dataConFieldType :: DataCon -> FieldLabel -> Type
460 dataConFieldType con label = expectJust "unexpected label" $
461 lookup label (dcFields con `zip` dcOrigArgTys con)
463 dataConStrictMarks :: DataCon -> [StrictnessMark]
464 dataConStrictMarks = dcStrictMarks
466 dataConExStricts :: DataCon -> [StrictnessMark]
467 -- Strictness of *existential* arguments only
468 -- Usually empty, so we don't bother to cache this
469 dataConExStricts dc = map mk_dict_strict_mark (dcTheta dc)
471 dataConSourceArity :: DataCon -> Arity
472 -- Source-level arity of the data constructor
473 dataConSourceArity dc = length (dcOrigArgTys dc)
475 -- dataConRepArity gives the number of actual fields in the
476 -- {\em representation} of the data constructor. This may be more than appear
477 -- in the source code; the extra ones are the existentially quantified
479 dataConRepArity (MkData {dcRepArgTys = arg_tys}) = length arg_tys
481 isNullarySrcDataCon, isNullaryRepDataCon :: DataCon -> Bool
482 isNullarySrcDataCon dc = null (dcOrigArgTys dc)
483 isNullaryRepDataCon dc = null (dcRepArgTys dc)
485 dataConRepStrictness :: DataCon -> [StrictnessMark]
486 -- Give the demands on the arguments of a
487 -- Core constructor application (Con dc args)
488 dataConRepStrictness dc = dcRepStrictness dc
490 dataConSig :: DataCon -> ([TyVar], ThetaType,
491 [Type], TyCon, [Type])
493 dataConSig (MkData {dcTyVars = tyvars, dcTheta = theta,
494 dcOrigArgTys = arg_tys, dcTyCon = tycon, dcResTys = res_tys})
495 = (tyvars, theta, arg_tys, tycon, res_tys)
497 dataConArgTys :: DataCon
498 -> [Type] -- Instantiated at these types
499 -- NB: these INCLUDE the existentially quantified arg types
500 -> [Type] -- Needs arguments of these types
501 -- NB: these INCLUDE the existentially quantified dict args
502 -- but EXCLUDE the data-decl context which is discarded
503 -- It's all post-flattening etc; this is a representation type
504 dataConArgTys (MkData {dcRepArgTys = arg_tys, dcTyVars = tyvars}) inst_tys
505 = ASSERT( length tyvars == length inst_tys )
506 map (substTyWith tyvars inst_tys) arg_tys
508 dataConResTy :: DataCon -> [Type] -> Type
509 dataConResTy (MkData {dcTyVars = tyvars, dcTyCon = tc, dcResTys = res_tys}) inst_tys
510 = ASSERT( length tyvars == length inst_tys )
511 substTy (zipTopTvSubst tyvars inst_tys) (mkTyConApp tc res_tys)
512 -- zipTopTvSubst because the res_tys can't contain any foralls
514 -- And the same deal for the original arg tys
515 -- This one only works for vanilla DataCons
516 dataConInstOrigArgTys :: DataCon -> [Type] -> [Type]
517 dataConInstOrigArgTys (MkData {dcOrigArgTys = arg_tys, dcTyVars = tyvars, dcVanilla = is_vanilla}) inst_tys
518 = ASSERT( is_vanilla )
519 ASSERT( length tyvars == length inst_tys )
520 map (substTyWith tyvars inst_tys) arg_tys
522 dataConStupidTheta :: DataCon -> ThetaType
523 dataConStupidTheta dc = dcStupidTheta dc
526 These two functions get the real argument types of the constructor,
527 without substituting for any type variables.
529 dataConOrigArgTys returns the arg types of the wrapper, excluding all dictionary args.
531 dataConRepArgTys retuns the arg types of the worker, including all dictionaries, and
532 after any flattening has been done.
535 dataConOrigArgTys :: DataCon -> [Type]
536 dataConOrigArgTys dc = dcOrigArgTys dc
538 dataConRepArgTys :: DataCon -> [Type]
539 dataConRepArgTys dc = dcRepArgTys dc
544 isTupleCon :: DataCon -> Bool
545 isTupleCon (MkData {dcTyCon = tc}) = isTupleTyCon tc
547 isUnboxedTupleCon :: DataCon -> Bool
548 isUnboxedTupleCon (MkData {dcTyCon = tc}) = isUnboxedTupleTyCon tc
550 isVanillaDataCon :: DataCon -> Bool
551 isVanillaDataCon dc = dcVanilla dc
556 classDataCon :: Class -> DataCon
557 classDataCon clas = case tyConDataCons (classTyCon clas) of
558 (dict_constr:no_more) -> ASSERT( null no_more ) dict_constr
561 %************************************************************************
563 \subsection{Splitting products}
565 %************************************************************************
568 splitProductType_maybe
569 :: Type -- A product type, perhaps
570 -> Maybe (TyCon, -- The type constructor
571 [Type], -- Type args of the tycon
572 DataCon, -- The data constructor
573 [Type]) -- Its *representation* arg types
575 -- Returns (Just ...) for any
576 -- concrete (i.e. constructors visible)
577 -- single-constructor
578 -- not existentially quantified
579 -- type whether a data type or a new type
581 -- Rejecing existentials is conservative. Maybe some things
582 -- could be made to work with them, but I'm not going to sweat
583 -- it through till someone finds it's important.
585 splitProductType_maybe ty
586 = case splitTyConApp_maybe ty of
588 | isProductTyCon tycon -- Includes check for non-existential,
589 -- and for constructors visible
590 -> Just (tycon, ty_args, data_con, dataConArgTys data_con ty_args)
592 data_con = head (tyConDataCons tycon)
595 splitProductType str ty
596 = case splitProductType_maybe ty of
598 Nothing -> pprPanic (str ++ ": not a product") (pprType ty)
601 computeRep :: [StrictnessMark] -- Original arg strictness
602 -> [Type] -- and types
603 -> ([StrictnessMark], -- Representation arg strictness
606 computeRep stricts tys
607 = unzip $ concat $ zipWithEqual "computeRep" unbox stricts tys
609 unbox NotMarkedStrict ty = [(NotMarkedStrict, ty)]
610 unbox MarkedStrict ty = [(MarkedStrict, ty)]
611 unbox MarkedUnboxed ty = zipEqual "computeRep" (dataConRepStrictness arg_dc) arg_tys
613 (_, _, arg_dc, arg_tys) = splitProductType "unbox_strict_arg_ty" ty