%
+% (c) The University of Glasgow 2006
% (c) The GRASP/AQUA Project, Glasgow University, 1998
%
\section[DataCon]{@DataCon@: Data Constructors}
DataCon, DataConIds(..),
ConTag, fIRST_TAG,
mkDataCon,
- dataConRepType, dataConSig, dataConName, dataConTag, dataConTyCon,
- dataConTyVars, dataConResTys,
- dataConStupidTheta,
- dataConInstArgTys, dataConOrigArgTys, dataConInstResTy,
+ dataConRepType, dataConSig, dataConFullSig,
+ dataConName, dataConTag, dataConTyCon, dataConUserType,
+ dataConUnivTyVars, dataConExTyVars, dataConAllTyVars, dataConResTys,
+ dataConEqSpec, eqSpecPreds, dataConTheta, dataConStupidTheta,
+ dataConInstArgTys, dataConOrigArgTys,
dataConInstOrigArgTys, dataConRepArgTys,
dataConFieldLabels, dataConFieldType,
dataConStrictMarks, dataConExStricts,
isNullarySrcDataCon, isNullaryRepDataCon, isTupleCon, isUnboxedTupleCon,
isVanillaDataCon, classDataCon,
- splitProductType_maybe, splitProductType,
+ splitProductType_maybe, splitProductType, deepSplitProductType,
+ deepSplitProductType_maybe
) where
#include "HsVersions.h"
-import Type ( Type, ThetaType, substTyWith, substTy, zipOpenTvSubst,
- mkForAllTys, mkFunTys, mkTyConApp,
- splitTyConApp_maybe,
- mkPredTys, isStrictPred, pprType
- )
-import TyCon ( TyCon, FieldLabel, tyConDataCons,
- isProductTyCon, isTupleTyCon, isUnboxedTupleTyCon )
-import Class ( Class, classTyCon )
-import Name ( Name, NamedThing(..), nameUnique )
-import Var ( TyVar, Id )
-import BasicTypes ( Arity, StrictnessMark(..) )
+import Type
+import Coercion
+import TyCon
+import Class
+import Name
+import Var
+import BasicTypes
import Outputable
-import Unique ( Unique, Uniquable(..) )
-import ListSetOps ( assoc )
-import Util ( zipEqual, zipWithEqual )
-import Maybes ( expectJust )
+import Unique
+import ListSetOps
+import Util
+import Maybes
+import FastString
\end{code}
The data con has one or two Ids associated with it:
- The "worker Id", is the actual data constructor.
- Its type may be different to the Haskell source constructor
- because:
- - useless dict args are dropped
- - strict args may be flattened
- The worker is very like a primop, in that it has no binding.
+The "worker Id", is the actual data constructor.
+* Every data constructor (newtype or data type) has a worker
- Newtypes have no worker Id
+* The worker is very like a primop, in that it has no binding.
+* For a *data* type, the worker *is* the data constructor;
+ it has no unfolding
- The "wrapper Id", $WC, whose type is exactly what it looks like
- in the source program. It is an ordinary function,
- and it gets a top-level binding like any other function.
+* For a *newtype*, the worker has a compulsory unfolding which
+ does a cast, e.g.
+ newtype T = MkT Int
+ The worker for MkT has unfolding
+ \(x:Int). x `cast` sym CoT
+ Here CoT is the type constructor, witnessing the FC axiom
+ axiom CoT : T = Int
- The wrapper Id isn't generated for a data type if the worker
- and wrapper are identical. It's always generated for a newtype.
+The "wrapper Id", $WC, goes as follows
+
+* Its type is exactly what it looks like in the source program.
+
+* It is an ordinary function, and it gets a top-level binding
+ like any other function.
+
+* The wrapper Id isn't generated for a data type if there is
+ nothing for the wrapper to do. That is, if its defn would be
+ $wC = C
+
+Why might the wrapper have anything to do? Two reasons:
+
+* Unboxing strict fields (with -funbox-strict-fields)
+ data T = MkT !(Int,Int)
+ $wMkT :: (Int,Int) -> T
+ $wMkT (x,y) = MkT x y
+ Notice that the worker has two fields where the wapper has
+ just one. That is, the worker has type
+ MkT :: Int -> Int -> T
+
+* Equality constraints for GADTs
+ data T a where { MkT :: a -> T [a] }
+
+ The worker gets a type with explicit equality
+ constraints, thus:
+ MkT :: forall a b. (a=[b]) => b -> T a
+
+ The wrapper has the programmer-specified type:
+ $wMkT :: a -> T [a]
+ $wMkT a x = MkT [a] a [a] x
+ The third argument is a coerion
+ [a] :: [a]:=:[a]
-- Running example:
--
- -- data Eq a => T a = forall b. Ord b => MkT a [b]
+ -- *** As declared by the user
+ -- data T a where
+ -- MkT :: forall x y. (Ord x) => x -> y -> T (x,y)
+ -- *** As represented internally
+ -- data T a where
+ -- MkT :: forall a. forall x y. (a:=:(x,y), Ord x) => x -> y -> T a
+ --
-- The next six fields express the type of the constructor, in pieces
-- e.g.
--
- -- dcTyVars = [a,b]
- -- dcStupidTheta = [Eq a]
- -- dcTheta = [Ord b]
+ -- dcUnivTyVars = [a]
+ -- dcExTyVars = [x,y]
+ -- dcEqSpec = [a:=:(x,y)]
+ -- dcTheta = [Ord x]
-- dcOrigArgTys = [a,List b]
-- dcTyCon = T
- -- dcTyArgs = [a,b]
dcVanilla :: Bool, -- True <=> This is a vanilla Haskell 98 data constructor
-- Its type is of form
-- forall a1..an . t1 -> ... tm -> T a1..an
- -- No existentials, no GADTs, nothing.
- --
- -- NB1: the order of the forall'd variables does matter;
- -- for a vanilla constructor, we assume that if the result
- -- type is (T t1 ... tn) then we can instantiate the constr
- -- at types [t1, ..., tn]
- --
- -- NB2: a vanilla constructor can still be declared in GADT-style
- -- syntax, provided its type looks like the above.
-
- dcTyVars :: [TyVar], -- Universally-quantified type vars
- -- for the data constructor.
- -- See NB1 on dcVanilla for the conneciton between dcTyVars and dcResTys
- --
- -- In general, the dcTyVars are NOT NECESSARILY THE SAME AS THE TYVARS
+ -- No existentials, no coercions, nothing.
+ -- That is: dcExTyVars = dcEqSpec = dcTheta = []
+ -- NB 1: newtypes always have a vanilla data con
+ -- NB 2: a vanilla constructor can still be declared in GADT-style
+ -- syntax, provided its type looks like the above.
+ -- The declaration format is held in the TyCon (algTcGadtSyntax)
+
+ dcUnivTyVars :: [TyVar], -- Universally-quantified type vars
+ dcExTyVars :: [TyVar], -- Existentially-quantified type vars
+ -- In general, the dcUnivTyVars are NOT NECESSARILY THE SAME AS THE TYVARS
-- FOR THE PARENT TyCon. With GADTs the data con might not even have
-- the same number of type variables.
-- [This is a change (Oct05): previously, vanilla datacons guaranteed to
-- have the same type variables as their parent TyCon, but that seems ugly.]
- dcStupidTheta :: ThetaType, -- This is a "thinned" version of
- -- the context of the data decl.
+ -- INVARIANT: the UnivTyVars and ExTyVars all have distinct OccNames
+ -- Reason: less confusing, and easier to generate IfaceSyn
+
+ dcEqSpec :: [(TyVar,Type)], -- Equalities derived from the result type,
+ -- *as written by the programmer*
+ -- This field allows us to move conveniently between the two ways
+ -- of representing a GADT constructor's type:
+ -- MkT :: forall a b. (a :=: [b]) => b -> T a
+ -- MkT :: forall b. b -> T [b]
+ -- Each equality is of the form (a :=: ty), where 'a' is one of
+ -- the universally quantified type variables
+
+ dcTheta :: ThetaType, -- The context of the constructor
+ -- In GADT form, this is *exactly* what the programmer writes, even if
+ -- the context constrains only universally quantified variables
+ -- MkT :: forall a. Eq a => a -> T a
+ -- It may contain user-written equality predicates too
+
+ dcStupidTheta :: ThetaType, -- The context of the data type declaration
+ -- data Eq a => T a = ...
+ -- or, rather, a "thinned" version thereof
-- "Thinned", because the Report says
-- to eliminate any constraints that don't mention
-- tyvars free in the arg types for this constructor
--
- -- "Stupid", because the dictionaries aren't used for anything.
+ -- INVARIANT: the free tyvars of dcStupidTheta are a subset of dcUnivTyVars
+ -- Reason: dcStupidTeta is gotten by thinning the stupid theta from the tycon
--
- -- Indeed, [as of March 02] they are no
- -- longer in the type of the wrapper Id, because
- -- that makes it harder to use the wrap-id to rebuild
- -- values after record selection or in generics.
- --
- -- Fact: the free tyvars of dcStupidTheta are a subset of
- -- the free tyvars of dcResTys
- -- Reason: dcStupidTeta is gotten by instantiating the
- -- stupid theta from the tycon (see BuildTyCl.mkDataConStupidTheta)
+ -- "Stupid", because the dictionaries aren't used for anything.
+ -- Indeed, [as of March 02] they are no longer in the type of
+ -- the wrapper Id, because that makes it harder to use the wrap-id
+ -- to rebuild values after record selection or in generics.
- dcTheta :: ThetaType, -- The existentially quantified stuff
-
dcOrigArgTys :: [Type], -- Original argument types
- -- (before unboxing and flattening of
- -- strict fields)
+ -- (before unboxing and flattening of strict fields)
-- Result type of constructor is T t1..tn
dcTyCon :: TyCon, -- Result tycon, T
- dcResTys :: [Type], -- Result type args, t1..tn
-- Now the strictness annotations and field labels of the constructor
dcStrictMarks :: [StrictnessMark],
-- and *including* existential dictionaries
dcRepStrictness :: [StrictnessMark], -- One for each *representation* argument
+ -- See also Note [Data-con worker strictness] in MkId.lhs
dcRepType :: Type, -- Type of the constructor
- -- forall a b . Ord b => a -> [b] -> MkT a
+ -- forall a x y. (a:=:(x,y), Ord x) => x -> y -> MkT a
-- (this is *not* of the constructor wrapper Id:
- -- see notes after this data type declaration)
- --
+ -- see Note [Data con representation] below)
-- Notice that the existential type parameters come *second*.
-- Reason: in a case expression we may find:
-- case (e :: T t) of { MkT b (d:Ord b) (x:t) (xs:[b]) -> ... }
}
data DataConIds
- = NewDC Id -- Newtypes have only a wrapper, but no worker
- | AlgDC (Maybe Id) Id -- Algebraic data types always have a worker, and
+ = DCIds (Maybe Id) Id -- Algebraic data types always have a worker, and
-- may or may not have a wrapper, depending on whether
- -- the wrapper does anything.
+ -- the wrapper does anything. Newtypes just have a worker
-- _Neither_ the worker _nor_ the wrapper take the dcStupidTheta dicts as arguments
-- The worker takes dcRepArgTys as its arguments
-- If the worker is absent, dcRepArgTys is the same as dcOrigArgTys
- -- The 'Nothing' case of AlgDC is important
+ -- The 'Nothing' case of DCIds is important
-- Not only is this efficient,
-- but it also ensures that the wrapper is replaced
-- by the worker (becuase it *is* the wroker)
fIRST_TAG = 1 -- Tags allocated from here for real constructors
\end{code}
+Note [Data con representation]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The dcRepType field contains the type of the representation of a contructor
This may differ from the type of the contructor *Id* (built
by MkId.mkDataConId) for two reasons:
\begin{code}
mkDataCon :: Name
-> Bool -- Declared infix
- -> Bool -- Vanilla (see notes with dcVanilla)
-> [StrictnessMark] -> [FieldLabel]
- -> [TyVar] -> ThetaType -> ThetaType
- -> [Type] -> TyCon -> [Type]
- -> DataConIds
+ -> [TyVar] -> [TyVar]
+ -> [(TyVar,Type)] -> ThetaType
+ -> [Type] -> TyCon
+ -> ThetaType -> DataConIds
-> DataCon
-- Can get the tag from the TyCon
-mkDataCon name declared_infix vanilla
+mkDataCon name declared_infix
arg_stricts -- Must match orig_arg_tys 1-1
fields
- tyvars stupid_theta theta orig_arg_tys tycon res_tys
- ids
- = con
+ univ_tvs ex_tvs
+ eq_spec theta
+ orig_arg_tys tycon
+ stupid_theta ids
+-- Warning: mkDataCon is not a good place to check invariants.
+-- If the programmer writes the wrong result type in the decl, thus:
+-- data T a where { MkT :: S }
+-- then it's possible that the univ_tvs may hit an assertion failure
+-- if you pull on univ_tvs. This case is checked by checkValidDataCon,
+-- so the error is detected properly... it's just that asaertions here
+-- are a little dodgy.
+
+ = ASSERT( not (any isEqPred theta) )
+ -- We don't currently allow any equality predicates on
+ -- a data constructor (apart from the GADT ones in eq_spec)
+ con
where
- con = MkData {dcName = name,
- dcUnique = nameUnique name, dcVanilla = vanilla,
- dcTyVars = tyvars, dcStupidTheta = stupid_theta, dcTheta = theta,
- dcOrigArgTys = orig_arg_tys, dcTyCon = tycon, dcResTys = res_tys,
+ is_vanilla = null ex_tvs && null eq_spec && null theta
+ con = MkData {dcName = name, dcUnique = nameUnique name,
+ dcVanilla = is_vanilla, dcInfix = declared_infix,
+ dcUnivTyVars = univ_tvs, dcExTyVars = ex_tvs,
+ dcEqSpec = eq_spec,
+ dcStupidTheta = stupid_theta, dcTheta = theta,
+ dcOrigArgTys = orig_arg_tys, dcTyCon = tycon,
dcRepArgTys = rep_arg_tys,
- dcStrictMarks = arg_stricts, dcRepStrictness = rep_arg_stricts,
+ dcStrictMarks = arg_stricts,
+ dcRepStrictness = rep_arg_stricts,
dcFields = fields, dcTag = tag, dcRepType = ty,
- dcIds = ids, dcInfix = declared_infix}
+ dcIds = ids }
-- Strictness marks for source-args
-- *after unboxing choices*,
real_stricts = map mk_dict_strict_mark theta ++ arg_stricts
-- Representation arguments and demands
+ -- To do: eliminate duplication with MkId
(rep_arg_stricts, rep_arg_tys) = computeRep real_stricts real_arg_tys
tag = assoc "mkDataCon" (tyConDataCons tycon `zip` [fIRST_TAG..]) con
- ty = mkForAllTys tyvars (mkFunTys rep_arg_tys result_ty)
- -- NB: the existential dict args are already in rep_arg_tys
+ ty = mkForAllTys univ_tvs $ mkForAllTys ex_tvs $
+ mkFunTys (mkPredTys (eqSpecPreds eq_spec)) $
+ -- NB: the dict args are already in rep_arg_tys
+ -- because they might be flattened..
+ -- but the equality predicates are not
+ mkFunTys rep_arg_tys $
+ mkTyConApp tycon (mkTyVarTys univ_tvs)
- result_ty = mkTyConApp tycon res_tys
+eqSpecPreds :: [(TyVar,Type)] -> ThetaType
+eqSpecPreds spec = [ mkEqPred (mkTyVarTy tv, ty) | (tv,ty) <- spec ]
mk_dict_strict_mark pred | isStrictPred pred = MarkedStrict
| otherwise = NotMarkedStrict
dataConIsInfix :: DataCon -> Bool
dataConIsInfix = dcInfix
-dataConTyVars :: DataCon -> [TyVar]
-dataConTyVars = dcTyVars
+dataConUnivTyVars :: DataCon -> [TyVar]
+dataConUnivTyVars = dcUnivTyVars
+
+dataConExTyVars :: DataCon -> [TyVar]
+dataConExTyVars = dcExTyVars
+
+dataConAllTyVars :: DataCon -> [TyVar]
+dataConAllTyVars (MkData { dcUnivTyVars = univ_tvs, dcExTyVars = ex_tvs })
+ = univ_tvs ++ ex_tvs
+
+dataConEqSpec :: DataCon -> [(TyVar,Type)]
+dataConEqSpec = dcEqSpec
+
+dataConTheta :: DataCon -> ThetaType
+dataConTheta = dcTheta
dataConWorkId :: DataCon -> Id
dataConWorkId dc = case dcIds dc of
- AlgDC _ wrk_id -> wrk_id
- NewDC _ -> pprPanic "dataConWorkId" (ppr dc)
+ DCIds _ wrk_id -> wrk_id
dataConWrapId_maybe :: DataCon -> Maybe Id
+-- Returns Nothing if there is no wrapper for an algebraic data con
+-- and also for a newtype (whose constructor is inlined compulsorily)
dataConWrapId_maybe dc = case dcIds dc of
- AlgDC mb_wrap _ -> mb_wrap
- NewDC wrap -> Just wrap
+ DCIds mb_wrap _ -> mb_wrap
dataConWrapId :: DataCon -> Id
-- Returns an Id which looks like the Haskell-source constructor
dataConWrapId dc = case dcIds dc of
- AlgDC (Just wrap) _ -> wrap
- AlgDC Nothing wrk -> wrk -- worker=wrapper
- NewDC wrap -> wrap
+ DCIds (Just wrap) _ -> wrap
+ DCIds Nothing wrk -> wrk -- worker=wrapper
dataConImplicitIds :: DataCon -> [Id]
dataConImplicitIds dc = case dcIds dc of
- AlgDC (Just wrap) work -> [wrap,work]
- AlgDC Nothing work -> [work]
- NewDC wrap -> [wrap]
+ DCIds (Just wrap) work -> [wrap,work]
+ DCIds Nothing work -> [work]
dataConFieldLabels :: DataCon -> [FieldLabel]
dataConFieldLabels = dcFields
-- Core constructor application (Con dc args)
dataConRepStrictness dc = dcRepStrictness dc
-dataConSig :: DataCon -> ([TyVar], ThetaType,
- [Type], TyCon, [Type])
+dataConSig :: DataCon -> ([TyVar], ThetaType, [Type])
+dataConSig (MkData {dcUnivTyVars = univ_tvs, dcExTyVars = ex_tvs, dcEqSpec = eq_spec,
+ dcTheta = theta, dcOrigArgTys = arg_tys, dcTyCon = tycon})
+ = (univ_tvs ++ ex_tvs, eqSpecPreds eq_spec ++ theta, arg_tys)
-dataConSig (MkData {dcTyVars = tyvars, dcTheta = theta,
- dcOrigArgTys = arg_tys, dcTyCon = tycon, dcResTys = res_tys})
- = (tyvars, theta, arg_tys, tycon, res_tys)
+dataConFullSig :: DataCon
+ -> ([TyVar], [TyVar], [(TyVar,Type)], ThetaType, [Type])
+dataConFullSig (MkData {dcUnivTyVars = univ_tvs, dcExTyVars = ex_tvs, dcEqSpec = eq_spec,
+ dcTheta = theta, dcOrigArgTys = arg_tys, dcTyCon = tycon})
+ = (univ_tvs, ex_tvs, eq_spec, theta, arg_tys)
dataConStupidTheta :: DataCon -> ThetaType
dataConStupidTheta dc = dcStupidTheta dc
dataConResTys :: DataCon -> [Type]
-dataConResTys dc = dcResTys dc
+dataConResTys dc = [substTyVar env tv | tv <- dcUnivTyVars dc]
+ where
+ env = mkTopTvSubst (dcEqSpec dc)
+
+dataConUserType :: DataCon -> Type
+-- The user-declared type of the data constructor
+-- in the nice-to-read form
+-- T :: forall a. a -> T [a]
+-- rather than
+-- T :: forall b. forall a. (a=[b]) => a -> T b
+-- NB: If the constructor is part of a data instance, the result type
+-- mentions the family tycon, not the internal one.
+dataConUserType (MkData { dcUnivTyVars = univ_tvs,
+ dcExTyVars = ex_tvs, dcEqSpec = eq_spec,
+ dcTheta = theta, dcOrigArgTys = arg_tys,
+ dcTyCon = tycon })
+ = mkForAllTys ((univ_tvs `minusList` map fst eq_spec) ++ ex_tvs) $
+ mkFunTys (mkPredTys theta) $
+ mkFunTys arg_tys $
+ case tyConFamInst_maybe tycon of
+ Nothing -> mkTyConApp tycon (map (substTyVar subst) univ_tvs)
+ Just (ftc, insttys) -> mkTyConApp ftc insttys -- data instance
+ where
+ subst = mkTopTvSubst eq_spec
dataConInstArgTys :: DataCon
-> [Type] -- Instantiated at these types
-- NB: these INCLUDE the existentially quantified dict args
-- but EXCLUDE the data-decl context which is discarded
-- It's all post-flattening etc; this is a representation type
-dataConInstArgTys (MkData {dcRepArgTys = arg_tys, dcTyVars = tyvars}) inst_tys
+dataConInstArgTys (MkData {dcRepArgTys = arg_tys,
+ dcUnivTyVars = univ_tvs,
+ dcExTyVars = ex_tvs}) inst_tys
= ASSERT( length tyvars == length inst_tys )
map (substTyWith tyvars inst_tys) arg_tys
+ where
+ tyvars = univ_tvs ++ ex_tvs
-dataConInstResTy :: DataCon -> [Type] -> Type
-dataConInstResTy (MkData {dcTyVars = tyvars, dcTyCon = tc, dcResTys = res_tys}) inst_tys
- = ASSERT( length tyvars == length inst_tys )
- substTy (zipOpenTvSubst tyvars inst_tys) (mkTyConApp tc res_tys)
- -- res_tys can't currently contain any foralls,
- -- but might in future; hence zipOpenTvSubst
-- And the same deal for the original arg tys
dataConInstOrigArgTys :: DataCon -> [Type] -> [Type]
-dataConInstOrigArgTys (MkData {dcOrigArgTys = arg_tys, dcTyVars = tyvars}) inst_tys
- = ASSERT( length tyvars == length inst_tys )
+dataConInstOrigArgTys dc@(MkData {dcOrigArgTys = arg_tys,
+ dcUnivTyVars = univ_tvs,
+ dcExTyVars = ex_tvs}) inst_tys
+ = ASSERT2( length tyvars == length inst_tys, ptext SLIT("dataConInstOrigArgTys") <+> ppr dc <+> ppr inst_tys )
map (substTyWith tyvars inst_tys) arg_tys
+ where
+ tyvars = univ_tvs ++ ex_tvs
\end{code}
These two functions get the real argument types of the constructor,
Nothing -> pprPanic (str ++ ": not a product") (pprType ty)
+deepSplitProductType_maybe ty
+ = do { (res@(tycon, tycon_args, _, _)) <- splitProductType_maybe ty
+ ; let {result
+ | isClosedNewTyCon tycon && not (isRecursiveTyCon tycon)
+ = deepSplitProductType_maybe (newTyConInstRhs tycon tycon_args)
+ | isNewTyCon tycon = Nothing -- cannot unbox through recursive
+ -- newtypes nor through families
+ | otherwise = Just res}
+ ; result
+ }
+
+deepSplitProductType str ty
+ = case deepSplitProductType_maybe ty of
+ Just stuff -> stuff
+ Nothing -> pprPanic (str ++ ": not a product") (pprType ty)
+
computeRep :: [StrictnessMark] -- Original arg strictness
-> [Type] -- and types
-> ([StrictnessMark], -- Representation arg strictness
unbox NotMarkedStrict ty = [(NotMarkedStrict, ty)]
unbox MarkedStrict ty = [(MarkedStrict, ty)]
unbox MarkedUnboxed ty = zipEqual "computeRep" (dataConRepStrictness arg_dc) arg_tys
- where
- (_, _, arg_dc, arg_tys) = splitProductType "unbox_strict_arg_ty" ty
+ where
+ (_tycon, _tycon_args, arg_dc, arg_tys)
+ = deepSplitProductType "unbox_strict_arg_ty" ty
\end{code}