\begin{code}
module Specialise (
- specProgram
+ specProgram,
+ idSpecVars
) where
#include "HsVersions.h"
-import Id ( Id, DictVar, idType, mkUserLocal,
+import MkId ( mkUserLocal )
+import Id ( Id, DictVar, idType, mkTemplateLocals,
- getIdSpecialisation, addIdSpecialisation, isSpecPragmaId,
+ getIdSpecialisation, setIdSpecialisation, isSpecPragmaId,
IdSet, mkIdSet, addOneToIdSet, intersectIdSets, isEmptyIdSet,
emptyIdSet, unionIdSets, minusIdSet, unitIdSet, elementOfIdSet,
- IdEnv, mkIdEnv, lookupIdEnv, addOneToIdEnv
+ IdEnv, mkIdEnv, lookupIdEnv, addOneToIdEnv, delOneFromIdEnv
)
import Type ( Type, mkTyVarTy, splitSigmaTy, instantiateTy, isDictTy,
tyVarsOfType, tyVarsOfTypes, applyTys, mkForAllTys
)
import TyCon ( TyCon )
-import TyVar ( TyVar,
+import TyVar ( TyVar, alphaTyVars,
TyVarSet, mkTyVarSet, isEmptyTyVarSet, intersectTyVarSets,
elementOfTyVarSet, unionTyVarSets, emptyTyVarSet,
- TyVarEnv, mkTyVarEnv
+ TyVarEnv, mkTyVarEnv, delFromTyVarEnv
)
-import CoreSyn
-import OccurAnal ( occurAnalyseGlobalExpr )
+import CoreSyn
+import PprCore () -- Instances
import Name ( NamedThing(..), getSrcLoc )
-import SpecEnv ( addToSpecEnv )
+import SpecEnv ( addToSpecEnv, lookupSpecEnv, specEnvValues )
import UniqSupply ( UniqSupply,
UniqSM, initUs, thenUs, returnUs, getUnique, mapUs
)
import FiniteMap
-import Maybes ( MaybeErr(..) )
+import Maybes ( MaybeErr(..), maybeToBool )
import Bag
import List ( partition )
import Util ( zipEqual )
%************************************************************************
These notes describe how we implement specialisation to eliminate
-overloading, and optionally to eliminate unboxed polymorphism, and
-full polymorphism.
+overloading.
-The specialisation pass is a partial evaluator which works on Core
+The specialisation pass works on Core
syntax, complete with all the explicit dictionary application,
abstraction and construction as added by the type checker. The
existing type checker remains largely as it is.
f@t1/t2 = <f_rhs> t1 t2 d1 d2
-(f_rhs presumably has some big lambdas and dictionary lambdas, so lots
-of simplification will now result.) Then we should recursively do
-everything again.
-
-The new id has its own unique, but its print-name (if exported) has
-an explicit representation of the instance types t1/t2.
+f_rhs presumably has some big lambdas and dictionary lambdas, so lots
+of simplification will now result. However we don't actually *do* that
+simplification. Rather, we leave it for the simplifier to do. If we
+*did* do it, though, we'd get more call instances from the specialised
+RHS. We can work out what they are by instantiating the call-instance
+set from f's RHS with the types t1, t2.
Add this new id to f's IdInfo, to record that f has a specialised version.
in
fl
-We still have recusion for non-overloadd functions which we
-speciailise, but the recursive call should get speciailised to the
+We still have recusion for non-overloaded functions which we
+speciailise, but the recursive call should get specialised to the
same recursive version.
f@t1/ = /\b -> <f_rhs> t1 b d1 d2
-This seems pretty simple, and a Good Thing.
+We do this.
-Polymorphism 3 -- Unboxed
-~~~~~~~~~~~~~~
-If we are speciailising at unboxed types we must speciailise
-regardless of the overloading constraint. In the exaple above it is
-worth speciailising at types Int/Int#, Int/Bool# and a/Int#, Int#/Int#
-etc.
+Dictionary floating
+~~~~~~~~~~~~~~~~~~~
+Consider this
-Note that specialising an overloaded type at an uboxed type requires
-an unboxed instance -- we cannot default to an unspecialised version!
+ f a (d::Num a) = let g = ...
+ in
+ ...(let d1::Ord a = Num.Ord.sel a d in g a d1)...
+Here, g is only called at one type, but the dictionary isn't in scope at the
+definition point for g. Usually the type checker would build a
+definition for d1 which enclosed g, but the transformation system
+might have moved d1's defn inward. Solution: float dictionary bindings
+outwards along with call instances.
-Dictionary floating
-~~~~~~~~~~~~~~~~~~~
Consider
f x = let g p q = p==q
to widen its scope. Notice that this floating can't be done in advance -- it only
shows up when specialisation is done.
-DELICATE MATTER: the way we tell a dictionary binding is by looking to
-see if it has a Dict type. If the type has been "undictify'd", so that
-it looks like a tuple, then the dictionary binding won't be floated, and
-an opportunity to specialise might be lost.
-
User SPECIALIZE pragmas
~~~~~~~~~~~~~~~~~~~~~~~
Specialisation pragmas can be digested by the type checker, and implemented
The information about what instance of the dfun exist gets added to
the dfun's IdInfo in the same way as a user-defined function too.
-In fact, matters are a little bit more complicated than this.
-When we make one of these specialised instances, we are defining
-a constant dictionary, and so we want immediate access to its constant
-methods and superclasses. Indeed, these constant methods and superclasses
-must be in the IdInfo for the class selectors! We need help from the
-typechecker to sort this out, perhaps by generating a separate IdInfo
-for each.
Automatic instance decl specialisation?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
new call-instances of the dfuns, and so on. This all arises because of
the unrestricted mutual recursion between instance decls and value decls.
+Still, there's no actual problem; it just means that we may not do all
+the specialisation we could theoretically do.
+
Furthermore, instance decls are usually exported and used non-locally,
so we'll want to compile enough to get those specialisations done.
back in as a pragma when next compiling the file. So for now,
we only specialise instance decls in response to pragmas.
-That means that even if an instance decl ain't otherwise exported it
-needs to be spat out as with a SPECIALIZE pragma. Furthermore, it needs
-something to say which module defined the instance, so the usage info
-can be fed into the right reqts info file. Blegh.
-
-
-SPECIAILISING DATA DECLARATIONS
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-
-With unboxed specialisation (or full specialisation) we also require
-data types (and their constructors) to be speciailised on unboxed
-type arguments.
-
-In addition to normal call instances we gather TyCon call instances at
-unboxed types, determine equivalence classes for the locally defined
-TyCons and build speciailised data constructor Ids for each TyCon and
-substitute these in the Con calls.
-
-We need the list of local TyCons to partition the TyCon instance info.
-We pass out a FiniteMap from local TyCons to Specialised Instances to
-give to the interface and code genertors.
-
-N.B. The specialised data constructors reference the original data
-constructor and type constructor which do not have the updated
-specialisation info attached. Any specialisation info must be
-extracted from the TyCon map returned.
-
SPITTING OUT USAGE INFORMATION
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
This is done at the top-level when all the call instances which escape
must be for imported functions and data types.
+*** Not currently done ***
+
Partial specialisation by pragmas
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In short, dfun Ids need IdInfo with a specialisation for each
constant instance of their instance declaration.
+All this uses a single mechanism: the SpecEnv inside an Id
+
What does the specialisation IdInfo look like?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
- SpecInfo
- [Maybe Type] -- Instance types
- Int -- No of dicts to eat
- Id -- Specialised version
+The SpecEnv of an Id maps a list of types (the template) to an expression
+
+ [Type] |-> Expr
For example, if f has this SpecInfo:
- SpecInfo [Just t1, Nothing, Just t3] 2 f'
+ [Int, a] -> \d:Ord Int. f' a
+
+it means that we can replace the call
-then
+ f Int t ===> (\d. f' t)
- f t1 t2 t3 d1 d2 ===> f t2
+This chucks one dictionary away and proceeds with the
+specialised version of f, namely f'.
-The "Nothings" identify type arguments in which the specialised
-version is polymorphic.
What can't be done this way?
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
overloading altogether anyway!
-Mutter mutter
-~~~~~~~~~~~~~
-What about types/classes mentioned in SPECIALIZE pragmas spat out,
-but not otherwise exported. Even if they are exported, what about
-their original names.
-
-Suggestion: use qualified names in pragmas, omitting module for
-prelude and "this module".
-
-
-Mutter mutter 2
-~~~~~~~~~~~~~~~
-Consider this
-
- f a (d::Num a) = let g = ...
- in
- ...(let d1::Ord a = Num.Ord.sel a d in g a d1)...
-
-Here, g is only called at one type, but the dictionary isn't in scope at the
-definition point for g. Usually the type checker would build a
-definition for d1 which enclosed g, but the transformation system
-might have moved d1's defn inward.
-
-
-Unboxed bindings
-~~~~~~~~~~~~~~~~
-
-What should we do when a value is specialised to a *strict* unboxed value?
-
- map_*_* f (x:xs) = let h = f x
- t = map f xs
- in h:t
-
-Could convert let to case:
-
- map_*_Int# f (x:xs) = case f x of h# ->
- let t = map f xs
- in h#:t
-
-This may be undesirable since it forces evaluation here, but the value
-may not be used in all branches of the body. In the general case this
-transformation is impossible since the mutual recursion in a letrec
-cannot be expressed as a case.
-
-There is also a problem with top-level unboxed values, since our
-implementation cannot handle unboxed values at the top level.
-
-Solution: Lift the binding of the unboxed value and extract it when it
-is used:
-
- map_*_Int# f (x:xs) = let h = case (f x) of h# -> _Lift h#
- t = map f xs
- in case h of
- _Lift h# -> h#:t
-
-Now give it to the simplifier and the _Lifting will be optimised away.
-
-The benfit is that we have given the specialised "unboxed" values a
-very simplep lifted semantics and then leave it up to the simplifier to
-optimise it --- knowing that the overheads will be removed in nearly
-all cases.
-
-In particular, the value will only be evaluted in the branches of the
-program which use it, rather than being forced at the point where the
-value is bound. For example:
-
- filtermap_*_* p f (x:xs)
- = let h = f x
- t = ...
- in case p x of
- True -> h:t
- False -> t
- ==>
- filtermap_*_Int# p f (x:xs)
- = let h = case (f x) of h# -> _Lift h#
- t = ...
- in case p x of
- True -> case h of _Lift h#
- -> h#:t
- False -> t
-
-The binding for h can still be inlined in the one branch and the
-_Lifting eliminated.
-
-
-Question: When won't the _Lifting be eliminated?
-
-Answer: When they at the top-level (where it is necessary) or when
-inlining would duplicate work (or possibly code depending on
-options). However, the _Lifting will still be eliminated if the
-strictness analyser deems the lifted binding strict.
-
A note about non-tyvar dictionaries
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
\begin{code}
specProgram :: UniqSupply -> [CoreBinding] -> [CoreBinding]
specProgram us binds
- = initSM us (go binds `thenSM` \ (binds', _) ->
- returnSM binds'
+ = initSM us (go binds `thenSM` \ (binds', uds') ->
+ returnSM (dumpAllDictBinds uds' binds')
)
where
go [] = returnSM ([], emptyUDs)
specExpr e@(Con _ _) = returnSM (e, emptyUDs)
specExpr e@(Prim _ _) = returnSM (e, emptyUDs)
-specExpr (Coerce co ty body)
- = specExpr body `thenSM` \ (body', uds) ->
- returnSM (Coerce co ty body', uds)
-
-specExpr (SCC cc body)
+specExpr (Note note body)
= specExpr body `thenSM` \ (body', uds) ->
- returnSM (SCC cc body', uds)
+ returnSM (Note note body', uds)
---------------- Applications might generate a call instance --------------------
returnSM ([], all_uds)
| isSpecPragmaId bndr
- -- SpecPragmaIds are there solely to generate specialisations
- -- Just drop the whole binding; keep only its usage details
= specExpr rhs `thenSM` \ (rhs', rhs_uds) ->
returnSM ([], rhs_uds `plusUDs` body_uds)
specDefn (calls body_uds) (bndr,rhs) `thenSM` \ ((bndr',rhs'), spec_defns, spec_uds) ->
let
(all_uds, (dict_binds, dump_calls))
- = splitUDs [ValBinder bndr'] (spec_uds `plusUDs` body_uds)
+ = splitUDs [ValBinder bndr] (spec_uds `plusUDs` body_uds)
+
+ -- If we make specialisations then we Rec the whole lot together
+ -- If not, leave it as a NonRec
+ new_bind | null spec_defns = NonRec bndr' rhs'
+ | otherwise = Rec ((bndr',rhs'):spec_defns)
in
- returnSM ( [NonRec bndr' rhs']
- ++ dict_binds
- ++ spec_defns,
- all_uds )
+ returnSM ( new_bind : dict_binds, all_uds )
specBind (Rec pairs) body_uds
= mapSM (specDefn (calls body_uds)) pairs `thenSM` \ stuff ->
spec_defns = concat spec_defns_s
spec_uds = plusUDList spec_uds_s
(all_uds, (dict_binds, dump_calls))
- = splitUDs (map (ValBinder . fst) pairs') (spec_uds `plusUDs` body_uds)
+ = splitUDs (map (ValBinder . fst) pairs) (spec_uds `plusUDs` body_uds)
+ new_bind = Rec (spec_defns ++ pairs')
in
- returnSM ( [Rec pairs']
- ++ dict_binds
- ++ spec_defns,
- all_uds )
+ returnSM ( new_bind : dict_binds, all_uds )
specDefn :: CallDetails -- Info on how it is used in its scope
-> (Id, CoreExpr) -- The thing being bound and its un-processed RHS
-> SpecM ((Id, CoreExpr), -- The thing and its processed RHS
-- the Id may now have specialisations attached
- [CoreBinding], -- Extra, specialised bindings
+ [(Id,CoreExpr)], -- Extra, specialised bindings
UsageDetails -- Stuff to fling upwards from the RHS and its
) -- specialised versions
returnSM ((fn, rhs'), [], rhs_uds)
where
- fn_type = idType fn
- (tyvars, theta, tau) = splitSigmaTy fn_type
- n_tyvars = length tyvars
- n_dicts = length theta
+ fn_type = idType fn
+ (tyvars, theta, tau) = splitSigmaTy fn_type
+ n_tyvars = length tyvars
+ n_dicts = length theta
+ mk_spec_tys call_ts = zipWith mk_spec_ty call_ts alphaTyVars
+ where
+ mk_spec_ty (Just ty) _ = ty
+ mk_spec_ty Nothing tyvar = mkTyVarTy tyvar
(rhs_tyvars, rhs_ids, rhs_body) = collectBinders rhs
rhs_dicts = take n_dicts rhs_ids
Nothing -> []
Just cs -> fmToList cs
+ ----------------------------------------------------------
-- Specialise to one particular call pattern
spec_call :: ProtoUsageDetails -- From the original body, captured by
-- the dictionary lambdas
-> ([Maybe Type], [DictVar]) -- Call instance
- -> SpecM (CoreBinding, -- Specialised definition
+ -> SpecM ((Id,CoreExpr), -- Specialised definition
UsageDetails, -- Usage details from specialised body
- ([Type], CoreExpr)) -- Info for the Id's SpecEnv
+ ([TyVar], [Type], CoreExpr)) -- Info for the Id's SpecEnv
spec_call bound_uds (call_ts, call_ds)
= ASSERT( length call_ts == n_tyvars && length call_ds == n_dicts )
-- Calls are only recorded for properly-saturated applications
-- f1 = /\ b d -> (..rhs of f..) t1 b t3 d d1 d2
-- and the type of this binder
let
- spec_tyvars = [tyvar | (tyvar, Nothing) <- tyvars `zip` call_ts]
- spec_tys = zipWith mk_spec_ty call_ts tyvars
+ spec_tyvars = [tyvar | (tyvar, Nothing) <- alphaTyVars `zip` call_ts]
+ spec_tys = mk_spec_tys call_ts
spec_rhs = mkTyLam spec_tyvars $
mkGenApp rhs (map TyArg spec_tys ++ map VarArg call_ds)
- spec_id_ty = mkForAllTys spec_tyvars (applyTys fn_type spec_tys)
-
- mk_spec_ty (Just ty) _ = ty
- mk_spec_ty Nothing tyvar = mkTyVarTy tyvar
+ spec_id_ty = mkForAllTys spec_tyvars (instantiateTy ty_env tau)
+ ty_env = mkTyVarEnv (zipEqual "spec_call" tyvars spec_tys)
in
+
newIdSM fn spec_id_ty `thenSM` \ spec_f ->
-- Construct the stuff for f's spec env
- -- [t1,b,t3,d] |-> \d1 d2 -> f1 b d
+ -- [b,d] [t1,b,t3,d] |-> \d1 d2 -> f1 b d
+ -- The only awkward bit is that d1,d2 might well be global
+ -- dictionaries, so it's tidier to make new local variables
+ -- for the lambdas in the RHS, rather than lambda-bind the
+ -- dictionaries themselves.
+ --
+ -- In fact we use the standard template locals, so that the
+ -- they don't need to be "tidied" before putting in interface files
let
- spec_env_rhs = mkValLam call_ds $
+ arg_ds = mkTemplateLocals (map idType call_ds)
+ spec_env_rhs = mkValLam arg_ds $
mkTyApp (Var spec_f) $
map mkTyVarTy spec_tyvars
- spec_env_info = (spec_tys, spec_env_rhs)
+ spec_env_info = (spec_tyvars, spec_tys, spec_env_rhs)
in
-- Specialise the UDs from f's RHS
let
- tv_env = [ (rhs_tyvar,ty)
+ -- Only the overloaded tyvars should be free in the uds
+ ty_env = [ (rhs_tyvar,ty)
| (rhs_tyvar, Just ty) <- zipEqual "specUDs1" rhs_tyvars call_ts
]
dict_env = zipEqual "specUDs2" rhs_dicts call_ds
in
- specUDs tv_env dict_env bound_uds `thenSM` \ spec_uds ->
+ specUDs ty_env dict_env bound_uds `thenSM` \ spec_uds ->
- returnSM (NonRec spec_f spec_rhs,
+ returnSM ((spec_f, spec_rhs),
spec_uds,
spec_env_info
)
addDictBind uds dict rhs = uds { dict_binds = mkDB dict rhs `consBag` dict_binds uds }
+dumpAllDictBinds (MkUD {dict_binds = dbs}) binds
+ = foldrBag add binds dbs
+ where
+ add (dict,rhs,_,_) binds = NonRec dict rhs : binds
+
dumpUDs :: [CoreBinder]
-> UsageDetails -> CoreExpr
-> (UsageDetails, CoreExpr)
instantiateDictRhs ty_env id_env rhs
= go rhs
where
- go (App e1 (VarArg a)) = App (go e1) (VarArg (lookupId id_env a))
- go (App e1 (TyArg t)) = App (go e1) (TyArg (instantiateTy ty_env t))
- go (Var v) = Var (lookupId id_env v)
- go (Lit l) = Lit l
+ go_arg (VarArg a) = VarArg (lookupId id_env a)
+ go_arg (TyArg t) = TyArg (instantiateTy ty_env t)
+
+ go (App e1 arg) = App (go e1) (go_arg arg)
+ go (Var v) = Var (lookupId id_env v)
+ go (Lit l) = Lit l
+ go (Con con args) = Con con (map go_arg args)
+ go (Note n e) = Note (go_note n) (go e)
+ go (Case e alts) = Case (go e) alts -- See comment below re alts
+ go other = pprPanic "instantiateDictRhs" (ppr rhs)
+
+ go_note (Coerce t1 t2) = Coerce (instantiateTy ty_env t1) (instantiateTy ty_env t2)
+ go_note note = note
dictRhsFVs :: CoreExpr -> IdSet
-- Cheapo function for simple RHSs
-dictRhsFVs (App e1 (VarArg a)) = dictRhsFVs e1 `addOneToIdSet` a
-dictRhsFVs (App e1 (TyArg t)) = dictRhsFVs e1
-dictRhsFVs (Var v) = unitIdSet v
-dictRhsFVs (Lit l) = emptyIdSet
+dictRhsFVs e
+ = go e
+ where
+ go (App e1 (VarArg a)) = go e1 `addOneToIdSet` a
+ go (App e1 (LitArg l)) = go e1
+ go (App e1 (TyArg t)) = go e1
+ go (Var v) = unitIdSet v
+ go (Lit l) = emptyIdSet
+ go (Con _ args) = mkIdSet [id | VarArg id <- args]
+ go (Note _ e) = go e
+
+ go (Case e _) = go e -- Claim: no free dictionaries in the alternatives
+ -- These case expressions are of the form
+ -- case d of { D a b c -> b }
+
+ go other = pprPanic "dictRhsFVs" (ppr e)
addIdSpecialisations id spec_stuff
pprTrace "Duplicate specialisations" (vcat (map ppr errs))
else \x -> x
)
- addIdSpecialisation id new_spec_env
+ setIdSpecialisation id new_spec_env
where
(new_spec_env, errs) = foldr add (getIdSpecialisation id, []) spec_stuff
- add (tys, template) (spec_env, errs)
- = case addToSpecEnv spec_env tys (occurAnalyseGlobalExpr template) of
+ add (tyvars, tys, template) (spec_env, errs)
+ = case addToSpecEnv True spec_env tyvars tys template of
Succeeded spec_env' -> (spec_env', errs)
Failed err -> (spec_env, err:errs)
+-- Given an Id, isSpecVars returns all its specialisations.
+-- We extract these from its SpecEnv.
+-- This is used by the occurrence analyser and free-var finder;
+-- we regard an Id's specialisations as free in the Id's definition.
+
+idSpecVars :: Id -> [Id]
+idSpecVars id
+ = map get_spec (specEnvValues (getIdSpecialisation id))
+ where
+ -- get_spec is another cheapo function like dictRhsFVs
+ -- It knows what these specialisation temlates look like,
+ -- and just goes for the jugular
+ get_spec (App f _) = get_spec f
+ get_spec (Lam _ b) = get_spec b
+ get_spec (Var v) = v
+
----------------------------------------
type SpecM a = UniqSM a
\end{code}
+ Old (but interesting) stuff about unboxed bindings
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+What should we do when a value is specialised to a *strict* unboxed value?
+
+ map_*_* f (x:xs) = let h = f x
+ t = map f xs
+ in h:t
+
+Could convert let to case:
+
+ map_*_Int# f (x:xs) = case f x of h# ->
+ let t = map f xs
+ in h#:t
+
+This may be undesirable since it forces evaluation here, but the value
+may not be used in all branches of the body. In the general case this
+transformation is impossible since the mutual recursion in a letrec
+cannot be expressed as a case.
+
+There is also a problem with top-level unboxed values, since our
+implementation cannot handle unboxed values at the top level.
+
+Solution: Lift the binding of the unboxed value and extract it when it
+is used:
+
+ map_*_Int# f (x:xs) = let h = case (f x) of h# -> _Lift h#
+ t = map f xs
+ in case h of
+ _Lift h# -> h#:t
+
+Now give it to the simplifier and the _Lifting will be optimised away.
+
+The benfit is that we have given the specialised "unboxed" values a
+very simplep lifted semantics and then leave it up to the simplifier to
+optimise it --- knowing that the overheads will be removed in nearly
+all cases.
+
+In particular, the value will only be evaluted in the branches of the
+program which use it, rather than being forced at the point where the
+value is bound. For example:
+
+ filtermap_*_* p f (x:xs)
+ = let h = f x
+ t = ...
+ in case p x of
+ True -> h:t
+ False -> t
+ ==>
+ filtermap_*_Int# p f (x:xs)
+ = let h = case (f x) of h# -> _Lift h#
+ t = ...
+ in case p x of
+ True -> case h of _Lift h#
+ -> h#:t
+ False -> t
+
+The binding for h can still be inlined in the one branch and the
+_Lifting eliminated.
+
+
+Question: When won't the _Lifting be eliminated?
+
+Answer: When they at the top-level (where it is necessary) or when
+inlining would duplicate work (or possibly code depending on
+options). However, the _Lifting will still be eliminated if the
+strictness analyser deems the lifted binding strict.
+