X-Git-Url: http://git.megacz.com/?a=blobdiff_plain;f=compiler%2Fspecialise%2FSpecConstr.lhs;h=918585c22abeb85af6eae9e4c7dfbc9a91eb4b88;hb=969baa167e4afa382b2558a3648d57862c4401eb;hp=921dc04674af6b5cbb66874047734b0051961b11;hpb=64f00b23e172aae40609b5deca87f83aa6f5447a;p=ghc-hetmet.git diff --git a/compiler/specialise/SpecConstr.lhs b/compiler/specialise/SpecConstr.lhs index 921dc04..918585c 100644 --- a/compiler/specialise/SpecConstr.lhs +++ b/compiler/specialise/SpecConstr.lhs @@ -22,7 +22,7 @@ import DataCon ( dataConRepArity, isVanillaDataCon ) import Type ( tyConAppArgs, tyVarsOfTypes ) import Unify ( coreRefineTys ) import Id ( Id, idName, idType, isDataConWorkId_maybe, - mkUserLocal, mkSysLocal ) + mkUserLocal, mkSysLocal, idUnfolding ) import Var ( Var ) import VarEnv import VarSet @@ -93,11 +93,40 @@ In Core, by the time we've w/wd (f is strict in i) we get At the call to f, we see that the argument, n is know to be (I# n#), and n is evaluated elsewhere in the body of f, so we can play the same -trick as above. However we don't want to do that if the boxed version -of n is needed (else we'd avoid the eval but pay more for re-boxing n). -So in this case we want that the *only* uses of n are in case statements. +trick as above. +Note [Reboxing] +~~~~~~~~~~~~~~~ +We must be careful not to allocate the same constructor twice. Consider + f p = (...(case p of (a,b) -> e)...p..., + ...let t = (r,s) in ...t...(f t)...) +At the recursive call to f, we can see that t is a pair. But we do NOT want +to make a specialised copy: + f' a b = let p = (a,b) in (..., ...) +because now t is allocated by the caller, then r and s are passed to the +recursive call, which allocates the (r,s) pair again. + +This happens if + (a) the argument p is used in other than a case-scrutinsation way. + (b) the argument to the call is not a 'fresh' tuple; you have to + look into its unfolding to see that it's a tuple + +Hence the "OR" part of Note [Good arguments] below. + +ALTERNATIVE: pass both boxed and unboxed versions. This no longer saves +allocation, but does perhaps save evals. In the RULE we'd have +something like + + f (I# x#) = f' (I# x#) x# + +If at the call site the (I# x) was an unfolding, then we'd have to +rely on CSE to eliminate the duplicate allocation.... This alternative +doesn't look attractive enough to pursue. + + +Note [Good arguments] +~~~~~~~~~~~~~~~~~~~~~ So we look for * A self-recursive function. Ignore mutual recursion for now, @@ -119,9 +148,11 @@ So we look for That same parameter is scrutinised by a case somewhere in the RHS of the function AND - Those are the only uses of the parameter + Those are the only uses of the parameter (see Note [Reboxing]) +What to abstract over +~~~~~~~~~~~~~~~~~~~~~ There's a bit of a complication with type arguments. If the call site looks like @@ -157,7 +188,7 @@ So the grand plan is: * Find the free variables of the abstracted pattern * Pass these variables, less any that are in scope at - the fn defn. + the fn defn. But see Note [Shadowing] below. NOTICE that we only abstract over variables that are not in scope, @@ -165,6 +196,185 @@ so we're in no danger of shadowing variables used in "higher up" in f_spec's RHS. +Note [Shadowing] +~~~~~~~~~~~~~~~~ +In this pass we gather up usage information that may mention variables +that are bound between the usage site and the definition site; or (more +seriously) may be bound to something different at the definition site. +For example: + + f x = letrec g y v = let x = ... + in ...(g (a,b) x)... + +Since 'x' is in scope at the call site, we may make a rewrite rule that +looks like + RULE forall a,b. g (a,b) x = ... +But this rule will never match, because it's really a different 'x' at +the call site -- and that difference will be manifest by the time the +simplifier gets to it. [A worry: the simplifier doesn't *guarantee* +no-shadowing, so perhaps it may not be distinct?] + +Anyway, the rule isn't actually wrong, it's just not useful. One possibility +is to run deShadowBinds before running SpecConstr, but instead we run the +simplifier. That gives the simplest possible program for SpecConstr to +chew on; and it virtually guarantees no shadowing. + +----------------------------------------------------- + Stuff not yet handled +----------------------------------------------------- + +Here are notes arising from Roman's work that I don't want to lose. + +Specialising for constant parameters +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +This one is about specialising on a *constant* (but not necessarily +constructor) argument + + foo :: Int -> (Int -> Int) -> Int + foo 0 f = 0 + foo m f = foo (f m) (+1) + +It produces + + lvl_rmV :: GHC.Base.Int -> GHC.Base.Int + lvl_rmV = + \ (ds_dlk :: GHC.Base.Int) -> + case ds_dlk of wild_alH { GHC.Base.I# x_alG -> + GHC.Base.I# (GHC.Prim.+# x_alG 1) + + T.$wfoo :: GHC.Prim.Int# -> (GHC.Base.Int -> GHC.Base.Int) -> + GHC.Prim.Int# + T.$wfoo = + \ (ww_sme :: GHC.Prim.Int#) (w_smg :: GHC.Base.Int -> GHC.Base.Int) -> + case ww_sme of ds_Xlw { + __DEFAULT -> + case w_smg (GHC.Base.I# ds_Xlw) of w1_Xmo { GHC.Base.I# ww1_Xmz -> + T.$wfoo ww1_Xmz lvl_rmV + }; + 0 -> 0 + } + +The recursive call has lvl_rmV as its argument, so we could create a specialised copy +with that argument baked in; that is, not passed at all. Now it can perhaps be inlined. + +When is this worth it? Call the constant 'lvl' +- If 'lvl' has an unfolding that is a constructor, see if the corresponding + parameter is scrutinised anywhere in the body. + +- If 'lvl' has an unfolding that is a inlinable function, see if the corresponding + parameter is applied (...to enough arguments...?) + + Also do this is if the function has RULES? + +Also + +Specialising for lambdas +~~~~~~~~~~~~~~~~~~~~~~~~ + foo :: Int -> (Int -> Int) -> Int + foo 0 f = 0 + foo m f = foo (f m) (\n -> n-m) + +This is subtly different from the previous one in that we get an +explicit lambda as the argument: + + T.$wfoo :: GHC.Prim.Int# -> (GHC.Base.Int -> GHC.Base.Int) -> + GHC.Prim.Int# + T.$wfoo = + \ (ww_sm8 :: GHC.Prim.Int#) (w_sma :: GHC.Base.Int -> GHC.Base.Int) -> + case ww_sm8 of ds_Xlr { + __DEFAULT -> + case w_sma (GHC.Base.I# ds_Xlr) of w1_Xmf { GHC.Base.I# ww1_Xmq -> + T.$wfoo + ww1_Xmq + (\ (n_ad3 :: GHC.Base.Int) -> + case n_ad3 of wild_alB { GHC.Base.I# x_alA -> + GHC.Base.I# (GHC.Prim.-# x_alA ds_Xlr) + }) + }; + 0 -> 0 + } + +I wonder if SpecConstr couldn't be extended to handle this? After all, +lambda is a sort of constructor for functions and perhaps it already +has most of the necessary machinery? + +Furthermore, there's an immediate win, because you don't need to allocate the lamda +at the call site; and if perchance it's called in the recursive call, then you +may avoid allocating it altogether. Just like for constructors. + +Looks cool, but probably rare...but it might be easy to implement. + +Example 1 +~~~~~~~~~ + data T a = T !a + + foo :: Int -> T Int -> Int + foo 0 t = 0 + foo x t | even x = case t of { T n -> foo (x-n) t } + | otherwise = foo (x-1) t + +SpecConstr does no specialisation, because the second recursive call +looks like a boxed use of the argument. A pity. + + $wfoo_sFw :: GHC.Prim.Int# -> T.T GHC.Base.Int -> GHC.Prim.Int# + $wfoo_sFw = + \ (ww_sFo [Just L] :: GHC.Prim.Int#) (w_sFq [Just L] :: T.T GHC.Base.Int) -> + case ww_sFo of ds_Xw6 [Just L] { + __DEFAULT -> + case GHC.Prim.remInt# ds_Xw6 2 of wild1_aEF [Dead Just A] { + __DEFAULT -> $wfoo_sFw (GHC.Prim.-# ds_Xw6 1) w_sFq; + 0 -> + case w_sFq of wild_Xy [Just L] { T.T n_ad5 [Just U(L)] -> + case n_ad5 of wild1_aET [Just A] { GHC.Base.I# y_aES [Just L] -> + $wfoo_sFw (GHC.Prim.-# ds_Xw6 y_aES) wild_Xy + } } }; + 0 -> 0 + +Example 2 +~~~~~~~~~ + data a :*: b = !a :*: !b + data T a = T !a + + foo :: (Int :*: T Int) -> Int + foo (0 :*: t) = 0 + foo (x :*: t) | even x = case t of { T n -> foo ((x-n) :*: t) } + | otherwise = foo ((x-1) :*: t) + +Very similar to the previous one, except that the parameters are now in +a strict tuple. Before SpecConstr, we have + + $wfoo_sG3 :: GHC.Prim.Int# -> T.T GHC.Base.Int -> GHC.Prim.Int# + $wfoo_sG3 = + \ (ww_sFU [Just L] :: GHC.Prim.Int#) (ww_sFW [Just L] :: T.T + GHC.Base.Int) -> + case ww_sFU of ds_Xws [Just L] { + __DEFAULT -> + case GHC.Prim.remInt# ds_Xws 2 of wild1_aEZ [Dead Just A] { + __DEFAULT -> + case ww_sFW of tpl_B2 [Just L] { T.T a_sFo [Just A] -> + $wfoo_sG3 (GHC.Prim.-# ds_Xws 1) tpl_B2 -- $wfoo1 + }; + 0 -> + case ww_sFW of wild_XB [Just A] { T.T n_ad7 [Just S(L)] -> + case n_ad7 of wild1_aFd [Just L] { GHC.Base.I# y_aFc [Just L] -> + $wfoo_sG3 (GHC.Prim.-# ds_Xws y_aFc) wild_XB -- $wfoo2 + } } }; + 0 -> 0 } + +We get two specialisations: +"SC:$wfoo1" [0] __forall {a_sFB :: GHC.Base.Int sc_sGC :: GHC.Prim.Int#} + Foo.$wfoo sc_sGC (Foo.T @ GHC.Base.Int a_sFB) + = Foo.$s$wfoo1 a_sFB sc_sGC ; +"SC:$wfoo2" [0] __forall {y_aFp :: GHC.Prim.Int# sc_sGC :: GHC.Prim.Int#} + Foo.$wfoo sc_sGC (Foo.T @ GHC.Base.Int (GHC.Base.I# y_aFp)) + = Foo.$s$wfoo y_aFp sc_sGC ; + +But perhaps the first one isn't good. After all, we know that tpl_B2 is +a T (I# x) really, because T is strict and Int has one constructor. (We can't +unbox the strict fields, becuase T is polymorphic!) + + + %************************************************************************ %* * \subsection{Top level wrapper stuff} @@ -415,7 +625,10 @@ scBind :: ScEnv -> CoreBind -> UniqSM (ScEnv, ScUsage, CoreBind) scBind env (Rec [(fn,rhs)]) | notNull val_bndrs = scExpr env_fn_body body `thenUs` \ (usg, body') -> - specialise env fn bndrs body usg `thenUs` \ (rules, spec_prs) -> + specialise env fn bndrs body' usg `thenUs` \ (rules, spec_prs) -> + -- Note body': the specialised copies should be based on the + -- optimised version of the body, in case there were + -- nested functions inside. let SCU { calls = calls, occs = occs } = usg in @@ -484,6 +697,7 @@ specialise env fn bndrs body (SCU {calls=calls, occs=occs}) --------------------- good_arg :: ConstrEnv -> IdEnv ArgOcc -> (CoreBndr, CoreArg) -> Bool +-- See Note [Good arguments] above good_arg con_env arg_occs (bndr, arg) = case is_con_app_maybe con_env arg of Just _ -> bndr_usg_ok arg_occs bndr arg @@ -535,6 +749,8 @@ spec_one env fn rhs (pats, rule_number) spec_occ = mkSpecOcc (nameOccName fn_name) pat_fvs = varSetElems (exprsFreeVars pats) vars_to_bind = filter not_avail pat_fvs + -- See Note [Shadowing] at the top + not_avail v = not (v `elemVarEnv` scope env) -- Put the type variables first; the type of a term -- variable may mention a type variable @@ -610,10 +826,20 @@ argsToPats env us args = mapAccumL (argToPat env) us args \begin{code} is_con_app_maybe :: ConstrEnv -> CoreExpr -> Maybe ConValue is_con_app_maybe env (Var v) - = lookupVarEnv env v - -- You might think we could look in the idUnfolding here - -- but that doesn't take account of which branch of a - -- case we are in, which is the whole point + = case lookupVarEnv env v of + Just stuff -> Just stuff + -- You might think we could look in the idUnfolding here + -- but that doesn't take account of which branch of a + -- case we are in, which is the whole point + + Nothing | isCheapUnfolding unf + -> is_con_app_maybe env (unfoldingTemplate unf) + where + unf = idUnfolding v + -- However we do want to consult the unfolding as well, + -- for let-bound constructors! + + other -> Nothing is_con_app_maybe env (Lit lit) = Just (CV (LitAlt lit) [])