the scrutinee of the case, and we can inline it.
\begin{code}
+{-# OPTIONS -w #-}
+-- The above warning supression flag is a temporary kludge.
+-- While working on this module you are encouraged to remove it and fix
+-- any warnings in the module. See
+-- http://hackage.haskell.org/trac/ghc/wiki/Commentary/CodingStyle#Warnings
+-- for details
+
module SetLevels (
setLevels,
import CoreSyn
import DynFlags ( FloatOutSwitches(..) )
-import CoreUtils ( exprType, exprIsTrivial, exprIsCheap, mkPiTypes )
+import CoreUtils ( exprType, exprIsTrivial, mkPiTypes )
import CoreFVs -- all of it
import CoreSubst ( Subst, emptySubst, extendInScope, extendIdSubst,
cloneIdBndr, cloneRecIdBndrs )
idSpecialisation, idWorkerInfo, setIdInfo
)
import IdInfo ( workerExists, vanillaIdInfo, isEmptySpecInfo )
-import Var ( Var )
+import Var
import VarSet
import VarEnv
import Name ( getOccName )
context @Level 0 0@.
-InlineCtxt
-~~~~~~~~~~
+Note [FloatOut inside INLINE]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
@InlineCtxt@ very similar to @Level 0 0@, but is used for one purpose:
to say "don't float anything out of here". That's exactly what we
want for the body of an INLINE, where we don't want to float anything
-- that if we'll escape a value lambda, or will go to the top level.
good_destination
| dest_lvl `ltMajLvl` ctxt_lvl -- Escapes a value lambda
- = not (exprIsCheap expr) || isTopLvl dest_lvl
- -- Even if it escapes a value lambda, we only
- -- float if it's not cheap (unless it'll get all the
- -- way to the top). I've seen cases where we
- -- float dozens of tiny free expressions, which cost
- -- more to allocate than to evaluate.
- -- NB: exprIsCheap is also true of bottom expressions, which
- -- is good; we don't want to share them
- --
- -- It's only Really Bad to float a cheap expression out of a
- -- strict context, because that builds a thunk that otherwise
- -- would never be built. So another alternative would be to
- -- add
- -- || (strict_ctxt && not (exprIsBottom expr))
- -- to the condition above. We should really try this out.
+ = True
+ -- OLD CODE: not (exprIsCheap expr) || isTopLvl dest_lvl
+ -- see Note [Escaping a value lambda]
| otherwise -- Does not escape a value lambda
= isTopLvl dest_lvl -- Only float if we are going to the top level
-- which is pretty stupid. Hence the strict_ctxt test
\end{code}
+Note [Escaping a value lambda]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+We want to float even cheap expressions out of value lambdas,
+because that saves allocation. Consider
+ f = \x. .. (\y.e) ...
+Then we'd like to avoid allocating the (\y.e) every time we call f,
+(assuming e does not mention x).
+
+An example where this really makes a difference is simplrun009.
+
+Another reason it's good is because it makes SpecContr fire on functions.
+Consider
+ f = \x. ....(f (\y.e))....
+After floating we get
+ lvl = \y.e
+ f = \x. ....(f lvl)...
+and that is much easier for SpecConstr to generate a robust specialisation for.
+
+The OLD CODE (given where this Note is referred to) prevents floating
+of the example above, so I just don't understand the old code. I
+don't understand the old comment either (which appears below). I
+measured the effect on nofib of changing OLD CODE to 'True', and got
+zeros everywhere, but a 4% win for 'puzzle'. Very small 0.5% loss for
+'cse'; turns out to be because our arity analysis isn't good enough
+yet (mentioned in Simon-nofib-notes).
+
+OLD comment was:
+ Even if it escapes a value lambda, we only
+ float if it's not cheap (unless it'll get all the
+ way to the top). I've seen cases where we
+ float dozens of tiny free expressions, which cost
+ more to allocate than to evaluate.
+ NB: exprIsCheap is also true of bottom expressions, which
+ is good; we don't want to share them
+
+ It's only Really Bad to float a cheap expression out of a
+ strict context, because that builds a thunk that otherwise
+ would never be built. So another alternative would be to
+ add
+ || (strict_ctxt && not (exprIsBottom expr))
+ to the condition above. We should really try this out.
+
%************************************************************************
%* *
-- We also use these envs when making a variable polymorphic
-- because we want to float it out past a big lambda.
--
- -- The SubstEnv and IdEnv always implement the same mapping, but the
- -- SubstEnv maps to CoreExpr and the IdEnv to LevelledExpr
+ -- The Subst and IdEnv always implement the same mapping, but the
+ -- Subst maps to CoreExpr and the IdEnv to LevelledExpr
-- Since the range is always a variable or type application,
-- there is never any difference between the two, but sadly
-- the types differ. The SubstEnv is used when substituting in
-- Find the variables in fvs, free vars of the target expresion,
-- whose level is greater than the destination level
-- These are the ones we are going to abstract out
-abstractVars dest_lvl env fvs
- = uniq (sortLe le [var | fv <- varSetElems fvs, var <- absVarsOf dest_lvl env fv])
+abstractVars dest_lvl (_, lvl_env, _, id_env) fvs
+ = map zap $ uniq $ sortLe le
+ [var | fv <- varSetElems fvs
+ , var <- absVarsOf id_env fv
+ , abstract_me var ]
+ -- NB: it's important to call abstract_me only on the OutIds the
+ -- come from absVarsOf (not on fv, which is an InId)
where
- -- Sort the variables so we don't get
- -- mixed-up tyvars and Ids; it's just messy
- v1 `le` v2 = case (isId v1, isId v2) of
- (True, False) -> False
- (False, True) -> True
+ -- Sort the variables so the true type variables come first;
+ -- the tyvars scope over Ids and coercion vars
+ v1 `le` v2 = case (is_tv v1, is_tv v2) of
+ (True, False) -> True
+ (False, True) -> False
other -> v1 <= v2 -- Same family
+ is_tv v = isTyVar v && not (isCoVar v)
+
uniq :: [Var] -> [Var]
-- Remove adjacent duplicates; the sort will have brought them together
uniq (v1:v2:vs) | v1 == v2 = uniq (v2:vs)
| otherwise = v1 : uniq (v2:vs)
uniq vs = vs
-absVarsOf :: Level -> LevelEnv -> Var -> [Var]
- -- If f is free in the expression, and f maps to poly_f a b c in the
- -- current substitution, then we must report a b c as candidate type
- -- variables
-absVarsOf dest_lvl (_, lvl_env, _, id_env) v
- | isId v
- = [zap av2 | av1 <- lookup_avs v, av2 <- add_tyvars av1, abstract_me av2]
-
- | otherwise
- = if abstract_me v then [v] else []
-
- where
abstract_me v = case lookupVarEnv lvl_env v of
Just lvl -> dest_lvl `ltLvl` lvl
Nothing -> False
- lookup_avs v = case lookupVarEnv id_env v of
- Just (abs_vars, _) -> abs_vars
- Nothing -> [v]
-
- add_tyvars v = v : varSetElems (varTypeTyVars v)
-
-- We are going to lambda-abstract, so nuke any IdInfo,
-- and add the tyvars of the Id (if necessary)
zap v | isId v = WARN( workerExists (idWorkerInfo v) ||
text "absVarsOf: discarding info on" <+> ppr v )
setIdInfo v vanillaIdInfo
| otherwise = v
+
+absVarsOf :: IdEnv ([Var], LevelledExpr) -> Var -> [Var]
+ -- If f is free in the expression, and f maps to poly_f a b c in the
+ -- current substitution, then we must report a b c as candidate type
+ -- variables
+ --
+ -- Also, if x::a is an abstracted variable, then so is a; that is,
+ -- we must look in x's type
+ -- And similarly if x is a coercion variable.
+absVarsOf id_env v
+ | isId v = [av2 | av1 <- lookup_avs v
+ , av2 <- add_tyvars av1]
+ | isCoVar v = add_tyvars v
+ | otherwise = [v]
+
+ where
+ lookup_avs v = case lookupVarEnv id_env v of
+ Just (abs_vars, _) -> abs_vars
+ Nothing -> [v]
+
+ add_tyvars v = v : varSetElems (varTypeTyVars v)
\end{code}
\begin{code}
returnUs (env', vs2)
-- VERY IMPORTANT: we must zap the demand info
- -- if the thing is going to float out past a lambda
+ -- if the thing is going to float out past a lambda,
+ -- or if it's going to top level (where things can't be strict)
zap_demand dest_lvl ctxt_lvl id
- | ctxt_lvl == dest_lvl = id -- Stays put
- | otherwise = zapDemandIdInfo id -- Floats out
+ | ctxt_lvl == dest_lvl,
+ not (isTopLvl dest_lvl) = id -- Stays, and not going to top level
+ | otherwise = zapDemandIdInfo id -- Floats out
\end{code}