From: simonpj@microsoft.com Date: Fri, 11 Dec 2009 16:19:28 +0000 (+0000) Subject: Bottom extraction: float out bottoming expressions to top level X-Git-Url: http://git.megacz.com/?p=ghc-hetmet.git;a=commitdiff_plain;h=b84ba676034763b3082bbd9405794a4fde499d14 Bottom extraction: float out bottoming expressions to top level The idea is to float out bottoming expressions to top level, abstracting them over any variables they mention, if necessary. This is good because it makes functions smaller (and more likely to inline), by keeping error code out of line. See Note [Bottoming floats] in SetLevels. On the way, this fixes the HPC failures for cg059 and friends. I've been meaning to do this for some time. See Maessen's paper 1999 "Bottom extraction: factoring error handling out of functional programs" (unpublished I think). Here are the nofib results: Program Size Allocs Runtime Elapsed -------------------------------------------------------------------------------- Min +0.1% -7.8% -14.4% -32.5% Max +0.5% +0.2% +1.6% +13.8% Geometric Mean +0.4% -0.2% -4.9% -6.7% Module sizes -1 s.d. ----- -2.6% +1 s.d. ----- +2.3% Average ----- -0.2% Compile times: -1 s.d. ----- -11.4% +1 s.d. ----- +4.3% Average ----- -3.8% I'm think program sizes have crept up because the base library is bigger -- module sizes in nofib decrease very slightly. In turn I think that may be because the floating generates a call where there was no call before. Anyway I think it's acceptable. The main changes are: * SetLevels floats out things that exprBotStrictness_maybe identifies as bottom. Make sure to pin on the right strictness info to the newly created Ids, so that the info ends up in interface files. Since FloatOut is run twice, we have to be careful that we don't treat the function created by the first float-out as a candidate for the second; this is what worthFloating does. See SetLevels Note [Bottoming floats] Note [Bottoming floats: eta expansion] * Be careful not to inline top-level bottoming functions; this would just undo what the floating transformation achieves. See CoreUnfold Note [Do not inline top-level bottoming functions Ensuring this requires a bit of extra plumbing, but nothing drastic.. * Similarly pre/postInlineUnconditionally should be careful not to re-inline top-level bottoming things! See SimplUtils Note [Top-level botomming Ids] Note [Top level and postInlineUnconditionally] --- diff --git a/compiler/coreSyn/CoreUnfold.lhs b/compiler/coreSyn/CoreUnfold.lhs index 0510e90..f374c00 100644 --- a/compiler/coreSyn/CoreUnfold.lhs +++ b/compiler/coreSyn/CoreUnfold.lhs @@ -1,4 +1,4 @@ -calcU% +% % (c) The University of Glasgow 2006 % (c) The AQUA Project, Glasgow University, 1994-1998 % @@ -72,12 +72,13 @@ import Outputable %************************************************************************ \begin{code} -mkTopUnfolding :: CoreExpr -> Unfolding -mkTopUnfolding expr = mkUnfolding True {- Top level -} expr +mkTopUnfolding :: Bool -> CoreExpr -> Unfolding +mkTopUnfolding is_bottoming expr + = mkUnfolding True {- Top level -} is_bottoming expr mkImplicitUnfolding :: CoreExpr -> Unfolding -- For implicit Ids, do a tiny bit of optimising first -mkImplicitUnfolding expr = mkTopUnfolding (simpleOptExpr expr) +mkImplicitUnfolding expr = mkTopUnfolding False (simpleOptExpr expr) -- Note [Top-level flag on inline rules] -- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -85,8 +86,8 @@ mkImplicitUnfolding expr = mkTopUnfolding (simpleOptExpr expr) -- top-level flag to True. It gets set more accurately by the simplifier -- Simplify.simplUnfolding. -mkUnfolding :: Bool -> CoreExpr -> Unfolding -mkUnfolding top_lvl expr +mkUnfolding :: Bool -> Bool -> CoreExpr -> Unfolding +mkUnfolding top_lvl is_bottoming expr = CoreUnfolding { uf_tmpl = occurAnalyseExpr expr, uf_src = InlineRhs, uf_arity = arity, @@ -98,7 +99,8 @@ mkUnfolding top_lvl expr uf_guidance = guidance } where is_cheap = exprIsCheap expr - (arity, guidance) = calcUnfoldingGuidance is_cheap opt_UF_CreationThreshold expr + (arity, guidance) = calcUnfoldingGuidance is_cheap (top_lvl && is_bottoming) + opt_UF_CreationThreshold expr -- Sometimes during simplification, there's a large let-bound thing -- which has been substituted, and so is now dead; so 'expr' contains -- two copies of the thing while the occurrence-analysed expression doesn't @@ -146,6 +148,7 @@ mkInlineRule unsat_ok expr arity where expr' = simpleOptExpr expr boring_ok = case calcUnfoldingGuidance True -- Treat as cheap + False -- But not bottoming (arity+1) expr' of (_, UnfWhen _ boring_ok) -> boring_ok _other -> boringCxtNotOk @@ -163,10 +166,12 @@ mkInlineRule unsat_ok expr arity calcUnfoldingGuidance :: Bool -- True <=> the rhs is cheap, or we want to treat it -- as cheap (INLINE things) + -> Bool -- True <=> this is a top-level unfolding for a + -- diverging function; don't inline this -> Int -- Bomb out if size gets bigger than this -> CoreExpr -- Expression to look at -> (Arity, UnfoldingGuidance) -calcUnfoldingGuidance expr_is_cheap bOMB_OUT_SIZE expr +calcUnfoldingGuidance expr_is_cheap top_bot bOMB_OUT_SIZE expr = case collectBinders expr of { (bndrs, body) -> let val_bndrs = filter isId bndrs @@ -179,6 +184,9 @@ calcUnfoldingGuidance expr_is_cheap bOMB_OUT_SIZE expr | uncondInline n_val_bndrs (iBox size) && expr_is_cheap -> UnfWhen needSaturated boringCxtOk + | top_bot -- See Note [Do not inline top-level bottoming functions] + -> UnfNever + | otherwise -> UnfIfGoodArgs { ug_args = map (discount cased_bndrs) val_bndrs , ug_size = iBox size @@ -222,6 +230,15 @@ Notice that 'x' counts 0, while (f x) counts 2. That's deliberate: there's a function call to account for. Notice also that constructor applications are very cheap, because exposing them to a caller is so valuable. + +Note [Do not inline top-level bottoming functions] +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +The FloatOut pass has gone to some trouble to float out calls to 'error' +and similar friends. See Note [Bottoming floats] in SetLevels. +Do not re-inline them! But we *do* still inline if they are very small +(the uncondInline stuff). + + Note [Unconditional inlining] ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ We inline *unconditionally* if inlined thing is smaller (using sizeExpr) @@ -566,7 +583,7 @@ actual arguments. \begin{code} couldBeSmallEnoughToInline :: Int -> CoreExpr -> Bool couldBeSmallEnoughToInline threshold rhs - = case calcUnfoldingGuidance False threshold rhs of + = case calcUnfoldingGuidance False False threshold rhs of (_, UnfNever) -> False _ -> True diff --git a/compiler/iface/MkIface.lhs b/compiler/iface/MkIface.lhs index 897c138..1c34edc 100644 --- a/compiler/iface/MkIface.lhs +++ b/compiler/iface/MkIface.lhs @@ -1472,6 +1472,8 @@ toIfaceIdInfo :: IdInfo -> [IfaceInfoItem] toIfaceIdInfo id_info = catMaybes [arity_hsinfo, caf_hsinfo, strict_hsinfo, inline_hsinfo, unfold_hsinfo] + -- NB: strictness must be before unfolding + -- See TcIface.tcUnfolding where ------------ Arity -------------- arity_info = arityInfo id_info diff --git a/compiler/iface/TcIface.lhs b/compiler/iface/TcIface.lhs index 2ec9de9..c9c33db 100644 --- a/compiler/iface/TcIface.lhs +++ b/compiler/iface/TcIface.lhs @@ -46,6 +46,7 @@ import VarEnv import Name import NameEnv import OccurAnal ( occurAnalyseExpr ) +import Demand ( isBottomingSig ) import Module import LazyUniqFM import UniqSupply @@ -1003,11 +1004,16 @@ tcIdInfo ignore_prags name ty info \begin{code} tcUnfolding :: Name -> Type -> IdInfo -> IfaceUnfolding -> IfL Unfolding -tcUnfolding name _ _ (IfCoreUnfold if_expr) +tcUnfolding name _ info (IfCoreUnfold if_expr) = do { mb_expr <- tcPragExpr name if_expr ; return (case mb_expr of Nothing -> NoUnfolding - Just expr -> mkTopUnfolding expr) } + Just expr -> mkTopUnfolding is_bottoming expr) } + where + -- Strictness should occur before unfolding! + is_bottoming = case strictnessInfo info of + Just sig -> isBottomingSig sig + Nothing -> False tcUnfolding name _ _ (IfInlineRule arity unsat_ok if_expr) = do { mb_expr <- tcPragExpr name if_expr @@ -1029,8 +1035,8 @@ tcUnfolding name ty info (IfWrapper arity wkr) (initUs_ us (mkWrapper ty strict_sig) wkr_id) arity - -- We are relying here on strictness info always appearing - -- before worker info, fingers crossed .... + -- Again we rely here on strictness info always appearing + -- before unfolding strict_sig = case strictnessInfo info of Just sig -> sig Nothing -> pprPanic "Worker info but no strictness for" (ppr wkr) diff --git a/compiler/main/TidyPgm.lhs b/compiler/main/TidyPgm.lhs index eefdd2d..41d9234 100644 --- a/compiler/main/TidyPgm.lhs +++ b/compiler/main/TidyPgm.lhs @@ -983,21 +983,24 @@ tidyTopPair show_unfold rhs_tidy_env caf_info name' (bndr, rhs) -- the RHS is bottom, it should jolly well be exposed _bottom_exposed = case exprBotStrictness_maybe rhs of Nothing -> True - Just (arity, _) -> appIsBottom str arity + Just (arity, _) -> appIsBottom str_sig arity where - str = strictnessInfo idinfo `orElse` topSig - - bndr1 = mkGlobalId details name' ty' idinfo' - details = idDetails bndr -- Preserve the IdDetails - ty' = tidyTopType (idType bndr) - rhs1 = tidyExpr rhs_tidy_env rhs - idinfo = idInfo bndr - idinfo' = tidyTopIdInfo (isExternalName name') + + + bndr1 = mkGlobalId details name' ty' idinfo' + details = idDetails bndr -- Preserve the IdDetails + ty' = tidyTopType (idType bndr) + rhs1 = tidyExpr rhs_tidy_env rhs + idinfo = idInfo bndr + unf_info = unfoldingInfo idinfo + str_sig = strictnessInfo idinfo `orElse` topSig + is_bot = isBottomingSig str_sig + idinfo' = tidyTopIdInfo (isExternalName name') idinfo unfold_info arity caf_info (occInfo idinfo) - unfold_info | show_unfold = tidyUnfolding rhs_tidy_env rhs1 (unfoldingInfo idinfo) + unfold_info | show_unfold = tidyUnfolding rhs_tidy_env rhs1 is_bot unf_info | otherwise = noUnfolding -- NB: do *not* expose the worker if show_unfold is off, -- because that means this thing is a loop breaker or @@ -1065,16 +1068,17 @@ tidyTopIdInfo is_external idinfo unfold_info arity caf_info occ_info ------------ Unfolding -------------- -tidyUnfolding :: TidyEnv -> CoreExpr -> Unfolding -> Unfolding -tidyUnfolding tidy_env _ (DFunUnfolding con ids) +tidyUnfolding :: TidyEnv -> CoreExpr -> Bool -> Unfolding -> Unfolding +tidyUnfolding tidy_env _ _ (DFunUnfolding con ids) = DFunUnfolding con (map (tidyExpr tidy_env) ids) -tidyUnfolding tidy_env tidy_rhs unf@(CoreUnfolding { uf_tmpl = unf_rhs, uf_src = src }) +tidyUnfolding tidy_env tidy_rhs is_bottoming + unf@(CoreUnfolding { uf_tmpl = unf_rhs, uf_src = src }) | isInlineRuleSource src = unf { uf_tmpl = tidyExpr tidy_env unf_rhs, -- Preserves OccInfo uf_src = tidyInl tidy_env src } | otherwise - = mkTopUnfolding tidy_rhs -tidyUnfolding _ _ unf = unf + = mkTopUnfolding is_bottoming tidy_rhs +tidyUnfolding _ _ _ unf = unf tidyInl :: TidyEnv -> UnfoldingSource -> UnfoldingSource tidyInl tidy_env (InlineWrapper w) = InlineWrapper (tidyVarOcc tidy_env w) diff --git a/compiler/simplCore/FloatOut.lhs b/compiler/simplCore/FloatOut.lhs index d65f7bd..f5f8946 100644 --- a/compiler/simplCore/FloatOut.lhs +++ b/compiler/simplCore/FloatOut.lhs @@ -10,11 +10,12 @@ module FloatOut ( floatOutwards ) where import CoreSyn import CoreUtils +import CoreArity ( etaExpand ) import DynFlags ( DynFlags, DynFlag(..), FloatOutSwitches(..) ) import ErrUtils ( dumpIfSet_dyn ) import CostCentre ( dupifyCC, CostCentre ) -import Id ( Id, idType ) +import Id ( Id, idType, idArity, isBottomingId ) import Type ( isUnLiftedType ) import SetLevels ( Level(..), LevelledExpr, LevelledBind, setLevels, isTopLvl, tOP_LEVEL ) @@ -144,13 +145,18 @@ floatTopBind bind %* * %************************************************************************ - \begin{code} floatBind :: LevelledBind -> (FloatStats, FloatBinds) -floatBind (NonRec (TB name level) rhs) +floatBind (NonRec (TB var level) rhs) = case (floatRhs level rhs) of { (fs, rhs_floats, rhs') -> - (fs, rhs_floats `plusFloats` unitFloat level (NonRec name rhs')) } + + -- A tiresome hack: + -- see Note [Bottoming floats: eta expansion] in SetLevels + let rhs'' | isBottomingId var = etaExpand (idArity var) rhs' + | otherwise = rhs' + + in (fs, rhs_floats `plusFloats` unitFloat level (NonRec var rhs'')) } floatBind bind@(Rec pairs) = case (unzip3 (map do_pair pairs)) of { (fss, rhss_floats, new_pairs) -> @@ -297,8 +303,8 @@ floatExpr lvl (Cast expr co) (fs, floating_defns, Cast expr' co) } floatExpr lvl (Let (NonRec (TB bndr bndr_lvl) rhs) body) - | isUnLiftedType (idType bndr) -- Treat unlifted lets just like a case - -- I.e. floatExpr for rhs, floatCaseAlt for body + | isUnLiftedType (idType bndr) -- Treat unlifted lets just like a case + -- I.e. floatExpr for rhs, floatCaseAlt for body = case floatExpr lvl rhs of { (_, rhs_floats, rhs') -> case floatCaseAlt bndr_lvl body of { (fs, body_floats, body') -> (fs, rhs_floats `plusFloats` body_floats, Let (NonRec bndr rhs') body') }} diff --git a/compiler/simplCore/SetLevels.lhs b/compiler/simplCore/SetLevels.lhs index ed42089..d0914c9 100644 --- a/compiler/simplCore/SetLevels.lhs +++ b/compiler/simplCore/SetLevels.lhs @@ -56,12 +56,12 @@ module SetLevels ( import CoreSyn import DynFlags ( FloatOutSwitches(..) ) -import CoreUtils ( exprType, exprIsTrivial, mkPiTypes ) +import CoreUtils ( exprType, mkPiTypes ) import CoreArity ( exprBotStrictness_maybe ) import CoreFVs -- all of it import CoreSubst ( Subst, emptySubst, extendInScope, extendInScopeList, extendIdSubst, cloneIdBndr, cloneRecIdBndrs ) -import Id ( idType, mkSysLocal, isOneShotLambda, +import Id ( idType, mkLocalIdWithInfo, mkSysLocal, isOneShotLambda, zapDemandIdInfo, transferPolyIdInfo, idSpecialisation, idUnfolding, setIdInfo, setIdStrictness, setIdArity @@ -70,10 +70,11 @@ import IdInfo import Var import VarSet import VarEnv -import Name ( getOccName ) +import Demand ( StrictSig, increaseStrictSigArity ) +import Name ( getOccName, mkSystemVarName ) import OccName ( occNameString ) import Type ( isUnLiftedType, Type ) -import BasicTypes ( TopLevelFlag(..) ) +import BasicTypes ( TopLevelFlag(..), Arity ) import UniqSupply import Util ( sortLe, isSingleton, count ) import Outputable @@ -340,10 +341,25 @@ If we see we'd like to float the call to error, to get lvl = error "urk" f = \x. g lvl -But, it's very helpful for lvl to get a strictness signature, so that, -for example, its unfolding is not exposed in interface files (unnecessary). -But this float-out might occur after strictness analysis. So we use the -cheap-and-cheerful exprBotStrictness_maybe function. +Furthermore, we want to float a bottoming expression even if it has free +variables: + f = \x. g (let v = h x in error ("urk" ++ v)) +Then we'd like to abstact over 'x' can float the whole arg of g: + lvl = \x. let v = h x in error ("urk" ++ v) + f = \x. g (lvl x) +See Maessen's paper 1999 "Bottom extraction: factoring error handling out +of functional programs" (unpublished I think). + +When we do this, we set the strictness and arity of the new bottoming +Id, so that it's properly exposed as such in the interface file, even if +this is all happening after strictness analysis. + +Note [Bottoming floats: eta expansion] c.f Note [Bottoming floats] +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +Tiresomely, though, the simplifier has an invariant that the manifest +arity of the RHS should be the same as the arity; but we can't call +etaExpand during SetLevels because it works over a decorated form of +CoreExpr. So we do the eta expansion later, in FloatOut. Note [Case MFEs] ~~~~~~~~~~~~~~~~ @@ -381,25 +397,21 @@ lvlMFE True ctxt_lvl env e@(_, AnnCase {}) lvlMFE strict_ctxt ctxt_lvl env ann_expr@(fvs, _) | isUnLiftedType ty -- Can't let-bind it; see Note [Unlifted MFEs] - || exprIsTrivial expr -- Never float if it's trivial + || notWorthFloating ann_expr abs_vars || not good_destination = -- Don't float it out lvlExpr ctxt_lvl env ann_expr | otherwise -- Float it out! = do expr' <- lvlFloatRhs abs_vars dest_lvl env ann_expr - var <- newLvlVar "lvl" abs_vars ty - -- Note [Bottoming floats] - let var_w_str = case exprBotStrictness_maybe expr of - Just (arity,str) -> var `setIdArity` arity - `setIdStrictness` str - Nothing -> var - return (Let (NonRec (TB var_w_str dest_lvl) expr') - (mkVarApps (Var var_w_str) abs_vars)) + var <- newLvlVar abs_vars ty mb_bot + return (Let (NonRec (TB var dest_lvl) expr') + (mkVarApps (Var var) abs_vars)) where expr = deAnnotate ann_expr ty = exprType expr - dest_lvl = destLevel env fvs (isFunction ann_expr) + mb_bot = exprBotStrictness_maybe expr + dest_lvl = destLevel env fvs (isFunction ann_expr) mb_bot abs_vars = abstractVars dest_lvl env fvs -- A decision to float entails let-binding this thing, and we only do @@ -426,6 +438,42 @@ lvlMFE strict_ctxt ctxt_lvl env ann_expr@(fvs, _) -- concat = /\ a -> lvl a -- lvl = /\ a -> foldr ..a.. (++) [] -- which is pretty stupid. Hence the strict_ctxt test + +annotateBotStr :: Id -> Maybe (Arity, StrictSig) -> Id +annotateBotStr id Nothing = id +annotateBotStr id (Just (arity,sig)) = id `setIdArity` arity + `setIdStrictness` sig + +notWorthFloating :: CoreExprWithFVs -> [Var] -> Bool +-- Returns True if the expression would be replaced by +-- something bigger than it is now. For example: +-- abs_vars = tvars only: return True if e is trivial, +-- but False for anything bigger +-- abs_vars = [x] (an Id): return True for trivial, or an application (f x) +-- but False for (f x x) +-- +-- One big goal is that floating should be idempotent. Eg if +-- we replace e with (lvl79 x y) and then run FloatOut again, don't want +-- to replace (lvl79 x y) with (lvl83 x y)! + +notWorthFloating e abs_vars + = go e (count isId abs_vars) + where + go (_, AnnVar {}) n = n == 0 + go (_, AnnLit {}) n = n == 0 + go (_, AnnCast e _) n = go e n + go (_, AnnApp e arg) n + | (_, AnnType {}) <- arg = go e n + | n==0 = False + | is_triv arg = go e (n-1) + | otherwise = False + go _ _ = False + + is_triv (_, AnnLit {}) = True -- Treat all literals as trivial + is_triv (_, AnnVar {}) = True -- (ie not worth floating) + is_triv (_, AnnCast e _) = is_triv e + is_triv (_, AnnApp e (_, AnnType {})) = is_triv e + is_triv _ = False \end{code} Note [Escaping a value lambda] @@ -502,13 +550,15 @@ lvlBind top_lvl ctxt_lvl env (AnnNonRec bndr rhs@(rhs_fvs,_)) | otherwise = do -- Yes, type abstraction; create a new binder, extend substitution, etc rhs' <- lvlFloatRhs abs_vars dest_lvl env rhs - (env', [bndr']) <- newPolyBndrs dest_lvl env abs_vars [bndr] + (env', [bndr']) <- newPolyBndrs dest_lvl env abs_vars [bndr_w_str] return (NonRec (TB bndr' dest_lvl) rhs', env') where - bind_fvs = rhs_fvs `unionVarSet` idFreeVars bndr - abs_vars = abstractVars dest_lvl env bind_fvs - dest_lvl = destLevel env bind_fvs (isFunction rhs) + bind_fvs = rhs_fvs `unionVarSet` idFreeVars bndr + abs_vars = abstractVars dest_lvl env bind_fvs + dest_lvl = destLevel env bind_fvs (isFunction rhs) mb_bot + mb_bot = exprBotStrictness_maybe (deAnnotate rhs) + bndr_w_str = annotateBotStr bndr mb_bot \end{code} @@ -562,7 +612,7 @@ lvlBind top_lvl ctxt_lvl env (AnnRec pairs) `minusVarSet` mkVarSet bndrs - dest_lvl = destLevel env bind_fvs (all isFunction rhss) + dest_lvl = destLevel env bind_fvs (all isFunction rhss) Nothing abs_vars = abstractVars dest_lvl env bind_fvs ---------------------------------------------------- @@ -619,12 +669,14 @@ lvlLamBndrs lvl bndrs \begin{code} -- Destintion level is the max Id level of the expression -- (We'll abstract the type variables, if any.) -destLevel :: LevelEnv -> VarSet -> Bool -> Level -destLevel env fvs is_function +destLevel :: LevelEnv -> VarSet -> Bool -> Maybe (Arity, StrictSig) -> Level +destLevel env fvs is_function mb_bot + | Just {} <- mb_bot = tOP_LEVEL -- Send bottoming bindings to the top + -- regardless; see Note [Bottoming floats] | floatLams env - && is_function = tOP_LEVEL -- Send functions to top level; see + && is_function = tOP_LEVEL -- Send functions to top level; see -- the comments with isFunction - | otherwise = maxIdLevel env fvs + | otherwise = maxIdLevel env fvs isFunction :: CoreExprWithFVs -> Bool -- The idea here is that we want to float *functions* to @@ -857,12 +909,20 @@ newPolyBndrs dest_lvl env abs_vars bndrs = do str = "poly_" ++ occNameString (getOccName bndr) poly_ty = mkPiTypes abs_vars (idType bndr) -newLvlVar :: String - -> [CoreBndr] -> Type -- Abstract wrt these bndrs +newLvlVar :: [CoreBndr] -> Type -- Abstract wrt these bndrs + -> Maybe (Arity, StrictSig) -- Note [Bottoming floats] -> LvlM Id -newLvlVar str vars body_ty = do - uniq <- getUniqueM - return (mkSysLocal (mkFastString str) uniq (mkPiTypes vars body_ty)) +newLvlVar vars body_ty mb_bot + = do { uniq <- getUniqueM + ; return (mkLocalIdWithInfo (mk_name uniq) (mkPiTypes vars body_ty) info) } + where + mk_name uniq = mkSystemVarName uniq (mkFastString "lvl") + arity = count isId vars + info = case mb_bot of + Nothing -> vanillaIdInfo + Just (bot_arity, sig) -> vanillaIdInfo + `setArityInfo` (arity + bot_arity) + `setStrictnessInfo` Just (increaseStrictSigArity arity sig) -- The deeply tiresome thing is that we have to apply the substitution -- to the rules inside each Id. Grr. But it matters. diff --git a/compiler/simplCore/SimplUtils.lhs b/compiler/simplCore/SimplUtils.lhs index 7a5b96b..56d2795 100644 --- a/compiler/simplCore/SimplUtils.lhs +++ b/compiler/simplCore/SimplUtils.lhs @@ -635,11 +635,18 @@ let-float if you inline windowToViewport However, as usual for Gentle mode, do not inline things that are inactive in the intial stages. See Note [Gentle mode]. +Note [Top-level botomming Ids] +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +Don't inline top-level Ids that are bottoming, even if they are used just +once, because FloatOut has gone to some trouble to extract them out. +Inlining them won't make the program run faster! + \begin{code} preInlineUnconditionally :: SimplEnv -> TopLevelFlag -> InId -> InExpr -> Bool preInlineUnconditionally env top_lvl bndr rhs - | not active = False - | opt_SimplNoPreInlining = False + | not active = False + | isTopLevel top_lvl && isBottomingId bndr = False -- Note [Top-level bottoming Ids] + | opt_SimplNoPreInlining = False | otherwise = case idOccInfo bndr of IAmDead -> True -- Happens in ((\x.1) v) OneOcc in_lam True int_cxt -> try_once in_lam int_cxt @@ -651,12 +658,11 @@ preInlineUnconditionally env top_lvl bndr rhs -- See Note [pre/postInlineUnconditionally in gentle mode] SimplPhase n _ -> isActive n act act = idInlineActivation bndr - try_once in_lam int_cxt -- There's one textual occurrence | not in_lam = isNotTopLevel top_lvl || early_phase | otherwise = int_cxt && canInlineInLam rhs --- Be very careful before inlining inside a lambda, becuase (a) we must not +-- Be very careful before inlining inside a lambda, because (a) we must not -- invalidate occurrence information, and (b) we want to avoid pushing a -- single allocation (here) into multiple allocations (inside lambda). -- Inlining a *function* with a single *saturated* call would be ok, mind you. @@ -745,6 +751,7 @@ postInlineUnconditionally env top_lvl bndr occ_info rhs unfolding | isExportedId bndr = False | isStableUnfolding unfolding = False -- Note [InlineRule and postInlineUnconditionally] | exprIsTrivial rhs = True + | isTopLevel top_lvl = False -- Note [Top level and postInlineUnconditionally] | otherwise = case occ_info of -- The point of examining occ_info here is that for *non-values* @@ -771,8 +778,8 @@ postInlineUnconditionally env top_lvl bndr occ_info rhs unfolding -- PRINCIPLE: when we've already simplified an expression once, -- make sure that we only inline it if it's reasonably small. - && ((isNotTopLevel top_lvl && not in_lam) || - -- But outside a lambda, we want to be reasonably aggressive + && (not in_lam || + -- Outside a lambda, we want to be reasonably aggressive -- about inlining into multiple branches of case -- e.g. let x = -- in case y of { C1 -> ..x..; C2 -> ..x..; C3 -> ... } @@ -875,6 +882,14 @@ activeRule dflags env SimplPhase n _ -> Just (isActive n) \end{code} +Note [Top level and postInlineUnconditionally] +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +We don't do postInlineUnconditionally for top-level things (except +ones that are trivial). There is no point, because the main goal is +to get rid of local bindings used in multiple case branches. And +doing so risks replacing a single global allocation with local allocations. + + Note [InlineRule and postInlineUnconditionally] ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Do not do postInlineUnconditionally if the Id has an InlineRule, otherwise diff --git a/compiler/simplCore/Simplify.lhs b/compiler/simplCore/Simplify.lhs index 37fa798..f6e8569 100644 --- a/compiler/simplCore/Simplify.lhs +++ b/compiler/simplCore/Simplify.lhs @@ -662,7 +662,7 @@ addNonRecWithUnf env new_bndr new_rhs new_unfolding ------------------------------ simplUnfolding :: SimplEnv-> TopLevelFlag - -> Id -- Debug output only + -> Id -> OccInfo -> OutExpr -> Unfolding -> SimplM Unfolding -- Note [Setting the new unfolding] @@ -681,8 +681,8 @@ simplUnfolding env top_lvl _ _ _ ; return (mkCoreUnfolding (isTopLevel top_lvl) src' expr' arity guide) } -- See Note [Top-level flag on inline rules] in CoreUnfold -simplUnfolding _ top_lvl _ _occ_info new_rhs _ - = return (mkUnfolding (isTopLevel top_lvl) new_rhs) +simplUnfolding _ top_lvl id _occ_info new_rhs _ + = return (mkUnfolding (isTopLevel top_lvl) (isBottomingId id) new_rhs) -- We make an unfolding *even for loop-breakers*. -- Reason: (a) It might be useful to know that they are WHNF -- (b) In TidyPgm we currently assume that, if we want to @@ -1724,7 +1724,7 @@ simplAlt env _ case_bndr' cont' (DataAlt con, vs, rhs) addBinderUnfolding :: SimplEnv -> Id -> CoreExpr -> SimplEnv addBinderUnfolding env bndr rhs - = modifyInScope env (bndr `setIdUnfolding` mkUnfolding False rhs) + = modifyInScope env (bndr `setIdUnfolding` mkUnfolding False False rhs) addBinderOtherCon :: SimplEnv -> Id -> [AltCon] -> SimplEnv addBinderOtherCon env bndr cons diff --git a/compiler/specialise/Specialise.lhs b/compiler/specialise/Specialise.lhs index d738565..ad641d4 100644 --- a/compiler/specialise/Specialise.lhs +++ b/compiler/specialise/Specialise.lhs @@ -939,7 +939,7 @@ bindAuxiliaryDicts subst triples = go subst [] triples -- No auxiliary binding necessary | otherwise = go subst_w_unf (NonRec dx_id dx : binds) pairs where - dx_id1 = dx_id `setIdUnfolding` mkUnfolding False dx + dx_id1 = dx_id `setIdUnfolding` mkUnfolding False False dx subst_w_unf = extendIdSubst subst d (Var dx_id1) -- Important! We're going to substitute dx_id1 for d -- and we want it to look "interesting", else we won't gather *any*