From 30b5ebe424ebae69b162ac3fc547eb14d898535f Mon Sep 17 00:00:00 2001 From: simonpj Date: Mon, 1 Nov 1999 17:10:57 +0000 Subject: [PATCH] [project @ 1999-11-01 17:09:54 by simonpj] A regrettably-gigantic commit that puts in place what Simon PJ has been up to for the last month or so, on and off. The basic idea was to restore unfoldings to *occurrences* of variables without introducing a space leak. I wanted to make sure things improved relative to 4.04, and that proved depressingly hard. On the way I discovered several quite serious bugs in the simplifier. Here's a summary of what's gone on. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ * No commas between for-alls in RULES. This makes the for-alls have the same syntax as in types. * Arrange that simplConArgs works in one less pass than before. This exposed a bug: a bogus call to completeBeta. * Add a top-level flag in CoreUnfolding, used in callSiteInline * Extend w/w to use etaExpandArity, so it does eta/coerce expansion * Implement inline phases. The meaning of the inline pragmas is described in CoreUnfold.lhs. You can say things like {#- INLINE 2 build #-} to mean "inline build in phase 2" * Don't float anything out of an INLINE. Don't float things to top level unless they also escape a value lambda. [see comments with SetLevels.lvlMFE Without at least one of these changes, I found that {-# INLINE concat #-} concat = __inline (/\a -> foldr (++) []) was getting floated to concat = __inline( /\a -> lvl a ) lvl = ...inlined version of foldr... Subsequently I found that not floating constants out of an INLINE gave really bad code like __inline (let x = e in \y -> ...) so I now let things float out of INLINE * Implement the "reverse-mapping" idea for CSE; actually it turned out to be easier to implement it in SetLevels, and may benefit full laziness too. * It's a good idea to inline inRange. Consider index (l,h) i = case inRange (l,h) i of True -> l+i False -> error inRange itself isn't strict in h, but if it't inlined then 'index' *does* become strict in h. Interesting! * Big change to the way unfoldings and occurrence info is propagated in the simplifier The plan is described in Subst.lhs with the Subst type Occurrence info is now in a separate IdInfo field than user pragmas * I found that (coerce T (coerce S (\x.e))) y didn't simplify in one round. First we get to (\x.e) y and only then do the beta. Solution: cancel the coerces in the continuation * Amazingly, CoreUnfold wasn't counting the cost of a function an application. * Disable rules in initial simplifier run. Otherwise full laziness doesn't get a chance to lift out a MFE before a rule (e.g. fusion) zaps it. queens is a case in point * Improve float-out stuff significantly. The big change is that if we have \x -> ... /\a -> ...let p = ..a.. in let q = ...p... where p's rhs doesn't x, we abstract a from p, so that we can get p past x. (We did that before.) But we also substitute (p a) for p in q, and then we can do the same thing for q. (We didn't do that, so q got stuck.) This is much better. It involves doing a substitution "as we go" in SetLevels, though. --- ghc/compiler/basicTypes/DataCon.hi-boot | 3 +- ghc/compiler/basicTypes/DataCon.hi-boot-5 | 3 +- ghc/compiler/basicTypes/DataCon.lhs | 3 +- ghc/compiler/basicTypes/Id.lhs | 63 ++- ghc/compiler/basicTypes/IdInfo.hi-boot | 3 +- ghc/compiler/basicTypes/IdInfo.hi-boot-5 | 3 +- ghc/compiler/basicTypes/IdInfo.lhs | 356 ++++++++-------- ghc/compiler/basicTypes/MkId.lhs | 26 +- ghc/compiler/basicTypes/Name.lhs | 2 +- ghc/compiler/basicTypes/OccName.lhs | 2 +- ghc/compiler/basicTypes/Var.lhs | 7 +- ghc/compiler/basicTypes/VarEnv.lhs | 6 +- ghc/compiler/codeGen/CgCase.lhs | 3 +- ghc/compiler/codeGen/CgClosure.lhs | 17 +- ghc/compiler/codeGen/CgUsages.lhs | 17 +- ghc/compiler/coreSyn/CoreLint.lhs | 31 +- ghc/compiler/coreSyn/CoreSyn.lhs | 6 +- ghc/compiler/coreSyn/CoreTidy.lhs | 11 +- ghc/compiler/coreSyn/CoreUnfold.hi-boot | 3 +- ghc/compiler/coreSyn/CoreUnfold.hi-boot-5 | 3 +- ghc/compiler/coreSyn/CoreUnfold.lhs | 282 +++++++------ ghc/compiler/coreSyn/CoreUtils.lhs | 11 +- ghc/compiler/coreSyn/PprCore.lhs | 5 +- ghc/compiler/coreSyn/Subst.lhs | 164 +++++--- ghc/compiler/hsSyn/HsBinds.lhs | 27 +- ghc/compiler/hsSyn/HsCore.lhs | 2 +- ghc/compiler/main/CmdLineOpts.lhs | 12 +- ghc/compiler/main/Main.lhs | 3 +- ghc/compiler/main/MkIface.lhs | 20 +- ghc/compiler/parser/Parser.y | 14 +- ghc/compiler/rename/ParseIface.y | 10 +- ghc/compiler/rename/RnBinds.lhs | 28 +- ghc/compiler/rename/RnSource.lhs | 7 +- ghc/compiler/simplCore/BinderInfo.lhs | 45 +- ghc/compiler/simplCore/CSE.lhs | 54 ++- ghc/compiler/simplCore/FloatOut.lhs | 27 +- ghc/compiler/simplCore/OccurAnal.lhs | 88 ++-- ghc/compiler/simplCore/SetLevels.lhs | 640 ++++++++++++----------------- ghc/compiler/simplCore/SimplMonad.lhs | 33 +- ghc/compiler/simplCore/SimplUtils.lhs | 23 +- ghc/compiler/simplCore/Simplify.lhs | 456 +++++++++++--------- ghc/compiler/simplStg/StgVarInfo.lhs | 8 +- ghc/compiler/specialise/Rules.lhs | 10 +- ghc/compiler/specialise/Specialise.lhs | 8 +- ghc/compiler/stgSyn/CoreToStg.lhs | 2 +- ghc/compiler/stranal/StrictAnal.lhs | 32 +- ghc/compiler/stranal/WorkWrap.lhs | 43 +- ghc/compiler/stranal/WwLib.lhs | 60 ++- ghc/compiler/typecheck/TcBinds.lhs | 16 +- ghc/compiler/typecheck/TcClassDcl.lhs | 11 +- ghc/compiler/typecheck/TcExpr.lhs | 17 +- ghc/compiler/typecheck/TcIfaceSig.lhs | 11 +- ghc/compiler/types/TyCon.lhs | 14 +- ghc/compiler/usageSP/UsageSPInf.lhs | 4 +- ghc/compiler/usageSP/UsageSPUtils.lhs | 6 +- ghc/lib/std/Ix.lhs | 2 +- ghc/lib/std/PrelBase.lhs | 19 +- ghc/lib/std/PrelList.lhs | 18 +- ghc/lib/std/PrelNumExtra.lhs | 1 + ghc/lib/std/PrelST.lhs | 4 +- ghc/lib/std/PrelShow.lhs | 78 ++-- 61 files changed, 1562 insertions(+), 1321 deletions(-) diff --git a/ghc/compiler/basicTypes/DataCon.hi-boot b/ghc/compiler/basicTypes/DataCon.hi-boot index e1a6dae..f11d4e4 100644 --- a/ghc/compiler/basicTypes/DataCon.hi-boot +++ b/ghc/compiler/basicTypes/DataCon.hi-boot @@ -1,6 +1,7 @@ _interface_ DataCon 1 _exports_ -DataCon DataCon dataConType ; +DataCon DataCon dataConType isExistentialDataCon ; _declarations_ 1 data DataCon ; 1 dataConType _:_ DataCon -> TypeRep.Type ;; +1 isExistentialDataCon _:_ DataCon -> PrelBase.Bool ;; diff --git a/ghc/compiler/basicTypes/DataCon.hi-boot-5 b/ghc/compiler/basicTypes/DataCon.hi-boot-5 index 31963e3..ea08f44 100644 --- a/ghc/compiler/basicTypes/DataCon.hi-boot-5 +++ b/ghc/compiler/basicTypes/DataCon.hi-boot-5 @@ -1,4 +1,5 @@ __interface DataCon 1 0 where -__export DataCon DataCon dataConType ; +__export DataCon DataCon dataConType isExistentialDataCon ; 1 data DataCon ; 1 dataConType :: DataCon -> TypeRep.Type ; +1 isExistentialDataCon :: DataCon -> PrelBase.Bool ; diff --git a/ghc/compiler/basicTypes/DataCon.lhs b/ghc/compiler/basicTypes/DataCon.lhs index f8aa66a..0117a4f 100644 --- a/ghc/compiler/basicTypes/DataCon.lhs +++ b/ghc/compiler/basicTypes/DataCon.lhs @@ -378,8 +378,7 @@ splitProductType_maybe splitProductType_maybe ty = case splitAlgTyConApp_maybe ty of Just (tycon,ty_args,[data_con]) - | isProductTyCon tycon && -- Checks for non-recursive - not (isExistentialDataCon data_con) + | isProductTyCon tycon -- Checks for non-recursive, non-existential -> Just (tycon, ty_args, data_con, data_con_arg_tys) where data_con_arg_tys = map (substTy (mkTyVarSubst (dcTyVars data_con) ty_args)) diff --git a/ghc/compiler/basicTypes/Id.lhs b/ghc/compiler/basicTypes/Id.lhs index 11aa08d..d562a4d 100644 --- a/ghc/compiler/basicTypes/Id.lhs +++ b/ghc/compiler/basicTypes/Id.lhs @@ -19,6 +19,7 @@ module Id ( -- Modifying an Id setIdName, setIdUnique, setIdType, setIdNoDiscard, setIdInfo, lazySetIdInfo, modifyIdInfo, maybeModifyIdInfo, + zapFragileIdInfo, zapLamIdInfo, -- Predicates omitIfaceSigForId, @@ -28,12 +29,12 @@ module Id ( -- Inline pragma stuff getInlinePragma, setInlinePragma, modifyInlinePragma, - idMustBeINLINEd, idMustNotBeINLINEd, isSpecPragmaId, isRecordSelector, isPrimitiveId_maybe, isDataConId_maybe, isConstantId, isBottomingId, idAppIsBottom, isExportedId, isUserExportedId, + mayHaveNoBinding, -- One shot lambda stuff isOneShotLambda, setOneShotLambda, clearOneShotLambda, @@ -48,6 +49,7 @@ module Id ( setIdUpdateInfo, setIdCafInfo, setIdCprInfo, + setIdOccInfo, getIdArity, getIdDemandInfo, @@ -57,7 +59,8 @@ module Id ( getIdSpecialisation, getIdUpdateInfo, getIdCafInfo, - getIdCprInfo + getIdCprInfo, + getIdOccInfo ) where @@ -74,17 +77,20 @@ import Var ( Id, DictId, externallyVisibleId ) import VarSet -import Type ( Type, tyVarsOfType, typePrimRep, addFreeTyVars, seqType ) -import IdInfo +import Type ( Type, tyVarsOfType, typePrimRep, addFreeTyVars, seqType, splitTyConApp_maybe ) + +import IdInfo + import Demand ( Demand, isStrict, wwLazy ) import Name ( Name, OccName, mkSysLocalName, mkLocalName, isWiredInName, isUserExportedName ) +import OccName ( UserFS ) import Const ( Con(..) ) import PrimRep ( PrimRep ) import PrimOp ( PrimOp ) -import TysPrim ( realWorldStatePrimTy ) +import TysPrim ( statePrimTyCon ) import FieldLabel ( FieldLabel(..) ) import SrcLoc ( SrcLoc ) import Unique ( Unique, mkBuiltinUnique, getBuiltinUniques ) @@ -131,8 +137,8 @@ mkVanillaId name ty = mkId name ty vanillaIdInfo -- SysLocal: for an Id being created by the compiler out of thin air... -- UserLocal: an Id with a name the user might recognize... -mkUserLocal :: OccName -> Unique -> Type -> SrcLoc -> Id -mkSysLocal :: FAST_STRING -> Unique -> Type -> Id +mkUserLocal :: OccName -> Unique -> Type -> SrcLoc -> Id +mkSysLocal :: UserFS -> Unique -> Type -> Id mkSysLocal fs uniq ty = mkVanillaId (mkSysLocalName uniq fs) ty mkUserLocal occ uniq ty loc = mkVanillaId (mkLocalName uniq occ loc) ty @@ -215,6 +221,14 @@ isSpecPragmaId id = case idFlavour id of SpecPragmaId -> True other -> False +mayHaveNoBinding id = isConstantId id + -- mayHaveNoBinding returns True of an Id which may not have a + -- binding, even though it is defined in this module. Notably, + -- the constructors of a dictionary are in this situation. + -- + -- mayHaveNoBinding returns True of some things that *do* have a local binding, + -- so it's only an approximation. That's ok... it's only use for assertions. + -- Don't drop a binding for an exported Id, -- if it otherwise looks dead. isExportedId :: Id -> Bool @@ -344,6 +358,14 @@ getIdCprInfo id = cprInfo (idInfo id) setIdCprInfo :: Id -> CprInfo -> Id setIdCprInfo id cpr_info = modifyIdInfo (`setCprInfo` cpr_info) id + + --------------------------------- + -- Occcurrence INFO +getIdOccInfo :: Id -> OccInfo +getIdOccInfo id = occInfo (idInfo id) + +setIdOccInfo :: Id -> OccInfo -> Id +setIdOccInfo id occ_info = modifyIdInfo (`setOccInfo` occ_info) id \end{code} @@ -361,15 +383,6 @@ setInlinePragma id prag = modifyIdInfo (`setInlinePragInfo` prag) id modifyInlinePragma :: Id -> (InlinePragInfo -> InlinePragInfo) -> Id modifyInlinePragma id fn = modifyIdInfo (\info -> info `setInlinePragInfo` (fn (inlinePragInfo info))) id - -idMustNotBeINLINEd id = case getInlinePragma id of - IMustNotBeINLINEd -> True - IAmALoopBreaker -> True - other -> False - -idMustBeINLINEd id = case getInlinePragma id of - IMustBeINLINEd -> True - other -> False \end{code} @@ -379,7 +392,9 @@ idMustBeINLINEd id = case getInlinePragma id of isOneShotLambda :: Id -> Bool isOneShotLambda id = case lbvarInfo (idInfo id) of IsOneShotLambda -> True - NoLBVarInfo -> idType id == realWorldStatePrimTy + NoLBVarInfo -> case splitTyConApp_maybe (idType id) of + Just (tycon,_) -> tycon == statePrimTyCon + other -> False -- The last clause is a gross hack. It claims that -- every function over realWorldStatePrimTy is a one-shot -- function. This is pretty true in practice, and makes a big @@ -391,9 +406,12 @@ isOneShotLambda id = case lbvarInfo (idInfo id) of -- When `thenST` gets inlined, we end up with -- let lvl = E in \s -> case a s of (r, s') -> ...lvl... -- and we don't re-inline E. - -- + -- -- It would be better to spot that r was one-shot to start with, but -- I don't want to rely on that. + -- + -- Another good example is in fill_in in PrelPack.lhs. We should be able to + -- spot that fill_in has arity 2 (and when Keith is done, we will) but we can't yet. setOneShotLambda :: Id -> Id setOneShotLambda id = modifyIdInfo (`setLBVarInfo` IsOneShotLambda) id @@ -407,3 +425,12 @@ clearOneShotLambda id -- f = \x -> e -- If we change the one-shot-ness of x, f's type changes \end{code} + +\begin{code} +zapFragileIdInfo :: Id -> Id +zapFragileIdInfo id = maybeModifyIdInfo zapFragileInfo id + +zapLamIdInfo :: Id -> Id +zapLamIdInfo id = maybeModifyIdInfo zapLamInfo id +\end{code} + diff --git a/ghc/compiler/basicTypes/IdInfo.hi-boot b/ghc/compiler/basicTypes/IdInfo.hi-boot index f88c4f6..f180e04 100644 --- a/ghc/compiler/basicTypes/IdInfo.hi-boot +++ b/ghc/compiler/basicTypes/IdInfo.hi-boot @@ -1,6 +1,7 @@ _interface_ IdInfo 1 _exports_ -IdInfo IdInfo seqIdInfo ; +IdInfo IdInfo seqIdInfo vanillaIdInfo; _declarations_ 1 data IdInfo ; 1 seqIdInfo _:_ IdInfo -> PrelBase.() ;; +1 vanillaIdInfo _:_ IdInfo ;; diff --git a/ghc/compiler/basicTypes/IdInfo.hi-boot-5 b/ghc/compiler/basicTypes/IdInfo.hi-boot-5 index 7e3e942..efd8cc4 100644 --- a/ghc/compiler/basicTypes/IdInfo.hi-boot-5 +++ b/ghc/compiler/basicTypes/IdInfo.hi-boot-5 @@ -1,5 +1,6 @@ __interface IdInfo 1 0 where -__export IdInfo IdInfo seqIdInfo ; +__export IdInfo IdInfo seqIdInfo vanillaIdInfo ; 1 data IdInfo ; 1 seqIdInfo :: IdInfo -> PrelBase.Z0T ; +1 vanillaIdInfo :: IdInfo ; diff --git a/ghc/compiler/basicTypes/IdInfo.lhs b/ghc/compiler/basicTypes/IdInfo.lhs index 61b3a0e..f899847 100644 --- a/ghc/compiler/basicTypes/IdInfo.lhs +++ b/ghc/compiler/basicTypes/IdInfo.lhs @@ -12,9 +12,12 @@ module IdInfo ( vanillaIdInfo, mkIdInfo, seqIdInfo, megaSeqIdInfo, + -- Zapping + zapFragileInfo, zapLamInfo, zapSpecPragInfo, copyIdInfo, + -- Flavour IdFlavour(..), flavourInfo, - setNoDiscardInfo, zapSpecPragInfo, copyIdInfo, + setNoDiscardInfo, ppFlavourInfo, -- Arity @@ -40,8 +43,12 @@ module IdInfo ( demandInfo, setDemandInfo, -- Inline prags - InlinePragInfo(..), OccInfo(..), - inlinePragInfo, setInlinePragInfo, notInsideLambda, + InlinePragInfo(..), + inlinePragInfo, setInlinePragInfo, pprInlinePragInfo, + + -- Occurrence info + OccInfo(..), InsideLam, OneBranch, insideLam, notInsideLam, oneBranch, notOneBranch, + occInfo, setOccInfo, isFragileOccInfo, -- Specialisation specInfo, setSpecInfo, @@ -56,9 +63,6 @@ module IdInfo ( -- Constructed Product Result Info CprInfo(..), cprInfo, setCprInfo, ppCprInfo, noCprInfo, - -- Zapping - zapLamIdInfo, zapFragileIdInfo, zapIdInfoForStg, - -- Lambda-bound variable info LBVarInfo(..), lbvarInfo, setLBVarInfo, noLBVarInfo ) where @@ -71,9 +75,9 @@ import {-# SOURCE #-} CoreSyn ( CoreExpr, CoreRules, emptyCoreRules, isEmptyCor import {-# SOURCE #-} Const ( Con ) import Var ( Id ) +import VarSet ( IdOrTyVarSet ) import FieldLabel ( FieldLabel ) import Demand ( Demand, isStrict, isLazy, wwLazy, pprDemands, seqDemand, seqDemands ) -import Type ( UsageAnn ) import Outputable import Maybe ( isJust ) @@ -86,7 +90,8 @@ infixl 1 `setUpdateInfo`, `setUnfoldingInfo`, `setCprInfo`, `setWorkerInfo`, - `setCafInfo` + `setCafInfo`, + `setOccInfo` -- infixl so you can say (id `set` a `set` b) \end{code} @@ -121,7 +126,8 @@ data IdInfo cafInfo :: CafInfo, cprInfo :: CprInfo, -- Function always constructs a product result lbvarInfo :: LBVarInfo, -- Info about a lambda-bound variable - inlinePragInfo :: InlinePragInfo -- Inline pragmas + inlinePragInfo :: InlinePragInfo, -- Inline pragma + occInfo :: OccInfo -- How it occurs } seqIdInfo :: IdInfo -> () @@ -143,7 +149,7 @@ megaSeqIdInfo info seqCaf (cafInfo info) `seq` seqCpr (cprInfo info) `seq` seqLBVar (lbvarInfo info) `seq` - seqInlinePrag (inlinePragInfo info) + seqOccInfo (occInfo info) \end{code} Setters @@ -152,6 +158,7 @@ Setters setWorkerInfo info wk = wk `seq` info { workerInfo = wk } setSpecInfo info sp = PSEQ sp (info { specInfo = sp }) setInlinePragInfo info pr = pr `seq` info { inlinePragInfo = pr } +setOccInfo info oc = oc `seq` info { occInfo = oc } setStrictnessInfo info st = st `seq` info { strictnessInfo = st } -- Try to avoid spack leaks by seq'ing @@ -173,33 +180,6 @@ zapSpecPragInfo info = case flavourInfo info of SpecPragmaId -> info { flavourInfo = VanillaId } other -> info -copyIdInfo :: IdInfo -- From - -> IdInfo -- To - -> IdInfo -- To, updated with stuff from From; except flavour unchanged --- copyIdInfo is used when shorting out a top-level binding --- f_local = BIG --- f = f_local --- where f is exported. We are going to swizzle it around to --- f = BIG --- f_local = f --- but we must be careful to combine their IdInfos right. --- The fact that things can go wrong here is a bad sign, but I can't see --- how to make it 'patently right', so copyIdInfo is derived (pretty much) by trial and error --- --- Here 'from' is f_local, 'to' is f, and the result is attached to f - -copyIdInfo from to = from { flavourInfo = flavourInfo to, - specInfo = specInfo to, - inlinePragInfo = inlinePragInfo to - } - -- It's important to preserve the inline pragma on 'f'; e.g. consider - -- {-# NOINLINE f #-} - -- f = local - -- - -- similarly, transformation rules may be attached to f - -- and we want to preserve them. - -- - -- On the other hand, we want the strictness info from f_local. \end{code} @@ -220,7 +200,8 @@ mkIdInfo flv = IdInfo { cafInfo = MayHaveCafRefs, cprInfo = NoCPRInfo, lbvarInfo = NoLBVarInfo, - inlinePragInfo = NoInlinePragInfo + inlinePragInfo = NoInlinePragInfo, + occInfo = NoOccInfo } \end{code} @@ -314,66 +295,78 @@ ppArityInfo (ArityAtLeast arity) = hsep [ptext SLIT("__AL"), int arity] \begin{code} data InlinePragInfo = NoInlinePragInfo + | IMustNotBeINLINEd Bool -- True <=> came from an INLINE prag, False <=> came from a NOINLINE prag + (Maybe Int) -- Phase number from pragma, if any + -- The True, Nothing case doesn't need to be recorded - | IMustNotBeINLINEd -- User NOINLINE pragma - - | IAmALoopBreaker -- Used by the occurrence analyser to mark loop-breakers - -- in a group of recursive definitions +instance Outputable InlinePragInfo where + -- This is now parsed in interface files + ppr NoInlinePragInfo = empty + ppr other_prag = ptext SLIT("__U") <> pprInlinePragInfo other_prag + +pprInlinePragInfo NoInlinePragInfo = empty +pprInlinePragInfo (IMustNotBeINLINEd True Nothing) = empty +pprInlinePragInfo (IMustNotBeINLINEd True (Just n)) = brackets (int n) +pprInlinePragInfo (IMustNotBeINLINEd False Nothing) = brackets (char '!') +pprInlinePragInfo (IMustNotBeINLINEd False (Just n)) = brackets (char '!' <> int n) + +instance Show InlinePragInfo where + showsPrec p prag = showsPrecSDoc p (ppr prag) +\end{code} - | ICanSafelyBeINLINEd -- Used by the occurrence analyser to mark things - -- that manifesly occur once, not inside SCCs, - -- not in constructor arguments - OccInfo -- Says whether the occurrence is inside a lambda - -- If so, must only substitute WHNFs +%************************************************************************ +%* * +\subsection{Occurrence information} +%* * +%************************************************************************ - Bool -- False <=> occurs in more than one case branch - -- If so, there's a code-duplication issue +\begin{code} +data OccInfo + = NoOccInfo | IAmDead -- Marks unused variables. Sometimes useful for -- lambda and case-bound variables. - | IMustBeINLINEd -- Absolutely must inline; used for PrimOps and - -- constructors only. + | OneOcc InsideLam -seqInlinePrag :: InlinePragInfo -> () -seqInlinePrag (ICanSafelyBeINLINEd occ alts) - = occ `seq` alts `seq` () -seqInlinePrag other - = () + OneBranch -instance Outputable InlinePragInfo where - -- only used for debugging; never parsed. KSW 1999-07 - ppr NoInlinePragInfo = empty - ppr IMustBeINLINEd = ptext SLIT("__UU") - ppr IMustNotBeINLINEd = ptext SLIT("__Unot") - ppr IAmALoopBreaker = ptext SLIT("__Ux") - ppr IAmDead = ptext SLIT("__Ud") - ppr (ICanSafelyBeINLINEd InsideLam _) = ptext SLIT("__Ul") - ppr (ICanSafelyBeINLINEd NotInsideLam True) = ptext SLIT("__Us") - ppr (ICanSafelyBeINLINEd NotInsideLam False) = ptext SLIT("__Us*") - -instance Show InlinePragInfo where - showsPrec p prag = showsPrecSDoc p (ppr prag) -\end{code} + | IAmALoopBreaker -- Used by the occurrence analyser to mark loop-breakers + -- in a group of recursive definitions -\begin{code} -data OccInfo - = NotInsideLam +seqOccInfo :: OccInfo -> () +seqOccInfo (OneOcc in_lam once) = in_lam `seq` once `seq` () +seqOccInfo occ = () - | InsideLam -- Inside a non-linear lambda (that is, a lambda which - -- is sure to be instantiated only once). +type InsideLam = Bool -- True <=> Occurs inside a non-linear lambda -- Substituting a redex for this occurrence is -- dangerous because it might duplicate work. +insideLam = True +notInsideLam = False -instance Outputable OccInfo where - ppr NotInsideLam = empty - ppr InsideLam = text "l" +type OneBranch = Bool -- True <=> Occurs in only one case branch + -- so no code-duplication issue to worry about +oneBranch = True +notOneBranch = False +isFragileOccInfo :: OccInfo -> Bool +isFragileOccInfo (OneOcc _ _) = True +isFragileOccInfo other = False +\end{code} -notInsideLambda :: OccInfo -> Bool -notInsideLambda NotInsideLam = True -notInsideLambda InsideLam = False +\begin{code} +instance Outputable OccInfo where + -- only used for debugging; never parsed. KSW 1999-07 + ppr NoOccInfo = empty + ppr IAmALoopBreaker = ptext SLIT("_Kx") + ppr IAmDead = ptext SLIT("_Kd") + ppr (OneOcc inside_lam one_branch) | inside_lam = ptext SLIT("_Kl") + | one_branch = ptext SLIT("_Ks") + | otherwise = ptext SLIT("_Ks*") + +instance Show OccInfo where + showsPrec p occ = showsPrecSDoc p (ppr occ) \end{code} %************************************************************************ @@ -535,98 +528,6 @@ ppCafInfo MayHaveCafRefs = empty %************************************************************************ %* * -\subsection[CAF-IdInfo]{CAF-related information} -%* * -%************************************************************************ - -zapFragileIdInfo is used when cloning binders, mainly in the -simplifier. We must forget about used-once information because that -isn't necessarily correct in the transformed program. -Also forget specialisations and unfoldings because they would need -substitution to be correct. (They get pinned back on separately.) - -\begin{code} -zapFragileIdInfo :: IdInfo -> Maybe IdInfo -zapFragileIdInfo info@(IdInfo {inlinePragInfo = inline_prag, - workerInfo = wrkr, - specInfo = rules, - unfoldingInfo = unfolding}) - | not is_fragile_inline_prag - -- We must forget about whether it was marked safe-to-inline, - -- because that isn't necessarily true in the simplified expression. - -- This is important because expressions may be re-simplified - - && isEmptyCoreRules rules - -- Specialisations would need substituting. They get pinned - -- back on separately. - - && not (workerExists wrkr) - - && not (hasUnfolding unfolding) - -- This is very important; occasionally a let-bound binder is used - -- as a binder in some lambda, in which case its unfolding is utterly - -- bogus. Also the unfolding uses old binders so if we left it we'd - -- have to substitute it. Much better simply to give the Id a new - -- unfolding each time, which is what the simplifier does. - = Nothing - - | otherwise - = Just (info {inlinePragInfo = safe_inline_prag, - workerInfo = noWorkerInfo, - specInfo = emptyCoreRules, - unfoldingInfo = noUnfolding}) - - where - is_fragile_inline_prag = case inline_prag of - ICanSafelyBeINLINEd _ _ -> True - --- We used to say the dead-ness was fragile, but I don't --- see why it is. Furthermore, deadness is a pain to lose; --- see Simplify.mkDupableCont (Select ...) --- IAmDead -> True - - other -> False - - -- Be careful not to destroy real 'pragma' info - safe_inline_prag | is_fragile_inline_prag = NoInlinePragInfo - | otherwise = inline_prag -\end{code} - - -@zapLamIdInfo@ is used for lambda binders that turn out to to be -part of an unsaturated lambda - -\begin{code} -zapLamIdInfo :: IdInfo -> Maybe IdInfo -zapLamIdInfo info@(IdInfo {inlinePragInfo = inline_prag, demandInfo = demand}) - | is_safe_inline_prag && not (isStrict demand) - = Nothing - | otherwise - = Just (info {inlinePragInfo = safe_inline_prag, - demandInfo = wwLazy}) - where - -- The "unsafe" prags are the ones that say I'm not in a lambda - -- because that might not be true for an unsaturated lambda - is_safe_inline_prag = case inline_prag of - ICanSafelyBeINLINEd NotInsideLam nalts -> False - other -> True - - safe_inline_prag = case inline_prag of - ICanSafelyBeINLINEd _ nalts - -> ICanSafelyBeINLINEd InsideLam nalts - other -> inline_prag -\end{code} - -\begin{code} -zapIdInfoForStg :: IdInfo -> IdInfo - -- Return only the info needed for STG stuff - -- Namely, nothing, I think -zapIdInfoForStg info = vanillaIdInfo -\end{code} - - -%************************************************************************ -%* * \subsection[cpr-IdInfo]{Constructed Product Result info about an @Id@} %* * %************************************************************************ @@ -740,3 +641,112 @@ instance Outputable LBVarInfo where instance Show LBVarInfo where showsPrec p c = showsPrecSDoc p (ppr c) \end{code} + + +%************************************************************************ +%* * +\subsection{Bulk operations on IdInfo} +%* * +%************************************************************************ + +zapFragileInfo is used when cloning binders, mainly in the +simplifier. We must forget about used-once information because that +isn't necessarily correct in the transformed program. +Also forget specialisations and unfoldings because they would need +substitution to be correct. (They get pinned back on separately.) + +\begin{code} +zapFragileInfo :: IdInfo -> Maybe IdInfo +zapFragileInfo info@(IdInfo {occInfo = occ, + workerInfo = wrkr, + specInfo = rules, + unfoldingInfo = unfolding}) + | not (isFragileOccInfo occ) + -- We must forget about whether it was marked safe-to-inline, + -- because that isn't necessarily true in the simplified expression. + -- This is important because expressions may be re-simplified + -- We don't zap deadness or loop-breaker-ness. + -- The latter is important because it tells MkIface not to + -- spit out an inlining for the thing. The former doesn't + -- seem so important, but there's no harm. + + && isEmptyCoreRules rules + -- Specialisations would need substituting. They get pinned + -- back on separately. + + && not (workerExists wrkr) + + && not (hasUnfolding unfolding) + -- This is very important; occasionally a let-bound binder is used + -- as a binder in some lambda, in which case its unfolding is utterly + -- bogus. Also the unfolding uses old binders so if we left it we'd + -- have to substitute it. Much better simply to give the Id a new + -- unfolding each time, which is what the simplifier does. + = Nothing + + | otherwise + = Just (info {occInfo = robust_occ_info, + workerInfo = noWorkerInfo, + specInfo = emptyCoreRules, + unfoldingInfo = noUnfolding}) + where + -- It's important to keep the loop-breaker info, + -- because the substitution doesn't remember it. + robust_occ_info = case occ of + OneOcc _ _ -> NoOccInfo + other -> occ +\end{code} + +@zapLamInfo@ is used for lambda binders that turn out to to be +part of an unsaturated lambda + +\begin{code} +zapLamInfo :: IdInfo -> Maybe IdInfo +zapLamInfo info@(IdInfo {occInfo = occ, demandInfo = demand}) + | is_safe_occ && not (isStrict demand) + = Nothing + | otherwise + = Just (info {occInfo = safe_occ, + demandInfo = wwLazy}) + where + -- The "unsafe" occ info is the ones that say I'm not in a lambda + -- because that might not be true for an unsaturated lambda + is_safe_occ = case occ of + OneOcc in_lam once -> in_lam + other -> True + + safe_occ = case occ of + OneOcc _ once -> OneOcc insideLam once + other -> occ +\end{code} + + +copyIdInfo is used when shorting out a top-level binding + f_local = BIG + f = f_local +where f is exported. We are going to swizzle it around to + f = BIG + f_local = f +but we must be careful to combine their IdInfos right. +The fact that things can go wrong here is a bad sign, but I can't see +how to make it 'patently right', so copyIdInfo is derived (pretty much) by trial and error + +Here 'from' is f_local, 'to' is f, and the result is attached to f + +\begin{code} +copyIdInfo :: IdInfo -- From + -> IdInfo -- To + -> IdInfo -- To, updated with stuff from From; except flavour unchanged +copyIdInfo from to = from { flavourInfo = flavourInfo to, + specInfo = specInfo to, + inlinePragInfo = inlinePragInfo to + } + -- It's important to preserve the inline pragma on 'f'; e.g. consider + -- {-# NOINLINE f #-} + -- f = local + -- + -- similarly, transformation rules may be attached to f + -- and we want to preserve them. + -- + -- On the other hand, we want the strictness info from f_local. +\end{code} diff --git a/ghc/compiler/basicTypes/MkId.lhs b/ghc/compiler/basicTypes/MkId.lhs index 20cdf6c..9da068a 100644 --- a/ghc/compiler/basicTypes/MkId.lhs +++ b/ghc/compiler/basicTypes/MkId.lhs @@ -47,7 +47,7 @@ import Type ( Type, ThetaType, mkUsgTy, UsageAnn(..) ) import Module ( Module ) -import CoreUnfold ( mkUnfolding ) +import CoreUnfold ( mkTopUnfolding, mkCompulsoryUnfolding ) import Subst ( mkTopTyVarSubst, substTheta ) import TyCon ( TyCon, isNewTyCon, tyConDataCons, isDataTyCon ) import Class ( Class, classBigSig, classTyCon ) @@ -199,7 +199,9 @@ dataConInfo data_con `setArityInfo` exactArity (n_dicts + n_ex_dicts + n_id_args) `setUnfoldingInfo` unfolding where - unfolding = mkUnfolding (Note InlineMe con_rhs) + unfolding = mkTopUnfolding (Note InlineMe con_rhs) + -- The dictionary constructors of a class don't get a binding, + -- but they are always saturated, so they should always be inlined. (tyvars, theta, ex_tyvars, ex_theta, orig_arg_tys, tycon) = dataConSig data_con @@ -290,7 +292,7 @@ mkRecordSelId field_label selector_ty -- ToDo: consider adding further IdInfo - unfolding = mkUnfolding sel_rhs + unfolding = mkTopUnfolding sel_rhs (tyvars, theta, tau) = splitSigmaTy selector_ty (data_ty,rhs_ty) = expectJust "StdIdInfoRec" (splitFunTy_maybe tau) @@ -343,7 +345,7 @@ mkNewTySelId field_label selector_ty = sel_id -- ToDo: consider adding further IdInfo - unfolding = mkUnfolding sel_rhs + unfolding = mkTopUnfolding sel_rhs (tyvars, theta, tau) = splitSigmaTy selector_ty (data_ty,rhs_ty) = expectJust "StdIdInfoRec" (splitFunTy_maybe tau) @@ -380,7 +382,7 @@ mkDictSelId name clas ty -- We no longer use 'must-inline' on record selectors. They'll -- inline like crazy if they scrutinise a constructor - unfolding = mkUnfolding rhs + unfolding = mkTopUnfolding rhs (tyvars, _, sc_sel_ids, op_sel_ids, defms) = classBigSig clas @@ -419,13 +421,11 @@ mkPrimitiveId prim_op info = mkIdInfo (ConstantId (PrimOp prim_op)) `setUnfoldingInfo` unfolding - `setInlinePragInfo` IMustBeINLINEd - -- The pragma @IMustBeINLINEd@ says that this Id absolutely + + unfolding = mkCompulsoryUnfolding rhs + -- The mkCompulsoryUnfolding says that this Id absolutely -- must be inlined. It's only used for primitives, -- because we don't want to make a closure for each of them. - - - unfolding = mkUnfolding rhs args = mkTemplateLocals arg_tys rhs = mkLams tyvars $ mkLams args $ @@ -500,8 +500,7 @@ unsafeCoerceId = pcMiscPrelId unsafeCoerceIdKey pREL_GHC SLIT("unsafeCoerce#") ty info where info = vanillaIdInfo - `setUnfoldingInfo` mkUnfolding rhs - `setInlinePragInfo` IMustBeINLINEd + `setUnfoldingInfo` mkCompulsoryUnfolding rhs ty = mkForAllTys [openAlphaTyVar,openBetaTyVar] @@ -520,8 +519,7 @@ getTagId = pcMiscPrelId getTagIdKey pREL_GHC SLIT("getTag#") ty info where info = vanillaIdInfo - `setUnfoldingInfo` mkUnfolding rhs - `setInlinePragInfo` IMustBeINLINEd + `setUnfoldingInfo` mkCompulsoryUnfolding rhs -- We don't provide a defn for this; you must inline it ty = mkForAllTys [alphaTyVar] (mkFunTy alphaTy intPrimTy) diff --git a/ghc/compiler/basicTypes/Name.lhs b/ghc/compiler/basicTypes/Name.lhs index 7709868..4a3bfaa 100644 --- a/ghc/compiler/basicTypes/Name.lhs +++ b/ghc/compiler/basicTypes/Name.lhs @@ -114,7 +114,7 @@ mkKnownKeyGlobal (rdr_name, uniq) (rdrNameOcc rdr_name) systemProvenance -mkSysLocalName :: Unique -> FAST_STRING -> Name +mkSysLocalName :: Unique -> UserFS -> Name mkSysLocalName uniq fs = Name { n_uniq = uniq, n_sort = Local, n_occ = mkSrcVarOcc fs, n_prov = systemProvenance } diff --git a/ghc/compiler/basicTypes/OccName.lhs b/ghc/compiler/basicTypes/OccName.lhs index f33c716..1720506 100644 --- a/ghc/compiler/basicTypes/OccName.lhs +++ b/ghc/compiler/basicTypes/OccName.lhs @@ -58,7 +58,7 @@ code the encoding operation is not performed on each occurrence. These type synonyms help documentation. \begin{code} -type UserFS = FAST_STRING -- As the user typed it +type UserFS = FAST_STRING -- As the user typed it type EncodedFS = FAST_STRING -- Encoded form type UserString = String -- As the user typed it diff --git a/ghc/compiler/basicTypes/Var.lhs b/ghc/compiler/basicTypes/Var.lhs index 3a070e7..489e42a 100644 --- a/ghc/compiler/basicTypes/Var.lhs +++ b/ghc/compiler/basicTypes/Var.lhs @@ -26,14 +26,14 @@ module Var ( -- Ids Id, DictId, idName, idType, idUnique, idInfo, modifyIdInfo, maybeModifyIdInfo, - setIdName, setIdUnique, setIdInfo, lazySetIdInfo, + setIdName, setIdUnique, setIdInfo, lazySetIdInfo, zapIdInfo, mkIdVar, isId, externallyVisibleId ) where #include "HsVersions.h" import {-# SOURCE #-} TypeRep( Type, Kind ) -import {-# SOURCE #-} IdInfo( IdInfo, seqIdInfo ) +import {-# SOURCE #-} IdInfo( IdInfo, seqIdInfo, vanillaIdInfo ) import Unique ( Unique, Uniquable(..), mkUniqueGrimily, getKey ) import Name ( Name, OccName, NamedThing(..), @@ -284,6 +284,9 @@ setIdInfo :: Id -> IdInfo -> Id setIdInfo var info = seqIdInfo info `seq` var {varInfo = info} -- Try to avoid spack leaks by seq'ing +zapIdInfo :: Id -> Id +zapIdInfo var = var {varInfo = vanillaIdInfo} + modifyIdInfo :: (IdInfo -> IdInfo) -> Id -> Id modifyIdInfo fn var@(Var {varInfo = info}) = seqIdInfo new_info `seq` var {varInfo = new_info} diff --git a/ghc/compiler/basicTypes/VarEnv.lhs b/ghc/compiler/basicTypes/VarEnv.lhs index 0088812..a103677 100644 --- a/ghc/compiler/basicTypes/VarEnv.lhs +++ b/ghc/compiler/basicTypes/VarEnv.lhs @@ -20,7 +20,8 @@ module VarEnv ( TidyEnv, emptyTidyEnv, -- SubstEnvs - SubstEnv, TyVarSubstEnv, SubstResult(..), emptySubstEnv, + SubstEnv, TyVarSubstEnv, SubstResult(..), + emptySubstEnv, mkSubstEnv, lookupSubstEnv, extendSubstEnv, extendSubstEnvList, delSubstEnv, noTypeSubst, isEmptySubstEnv ) where @@ -30,6 +31,7 @@ module VarEnv ( import {-# SOURCE #-} CoreSyn( CoreExpr ) import {-# SOURCE #-} TypeRep( Type ) +import IdInfo ( OccInfo ) import OccName ( TidyOccEnv, emptyTidyOccEnv ) import Var ( Var, Id, IdOrTyVar ) import UniqFM @@ -74,6 +76,8 @@ type TyVarSubstEnv = SubstEnv -- of the form (DoneTy ty) *only* data SubstResult = DoneEx CoreExpr -- Completed term + | DoneId Id OccInfo -- Completed term variable, with occurrence info; only + -- used by the simplifier | DoneTy Type -- Completed type | ContEx SubstEnv CoreExpr -- A suspended substitution diff --git a/ghc/compiler/codeGen/CgCase.lhs b/ghc/compiler/codeGen/CgCase.lhs index b7c092c..4e755ca 100644 --- a/ghc/compiler/codeGen/CgCase.lhs +++ b/ghc/compiler/codeGen/CgCase.lhs @@ -1,7 +1,7 @@ % % (c) The GRASP/AQUA Project, Glasgow University, 1992-1998 % -% $Id: CgCase.lhs,v 1.35 1999/10/13 16:39:14 simonmar Exp $ +% $Id: CgCase.lhs,v 1.36 1999/11/01 17:10:06 simonpj Exp $ % %******************************************************** %* * @@ -174,6 +174,7 @@ cgCase (StgCon (PrimOp op) args res_ty) } `thenC` -- bind the default binder if necessary + -- The deadness info is set by StgVarInfo (if (isDeadBinder bndr) then nopC else bindNewToTemp bndr `thenFC` \ bndr_amode -> diff --git a/ghc/compiler/codeGen/CgClosure.lhs b/ghc/compiler/codeGen/CgClosure.lhs index 71a2c06..dc32608 100644 --- a/ghc/compiler/codeGen/CgClosure.lhs +++ b/ghc/compiler/codeGen/CgClosure.lhs @@ -1,7 +1,7 @@ % % (c) The GRASP/AQUA Project, Glasgow University, 1992-1998 % -% $Id: CgClosure.lhs,v 1.35 1999/10/13 16:39:15 simonmar Exp $ +% $Id: CgClosure.lhs,v 1.36 1999/11/01 17:10:07 simonpj Exp $ % \section[CgClosure]{Code generation for closures} @@ -46,7 +46,7 @@ import ClosureInfo -- lots and lots of stuff import CmdLineOpts ( opt_GranMacros, opt_SccProfilingOn, opt_DoTickyProfiling ) import CostCentre import Id ( Id, idName, idType, idPrimRep ) -import Name ( Name ) +import Name ( Name, isLocalName ) import Module ( Module, pprModule ) import ListSetOps ( minusList ) import PrimRep ( PrimRep(..) ) @@ -372,9 +372,10 @@ closureCodeBody binder_info closure_info cc all_args body -- fast_entry_code = forceHeapCheck [] True fast_entry_code' fast_entry_code - = profCtrC SLIT("TICK_CTR") [ + = moduleName `thenFC` \ mod_name -> + profCtrC SLIT("TICK_CTR") [ CLbl ticky_ctr_label DataPtrRep, - mkCString (_PK_ (showSDocDebug (ppr name))), + mkCString (_PK_ (ppr_for_ticky_name mod_name name)), mkIntCLit stg_arity, -- total # of args mkIntCLit sp_stk_args, -- # passed on stk mkCString (_PK_ (map (showTypeCategory . idType) all_args)) @@ -437,6 +438,14 @@ closureCodeBody binder_info closure_info cc all_args body name = closureName closure_info fast_label = mkFastEntryLabel name stg_arity info_label = mkInfoTableLabel name + + +-- When printing the name of a thing in a ticky file, we want to +-- give the module name even for *local* things. We print +-- just "x (M)" rather that "M.x" to distinguish them from the global kind. +ppr_for_ticky_name mod_name name + | isLocalName name = showSDocDebug (ppr name <+> (parens (ppr mod_name))) + | otherwise = showSDocDebug (ppr name) \end{code} For lexically scoped profiling we have to load the cost centre from diff --git a/ghc/compiler/codeGen/CgUsages.lhs b/ghc/compiler/codeGen/CgUsages.lhs index ce20791..6f3353d 100644 --- a/ghc/compiler/codeGen/CgUsages.lhs +++ b/ghc/compiler/codeGen/CgUsages.lhs @@ -21,6 +21,7 @@ module CgUsages ( #include "HsVersions.h" import AbsCSyn +import PrimRep ( PrimRep(..) ) import AbsCUtils ( mkAbstractCs ) import CgMonad \end{code} @@ -143,9 +144,10 @@ That's done by functions which allocate stack space. \begin{code} adjustSpAndHp :: VirtualSpOffset -- New offset for Arg stack ptr -> Code -adjustSpAndHp newRealSp info_down (MkCgState absC binds - ((vSp,fSp,realSp,hwSp), - (vHp, rHp))) +adjustSpAndHp newRealSp (MkCgInfoDown _ _ _ ticky_ctr _) + (MkCgState absC binds + ((vSp,fSp,realSp,hwSp), + (vHp, rHp))) = MkCgState (mkAbstractCs [absC,move_sp,move_hp]) binds new_usage where @@ -153,9 +155,14 @@ adjustSpAndHp newRealSp info_down (MkCgState absC binds else (CAssign (CReg Sp) (CAddr (spRel realSp newRealSp))) + -- Adjust the heap pointer backwards in case we over-allocated + -- Analogously, we also remove bytes from the ticky counter move_hp = if (rHp == vHp) then AbsCNop - else (CAssign (CReg Hp) - (CAddr (hpRel rHp vHp))) + else mkAbstractCs [ + CAssign (CReg Hp) (CAddr (hpRel rHp vHp)), + profCtrAbsC SLIT("TICK_ALLOC_HEAP") + [ mkIntCLit (vHp - rHp), CLbl ticky_ctr DataPtrRep ] + ] new_usage = ((vSp, fSp, newRealSp, hwSp), (vHp,vHp)) \end{code} diff --git a/ghc/compiler/coreSyn/CoreLint.lhs b/ghc/compiler/coreSyn/CoreLint.lhs index f778d0d..b3de053 100644 --- a/ghc/compiler/coreSyn/CoreLint.lhs +++ b/ghc/compiler/coreSyn/CoreLint.lhs @@ -21,7 +21,7 @@ import CoreUtils ( exprOkForSpeculation ) import Bag import Const ( Con(..), DataCon, conType, conOkForApp, conOkForAlt ) -import Id ( isConstantId, idMustBeINLINEd ) +import Id ( mayHaveNoBinding ) import Var ( IdOrTyVar, Id, TyVar, idType, tyVarKind, isTyVar, isId ) import VarSet import Subst ( mkTyVarSubst, substTy ) @@ -219,20 +219,7 @@ lintSingleBinding rec_flag (binder,rhs) \begin{code} lintCoreExpr :: CoreExpr -> LintM Type -lintCoreExpr (Var var) - | isConstantId var = returnL (idType var) - -- Micro-hack here... Class decls generate applications of their - -- dictionary constructor, but don't generate a binding for the - -- constructor (since it would never be used). After a single round - -- of simplification, these dictionary constructors have been - -- inlined (from their UnfoldInfo) to CoCons. Just between - -- desugaring and simplfication, though, they appear as naked, unbound - -- variables as the function in an application. - -- The hack here simply doesn't check for out-of-scope-ness for - -- data constructors (at least, in a function position). - -- Ditto primitive Ids - - | otherwise = checkIdInScope var `seqL` returnL (idType var) +lintCoreExpr (Var var) = checkIdInScope var `seqL` returnL (idType var) lintCoreExpr (Note (Coerce to_ty from_ty) expr) = lintCoreExpr expr `thenL` \ expr_ty -> @@ -573,9 +560,17 @@ checkInScope :: SDoc -> IdOrTyVar -> LintM () checkInScope loc_msg var loc scope errs | isLocallyDefined var && not (var `elemVarSet` scope) - && not (isId var && idMustBeINLINEd var) -- Constructors and dict selectors - -- don't have bindings, - -- just MustInline prags + && not (isId var && mayHaveNoBinding var) + -- Micro-hack here... Class decls generate applications of their + -- dictionary constructor, but don't generate a binding for the + -- constructor (since it would never be used). After a single round + -- of simplification, these dictionary constructors have been + -- inlined (from their UnfoldInfo) to CoCons. Just between + -- desugaring and simplfication, though, they appear as naked, unbound + -- variables as the function in an application. + -- The hack here simply doesn't check for out-of-scope-ness for + -- data constructors (at least, in a function position). + -- Ditto primitive Ids = (Nothing, addErr errs (hsep [ppr var, loc_msg]) loc) | otherwise = (Nothing,errs) diff --git a/ghc/compiler/coreSyn/CoreSyn.lhs b/ghc/compiler/coreSyn/CoreSyn.lhs index c1eb1f0..94aa741 100644 --- a/ghc/compiler/coreSyn/CoreSyn.lhs +++ b/ghc/compiler/coreSyn/CoreSyn.lhs @@ -43,9 +43,9 @@ import TysWiredIn ( boolTy, stringTy, nilDataCon ) import CostCentre ( CostCentre, isDupdCC, noCostCentre ) import Var ( Var, Id, TyVar, IdOrTyVar, isTyVar, isId, idType ) import VarEnv -import Id ( mkWildId, getInlinePragma, idInfo ) +import Id ( mkWildId, getIdOccInfo, idInfo ) import Type ( Type, UsageAnn, mkTyVarTy, isUnLiftedType, seqType ) -import IdInfo ( InlinePragInfo(..), megaSeqIdInfo ) +import IdInfo ( OccInfo(..), megaSeqIdInfo ) import Const ( Con(..), DataCon, Literal(NoRepStr), PrimOp ) import TysWiredIn ( trueDataCon, falseDataCon ) import VarSet @@ -279,7 +279,7 @@ rhssOfAlts :: [Alt b] -> [Expr b] rhssOfAlts alts = [e | (_,_,e) <- alts] isDeadBinder :: CoreBndr -> Bool -isDeadBinder bndr | isId bndr = case getInlinePragma bndr of +isDeadBinder bndr | isId bndr = case getIdOccInfo bndr of IAmDead -> True other -> False | otherwise = False -- TyVars count as not dead diff --git a/ghc/compiler/coreSyn/CoreTidy.lhs b/ghc/compiler/coreSyn/CoreTidy.lhs index 51a5175..a980409 100644 --- a/ghc/compiler/coreSyn/CoreTidy.lhs +++ b/ghc/compiler/coreSyn/CoreTidy.lhs @@ -219,9 +219,8 @@ tidyTopId mod env@(tidy_env, var_env) env_idinfo id \begin{code} -- tidyIdInfo does these things: --- a) tidy the specialisation info (if any) --- b) zap a complicated ICanSafelyBeINLINEd pragma, --- c) zap the unfolding +-- a) tidy the specialisation info and worker info (if any) +-- b) zap the unfolding and demand info -- The latter two are to avoid space leaks tidyIdInfo env info @@ -229,13 +228,9 @@ tidyIdInfo env info where rules = specInfo info - info1 | isEmptyCoreRules rules = info + info2 | isEmptyCoreRules rules = info | otherwise = info `setSpecInfo` tidyRules env rules - info2 = case inlinePragInfo info of - ICanSafelyBeINLINEd _ _ -> info1 `setInlinePragInfo` NoInlinePragInfo - other -> info1 - info3 = info2 `setUnfoldingInfo` noUnfolding info4 = info3 `setDemandInfo` wwLazy -- I don't understand why... diff --git a/ghc/compiler/coreSyn/CoreUnfold.hi-boot b/ghc/compiler/coreSyn/CoreUnfold.hi-boot index 86ee1da..149d225 100644 --- a/ghc/compiler/coreSyn/CoreUnfold.hi-boot +++ b/ghc/compiler/coreSyn/CoreUnfold.hi-boot @@ -1,10 +1,9 @@ _interface_ CoreUnfold 1 _exports_ -CoreUnfold Unfolding UnfoldingGuidance mkUnfolding noUnfolding hasUnfolding isEvaldUnfolding seqUnfolding ; +CoreUnfold Unfolding UnfoldingGuidance noUnfolding hasUnfolding isEvaldUnfolding seqUnfolding ; _declarations_ 1 data Unfolding; 1 data UnfoldingGuidance; -1 mkUnfolding _:_ CoreSyn.CoreExpr -> Unfolding ;; 1 noUnfolding _:_ Unfolding ;; 1 hasUnfolding _:_ Unfolding -> PrelBase.Bool ;; 1 seqUnfolding _:_ Unfolding -> PrelBase.() ;; diff --git a/ghc/compiler/coreSyn/CoreUnfold.hi-boot-5 b/ghc/compiler/coreSyn/CoreUnfold.hi-boot-5 index 32c1673..319191e 100644 --- a/ghc/compiler/coreSyn/CoreUnfold.hi-boot-5 +++ b/ghc/compiler/coreSyn/CoreUnfold.hi-boot-5 @@ -1,8 +1,7 @@ __interface CoreUnfold 1 0 where -__export CoreUnfold Unfolding UnfoldingGuidance mkUnfolding noUnfolding hasUnfolding isEvaldUnfolding seqUnfolding ; +__export CoreUnfold Unfolding UnfoldingGuidance noUnfolding hasUnfolding isEvaldUnfolding seqUnfolding ; 1 data Unfolding; 1 data UnfoldingGuidance; -1 mkUnfolding :: CoreSyn.CoreExpr -> Unfolding ; 1 noUnfolding :: Unfolding ; 1 hasUnfolding :: Unfolding -> PrelBase.Bool ; 1 seqUnfolding :: Unfolding -> PrelBase.Z0T ; diff --git a/ghc/compiler/coreSyn/CoreUnfold.lhs b/ghc/compiler/coreSyn/CoreUnfold.lhs index 96c93a6..faa3983 100644 --- a/ghc/compiler/coreSyn/CoreUnfold.lhs +++ b/ghc/compiler/coreSyn/CoreUnfold.lhs @@ -16,7 +16,7 @@ find, unsurprisingly, a Core expression. module CoreUnfold ( Unfolding, UnfoldingGuidance, -- types - noUnfolding, mkUnfolding, seqUnfolding, + noUnfolding, mkTopUnfolding, mkUnfolding, mkCompulsoryUnfolding, seqUnfolding, mkOtherCon, otherCons, unfoldingTemplate, maybeUnfoldingTemplate, isEvaldUnfolding, isCheapUnfolding, @@ -55,11 +55,11 @@ import VarSet import Name ( isLocallyDefined ) import Const ( Con(..), isLitLitLit, isWHNFCon ) import PrimOp ( PrimOp(..), primOpIsDupable ) -import IdInfo ( ArityInfo(..), InlinePragInfo(..), OccInfo(..), workerExists ) +import IdInfo ( ArityInfo(..), InlinePragInfo(..), OccInfo(..), insideLam, workerExists ) import TyCon ( tyConFamilySize ) import Type ( splitAlgTyConApp_maybe, splitFunTy_maybe, isUnLiftedType ) import Const ( isNoRepLit ) -import Unique ( Unique, buildIdKey, augmentIdKey, runSTRepIdKey ) +import Unique ( Unique, buildIdKey, augmentIdKey ) import Maybes ( maybeToBool ) import Bag import Util ( isIn, lengthExceeds ) @@ -89,8 +89,12 @@ data Unfolding -- case x of { C f -> ... } -- Here, f gets an OtherCon [] unfolding. + | CompulsoryUnfolding CoreExpr -- There is no "original" definition, + -- so you'd better unfold. + | CoreUnfolding -- An unfolding with redundant cached information CoreExpr -- Template; binder-info is correct + Bool -- This is a top-level binding Bool -- exprIsCheap template (cached); it won't duplicate (much) work -- if you inline this in more than one place Bool -- exprIsValue template (cached); it is ok to discard a `seq` on @@ -98,8 +102,8 @@ data Unfolding UnfoldingGuidance -- Tells about the *size* of the template. seqUnfolding :: Unfolding -> () -seqUnfolding (CoreUnfolding e b1 b2 g) - = seqExpr e `seq` b1 `seq` b2 `seq` seqGuidance g +seqUnfolding (CoreUnfolding e top b1 b2 g) + = seqExpr e `seq` top `seq` b1 `seq` b2 `seq` seqGuidance g seqUnfolding other = () \end{code} @@ -107,35 +111,44 @@ seqUnfolding other = () noUnfolding = NoUnfolding mkOtherCon = OtherCon -mkUnfolding expr +mkTopUnfolding expr = mkUnfolding True expr + +mkUnfolding top_lvl expr = CoreUnfolding (occurAnalyseGlobalExpr expr) + top_lvl (exprIsCheap expr) (exprIsValue expr) (calcUnfoldingGuidance opt_UF_CreationThreshold expr) +mkCompulsoryUnfolding expr -- Used for things that absolutely must be unfolded + = CompulsoryUnfolding (occurAnalyseGlobalExpr expr) + unfoldingTemplate :: Unfolding -> CoreExpr -unfoldingTemplate (CoreUnfolding expr _ _ _) = expr +unfoldingTemplate (CoreUnfolding expr _ _ _ _) = expr +unfoldingTemplate (CompulsoryUnfolding expr) = expr unfoldingTemplate other = panic "getUnfoldingTemplate" maybeUnfoldingTemplate :: Unfolding -> Maybe CoreExpr -maybeUnfoldingTemplate (CoreUnfolding expr _ _ _) = Just expr -maybeUnfoldingTemplate other = Nothing +maybeUnfoldingTemplate (CoreUnfolding expr _ _ _ _) = Just expr +maybeUnfoldingTemplate (CompulsoryUnfolding expr) = Just expr +maybeUnfoldingTemplate other = Nothing otherCons (OtherCon cons) = cons otherCons other = [] isEvaldUnfolding :: Unfolding -> Bool -isEvaldUnfolding (OtherCon _) = True -isEvaldUnfolding (CoreUnfolding _ _ is_evald _) = is_evald -isEvaldUnfolding other = False +isEvaldUnfolding (OtherCon _) = True +isEvaldUnfolding (CoreUnfolding _ _ _ is_evald _) = is_evald +isEvaldUnfolding other = False isCheapUnfolding :: Unfolding -> Bool -isCheapUnfolding (CoreUnfolding _ is_cheap _ _) = is_cheap -isCheapUnfolding other = False +isCheapUnfolding (CoreUnfolding _ _ is_cheap _ _) = is_cheap +isCheapUnfolding other = False hasUnfolding :: Unfolding -> Bool -hasUnfolding (CoreUnfolding _ _ _ _) = True -hasUnfolding other = False +hasUnfolding (CoreUnfolding _ _ _ _ _) = True +hasUnfolding (CompulsoryUnfolding _) = True +hasUnfolding other = False hasSomeUnfolding :: Unfolding -> Bool hasSomeUnfolding NoUnfolding = False @@ -143,11 +156,6 @@ hasSomeUnfolding other = True data UnfoldingGuidance = UnfoldNever - | UnfoldAlways -- There is no "original" definition, - -- so you'd better unfold. Or: something - -- so cheap to unfold (e.g., 1#) that - -- you should do it absolutely always. - | UnfoldIfGoodArgs Int -- and "n" value args [Int] -- Discount if the argument is evaluated. @@ -167,7 +175,6 @@ seqGuidance other = () \begin{code} instance Outputable UnfoldingGuidance where - ppr UnfoldAlways = ptext SLIT("ALWAYS") ppr UnfoldNever = ptext SLIT("NEVER") ppr (UnfoldIfGoodArgs v cs size discount) = hsep [ ptext SLIT("IF_ARGS"), int v, @@ -189,18 +196,20 @@ calcUnfoldingGuidance -> CoreExpr -- expression to look at -> UnfoldingGuidance calcUnfoldingGuidance bOMB_OUT_SIZE expr - | exprIsTrivial expr -- Often trivial expressions are never bound - -- to an expression, but it can happen. For - -- example, the Id for a nullary constructor has - -- a trivial expression as its unfolding, and - -- we want to make sure that we always unfold it. - = UnfoldAlways - - | otherwise = case collect_val_bndrs expr of { (inline, val_binders, body) -> + let + n_val_binders = length val_binders + in case (sizeExpr bOMB_OUT_SIZE val_binders body) of - TooBig -> UnfoldNever + TooBig + | not inline -> UnfoldNever + -- A big function with an INLINE pragma must + -- have an UnfoldIfGoodArgs guidance + | inline -> UnfoldIfGoodArgs n_val_binders + (map (const 0) val_binders) + (n_val_binders + 2) 0 + -- See comments with final_size below SizeIs size cased_args scrut_discount -> UnfoldIfGoodArgs @@ -211,14 +220,22 @@ calcUnfoldingGuidance bOMB_OUT_SIZE expr where boxed_size = I# size - n_val_binders = length val_binders - - final_size | inline = boxed_size `min` (n_val_binders + 2) + final_size | inline = 0 -- Trying very agresssive inlining of INLINE things. + -- Reason: we don't want to call the un-inlined version, + -- because its body is awful + -- boxed_size `min` (n_val_binders + 2) -- Trying "+2" again... | otherwise = boxed_size -- The idea is that if there is an INLINE pragma (inline is True) - -- and there's a big body, we give a size of n_val_binders+2. This - -- This is enough to defeat the no-size-increase test in callSiteInline; - -- we don't want to inline an INLINE thing into a totally boring context + -- and there's a big body, we give a size of n_val_binders+1. This + -- This is enough to pass the no-size-increase test in callSiteInline, + -- but no more. + -- I tried n_val_binders+2, to just defeat the test, on the grounds that + -- we don't want to inline an INLINE thing into a totally boring context, + -- but I found that some wrappers (notably one for a join point) weren't + -- getting inlined, and that was terrible. In that particular case, the + -- call site applied the wrapper to realWorld#, so if we made that an + -- "interesting" value the inlining would have happened... but it was + -- simpler to inline wrappers a little more eagerly instead. -- -- Sometimes, though, an INLINE thing is smaller than n_val_binders+2. -- A particular case in point is a constructor, which has size 1. @@ -306,15 +323,17 @@ sizeExpr (I# bOMB_OUT_SIZE) args expr ------------ size_up_app (App fun arg) args = size_up_app fun (arg:args) - size_up_app fun args = foldr (addSize . nukeScrutDiscount . size_up) (fun_discount fun) args + size_up_app fun args = foldr (addSize . nukeScrutDiscount . size_up) + (size_up_fun fun) + args -- A function application with at least one value argument -- so if the function is an argument give it an arg-discount -- Also behave specially if the function is a build - fun_discount (Var fun) | idUnique fun == buildIdKey = buildSize - | idUnique fun == augmentIdKey = augmentSize - | fun `is_elem` args = scrutArg fun - fun_discount other = sizeZero + size_up_fun (Var fun) | idUnique fun == buildIdKey = buildSize + | idUnique fun == augmentIdKey = augmentSize + | fun `is_elem` args = scrutArg fun `addSize` sizeOne + size_up_fun other = size_up other ------------ size_up_alt (con, bndrs, rhs) = size_up rhs @@ -443,7 +462,6 @@ couldBeSmallEnoughToInline other = True certainlySmallEnoughToInline :: UnfoldingGuidance -> Bool certainlySmallEnoughToInline UnfoldNever = False -certainlySmallEnoughToInline UnfoldAlways = True certainlySmallEnoughToInline (UnfoldIfGoodArgs _ _ size _) = size <= opt_UF_UseThreshold \end{code} @@ -500,95 +518,93 @@ so we can inline if it occurs once, or is small \begin{code} callSiteInline :: Bool -- True <=> the Id is black listed -> Bool -- 'inline' note at call site + -> OccInfo -> Id -- The Id -> [Bool] -- One for each value arg; True if it is interesting -> Bool -- True <=> continuation is interesting -> Maybe CoreExpr -- Unfolding, if any -callSiteInline black_listed inline_call id arg_infos interesting_cont +callSiteInline black_listed inline_call occ id arg_infos interesting_cont = case getIdUnfolding id of { NoUnfolding -> Nothing ; OtherCon _ -> Nothing ; - CoreUnfolding unf_template is_cheap _ guidance -> + CompulsoryUnfolding unf_template -> Just unf_template ; + CoreUnfolding unf_template is_top is_cheap _ guidance -> let result | yes_or_no = Just unf_template | otherwise = Nothing - inline_prag = getInlinePragma id n_val_args = length arg_infos - yes_or_no = - case inline_prag of - IAmDead -> pprTrace "callSiteInline: dead" (ppr id) False - IMustNotBeINLINEd -> False - IAmALoopBreaker -> False - IMustBeINLINEd -> True -- Overrides absolutely everything, including the black list - ICanSafelyBeINLINEd in_lam one_br -> consider in_lam True one_br - NoInlinePragInfo -> consider InsideLam False False - - consider in_lam once once_in_one_branch + yes_or_no | black_listed = False + | otherwise = case occ of + IAmDead -> pprTrace "callSiteInline: dead" (ppr id) False + IAmALoopBreaker -> False + OneOcc in_lam one_br -> (not in_lam || is_cheap) && consider_safe in_lam True one_br + NoOccInfo -> is_cheap && consider_safe True False False + + consider_safe in_lam once once_in_one_branch + -- consider_safe decides whether it's a good idea to inline something, + -- given that there's no work-duplication issue (the caller checks that). + -- once_in_one_branch = True means there's a unique textual occurrence | inline_call = True + | once_in_one_branch -- Be very keen to inline something if this is its unique occurrence; that -- gives a good chance of eliminating the original binding for the thing. -- The only time we hold back is when substituting inside a lambda; -- then if the context is totally uninteresting (not applied, not scrutinised) -- there is no point in substituting because it might just increase allocation. - = WARN( case in_lam of { NotInsideLam -> True; other -> False }, - text "callSiteInline:oneOcc" <+> ppr id ) - -- If it has one occurrence, not inside a lambda, PreInlineUnconditionally - -- should have zapped it already - is_cheap && (not (null arg_infos) || interesting_cont) + = not in_lam || not (null arg_infos) || interesting_cont - | otherwise -- Occurs (textually) more than once, so look at its size + | otherwise = case guidance of - UnfoldAlways -> True - UnfoldNever -> False + UnfoldNever -> False ; UnfoldIfGoodArgs n_vals_wanted arg_discounts size res_discount - | enough_args && size <= (n_vals_wanted + 1) + + | enough_args && size <= (n_vals_wanted + 1) -- No size increase -- Size of call is n_vals_wanted (+1 for the function) - -> case in_lam of - NotInsideLam -> True - InsideLam -> is_cheap - - | not (or arg_infos || really_interesting_cont || once) - -- If it occurs more than once, there must be something interesting - -- about some argument, or the result, to make it worth inlining - -- We also drop this case if the thing occurs once, although perhaps in - -- several branches. In this case we are keener about inlining in the hope - -- that we'll be able to drop the allocation for the function altogether. - -> False - - | otherwise - -> case in_lam of - NotInsideLam -> small_enough - InsideLam -> is_cheap && small_enough - - where - enough_args = n_val_args >= n_vals_wanted - really_interesting_cont | n_val_args < n_vals_wanted = False -- Too few args - | n_val_args == n_vals_wanted = interesting_cont - | otherwise = True -- Extra args - -- This rather elaborate defn for really_interesting_cont is important - -- Consider an I# = INLINE (\x -> I# {x}) - -- The unfolding guidance deems it to have size 2, and no arguments. - -- So in an application (I# y) we must take the extra arg 'y' as - -- evidence of an interesting context! - - small_enough = (size - discount) <= opt_UF_UseThreshold - discount = computeDiscount n_vals_wanted arg_discounts res_discount + -> True + + | otherwise + -> some_benefit && small_enough + + where + some_benefit = or arg_infos || really_interesting_cont || + (not is_top && (once || (n_vals_wanted > 0 && enough_args))) + -- If it occurs more than once, there must be something interesting + -- about some argument, or the result context, to make it worth inlining + -- + -- If a function has a nested defn we also record some-benefit, + -- on the grounds that we are often able to eliminate the binding, + -- and hence the allocation, for the function altogether; this is good + -- for join points. But this only makes sense for *functions*; + -- inlining a constructor doesn't help allocation unless the result is + -- scrutinised. UNLESS the constructor occurs just once, albeit possibly + -- in multiple case branches. Then inlining it doesn't increase allocation, + -- but it does increase the chance that the constructor won't be allocated at all + -- in the branches that don't use it. + + enough_args = n_val_args >= n_vals_wanted + really_interesting_cont | n_val_args < n_vals_wanted = False -- Too few args + | n_val_args == n_vals_wanted = interesting_cont + | otherwise = True -- Extra args + -- really_interesting_cont tells if the result of the + -- call is in an interesting context. + + small_enough = (size - discount) <= opt_UF_UseThreshold + discount = computeDiscount n_vals_wanted arg_discounts res_discount arg_infos really_interesting_cont - - + in #ifdef DEBUG if opt_D_dump_inlinings then pprTrace "Considering inlining" (ppr id <+> vcat [text "black listed" <+> ppr black_listed, - text "inline prag:" <+> ppr inline_prag, + text "occ info:" <+> ppr occ, text "arg infos" <+> ppr arg_infos, text "interesting continuation" <+> ppr interesting_cont, text "is cheap" <+> ppr is_cheap, @@ -646,6 +662,19 @@ For optimisation we use phase 1,2 and nothing (i.e. no -finline-phase flag) in that order. The meanings of these are determined by the @blackListed@ function here. +The final simplification doesn't have a phase number + +Pragmas +~~~~~~~ + Pragma Black list if + +(least black listing, most inlining) + INLINE n foo phase is Just p *and* p Maybe Int -- Inline phase @@ -655,39 +684,42 @@ blackListed :: IdSet -- Used in transformation rules -- inlined because of the inline phase we are in. This is the sole -- place that the inline phase number is looked at. --- ToDo: improve horrible coding style (too much duplication) +blackListed rule_vars Nothing -- Last phase + = \v -> case getInlinePragma v of + IMustNotBeINLINEd False Nothing -> True -- An unconditional NOINLINE pragma + other -> False +blackListed rule_vars (Just 0) -- Phase 0: used for 'no imported inlinings please' -- This prevents wrappers getting inlined which in turn is bad for full laziness -- NEW: try using 'not a wrapper' rather than 'not imported' in this phase. -- This allows a little more inlining, which seems to be important, sometimes. -- For example PrelArr.newIntArr gets better. -blackListed rule_vars (Just 0) - = \v -> let v_uniq = idUnique v - in - -- not (isLocallyDefined v) - workerExists (getIdWorkerInfo v) - || v `elemVarSet` rule_vars - || not (isEmptyCoreRules (getIdSpecialisation v)) - || v_uniq == runSTRepIdKey - --- Phase 1: don't inline any rule-y things or things with specialisations -blackListed rule_vars (Just 1) - = \v -> let v_uniq = idUnique v - in v `elemVarSet` rule_vars - || not (isEmptyCoreRules (getIdSpecialisation v)) - || v_uniq == runSTRepIdKey - --- Phase 2: allow build/augment to inline, and specialisations -blackListed rule_vars (Just 2) - = \v -> let v_uniq = idUnique v - in (v `elemVarSet` rule_vars && not (v_uniq == buildIdKey || - v_uniq == augmentIdKey)) - || v_uniq == runSTRepIdKey - --- Otherwise just go for it -blackListed rule_vars phase - = \v -> False + = \v -> -- workerExists (getIdWorkerInfo v) || normal_case rule_vars 0 v + -- True -- Try going back to no inlinings at all + -- BUT: I found that there is some advantage in doing + -- local inlinings first. For example in fish/Main.hs + -- it's advantageous to inline scale_vec2 before inlining + -- wrappers from PrelNum that make it look big. + not (isLocallyDefined v) -- This seems best at the moment + +blackListed rule_vars (Just phase) + = \v -> normal_case rule_vars phase v + +normal_case rule_vars phase v + = case getInlinePragma v of + NoInlinePragInfo -> has_rules + + IMustNotBeINLINEd from_INLINE Nothing + | from_INLINE -> has_rules -- Black list until final phase + | otherwise -> True -- Always blacklisted + + IMustNotBeINLINEd from_inline (Just threshold) + | from_inline -> phase < threshold && has_rules + | otherwise -> phase < threshold || has_rules + where + has_rules = v `elemVarSet` rule_vars + || not (isEmptyCoreRules (getIdSpecialisation v)) \end{code} diff --git a/ghc/compiler/coreSyn/CoreUtils.lhs b/ghc/compiler/coreSyn/CoreUtils.lhs index e2a3b13..198b406 100644 --- a/ghc/compiler/coreSyn/CoreUtils.lhs +++ b/ghc/compiler/coreSyn/CoreUtils.lhs @@ -105,7 +105,6 @@ applyTypeToArgs e op_ty (other_arg : args) Nothing -> pprPanic "applyTypeToArgs" (pprCoreExpr e) \end{code} - %************************************************************************ %* * \subsection{Figuring out things about expressions} @@ -344,8 +343,9 @@ exprEtaExpandArity (Case scrut _ alts) exprEtaExpandArity (Note note e) | ok_note note = exprEtaExpandArity e where - ok_note InlineCall = True - ok_note other = False + ok_note (Coerce _ _) = True + ok_note InlineCall = True + ok_note other = False -- Notice that we do not look through __inline_me__ -- This one is a bit more surprising, but consider -- f = _inline_me (\x -> e) @@ -355,11 +355,6 @@ exprEtaExpandArity (Note note e) -- giving just -- f = \x -> e -- A Bad Idea - -- - -- Notice also that we don't look through Coerce - -- This is simply because the etaExpand code in SimplUtils - -- isn't capable of making the alternating lambdas and coerces - -- that would be necessary to exploit it exprEtaExpandArity other = 0 -- Could do better for applications diff --git a/ghc/compiler/coreSyn/PprCore.lhs b/ghc/compiler/coreSyn/PprCore.lhs index e4f2d7b..9f8a16d 100644 --- a/ghc/compiler/coreSyn/PprCore.lhs +++ b/ghc/compiler/coreSyn/PprCore.lhs @@ -18,7 +18,7 @@ module PprCore ( import CoreSyn import CostCentre ( pprCostCentreCore ) -import Id ( idType, idInfo, getInlinePragma, getIdDemandInfo, Id ) +import Id ( idType, idInfo, getInlinePragma, getIdDemandInfo, getIdOccInfo, Id ) import Var ( isTyVar ) import IdInfo ( IdInfo, arityInfo, ppArityInfo, ppFlavourInfo, flavourInfo, @@ -334,7 +334,8 @@ pprTypedBinder binder -- printing interfaces, because we get \ x::(a->b) y::(c->d) -> ... -- When printing any Id binder in debug mode, we print its inline pragma and one-shot-ness -pprIdBndr id = ppr id <+> ifPprDebug (ppr (getInlinePragma id) <+> ppr (getIdDemandInfo id)) <+> ppr (lbvarInfo (idInfo id)) +pprIdBndr id = ppr id <+> ifPprDebug (ppr (getInlinePragma id) <+> ppr (getIdOccInfo id) <+> + ppr (getIdDemandInfo id)) <+> ppr (lbvarInfo (idInfo id)) \end{code} diff --git a/ghc/compiler/coreSyn/Subst.lhs b/ghc/compiler/coreSyn/Subst.lhs index 6974223..02599cb 100644 --- a/ghc/compiler/coreSyn/Subst.lhs +++ b/ghc/compiler/coreSyn/Subst.lhs @@ -7,12 +7,12 @@ module Subst ( -- In-scope set InScopeSet, emptyInScopeSet, - lookupInScope, setInScope, extendInScope, extendInScopes, isInScope, + lookupInScope, setInScope, extendInScope, extendInScopes, isInScope, modifyInScope, -- Substitution stuff Subst, TyVarSubst, IdSubst, emptySubst, mkSubst, substEnv, substInScope, - lookupSubst, isEmptySubst, extendSubst, extendSubstList, + lookupSubst, lookupIdSubst, isEmptySubst, extendSubst, extendSubstList, zapSubstEnv, setSubstEnv, bindSubst, unBindSubst, bindSubstList, unBindSubstList, @@ -44,13 +44,14 @@ import Type ( ThetaType, import VarSet import VarEnv import Var ( setVarUnique, isId ) -import Id ( idType, setIdType ) -import IdInfo ( IdInfo, zapFragileIdInfo, +import Id ( idType, setIdType, getIdOccInfo, zapFragileIdInfo ) +import Name ( isLocallyDefined ) +import IdInfo ( IdInfo, isFragileOccInfo, specInfo, setSpecInfo, workerExists, workerInfo, setWorkerInfo, WorkerInfo ) import UniqSupply ( UniqSupply, uniqFromSupply, splitUniqSupply ) -import Var ( Var, IdOrTyVar, Id, TyVar, isTyVar, maybeModifyIdInfo ) +import Var ( Var, IdOrTyVar, Id, TyVar, isTyVar ) import Outputable import Util ( mapAccumL, foldl2, seqList, ($!) ) \end{code} @@ -62,11 +63,11 @@ import Util ( mapAccumL, foldl2, seqList, ($!) ) %************************************************************************ \begin{code} -type InScopeSet = VarSet +type InScopeSet = VarEnv Var data Subst = Subst InScopeSet -- In scope SubstEnv -- Substitution itself - -- INVARIANT 1: The in-scope set is a superset + -- INVARIANT 1: The (domain of the) in-scope set is a superset -- of the free vars of the range of the substitution -- that might possibly clash with locally-bound variables -- in the thing being substituted in. @@ -85,9 +86,46 @@ data Subst = Subst InScopeSet -- In scope type IdSubst = Subst \end{code} +The general plan about the substitution and in-scope set for Ids is as follows + +* substId always adds new_id to the in-scope set. + new_id has a correctly-substituted type, but all its fragile IdInfo has been zapped. + That is added back in later. So new_id is the minimal thing it's + correct to substitute. + +* substId adds a binding (DoneVar new_id occ) to the substitution if + EITHER the Id's unique has changed + OR the Id has interesting occurrence information + Note, though that the substitution isn't necessarily extended + if the type changes. Why not? Because of the next point: + +* We *always, always* finish by looking up in the in-scope set + any variable that doesn't get a DoneEx or DoneVar hit in the substitution. + Reason: so that we never finish up with a "old" Id in the result. + An old Id might point to an old unfolding and so on... which gives a space leak. + + [The DoneEx and DoneVar hits map to "new" stuff.] + +* It follows that substExpr must not do a no-op if the substitution is empty. + substType is free to do so, however. + +* When we come to a let-binding (say) we generate new IdInfo, including an + unfolding, attach it to the binder, and add this newly adorned binder to + the in-scope set. So all subsequent occurrences of the binder will get mapped + to the full-adorned binder, which is also the one put in the binding site. + +* The in-scope "set" usually maps x->x; we use it simply for its domain. + But sometimes we have two in-scope Ids that are synomyms, and should + map to the same target: x->x, y->x. Notably: + case y of x { ... } + That's why the "set" is actually a VarEnv Var + \begin{code} emptyInScopeSet :: InScopeSet emptyInScopeSet = emptyVarSet + +add_in_scope :: InScopeSet -> Var -> InScopeSet +add_in_scope in_scope v = extendVarEnv in_scope v v \end{code} @@ -97,7 +135,7 @@ isEmptySubst :: Subst -> Bool isEmptySubst (Subst _ env) = isEmptySubstEnv env emptySubst :: Subst -emptySubst = Subst emptyVarSet emptySubstEnv +emptySubst = Subst emptyInScopeSet emptySubstEnv mkSubst :: InScopeSet -> SubstEnv -> Subst mkSubst in_scope env = Subst in_scope env @@ -120,24 +158,42 @@ extendSubstList (Subst in_scope env) v r = Subst in_scope (extendSubstEnvList en lookupSubst :: Subst -> Var -> Maybe SubstResult lookupSubst (Subst _ env) v = lookupSubstEnv env v +lookupIdSubst :: Subst -> Id -> SubstResult +-- Does the lookup in the in-scope set too +lookupIdSubst (Subst in_scope env) v + = case lookupSubstEnv env v of + Just (DoneId v' occ) -> case lookupVarEnv in_scope v' of + Just v'' -> DoneId v'' occ + Nothing -> DoneId v' occ + Just res -> res + Nothing -> DoneId v' (getIdOccInfo v') + where + v' = case lookupVarEnv in_scope v of + Just v' -> v' + Nothing -> v + lookupInScope :: Subst -> Var -> Maybe Var -lookupInScope (Subst in_scope _) v = lookupVarSet in_scope v +lookupInScope (Subst in_scope _) v = lookupVarEnv in_scope v isInScope :: Var -> Subst -> Bool -isInScope v (Subst in_scope _) = v `elemVarSet` in_scope +isInScope v (Subst in_scope _) = v `elemVarEnv` in_scope extendInScope :: Subst -> Var -> Subst -extendInScope (Subst in_scope env) v = Subst (extendVarSet in_scope v) env +extendInScope (Subst in_scope env) v = Subst (in_scope `add_in_scope` v) env + +modifyInScope :: Subst -> Var -> Var -> Subst +modifyInScope (Subst in_scope env) old_v new_v = Subst (extendVarEnv in_scope old_v new_v) env + -- make old_v map to new_v extendInScopes :: Subst -> [Var] -> Subst -extendInScopes (Subst in_scope env) vs = Subst (foldl extendVarSet in_scope vs) env +extendInScopes (Subst in_scope env) vs = Subst (foldl add_in_scope in_scope vs) env ------------------------------- bindSubst :: Subst -> Var -> Var -> Subst -- Extend with a substitution, v1 -> Var v2 -- and extend the in-scopes with v2 bindSubst (Subst in_scope env) old_bndr new_bndr - = Subst (in_scope `extendVarSet` new_bndr) + = Subst (in_scope `add_in_scope` new_bndr) (extendSubstEnv env old_bndr subst_result) where subst_result | isId old_bndr = DoneEx (Var new_bndr) @@ -147,7 +203,7 @@ unBindSubst :: Subst -> Var -> Var -> Subst -- Reverse the effect of bindSubst -- If old_bndr was already in the substitution, this doesn't quite work unBindSubst (Subst in_scope env) old_bndr new_bndr - = Subst (in_scope `delVarSet` new_bndr) (delSubstEnv env old_bndr) + = Subst (in_scope `delVarEnv` new_bndr) (delSubstEnv env old_bndr) -- And the "List" forms bindSubstList :: Subst -> [Var] -> [Var] -> Subst @@ -164,8 +220,7 @@ setInScope :: Subst -- Take env part from here -> InScopeSet -> Subst setInScope (Subst in_scope1 env1) in_scope2 - = ASSERT( in_scope1 `subVarSet` in_scope1 ) - Subst in_scope2 env1 + = Subst in_scope2 env1 setSubstEnv :: Subst -- Take in-scope part from here -> SubstEnv -- ... and env part from here @@ -194,7 +249,7 @@ mkTyVarSubst tyvars tys = Subst (tyVarsOfTypes tys) (zip_ty_env tyvars tys empty -- Here we expect that the free vars of the range of the -- substitution will be empty. mkTopTyVarSubst :: [TyVar] -> [Type] -> Subst -mkTopTyVarSubst tyvars tys = Subst emptyVarSet (zip_ty_env tyvars tys emptySubstEnv) +mkTopTyVarSubst tyvars tys = Subst emptyInScopeSet (zip_ty_env tyvars tys emptySubstEnv) zip_ty_env [] [] env = env zip_ty_env (tv:tvs) (ty:tys) env = zip_ty_env tvs tys (extendSubstEnv env tv (DoneTy ty)) @@ -244,7 +299,7 @@ substTyVar subst@(Subst in_scope env) old_var -- -- The new_id isn't cloned, but it may have a different type -- etc, so we must return it, not the old id - = (Subst (in_scope `extendVarSet` new_var) + = (Subst (in_scope `add_in_scope` new_var) (delSubstEnv env old_var), new_var) @@ -253,7 +308,7 @@ substTyVar subst@(Subst in_scope env) old_var -- Extending the substitution to do this renaming also -- has the (correct) effect of discarding any existing -- substitution for that variable - = (Subst (in_scope `extendVarSet` new_var) + = (Subst (in_scope `add_in_scope` new_var) (extendSubstEnv env old_var (DoneTy (TyVarTy new_var))), new_var) where @@ -279,51 +334,48 @@ and so far has proved unnecessary. \begin{code} substExpr :: Subst -> CoreExpr -> CoreExpr -substExpr subst expr | isEmptySubst subst = expr - | otherwise = subst_expr subst expr +substExpr subst expr + -- NB: we do not do a no-op when the substitution is empty, + -- because we always want to substitute the variables in the + -- in-scope set for their occurrences. Why? + -- (a) because they may contain more information + -- (b) because leaving an un-substituted Id might cause + -- a space leak (its unfolding might point to an old version + -- of its right hand side). -subst_expr subst expr = go expr where - go (Var v) = case lookupSubst subst v of - Just (DoneEx e') -> e' - Just (ContEx env' e') -> subst_expr (setSubstEnv subst env') e' --- NO! NO! SLPJ 14 July 99 - Nothing -> case lookupInScope subst v of - Just v' -> Var v' - Nothing -> Var v - -- NB: we look up in the in_scope set because the variable - -- there may have more info. In particular, when substExpr - -- is called from the simplifier, the type inside the *occurrences* - -- of a variable may not be right; we should replace it with the - -- binder, from the in_scope set. - --- Nothing -> Var v + go (Var v) = -- See the notes at the top, with the Subst data type declaration + case lookupIdSubst subst v of + + ContEx env' e' -> substExpr (setSubstEnv subst env') e' + DoneId v _ -> Var v + DoneEx e' -> e' go (Type ty) = Type (go_ty ty) go (Con con args) = Con con (map go args) go (App fun arg) = App (go fun) (go arg) go (Note note e) = Note (go_note note) (go e) - go (Lam bndr body) = Lam bndr' (subst_expr subst' body) + go (Lam bndr body) = Lam bndr' (substExpr subst' body) where (subst', bndr') = substBndr subst bndr - go (Let (NonRec bndr rhs) body) = Let (NonRec bndr' (go rhs)) (subst_expr subst' body) + go (Let (NonRec bndr rhs) body) = Let (NonRec bndr' (go rhs)) (substExpr subst' body) where (subst', bndr') = substBndr subst bndr - go (Let (Rec pairs) body) = Let (Rec pairs') (subst_expr subst' body) + go (Let (Rec pairs) body) = Let (Rec pairs') (substExpr subst' body) where (subst', bndrs') = substBndrs subst (map fst pairs) pairs' = bndrs' `zip` rhss' - rhss' = map (subst_expr subst' . snd) pairs + rhss' = map (substExpr subst' . snd) pairs go (Case scrut bndr alts) = Case (go scrut) bndr' (map (go_alt subst') alts) where (subst', bndr') = substBndr subst bndr - go_alt subst (con, bndrs, rhs) = (con, bndrs', subst_expr subst' rhs) + go_alt subst (con, bndrs, rhs) = (con, bndrs', substExpr subst' rhs) where (subst', bndrs') = substBndrs subst bndrs @@ -338,7 +390,6 @@ Substituting in binders is a rather tricky part of the whole compiler. When we hit a binder we may need to (a) apply the the type envt (if non-empty) to its type - (b) apply the type envt and id envt to its SpecEnv (if it has one) (c) give it a new unique to avoid name clashes \begin{code} @@ -355,16 +406,15 @@ substIds :: Subst -> [Id] -> (Subst, [Id]) substIds subst bndrs = mapAccumL substId subst bndrs substId :: Subst -> Id -> (Subst, Id) - --- Returns an Id with empty unfolding and spec-env. --- It's up to the caller to sort these out. + -- Returns an Id with empty IdInfo + -- See the notes with the Subst data type decl at the + -- top of this module substId subst@(Subst in_scope env) old_id - = (Subst (in_scope `extendVarSet` new_id) - (extendSubstEnv env old_id (DoneEx (Var new_id))), - new_id) + = (Subst (in_scope `add_in_scope` new_id) new_env, new_id) where id_ty = idType old_id + occ_info = getIdOccInfo old_id -- id1 has its type zapped id1 | noTypeSubst env @@ -374,11 +424,19 @@ substId subst@(Subst in_scope env) old_id -- in a Note in the id's type itself | otherwise = setIdType old_id (substTy subst id_ty) - -- id2 has its fragile IdInfo zapped - id2 = maybeModifyIdInfo zapFragileIdInfo id1 + -- id2 has its IdInfo zapped + id2 = zapFragileIdInfo id1 -- new_id is cloned if necessary new_id = uniqAway in_scope id2 + + -- Extend the substitution if the unique has changed, + -- or there's some useful occurrence information + -- See the notes with substTyVar for the delSubstEnv + new_env | new_id /= old_id || isFragileOccInfo occ_info + = extendSubstEnv env old_id (DoneId new_id occ_info) + | otherwise + = delSubstEnv env old_id \end{code} Now a variant that unconditionally allocates a new unique. @@ -392,7 +450,7 @@ substAndCloneIds subst us (b:bs) = case substAndCloneId subst us b of { (sub substAndCloneId :: Subst -> UniqSupply -> Id -> (Subst, UniqSupply, Id) substAndCloneId subst@(Subst in_scope env) us old_id - = (Subst (in_scope `extendVarSet` new_id) + = (Subst (in_scope `add_in_scope` new_id) (extendSubstEnv env old_id (DoneEx (Var new_id))), new_us, new_id) @@ -401,7 +459,7 @@ substAndCloneId subst@(Subst in_scope env) us old_id id1 | noTypeSubst env || isEmptyVarSet (tyVarsOfType id_ty) = old_id | otherwise = setIdType old_id (substTy subst id_ty) - id2 = maybeModifyIdInfo zapFragileIdInfo id1 + id2 = zapFragileIdInfo id1 new_id = setVarUnique id2 (uniqFromSupply us1) (us1,new_us) = splitUniqSupply us \end{code} @@ -448,6 +506,7 @@ substWorker subst Nothing substWorker subst (Just w) = case lookupSubst subst w of Nothing -> Just w + Just (DoneId w1 _) -> Just w1 Just (DoneEx (Var w1)) -> Just w1 Just (DoneEx other) -> WARN( True, text "substWorker: DoneEx" <+> ppr w ) Nothing -- Worker has got substituted away altogether @@ -479,6 +538,7 @@ substRules subst (Rules rules rhs_fvs) where subst_fv fv = case lookupSubstEnv se fv of Nothing -> unitVarSet fv + Just (DoneId fv' _) -> unitVarSet fv' Just (DoneEx expr) -> exprFreeVars expr Just (DoneTy ty) -> tyVarsOfType ty Just (ContEx se' expr) -> subst_fvs se' (exprFreeVars expr) diff --git a/ghc/compiler/hsSyn/HsBinds.lhs b/ghc/compiler/hsSyn/HsBinds.lhs index da37e20..4d3fe4a 100644 --- a/ghc/compiler/hsSyn/HsBinds.lhs +++ b/ghc/compiler/hsSyn/HsBinds.lhs @@ -227,9 +227,11 @@ data Sig name SrcLoc | InlineSig name -- INLINE f + (Maybe Int) -- phase SrcLoc | NoInlineSig name -- NOINLINE f + (Maybe Int) -- phase SrcLoc | SpecInstSig (HsType name) -- (Class tys); should be a specialisation of the @@ -250,8 +252,8 @@ sigsForMe f sigs sig_for_me (Sig n _ _) = f n sig_for_me (ClassOpSig n _ _ _) = f n sig_for_me (SpecSig n _ _) = f n - sig_for_me (InlineSig n _) = f n - sig_for_me (NoInlineSig n _) = f n + sig_for_me (InlineSig n _ _) = f n + sig_for_me (NoInlineSig n _ _) = f n sig_for_me (SpecInstSig _ _) = False sig_for_me (FixSig (FixitySig n _ _)) = f n @@ -265,11 +267,11 @@ isClassOpSig _ = False isPragSig :: Sig name -> Bool -- Identifies pragmas -isPragSig (SpecSig _ _ _) = True -isPragSig (InlineSig _ _) = True -isPragSig (NoInlineSig _ _) = True -isPragSig (SpecInstSig _ _) = True -isPragSig other = False +isPragSig (SpecSig _ _ _) = True +isPragSig (InlineSig _ _ _) = True +isPragSig (NoInlineSig _ _ _) = True +isPragSig (SpecInstSig _ _) = True +isPragSig other = False \end{code} \begin{code} @@ -291,15 +293,18 @@ ppr_sig (SpecSig var ty _) nest 4 (ppr ty <+> text "#-}") ] -ppr_sig (InlineSig var _) - = hsep [text "{-# INLINE", ppr var, text "#-}"] +ppr_sig (InlineSig var phase _) + = hsep [text "{-# INLINE", ppr_phase phase, ppr var, text "#-}"] -ppr_sig (NoInlineSig var _) - = hsep [text "{-# NOINLINE", ppr var, text "#-}"] +ppr_sig (NoInlineSig var phase _) + = hsep [text "{-# NOINLINE", ppr_phase phase, ppr var, text "#-}"] ppr_sig (SpecInstSig ty _) = hsep [text "{-# SPECIALIZE instance", ppr ty, text "#-}"] ppr_sig (FixSig fix_sig) = ppr fix_sig + +ppr_phase Nothing = empty +ppr_phase (Just n) = int n \end{code} diff --git a/ghc/compiler/hsSyn/HsCore.lhs b/ghc/compiler/hsSyn/HsCore.lhs index 120dcd3..07293c6 100644 --- a/ghc/compiler/hsSyn/HsCore.lhs +++ b/ghc/compiler/hsSyn/HsCore.lhs @@ -157,7 +157,7 @@ instance (Outputable name) => Outputable (IfaceSig name) where data HsIdInfo name = HsArity ArityInfo | HsStrictness HsStrictnessInfo - | HsUnfold InlinePragInfo (Maybe (UfExpr name)) + | HsUnfold InlinePragInfo (UfExpr name) | HsUpdate UpdateInfo | HsSpecialise (UfRuleBody name) | HsNoCafRefs diff --git a/ghc/compiler/main/CmdLineOpts.lhs b/ghc/compiler/main/CmdLineOpts.lhs index eafe458..3101d02 100644 --- a/ghc/compiler/main/CmdLineOpts.lhs +++ b/ghc/compiler/main/CmdLineOpts.lhs @@ -93,7 +93,6 @@ module CmdLineOpts ( opt_DoSemiTagging, opt_FoldrBuildOn, opt_LiberateCaseThreshold, - opt_NoPreInlining, opt_StgDoLetNoEscapes, opt_UnfoldCasms, opt_UsageSPOn, @@ -103,7 +102,6 @@ module CmdLineOpts ( opt_SimplDoLambdaEtaExpansion, opt_SimplCaseOfCase, opt_SimplCaseMerge, - opt_SimplLetToCase, opt_SimplPedanticBottoms, -- Unfolding control @@ -235,6 +233,8 @@ data StgToDo data SimplifierSwitch = MaxSimplifierIterations Int | SimplInlinePhase Int + | DontApplyRules + | SimplLetToCase \end{code} %************************************************************************ @@ -381,7 +381,6 @@ opt_DoEtaReduction = lookUp SLIT("-fdo-eta-reduction") opt_DoSemiTagging = lookUp SLIT("-fsemi-tagging") opt_FoldrBuildOn = lookUp SLIT("-ffoldr-build-on") opt_LiberateCaseThreshold = lookup_def_int "-fliberate-case-threshold" (10::Int) -opt_NoPreInlining = lookUp SLIT("-fno-pre-inlining") opt_StgDoLetNoEscapes = lookUp SLIT("-flet-no-escape") opt_UnfoldCasms = lookUp SLIT("-funfold-casms-in-hi-file") opt_UsageSPOn = lookUp SLIT("-fusagesp-on") @@ -421,7 +420,6 @@ opt_SimplDoEtaReduction = lookUp SLIT("-fdo-eta-reduction") opt_SimplDoLambdaEtaExpansion = lookUp SLIT("-fdo-lambda-eta-expansion") opt_SimplCaseOfCase = lookUp SLIT("-fcase-of-case") opt_SimplCaseMerge = lookUp SLIT("-fcase-merge") -opt_SimplLetToCase = lookUp SLIT("-flet-to-case") opt_SimplPedanticBottoms = lookUp SLIT("-fpedantic-bottoms") -- Unfolding control @@ -531,6 +529,8 @@ classifyOpts = sep argv [] [] -- accumulators... matchSimplSw opt = firstJust [ matchSwInt opt "-fmax-simplifier-iterations" MaxSimplifierIterations , matchSwInt opt "-finline-phase" SimplInlinePhase + , matchSwBool opt "-fno-rules" DontApplyRules + , matchSwBool opt "-flet-to-case" SimplLetToCase ] matchSwBool :: String -> String -> a -> Maybe a @@ -563,10 +563,12 @@ instance Ord SimplifierSwitch where tagOf_SimplSwitch (SimplInlinePhase _) = ILIT(1) tagOf_SimplSwitch (MaxSimplifierIterations _) = ILIT(2) +tagOf_SimplSwitch DontApplyRules = ILIT(3) +tagOf_SimplSwitch SimplLetToCase = ILIT(4) -- If you add anything here, be sure to change lAST_SIMPL_SWITCH_TAG, too! -lAST_SIMPL_SWITCH_TAG = 2 +lAST_SIMPL_SWITCH_TAG = 4 \end{code} %************************************************************************ diff --git a/ghc/compiler/main/Main.lhs b/ghc/compiler/main/Main.lhs index 1712dca..432a2f2 100644 --- a/ghc/compiler/main/Main.lhs +++ b/ghc/compiler/main/Main.lhs @@ -311,7 +311,8 @@ ppSourceStats short (HsModule name version exports imports decls src_loc) sig_info (Sig _ _ _) = (1,0,0,0) sig_info (ClassOpSig _ _ _ _) = (0,1,0,0) sig_info (SpecSig _ _ _) = (0,0,1,0) - sig_info (InlineSig _ _) = (0,0,0,1) + sig_info (InlineSig _ _ _) = (0,0,0,1) + sig_info (NoInlineSig _ _ _) = (0,0,0,1) sig_info _ = (0,0,0,0) import_info (ImportDecl _ _ qual as spec _) diff --git a/ghc/compiler/main/MkIface.lhs b/ghc/compiler/main/MkIface.lhs index a407ab7..9995ca3 100644 --- a/ghc/compiler/main/MkIface.lhs +++ b/ghc/compiler/main/MkIface.lhs @@ -31,7 +31,8 @@ import IdInfo ( IdInfo, StrictnessInfo(..), ArityInfo, InlinePragInfo(..), inli arityInfo, ppArityInfo, arityLowerBound, strictnessInfo, ppStrictnessInfo, isBottomingStrictness, cafInfo, ppCafInfo, specInfo, - cprInfo, ppCprInfo, + cprInfo, ppCprInfo, pprInlinePragInfo, + occInfo, OccInfo(..), workerExists, workerInfo, ppWorkerInfo ) import CoreSyn ( CoreExpr, CoreBind, Bind(..), rulesRules, rulesRhsFreeVars ) @@ -332,19 +333,25 @@ ifaceId get_idinfo needed_ids is_rec id rhs Just work_id = work_info + ------------ Occ info -------------- + loop_breaker = case occInfo core_idinfo of + IAmALoopBreaker -> True + other -> False + ------------ Unfolding -------------- inline_pragma = inlinePragInfo core_idinfo dont_inline = case inline_pragma of - IMustNotBeINLINEd -> True - IAmALoopBreaker -> True - other -> False + IMustNotBeINLINEd False Nothing -> True -- Unconditional NOINLINE + other -> False + - unfold_pretty | show_unfold = ptext SLIT("__U") <+> pprIfaceUnfolding rhs + unfold_pretty | show_unfold = ptext SLIT("__U") <> pprInlinePragInfo inline_pragma <+> pprIfaceUnfolding rhs | otherwise = empty show_unfold = not has_worker && -- Not unnecessary not bottoming_fn && -- Not necessary not dont_inline && + not loop_breaker && rhs_is_small && -- Small enough okToUnfoldInHiFile rhs -- No casms etc @@ -374,10 +381,11 @@ ifaceId get_idinfo needed_ids is_rec id rhs ------------ Sanity checking -------------- -- The arity of a wrapper function should match its strictness, -- or else an importing module will get very confused indeed. + -- [later: actually all that is necessary is for strictness to exceed arity] arity_matches_strictness = not has_worker || case strict_info of - StrictnessInfo ds _ -> length ds == arityLowerBound arity_info + StrictnessInfo ds _ -> length ds >= arityLowerBound arity_info other -> True interestingId id = isId id && isLocallyDefined id && diff --git a/ghc/compiler/parser/Parser.y b/ghc/compiler/parser/Parser.y index 239e64b..cc76e5d 100644 --- a/ghc/compiler/parser/Parser.y +++ b/ghc/compiler/parser/Parser.y @@ -1,6 +1,6 @@ {- ----------------------------------------------------------------------------- -$Id: Parser.y,v 1.14 1999/09/01 14:08:19 sof Exp $ +$Id: Parser.y,v 1.15 1999/11/01 17:10:23 simonpj Exp $ Haskell grammar. @@ -367,8 +367,8 @@ decl :: { RdrBinding } : signdecl { $1 } | fixdecl { $1 } | valdef { RdrValBinding $1 } - | '{-# INLINE' srcloc qvar '#-}' { RdrSig (InlineSig $3 $2) } - | '{-# NOINLINE' srcloc qvar '#-}' { RdrSig (NoInlineSig $3 $2) } + | '{-# INLINE' srcloc opt_phase qvar '#-}' { RdrSig (InlineSig $4 $3 $2) } + | '{-# NOINLINE' srcloc opt_phase qvar '#-}' { RdrSig (NoInlineSig $4 $3 $2) } | '{-# SPECIALISE' srcloc qvar '::' sigtypes '#-}' { foldr1 RdrAndBindings (map (\t -> RdrSig (SpecSig $3 t $2)) $5) } @@ -376,6 +376,10 @@ decl :: { RdrBinding } { RdrSig (SpecInstSig $4 $2) } | '{-# RULES' rules '#-}' { $2 } +opt_phase :: { Maybe Int } + : INTEGER { Just (fromInteger $1) } + | {- empty -} { Nothing } + sigtypes :: { [RdrNameHsType] } : sigtype { [ $1 ] } | sigtypes ',' sigtype { $3 : $1 } @@ -443,11 +447,11 @@ rule_forall :: { [RdrNameRuleBndr] } rule_var_list :: { [RdrNameRuleBndr] } : rule_var { [$1] } - | rule_var ',' rule_var_list { $1 : $3 } + | rule_var rule_var_list { $1 : $2 } rule_var :: { RdrNameRuleBndr } : varid { RuleBndr $1 } - | varid '::' ctype { RuleBndrSig $1 $3 } + | '(' varid '::' ctype ')' { RuleBndrSig $2 $4 } ----------------------------------------------------------------------------- -- Foreign import/export diff --git a/ghc/compiler/rename/ParseIface.y b/ghc/compiler/rename/ParseIface.y index 20cdf9f..df52ddd 100644 --- a/ghc/compiler/rename/ParseIface.y +++ b/ghc/compiler/rename/ParseIface.y @@ -14,7 +14,7 @@ import BasicTypes ( Fixity(..), FixityDirection(..), import CostCentre ( CostCentre(..), IsCafCC(..), IsDupdCC(..) ) import HsPragmas ( noDataPragmas, noClassPragmas ) import Type ( Kind, mkArrowKind, boxedTypeKind, openTypeKind, UsageAnn(..) ) -import IdInfo ( ArityInfo, exactArity, CprInfo(..) ) +import IdInfo ( ArityInfo, exactArity, CprInfo(..), InlinePragInfo(..) ) import Lex import RnMonad ( ImportVersion, LocalVersion, ParsedIface(..), WhatsImported(..), @@ -602,13 +602,17 @@ id_info :: { [HsIdInfo RdrName] } id_info_item :: { HsIdInfo RdrName } : '__A' INTEGER { HsArity (exactArity (fromInteger $2)) } - | '__U' core_expr { HsUnfold $1 (Just $2) } - | '__U' { HsUnfold $1 Nothing } + | '__U' inline_prag core_expr { HsUnfold $2 $3 } | '__M' { HsCprInfo $1 } | '__S' { HsStrictness (HsStrictnessInfo $1) } | '__C' { HsNoCafRefs } | '__P' qvar_name { HsWorker $2 } +inline_prag :: { InlinePragInfo } + : {- empty -} { NoInlinePragInfo } + | '[' INTEGER ']' { IMustNotBeINLINEd True (Just (fromInteger $2)) } -- INLINE n + | '[' '!' INTEGER ']' { IMustNotBeINLINEd False (Just (fromInteger $3)) } -- NOINLINE n + ------------------------------------------------------- core_expr :: { UfExpr RdrName } core_expr : '\\' core_bndrs '->' core_expr { foldr UfLam $4 $2 } diff --git a/ghc/compiler/rename/RnBinds.lhs b/ghc/compiler/rename/RnBinds.lhs index c29ecd9..ca0f820 100644 --- a/ghc/compiler/rename/RnBinds.lhs +++ b/ghc/compiler/rename/RnBinds.lhs @@ -536,30 +536,30 @@ renameSig lookup_occ_nm (SpecSig v ty src_loc) rnHsSigType (quotes (ppr v)) ty `thenRn` \ (new_ty,fvs) -> returnRn (SpecSig new_v new_ty src_loc, fvs `addOneFV` new_v) -renameSig lookup_occ_nm (InlineSig v src_loc) +renameSig lookup_occ_nm (FixSig (FixitySig v fix src_loc)) = pushSrcLocRn src_loc $ lookup_occ_nm v `thenRn` \ new_v -> - returnRn (InlineSig new_v src_loc, unitFV new_v) + returnRn (FixSig (FixitySig new_v fix src_loc), unitFV new_v) -renameSig lookup_occ_nm (FixSig (FixitySig v fix src_loc)) +renameSig lookup_occ_nm (InlineSig v p src_loc) = pushSrcLocRn src_loc $ lookup_occ_nm v `thenRn` \ new_v -> - returnRn (FixSig (FixitySig new_v fix src_loc), unitFV new_v) + returnRn (InlineSig new_v p src_loc, unitFV new_v) -renameSig lookup_occ_nm (NoInlineSig v src_loc) +renameSig lookup_occ_nm (NoInlineSig v p src_loc) = pushSrcLocRn src_loc $ lookup_occ_nm v `thenRn` \ new_v -> - returnRn (NoInlineSig new_v src_loc, unitFV new_v) + returnRn (NoInlineSig new_v p src_loc, unitFV new_v) \end{code} Checking for distinct signatures; oh, so boring \begin{code} cmp_sig :: RenamedSig -> RenamedSig -> Ordering -cmp_sig (Sig n1 _ _) (Sig n2 _ _) = n1 `compare` n2 -cmp_sig (InlineSig n1 _) (InlineSig n2 _) = n1 `compare` n2 -cmp_sig (NoInlineSig n1 _) (NoInlineSig n2 _) = n1 `compare` n2 -cmp_sig (SpecInstSig ty1 _) (SpecInstSig ty2 _) = cmpHsType compare ty1 ty2 +cmp_sig (Sig n1 _ _) (Sig n2 _ _) = n1 `compare` n2 +cmp_sig (InlineSig n1 _ _) (InlineSig n2 _ _) = n1 `compare` n2 +cmp_sig (NoInlineSig n1 _ _) (NoInlineSig n2 _ _) = n1 `compare` n2 +cmp_sig (SpecInstSig ty1 _) (SpecInstSig ty2 _) = cmpHsType compare ty1 ty2 cmp_sig (SpecSig n1 ty1 _) (SpecSig n2 ty2 _) = -- may have many specialisations for one value; -- but not ones that are exactly the same... @@ -571,8 +571,8 @@ cmp_sig other_1 other_2 -- Tags *must* be different sig_tag (Sig n1 _ _) = (ILIT(1) :: FAST_INT) sig_tag (SpecSig n1 _ _) = ILIT(2) -sig_tag (InlineSig n1 _) = ILIT(3) -sig_tag (NoInlineSig n1 _) = ILIT(4) +sig_tag (InlineSig n1 _ _) = ILIT(3) +sig_tag (NoInlineSig n1 _ _) = ILIT(4) sig_tag (SpecInstSig _ _) = ILIT(5) sig_tag (FixSig _) = ILIT(6) sig_tag _ = panic# "tag(RnBinds)" @@ -603,8 +603,8 @@ unknownSigErr sig sig_doc (Sig _ _ loc) = (SLIT("type signature"),loc) sig_doc (ClassOpSig _ _ _ loc) = (SLIT("class-method type signature"), loc) sig_doc (SpecSig _ _ loc) = (SLIT("SPECIALISE pragma"),loc) -sig_doc (InlineSig _ loc) = (SLIT("INLINE pragma"),loc) -sig_doc (NoInlineSig _ loc) = (SLIT("NOINLINE pragma"),loc) +sig_doc (InlineSig _ _ loc) = (SLIT("INLINE pragma"),loc) +sig_doc (NoInlineSig _ _ loc) = (SLIT("NOINLINE pragma"),loc) sig_doc (SpecInstSig _ loc) = (SLIT("SPECIALISE instance pragma"),loc) sig_doc (FixSig (FixitySig _ _ loc)) = (SLIT("fixity declaration"), loc) diff --git a/ghc/compiler/rename/RnSource.lhs b/ghc/compiler/rename/RnSource.lhs index ecc7015..61dd26b 100644 --- a/ghc/compiler/rename/RnSource.lhs +++ b/ghc/compiler/rename/RnSource.lhs @@ -617,7 +617,7 @@ rnForAll doc forall_tyvars ctxt ty --------------------------------------- rnHsType doc ty@(HsForAllTy _ _ inner_ty) - = addErrRn (unexpectedForAllTy ty) `thenRn_` + = addWarnRn (unexpectedForAllTy ty) `thenRn_` rnHsPolyType doc ty rnHsType doc (MonoTyVar tyvar) @@ -715,9 +715,8 @@ rnIdInfo (HsWorker worker) = lookupOccRn worker `thenRn` \ worker' -> returnRn (HsWorker worker', unitFV worker') -rnIdInfo (HsUnfold inline (Just expr)) = rnCoreExpr expr `thenRn` \ (expr', fvs) -> - returnRn (HsUnfold inline (Just expr'), fvs) -rnIdInfo (HsUnfold inline Nothing) = returnRn (HsUnfold inline Nothing, emptyFVs) +rnIdInfo (HsUnfold inline expr) = rnCoreExpr expr `thenRn` \ (expr', fvs) -> + returnRn (HsUnfold inline expr', fvs) rnIdInfo (HsArity arity) = returnRn (HsArity arity, emptyFVs) rnIdInfo (HsUpdate update) = returnRn (HsUpdate update, emptyFVs) rnIdInfo (HsNoCafRefs) = returnRn (HsNoCafRefs, emptyFVs) diff --git a/ghc/compiler/simplCore/BinderInfo.lhs b/ghc/compiler/simplCore/BinderInfo.lhs index f125975..1623bcd 100644 --- a/ghc/compiler/simplCore/BinderInfo.lhs +++ b/ghc/compiler/simplCore/BinderInfo.lhs @@ -9,7 +9,7 @@ \begin{code} module BinderInfo ( - BinderInfo(..), + BinderInfo, addBinderInfo, orBinderInfo, @@ -19,12 +19,12 @@ module BinderInfo ( getBinderInfoArity, setBinderInfoArityToZero, - occInfoToInlinePrag + binderInfoToOccInfo ) where #include "HsVersions.h" -import IdInfo ( InlinePragInfo(..), OccInfo(..) ) +import IdInfo ( OccInfo(..), InsideLam, OneBranch, insideLam, notInsideLam, oneBranch ) import GlaExts ( Int(..), (+#) ) import Outputable \end{code} @@ -46,10 +46,10 @@ data BinderInfo !Int -- number of arguments on stack when called; this is a minimum guarantee - | OneOcc -- Just one occurrence (or one each in + | SingleOcc -- Just one occurrence (or one each in -- mutually-exclusive case alts). - !OccInfo + !InsideLam !InsideSCC @@ -57,7 +57,7 @@ data BinderInfo -- in which it occurs -- Note that we only worry about the case-alt counts - -- if the OneOcc is substitutable -- that's the only + -- if the SingleOcc is substitutable -- that's the only -- time we *use* the info; we could be more clever for -- other cases if we really had to. (WDP/PS) @@ -79,10 +79,10 @@ noBinderInfo = ManyOcc 0 -- A non-committal value \end{code} \begin{code} -occInfoToInlinePrag :: BinderInfo -> InlinePragInfo -occInfoToInlinePrag DeadCode = IAmDead -occInfoToInlinePrag (OneOcc occ_info NotInsideSCC n_alts _) = ICanSafelyBeINLINEd occ_info (n_alts==1) -occInfoToInlinePrag other = NoInlinePragInfo +binderInfoToOccInfo :: BinderInfo -> OccInfo +binderInfoToOccInfo DeadCode = IAmDead +binderInfoToOccInfo (SingleOcc in_lam NotInsideSCC n_alts _) = OneOcc in_lam (n_alts==1) +binderInfoToOccInfo other = NoOccInfo \end{code} @@ -94,18 +94,18 @@ deadOccurrence :: BinderInfo deadOccurrence = DeadCode funOccurrence :: Int -> BinderInfo -funOccurrence = OneOcc NotInsideLam NotInsideSCC 1 +funOccurrence = SingleOcc notInsideLam NotInsideSCC 1 markMany, markInsideLam, markInsideSCC :: BinderInfo -> BinderInfo -markMany (OneOcc _ _ _ ar) = ManyOcc ar +markMany (SingleOcc _ _ _ ar) = ManyOcc ar markMany (ManyOcc ar) = ManyOcc ar markMany DeadCode = panic "markMany" -markInsideLam (OneOcc _ in_scc n_alts ar) = OneOcc InsideLam in_scc n_alts ar +markInsideLam (SingleOcc _ in_scc n_alts ar) = SingleOcc insideLam in_scc n_alts ar markInsideLam other = other -markInsideSCC (OneOcc dup_danger _ n_alts ar) = OneOcc dup_danger InsideSCC n_alts ar +markInsideSCC (SingleOcc dup_danger _ n_alts ar) = SingleOcc dup_danger InsideSCC n_alts ar markInsideSCC other = other addBinderInfo, orBinderInfo :: BinderInfo -> BinderInfo -> BinderInfo @@ -120,22 +120,20 @@ addBinderInfo info1 info2 orBinderInfo DeadCode info2 = info2 orBinderInfo info1 DeadCode = info1 -orBinderInfo (OneOcc dup1 scc1 n_alts1 ar_1) - (OneOcc dup2 scc2 n_alts2 ar_2) +orBinderInfo (SingleOcc dup1 scc1 n_alts1 ar_1) + (SingleOcc dup2 scc2 n_alts2 ar_2) = let scc = or_sccs scc1 scc2 dup = or_dups dup1 dup2 alts = n_alts1 + n_alts2 ar = min ar_1 ar_2 in - OneOcc dup scc alts ar + SingleOcc dup scc alts ar orBinderInfo info1 info2 = ManyOcc (min (getBinderInfoArity info1) (getBinderInfoArity info2)) -or_dups InsideLam _ = InsideLam -or_dups _ InsideLam = InsideLam -or_dups _ _ = NotInsideLam +or_dups in_lam1 in_lam2 = in_lam1 || in_lam2 or_sccs InsideSCC _ = InsideSCC or_sccs _ InsideSCC = InsideSCC @@ -144,20 +142,20 @@ or_sccs _ _ = NotInsideSCC setBinderInfoArityToZero :: BinderInfo -> BinderInfo setBinderInfoArityToZero DeadCode = DeadCode setBinderInfoArityToZero (ManyOcc _) = ManyOcc 0 -setBinderInfoArityToZero (OneOcc dd sc i _) = OneOcc dd sc i 0 +setBinderInfoArityToZero (SingleOcc dd sc i _) = SingleOcc dd sc i 0 \end{code} \begin{code} getBinderInfoArity (DeadCode) = 0 getBinderInfoArity (ManyOcc i) = i -getBinderInfoArity (OneOcc _ _ _ i) = i +getBinderInfoArity (SingleOcc _ _ _ i) = i \end{code} \begin{code} instance Outputable BinderInfo where ppr DeadCode = ptext SLIT("Dead") ppr (ManyOcc ar) = hcat [ ptext SLIT("Many-"), int ar ] - ppr (OneOcc dup_danger in_scc n_alts ar) + ppr (SingleOcc dup_danger in_scc n_alts ar) = hcat [ ptext SLIT("One-"), ppr dup_danger, char '-', pp_scc in_scc, char '-', int n_alts, char '-', int ar ] @@ -165,4 +163,3 @@ instance Outputable BinderInfo where pp_scc InsideSCC = ptext SLIT("*SCC*") pp_scc NotInsideSCC = ptext SLIT("noscc") \end{code} - diff --git a/ghc/compiler/simplCore/CSE.lhs b/ghc/compiler/simplCore/CSE.lhs index ee12ab9..d424653 100644 --- a/ghc/compiler/simplCore/CSE.lhs +++ b/ghc/compiler/simplCore/CSE.lhs @@ -25,7 +25,7 @@ import UniqFM Simple common sub-expression - + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ When we see x1 = C a b x2 = C x1 b @@ -37,12 +37,12 @@ When we then see y1 = C a b y2 = C y1 b we replace the C a b with x1. But then we *dont* want to -add x1 -> y to the mapping. Rather, we want the reverse, y -> x1 +add x1 -> y1 to the mapping. Rather, we want the reverse, y1 -> x1 so that a subsequent binding - z = C y b + y2 = C y1 b will get transformed to C x1 b, and then to x2. -So we carry an extra var->var mapping which we apply before looking up in the +So we carry an extra var->var mapping which we apply *before* looking up in the reverse mapping. @@ -56,7 +56,33 @@ For example, consider h = \x -> x+x in ... -Here we must *not* do CSE on the x+x! +Here we must *not* do CSE on the inner x+x! + + +Another important wrinkle +~~~~~~~~~~~~~~~~~~~~~~~~~ +Consider + + f = \x -> case x of wild { + (a:as) -> case a of wild1 { + (p,q) -> ...(wild1:as)... + +Here, (wild1:as) is morally the same as (a:as) and hence equal to wild. +But that's not quite obvious. In general we want to keep it as (wild1:as), +but for CSE purpose that's a bad idea. + +So we add the binding (wild1 -> a) to the extra var->var mapping. + + +Yet another wrinkle +~~~~~~~~~~~~~~~~~~~ +Consider + case (h x) of y -> ...(h x)... + +We'd like to replace (h x) in the alternative, by y. But because of +the preceding "Another important wrinkle", we only want to add the mapping + scrutinee -> case binder +to the CSE mapping if the scrutinee is a non-trivial expression. %************************************************************************ @@ -119,18 +145,28 @@ cseExpr env (Lam b e) = Lam b (cseExpr env e) cseExpr env (Let bind e) = let (env1, bind') = cseBind env bind in Let bind' (cseExpr env1 e) cseExpr env (Type t) = Type t -cseExpr env (Case scrut bndr alts) = Case (tryForCSE env scrut) bndr (cseAlts env bndr alts) +cseExpr env (Case scrut bndr alts) = Case scrut' bndr (cseAlts env scrut' bndr alts) + where + scrut' = tryForCSE env scrut -cseAlts env bndr alts +cseAlts env new_scrut bndr alts = map cse_alt alts where + (con_target, alt_env) + = case new_scrut of + Var v -> (v, extendSubst env bndr v) -- See "another important wrinkle" + -- map: bndr -> v + + other -> (bndr, extendCSEnv env bndr new_scrut) -- See "yet another wrinkle" + -- map: new_scrut -> bndr + arg_tys = case splitTyConApp_maybe (idType bndr) of Just (_, arg_tys) -> map Type arg_tys other -> pprPanic "cseAlts" (ppr bndr) cse_alt (con, args, rhs) - | null args || not (isBoxedDataCon con) = (con, args, cseExpr env rhs) + | null args || not (isBoxedDataCon con) = (con, args, cseExpr alt_env rhs) -- Don't try CSE if there are no args; it just increases the number -- of live vars. E.g. -- case x of { True -> ....True.... } @@ -138,7 +174,7 @@ cseAlts env bndr alts -- Hence the 'null args', which also deal with literals and DEFAULT -- And we can't CSE on unboxed tuples | otherwise - = (con, args, cseExpr (extendCSEnv env bndr (Con con (arg_tys ++ (map varToCoreExpr args)))) rhs) + = (con, args, cseExpr (extendCSEnv alt_env con_target (Con con (arg_tys ++ (map varToCoreExpr args)))) rhs) \end{code} diff --git a/ghc/compiler/simplCore/FloatOut.lhs b/ghc/compiler/simplCore/FloatOut.lhs index d41f3d9..83e5d5a 100644 --- a/ghc/compiler/simplCore/FloatOut.lhs +++ b/ghc/compiler/simplCore/FloatOut.lhs @@ -15,7 +15,7 @@ import CoreSyn import CmdLineOpts ( opt_D_verbose_core2core, opt_D_dump_simpl_stats ) import ErrUtils ( dumpIfSet ) import CostCentre ( dupifyCC, CostCentre ) -import Id ( Id ) +import Id ( Id, idType ) import Const ( isWHNFCon ) import VarEnv import CoreLint ( beginPass, endPass ) @@ -24,6 +24,7 @@ import SetLevels ( setLevels, Level(..), tOP_LEVEL, ltMajLvl, ltLvl, isTopLvl ) import BasicTypes ( Unused ) +import Type ( isUnLiftedType ) import Var ( TyVar ) import UniqSupply ( UniqSupply ) import List ( partition ) @@ -261,6 +262,16 @@ floatExpr env lvl (Note note@(SCC cc) expr) -- Note: Nested SCC's are preserved for the benefit of -- cost centre stack profiling (Durham) +-- At one time I tried the effect of not float anything out of an InlineMe, +-- but it sometimes works badly. For example, consider PrelArr.done. It +-- has the form __inline (\d. e) +-- where e doesn't mention d. If we float this to +-- __inline (let x = e in \d. x) +-- things are bad. The inliner doesn't even inline it because it doesn't look +-- like a head-normal form. So it seems a lesser evil to let things float. +-- In SetLevels we do set the context to (Level 0 0) when we get to an InlineMe +-- which discourages floating out. + floatExpr env lvl (Note note expr) -- Other than SCCs = case (floatExpr env lvl expr) of { (fs, floating_defns, expr') -> (fs, floating_defns, Note note expr') } @@ -359,10 +370,16 @@ partitionByMajorLevel, partitionByLevel partitionByMajorLevel ctxt_lvl defns = partition float_further defns where - float_further (my_lvl, _) = my_lvl `lt_major` ctxt_lvl - -my_lvl `lt_major` ctxt_lvl = my_lvl `ltMajLvl` ctxt_lvl || - isTopLvl my_lvl + -- Float it if we escape a value lambda, + -- or if we get to the top level + float_further (my_lvl, bind) = my_lvl `ltMajLvl` ctxt_lvl || isTopLvl my_lvl + -- The isTopLvl part says that if we can get to the top level, say "yes" anyway + -- This means that + -- x = f e + -- transforms to + -- lvl = e + -- x = f lvl + -- which is as it should be partitionByLevel ctxt_lvl defns = partition float_further defns diff --git a/ghc/compiler/simplCore/OccurAnal.lhs b/ghc/compiler/simplCore/OccurAnal.lhs index 01e5652..e4fb5b8 100644 --- a/ghc/compiler/simplCore/OccurAnal.lhs +++ b/ghc/compiler/simplCore/OccurAnal.lhs @@ -26,12 +26,12 @@ import CoreFVs ( idRuleVars ) import CoreUtils ( exprIsTrivial ) import Const ( Con(..), Literal(..) ) import Id ( isSpecPragmaId, isOneShotLambda, setOneShotLambda, - getInlinePragma, setInlinePragma, + getIdOccInfo, setIdOccInfo, isExportedId, modifyIdInfo, idInfo, getIdSpecialisation, idType, idUnique, Id ) -import IdInfo ( InlinePragInfo(..), OccInfo(..), copyIdInfo ) +import IdInfo ( OccInfo(..), insideLam, copyIdInfo ) import VarSet import VarEnv @@ -416,7 +416,7 @@ reOrderRec env (AcyclicSCC (bind, _, _)) = [bind] -- Common case of simple self-recursion reOrderRec env (CyclicSCC [bind]) - = [(setInlinePragma tagged_bndr IAmALoopBreaker, rhs)] + = [(setIdOccInfo tagged_bndr IAmALoopBreaker, rhs)] where ((tagged_bndr, rhs), _, _) = bind @@ -425,7 +425,7 @@ reOrderRec env (CyclicSCC (bind : binds)) -- do SCC analysis on the rest, and recursively sort them out concat (map (reOrderRec env) (stronglyConnCompR unchosen)) ++ - [(setInlinePragma tagged_bndr IAmALoopBreaker, rhs)] + [(setIdOccInfo tagged_bndr IAmALoopBreaker, rhs)] where (chosen_pair, unchosen) = choose_loop_breaker bind (score bind) [] binds @@ -458,10 +458,9 @@ reOrderRec env (CyclicSCC (bind : binds)) inlineCandidate :: Id -> CoreExpr -> Bool inlineCandidate id (Note InlineMe _) = True - inlineCandidate id rhs = case getInlinePragma id of - IMustBeINLINEd -> True - ICanSafelyBeINLINEd _ _ -> True - other -> False + inlineCandidate id rhs = case getIdOccInfo id of + OneOcc _ _ -> True + other -> False -- Real example (the Enum Ordering instance from PrelBase): -- rec f = \ x -> case d of (p,q,r) -> p x @@ -646,13 +645,25 @@ occAnal env (Case scrut bndr alts) case occAnal (zapCtxt env) scrut of { (scrut_usage, scrut') -> let alts_usage = foldr1 combineAltsUsageDetails alts_usage_s - (alts_usage1, tagged_bndr) = tagBinder alts_usage bndr + alts_usage' = addCaseBndrUsage alts_usage + (alts_usage1, tagged_bndr) = tagBinder alts_usage' bndr total_usage = scrut_usage `combineUsageDetails` alts_usage1 in total_usage `seq` (total_usage, Case scrut' tagged_bndr alts') }} where alt_env = env `addNewCand` bndr + -- The case binder gets a usage of either "many" or "dead", never "one". + -- Reason: we like to inline single occurrences, to eliminate a binding, + -- but inlining a case binder *doesn't* eliminate a binding. + -- We *don't* want to transform + -- case x of w { (p,q) -> f w } + -- into + -- case x of w { (p,q) -> f (p,q) } + addCaseBndrUsage usage = case lookupVarEnv usage bndr of + Nothing -> usage + Just occ -> extendVarEnv usage bndr (markMany occ) + occAnal env (Let bind body) = case occAnal new_env body of { (body_usage, body') -> case occAnalBind env bind body_usage of { (final_usage, new_binds) -> @@ -828,7 +839,7 @@ tagBinders :: UsageDetails -- Of scope tagBinders usage binders = let usage' = usage `delVarEnvList` binders - uss = map (setBinderPrag usage) binders + uss = map (setBinderOcc usage) binders in usage' `seq` (usage', uss) @@ -840,45 +851,27 @@ tagBinder :: UsageDetails -- Of scope tagBinder usage binder = let usage' = usage `delVarEnv` binder - binder' = setBinderPrag usage binder + binder' = setBinderOcc usage binder in usage' `seq` (usage', binder') -setBinderPrag :: UsageDetails -> CoreBndr -> CoreBndr -setBinderPrag usage bndr - | isTyVar bndr - = bndr - - | otherwise - = case old_prag of - NoInlinePragInfo -> new_bndr - IAmDead -> new_bndr -- The next three are annotations - ICanSafelyBeINLINEd _ _ -> new_bndr -- from the previous iteration of - IAmALoopBreaker -> new_bndr -- the occurrence analyser - - other | its_now_dead -> new_bndr -- Overwrite the others iff it's now dead - | otherwise -> bndr - +setBinderOcc :: UsageDetails -> CoreBndr -> CoreBndr +setBinderOcc usage bndr + | isTyVar bndr = bndr + | isExportedId bndr + = -- Don't use local usage info for visible-elsewhere things + -- BUT *do* erase any IAmALoopBreaker annotation, because we're + -- about to re-generate it and it shouldn't be "sticky" + case getIdOccInfo bndr of + NoOccInfo -> bndr + other -> setIdOccInfo bndr NoOccInfo + + | otherwise = setIdOccInfo bndr occ_info where - old_prag = getInlinePragma bndr - new_bndr = setInlinePragma bndr new_prag - - its_now_dead = case new_prag of - IAmDead -> True - other -> False - - new_prag = occInfoToInlinePrag occ_info - - occ_info - | isExportedId bndr = noBinderInfo - -- Don't use local usage info for visible-elsewhere things - -- But NB that we do set NoInlinePragma for exported things - -- thereby nuking any IAmALoopBreaker from a previous pass. - - | otherwise = case lookupVarEnv usage bndr of - Nothing -> deadOccurrence - Just info -> info + occ_info = case lookupVarEnv usage bndr of + Nothing -> IAmDead + Just info -> binderInfoToOccInfo info markBinderInsideLambda :: CoreBndr -> CoreBndr markBinderInsideLambda bndr @@ -886,10 +879,9 @@ markBinderInsideLambda bndr = bndr | otherwise - = case getInlinePragma bndr of - ICanSafelyBeINLINEd not_in_lam nalts - -> bndr `setInlinePragma` ICanSafelyBeINLINEd InsideLam nalts - other -> bndr + = case getIdOccInfo bndr of + OneOcc _ once -> bndr `setIdOccInfo` OneOcc insideLam once + other -> bndr funOccZero = funOccurrence 0 \end{code} diff --git a/ghc/compiler/simplCore/SetLevels.lhs b/ghc/compiler/simplCore/SetLevels.lhs index fb552e4..2ff4754 100644 --- a/ghc/compiler/simplCore/SetLevels.lhs +++ b/ghc/compiler/simplCore/SetLevels.lhs @@ -22,7 +22,12 @@ We do *not* clone top-level bindings, because some of them must not change, but we *do* clone bindings that are heading for the top level - +* In the expression + case x of wild { p -> ...wild... } + we substitute x for wild in the RHS of the case alternatives: + case x of wild { p -> ...x... } + This means that a sub-expression involving x is not "trapped" inside the RHS. + And it's not inconvenient because we already have a substitution. \begin{code} module SetLevels ( @@ -39,13 +44,17 @@ import CoreSyn import CoreUtils ( coreExprType, exprIsTrivial, exprIsBottom ) import CoreFVs -- all of it -import Id ( Id, idType, mkSysLocal, isOneShotLambda, modifyIdInfo ) -import IdInfo ( specInfo, setSpecInfo ) -import Var ( IdOrTyVar, Var, setVarUnique ) +import Id ( Id, idType, mkSysLocal, isOneShotLambda, modifyIdInfo, + getIdSpecialisation, getIdWorkerInfo + ) +import IdInfo ( workerExists ) +import Var ( IdOrTyVar, Var, TyVar, setVarUnique ) import VarEnv import Subst import VarSet -import Type ( isUnLiftedType, mkTyVarTys, mkForAllTys, Type ) +import Name ( getOccName ) +import OccName ( occNameUserString ) +import Type ( isUnLiftedType, mkTyVarTy, mkForAllTys, Type ) import BasicTypes ( TopLevelFlag(..) ) import VarSet import VarEnv @@ -53,8 +62,7 @@ import UniqSupply import Maybes ( maybeToBool ) import Util ( zipWithEqual, zipEqual ) import Outputable - -isLeakFreeType x y = False -- safe option; ToDo +import List ( nub ) \end{code} %************************************************************************ @@ -64,11 +72,9 @@ isLeakFreeType x y = False -- safe option; ToDo %************************************************************************ \begin{code} -data Level - = Top -- Means *really* the top level; short for (Level 0 0). - | Level Int -- Level number of enclosing lambdas - Int -- Number of big-lambda and/or case expressions between - -- here and the nearest enclosing lambda +data Level = Level Int -- Level number of enclosing lambdas + Int -- Number of big-lambda and/or case expressions between + -- here and the nearest enclosing lambda \end{code} The {\em level number} on a (type-)lambda-bound variable is the @@ -87,68 +93,44 @@ a_0 = let b_? = ... in x_1 = ... b ... in ... \end{verbatim} -Level 0 0 will make something get floated to a top-level "equals", -@Top@ makes it go right to the top. - The main function @lvlExpr@ carries a ``context level'' (@ctxt_lvl@). That's meant to be the level number of the enclosing binder in the final (floated) program. If the level number of a sub-expression is less than that of the context, then it might be worth let-binding the sub-expression so that it will indeed float. This context level starts -at @Level 0 0@; it is never @Top@. +at @Level 0 0@. \begin{code} type LevelledExpr = TaggedExpr Level type LevelledArg = TaggedArg Level type LevelledBind = TaggedBind Level -tOP_LEVEL = Top +tOP_LEVEL = Level 0 0 incMajorLvl :: Level -> Level -incMajorLvl Top = Level 1 0 incMajorLvl (Level major minor) = Level (major+1) 0 incMinorLvl :: Level -> Level -incMinorLvl Top = Level 0 1 incMinorLvl (Level major minor) = Level major (minor+1) -unTopify :: Type -> Level -> Level -unTopify ty lvl - | isUnLiftedType ty = case lvl of - Top -> Level 0 0 -- Unboxed floats can't go right - other -> lvl -- to the top - | otherwise = lvl - maxLvl :: Level -> Level -> Level -maxLvl Top l2 = l2 -maxLvl l1 Top = l1 maxLvl l1@(Level maj1 min1) l2@(Level maj2 min2) | (maj1 > maj2) || (maj1 == maj2 && min1 > min2) = l1 | otherwise = l2 ltLvl :: Level -> Level -> Bool -ltLvl l1 Top = False -ltLvl Top (Level _ _) = True ltLvl (Level maj1 min1) (Level maj2 min2) = (maj1 < maj2) || (maj1 == maj2 && min1 < min2) ltMajLvl :: Level -> Level -> Bool -- Tells if one level belongs to a difft *lambda* level to another -ltMajLvl l1 Top = False -ltMajLvl Top (Level 0 _) = False -ltMajLvl Top (Level _ _) = True ltMajLvl (Level maj1 _) (Level maj2 _) = maj1 < maj2 isTopLvl :: Level -> Bool -isTopLvl Top = True -isTopLvl other = False - -isTopMajLvl :: Level -> Bool -- Tells if it's the top *lambda* level -isTopMajLvl Top = True -isTopMajLvl (Level maj _) = maj == 0 +isTopLvl (Level 0 0) = True +isTopLvl other = False instance Outputable Level where - ppr Top = ptext SLIT("") ppr (Level maj min) = hcat [ char '<', int maj, char ',', int min, char '>' ] \end{code} @@ -175,41 +157,14 @@ setLevels binds us do_them (b:bs) = lvlTopBind b `thenLvl` \ (lvld_bind, _) -> do_them bs `thenLvl` \ lvld_binds -> - returnLvl (lvld_bind ++ lvld_binds) + returnLvl (lvld_bind : lvld_binds) lvlTopBind (NonRec binder rhs) - = lvlBind TopLevel Top initialEnv (AnnNonRec binder (freeVars rhs)) + = lvlBind TopLevel tOP_LEVEL initialEnv (AnnNonRec binder (freeVars rhs)) -- Rhs can have no free vars! lvlTopBind (Rec pairs) - = lvlBind TopLevel Top initialEnv (AnnRec [(b,freeVars rhs) | (b,rhs) <- pairs]) -\end{code} - -%************************************************************************ -%* * -\subsection{Bindings} -%* * -%************************************************************************ - -The binding stuff works for top level too. - -\begin{code} -lvlBind :: TopLevelFlag -- Used solely to decide whether to clone - -> Level -- Context level; might be Top even for bindings nested in the RHS - -- of a top level binding - -> LevelEnv - -> CoreBindWithFVs - -> LvlM ([LevelledBind], LevelEnv) - -lvlBind top_lvl ctxt_lvl env (AnnNonRec bndr rhs) - = setFloatLevel (Just bndr) ctxt_lvl env rhs ty `thenLvl` \ (final_lvl, rhs') -> - cloneVar top_lvl env bndr final_lvl `thenLvl` \ (new_env, new_bndr) -> - returnLvl ([NonRec (new_bndr, final_lvl) rhs'], new_env) - where - ty = idType bndr - - -lvlBind top_lvl ctxt_lvl env (AnnRec pairs) = lvlRecBind top_lvl ctxt_lvl env pairs + = lvlBind TopLevel tOP_LEVEL initialEnv (AnnRec [(b,freeVars rhs) | (b,rhs) <- pairs]) \end{code} %************************************************************************ @@ -226,9 +181,7 @@ lvlExpr :: Level -- ctxt_lvl: Level of enclosing expression \end{code} The @ctxt_lvl@ is, roughly, the level of the innermost enclosing -binder. - -Here's an example +binder. Here's an example v = \x -> ...\y -> let r = case (..x..) of ..x.. @@ -252,9 +205,14 @@ lvlExpr ctxt_lvl env (_, AnnCon con args) lvlExpr ctxt_lvl env (_, AnnApp fun arg) = lvlExpr ctxt_lvl env fun `thenLvl` \ fun' -> - lvlMFE ctxt_lvl env arg `thenLvl` \ arg' -> + lvlMFE False ctxt_lvl env arg `thenLvl` \ arg' -> returnLvl (App fun' arg') +lvlExpr ctxt_lvl env (_, AnnNote InlineMe expr) + -- Don't float anything out of an InlineMe + = lvlExpr tOP_LEVEL env expr `thenLvl` \ expr' -> + returnLvl (Note InlineMe expr') + lvlExpr ctxt_lvl env (_, AnnNote note expr) = lvlExpr ctxt_lvl env expr `thenLvl` \ expr' -> returnLvl (Note note expr') @@ -267,341 +225,243 @@ lvlExpr ctxt_lvl env (_, AnnNote note expr) -- lambdas makes them more expensive. lvlExpr ctxt_lvl env expr@(_, AnnLam bndr rhs) - = lvlMFE incd_lvl new_env body `thenLvl` \ body' -> - returnLvl (mk_lams lvld_bndrs expr body') - where - bndr_is_id = isId bndr - bndr_is_tyvar = isTyVar bndr - (more_bndrs, body) = go rhs - bndrs = bndr : more_bndrs - - incd_lvl | bndr_is_id && not (all isOneShotLambda bndrs) = incMajorLvl ctxt_lvl - | otherwise = incMinorLvl ctxt_lvl - -- Only bump the major level number if the binders include - -- at least one more-than-one-shot lambda - - lvld_bndrs = [(b,incd_lvl) | b <- bndrs] - new_env = extendLvlEnv env lvld_bndrs + = go (incMinorLvl ctxt_lvl) env False {- Havn't bumped major level in this group -} expr + where + go lvl env bumped_major (_, AnnLam bndr body) + = go new_lvl new_env new_bumped_major body `thenLvl` \ new_body -> + returnLvl (Lam lvld_bndr new_body) + where + -- Go to the next major level if this is a value binder, + -- and we havn't already gone to the next level (one jump per group) + -- and it isn't a one-shot lambda + (new_lvl, new_bumped_major) + | isId bndr && + not bumped_major && + not (isOneShotLambda bndr) = (incMajorLvl ctxt_lvl, True) + | otherwise = (lvl, bumped_major) + new_env = extendLvlEnv env [lvld_bndr] + lvld_bndr = (bndr, new_lvl) -- Ignore notes, because we don't want to split -- a lambda like this (\x -> coerce t (\s -> ...)) -- This happens quite a bit in state-transformer programs - go (_, AnnLam bndr rhs) | bndr_is_id && isId bndr - || bndr_is_tyvar && isTyVar bndr - = case go rhs of { (bndrs, body) -> (bndr:bndrs, body) } - go (_, AnnNote _ rhs) = go rhs - go body = ([], body) - - -- Have to reconstruct the right Notes, since we ignored - -- them when gathering the lambdas - mk_lams (lb : lbs) (_, AnnLam _ body) body' = Lam lb (mk_lams lbs body body') - mk_lams lbs (_, AnnNote note body) body' = Note note (mk_lams lbs body body') - mk_lams [] body body' = body' + go lvl env bumped_major (_, AnnNote note body) + = go lvl env bumped_major body `thenLvl` \ new_body -> + returnLvl (Note note new_body) + + go lvl env bumped_major body + = lvlMFE True lvl env body + lvlExpr ctxt_lvl env (_, AnnLet bind body) - = lvlBind NotTopLevel ctxt_lvl env bind `thenLvl` \ (binds', new_env) -> + = lvlBind NotTopLevel ctxt_lvl env bind `thenLvl` \ (bind', new_env) -> lvlExpr ctxt_lvl new_env body `thenLvl` \ body' -> - returnLvl (mkLets binds' body') + returnLvl (Let bind' body') lvlExpr ctxt_lvl env (_, AnnCase expr case_bndr alts) - = lvlMFE ctxt_lvl env expr `thenLvl` \ expr' -> - mapLvl lvl_alt alts `thenLvl` \ alts' -> + = lvlMFE True ctxt_lvl env expr `thenLvl` \ expr' -> + let + alts_env = extendCaseBndrLvlEnv env expr' case_bndr incd_lvl + in + mapLvl (lvl_alt alts_env) alts `thenLvl` \ alts' -> returnLvl (Case expr' (case_bndr, incd_lvl) alts') where expr_type = coreExprType (deAnnotate expr) incd_lvl = incMinorLvl ctxt_lvl - alts_env = extendLvlEnv env [(case_bndr,incd_lvl)] - - lvl_alt (con, bs, rhs) - = let - bs' = [ (b, incd_lvl) | b <- bs ] - new_env = extendLvlEnv alts_env bs' - in - lvlMFE incd_lvl new_env rhs `thenLvl` \ rhs' -> + + lvl_alt alts_env (con, bs, rhs) + = lvlMFE True incd_lvl new_env rhs `thenLvl` \ rhs' -> returnLvl (con, bs', rhs') + where + bs' = [ (b, incd_lvl) | b <- bs ] + new_env = extendLvlEnv alts_env bs' \end{code} @lvlMFE@ is just like @lvlExpr@, except that it might let-bind the expression, so that it can itself be floated. \begin{code} -lvlMFE :: Level -- Level of innermost enclosing lambda/tylam +lvlMFE :: Bool -- True <=> strict context [body of case or let] + -> Level -- Level of innermost enclosing lambda/tylam -> LevelEnv -- Level of in-scope names/tyvars -> CoreExprWithFVs -- input expression -> LvlM LevelledExpr -- Result expression -lvlMFE ctxt_lvl env (_, AnnType ty) +lvlMFE strict_ctxt ctxt_lvl env (_, AnnType ty) = returnLvl (Type ty) -lvlMFE ctxt_lvl env ann_expr - | isUnLiftedType ty -- Can't let-bind it - = lvlExpr ctxt_lvl env ann_expr - - | otherwise -- Not primitive type so could be let-bound - = setFloatLevel Nothing {- Not already let-bound -} - ctxt_lvl env ann_expr ty `thenLvl` \ (final_lvl, expr') -> - returnLvl expr' +lvlMFE strict_ctxt ctxt_lvl env ann_expr@(fvs, _) + | isUnLiftedType ty -- Can't let-bind it + || not (dest_lvl `ltMajLvl` ctxt_lvl) -- Does not escape a value lambda + -- A decision to float entails let-binding this thing, and we only do + -- that if we'll escape a value lambda. I considered doing it if it + -- would make the thing go to top level, but I found things like + -- concat = /\ a -> foldr ..a.. (++) [] + -- was getting turned into + -- concat = /\ a -> lvl a + -- lvl = /\ a -> foldr ..a.. (++) [] + -- which is pretty stupid. So for now at least, I don't let-bind things + -- simply because they could go to top level. + || exprIsTrivial expr -- Is trivial + || (strict_ctxt && exprIsBottom expr) -- Strict context and is bottom + = -- Don't float it out + lvlExpr ctxt_lvl env ann_expr + + | otherwise -- Float it out! + = lvlExpr expr_lvl expr_env ann_expr `thenLvl` \ expr' -> + newLvlVar "lvl" (mkForAllTys tyvars ty) `thenLvl` \ var -> + returnLvl (Let (NonRec (var,dest_lvl) (mkLams tyvars_w_lvls expr')) + (mkTyVarApps var tyvars)) where - ty = coreExprType (deAnnotate ann_expr) + expr = deAnnotate ann_expr + ty = coreExprType expr + dest_lvl = destLevel env fvs + (tyvars, tyvars_w_lvls, expr_lvl) = abstractTyVars dest_lvl env fvs + expr_env = extendLvlEnv env tyvars_w_lvls \end{code} %************************************************************************ %* * -\subsection{Deciding floatability} +\subsection{Bindings} %* * %************************************************************************ -@setFloatLevel@ is used for let-bound right-hand-sides, or for MFEs which -are being created as let-bindings - -Decision tree: -Let Bound? - YES. -> (a) try abstracting type variables. - If we abstract type variables it will go further, that is, past more - lambdas. same as asking if the level number given by the free - variables is less than the level number given by free variables - and type variables together. - Abstract offending type variables, e.g. - change f ty a b - to let v = /\ty' -> f ty' a b - in v ty - so that v' is not stopped by the level number of ty - tag the original let with its level number - (from its variables and type variables) - NO. is a WHNF? - YES. -> No point in let binding to float a WHNF. - Pin (leave) expression here. - NO. -> Will float past a lambda? - (check using free variables only, not type variables) - YES. -> do the same as (a) above. - NO. -> No point in let binding if it is not going anywhere - Pin (leave) expression here. +The binding stuff works for top level too. \begin{code} -setFloatLevel :: Maybe Id -- Just id <=> the expression is already let-bound to id - -- Nothing <=> it's a possible MFE - -> Level -- of context - -> LevelEnv - - -> CoreExprWithFVs -- Original rhs - -> Type -- Type of rhs - - -> LvlM (Level, -- Level to attribute to this let-binding - LevelledExpr) -- Final rhs - -setFloatLevel maybe_let_bound ctxt_lvl env expr@(expr_fvs, _) ty - --- Now deal with (by not floating) trivial non-let-bound expressions --- which just aren't worth let-binding in order to float. We always --- choose to float even trivial let-bound things because it doesn't do --- any harm, and not floating it may pin something important. For --- example --- --- x = let v = [] --- w = 1:v --- in ... --- --- Here, if we don't float v we won't float w, which is Bad News. --- If this gives any problems we could restrict the idea to things destined --- for top level. - - | not alreadyLetBound - && (expr_is_trivial || expr_is_bottom || not will_float_past_lambda) - - = -- Pin trivial non-let-bound expressions, - -- or ones which aren't going anywhere useful - lvlExpr ctxt_lvl env expr `thenLvl` \ expr' -> - returnLvl (safe_ctxt_lvl, expr') - -{- SDM 7/98 -The above case used to read (whnf_or_bottom || not will_float_past_lambda). -It was changed because we really do want to float out constructors if possible: -this can save a great deal of needless allocation inside a loop. On the other -hand, there's no point floating out nullary constructors and literals, hence -the expr_is_trivial condition. --} - - | alreadyLetBound && not worth_type_abstraction - = -- Process the expression with a new ctxt_lvl, obtained from - -- the free vars of the expression itself - lvlExpr expr_lvl env expr `thenLvl` \ expr' -> - returnLvl (safe_expr_lvl, expr') - - | otherwise -- This will create a let anyway, even if there is no - -- type variable to abstract, so we try to abstract anyway - = abstractWrtTyVars offending_tyvars ty env lvl_after_ty_abstr expr - `thenLvl` \ final_expr -> - returnLvl (safe_expr_lvl, final_expr) - -- OLD LIE: The body of the let, just a type application, isn't worth floating - -- so pin it with ctxt_lvl - -- The truth: better to give it expr_lvl in case it is pinning - -- something non-trivial which depends on it. - where - alreadyLetBound = maybeToBool maybe_let_bound - - safe_ctxt_lvl = unTopify ty ctxt_lvl - safe_expr_lvl = unTopify ty expr_lvl - - fvs = case maybe_let_bound of - Nothing -> expr_fvs - Just id -> expr_fvs `unionVarSet` idFreeVars id - - ids_only_lvl = foldVarSet (maxIdLvl env) tOP_LEVEL fvs - tyvars_only_lvl = foldVarSet (maxTyVarLvl env) tOP_LEVEL fvs - expr_lvl = ids_only_lvl `maxLvl` tyvars_only_lvl - lvl_after_ty_abstr = ids_only_lvl --`maxLvl` non_offending_tyvars_lvl - - -- Will escape lambda if let-bound - will_float_past_lambda = ids_only_lvl `ltMajLvl` ctxt_lvl - - -- Will escape (more) lambda(s)/type lambda(s) if type abstracted - worth_type_abstraction = (ids_only_lvl `ltLvl` tyvars_only_lvl) - && not expr_is_trivial -- Avoids abstracting trivial type applications - - offending_tyvars = filter offending_tv (varSetElems fvs) - offending_tv var | isId var = False - | otherwise = ids_only_lvl `ltLvl` varLevel env var - - expr_is_trivial = exprIsTrivial de_ann_expr - expr_is_bottom = exprIsBottom de_ann_expr - de_ann_expr = deAnnotate expr -\end{code} - -Abstract wrt tyvars, by making it just as if we had seen - - let v = /\a1..an. E - in v a1 ... an +lvlBind :: TopLevelFlag -- Used solely to decide whether to clone + -> Level -- Context level; might be Top even for bindings nested in the RHS + -- of a top level binding + -> LevelEnv + -> CoreBindWithFVs + -> LvlM (LevelledBind, LevelEnv) -instead of simply E. The idea is that v can be freely floated, since it -has no free type variables. Of course, if E has no free type -variables, then we just return E. +lvlBind top_lvl ctxt_lvl env (AnnNonRec bndr rhs@(rhs_fvs,_)) + | null tyvars + = -- No type abstraction; clone existing binder + lvlExpr rhs_lvl rhs_env rhs `thenLvl` \ rhs' -> + cloneVar top_lvl env bndr dest_lvl `thenLvl` \ (env', bndr') -> + returnLvl (NonRec (bndr', dest_lvl) rhs', env') -\begin{code} -abstractWrtTyVars offending_tyvars ty env lvl expr - = lvlExpr incd_lvl new_env expr `thenLvl` \ expr' -> - newLvlVar poly_ty `thenLvl` \ poly_var -> + | otherwise + = -- Yes, type abstraction; create a new binder, extend substitution, etc + WARN( workerExists (getIdWorkerInfo bndr) + || not (isEmptyCoreRules (getIdSpecialisation bndr)), + text "lvlBind: discarding info on" <+> ppr bndr ) + + lvl_poly_rhs tyvars_w_lvls rhs_lvl rhs_env rhs `thenLvl` \ rhs' -> + new_poly_bndr tyvars bndr `thenLvl` \ bndr' -> let - poly_var_rhs = mkLams tyvar_lvls expr' - poly_var_binding = NonRec (poly_var, lvl) poly_var_rhs - poly_var_app = mkTyApps (Var poly_var) (mkTyVarTys offending_tyvars) - final_expr = Let poly_var_binding poly_var_app -- mkCoLet* requires Core + env' = extendPolyLvlEnv env dest_lvl tyvars [(bndr, bndr')] in - returnLvl final_expr - where - poly_ty = mkForAllTys offending_tyvars ty + returnLvl (NonRec (bndr', dest_lvl) rhs', env') - -- These defns are just like those in the TyLam case of lvlExpr - incd_lvl = incMinorLvl lvl - tyvar_lvls = [(tv,incd_lvl) | tv <- offending_tyvars] - new_env = extendLvlEnv env tyvar_lvls -\end{code} + where + bind_fvs = rhs_fvs `unionVarSet` idFreeVars bndr -Recursive definitions. We want to transform + dest_lvl | isUnLiftedType (idType bndr) = destLevel env bind_fvs `maxLvl` Level 1 0 + | otherwise = destLevel env bind_fvs + -- Hack alert! We do have some unlifted bindings, for cheap primops, and + -- it is ok to float them out; but not to the top level. If they would otherwise + -- go to the top level, we pin them inside the topmost lambda - letrec - x1 = e1 - ... - xn = en - in - body + (tyvars, tyvars_w_lvls, rhs_lvl) = abstractTyVars dest_lvl env bind_fvs + rhs_env = extendLvlEnv env tyvars_w_lvls +\end{code} -to - letrec - x1' = /\ ab -> let D' in e1 - ... - xn' = /\ ab -> let D' in en - in - let D in body +\begin{code} +lvlBind top_lvl ctxt_lvl env (AnnRec pairs) + | null tyvars + = cloneVars top_lvl env bndrs dest_lvl `thenLvl` \ (new_env, new_bndrs) -> + mapLvl (lvlExpr rhs_lvl new_env) rhss `thenLvl` \ new_rhss -> + returnLvl (Rec ((new_bndrs `zip` repeat dest_lvl) `zip` new_rhss), new_env) -where ab are the tyvars pinning the defn further in than it -need be, and D is a bunch of simple type applications: + | otherwise + = mapLvl (new_poly_bndr tyvars) bndrs `thenLvl` \ new_bndrs -> + let + new_env = extendPolyLvlEnv env dest_lvl tyvars (bndrs `zip` new_bndrs) + rhs_env = extendLvlEnv new_env tyvars_w_lvls + in + mapLvl (lvl_poly_rhs tyvars_w_lvls rhs_lvl rhs_env) rhss `thenLvl` \ new_rhss -> + returnLvl (Rec ((new_bndrs `zip` repeat dest_lvl) `zip` new_rhss), new_env) - x1_cl = x1' ab - ... - xn_cl = xn' ab + where + (bndrs,rhss) = unzip pairs -The "_cl" indicates that in D, the level numbers on the xi are the context level -number; type applications aren't worth floating. The D' decls are -similar: + -- Finding the free vars of the binding group is annoying + bind_fvs = (unionVarSets [ idFreeVars bndr `unionVarSet` rhs_fvs + | (bndr, (rhs_fvs,_)) <- pairs]) + `minusVarSet` + mkVarSet bndrs - x1_ll = x1' ab - ... - xn_ll = xn' ab + dest_lvl = destLevel env bind_fvs -but differ in their level numbers; here the ab are the newly-introduced -type lambdas. + (tyvars, tyvars_w_lvls, rhs_lvl) = abstractTyVars dest_lvl env bind_fvs -\begin{code} -lvlRecBind top_lvl ctxt_lvl env pairs - | ids_only_lvl `ltLvl` tyvars_only_lvl - = -- Abstract wrt tyvars; - -- offending_tyvars is definitely non-empty - -- (I love the ASSERT to check this... WDP 95/02) - let - incd_lvl = incMinorLvl ids_only_lvl - tyvars_w_rhs_lvl = [(var,incd_lvl) | var <- offending_tyvars] - bndrs_w_rhs_lvl = [(var,incd_lvl) | var <- bndrs] - rhs_env = extendLvlEnv env (tyvars_w_rhs_lvl ++ bndrs_w_rhs_lvl) - in - mapLvl (lvlExpr incd_lvl rhs_env) rhss `thenLvl` \ rhss' -> - mapLvl newLvlVar poly_tys `thenLvl` \ poly_vars -> - cloneVars top_lvl env bndrs ctxt_lvl `thenLvl` \ (new_env, new_bndrs) -> - let - -- The "d_rhss" are the right-hand sides of "D" and "D'" - -- in the documentation above - d_rhss = [ mkTyApps (Var poly_var) offending_tyvar_tys | poly_var <- poly_vars] +---------------------------------------------------- +-- Three help functons Stuff for the type-abstraction case - -- "local_binds" are "D'" in the documentation above - local_binds = zipWithEqual "SetLevels" NonRec bndrs_w_rhs_lvl d_rhss +new_poly_bndr tyvars bndr + = newLvlVar ("poly_" ++ occNameUserString (getOccName bndr)) + (mkForAllTys tyvars (idType bndr)) - poly_var_rhss = [ mkLams tyvars_w_rhs_lvl (mkLets local_binds rhs') - | rhs' <- rhss' - ] +lvl_poly_rhs tyvars_w_lvls rhs_lvl rhs_env rhs + = lvlExpr rhs_lvl rhs_env rhs `thenLvl` \ rhs' -> + returnLvl (mkLams tyvars_w_lvls rhs') +\end{code} - poly_binds = zipEqual "poly_binds" [(poly_var, ids_only_lvl) | poly_var <- poly_vars] - poly_var_rhss - -- The new right-hand sides, just a type application, - -- aren't worth floating so pin it with ctxt_lvl - bndrs_w_lvl = new_bndrs `zip` repeat ctxt_lvl +%************************************************************************ +%* * +\subsection{Deciding floatability} +%* * +%************************************************************************ - -- "d_binds" are the "D" in the documentation above - d_binds = zipWithEqual "SetLevels" NonRec bndrs_w_lvl d_rhss - in - returnLvl (Rec poly_binds : d_binds, new_env) +\begin{code} +abstractTyVars :: Level -> LevelEnv -> VarSet + -> ([TyVar], [(TyVar,Level)], Level) + -- Find the tyvars whose level is higher than the supplied level + -- There should be no Ids with this property +abstractTyVars lvl env fvs + | null tyvars = ([], [], lvl) -- Don't increment level | otherwise - = -- Let it float freely - cloneVars top_lvl env bndrs expr_lvl `thenLvl` \ (new_env, new_bndrs) -> - let - bndrs_w_lvls = new_bndrs `zip` repeat expr_lvl - in - mapLvl (lvlExpr expr_lvl new_env) rhss `thenLvl` \ rhss' -> - returnLvl ([Rec (bndrs_w_lvls `zip` rhss')], new_env) - + = ASSERT( not (any bad fv_list) ) + (tyvars, tyvars_w_lvls, incd_lvl) where - (bndrs,rhss) = unzip pairs + bad v = isId v && lvl `ltLvl` varLevel env v + fv_list = varSetElems fvs + tyvars = nub [tv | v <- fv_list, tv <- tvs_of v, abstract_tv tv] - -- Finding the free vars of the binding group is annoying - bind_fvs = (unionVarSets (map fst rhss) `unionVarSet` unionVarSets (map idFreeVars bndrs)) - `minusVarSet` - mkVarSet bndrs + -- If f is free in the exression, and f maps to poly_f a b c in the + -- current substitution, then we must report a b c as candidate type + -- variables + tvs_of v | isId v = lookupTyVars env v + | otherwise = [v] - ids_only_lvl = foldVarSet (maxIdLvl env) tOP_LEVEL bind_fvs - tyvars_only_lvl = foldVarSet (maxTyVarLvl env) tOP_LEVEL bind_fvs - expr_lvl = ids_only_lvl `maxLvl` tyvars_only_lvl + abstract_tv var | isId var = False + | otherwise = lvl `ltLvl` varLevel env var - offending_tyvars = filter offending_tv (varSetElems bind_fvs) - offending_tv var | isId var = False - | otherwise = ids_only_lvl `ltLvl` varLevel env var - offending_tyvar_tys = mkTyVarTys offending_tyvars + -- These defns are just like those in the TyLam case of lvlExpr + incd_lvl = incMinorLvl lvl + tyvars_w_lvls = [(tv,incd_lvl) | tv <- tyvars] - tys = map idType bndrs - poly_tys = map (mkForAllTys offending_tyvars) tys + + -- Destintion level is the max Id level of the expression + -- (We'll abstract the type variables, if any.) +destLevel :: LevelEnv -> VarSet -> Level +destLevel env fvs = foldVarSet (maxIdLvl env) tOP_LEVEL fvs + +maxIdLvl :: LevelEnv -> IdOrTyVar -> Level -> Level +maxIdLvl (lvl_env,_,_) var lvl | isTyVar var = lvl + | otherwise = case lookupVarEnv lvl_env var of + Just lvl' -> maxLvl lvl' lvl + Nothing -> lvl \end{code} + %************************************************************************ %* * \subsection{Free-To-Level Monad} @@ -609,43 +469,68 @@ lvlRecBind top_lvl ctxt_lvl env pairs %************************************************************************ \begin{code} -type LevelEnv = (VarEnv Level, SubstEnv) +type LevelEnv = (VarEnv Level, SubstEnv, IdEnv ([TyVar], LevelledExpr)) -- We clone let-bound variables so that they are still - -- distinct when floated out; hence the SubstEnv - -- The domain of the VarEnv is *pre-cloned* Ids, though + -- distinct when floated out; hence the SubstEnv/IdEnv. + -- We also use these envs when making a variable polymorphic + -- because we want to float it out past a big lambda. + -- + -- The two Envs always implement the same mapping, but the + -- SubstEnv maps to CoreExpr and the IdEnv to LevelledExpr + -- Since the range is always a variable or type application, + -- there is never any difference between the two, but sadly + -- the types differ. The SubstEnv is used when substituting in + -- a variable's IdInfo; the IdEnv when we find a Var. + -- + -- In addition the IdEnv records a list of tyvars free in the + -- type application, just so we don't have to call freeVars on + -- the type application repeatedly. + -- + -- The domain of the both envs is *pre-cloned* Ids, though initialEnv :: LevelEnv -initialEnv = (emptyVarEnv, emptySubstEnv) +initialEnv = (emptyVarEnv, emptySubstEnv, emptyVarEnv) extendLvlEnv :: LevelEnv -> [(Var,Level)] -> LevelEnv -- Used when *not* cloning -extendLvlEnv (lvl_env, subst_env) prs - = (foldl add lvl_env prs, subst_env) - where - add env (v,l) = extendVarEnv env v l +extendLvlEnv (lvl_env, subst_env, id_env) prs + = (foldl add lvl_env prs, subst_env, id_env) + where + add env (v,l) = extendVarEnv env v l + +-- extendCaseBndrLvlEnv adds the mapping case-bndr->scrut-var if it can +extendCaseBndrLvlEnv (lvl_env, subst_env, id_env) scrut case_bndr lvl + = case scrut of + Var v -> (new_lvl_env, extendSubstEnv subst_env case_bndr (DoneEx (Var v)), + extendVarEnv id_env case_bndr ([], scrut)) + other -> (new_lvl_env, subst_env, id_env) + where + new_lvl_env = extendVarEnv lvl_env case_bndr lvl + +extendPolyLvlEnv (lvl_env, subst_env, id_env) dest_lvl tyvars bndr_pairs + = (foldl add_lvl lvl_env bndr_pairs, + foldl add_subst subst_env bndr_pairs, + foldl add_id id_env bndr_pairs) + where + add_lvl env (v,_ ) = extendVarEnv env v dest_lvl + add_subst env (v,v') = extendSubstEnv env v (DoneEx (mkTyVarApps v' tyvars)) + add_id env (v,v') = extendVarEnv env v (tyvars, mkTyVarApps v' tyvars) varLevel :: LevelEnv -> IdOrTyVar -> Level -varLevel (lvl_env, _) v +varLevel (lvl_env, _, _) v = case lookupVarEnv lvl_env v of Just level -> level Nothing -> tOP_LEVEL lookupVar :: LevelEnv -> Id -> LevelledExpr -lookupVar (_, subst) v = case lookupSubstEnv subst v of - Just (DoneEx (Var v')) -> Var v' -- Urgh! Types don't match - other -> Var v - -maxIdLvl :: LevelEnv -> IdOrTyVar -> Level -> Level -maxIdLvl (lvl_env,_) var lvl | isTyVar var = lvl - | otherwise = case lookupVarEnv lvl_env var of - Just lvl' -> maxLvl lvl' lvl - Nothing -> lvl - -maxTyVarLvl :: LevelEnv -> IdOrTyVar -> Level -> Level -maxTyVarLvl (lvl_env,_) var lvl | isId var = lvl - | otherwise = case lookupVarEnv lvl_env var of - Just lvl' -> maxLvl lvl' lvl - Nothing -> lvl +lookupVar (_, _, id_env) v = case lookupVarEnv id_env v of + Just (_, expr) -> expr + other -> Var v + +lookupTyVars :: LevelEnv -> Id -> [TyVar] +lookupTyVars (_, _, id_env) v = case lookupVarEnv id_env v of + Just (tyvars, _) -> tyvars + Nothing -> [] \end{code} \begin{code} @@ -658,9 +543,9 @@ mapLvl = mapUs \end{code} \begin{code} -newLvlVar :: Type -> LvlM Id -newLvlVar ty = getUniqueUs `thenLvl` \ uniq -> - returnUs (mkSysLocal SLIT("lvl") uniq ty) +newLvlVar :: String -> Type -> LvlM Id +newLvlVar str ty = getUniqueUs `thenLvl` \ uniq -> + returnUs (mkSysLocal (_PK_ str) uniq ty) -- The deeply tiresome thing is that we have to apply the substitution -- to the rules inside each Id. Grr. But it matters. @@ -668,28 +553,33 @@ newLvlVar ty = getUniqueUs `thenLvl` \ uniq -> cloneVar :: TopLevelFlag -> LevelEnv -> Id -> Level -> LvlM (LevelEnv, Id) cloneVar TopLevel env v lvl = returnUs (env, v) -- Don't clone top level things -cloneVar NotTopLevel (lvl_env, subst_env) v lvl +cloneVar NotTopLevel (lvl_env, subst_env, id_env) v lvl = getUniqueUs `thenLvl` \ uniq -> let subst = mkSubst emptyVarSet subst_env v' = setVarUnique v uniq v'' = modifyIdInfo (\info -> substIdInfo subst info info) v' subst_env' = extendSubstEnv subst_env v (DoneEx (Var v'')) - lvl_env' = extendVarEnv lvl_env v lvl + id_env' = extendVarEnv id_env v ([], Var v'') + lvl_env' = extendVarEnv lvl_env v lvl in - returnUs ((lvl_env', subst_env'), v'') + returnUs ((lvl_env', subst_env', id_env'), v'') cloneVars :: TopLevelFlag -> LevelEnv -> [Id] -> Level -> LvlM (LevelEnv, [Id]) cloneVars TopLevel env vs lvl = returnUs (env, vs) -- Don't clone top level things -cloneVars NotTopLevel (lvl_env, subst_env) vs lvl +cloneVars NotTopLevel (lvl_env, subst_env, id_env) vs lvl = getUniquesUs (length vs) `thenLvl` \ uniqs -> let subst = mkSubst emptyVarSet subst_env' vs' = zipWith setVarUnique vs uniqs vs'' = map (modifyIdInfo (\info -> substIdInfo subst info info)) vs' subst_env' = extendSubstEnvList subst_env vs [DoneEx (Var v'') | v'' <- vs''] + id_env' = extendVarEnvList id_env (vs `zip` [([], Var v') | v' <- vs'']) lvl_env' = extendVarEnvList lvl_env (vs `zip` repeat lvl) in - returnUs ((lvl_env', subst_env'), vs'') + returnUs ((lvl_env', subst_env', id_env'), vs'') + +mkTyVarApps var tyvars = foldl (\e tv -> App e (Type (mkTyVarTy tv))) + (Var var) tyvars \end{code} diff --git a/ghc/compiler/simplCore/SimplMonad.lhs b/ghc/compiler/simplCore/SimplMonad.lhs index 80b8553..fdc70a4 100644 --- a/ghc/compiler/simplCore/SimplMonad.lhs +++ b/ghc/compiler/simplCore/SimplMonad.lhs @@ -51,13 +51,14 @@ module SimplMonad ( #include "HsVersions.h" import Const ( Con(DEFAULT) ) -import Id ( Id, mkSysLocal, idMustBeINLINEd ) +import Id ( Id, mkSysLocal, isConstantId ) import IdInfo ( InlinePragInfo(..) ) import Demand ( Demand ) import CoreSyn import PprCore () -- Instances import Rules ( RuleBase ) import CostCentre ( CostCentreStack, subsumedCCS ) +import Name ( isLocallyDefined ) import Var ( TyVar ) import VarEnv import VarSet @@ -743,7 +744,23 @@ environment seems like wild overkill. \begin{code} switchOffInlining :: SimplM a -> SimplM a switchOffInlining m env us sc - = m (env { seBlackList = \v -> True }) us sc + = m (env { seBlackList = \v -> (v `isInScope` subst) || not (isLocallyDefined v) + }) us sc + -- Black list anything that is in scope or imported. + -- The in-scope thing arranges *not* to black list inlinings that are + -- completely inside the switch-off-inlining block. + -- This allows simplification to proceed un-hindered inside the block. + -- + -- At one time I had an exception for constant Ids (constructors, primops) + -- && (old_black_list v || not (isConstantId v )) + -- because (a) some don't have bindings, so we never want not to inline them + -- (b) their defns are very seldom big, so there's no size penalty + -- to inline them + -- But that failed because if we inline (say) [] in build's rhs, then + -- the exported thing doesn't match rules + where + subst = seSubst env + old_black_list = seBlackList env \end{code} @@ -813,15 +830,9 @@ setInScope :: InScopeSet -> SimplM a -> SimplM a setInScope in_scope m env@(SimplEnv {seSubst = subst}) us sc = m (env {seSubst = Subst.setInScope subst in_scope}) us sc -modifyInScope :: CoreBndr -> SimplM a -> SimplM a -modifyInScope v m env us sc -#ifdef DEBUG - | not (v `isInScope` seSubst env) - = pprTrace "modifyInScope: not in scope:" (ppr v) - m env us sc -#endif - | otherwise - = extendInScope v m env us sc +modifyInScope :: CoreBndr -> CoreBndr -> SimplM a -> SimplM a +modifyInScope v v' m env@(SimplEnv {seSubst = subst}) us sc + = m (env {seSubst = Subst.modifyInScope subst v v'}) us sc extendSubst :: CoreBndr -> SubstResult -> SimplM a -> SimplM a extendSubst var res m env@(SimplEnv {seSubst = subst}) us sc diff --git a/ghc/compiler/simplCore/SimplUtils.lhs b/ghc/compiler/simplCore/SimplUtils.lhs index b685876..835047b 100644 --- a/ghc/compiler/simplCore/SimplUtils.lhs +++ b/ghc/compiler/simplCore/SimplUtils.lhs @@ -21,10 +21,10 @@ import CoreFVs ( exprFreeVars ) import CoreUtils ( exprIsTrivial, cheapEqExpr, coreExprType, exprIsCheap, exprEtaExpandArity ) import Subst ( substBndrs, substBndr, substIds ) import Id ( Id, idType, getIdArity, isId, idName, - getInlinePragma, setInlinePragma, + getIdOccInfo, getIdDemandInfo, mkId, idInfo ) -import IdInfo ( arityLowerBound, InlinePragInfo(..), setInlinePragInfo, vanillaIdInfo ) +import IdInfo ( arityLowerBound, setOccInfo, vanillaIdInfo ) import Maybes ( maybeToBool, catMaybes ) import Const ( Con(..) ) import Name ( isLocalName, setNameUnique ) @@ -243,7 +243,7 @@ mkRhsTyLam tyvars body -- Only does something if there's a let poly_name = setNameUnique (idName var) uniq -- Keep same name poly_ty = mkForAllTys tyvars_here (idType var) -- But new type of course - -- It's crucial to copy the inline-prag of the original var, because + -- It's crucial to copy the occInfo of the original var, because -- we're looking at occurrence-analysed but as yet unsimplified code! -- In particular, we mustn't lose the loop breakers. -- @@ -254,14 +254,14 @@ mkRhsTyLam tyvars body -- Only does something if there's a let -- where x* has an INLINE prag on it. Now, once x* is inlined, -- the occurrences of x' will be just the occurrences originaly -- pinned on x. - poly_info = vanillaIdInfo `setInlinePragInfo` getInlinePragma var + poly_info = vanillaIdInfo `setOccInfo` getIdOccInfo var poly_id = mkId poly_name poly_ty poly_info in returnSmpl (poly_id, mkTyApps (Var poly_id) (mkTyVarTys tyvars_here)) - mk_silly_bind var rhs = NonRec (setInlinePragma var IMustBeINLINEd) rhs - -- The addInlinePragma is really important! If we don't say + mk_silly_bind var rhs = NonRec var rhs + -- The Inline note is really important! If we don't say -- INLINE on these silly little bindings then look what happens! -- Suppose we start with: -- @@ -273,7 +273,7 @@ mkRhsTyLam tyvars body -- Only does something if there's a let -- * but then it gets inlined into the rhs of g* -- * then the binding for g* is floated out of the /\b -- * so we're back to square one - -- The silly binding for g* must be IMustBeINLINEs, so that + -- The silly binding for g* must be INLINEd, so that -- we simply substitute for g* throughout. \end{code} @@ -541,11 +541,14 @@ findAlt con alts matches (DEFAULT, _, _) = True matches (con1, _, _) = con == con1 +\end{code} -mkCoerce to_ty (Note (Coerce _ from_ty) expr) +\begin{code} +mkCoerce :: Type -> CoreExpr -> CoreExpr +mkCoerce to_ty expr | to_ty == from_ty = expr | otherwise = Note (Coerce to_ty from_ty) expr -mkCoerce to_ty expr - = Note (Coerce to_ty (coreExprType expr)) expr + where + from_ty = coreExprType expr \end{code} diff --git a/ghc/compiler/simplCore/Simplify.lhs b/ghc/compiler/simplCore/Simplify.lhs index 0828a79..2d9740b 100644 --- a/ghc/compiler/simplCore/Simplify.lhs +++ b/ghc/compiler/simplCore/Simplify.lhs @@ -8,7 +8,7 @@ module Simplify ( simplTopBinds, simplExpr ) where #include "HsVersions.h" -import CmdLineOpts ( intSwitchSet, +import CmdLineOpts ( intSwitchSet, switchIsOn, opt_SccProfilingOn, opt_PprStyle_Debug, opt_SimplDoEtaReduction, opt_SimplNoPreInlining, opt_DictsStrict, opt_SimplPedanticBottoms, SimplifierSwitch(..) @@ -25,13 +25,15 @@ import Id ( Id, idType, idInfo, idUnique, getIdSpecialisation, setIdSpecialisation, getIdDemandInfo, setIdDemandInfo, setIdInfo, + getIdOccInfo, setIdOccInfo, + zapLamIdInfo, zapFragileIdInfo, getIdStrictness, - setInlinePragma, getInlinePragma, idMustBeINLINEd, + setInlinePragma, mayHaveNoBinding, setOneShotLambda, maybeModifyIdInfo ) import IdInfo ( InlinePragInfo(..), OccInfo(..), StrictnessInfo(..), - ArityInfo(..), atLeastArity, arityLowerBound, unknownArity, zapFragileIdInfo, - specInfo, inlinePragInfo, zapLamIdInfo, setArityInfo, setInlinePragInfo, setUnfoldingInfo + ArityInfo(..), atLeastArity, arityLowerBound, unknownArity, + specInfo, inlinePragInfo, setArityInfo, setInlinePragInfo, setUnfoldingInfo ) import Demand ( Demand, isStrict, wwLazy ) import Const ( isWHNFCon, conOkForAlt ) @@ -55,8 +57,8 @@ import Type ( Type, mkTyVarTy, mkTyVarTys, isUnLiftedType, seqType, mkFunTy, splitFunTys, splitTyConApp_maybe, splitFunTy_maybe, funResultTy, isDictTy, isDataType, applyTy, applyTys, mkFunTys ) -import Subst ( Subst, mkSubst, emptySubst, substExpr, substTy, - substEnv, lookupInScope, lookupSubst, substIdInfo +import Subst ( Subst, mkSubst, emptySubst, substTy, substExpr, + substEnv, isInScope, lookupInScope, lookupIdSubst, substIdInfo ) import TyCon ( isDataTyCon, tyConDataCons, tyConClass_maybe, tyConArity, isDataTyCon ) import TysPrim ( realWorldStatePrimTy ) @@ -66,6 +68,7 @@ import Maybes ( maybeToBool ) import Util ( zipWithEqual, stretchZipEqual, lengthExceeds ) import PprCore import Outputable +import Unique ( foldrIdKey ) -- Temp \end{code} @@ -87,22 +90,21 @@ simplTopBinds binds -- so that if a transformation rule has unexpectedly brought -- anything into scope, then we don't get a complaint about that. -- It's rather as if the top-level binders were imported. - extendInScopes top_binders $ - simpl_binds binds `thenSmpl` \ (binds', _) -> - freeTick SimplifierDone `thenSmpl_` + simplIds (bindersOfBinds binds) $ \ bndrs' -> + simpl_binds binds bndrs' `thenSmpl` \ (binds', _) -> + freeTick SimplifierDone `thenSmpl_` returnSmpl binds' where - top_binders = bindersOfBinds binds - simpl_binds [] = returnSmpl ([], panic "simplTopBinds corner") - simpl_binds (NonRec bndr rhs : binds) = simplLazyBind TopLevel bndr (zap bndr) rhs (simpl_binds binds) - simpl_binds (Rec pairs : binds) = simplRecBind TopLevel pairs (map (zap . fst) pairs) (simpl_binds binds) + -- We need to track the zapped top-level binders, because + -- they should have their fragile IdInfo zapped (notably occurrence info) + simpl_binds [] bs = ASSERT( null bs ) returnSmpl ([], panic "simplTopBinds corner") + simpl_binds (NonRec bndr rhs : binds) (b:bs) = simplLazyBind True bndr b rhs (simpl_binds binds bs) + simpl_binds (Rec pairs : binds) bs = simplRecBind True pairs (take n bs) (simpl_binds binds (drop n bs)) + where + n = length pairs - zap id = maybeModifyIdInfo zapFragileIdInfo id --- TEMP - - -simplRecBind :: TopLevelFlag -> [(InId, InExpr)] -> [OutId] +simplRecBind :: Bool -> [(InId, InExpr)] -> [OutId] -> SimplM (OutStuff a) -> SimplM (OutStuff a) simplRecBind top_lvl pairs bndrs' thing_inside = go pairs bndrs' `thenSmpl` \ (binds', stuff) -> @@ -238,7 +240,7 @@ simplExprF (Let (Rec pairs) body) cont -- NB: bndrs' don't have unfoldings or spec-envs -- We add them as we go down, using simplPrags - simplRecBind NotTopLevel pairs bndrs' (simplExprF body cont) + simplRecBind False pairs bndrs' (simplExprF body cont) simplExprF expr@(Lam _ _) cont = simplLam expr cont @@ -247,10 +249,25 @@ simplExprF (Type ty) cont simplType ty `thenSmpl` \ ty' -> rebuild (Type ty') cont +-- Comments about the Coerce case +-- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +-- It's worth checking for a coerce in the continuation, +-- in case we can cancel them. For example, in the initial form of a worker +-- we may find (coerce T (coerce S (\x.e))) y +-- and we'd like it to simplify to e[y/x] in one round of simplification + +simplExprF (Note (Coerce to from) e) (CoerceIt outer_to cont) + = simplType from `thenSmpl` \ from' -> + if outer_to == from' then + -- The coerces cancel out + simplExprF e cont + else + -- They don't cancel, but the inner one is redundant + simplExprF e (CoerceIt outer_to cont) + simplExprF (Note (Coerce to from) e) cont - | to == from = simplExprF e cont - | otherwise = simplType to `thenSmpl` \ to' -> - simplExprF e (CoerceIt to' cont) + = simplType to `thenSmpl` \ to' -> + simplExprF e (CoerceIt to' cont) -- hack: we only distinguish subsumed cost centre stacks for the purposes of -- inlining. All other CCCSs are mapped to currentCCS. @@ -305,7 +322,7 @@ simplExprF (Let (NonRec bndr rhs) body) cont simplLam fun cont = go fun cont where - zap_it = mkLamBndrZapper fun (countArgs cont) + zap_it = mkLamBndrZapper fun cont cont_ty = contResultType cont -- Type-beta reduction @@ -353,15 +370,19 @@ completeLam acc body cont -- Remember, acc is the *reversed* binders mkLamBndrZapper :: CoreExpr -- Function - -> Int -- Number of args + -> SimplCont -- The context -> Id -> Id -- Use this to zap the binders -mkLamBndrZapper fun n_args +mkLamBndrZapper fun cont | n_args >= n_params fun = \b -> b -- Enough args - | otherwise = \b -> maybeModifyIdInfo zapLamIdInfo b + | otherwise = \b -> zapLamIdInfo b where - n_params (Lam b e) | isId b = 1 + n_params e - | otherwise = n_params e - n_params other = 0::Int + -- NB: we count all the args incl type args + -- so we must count all the binders (incl type lambdas) + n_args = countArgs cont + + n_params (Note _ e) = n_params e + n_params (Lam b e) = 1 + n_params e + n_params other = 0::Int \end{code} @@ -371,27 +392,42 @@ That means it may generate some Lets, hence the strange type \begin{code} simplConArgs :: [InArg] -> ([OutArg] -> SimplM OutExprStuff) -> SimplM OutExprStuff -simplConArgs [] thing_inside - = thing_inside [] - -simplConArgs (arg:args) thing_inside - = switchOffInlining (simplExpr arg) `thenSmpl` \ arg' -> - -- Simplify the RHS with inlining switched off, so that - -- only absolutely essential things will happen. - -- If we don't do this, consider: - -- let x = e in C {x} - -- We end up inlining x back into C's argument, - -- and then let-binding it again! - - simplConArgs args $ \ args' -> - - -- If the argument ain't trivial, then let-bind it - if exprIsTrivial arg' then - thing_inside (arg' : args') - else - newId (coreExprType arg') $ \ arg_id -> - completeBeta arg_id arg_id arg' $ - thing_inside (Var arg_id : args') +simplConArgs args thing_inside + = getSubst `thenSmpl` \ subst -> + go subst args thing_inside + where + go subst [] thing_inside + = thing_inside [] + go subst (arg:args) thing_inside + | exprIsTrivial arg + = let + arg1 = substExpr subst arg + -- Simplify the RHS with inlining switched off, so that + -- only absolutely essential things will happen. + -- If we don't do this, consider: + -- let x = e in C {x} + -- We end up inlining x back into C's argument, + -- and then let-binding it again! + -- + -- It's important that the substitution *does* deal with case-binder synonyms: + -- case x of y { True -> (x,1) } + -- Here we must be sure to substitute y for x when simplifying the args of the pair, + -- to increase the chances of being able to inline x. The substituter will do + -- that because the x->y mapping is held in the in-scope set. + in + ASSERT( exprIsTrivial arg1 ) + go subst args $ \ args1 -> + thing_inside (arg1 : args1) + + | otherwise + = -- If the argument ain't trivial, then let-bind it + simplExpr arg `thenSmpl` \ arg1 -> + newId (coreExprType arg1) $ \ arg_id -> + go subst args $ \ args1 -> + thing_inside (Var arg_id : args1) `thenSmpl` \ res -> + returnSmpl (addBind (NonRec arg_id arg1) res) + -- I used to use completeBeta but that was wrong, because + -- arg_id isn't an InId \end{code} @@ -432,7 +468,7 @@ simplBeta bndr rhs rhs_se cont_ty thing_inside #endif simplBeta bndr rhs rhs_se cont_ty thing_inside - | preInlineUnconditionally bndr && not opt_SimplNoPreInlining + | preInlineUnconditionally False {- not black listed -} bndr = tick (PreInlineUnconditionally bndr) `thenSmpl_` extendSubst bndr (ContEx rhs_se rhs) thing_inside @@ -455,7 +491,7 @@ completeBeta bndr bndr' rhs' thing_inside returnSmpl ([], (in_scope, Case rhs' bndr' [(DEFAULT, [], mkLets floats body)])) | otherwise - = completeBinding bndr bndr' False rhs' thing_inside + = completeBinding bndr bndr' False False rhs' thing_inside \end{code} @@ -480,7 +516,8 @@ simplArg arg_ty demand arg arg_se cont_ty thing_inside etaFirst thing_inside rhs') | otherwise - = simplRhs NotTopLevel True {- OK to float unboxed -} + = simplRhs False {- Not top level -} + True {- OK to float unboxed -} arg_ty arg arg_se thing_inside @@ -514,19 +551,20 @@ It does *not* attempt to do let-to-case. Why? Because they are used for \begin{code} completeBinding :: InId -- Binder -> OutId -- New binder + -> Bool -- True <=> top level -> Bool -- True <=> black-listed; don't inline -> OutExpr -- Simplified RHS -> SimplM (OutStuff a) -- Thing inside -> SimplM (OutStuff a) -completeBinding old_bndr new_bndr black_listed new_rhs thing_inside - | isDeadBinder old_bndr -- This happens; for example, the case_bndr during case of - -- known constructor: case (a,b) of x { (p,q) -> ... } - -- Here x isn't mentioned in the RHS, so we don't want to +completeBinding old_bndr new_bndr top_lvl black_listed new_rhs thing_inside + | (case occ_info of -- This happens; for example, the case_bndr during case of + IAmDead -> True -- known constructor: case (a,b) of x { (p,q) -> ... } + other -> False) -- Here x isn't mentioned in the RHS, so we don't want to -- create the (dead) let-binding let x = (a,b) in ... = thing_inside - | not black_listed && postInlineUnconditionally old_bndr new_rhs + | postInlineUnconditionally black_listed occ_info old_bndr new_rhs -- Maybe we don't need a let-binding! Maybe we can just -- inline it right away. Unlike the preInlineUnconditionally case -- we are allowed to look at the RHS. @@ -534,6 +572,14 @@ completeBinding old_bndr new_bndr black_listed new_rhs thing_inside -- NB: a loop breaker never has postInlineUnconditionally True -- and non-loop-breakers only have *forward* references -- Hence, it's safe to discard the binding + -- + -- NB: You might think that postInlineUnconditionally is an optimisation, + -- but if we have + -- let x = f Bool in (x, y) + -- then because of the constructor, x will not be *inlined* in the pair, + -- so the trivial binding will stay. But in this postInlineUnconditionally + -- gag we use the *substitution* to substitute (f Bool) for x, and that *will* + -- happen. = tick (PostInlineUnconditionally old_bndr) `thenSmpl_` extendSubst old_bndr (DoneEx new_rhs) thing_inside @@ -542,26 +588,23 @@ completeBinding old_bndr new_bndr black_listed new_rhs thing_inside = getSubst `thenSmpl` \ subst -> let -- We make new IdInfo for the new binder by starting from the old binder, - -- doing appropriate substitutions, + -- doing appropriate substitutions. + -- Then we add arity and unfolding info to get the new binder new_bndr_info = substIdInfo subst (idInfo old_bndr) (idInfo new_bndr) `setArityInfo` ArityAtLeast (exprArity new_rhs) + `setUnfoldingInfo` mkUnfolding top_lvl new_rhs - -- At the *binding* site we use the new binder info - binding_site_id = new_bndr `setIdInfo` new_bndr_info - - -- At the *occurrence* sites we want to know the unfolding - -- We also want the occurrence info of the *original* - occ_site_id = new_bndr `setIdInfo` - (new_bndr_info `setUnfoldingInfo` mkUnfolding new_rhs - `setInlinePragInfo` getInlinePragma old_bndr) + final_id = new_bndr `setIdInfo` new_bndr_info in -- These seqs force the Ids, and hence the IdInfos, and hence any -- inner substitutions - binding_site_id `seq` - occ_site_id `seq` + final_id `seq` + + (modifyInScope new_bndr final_id thing_inside `thenSmpl` \ stuff -> + returnSmpl (addBind (NonRec final_id new_rhs) stuff)) - (modifyInScope occ_site_id thing_inside `thenSmpl` \ stuff -> - returnSmpl (addBind (NonRec binding_site_id new_rhs) stuff)) + where + occ_info = getIdOccInfo old_bndr \end{code} @@ -580,7 +623,7 @@ It does two important optimisations though: * It does eta expansion \begin{code} -simplLazyBind :: TopLevelFlag +simplLazyBind :: Bool -- True <=> top level -> InId -> OutId -> InExpr -- The RHS -> SimplM (OutStuff a) -- The body of the binding @@ -591,34 +634,31 @@ simplLazyBind :: TopLevelFlag simplLazyBind top_lvl bndr bndr' rhs thing_inside = getBlackList `thenSmpl` \ black_list_fn -> - let - black_listed = isTopLevel top_lvl && black_list_fn bndr - -- Only top level things can be black listed, so the - -- first test gets us 'False' without having to call - -- the function, in the common case. + let + black_listed = black_list_fn bndr in - if not black_listed && - preInlineUnconditionally bndr && - not opt_SimplNoPreInlining - then - tick (PreInlineUnconditionally bndr) `thenSmpl_` - getSubstEnv `thenSmpl` \ rhs_se -> + + if preInlineUnconditionally black_listed bndr then + -- Inline unconditionally + tick (PreInlineUnconditionally bndr) `thenSmpl_` + getSubstEnv `thenSmpl` \ rhs_se -> (extendSubst bndr (ContEx rhs_se rhs) thing_inside) + else - else -- Simplify the RHS - getSubstEnv `thenSmpl` \ rhs_se -> - simplRhs top_lvl False {- Not ok to float unboxed -} - (idType bndr') - rhs rhs_se $ \ rhs' -> + -- Simplify the RHS + getSubstEnv `thenSmpl` \ rhs_se -> + simplRhs top_lvl False {- Not ok to float unboxed -} + (idType bndr') + rhs rhs_se $ \ rhs' -> -- Now compete the binding and simplify the body - completeBinding bndr bndr' black_listed rhs' thing_inside + completeBinding bndr bndr' top_lvl black_listed rhs' thing_inside \end{code} \begin{code} -simplRhs :: TopLevelFlag +simplRhs :: Bool -- True <=> Top level -> Bool -- True <=> OK to float unboxed (speculative) bindings -> OutType -> InExpr -> SubstEnv -> (OutExpr -> SimplM (OutStuff a)) @@ -636,8 +676,8 @@ simplRhs top_lvl float_ubx rhs_ty rhs rhs_se thing_inside (floats_out, rhs'') | float_ubx = (floats, rhs') | otherwise = splitFloats floats rhs' in - if (isTopLevel top_lvl || exprIsCheap rhs') && -- Float lets if (a) we're at the top level - not (null floats_out) -- or (b) it exposes a cheap (i.e. duplicatable) expression + if (top_lvl || exprIsCheap rhs') && -- Float lets if (a) we're at the top level + not (null floats_out) -- or (b) it exposes a cheap (i.e. duplicatable) expression then tickLetFloat floats_out `thenSmpl_` -- Do the float @@ -692,45 +732,31 @@ splitFloats floats rhs \begin{code} simplVar var cont = getSubst `thenSmpl` \ subst -> - case lookupSubst subst var of - Just (DoneEx (Var v)) -> zapSubstEnv (simplVar v cont) - Just (DoneEx e) -> zapSubstEnv (simplExprF e cont) - Just (ContEx env' e) -> setSubstEnv env' (simplExprF e cont) - - Nothing -> let - var' = case lookupInScope subst var of - Just v' -> v' - Nothing -> -#ifdef DEBUG - if isLocallyDefined var && not (idMustBeINLINEd var) - -- The idMustBeINLINEd test accouunts for the fact - -- that class dictionary constructors don't have top level - -- bindings and hence aren't in scope. - then - -- Not in scope - pprTrace "simplVar:" (ppr var) var - else -#endif - var - in - getBlackList `thenSmpl` \ black_list -> - getInScope `thenSmpl` \ in_scope -> - completeCall black_list in_scope var var' cont + case lookupIdSubst subst var of + DoneEx e -> zapSubstEnv (simplExprF e cont) + ContEx env1 e -> setSubstEnv env1 (simplExprF e cont) + DoneId var1 occ -> WARN( not (isInScope var1 subst) && isLocallyDefined var1 && not (mayHaveNoBinding var1), + text "simplVar:" <+> ppr var ) + -- The mayHaveNoBinding test accouunts for the fact + -- that class dictionary constructors dont have top level + -- bindings and hence aren't in scope. + finish_var var1 occ + where + finish_var var occ + = getBlackList `thenSmpl` \ black_list -> + getInScope `thenSmpl` \ in_scope -> + completeCall black_list in_scope occ var cont --------------------------------------------------------- -- Dealing with a call -completeCall black_list_fn in_scope orig_var var cont --- For reasons I'm not very clear about, it's important *not* to plug 'var', --- which is replete with an inlining in its IdInfo, into the resulting expression --- Doing so results in a significant space leak. --- Instead we pass orig_var, which has no inlinings etc. +completeCall black_list_fn in_scope occ var cont -- Look for an unfolding. There's a binding for the -- thing, but perhaps we want to inline it anyway | maybeToBool maybe_inline = tick (UnfoldingDone var) `thenSmpl_` - zapSubstEnv (completeInlining orig_var unf_template discard_inline_cont) + zapSubstEnv (completeInlining var unf_template discard_inline_cont) -- The template is already simplified, so don't re-substitute. -- This is VITAL. Consider -- let x = e in @@ -740,7 +766,7 @@ completeCall black_list_fn in_scope orig_var var cont -- Then when we inline y, we must *not* replace x by x' in -- the inlined copy!! - | otherwise -- Neither rule nor inlining + | otherwise -- No inlining -- Use prepareArgs to use function strictness = prepareArgs (ppr var) (idType var) (get_str var) cont $ \ args' cont' -> @@ -757,13 +783,19 @@ completeCall black_list_fn in_scope orig_var var cont -- But the black-listing mechanism means that inlining of the wrapper -- won't occur for things that have specialisations till a later phase, so -- it's ok to try for inlining first. + getSwitchChecker `thenSmpl` \ chkr -> + if switchIsOn chkr DontApplyRules then + -- Don't try rules + rebuild (mkApps (Var var) args') cont' + else + -- Try rules first case lookupRule in_scope var args' of Just (rule_name, rule_rhs, rule_args) -> tick (RuleFired rule_name) `thenSmpl_` zapSubstEnv (simplExprF rule_rhs (pushArgs emptySubstEnv rule_args cont')) -- See note above about zapping the substitution here - Nothing -> rebuild (mkApps (Var orig_var) args') cont' + Nothing -> rebuild (mkApps (Var var) args') cont' where get_str var = case getIdStrictness var of @@ -779,7 +811,7 @@ completeCall black_list_fn in_scope orig_var var cont discard_inline_cont | inline_call = discardInline cont | otherwise = cont - maybe_inline = callSiteInline black_listed inline_call + maybe_inline = callSiteInline black_listed inline_call occ var arg_infos interesting_cont Just unf_template = maybe_inline black_listed = black_list_fn var @@ -900,8 +932,25 @@ tick_case_of_error other = tick BottomFound %* * %************************************************************************ +NB: At one time I tried not pre/post-inlining top-level things, +even if they occur exactly once. Reason: + (a) some might appear as a function argument, so we simply + replace static allocation with dynamic allocation: + l = <...> + x = f x + becomes + x = f <...> + + (b) some top level things might be black listed + +HOWEVER, I found that some useful foldr/build fusion was lost (most +notably in spectral/hartel/parstof) because the foldr didn't see the build. + +Doing the dynamic allocation isn't a big deal, in fact, but losing the +fusion can be. + \begin{code} -preInlineUnconditionally :: InId -> Bool +preInlineUnconditionally :: Bool {- Black listed -} -> InId -> Bool -- Examines a bndr to see if it is used just once in a -- completely safe way, so that it is safe to discard the binding -- inline its RHS at the (unique) usage site, REGARDLESS of how @@ -922,17 +971,18 @@ preInlineUnconditionally :: InId -> Bool -- -- Evne RHSs labelled InlineMe aren't caught here, because -- there might be no benefit from inlining at the call site. - -- But things labelled 'IMustBeINLINEd' *are* caught. We use this - -- for the trivial bindings introduced by SimplUtils.mkRhsTyLam -preInlineUnconditionally bndr - = case getInlinePragma bndr of - IMustBeINLINEd -> True - ICanSafelyBeINLINEd NotInsideLam True -> True -- Not inside a lambda, - -- one occurrence ==> safe! - other -> False + +preInlineUnconditionally black_listed bndr + | black_listed || opt_SimplNoPreInlining = False + | otherwise = case getIdOccInfo bndr of + OneOcc in_lam once -> not in_lam && once + -- Not inside a lambda, one occurrence ==> safe! + other -> False -postInlineUnconditionally :: InId -> OutExpr -> Bool +postInlineUnconditionally :: Bool -- Black listed + -> OccInfo + -> InId -> OutExpr -> Bool -- Examines a (bndr = rhs) binding, AFTER the rhs has been simplified -- It returns True if it's ok to discard the binding and inline the -- RHS at every use site. @@ -941,29 +991,26 @@ postInlineUnconditionally :: InId -> OutExpr -> Bool -- We're at the binding site right now, and -- we'll get another opportunity when we get to the ocurrence(s) -postInlineUnconditionally bndr rhs - | isExportedId bndr - = False - | otherwise - = case getInlinePragma bndr of - IAmALoopBreaker -> False - - ICanSafelyBeINLINEd InsideLam one_branch -> exprIsTrivial rhs - -- Don't inline even WHNFs inside lambdas; doing so may - -- simply increase allocation when the function is called - -- This isn't the last chance; see NOTE above. - - ICanSafelyBeINLINEd not_in_lam one_branch -> one_branch || exprIsTrivial rhs - -- Was 'exprIsDupable' instead of 'exprIsTrivial' but the - -- decision about duplicating code is best left to callSiteInline - - other -> exprIsTrivial rhs -- Duplicating is *free* - -- NB: Even InlineMe and IMustBeINLINEd are ignored here - -- Why? Because we don't even want to inline them into the - -- RHS of constructor arguments. See NOTE above - -- NB: Even IMustBeINLINEd is ignored here: if the rhs is trivial - -- it's best to inline it anyway. We often get a=E; b=a - -- from desugaring, with both a and b marked NOINLINE. +postInlineUnconditionally black_listed occ_info bndr rhs + | isExportedId bndr || + black_listed || + loop_breaker = False -- Don't inline these + | otherwise = exprIsTrivial rhs -- Duplicating is free + -- Don't inline even WHNFs inside lambdas; doing so may + -- simply increase allocation when the function is called + -- This isn't the last chance; see NOTE above. + -- + -- NB: Even inline pragmas (e.g. IMustBeINLINEd) are ignored here + -- Why? Because we don't even want to inline them into the + -- RHS of constructor arguments. See NOTE above + -- + -- NB: Even NOINLINEis ignored here: if the rhs is trivial + -- it's best to inline it anyway. We often get a=E; b=a + -- from desugaring, with both a and b marked NOINLINE. + where + loop_breaker = case occ_info of + IAmALoopBreaker -> True + other -> False \end{code} @@ -1052,7 +1099,7 @@ rebuild scrut (Select _ bndr alts se cont) = tick (CaseElim bndr) `thenSmpl_` ( setSubstEnv se $ simplBinder bndr $ \ bndr' -> - completeBinding bndr bndr' False scrut $ + completeBinding bndr bndr' False False scrut $ simplExprF rhs1 cont) | otherwise @@ -1164,17 +1211,13 @@ rebuild_case scrut case_bndr alts se cont -- Deal with variable scrutinee - ( simplBinder case_bndr $ \ case_bndr' -> - substForVarScrut scrut case_bndr' $ \ zap_occ_info -> - let - case_bndr'' = zap_occ_info case_bndr' - in + ( simplCaseBinder scrut case_bndr $ \ case_bndr' zap_occ_info -> - -- Deal with the case alternaatives + -- Deal with the case alternatives simplAlts zap_occ_info scrut_cons - case_bndr'' better_alts cont' `thenSmpl` \ alts' -> + case_bndr' better_alts cont' `thenSmpl` \ alts' -> - mkCase scrut case_bndr'' alts' + mkCase scrut case_bndr' alts' ) `thenSmpl` \ case_expr -> -- Notice that the simplBinder, prepareCaseCont, etc, do *not* scope @@ -1194,7 +1237,7 @@ knownCon expr con args bndr alts se cont simplBinder bndr $ \ bndr' -> case findAlt con alts of (DEFAULT, bs, rhs) -> ASSERT( null bs ) - completeBinding bndr bndr' False expr $ + completeBinding bndr bndr' False False expr $ -- Don't use completeBeta here. The expr might be -- an unboxed literal, like 3, or a variable -- whose unfolding is an unboxed literal... and @@ -1211,7 +1254,7 @@ knownCon expr con args bndr alts se cont simplExprF rhs cont (DataCon dc, bs, rhs) -> ASSERT( length bs == length real_args ) - completeBinding bndr bndr' False expr $ + completeBinding bndr bndr' False False expr $ -- See note above extendSubstList bs (map mk real_args) $ simplExprF rhs cont @@ -1229,10 +1272,15 @@ prepareCaseCont :: [InAlt] -> SimplCont -- Polymorphic recursion here! prepareCaseCont [alt] cont thing_inside = thing_inside cont -prepareCaseCont alts cont thing_inside = mkDupableCont (coreAltsType alts) cont thing_inside +prepareCaseCont alts cont thing_inside = simplType (coreAltsType alts) `thenSmpl` \ alts_ty -> + mkDupableCont alts_ty cont thing_inside + -- At one time I passed in the un-simplified type, and simplified + -- it only if we needed to construct a join binder, but that + -- didn't work because we have to decompse function types + -- (using funResultTy) in mkDupableCont. \end{code} -substForVarScrut checks whether the scrutinee is a variable, v. +simplCaseBinder checks whether the scrutinee is a variable, v. If so, try to eliminate uses of v in the RHSs in favour of case_bndr; that way, there's a chance that v will now only be used once, and hence inlined. @@ -1249,20 +1297,22 @@ case RHS, and eliminate the second case, we get case x or { (a,b) -> a b } Urk! b is alive! Reason: the scrutinee was a variable, and case elimination -happened. Hence the zap_occ_info function returned by substForVarScrut +happened. Hence the zap_occ_info function returned by simplCaseBinder \begin{code} -substForVarScrut (Var v) case_bndr' thing_inside - | isLocallyDefined v -- No point for imported things - = modifyInScope (v `setIdUnfolding` mkUnfolding (Var case_bndr') - `setInlinePragma` IMustBeINLINEd) $ +simplCaseBinder (Var v) case_bndr thing_inside + = simplBinder (zap case_bndr) $ \ case_bndr' -> + modifyInScope v case_bndr' $ -- We could extend the substitution instead, but it would be -- a hack because then the substitution wouldn't be idempotent - -- any more. - thing_inside (\ bndr -> bndr `setInlinePragma` NoInlinePragInfo) + -- any more (v is an OutId). And this just just as well. + thing_inside case_bndr' zap + where + zap b = b `setIdOccInfo` NoOccInfo -substForVarScrut other_scrut case_bndr' thing_inside - = thing_inside (\ bndr -> bndr) -- NoOp on bndr +simplCaseBinder other_scrut case_bndr thing_inside + = simplBinder case_bndr $ \ case_bndr' -> + thing_inside case_bndr' (\ bndr -> bndr) -- NoOp on bndr \end{code} prepareCaseAlts does two things: @@ -1316,10 +1366,10 @@ prepareCaseAlts _ _ scrut_cons alts ---------------------- -simplAlts zap_occ_info scrut_cons case_bndr'' alts cont' +simplAlts zap_occ_info scrut_cons case_bndr' alts cont' = mapSmpl simpl_alt alts where - inst_tys' = case splitTyConApp_maybe (idType case_bndr'') of + inst_tys' = case splitTyConApp_maybe (idType case_bndr') of Just (tycon, inst_tys) -> inst_tys -- handled_cons is all the constructors that are dealt @@ -1330,21 +1380,24 @@ simplAlts zap_occ_info scrut_cons case_bndr'' alts cont' = -- In the default case we record the constructors that the -- case-binder *can't* be. -- We take advantage of any OtherCon info in the case scrutinee - modifyInScope (case_bndr'' `setIdUnfolding` mkOtherCon handled_cons) $ + modifyInScope case_bndr' (case_bndr' `setIdUnfolding` mkOtherCon handled_cons) $ simplExprC rhs cont' `thenSmpl` \ rhs' -> returnSmpl (DEFAULT, [], rhs') simpl_alt (con, vs, rhs) = -- Deal with the pattern-bound variables -- Mark the ones that are in ! positions in the data constructor - -- as certainly-evaluated - simplBinders (add_evals con vs) $ \ vs' -> + -- as certainly-evaluated. + -- NB: it happens that simplBinders does *not* erase the OtherCon + -- form of unfolding, so it's ok to add this info before + -- doing simplBinders + simplBinders (add_evals con vs) $ \ vs' -> -- Bind the case-binder to (Con args) let con_app = Con con (map Type inst_tys' ++ map varToCoreExpr vs') in - modifyInScope (case_bndr'' `setIdUnfolding` mkUnfolding con_app) $ + modifyInScope case_bndr' (case_bndr' `setIdUnfolding` mkUnfolding False con_app) $ simplExprC rhs cont' `thenSmpl` \ rhs' -> returnSmpl (con, vs', rhs') @@ -1378,7 +1431,7 @@ simplAlts zap_occ_info scrut_cons case_bndr'' alts cont' %************************************************************************ \begin{code} -mkDupableCont :: InType -- Type of the thing to be given to the continuation +mkDupableCont :: OutType -- Type of the thing to be given to the continuation -> SimplCont -> (SimplCont -> SimplM (OutStuff a)) -> SimplM (OutStuff a) @@ -1396,9 +1449,7 @@ mkDupableCont ty (InlinePlease cont) thing_inside mkDupableCont join_arg_ty (ArgOf _ cont_ty cont_fn) thing_inside = -- Build the RHS of the join point - simplType join_arg_ty `thenSmpl` \ join_arg_ty' -> - newId join_arg_ty' ( \ arg_id -> - getSwitchChecker `thenSmpl` \ chkr -> + newId join_arg_ty ( \ arg_id -> cont_fn (Var arg_id) `thenSmpl` \ (binds, (_, rhs)) -> returnSmpl (Lam (setOneShotLambda arg_id) (mkLets binds rhs)) ) `thenSmpl` \ join_rhs -> @@ -1456,9 +1507,12 @@ mkDupableCont ty (Select _ case_bndr alts se cont) thing_inside mkDupableAlt :: InId -> OutId -> SimplCont -> InAlt -> SimplM (OutStuff InAlt) -mkDupableAlt case_bndr case_bndr' (Stop _) alt@(con, bndrs, rhs) - | exprIsDupable rhs - = -- It is worth checking for a small RHS because otherwise we +mkDupableAlt case_bndr case_bndr' cont alt@(con, bndrs, rhs) + = simplBinders bndrs $ \ bndrs' -> + simplExprC rhs cont `thenSmpl` \ rhs' -> + + if (case cont of { Stop _ -> exprIsDupable rhs'; other -> False}) then + -- It is worth checking for a small RHS because otherwise we -- get extra let bindings that may cause an extra iteration of the simplifier to -- inline back in place. Quite often the rhs is just a variable or constructor. -- The Ord instance of Maybe in PrelMaybe.lhs, for example, took several extra @@ -1468,14 +1522,16 @@ mkDupableAlt case_bndr case_bndr' (Stop _) alt@(con, bndrs, rhs) -- -- But since the continuation is absorbed into the rhs, we only do this -- for a Stop continuation. - returnSmpl ([], alt) + -- + -- NB: we have to check the size of rhs', not rhs. + -- Duplicating a small InAlt might invalidate occurrence information + -- However, if it *is* dupable, we return the *un* simplified alternative, + -- because otherwise we'd need to pair it up with an empty subst-env. + -- (Remember we must zap the subst-env before re-simplifying something). + -- Rather than do this we simply agree to re-simplify the original (small) thing later. + returnSmpl ([], alt) -mkDupableAlt case_bndr case_bndr' cont alt@(con, bndrs, rhs) - | otherwise - = -- Not worth checking whether the rhs is small; the - -- inliner will inline it if so. - simplBinders bndrs $ \ bndrs' -> - simplExprC rhs cont `thenSmpl` \ rhs' -> + else let rhs_ty' = coreExprType rhs' (used_bndrs, used_bndrs') diff --git a/ghc/compiler/simplStg/StgVarInfo.lhs b/ghc/compiler/simplStg/StgVarInfo.lhs index 6e93773..27756b7 100644 --- a/ghc/compiler/simplStg/StgVarInfo.lhs +++ b/ghc/compiler/simplStg/StgVarInfo.lhs @@ -13,12 +13,12 @@ module StgVarInfo ( setStgVarInfo ) where import StgSyn -import Id ( setIdArity, getIdArity, Id ) +import Id ( setIdArity, getIdArity, setIdOccInfo, Id ) import VarSet import VarEnv import Var import Const ( Con(..) ) -import IdInfo ( ArityInfo(..), InlinePragInfo(..), +import IdInfo ( ArityInfo(..), OccInfo(..), setInlinePragInfo ) import PrimOp ( PrimOp(..) ) import TysWiredIn ( isForeignObjTy ) @@ -294,8 +294,8 @@ varsExpr (StgCase scrut _ _ bndr srt alts) let -- determine whether the default binder is dead or not bndr'= if (bndr `elementOfFVInfo` alts_fvs) - then modifyIdInfo (`setInlinePragInfo` NoInlinePragInfo) bndr - else modifyIdInfo (`setInlinePragInfo` IAmDead) bndr + then bndr `setIdOccInfo` NoOccInfo + else bndr `setIdOccInfo` IAmDead -- for a _ccall_GC_, some of the *arguments* need to live across the -- call (see findLiveArgs comments.), so we annotate them as being live diff --git a/ghc/compiler/specialise/Rules.lhs b/ghc/compiler/specialise/Rules.lhs index e27b0e2..864013b 100644 --- a/ghc/compiler/specialise/Rules.lhs +++ b/ghc/compiler/specialise/Rules.lhs @@ -24,11 +24,11 @@ import Subst ( Subst, InScopeSet, substBndr, lookupSubst, extendSubst, mkSubst, substEnv, setSubstEnv, emptySubst, isInScope, unBindSubst, bindSubstList, unBindSubstList, substInScope ) -import Id ( Id, getIdUnfolding, +import Id ( Id, getIdUnfolding, zapLamIdInfo, getIdSpecialisation, setIdSpecialisation, setIdNoDiscard, maybeModifyIdInfo, modifyIdInfo ) -import IdInfo ( zapLamIdInfo, setSpecInfo, specInfo ) +import IdInfo ( setSpecInfo, specInfo ) import Name ( Name, isLocallyDefined ) import Var ( isTyVar, isId ) import VarSet @@ -205,13 +205,13 @@ matchRule in_scope rule@(Rule rn tpl_vars tpl_args rhs) args where senv = substEnv subst go v = case lookupSubstEnv senv v of - Just (DoneEx ex) -> ex - Just (DoneTy ty) -> Type ty + Just (DoneEx ex) -> ex + Just (DoneTy ty) -> Type ty -- Substitution should bind them all! zapOccInfo bndr | isTyVar bndr = bndr - | otherwise = maybeModifyIdInfo zapLamIdInfo bndr + | otherwise = zapLamIdInfo bndr \end{code} \begin{code} diff --git a/ghc/compiler/specialise/Specialise.lhs b/ghc/compiler/specialise/Specialise.lhs index edc928b..d6f59f1 100644 --- a/ghc/compiler/specialise/Specialise.lhs +++ b/ghc/compiler/specialise/Specialise.lhs @@ -22,7 +22,7 @@ import Type ( Type, mkTyVarTy, splitSigmaTy, splitFunTysN, mkForAllTys, boxedTypeKind ) import Subst ( Subst, mkSubst, substTy, emptySubst, substBndrs, extendSubstList, - substExpr, substId, substIds, substAndCloneId, substAndCloneIds, lookupSubst + substId, substAndCloneId, substAndCloneIds, lookupIdSubst ) import Var ( TyVar, mkSysTyVar, setVarUnique ) import VarSet @@ -609,9 +609,9 @@ dump_specs var = pprCoreRules var (getIdSpecialisation var) \begin{code} specVar :: Subst -> Id -> CoreExpr -specVar subst v = case lookupSubst subst v of - Nothing -> Var v - Just (DoneEx e) -> e +specVar subst v = case lookupIdSubst subst v of + DoneEx e -> e + DoneId v _ -> Var v specExpr :: Subst -> CoreExpr -> SpecM (CoreExpr, UsageDetails) -- We carry a substitution down: diff --git a/ghc/compiler/stgSyn/CoreToStg.lhs b/ghc/compiler/stgSyn/CoreToStg.lhs index 4ff2d3a..271615f 100644 --- a/ghc/compiler/stgSyn/CoreToStg.lhs +++ b/ghc/compiler/stgSyn/CoreToStg.lhs @@ -24,7 +24,7 @@ import Id ( Id, mkSysLocal, idType, getIdStrictness, idUnique, isExportedId, mk externallyVisibleId, setIdUnique, idName, getIdDemandInfo, setIdType ) import Var ( Var, varType, modifyIdInfo ) -import IdInfo ( setDemandInfo, StrictnessInfo(..), zapIdInfoForStg ) +import IdInfo ( setDemandInfo, StrictnessInfo(..) ) import UsageSPUtils ( primOpUsgTys ) import DataCon ( DataCon, dataConName, dataConId ) import Demand ( Demand, isStrict, wwStrict, wwLazy ) diff --git a/ghc/compiler/stranal/StrictAnal.lhs b/ghc/compiler/stranal/StrictAnal.lhs index a2e8188..081e039 100644 --- a/ghc/compiler/stranal/StrictAnal.lhs +++ b/ghc/compiler/stranal/StrictAnal.lhs @@ -19,6 +19,7 @@ import Id ( idType, setIdStrictness, ) import IdInfo ( mkStrictnessInfo ) import CoreLint ( beginPass, endPass ) +import Type ( repType, splitFunTys ) import ErrUtils ( dumpIfSet ) import SaAbsInt import SaLib @@ -324,19 +325,24 @@ addStrictnessInfoToId -> Id -- Augmented with strictness addStrictnessInfoToId str_val abs_val binder body - = case collectBindersIgnoringNotes body of - -- It's imporant to use collectBindersIgnoringNotes, so that INLINE prags - -- don't inhibit strictness info. In particular, foldr is marked INLINE, - -- but we still want it to be strict in its third arg, so that - -- foldr k z (case e of p -> build g) - -- gets transformed to - -- case e of p -> foldr k z (build g) - -- [foldr is only inlined late in compilation, after strictness analysis] - (binders, rhs) -> binder `setIdStrictness` - mkStrictnessInfo strictness - where - tys = [idType id | id <- binders, isId id] - strictness = findStrictness tys str_val abs_val + = binder `setIdStrictness` mkStrictnessInfo strictness + where + arg_tys = collect_arg_tys (idType binder) + strictness = findStrictness arg_tys str_val abs_val + + collect_arg_tys ty + | null arg_tys = [] + | otherwise = arg_tys ++ collect_arg_tys res_ty + where + (arg_tys, res_ty) = splitFunTys (repType ty) + -- repType looks through for-alls and new-types. And since we look on the + -- type info, we aren't confused by INLINE prags. + -- In particular, foldr is marked INLINE, + -- but we still want it to be strict in its third arg, so that + -- foldr k z (case e of p -> build g) + -- gets transformed to + -- case e of p -> foldr k z (build g) + -- [foldr is only inlined late in compilation, after strictness analysis] \end{code} \begin{code} diff --git a/ghc/compiler/stranal/WorkWrap.lhs b/ghc/compiler/stranal/WorkWrap.lhs index d919b73..9ae59c4 100644 --- a/ghc/compiler/stranal/WorkWrap.lhs +++ b/ghc/compiler/stranal/WorkWrap.lhs @@ -14,17 +14,17 @@ import CmdLineOpts ( opt_UF_CreationThreshold , opt_D_verbose_core2core, opt_D_dump_worker_wrapper ) import CoreLint ( beginPass, endPass ) -import CoreUtils ( coreExprType, exprArity ) +import CoreUtils ( coreExprType, exprEtaExpandArity ) import Const ( Con(..) ) import DataCon ( DataCon ) import MkId ( mkWorkerId ) -import Id ( Id, idType, getIdStrictness, setIdArity, - setIdStrictness, getIdDemandInfo, +import Id ( Id, idType, getIdStrictness, setIdArity, isOneShotLambda, + setIdStrictness, getIdDemandInfo, getInlinePragma, setIdWorkerInfo, getIdCprInfo ) import VarSet import Type ( Type, isNewType, splitForAllTys, splitFunTys ) import IdInfo ( mkStrictnessInfo, noStrictnessInfo, StrictnessInfo(..), - CprInfo(..), exactArity + CprInfo(..), exactArity, InlinePragInfo(..) ) import Demand ( Demand, wwLazy ) import SaLib @@ -203,13 +203,14 @@ tryWW non_rec fn_id rhs ) || arity == 0 -- Don't split if it's not a function + || never_inline fn_id || not (do_strict_ww || do_cpr_ww || do_coerce_ww) = returnUs [ (fn_id, rhs) ] | otherwise -- Do w/w split - = mkWwBodies fun_ty arity wrap_dmds cpr_info `thenUs` \ (work_args, wrap_fn, work_fn) -> - getUniqueUs `thenUs` \ work_uniq -> + = mkWwBodies fun_ty arity wrap_dmds one_shots cpr_info `thenUs` \ (work_args, wrap_fn, work_fn) -> + getUniqueUs `thenUs` \ work_uniq -> let work_rhs = work_fn rhs work_demands = [getIdDemandInfo v | v <- work_args, isId v] @@ -230,7 +231,12 @@ tryWW non_rec fn_id rhs -- Worker first, because wrapper mentions it where fun_ty = idType fn_id - arity = exprArity rhs + arity = exprEtaExpandArity rhs + + -- Don't split something which is marked unconditionally NOINLINE + never_inline fn_id = case getInlinePragma fn_id of + IMustNotBeINLINEd False Nothing -> True + other -> False strictness_info = getIdStrictness fn_id StrictnessInfo arg_demands result_bot = strictness_info @@ -253,16 +259,17 @@ tryWW non_rec fn_id rhs | otherwise = noStrictnessInfo ------------------------------------------------------------- - cpr_info = getIdCprInfo fn_id - has_cpr_info = case cpr_info of + cpr_info = getIdCprInfo fn_id + do_cpr_ww = case cpr_info of CPRInfo _ -> True other -> False - do_cpr_ww = has_cpr_info - ------------------------------------------------------------- do_coerce_ww = check_for_coerce arity fun_ty + ------------------------------------------------------------- + one_shots = get_one_shots rhs + -- See if there's a Coerce before we run out of arity; -- if so, it's worth trying a w/w split. Reason: we find -- functions like f = coerce (\s -> e) @@ -278,6 +285,16 @@ check_for_coerce arity ty where (_, tau) = splitForAllTys ty (arg_tys, res_ty) = splitFunTys tau + +-- If the original function has one-shot arguments, it is important to +-- make the wrapper and worker have corresponding one-shot arguments too. +-- Otherwise we spuriously float stuff out of case-expression join points, +-- which is very annoying. +get_one_shots (Lam b e) + | isId b = isOneShotLambda b : get_one_shots e + | otherwise = get_one_shots e +get_one_shots (Note _ e) = get_one_shots e +get_one_shots other = noOneShotInfo \end{code} @@ -299,8 +316,10 @@ mkWrapper :: Type -- Wrapper type -> UniqSM (Id -> CoreExpr) -- Wrapper body, missing worker Id mkWrapper fun_ty arity demands cpr_info - = mkWwBodies fun_ty arity demands cpr_info `thenUs` \ (_, wrap_fn, _) -> + = mkWwBodies fun_ty arity demands noOneShotInfo cpr_info `thenUs` \ (_, wrap_fn, _) -> returnUs wrap_fn + +noOneShotInfo = repeat False \end{code} diff --git a/ghc/compiler/stranal/WwLib.lhs b/ghc/compiler/stranal/WwLib.lhs index 1a6c4de..170e10b 100644 --- a/ghc/compiler/stranal/WwLib.lhs +++ b/ghc/compiler/stranal/WwLib.lhs @@ -14,6 +14,7 @@ module WwLib ( import CoreSyn import CoreUtils ( coreExprType ) import Id ( Id, idType, mkSysLocal, getIdDemandInfo, setIdDemandInfo, + isOneShotLambda, setOneShotLambda, mkWildId, setIdInfo ) import IdInfo ( CprInfo(..), noCprInfo, vanillaIdInfo ) @@ -34,8 +35,9 @@ import BasicTypes ( NewOrData(..), Arity ) import Var ( TyVar, IdOrTyVar ) import UniqSupply ( returnUs, thenUs, getUniqueUs, getUniquesUs, mapUs, UniqSM ) -import Util ( zipWithEqual, zipEqual ) +import Util ( zipWithEqual, zipEqual, lengthExceeds ) import Outputable +import List ( zipWith4 ) \end{code} @@ -223,17 +225,20 @@ allAbsent ds = all absent ds mkWwBodies :: Type -- Type of original function -> Arity -- Arity of original function -> [Demand] -- Strictness of original function + -> [Bool] -- One-shot-ness of the function -> CprInfo -- Result of CPR analysis -> UniqSM ([IdOrTyVar], -- Worker args Id -> CoreExpr, -- Wrapper body, lacking only the worker Id CoreExpr -> CoreExpr) -- Worker body, lacking the original function rhs -mkWwBodies fun_ty arity demands cpr_info - = WARN( arity /= length demands, text "mkWrapper" <+> ppr fun_ty <+> ppr arity <+> ppr demands ) - mkWWargs fun_ty arity demands `thenUs` \ (wrap_args, wrap_fn_args, work_fn_args, res_ty) -> - mkWWstr wrap_args `thenUs` \ (work_args, wrap_fn_str, work_fn_str) -> - mkWWcpr res_ty cpr_info `thenUs` \ (wrap_fn_cpr, work_fn_cpr, cpr_res_ty) -> - mkWWfixup cpr_res_ty work_args `thenUs` \ (wrap_fn_fixup, work_fn_fixup) -> +mkWwBodies fun_ty arity demands one_shots cpr_info + = WARN( not (lengthExceeds demands (arity-1)) + || not (lengthExceeds one_shots (arity-1)), + text "mkWrapper" <+> ppr fun_ty <+> ppr arity <+> ppr (take arity demands) <+> ppr (take arity one_shots) ) + mkWWargs fun_ty arity demands one_shots `thenUs` \ (wrap_args, wrap_fn_args, work_fn_args, res_ty) -> + mkWWstr wrap_args `thenUs` \ (work_args, wrap_fn_str, work_fn_str) -> + mkWWcpr res_ty cpr_info `thenUs` \ (wrap_fn_cpr, work_fn_cpr, cpr_res_ty) -> + mkWWfixup cpr_res_ty work_args `thenUs` \ (wrap_fn_fixup, work_fn_fixup) -> returnUs (work_args, Note InlineMe . wrap_fn_args . wrap_fn_cpr . wrap_fn_str . wrap_fn_fixup . Var, @@ -278,25 +283,28 @@ the \x to get what we want. -- It chomps bites off foralls, arrows, newtypes -- and keeps repeating that until it's satisfied the supplied arity -mkWWargs :: Type -> Int -> [Demand] - -> UniqSM ([IdOrTyVar], -- Wrapper args - CoreExpr -> CoreExpr, -- Wrapper fn - CoreExpr -> CoreExpr, -- Worker fn - Type) -- Type of wrapper body +mkWWargs :: Type -> Arity + -> [Demand] -> [Bool] -- Both these will in due course be derived + -- from the type. The [Bool] is True for a one-shot arg. + -> UniqSM ([IdOrTyVar], -- Wrapper args + CoreExpr -> CoreExpr, -- Wrapper fn + CoreExpr -> CoreExpr, -- Worker fn + Type) -- Type of wrapper body -mkWWargs fun_ty arity demands +mkWWargs fun_ty arity demands one_shots | arity == 0 = returnUs ([], id, id, fun_ty) | otherwise = getUniquesUs n_args `thenUs` \ wrap_uniqs -> let - val_args = zipWith3 mk_wrap_arg wrap_uniqs arg_tys demands + val_args = zipWith4 mk_wrap_arg wrap_uniqs arg_tys demands one_shots wrap_args = tyvars ++ val_args in mkWWargs body_rep_ty (arity - n_args) - (drop n_args demands) `thenUs` \ (more_wrap_args, wrap_fn_args, work_fn_args, res_ty) -> + (drop n_args demands) + (drop n_args one_shots) `thenUs` \ (more_wrap_args, wrap_fn_args, work_fn_args, res_ty) -> returnUs (wrap_args ++ more_wrap_args, mkLams wrap_args . wrap_coerce_fn . wrap_fn_args, @@ -319,7 +327,11 @@ mkWWargs fun_ty arity demands applyToVars :: [IdOrTyVar] -> CoreExpr -> CoreExpr applyToVars vars fn = mkVarApps fn vars -mk_wrap_arg uniq ty dmd = setIdDemandInfo (mkSysLocal SLIT("w") uniq ty) dmd +mk_wrap_arg uniq ty dmd one_shot + = set_one_shot one_shot (setIdDemandInfo (mkSysLocal SLIT("w") uniq ty) dmd) + where + set_one_shot True id = setOneShotLambda id + set_one_shot False id = id \end{code} @@ -401,7 +413,7 @@ mk_ww_str (arg : ds) getUniquesUs (length inst_con_arg_tys) `thenUs` \ uniqs -> let unpk_args = zipWith mk_ww_local uniqs inst_con_arg_tys - unpk_args_w_ds = zipWithEqual "mk_ww_str" setIdDemandInfo unpk_args cs + unpk_args_w_ds = zipWithEqual "mk_ww_str" set_worker_arg_info unpk_args cs in mk_ww_str (unpk_args_w_ds ++ ds) `thenUs` \ (worker_args, wrap_fn, work_fn) -> returnUs (worker_args, @@ -414,6 +426,14 @@ mk_ww_str (arg : ds) other_demand -> mk_ww_str ds `thenUs` \ (worker_args, wrap_fn, work_fn) -> returnUs (arg : worker_args, wrap_fn, work_fn) + where + -- If the wrapper argument is a one-shot lambda, then + -- so should (all) the corresponding worker arguments be + -- This bites when we do w/w on a case join point + set_worker_arg_info worker_arg demand = set_one_shot (setIdDemandInfo worker_arg demand) + + set_one_shot | isOneShotLambda arg = setOneShotLambda + | otherwise = \x -> x \end{code} @@ -451,8 +471,8 @@ mkWWcpr body_ty (CPRInfo cpr_args) work_wild = mk_ww_local work_uniq body_ty arg = mk_ww_local arg_uniq con_arg_ty1 in - returnUs (\ wkr_call -> mkConApp data_con (map Type tycon_arg_tys ++ [wkr_call]), - \ body -> Case body work_wild [(DataCon data_con, [arg], Var arg)], + returnUs (\ wkr_call -> Case wkr_call arg [(DEFAULT, [], mkConApp data_con (map Type tycon_arg_tys ++ [Var arg]))], + \ body -> Case body work_wild [(DataCon data_con, [arg], Var arg)], con_arg_ty1) | otherwise -- The general case @@ -502,7 +522,7 @@ splitProductType fname ty text "splitProductType hack: I happened!" <+> ppr ty ) (tycon, tycon_args, con, dataConArgTys con tycon_args) - Nothing -> pprPanic (fname ++ ": not a product") (ppr ty) + other -> pprPanic (fname ++ ": not a product") (ppr ty) \end{code} diff --git a/ghc/compiler/typecheck/TcBinds.lhs b/ghc/compiler/typecheck/TcBinds.lhs index 3fb4cdf..8c0ac2a 100644 --- a/ghc/compiler/typecheck/TcBinds.lhs +++ b/ghc/compiler/typecheck/TcBinds.lhs @@ -58,6 +58,7 @@ import Bag import Util ( isIn ) import Maybes ( maybeToBool ) import BasicTypes ( TopLevelFlag(..), RecFlag(..), isNotTopLevel ) +import FiniteMap ( listToFM, lookupFM ) import SrcLoc ( SrcLoc ) import Outputable \end{code} @@ -354,8 +355,14 @@ tcBindWithSigs top_lvl mbind tc_ty_sigs inline_sigs is_rec exports = zipWith mk_export binder_names zonked_mono_ids dict_tys = map idType dicts_bound - inlines = mkNameSet [name | InlineSig name loc <- inline_sigs] - no_inlines = mkNameSet [name | NoInlineSig name loc <- inline_sigs] + inlines = mkNameSet [name | InlineSig name _ loc <- inline_sigs] + no_inlines = listToFM ([(name, IMustNotBeINLINEd False phase) | NoInlineSig name phase loc <- inline_sigs] ++ + [(name, IMustNotBeINLINEd True phase) | InlineSig name phase loc <- inline_sigs, maybeToBool phase]) + -- "INLINE n foo" means inline foo, but not until at least phase n + -- "NOINLINE n foo" means don't inline foo until at least phase n, and even + -- then only if it is small enough etc. + -- "NOINLINE foo" means don't inline foo ever, which we signal with a (IMustNotBeINLINEd Nothing) + -- See comments in CoreUnfold.blackListed for the Authorised Version mk_export binder_name zonked_mono_id = (tyvars, @@ -408,8 +415,9 @@ justPatBindings (AndMonoBinds b1 b2) binds = justPatBindings other_bind binds = binds attachNoInlinePrag no_inlines bndr - | idName bndr `elemNameSet` no_inlines = bndr `setInlinePragma` IMustNotBeINLINEd - | otherwise = bndr + = case lookupFM no_inlines (idName bndr) of + Just prag -> bndr `setInlinePragma` prag + Nothing -> bndr \end{code} Polymorphic recursion diff --git a/ghc/compiler/typecheck/TcClassDcl.lhs b/ghc/compiler/typecheck/TcClassDcl.lhs index 264776a..ec003b4 100644 --- a/ghc/compiler/typecheck/TcClassDcl.lhs +++ b/ghc/compiler/typecheck/TcClassDcl.lhs @@ -179,10 +179,7 @@ tcClassDecl1 rec_env rec_inst_mapper rec_vrcs dict_component_tys tycon dict_con_id - -- In general, constructors don't have to be inlined, but this one - -- does, because we don't make a top level binding for it. dict_con_id = mkDataConId dict_con - `setInlinePragma` IMustBeINLINEd argvrcs = lookupWithDefaultFM rec_vrcs (pprPanic "tcClassDecl1: argvrcs:" $ ppr tycon_name) @@ -614,10 +611,10 @@ tcMethodBind clas origin inst_tyvars inst_tys inst_theta find_prags meth_name [] = [] find_prags meth_name (SpecSig name ty loc : prags) | name == sel_name = SpecSig meth_name ty loc : find_prags meth_name prags - find_prags meth_name (InlineSig name loc : prags) - | name == sel_name = InlineSig meth_name loc : find_prags meth_name prags - find_prags meth_name (NoInlineSig name loc : prags) - | name == sel_name = NoInlineSig meth_name loc : find_prags meth_name prags + find_prags meth_name (InlineSig name phase loc : prags) + | name == sel_name = InlineSig meth_name phase loc : find_prags meth_name prags + find_prags meth_name (NoInlineSig name phase loc : prags) + | name == sel_name = NoInlineSig meth_name phase loc : find_prags meth_name prags find_prags meth_name (prag:prags) = find_prags meth_name prags mk_default_bind local_meth_name loc diff --git a/ghc/compiler/typecheck/TcExpr.lhs b/ghc/compiler/typecheck/TcExpr.lhs index f3903d7..0a6b2c0 100644 --- a/ghc/compiler/typecheck/TcExpr.lhs +++ b/ghc/compiler/typecheck/TcExpr.lhs @@ -459,8 +459,9 @@ tcMonoExpr (ExplicitTuple exprs boxed) res_ty `thenTc` \ (exprs', lies) -> returnTc (ExplicitTuple exprs' boxed, plusLIEs lies) -tcMonoExpr (RecordCon con_name rbinds) res_ty - = tcId con_name `thenNF_Tc` \ (con_expr, con_lie, con_tau) -> +tcMonoExpr expr@(RecordCon con_name rbinds) res_ty + = tcAddErrCtxt (recordConCtxt expr) $ + tcId con_name `thenNF_Tc` \ (con_expr, con_lie, con_tau) -> let (_, record_ty) = splitFunTys con_tau in @@ -522,8 +523,8 @@ tcMonoExpr (RecordCon con_name rbinds) res_ty -- -- All this is done in STEP 4 below. -tcMonoExpr (RecordUpd record_expr rbinds) res_ty - = tcAddErrCtxt recordUpdCtxt $ +tcMonoExpr expr@(RecordUpd record_expr rbinds) res_ty + = tcAddErrCtxt (recordUpdCtxt expr) $ -- STEP 0 -- Check that the field names are really field names @@ -1091,7 +1092,8 @@ badFieldsUpd rbinds where fields = [field | (field, _, _) <- rbinds] -recordUpdCtxt = ptext SLIT("In a record update construct") +recordUpdCtxt expr = ptext SLIT("In the record update:") <+> ppr expr +recordConCtxt expr = ptext SLIT("In the record construction:") <+> ppr expr notSelector field = hsep [quotes (ppr field), ptext SLIT("is not a record selector")] @@ -1112,7 +1114,6 @@ missingStrictFieldCon con field missingFieldCon :: Name -> Name -> SDoc missingFieldCon con field - = hsep [ptext SLIT("Constructor") <+> quotes (ppr con), - ptext SLIT("does not have the field"), quotes (ppr field)] - + = hsep [ptext SLIT("Field") <+> quotes (ppr field), + ptext SLIT("is not initialised")] \end{code} diff --git a/ghc/compiler/typecheck/TcIfaceSig.lhs b/ghc/compiler/typecheck/TcIfaceSig.lhs index c4a59f3..2cf4095 100644 --- a/ghc/compiler/typecheck/TcIfaceSig.lhs +++ b/ghc/compiler/typecheck/TcIfaceSig.lhs @@ -89,17 +89,14 @@ tcIdInfo unf_env in_scope_vars name ty info info_ins tcPrag info (HsNoCafRefs) = returnTc (info `setCafInfo` NoCafRefs) tcPrag info (HsCprInfo cpr_info) = returnTc (info `setCprInfo` cpr_info) - tcPrag info (HsUnfold inline_prag maybe_expr) - = (case maybe_expr of - Just expr -> tcPragExpr unf_env name in_scope_vars expr - Nothing -> returnNF_Tc Nothing - ) `thenNF_Tc` \ maybe_expr' -> + tcPrag info (HsUnfold inline_prag expr) + = tcPragExpr unf_env name in_scope_vars expr `thenNF_Tc` \ maybe_expr' -> let -- maybe_expr doesn't get looked at if the unfolding -- is never inspected; so the typecheck doesn't even happen unfold_info = case maybe_expr' of Nothing -> noUnfolding - Just expr' -> mkUnfolding expr' + Just expr' -> mkTopUnfolding expr' info1 = info `setUnfoldingInfo` unfold_info info2 = info1 `setInlinePragInfo` inline_prag in @@ -122,7 +119,7 @@ tcWorkerInfo unf_env ty info worker_name let -- Watch out! We can't pull on unf_env too eagerly! info' = case explicitLookupValue unf_env worker_name of - Just worker_id -> info `setUnfoldingInfo` mkUnfolding (wrap_fn worker_id) + Just worker_id -> info `setUnfoldingInfo` mkTopUnfolding (wrap_fn worker_id) `setWorkerInfo` Just worker_id Nothing -> pprTrace "tcWorkerInfo failed:" (ppr worker_name) info diff --git a/ghc/compiler/types/TyCon.lhs b/ghc/compiler/types/TyCon.lhs index d79f003..14180b2 100644 --- a/ghc/compiler/types/TyCon.lhs +++ b/ghc/compiler/types/TyCon.lhs @@ -45,7 +45,7 @@ import {-# SOURCE #-} TypeRep ( Type, Kind, SuperKind ) -- Should just be Type(Type), but this fails due to bug present up to -- and including 4.02 involving slurping of hi-boot files. Bug is now fixed. -import {-# SOURCE #-} DataCon ( DataCon ) +import {-# SOURCE #-} DataCon ( DataCon, isExistentialDataCon ) import Class ( Class ) import Var ( TyVar ) @@ -276,10 +276,16 @@ isDataTyCon other = False isNewTyCon (AlgTyCon {algTyConFlavour = NewType}) = True isNewTyCon other = False --- A "product" tycon is non-recursive and has one constructor, and is *not* an unboxed tuple +-- A "product" tycon is +-- non-recursive +-- has one constructor, +-- is *not* existential +-- is *not* an unboxed tuple -- whether DataType or NewType -isProductTyCon (AlgTyCon {dataCons = [c], algTyConRec = NonRecursive}) = True -isProductTyCon (TupleTyCon { tyConBoxed = boxed }) = boxed +isProductTyCon (AlgTyCon {dataCons = [data_con], algTyConRec = NonRecursive}) + = not (isExistentialDataCon data_con) +isProductTyCon (TupleTyCon { tyConBoxed = boxed }) + = boxed isProductTyCon other = False isSynTyCon (SynTyCon {}) = True diff --git a/ghc/compiler/usageSP/UsageSPInf.lhs b/ghc/compiler/usageSP/UsageSPInf.lhs index 7fe753e..2b69448 100644 --- a/ghc/compiler/usageSP/UsageSPInf.lhs +++ b/ghc/compiler/usageSP/UsageSPInf.lhs @@ -30,7 +30,7 @@ import DataCon ( dataConType ) import Const ( Con(..), Literal(..), literalType ) import Var ( Var, UVar, varType, setVarType, mkUVar, modifyIdInfo ) import IdInfo ( setLBVarInfo, LBVarInfo(..) ) -import Id ( idMustBeINLINEd, isExportedId ) +import Id ( mayHaveNoBinding, isExportedId ) import Name ( isLocallyDefined ) import VarEnv import VarSet @@ -394,7 +394,7 @@ lookupVar :: VarEnv Var -> Var -> Var --lookupVar ve v = error "lookupVar unimplemented" lookupVar ve v = case lookupVarEnv ve v of Just v' -> v' - Nothing -> ASSERT( not (isLocallyDefined v) || (idMustBeINLINEd v) ) + Nothing -> ASSERT( not (isLocallyDefined v) || (mayHaveNoBinding v) ) ASSERT( isUsgTy (varType v) ) v diff --git a/ghc/compiler/usageSP/UsageSPUtils.lhs b/ghc/compiler/usageSP/UsageSPUtils.lhs index e41609a..d421d1b 100644 --- a/ghc/compiler/usageSP/UsageSPUtils.lhs +++ b/ghc/compiler/usageSP/UsageSPUtils.lhs @@ -27,7 +27,7 @@ module UsageSPUtils ( AnnotM(AnnotM), initAnnotM, import CoreSyn import Const ( Con(..), Literal(..) ) import Var ( IdOrTyVar, varName, varType, setVarType, mkUVar ) -import Id ( idMustBeINLINEd, isExportedId ) +import Id ( mayHaveNoBinding, isExportedId ) import Name ( isLocallyDefined ) import TypeRep ( Type(..), TyNote(..) ) -- friend import Type ( UsageAnn(..), isUsgTy, splitFunTys ) @@ -167,7 +167,7 @@ for us. @sigVarTyMF@ checks the variable to see how to set the flags. @hasLocalDef@ tells us if the given variable has an actual local definition that we can play with. This is not quite the same as -@isLocallyDefined@, since @IMustBeINLINEd@ things (usually) don't have +@isLocallyDefined@, since @mayHaveNoBindingId@ things (usually) don't have a local definition - the simplifier will inline whatever their unfolding is anyway. We treat these as if they were externally defined, since we don't have access to their definition (at least not @@ -182,7 +182,7 @@ assumed true (exactly) of all imported ids. \begin{code} hasLocalDef :: IdOrTyVar -> Bool hasLocalDef var = isLocallyDefined var - && not (idMustBeINLINEd var) + && not (mayHaveNoBinding var) hasUsgInfo :: IdOrTyVar -> Bool hasUsgInfo var = (not . isLocallyDefined) var diff --git a/ghc/lib/std/Ix.lhs b/ghc/lib/std/Ix.lhs index da7a5e4..e7ee204 100644 --- a/ghc/lib/std/Ix.lhs +++ b/ghc/lib/std/Ix.lhs @@ -101,9 +101,9 @@ instance Ix Int where index b i | inRange b i = unsafeIndex b i | otherwise = indexError b i "Int" + {-# INLINE inRange #-} inRange (I# m,I# n) (I# i) = m <=# i && i <=# n - ---------------------------------------------------------------------- instance Ix Integer where {-# INLINE range #-} diff --git a/ghc/lib/std/PrelBase.lhs b/ghc/lib/std/PrelBase.lhs index b48a3e6..9f8cb50 100644 --- a/ghc/lib/std/PrelBase.lhs +++ b/ghc/lib/std/PrelBase.lhs @@ -149,29 +149,32 @@ foldr k z xs = go xs go (x:xs) = x `k` go xs build :: forall a. (forall b. (a -> b -> b) -> b -> b) -> [a] -{-# INLINE build #-} +{-# INLINE 2 build #-} -- The INLINE is important, even though build is tiny, -- because it prevents [] getting inlined in the version that -- appears in the interface file. If [] *is* inlined, it -- won't match with [] appearing in rules in an importing module. + -- + -- The "2" says to inline in phase 2 + build g = g (:) [] augment :: forall a. (forall b. (a->b->b) -> b -> b) -> [a] -> [a] -{-# INLINE augment #-} +{-# INLINE 2 augment #-} augment g xs = g (:) xs {-# RULES -"fold/build" forall k,z,g::forall b. (a->b->b) -> b -> b . +"fold/build" forall k z (g::forall b. (a->b->b) -> b -> b) . foldr k z (build g) = g k z -"foldr/augment" forall k,z,xs,g::forall b. (a->b->b) -> b -> b . +"foldr/augment" forall k z xs (g::forall b. (a->b->b) -> b -> b) . foldr k z (augment g xs) = g k (foldr k z xs) "foldr/id" foldr (:) [] = \x->x -"foldr/app" forall xs, ys. foldr (:) ys xs = append xs ys +"foldr/app" forall xs ys. foldr (:) ys xs = append xs ys -"foldr/cons" forall k,z,x,xs. foldr k z (x:xs) = k x (foldr k z xs) -"foldr/nil" forall k,z. foldr k z [] = z +"foldr/cons" forall k z x xs. foldr k z (x:xs) = k x (foldr k z xs) +"foldr/nil" forall k z. foldr k z [] = z #-} \end{code} @@ -193,7 +196,7 @@ mapList _ [] = [] mapList f (x:xs) = f x : mapList f xs {-# RULES -"mapFB" forall c,f,g. mapFB (mapFB c f) g = mapFB c (f.g) +"mapFB" forall c f g. mapFB (mapFB c f) g = mapFB c (f.g) "mapList" forall f. foldr (mapFB (:) f) [] = mapList f #-} \end{code} diff --git a/ghc/lib/std/PrelList.lhs b/ghc/lib/std/PrelList.lhs index b1a0b7c..2fecdf2 100644 --- a/ghc/lib/std/PrelList.lhs +++ b/ghc/lib/std/PrelList.lhs @@ -64,9 +64,9 @@ badHead = errorEmptyList "head" -- This rule is useful in cases like -- head [y | (x,y) <- ps, x==t] {-# RULES -"head/build" forall g::forall b.(Bool->b->b)->b->b . +"head/build" forall (g::forall b.(Bool->b->b)->b->b) . head (build g) = g (\x _ -> x) badHead -"head/augment" forall xs, g::forall b. (a->b->b) -> b -> b . +"head/augment" forall xs (g::forall b. (a->b->b) -> b -> b) . head (augment g xs) = g (\x _ -> x) (head xs) #-} @@ -125,7 +125,7 @@ filterFB c p x r | p x = x `c` r | otherwise = r {-# RULES -"filterFB" forall c,p,q. filterFB (filterFB c p) q = filterFB c (\x -> p x && q x) +"filterFB" forall c p q. filterFB (filterFB c p) q = filterFB c (\x -> p x && q x) "filterList" forall p. foldr (filterFB (:) p) [] = filterList p #-} @@ -361,9 +361,9 @@ or [] = False or (x:xs) = x || or xs {-# RULES -"and/build" forall g::forall b.(Bool->b->b)->b->b . +"and/build" forall (g::forall b.(Bool->b->b)->b->b) . and (build g) = g (&&) True -"or/build" forall g::forall b.(Bool->b->b)->b->b . +"or/build" forall (g::forall b.(Bool->b->b)->b->b) . or (build g) = g (||) False #-} #endif @@ -381,9 +381,9 @@ any p (x:xs) = p x || any p xs all _ [] = True all p (x:xs) = p x && all p xs {-# RULES -"any/build" forall p, g::forall b.(a->b->b)->b->b . +"any/build" forall p (g::forall b.(a->b->b)->b->b) . any p (build g) = g ((||) . p) False -"all/build" forall p, g::forall b.(a->b->b)->b->b . +"all/build" forall p (g::forall b.(a->b->b)->b->b) . all p (build g) = g ((&&) . p) True #-} #endif @@ -475,10 +475,10 @@ foldr2_right k _z y r (x:xs) = k x y (r xs) -- foldr2 k z xs ys = foldr (foldr2_left k z) (\_ -> z) xs ys -- foldr2 k z xs ys = foldr (foldr2_right k z) (\_ -> z) ys xs {-# RULES -"foldr2/left" forall k,z,ys,g::forall b.(a->b->b)->b->b . +"foldr2/left" forall k z ys (g::forall b.(a->b->b)->b->b) . foldr2 k z (build g) ys = g (foldr2_left k z) (\_ -> z) ys -"foldr2/right" forall k,z,xs,g::forall b.(a->b->b)->b->b . +"foldr2/right" forall k z xs (g::forall b.(a->b->b)->b->b) . foldr2 k z xs (build g) = g (foldr2_right k z) (\_ -> z) xs #-} \end{code} diff --git a/ghc/lib/std/PrelNumExtra.lhs b/ghc/lib/std/PrelNumExtra.lhs index c96617f..c9fa3c5 100644 --- a/ghc/lib/std/PrelNumExtra.lhs +++ b/ghc/lib/std/PrelNumExtra.lhs @@ -5,6 +5,7 @@ \section[PrelNumExtra]{Module @PrelNumExtra@} \begin{code} +{-# OPTIONS -fno-cpr-analyse #-} {-# OPTIONS -fno-implicit-prelude #-} {-# OPTIONS -H20m #-} diff --git a/ghc/lib/std/PrelST.lhs b/ghc/lib/std/PrelST.lhs index a45c8b2..9d8855c 100644 --- a/ghc/lib/std/PrelST.lhs +++ b/ghc/lib/std/PrelST.lhs @@ -114,8 +114,10 @@ All calls to @f@ will share a {\em single} array! End SLPJ 95/04. runST :: (forall s. ST s a) -> a runST st = runSTRep (case st of { ST st_rep -> st_rep }) --- I'm letting runSTRep be inlined *after* full laziness +-- I'm only letting runSTRep be inlined right at the end, in particular *after* full laziness +-- That's what the "INLINE 100" says. -- SLPJ Apr 99 +{-# INLINE 100 runSTRep #-} runSTRep :: (forall s. STRep s a) -> a runSTRep st_rep = case st_rep realWorld# of (# _, r #) -> r diff --git a/ghc/lib/std/PrelShow.lhs b/ghc/lib/std/PrelShow.lhs index b09d74d..c631265 100644 --- a/ghc/lib/std/PrelShow.lhs +++ b/ghc/lib/std/PrelShow.lhs @@ -57,7 +57,7 @@ class Show a where showsPrec _ x s = show x ++ s show x = shows x "" - showList ls = showList__ shows ls + showList ls s = showList__ shows ls s showList__ :: (a -> ShowS) -> [a] -> ShowS showList__ _ [] s = "[]" ++ s @@ -95,26 +95,31 @@ instance Show Char where showsPrec _ c = showChar '\'' . showLitChar c . showChar '\'' showList cs = showChar '"' . showl cs - where showl "" = showChar '"' - showl ('"':xs) = showString "\\\"" . showl xs - showl (x:xs) = showLitChar x . showl xs + where showl "" s = showChar '"' s + showl ('"':xs) s = showString "\\\"" (showl xs s) + showl (x:xs) s = showLitChar x (showl xs s) + -- Making 's' an explicit parameter makes it clear to GHC + -- that showl has arity 2, which avoids it allocating an extra lambda + -- The sticking point is the recursive call to (showl xs), which + -- it can't figure out would be ok with arity 2. instance Show Int where showsPrec p n = showSignedInt p n instance Show a => Show (Maybe a) where - showsPrec _p Nothing = showString "Nothing" - showsPrec p@(I# p#) (Just x) - = showParen (p# >=# 10#) $ - showString "Just " . - showsPrec (I# 10#) x + showsPrec _p Nothing s = showString "Nothing" s + showsPrec p@(I# p#) (Just x) s + = (showParen (p# >=# 10#) $ + showString "Just " . + showsPrec (I# 10#) x) s instance (Show a, Show b) => Show (Either a b) where - showsPrec p@(I# p#) e = - showParen (p# >=# 10#) $ - case e of + showsPrec p@(I# p#) e s = + (showParen (p# >=# 10#) $ + case e of Left a -> showString "Left " . showsPrec (I# 10#) a - Right b -> showString "Right " . showsPrec (I# 10#) b + Right b -> showString "Right " . showsPrec (I# 10#) b) + s \end{code} @@ -126,27 +131,37 @@ instance (Show a, Show b) => Show (Either a b) where %********************************************************* \begin{code} +-- The explicit 's' parameters are important +-- Otherwise GHC thinks that "shows x" might take a lot of work to compute +-- and generates defns like +-- showsPrec _ (x,y) = let sx = shows x; sy = shows y in +-- \s -> showChar '(' (sx (showChar ',' (sy (showChar ')' s)))) + instance (Show a, Show b) => Show (a,b) where - showsPrec _ (x,y) = showChar '(' . shows x . showChar ',' . - shows y . showChar ')' + showsPrec _ (x,y) s = (showChar '(' . shows x . showChar ',' . + shows y . showChar ')') + s instance (Show a, Show b, Show c) => Show (a, b, c) where - showsPrec _ (x,y,z) = showChar '(' . shows x . showChar ',' . - shows y . showChar ',' . - shows z . showChar ')' + showsPrec _ (x,y,z) s = (showChar '(' . shows x . showChar ',' . + shows y . showChar ',' . + shows z . showChar ')') + s instance (Show a, Show b, Show c, Show d) => Show (a, b, c, d) where - showsPrec _ (w,x,y,z) = showChar '(' . shows w . showChar ',' . - shows x . showChar ',' . - shows y . showChar ',' . - shows z . showChar ')' + showsPrec _ (w,x,y,z) s = (showChar '(' . shows w . showChar ',' . + shows x . showChar ',' . + shows y . showChar ',' . + shows z . showChar ')') + s instance (Show a, Show b, Show c, Show d, Show e) => Show (a, b, c, d, e) where - showsPrec _ (v,w,x,y,z) = showChar '(' . shows v . showChar ',' . - shows w . showChar ',' . - shows x . showChar ',' . - shows y . showChar ',' . - shows z . showChar ')' + showsPrec _ (v,w,x,y,z) s = (showChar '(' . shows v . showChar ',' . + shows w . showChar ',' . + shows x . showChar ',' . + shows y . showChar ',' . + shows z . showChar ')') + s \end{code} @@ -177,7 +192,7 @@ Code specific for characters \begin{code} showLitChar :: Char -> ShowS -showLitChar c | c > '\DEL' = showChar '\\' . protectEsc isDigit (shows (ord c)) +showLitChar c | c > '\DEL' = \s -> showChar '\\' (protectEsc isDigit (shows (ord c)) s) showLitChar '\DEL' = showString "\\DEL" showLitChar '\\' = showString "\\\\" showLitChar c | c >= ' ' = showChar c @@ -188,8 +203,11 @@ showLitChar '\n' = showString "\\n" showLitChar '\r' = showString "\\r" showLitChar '\t' = showString "\\t" showLitChar '\v' = showString "\\v" -showLitChar '\SO' = protectEsc (== 'H') (showString "\\SO") -showLitChar c = showString ('\\' : asciiTab!!ord c) +showLitChar '\SO' = \s -> protectEsc (== 'H') (showString "\\SO") s +showLitChar c = \s -> showString ('\\' : asciiTab!!ord c) s + -- The "\s ->" here means that GHC knows it's ok to put the + -- asciiTab!!ord c inside the lambda. Otherwise we get an extra + -- lambda allocated, and that can be pretty bad protectEsc :: (Char -> Bool) -> ShowS -> ShowS protectEsc p f = f . cont -- 1.7.10.4