Adjust Activations for specialise and work/wrap, and better simplify in InlineRules

author simonpj@microsoft.com <unknown>

Wed, 16 Dec 2009 14:52:05 +0000 (14:52 +0000)

committer simonpj@microsoft.com <unknown>

Wed, 16 Dec 2009 14:52:05 +0000 (14:52 +0000)
author simonpj@microsoft.com <unknown>
Wed, 16 Dec 2009 14:52:05 +0000 (14:52 +0000)
committer simonpj@microsoft.com <unknown>
Wed, 16 Dec 2009 14:52:05 +0000 (14:52 +0000)
diff --git a/compiler/simplCore/SimplUtils.lhs b/compiler/simplCore/SimplUtils.lhs

index c645519..dd4cec6 100644 (file)
--- a/compiler/simplCore/SimplUtils.lhs
+++ b/compiler/simplCore/SimplUtils.lhs
@@ -430,25 +430,6 @@ interestingArgContext rules call_cont
  %*                                                                     *
  %************************************************************************
  
-\begin{code}
-simplEnvForGHCi :: SimplEnv
-simplEnvForGHCi = mkSimplEnv allOffSwitchChecker $
-                  SimplGently { sm_rules = False, sm_inline = False }
-   -- Do not do any inlining, in case we expose some unboxed
-   -- tuple stuff that confuses the bytecode interpreter
-
-simplEnvForRules :: SimplEnv
-simplEnvForRules = mkSimplEnv allOffSwitchChecker $
-                   SimplGently { sm_rules = True, sm_inline = False }
-
-updModeForInlineRules :: SimplifierMode -> SimplifierMode
-updModeForInlineRules mode
-  = case mode of      
-      SimplGently {} -> mode   -- Don't modify mode if we already gentle
-      SimplPhase  {} -> SimplGently { sm_rules = True, sm_inline = True }
-       -- Simplify as much as possible, subject to the usual "gentle" rules
-\end{code}
-
  Inlining is controlled partly by the SimplifierMode switch.  This has two
  settings
         
@@ -511,33 +492,79 @@ RULES are enabled when doing "gentle" simplification.  Two reasons:
      to work in Template Haskell when simplifying
      splices, so we get simpler code for literal strings
  
-Note [Simplifying gently inside InlineRules]
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-We don't do much simplification inside InlineRules (which come from
-INLINE pragmas).  It really is important to switch off inlinings
-inside such expressions.  Consider the following example
+Note [Simplifying inside InlineRules]
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+We must take care with simplification inside InlineRules (which come from
+INLINE pragmas).  
  
+First, consider the following example
         let f = \pq -> BIG
         in
         let g = \y -> f y y
             {-# INLINE g #-}
         in ...g...g...g...g...g...
+Now, if that's the ONLY occurrence of f, it might be inlined inside g,
+and thence copied multiple times when g is inlined. HENCE we treat
+any occurrence in an InlineRule as a multiple occurrence, not a single
+one; see OccurAnal.addRuleUsage.
+
+Second, we do want *do* to some modest rules/inlining stuff in InlineRules,
+partly to eliminate senseless crap, and partly to break the recursive knots
+generated by instance declarations.  To keep things simple, we always set 
+the phase to 'gentle' when processing InlineRules.  OK, so suppose we have
+       {-# INLINE <act> f #-}
+       f = <rhs>
+meaning "inline f in phases p where activation <act>(p) holds". 
+Then what inlinings/rules can we apply to the copy of <rhs> captured in
+f's InlineRule?  Our model is that literally <rhs> is substituted for
+f when it is inlined.  So our conservative plan (implemented by 
+updModeForInlineRules) is this:
+
+  -------------------------------------------------------------
+  When simplifying the RHS of an InlineRule,
+  If the InlineRule becomes active in phase p, then
+    if the current phase is *earlier than* p, 
+       make no inlinings or rules active when simplifying the RHS
+    otherwise 
+       set the phase to p when simplifying the RHS
+  -------------------------------------------------------------
+
+That ensures that
+
+  a) Rules/inlinings that *cease* being active before p will 
+     not apply to the InlineRule rhs, consistent with it being
+     inlined in its *original* form in phase p.
+
+  b) Rules/inlinings that only become active *after* p will
+     not apply to the InlineRule rhs, again to be consistent with
+     inlining the *original* rhs in phase p.
+
+For example, 
+       {-# INLINE f #-}
+       f x = ...g...
  
-Now, if that's the ONLY occurrence of f, it will be inlined inside g,
-and thence copied multiple times when g is inlined.  
+       {-# NOINLINE [1] g #-}
+       g y = ...
  
-This function may be inlinined in other modules, so we don't want to
-remove (by inlining) calls to functions that have specialisations, or
-that may have transformation rules in an importing scope.
+       {-# RULE h g = ... #-}
+Here we must not inline g into f's RHS, even when we get to phase 0,
+because when f is later inlined into some other module we want the
+rule for h to fire.
  
-E.g.   {-# INLINE f #-}
+Similarly, consider
+       {-# INLINE f #-}
         f x = ...g...
  
-and suppose that g is strict *and* has specialisations.  If we inline
-g's wrapper, we deny f the chance of getting the specialised version
-of g when f is inlined at some call site (perhaps in some other
-module).
+       g y = ...
+and suppose that there are auto-generated specialisations and a strictness
+wrapper for g.  The specialisations get activation AlwaysActive, and the
+strictness wrapper get activation (ActiveAfter 0).  So the strictness
+wrepper fails the test and won't be inlined into f's InlineRule. That
+means f can inline, expose the specialised call to g, so the specialisation
+rules can fire.
  
+A note about wrappers
+~~~~~~~~~~~~~~~~~~~~~
  It's also important not to inline a worker back into a wrapper.
  A wrapper looks like
         wraper = inline_me (\x -> ...worker... )
@@ -545,18 +572,45 @@ Normally, the inline_me prevents the worker getting inlined into
  the wrapper (initially, the worker's only call site!).  But,
  if the wrapper is sure to be called, the strictness analyser will
  mark it 'demanded', so when the RHS is simplified, it'll get an ArgOf
-continuation.  That's why the keep_inline predicate returns True for
-ArgOf continuations.  It shouldn't do any harm not to dissolve the
-inline-me note under these circumstances.
+continuation. 
  
-Although we do very little simplification inside an InlineRule,
-the RHS is simplified as normal.  For example:
+\begin{code}
+simplEnvForGHCi :: SimplEnv
+simplEnvForGHCi = mkSimplEnv allOffSwitchChecker $
+                  SimplGently { sm_rules = False, sm_inline = False }
+   -- Do not do any inlining, in case we expose some unboxed
+   -- tuple stuff that confuses the bytecode interpreter
  
-       all xs = foldr (&&) True xs
-       any p = all . map p  {-# INLINE any #-}
+simplEnvForRules :: SimplEnv
+simplEnvForRules = mkSimplEnv allOffSwitchChecker $
+                   SimplGently { sm_rules = True, sm_inline = False }
+
+updModeForInlineRules :: Activation -> SimplifierMode -> SimplifierMode
+-- See Note [Simplifying inside InlineRules]
+--    Treat Gentle as phase "infinity"
+--    If current_phase `earlier than` inline_rule_start_phase 
+--      then no_op
+--    else 
+--    if current_phase `same phase` inline_rule_start_phase 
+--      then current_phase   (keep gentle flags)
+--      else inline_rule_start_phase
+updModeForInlineRules inline_rule_act current_mode
+  = case inline_rule_act of
+      NeverActive     -> no_op
+      AlwaysActive    -> mk_gentle current_mode
+      ActiveBefore {} -> mk_gentle current_mode
+      ActiveAfter n   -> mk_phase n current_mode
+  where
+    no_op  = SimplGently { sm_rules = False, sm_inline = False }
+
+    mk_gentle (SimplGently {}) = current_mode
+    mk_gentle _                = SimplGently { sm_rules = True,  sm_inline = True }
  
-The RHS of 'any' will get optimised and deforested; but the InlineRule
-will still mention the original RHS.
+    mk_phase n (SimplPhase cp ss) 
+                    | cp > n    = no_op        -- Current phase earlier than n
+                    | otherwise = SimplPhase n ss
+    mk_phase _ (SimplGently {}) = no_op
+\end{code}
  
  
  preInlineUnconditionally
diff --git a/compiler/simplCore/Simplify.lhs b/compiler/simplCore/Simplify.lhs

index 106cd9d..b7084c8 100644 (file)
--- a/compiler/simplCore/Simplify.lhs
+++ b/compiler/simplCore/Simplify.lhs
@@ -671,15 +671,17 @@ simplUnfolding env _ _ _ _ (DFunUnfolding con ops)
    where
      ops' = map (CoreSubst.substExpr (mkCoreSubst env)) ops
  
-simplUnfolding env top_lvl _ _ _ 
+simplUnfolding env top_lvl id _ _ 
      (CoreUnfolding { uf_tmpl = expr, uf_arity = arity
                     , uf_src = src, uf_guidance = guide })
    | isInlineRuleSource src
-  = do { expr' <- simplExpr (updMode updModeForInlineRules env) expr
-                      -- See Note [Simplifying gently inside InlineRules] in SimplUtils
+  = do { expr' <- simplExpr rule_env expr
         ; let src' = CoreSubst.substUnfoldingSource (mkCoreSubst env) src
         ; return (mkCoreUnfolding (isTopLevel top_lvl) src' expr' arity guide) }
                 -- See Note [Top-level flag on inline rules] in CoreUnfold
+  where
+    rule_env = updMode (updModeForInlineRules (idInlineActivation id)) env
+                      -- See Note [Simplifying gently inside InlineRules] in SimplUtils
  
  simplUnfolding _ top_lvl id _occ_info new_rhs _
    = return (mkUnfolding (isTopLevel top_lvl) (isBottomingId id) new_rhs)
diff --git a/compiler/specialise/SpecConstr.lhs b/compiler/specialise/SpecConstr.lhs

index 36dda5e..404b6cc 100644 (file)
--- a/compiler/specialise/SpecConstr.lhs
+++ b/compiler/specialise/SpecConstr.lhs
@@ -39,7 +39,6 @@ import Name
  import DynFlags                ( DynFlags(..) )
  import StaticFlags     ( opt_PprStyle_Debug )
  import StaticFlags     ( opt_SpecInlineJoinPoints )
-import BasicTypes      ( Activation(..) )
  import Maybes          ( orElse, catMaybes, isJust, isNothing )
  import Demand
  import DmdAnal         ( both )
@@ -1179,18 +1178,19 @@ spec_one env fn arg_bndrs body (call_pat@(qvars, pats), rule_number)
                 -- Usual w/w hack to avoid generating 
                 -- a spec_rhs of unlifted type and no args
         
-             fn_name   = idName fn
-             fn_loc    = nameSrcSpan fn_name
-             spec_occ  = mkSpecOcc (nameOccName fn_name)
-             rule_name = mkFastString ("SC:" ++ showSDoc (ppr fn <> int rule_number))
-             spec_rhs  = mkLams spec_lam_args spec_body
-             spec_str  = calcSpecStrictness fn spec_lam_args pats
-             spec_id   = mkUserLocal spec_occ spec_uniq (mkPiTypes spec_lam_args body_ty) fn_loc
-                           `setIdStrictness` spec_str          -- See Note [Transfer strictness]
-                           `setIdArity` count isId spec_lam_args
-             body_ty   = exprType spec_body
-             rule_rhs  = mkVarApps (Var spec_id) spec_call_args
-             rule      = mkLocalRule rule_name specConstrActivation fn_name qvars pats rule_rhs
+             fn_name    = idName fn
+             fn_loc     = nameSrcSpan fn_name
+             spec_occ   = mkSpecOcc (nameOccName fn_name)
+             rule_name  = mkFastString ("SC:" ++ showSDoc (ppr fn <> int rule_number))
+             spec_rhs   = mkLams spec_lam_args spec_body
+             spec_str   = calcSpecStrictness fn spec_lam_args pats
+             spec_id    = mkUserLocal spec_occ spec_uniq (mkPiTypes spec_lam_args body_ty) fn_loc
+                            `setIdStrictness` spec_str         -- See Note [Transfer strictness]
+                            `setIdArity` count isId spec_lam_args
+             body_ty    = exprType spec_body
+             rule_rhs   = mkVarApps (Var spec_id) spec_call_args
+              inline_act = idInlineActivation fn
+             rule       = mkLocalRule rule_name inline_act fn_name qvars pats rule_rhs
         ; return (spec_usg, OS call_pat rule spec_id spec_rhs) }
  
  calcSpecStrictness :: Id                    -- The original function
@@ -1215,18 +1215,23 @@ calcSpecStrictness fn qvars pats
            | (Var _, args) <- collectArgs e = go env ds args
      go_one env _         _ = env
  
--- In which phase should the specialise-constructor rules be active?
--- Originally I made them always-active, but Manuel found that
--- this defeated some clever user-written rules.  So Plan B
--- is to make them active only in Phase 0; after all, currently,
--- the specConstr transformation is only run after the simplifier
--- has reached Phase 0.  In general one would want it to be 
--- flag-controllable, but for now I'm leaving it baked in
---                                     [SLPJ Oct 01]
-specConstrActivation :: Activation
-specConstrActivation = ActiveAfter 0   -- Baked in; see comments above
  \end{code}
  
+Note [Transfer activation]
+~~~~~~~~~~~~~~~~~~~~~~~~~~
+In which phase should the specialise-constructor rules be active?
+Originally I made them always-active, but Manuel found that this
+defeated some clever user-written rules.  Then I made them active only
+in Phase 0; after all, currently, the specConstr transformation is
+only run after the simplifier has reached Phase 0, but that meant
+that specialisations didn't fire inside wrappers; see test
+simplCore/should_compile/spec-inline.
+
+So now I just use the inline-activation of the parent Id, as the
+activation for the specialiation RULE, just like the main specialiser;
+see Note [Auto-specialisation and RULES] in Specialise.
+
+
  Note [Transfer strictness]
  ~~~~~~~~~~~~~~~~~~~~~~~~~~
  We must transfer strictness information from the original function to
diff --git a/compiler/specialise/Specialise.lhs b/compiler/specialise/Specialise.lhs

index ad641d4..5d780ea 100644 (file)
--- a/compiler/specialise/Specialise.lhs
+++ b/compiler/specialise/Specialise.lhs
@@ -29,6 +29,7 @@ import Name
  import MkId            ( voidArgId, realWorldPrimId )
  import FiniteMap
  import Maybes          ( catMaybes, isJust )
+import BasicTypes      ( isNeverActive, inlinePragmaActivation )
  import Bag
  import Util
  import Outputable
@@ -773,6 +774,9 @@ specDefn subst body_uds fn rhs
    |  rhs_tyvars `lengthIs`     n_tyvars -- Rhs of fn's defn has right number of big lambdas
    && rhs_ids    `lengthAtLeast` n_dicts        -- and enough dict args
    && notNull calls_for_me              -- And there are some calls to specialise
+  && not (isNeverActive (idInlineActivation fn))
+       -- Don't specialise NOINLINE things
+       -- See Note [Auto-specialisation and RULES]
  
  --   && not (certainlyWillInline (idUnfolding fn))     -- And it's not small
  --     See Note [Inline specialisation] for why we do not 
@@ -804,7 +808,7 @@ specDefn subst body_uds fn rhs
      (tyvars, theta, _) = tcSplitSigmaTy fn_type
      n_tyvars          = length tyvars
      n_dicts           = length theta
-    inline_act         = idInlineActivation fn
+    inl_act            = inlinePragmaActivation (idInlinePragma fn)
  
         -- Figure out whether the function has an INLINE pragma
         -- See Note [Inline specialisations]
@@ -886,10 +890,6 @@ specDefn subst body_uds fn rhs
                  spec_id_ty = mkPiTypes lam_args body_ty
         
             ; spec_f <- newSpecIdSM fn spec_id_ty
-          ; let spec_f_w_arity = setIdArity spec_f (max 0 (fn_arity - n_dicts))
-               -- Adding arity information just propagates it a bit faster
-               -- See Note [Arity decrease] in Simplify
-
             ; (spec_rhs, rhs_uds) <- specExpr rhs_subst2 (mkLams lam_args body)
            ; let
                 -- The rule to put in the function's specialisation is:
@@ -897,20 +897,26 @@ specDefn subst body_uds fn rhs
                 rule_name = mkFastString ("SPEC " ++ showSDoc (ppr fn <+> ppr spec_ty_args))
                 spec_env_rule = mkLocalRule
                                   rule_name
-                                 inline_act    -- Note [Auto-specialisation and RULES]
+                                 inl_act       -- Note [Auto-specialisation and RULES]
                                   (idName fn)
                                   (poly_tyvars ++ inst_dict_ids)
                                   inst_args 
-                                 (mkVarApps (Var spec_f_w_arity) app_args)
+                                 (mkVarApps (Var spec_f) app_args)
  
                 -- Add the { d1' = dx1; d2' = dx2 } usage stuff
                 final_uds = foldr consDictBind rhs_uds dx_binds
  
+               -- Adding arity information just propagates it a bit faster
+               --      See Note [Arity decrease] in Simplify
+               -- Copy InlinePragma information from the parent Id.
+               -- So if f has INLINE[1] so does spec_f
+               spec_f_w_arity = spec_f `setIdArity`          max 0 (fn_arity - n_dicts)
+                                        `setInlineActivation` inl_act
+
+               -- Add an InlineRule if the parent has one
                 -- See Note [Inline specialisations]
                 final_spec_f | Just sat <- fn_has_inline_rule
-                            = spec_f_w_arity `setInlineActivation` inline_act
-                                             `setIdUnfolding` mkInlineRule sat spec_rhs spec_arity
-                                               -- I'm not sure this should be unconditionally InlSat
+                            = spec_f_w_arity `setIdUnfolding` mkInlineRule sat spec_rhs spec_arity
                              | otherwise 
                              = spec_f_w_arity
            ; return (Just ((final_spec_f, spec_rhs), final_uds, spec_env_rule)) } }
@@ -1112,10 +1118,14 @@ also add
         RULE f g_spec = 0
  
  But that's a bit complicated.  For now we ask the programmer's help,
-by *copying the INLINE activation pragma* to the auto-specialised rule.
-So if g says {-# NOINLINE[2] g #-}, then the auto-spec rule will also
-not be active until phase 2.  
+by *copying the INLINE activation pragma* to the auto-specialised
+rule.  So if g says {-# NOINLINE[2] g #-}, then the auto-spec rule
+will also not be active until phase 2.  And that's what programmers
+should jolly well do anyway, even aside from specialisation, to ensure
+that g doesn't inline too early.
  
+This in turn means that the RULE would never fire for a NOINLINE
+thing so not much point in generating a specialisation at all.
  
  Note [Specialisation shape]
  ~~~~~~~~~~~~~~~~~~~~~~~~~~~
@@ -1141,13 +1151,12 @@ It's a silly exapmle, but we get
  where choose doesn't have any dict arguments.  Thus far I have not
  tried to fix this (wait till there's a real example).
  
-
  Note [Inline specialisations]
  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  We transfer to the specialised function any INLINE stuff from the
-original.  This means (a) the Activation in the IdInfo, and (b) any
-InlineMe on the RHS.  We do not, however, transfer the RuleMatchInfo
-since we do not expect the specialisation to occur in rewrite rules.
+original.  This means 
+   (a) the Activation for its inlining (from its InlinePragma)
+   (b) any InlineRule
  
  This is a change (Jun06).  Previously the idea is that the point of
  inlining was precisely to specialise the function at its call site,
@@ -1166,9 +1175,6 @@ arguments alone are enough to specialise (even though the args are too
  boring to trigger inlining), and it's certainly better to call the 
  specialised version.
  
-A case in point is dictionary functions, which are current marked
-INLINE, but which are worth specialising.
-
  
  %************************************************************************
  %*                                                                     *
diff --git a/compiler/stranal/WorkWrap.lhs b/compiler/stranal/WorkWrap.lhs

index 493015f..33ca298 100644 (file)
--- a/compiler/stranal/WorkWrap.lhs
+++ b/compiler/stranal/WorkWrap.lhs
@@ -12,7 +12,7 @@ import CoreUtils      ( exprType, exprIsHNF )
  import CoreArity       ( exprArity )
  import Var
  import Id              ( idType, isOneShotLambda, idUnfolding,
-                         setIdStrictness, mkWorkerId,
+                         setIdStrictness, mkWorkerId, setInlinePragma,
                           setInlineActivation, setIdUnfolding,
                           setIdArity )
  import Type            ( Type )
@@ -22,7 +22,8 @@ import Demand           ( Demand(..), StrictSig(..), DmdType(..), DmdResult(..),
                         )
  import UniqSupply
  import BasicTypes      ( RecFlag(..), isNonRec, isNeverActive,
-                          Activation, inlinePragmaActivation )
+                          Activation(..), InlinePragma(..), 
+                         inlinePragmaActivation, inlinePragmaRuleMatchInfo )
  import VarEnv          ( isEmptyVarEnv )
  import Maybes          ( orElse )
  import WwLib
@@ -145,33 +146,22 @@ wwExpr (Case expr binder ty alts) = do
  front-end into the proper form, then calls @mkWwBodies@ to do
  the business.
  
-We have to BE CAREFUL that we don't worker-wrapperize an Id that has
-already been w-w'd!  (You can end up with several liked-named Ids
-bouncing around at the same time---absolute mischief.)  So the
-criterion we use is: if an Id already has an unfolding (for whatever
-reason), then we don't w-w it.
-
  The only reason this is monadised is for the unique supply.
  
  Note [Don't w/w inline things (a)]
  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-It's very important to refrain from w/w-ing an INLINE function
-because the wrapper will then overwrite the InlineRule unfolding.
-
-It was wrong with the old InlineMe Note too: if we do so by mistake 
-we transform
-       f = __inline (\x -> E)
-into
-       f = __inline (\x -> case x of (a,b) -> fw E)
-       fw = \ab -> (__inline (\x -> E)) (a,b)
-and the original __inline now vanishes, so E is no longer
-inside its __inline wrapper.  Death!  Disaster!
+
+It's very important to refrain from w/w-ing an INLINE function (ie one
+with an InlineRule) because the wrapper will then overwrite the
+InlineRule unfolding.
  
  Furthermore, if the programmer has marked something as INLINE, 
  we may lose by w/w'ing it.
  
  If the strictness analyser is run twice, this test also prevents
-wrappers (which are INLINEd) from being re-done.
+wrappers (which are INLINEd) from being re-done.  (You can end up with
+several liked-named Ids bouncing around at the same time---absolute
+mischief.)  
  
  Notice that we refrain from w/w'ing an INLINE function even if it is
  in a recursive group.  It might not be the loop breaker.  (We could
@@ -179,11 +169,10 @@ test for loop-breaker-hood, but I'm not sure that ever matters.)
  
  Note [Don't w/w inline things (b)]
  ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-In general, therefore, we refrain from w/w-ing *small* functions,
-because they'll inline anyway.  But we must take care: it may look
-small now, but get to be big later after other inling has happened.
-So we take the precaution of adding an INLINE pragma to any such
-functions.  
+In general, we refrain from w/w-ing *small* functions, because they'll
+inline anyway.  But we must take care: it may look small now, but get
+to be big later after other inling has happened.  So we take the
+precaution of adding an INLINE pragma to any such functions.
  
  I made this change when I observed a big function at the end of
  compilation with a useful strictness signature but no w-w.  When 
@@ -191,6 +180,34 @@ I measured it on nofib, it didn't make much difference; just a few
  percent improved allocation on one benchmark (bspt/Euclid.space).  
  But nothing got worse.
  
+Note [Wrapper activation]
+~~~~~~~~~~~~~~~~~~~~~~~~~
+When should the wrapper inlining be active?  It must not be active
+earlier than the current Activation of the Id (eg it might have a
+NOINLINE pragma).  But in fact strictness analysis happens fairly
+late in the pipeline, and we want to prioritise specialisations over
+strictness.  Eg if we have 
+  module Foo where
+    f :: Num a => a -> Int -> a
+    f n 0 = n                     -- Strict in the Int, hence wrapper
+    f n x = f (n+n) (x-1)
+
+    g :: Int -> Int
+    g x = f x x                   -- Provokes a specialisation for f
+
+  module Bsr where
+    import Foo
+
+    h :: Int -> Int
+    h x = f 3 x
+
+Then we want the specialisation for 'f' to kick in before the wrapper does.
+
+Now in fact the 'gentle' simplification pass encourages this, by
+having rules on, but inlinings off.  But that's kind of lucky. It seems 
+more robust to give the wrapper an Activation of (ActiveAfter 0),
+so that it becomes active in an importing module at the same time that
+it appears in the first place in the defining module.
  
  \begin{code}
  tryWW  :: RecFlag
@@ -218,7 +235,7 @@ tryWW is_rec fn_id rhs
  
    | is_fun && worthSplittingFun wrap_dmds res_info
    = checkSize new_fn_id rhs $
-    splitFun new_fn_id fn_info wrap_dmds res_info inline_act rhs
+    splitFun new_fn_id fn_info wrap_dmds res_info rhs
  
    | otherwise
    = return [ (new_fn_id, rhs) ]
@@ -264,9 +281,9 @@ checkSize fn_id rhs thing_inside
      inline_rule = mkInlineRule unSaturatedOk rhs (unfoldingArity unfolding)
  
  ---------------------
-splitFun :: Id -> IdInfo -> [Demand] -> DmdResult -> Activation -> Expr Var
+splitFun :: Id -> IdInfo -> [Demand] -> DmdResult -> Expr Var
           -> UniqSM [(Id, CoreExpr)]
-splitFun fn_id fn_info wrap_dmds res_info inline_act rhs
+splitFun fn_id fn_info wrap_dmds res_info rhs
    = WARN( not (wrap_dmds `lengthIs` arity), ppr fn_id <+> (ppr arity $$ ppr wrap_dmds $$ ppr res_info) ) 
      (do {
         -- The arity should match the signature
@@ -275,32 +292,45 @@ splitFun fn_id fn_info wrap_dmds res_info inline_act rhs
      ; let
         work_rhs = work_fn rhs
         work_id  = mkWorkerId work_uniq fn_id (exprType work_rhs) 
-                       `setInlineActivation` inline_act
+                       `setInlineActivation` (inlinePragmaActivation inl_prag)
                                 -- Any inline activation (which sets when inlining is active) 
-                               -- on the original function is duplicated on the worker and wrapper
+                               -- on the original function is duplicated on the worker
                                 -- It *matters* that the pragma stays on the wrapper
                                 -- It seems sensible to have it on the worker too, although we
                                 -- can't think of a compelling reason. (In ptic, INLINE things are 
                                 -- not w/wd). However, the RuleMatchInfo is not transferred since
                                  -- it does not make sense for workers to be constructorlike.
+
                         `setIdStrictness` StrictSig (mkTopDmdType work_demands work_res_info)
                                 -- Even though we may not be at top level, 
                                 -- it's ok to give it an empty DmdEnv
+
                          `setIdArity` (exprArity work_rhs)
                                  -- Set the arity so that the Core Lint check that the 
                                  -- arity is consistent with the demand type goes through
  
-       wrap_rhs = wrap_fn work_id
-       wrap_id  = fn_id `setIdUnfolding` mkWwInlineRule work_id wrap_rhs arity
+       wrap_rhs  = wrap_fn work_id
+       wrap_prag = InlinePragma { inl_inline = True
+                                 , inl_act    = ActiveAfter 0
+                                 , inl_rule   = rule_match_info }
+
+       wrap_id   = fn_id `setIdUnfolding` mkWwInlineRule work_id wrap_rhs arity
+                         `setInlinePragma` wrap_prag
+                               -- See Note [Wrapper activation]
+                               -- The RuleMatchInfo is (and must be) unaffected
+                               -- The inl_inline is bound to be False, else we would not be
+                               --    making a wrapper
  
      ; return ([(work_id, work_rhs), (wrap_id, wrap_rhs)]) })
         -- Worker first, because wrapper mentions it
         -- mkWwBodies has already built a wrap_rhs with an INLINE pragma wrapped around it
    where
-    fun_ty = idType fn_id
-
-    arity  = arityInfo fn_info -- The arity is set by the simplifier using exprEtaExpandArity
-                               -- So it may be more than the number of top-level-visible lambdas
+    fun_ty          = idType fn_id
+    inl_prag        = inlinePragInfo fn_info
+    rule_match_info = inlinePragmaRuleMatchInfo inl_prag
+    arity           = arityInfo fn_info        
+                   -- The arity is set by the simplifier using exprEtaExpandArity
+                   -- So it may be more than the number of top-level-visible lambdas
  
      work_res_info | isBotRes res_info = BotRes -- Cpr stuff done by wrapper
                   | otherwise         = TopRes
author	simonpj@microsoft.com <unknown>
	Wed, 16 Dec 2009 14:52:05 +0000 (14:52 +0000)
committer	simonpj@microsoft.com <unknown>
	Wed, 16 Dec 2009 14:52:05 +0000 (14:52 +0000)
compiler/simplCore/SimplUtils.lhs		patch \| blob \| history
compiler/simplCore/Simplify.lhs		patch \| blob \| history
compiler/specialise/SpecConstr.lhs		patch \| blob \| history
compiler/specialise/Specialise.lhs		patch \| blob \| history
compiler/stranal/WorkWrap.lhs		patch \| blob \| history