2 % (c) The GRASP/AQUA Project, Glasgow University, 1992-1998
4 %************************************************************************
6 \section[OccurAnal]{Occurrence analysis pass}
8 %************************************************************************
10 The occurrence analyser re-typechecks a core expression, returning a new
11 core expression with (hopefully) improved usage information.
15 occurAnalysePgm, occurAnalyseExpr
18 #include "HsVersions.h"
22 import CoreUtils ( exprIsTrivial, isDefaultAlt, isExpandableApp, mkCoerce )
26 import Name ( Name, localiseName )
34 import Maybes ( orElse )
35 import Digraph ( SCC(..), stronglyConnCompFromEdgedVerticesR )
36 import PrelNames ( buildIdKey, foldrIdKey, runSTRepIdKey, augmentIdKey )
39 import Util ( mapAndUnzip, filterOut )
47 %************************************************************************
49 \subsection[OccurAnal-main]{Counting occurrences: main function}
51 %************************************************************************
53 Here's the externally-callable interface:
56 occurAnalysePgm :: Maybe (Activation -> Bool) -> [CoreRule]
57 -> [CoreBind] -> [CoreBind]
58 occurAnalysePgm active_rule imp_rules binds
59 = snd (go (initOccEnv active_rule imp_rules) binds)
61 initial_uds = addIdOccs emptyDetails (rulesFreeVars imp_rules)
62 -- The RULES keep things alive!
64 go :: OccEnv -> [CoreBind] -> (UsageDetails, [CoreBind])
68 = (final_usage, bind' ++ binds')
70 (bs_usage, binds') = go env binds
71 (final_usage, bind') = occAnalBind env env bind bs_usage
73 occurAnalyseExpr :: CoreExpr -> CoreExpr
74 -- Do occurrence analysis, and discard occurence info returned
76 = snd (occAnal (initOccEnv all_active_rules []) expr)
78 -- To be conservative, we say that all inlines and rules are active
79 all_active_rules = Just (\_ -> True)
83 %************************************************************************
85 \subsection[OccurAnal-main]{Counting occurrences: main function}
87 %************************************************************************
93 occAnalBind :: OccEnv -- The incoming OccEnv
94 -> OccEnv -- Same, but trimmed by (binderOf bind)
96 -> UsageDetails -- Usage details of scope
97 -> (UsageDetails, -- Of the whole let(rec)
100 occAnalBind env _ (NonRec binder rhs) body_usage
101 | isTyVar binder -- A type let; we don't gather usage info
102 = (body_usage, [NonRec binder rhs])
104 | not (binder `usedIn` body_usage) -- It's not mentioned
107 | otherwise -- It's mentioned in the body
108 = (body_usage' +++ rhs_usage3, [NonRec tagged_binder rhs'])
110 (body_usage', tagged_binder) = tagBinder body_usage binder
111 (rhs_usage1, rhs') = occAnalRhs env (Just tagged_binder) rhs
112 rhs_usage2 = addIdOccs rhs_usage1 (idUnfoldingVars binder)
113 rhs_usage3 = addIdOccs rhs_usage2 (idRuleVars binder)
114 -- See Note [Rules are extra RHSs] and Note [Rule dependency info]
119 Dropping dead code for recursive bindings is done in a very simple way:
121 the entire set of bindings is dropped if none of its binders are
122 mentioned in its body; otherwise none are.
124 This seems to miss an obvious improvement.
136 Now 'f' is unused! But it's OK! Dependency analysis will sort this
137 out into a letrec for 'g' and a 'let' for 'f', and then 'f' will get
138 dropped. It isn't easy to do a perfect job in one blow. Consider
149 Note [Loop breaking and RULES]
150 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
151 Loop breaking is surprisingly subtle. First read the section 4 of
152 "Secrets of the GHC inliner". This describes our basic plan.
154 However things are made quite a bit more complicated by RULES. Remember
156 * Note [Rules are extra RHSs]
157 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
158 A RULE for 'f' is like an extra RHS for 'f'. That way the "parent"
159 keeps the specialised "children" alive. If the parent dies
160 (because it isn't referenced any more), then the children will die
161 too (unless they are already referenced directly).
163 To that end, we build a Rec group for each cyclic strongly
165 *treating f's rules as extra RHSs for 'f'*.
166 More concretely, the SCC analysis runs on a graph with an edge
167 from f -> g iff g is mentioned in
172 Under (b) we include variables free in *either* LHS *or* RHS of
173 the rule. The former might seems silly, but see Note [Rule
174 dependency info]. So in Example [eftInt], eftInt and eftIntFB
175 will be put in the same Rec, even though their 'main' RHSs are
178 * Note [Rules are visible in their own rec group]
179 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
180 We want the rules for 'f' to be visible in f's right-hand side.
181 And we'd like them to be visible in other functions in f's Rec
182 group. E.g. in Example [Specialisation rules] we want f' rule
183 to be visible in both f's RHS, and fs's RHS.
185 This means that we must simplify the RULEs first, before looking
186 at any of the definitions. This is done by Simplify.simplRecBind,
187 when it calls addLetIdInfo.
189 * Note [Choosing loop breakers]
190 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
191 We avoid infinite inlinings by choosing loop breakers, and
192 ensuring that a loop breaker cuts each loop. But what is a
193 "loop"? In particular, a RULE is like an equation for 'f' that
194 is *always* inlined if it is applicable. We do *not* disable
195 rules for loop-breakers. It's up to whoever makes the rules to
196 make sure that the rules themselves always terminate. See Note
197 [Rules for recursive functions] in Simplify.lhs
200 f's RHS (or its INLINE template if it has one) mentions g, and
201 g has a RULE that mentions h, and
202 h has a RULE that mentions f
204 then we *must* choose f to be a loop breaker. In general, take the
205 free variables of f's RHS, and augment it with all the variables
206 reachable by RULES from those starting points. That is the whole
207 reason for computing rule_fv_env in occAnalBind. (Of course we
208 only consider free vars that are also binders in this Rec group.)
209 See also Note [Finding rule RHS free vars]
211 Note that when we compute this rule_fv_env, we only consider variables
212 free in the *RHS* of the rule, in contrast to the way we build the
213 Rec group in the first place (Note [Rule dependency info])
215 Note that if 'g' has RHS that mentions 'w', we should add w to
216 g's loop-breaker edges. More concretely there is an edge from f -> g
218 (a) g is mentioned in f's RHS
219 (b) h is mentioned in f's RHS, and
220 g appears in the RHS of a RULE of h
221 or a transitive sequence of rules starting with h
223 Note that in Example [eftInt], *neither* eftInt *nor* eftIntFB is
224 chosen as a loop breaker, because their RHSs don't mention each other.
225 And indeed both can be inlined safely.
227 Note that the edges of the graph we use for computing loop breakers
228 are not the same as the edges we use for computing the Rec blocks.
229 That's why we compute
230 rec_edges for the Rec block analysis
231 loop_breaker_edges for the loop breaker analysis
233 * Note [Finding rule RHS free vars]
234 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
235 Consider this real example from Data Parallel Haskell
236 tagZero :: Array Int -> Array Tag
237 {-# INLINE [1] tagZeroes #-}
238 tagZero xs = pmap (\x -> fromBool (x==0)) xs
240 {-# RULES "tagZero" [~1] forall xs n.
241 pmap fromBool <blah blah> = tagZero xs #-}
242 So tagZero's RHS mentions pmap, and pmap's RULE mentions tagZero.
243 However, tagZero can only be inlined in phase 1 and later, while
244 the RULE is only active *before* phase 1. So there's no problem.
246 To make this work, we look for the RHS free vars only for
247 *active* rules. That's the reason for the is_active argument
248 to idRhsRuleVars, and the occ_rule_act field of the OccEnv.
250 * Note [Weak loop breakers]
251 ~~~~~~~~~~~~~~~~~~~~~~~~~
252 There is a last nasty wrinkle. Suppose we have
262 Remember that we simplify the RULES before any RHS (see Note
263 [Rules are visible in their own rec group] above).
265 So we must *not* postInlineUnconditionally 'g', even though
266 its RHS turns out to be trivial. (I'm assuming that 'g' is
267 not choosen as a loop breaker.) Why not? Because then we
268 drop the binding for 'g', which leaves it out of scope in the
271 We "solve" this by making g a "weak" or "rules-only" loop breaker,
272 with OccInfo = IAmLoopBreaker True. A normal "strong" loop breaker
273 has IAmLoopBreaker False. So
275 Inline postInlineUnconditionally
276 IAmLoopBreaker False no no
277 IAmLoopBreaker True yes no
280 The **sole** reason for this kind of loop breaker is so that
281 postInlineUnconditionally does not fire. Ugh.
283 * Note [Rule dependency info]
284 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
285 The VarSet in a SpecInfo is used for dependency analysis in the
286 occurrence analyser. We must track free vars in *both* lhs and rhs.
287 Hence use of idRuleVars, rather than idRuleRhsVars in occAnalBind.
291 Then if we substitute y for x, we'd better do so in the
292 rule's LHS too, so we'd better ensure the dependency is respected
295 * Note [Inline rules]
297 None of the above stuff about RULES applies to Inline Rules,
298 stored in a CoreUnfolding. The unfolding, if any, is simplified
299 at the same time as the regular RHS of the function, so it should
300 be treated *exactly* like an extra RHS.
302 There is a danger that we'll be sub-optimal if we see this
304 [INLINE f = ..no f...]
305 where f is recursive, but the INLINE is not. This can just about
306 happen with a sufficiently odd set of rules; eg
309 {-# INLINE [1] foo #-}
313 {-# INLINE [1] bar #-}
316 {-# RULES "foo" [~1] forall x. foo x = bar x #-}
318 Here the RULE makes bar recursive; but it's INLINE pragma remains
319 non-recursive. It's tempting to then say that 'bar' should not be
320 a loop breaker, but an attempt to do so goes wrong in two ways:
324 [INLINE $cfoo = ...no-$df...]
325 But we want $cfoo to depend on $df explicitly so that we
326 put the bindings in the right order to inline $df in $cfoo
327 and perhaps break the loop altogether. (Maybe this
334 Example (from GHC.Enum):
336 eftInt :: Int# -> Int# -> [Int]
337 eftInt x y = ...(non-recursive)...
339 {-# INLINE [0] eftIntFB #-}
340 eftIntFB :: (Int -> r -> r) -> r -> Int# -> Int# -> r
341 eftIntFB c n x y = ...(non-recursive)...
344 "eftInt" [~1] forall x y. eftInt x y = build (\ c n -> eftIntFB c n x y)
345 "eftIntList" [1] eftIntFB (:) [] = eftInt
348 Example [Specialisation rules]
349 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
350 Consider this group, which is typical of what SpecConstr builds:
352 fs a = ....f (C a)....
353 f x = ....f (C a)....
354 {-# RULE f (C a) = fs a #-}
356 So 'f' and 'fs' are in the same Rec group (since f refers to fs via its RULE).
358 But watch out! If 'fs' is not chosen as a loop breaker, we may get an infinite loop:
359 - the RULE is applied in f's RHS (see Note [Self-recursive rules] in Simplify
360 - fs is inlined (say it's small)
361 - now there's another opportunity to apply the RULE
363 This showed up when compiling Control.Concurrent.Chan.getChanContents.
367 occAnalBind _ env (Rec pairs) body_usage
368 = foldr (occAnalRec env) (body_usage, []) sccs
369 -- For a recursive group, we
370 -- * occ-analyse all the RHSs
371 -- * compute strongly-connected components
372 -- * feed those components to occAnalRec
374 -------------Dependency analysis ------------------------------
375 bndr_set = mkVarSet (map fst pairs)
377 sccs :: [SCC (Node Details)]
378 sccs = {-# SCC "occAnalBind.scc" #-} stronglyConnCompFromEdgedVerticesR rec_edges
380 rec_edges :: [Node Details]
381 rec_edges = {-# SCC "occAnalBind.assoc" #-} map make_node pairs
383 make_node (bndr, rhs)
384 = (details, varUnique bndr, keysUFM out_edges)
386 details = ND { nd_bndr = bndr, nd_rhs = rhs'
387 , nd_uds = rhs_usage3, nd_inl = inl_fvs}
389 (rhs_usage1, rhs') = occAnalRhs env Nothing rhs
390 rhs_usage2 = addIdOccs rhs_usage1 rule_fvs -- Note [Rules are extra RHSs]
391 rhs_usage3 = addIdOccs rhs_usage2 unf_fvs
392 unf = realIdUnfolding bndr -- Ignore any current loop-breaker flag
393 unf_fvs = stableUnfoldingVars unf
394 rule_fvs = idRuleVars bndr -- See Note [Rule dependency info]
396 inl_fvs = rhs_fvs `unionVarSet` unf_fvs
397 rhs_fvs = intersectUFM_C (\b _ -> b) bndr_set rhs_usage1
398 out_edges = intersectUFM_C (\b _ -> b) bndr_set rhs_usage3
399 -- (a -> b) means a mentions b
400 -- Given the usage details (a UFM that gives occ info for each free var of
401 -- the RHS) we can get the list of free vars -- or rather their Int keys --
402 -- by just extracting the keys from the finite map. Grimy, but fast.
403 -- Previously we had this:
404 -- [ bndr | bndr <- bndrs,
405 -- maybeToBool (lookupVarEnv rhs_usage bndr)]
406 -- which has n**2 cost, and this meant that edges_from alone
407 -- consumed 10% of total runtime!
409 -----------------------------
410 occAnalRec :: OccEnv -> SCC (Node Details)
411 -> (UsageDetails, [CoreBind])
412 -> (UsageDetails, [CoreBind])
414 -- The NonRec case is just like a Let (NonRec ...) above
415 occAnalRec _ (AcyclicSCC (ND { nd_bndr = bndr, nd_rhs = rhs, nd_uds = rhs_usage}, _, _))
417 | not (bndr `usedIn` body_usage)
418 = (body_usage, binds)
420 | otherwise -- It's mentioned in the body
421 = (body_usage' +++ rhs_usage,
422 NonRec tagged_bndr rhs : binds)
424 (body_usage', tagged_bndr) = tagBinder body_usage bndr
427 -- The Rec case is the interesting one
428 -- See Note [Loop breaking]
429 occAnalRec env (CyclicSCC nodes) (body_usage, binds)
430 | not (any (`usedIn` body_usage) bndrs) -- NB: look at body_usage, not total_usage
431 = (body_usage, binds) -- Dead code
433 | otherwise -- At this point we always build a single Rec
434 = (final_usage, Rec pairs : binds)
437 bndrs = [b | (ND { nd_bndr = b }, _, _) <- nodes]
438 bndr_set = mkVarSet bndrs
439 non_boring bndr = isId bndr &&
440 (isStableUnfolding (realIdUnfolding bndr) || idHasRules bndr)
442 ----------------------------
443 -- Tag the binders with their occurrence info
444 total_usage = foldl add_usage body_usage nodes
445 add_usage usage_so_far (ND { nd_uds = rhs_usage }, _, _) = usage_so_far +++ rhs_usage
446 (final_usage, tagged_nodes) = mapAccumL tag_node total_usage nodes
448 tag_node :: UsageDetails -> Node Details -> (UsageDetails, Node Details)
449 -- (a) Tag the binders in the details with occ info
450 -- (b) Mark the binder with "weak loop-breaker" OccInfo
451 -- saying "no preInlineUnconditionally" if it is used
452 -- in any rule (lhs or rhs) of the recursive group
453 -- See Note [Weak loop breakers]
454 tag_node usage (details@ND { nd_bndr = bndr }, k, ks)
455 = (usage `delVarEnv` bndr, (details { nd_bndr = bndr2 }, k, ks))
457 bndr2 | bndr `elemVarSet` all_rule_fvs = makeLoopBreaker True bndr1
459 bndr1 = setBinderOcc usage bndr
460 all_rule_fvs = bndr_set `intersectVarSet` foldr (unionVarSet . idRuleVars)
463 ----------------------------
464 -- Now reconstruct the cycle
465 pairs | any non_boring bndrs
466 = foldr (reOrderRec 0) [] $
467 stronglyConnCompFromEdgedVerticesR loop_breaker_edges
469 = reOrderCycle 0 tagged_nodes []
471 -- See Note [Choosing loop breakers] for loop_breaker_edges
472 loop_breaker_edges = map mk_node tagged_nodes
473 mk_node (details@(ND { nd_inl = inl_fvs }), k, _) = (details, k, new_ks)
475 new_ks = keysUFM (fst (extendFvs rule_fv_env inl_fvs))
477 ------------------------------------
478 rule_fv_env :: IdEnv IdSet -- Variables from this group mentioned in RHS of rules
479 -- Domain is *subset* of bound vars (others have no rule fvs)
480 rule_fv_env = transClosureFV init_rule_fvs
482 | Just is_active <- occ_rule_act env -- See Note [Finding rule RHS free vars]
486 , let rule_fvs = idRuleRhsVars is_active b
487 `intersectVarSet` bndr_set
488 , not (isEmptyVarSet rule_fvs)]
493 @reOrderRec@ is applied to the list of (binder,rhs) pairs for a cyclic
494 strongly connected component (there's guaranteed to be a cycle). It returns the
496 a) in a better order,
497 b) with some of the Ids having a IAmALoopBreaker pragma
499 The "loop-breaker" Ids are sufficient to break all cycles in the SCC. This means
500 that the simplifier can guarantee not to loop provided it never records an inlining
501 for these no-inline guys.
503 Furthermore, the order of the binds is such that if we neglect dependencies
504 on the no-inline Ids then the binds are topologically sorted. This means
505 that the simplifier will generally do a good job if it works from top bottom,
506 recording inlinings for any Ids which aren't marked as "no-inline" as it goes.
509 [June 98: I don't understand the following paragraphs, and I've
510 changed the a=b case again so that it isn't a special case any more.]
512 Here's a case that bit me:
520 Re-ordering doesn't change the order of bindings, but there was no loop-breaker.
522 My solution was to make a=b bindings record b as Many, rather like INLINE bindings.
523 Perhaps something cleverer would suffice.
528 type Node details = (details, Unique, [Unique]) -- The Ints are gotten from the Unique,
529 -- which is gotten from the Id.
531 = ND { nd_bndr :: Id -- Binder
532 , nd_rhs :: CoreExpr -- RHS
534 , nd_uds :: UsageDetails -- Usage from RHS,
535 -- including RULES and InlineRule unfolding
537 , nd_inl :: IdSet -- Other binders *from this Rec group* mentioned in
538 } -- its InlineRule unfolding (if present)
540 -- but *excluding* any RULES
541 -- This is the IdSet that may be used if the Id is inlined
543 reOrderRec :: Int -> SCC (Node Details)
544 -> [(Id,CoreExpr)] -> [(Id,CoreExpr)]
545 -- Sorted into a plausible order. Enough of the Ids have
546 -- IAmALoopBreaker pragmas that there are no loops left.
547 reOrderRec _ (AcyclicSCC (ND { nd_bndr = bndr, nd_rhs = rhs }, _, _))
548 pairs = (bndr, rhs) : pairs
549 reOrderRec depth (CyclicSCC cycle) pairs = reOrderCycle depth cycle pairs
551 reOrderCycle :: Int -> [Node Details] -> [(Id,CoreExpr)] -> [(Id,CoreExpr)]
553 = panic "reOrderCycle"
554 reOrderCycle _ [(ND { nd_bndr = bndr, nd_rhs = rhs }, _, _)] pairs
555 = -- Common case of simple self-recursion
556 (makeLoopBreaker False bndr, rhs) : pairs
558 reOrderCycle depth (bind : binds) pairs
559 = -- Choose a loop breaker, mark it no-inline,
560 -- do SCC analysis on the rest, and recursively sort them out
561 -- pprTrace "reOrderCycle" (ppr [b | (ND { nd_bndr = b }, _, _) <- bind:binds]) $
562 foldr (reOrderRec new_depth)
563 ([ (makeLoopBreaker False bndr, rhs)
564 | (ND { nd_bndr = bndr, nd_rhs = rhs }, _, _) <- chosen_binds] ++ pairs)
565 (stronglyConnCompFromEdgedVerticesR unchosen)
567 (chosen_binds, unchosen) = choose_loop_breaker [bind] (score bind) [] binds
569 approximate_loop_breaker = depth >= 2
570 new_depth | approximate_loop_breaker = 0
571 | otherwise = depth+1
572 -- After two iterations (d=0, d=1) give up
573 -- and approximate, returning to d=0
575 -- This loop looks for the bind with the lowest score
576 -- to pick as the loop breaker. The rest accumulate in
577 choose_loop_breaker loop_binds _loop_sc acc []
578 = (loop_binds, acc) -- Done
580 -- If approximate_loop_breaker is True, we pick *all*
581 -- nodes with lowest score, else just one
582 -- See Note [Complexity of loop breaking]
583 choose_loop_breaker loop_binds loop_sc acc (bind : binds)
584 | sc < loop_sc -- Lower score so pick this new one
585 = choose_loop_breaker [bind] sc (loop_binds ++ acc) binds
587 | approximate_loop_breaker && sc == loop_sc
588 = choose_loop_breaker (bind : loop_binds) loop_sc acc binds
590 | otherwise -- Higher score so don't pick it
591 = choose_loop_breaker loop_binds loop_sc (bind : acc) binds
595 score :: Node Details -> Int -- Higher score => less likely to be picked as loop breaker
596 score (ND { nd_bndr = bndr, nd_rhs = rhs }, _, _)
597 | not (isId bndr) = 100 -- A type or cercion variable is never a loop breaker
599 | isDFunId bndr = 9 -- Never choose a DFun as a loop breaker
600 -- Note [DFuns should not be loop breakers]
602 | Just inl_source <- isStableCoreUnfolding_maybe (idUnfolding bndr)
604 InlineWrapper {} -> 10 -- Note [INLINE pragmas]
605 _other -> 3 -- Data structures are more important than this
606 -- so that dictionary/method recursion unravels
607 -- Note that this case hits all InlineRule things, so we
608 -- never look at 'rhs for InlineRule stuff. That's right, because
609 -- 'rhs' is irrelevant for inlining things with an InlineRule
611 | is_con_app rhs = 5 -- Data types help with cases: Note [Constructor applications]
613 | exprIsTrivial rhs = 10 -- Practically certain to be inlined
614 -- Used to have also: && not (isExportedId bndr)
615 -- But I found this sometimes cost an extra iteration when we have
616 -- rec { d = (a,b); a = ...df...; b = ...df...; df = d }
617 -- where df is the exported dictionary. Then df makes a really
618 -- bad choice for loop breaker
621 -- If an Id is marked "never inline" then it makes a great loop breaker
622 -- The only reason for not checking that here is that it is rare
623 -- and I've never seen a situation where it makes a difference,
624 -- so it probably isn't worth the time to test on every binder
625 -- | isNeverActive (idInlinePragma bndr) = -10
627 | isOneOcc (idOccInfo bndr) = 2 -- Likely to be inlined
629 | canUnfold (realIdUnfolding bndr) = 1
630 -- The Id has some kind of unfolding
631 -- Ignore loop-breaker-ness here because that is what we are setting!
635 -- Checking for a constructor application
636 -- Cheap and cheerful; the simplifer moves casts out of the way
637 -- The lambda case is important to spot x = /\a. C (f a)
638 -- which comes up when C is a dictionary constructor and
639 -- f is a default method.
640 -- Example: the instance for Show (ST s a) in GHC.ST
642 -- However we *also* treat (\x. C p q) as a con-app-like thing,
643 -- Note [Closure conversion]
644 is_con_app (Var v) = isConLikeId v
645 is_con_app (App f _) = is_con_app f
646 is_con_app (Lam _ e) = is_con_app e
647 is_con_app (Note _ e) = is_con_app e
650 makeLoopBreaker :: Bool -> Id -> Id
651 -- Set the loop-breaker flag: see Note [Weak loop breakers]
652 makeLoopBreaker weak bndr
653 = ASSERT2( isId bndr, ppr bndr ) setIdOccInfo bndr (IAmALoopBreaker weak)
656 Note [Complexity of loop breaking]
657 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
658 The loop-breaking algorithm knocks out one binder at a time, and
659 performs a new SCC analysis on the remaining binders. That can
660 behave very badly in tightly-coupled groups of bindings; in the
661 worst case it can be (N**2)*log N, because it does a full SCC
662 on N, then N-1, then N-2 and so on.
664 To avoid this, we switch plans after 2 (or whatever) attempts:
665 Plan A: pick one binder with the lowest score, make it
666 a loop breaker, and try again
667 Plan B: pick *all* binders with the lowest score, make them
668 all loop breakers, and try again
669 Since there are only a small finite number of scores, this will
670 terminate in a constant number of iterations, rather than O(N)
673 You might thing that it's very unlikely, but RULES make it much
674 more likely. Here's a real example from Trac #1969:
675 Rec { $dm = \d.\x. op d
676 {-# RULES forall d. $dm Int d = $s$dm1
677 forall d. $dm Bool d = $s$dm2 #-}
679 dInt = MkD .... opInt ...
680 dInt = MkD .... opBool ...
685 $s$dm2 = \x. op dBool }
686 The RULES stuff means that we can't choose $dm as a loop breaker
687 (Note [Choosing loop breakers]), so we must choose at least (say)
688 opInt *and* opBool, and so on. The number of loop breakders is
689 linear in the number of instance declarations.
691 Note [INLINE pragmas]
692 ~~~~~~~~~~~~~~~~~~~~~
693 Avoid choosing a function with an INLINE pramga as the loop breaker!
694 If such a function is mutually-recursive with a non-INLINE thing,
695 then the latter should be the loop-breaker.
697 Usually this is just a question of optimisation. But a particularly
698 bad case is wrappers generated by the demand analyser: if you make
699 then into a loop breaker you may get an infinite inlining loop. For
702 $wfoo x = ....foo x....
704 {-loop brk-} foo x = ...$wfoo x...
706 The interface file sees the unfolding for $wfoo, and sees that foo is
707 strict (and hence it gets an auto-generated wrapper). Result: an
708 infinite inlining in the importing scope. So be a bit careful if you
709 change this. A good example is Tree.repTree in
710 nofib/spectral/minimax. If the repTree wrapper is chosen as the loop
711 breaker then compiling Game.hs goes into an infinite loop. This
712 happened when we gave is_con_app a lower score than inline candidates:
715 = __inline_me (/\a. \w w1 w2 ->
716 case Tree.$wrepTree @ a w w1 w2 of
717 { (# ww1, ww2 #) -> Branch @ a ww1 ww2 })
720 (# w2_smP, map a (Tree a) (Tree.repTree a w1 w) (w w2) #)
722 Here we do *not* want to choose 'repTree' as the loop breaker.
724 Note [DFuns should not be loop breakers]
725 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
726 It's particularly bad to make a DFun into a loop breaker. See
727 Note [How instance declarations are translated] in TcInstDcls
729 We give DFuns a higher score than ordinary CONLIKE things because
730 if there's a choice we want the DFun to be the non-looop breker. Eg
732 rec { sc = /\ a \$dC. $fBWrap (T a) ($fCT @ a $dC)
734 $fCT :: forall a_afE. (Roman.C a_afE) => Roman.C (Roman.T a_afE)
736 $fCT = /\a \$dC. MkD (T a) ((sc @ a $dC) |> blah) ($ctoF @ a $dC)
739 Here 'sc' (the superclass) looks CONLIKE, but we'll never get to it
740 if we can't unravel the DFun first.
742 Note [Constructor applications]
743 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
744 It's really really important to inline dictionaries. Real
745 example (the Enum Ordering instance from GHC.Base):
747 rec f = \ x -> case d of (p,q,r) -> p x
748 g = \ x -> case d of (p,q,r) -> q x
751 Here, f and g occur just once; but we can't inline them into d.
752 On the other hand we *could* simplify those case expressions if
753 we didn't stupidly choose d as the loop breaker.
754 But we won't because constructor args are marked "Many".
755 Inlining dictionaries is really essential to unravelling
756 the loops in static numeric dictionaries, see GHC.Float.
758 Note [Closure conversion]
759 ~~~~~~~~~~~~~~~~~~~~~~~~~
760 We treat (\x. C p q) as a high-score candidate in the letrec scoring algorithm.
761 The immediate motivation came from the result of a closure-conversion transformation
762 which generated code like this:
764 data Clo a b = forall c. Clo (c -> a -> b) c
766 ($:) :: Clo a b -> a -> b
767 Clo f env $: x = f env x
769 rec { plus = Clo plus1 ()
771 ; plus1 _ n = Clo plus2 n
774 ; plus2 (Succ m) n = Succ (plus $: m $: n) }
776 If we inline 'plus' and 'plus1', everything unravels nicely. But if
777 we choose 'plus1' as the loop breaker (which is entirely possible
778 otherwise), the loop does not unravel nicely.
781 @occAnalRhs@ deals with the question of bindings where the Id is marked
782 by an INLINE pragma. For these we record that anything which occurs
783 in its RHS occurs many times. This pessimistically assumes that ths
784 inlined binder also occurs many times in its scope, but if it doesn't
785 we'll catch it next time round. At worst this costs an extra simplifier pass.
786 ToDo: try using the occurrence info for the inline'd binder.
788 [March 97] We do the same for atomic RHSs. Reason: see notes with reOrderRec.
789 [June 98, SLPJ] I've undone this change; I don't understand it. See notes with reOrderRec.
794 -> Maybe Id -> CoreExpr -- Binder and rhs
795 -- Just b => non-rec, and alrady tagged with occurrence info
796 -- Nothing => Rec, no occ info
797 -> (UsageDetails, CoreExpr)
798 -- Returned usage details covers only the RHS,
799 -- and *not* the RULE or INLINE template for the Id
800 occAnalRhs env mb_bndr rhs
803 -- See Note [Cascading inlines]
804 ctxt = case mb_bndr of
805 Just b | certainly_inline b -> env
806 _other -> rhsCtxt env
808 certainly_inline bndr -- See Note [Cascading inlines]
809 = case idOccInfo bndr of
810 OneOcc in_lam one_br _ -> not in_lam && one_br && active && not_stable
813 active = isAlwaysActive (idInlineActivation bndr)
814 not_stable = not (isStableUnfolding (idUnfolding bndr))
816 addIdOccs :: UsageDetails -> VarSet -> UsageDetails
817 addIdOccs usage id_set = foldVarSet add usage id_set
819 add v u | isId v = addOneOcc u v NoOccInfo
821 -- Give a non-committal binder info (i.e NoOccInfo) because
822 -- a) Many copies of the specialised thing can appear
823 -- b) We don't want to substitute a BIG expression inside a RULE
824 -- even if that's the only occurrence of the thing
825 -- (Same goes for INLINE.)
828 Note [Cascading inlines]
829 ~~~~~~~~~~~~~~~~~~~~~~~~
830 By default we use an rhsCtxt for the RHS of a binding. This tells the
831 occ anal n that it's looking at an RHS, which has an effect in
832 occAnalApp. In particular, for constructor applications, it makes
833 the arguments appear to have NoOccInfo, so that we don't inline into
836 we do not want to inline x.
838 But there's a problem. Consider
843 First time round, it looks as if x1 and x2 occur as an arg of a
844 let-bound constructor ==> give them a many-occurrence.
845 But then x3 is inlined (unconditionally as it happens) and
846 next time round, x2 will be, and the next time round x1 will be
847 Result: multiple simplifier iterations. Sigh.
849 So, when analysing the RHS of x3 we notice that x3 will itself
850 definitely inline the next time round, and so we analyse x3's rhs in
851 an ordinary context, not rhsCtxt. Hence the "certainly_inline" stuff.
853 Annoyingly, we have to approximiate SimplUtils.preInlineUnconditionally.
854 If we say "yes" when preInlineUnconditionally says "no" the simplifier iterates
864 This is worse than the slow cascade, so we only want to say "certainly_inline"
865 if it really is certain. Look at the note with preInlineUnconditionally
866 for the various clauses.
873 -> (UsageDetails, -- Gives info only about the "interesting" Ids
876 occAnal _ expr@(Type _) = (emptyDetails, expr)
877 occAnal _ expr@(Lit _) = (emptyDetails, expr)
878 occAnal env expr@(Var v) = (mkOneOcc env v False, expr)
879 -- At one stage, I gathered the idRuleVars for v here too,
880 -- which in a way is the right thing to do.
881 -- But that went wrong right after specialisation, when
882 -- the *occurrences* of the overloaded function didn't have any
883 -- rules in them, so the *specialised* versions looked as if they
884 -- weren't used at all.
886 occAnal _ (Coercion co)
887 = (addIdOccs emptyDetails (coVarsOfCo co), Coercion co)
888 -- See Note [Gather occurrences of coercion veriables]
891 Note [Gather occurrences of coercion veriables]
892 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
893 We need to gather info about what coercion variables appear, so that
894 we can sort them into the right place when doing dependency analysis.
900 occAnal env (Note note@(SCC _) body)
901 = case occAnal env body of { (usage, body') ->
902 (mapVarEnv markInsideSCC usage, Note note body')
905 occAnal env (Note note body)
906 = case occAnal env body of { (usage, body') ->
907 (usage, Note note body')
910 occAnal env (Cast expr co)
911 = case occAnal env expr of { (usage, expr') ->
912 let usage1 = markManyIf (isRhsEnv env) usage
913 usage2 = addIdOccs usage1 (coVarsOfCo co)
914 -- See Note [Gather occurrences of coercion veriables]
915 in (usage2, Cast expr' co)
916 -- If we see let x = y `cast` co
917 -- then mark y as 'Many' so that we don't
918 -- immediately inline y again.
923 occAnal env app@(App _ _)
924 = occAnalApp env (collectArgs app)
926 -- Ignore type variables altogether
927 -- (a) occurrences inside type lambdas only not marked as InsideLam
928 -- (b) type variables not in environment
930 occAnal env (Lam x body) | isTyVar x
931 = case occAnal env body of { (body_usage, body') ->
932 (body_usage, Lam x body')
935 -- For value lambdas we do a special hack. Consider
937 -- If we did nothing, x is used inside the \y, so would be marked
938 -- as dangerous to dup. But in the common case where the abstraction
939 -- is applied to two arguments this is over-pessimistic.
940 -- So instead, we just mark each binder with its occurrence
941 -- info in the *body* of the multiple lambda.
942 -- Then, the simplifier is careful when partially applying lambdas.
944 occAnal env expr@(Lam _ _)
945 = case occAnal env_body body of { (body_usage, body') ->
947 (final_usage, tagged_binders) = tagLamBinders body_usage binders'
948 -- Use binders' to put one-shot info on the lambdas
950 -- URGH! Sept 99: we don't seem to be able to use binders' here, because
951 -- we get linear-typed things in the resulting program that we can't handle yet.
952 -- (e.g. PrelShow) TODO
954 really_final_usage = if linear then
957 mapVarEnv markInsideLam final_usage
960 mkLams tagged_binders body') }
962 env_body = vanillaCtxt (trimOccEnv env binders)
963 -- Body is (no longer) an RhsContext
964 (binders, body) = collectBinders expr
965 binders' = oneShotGroup env binders
966 linear = all is_one_shot binders'
967 is_one_shot b = isId b && isOneShotBndr b
969 occAnal env (Case scrut bndr ty alts)
970 = case occ_anal_scrut scrut alts of { (scrut_usage, scrut') ->
971 case mapAndUnzip occ_anal_alt alts of { (alts_usage_s, alts') ->
973 alts_usage = foldr1 combineAltsUsageDetails alts_usage_s
974 (alts_usage1, tagged_bndr) = tag_case_bndr alts_usage bndr
975 total_usage = scrut_usage +++ alts_usage1
977 total_usage `seq` (total_usage, Case scrut' tagged_bndr ty alts') }}
979 -- Note [Case binder usage]
980 -- ~~~~~~~~~~~~~~~~~~~~~~~~
981 -- The case binder gets a usage of either "many" or "dead", never "one".
982 -- Reason: we like to inline single occurrences, to eliminate a binding,
983 -- but inlining a case binder *doesn't* eliminate a binding.
984 -- We *don't* want to transform
985 -- case x of w { (p,q) -> f w }
987 -- case x of w { (p,q) -> f (p,q) }
988 tag_case_bndr usage bndr
989 = case lookupVarEnv usage bndr of
990 Nothing -> (usage, setIdOccInfo bndr IAmDead)
991 Just _ -> (usage `delVarEnv` bndr, setIdOccInfo bndr NoOccInfo)
993 alt_env = mkAltEnv env scrut bndr
994 occ_anal_alt = occAnalAlt alt_env bndr
996 occ_anal_scrut (Var v) (alt1 : other_alts)
997 | not (null other_alts) || not (isDefaultAlt alt1)
998 = (mkOneOcc env v True, Var v) -- The 'True' says that the variable occurs
999 -- in an interesting context; the case has
1000 -- at least one non-default alternative
1001 occ_anal_scrut scrut _alts
1002 = occAnal (vanillaCtxt env) scrut -- No need for rhsCtxt
1004 occAnal env (Let bind body)
1005 = case occAnal env_body body of { (body_usage, body') ->
1006 case occAnalBind env env_body bind body_usage of { (final_usage, new_binds) ->
1007 (final_usage, mkLets new_binds body') }}
1009 env_body = trimOccEnv env (bindersOf bind)
1011 occAnalArgs :: OccEnv -> [CoreExpr] -> (UsageDetails, [CoreExpr])
1012 occAnalArgs env args
1013 = case mapAndUnzip (occAnal arg_env) args of { (arg_uds_s, args') ->
1014 (foldr (+++) emptyDetails arg_uds_s, args')}
1016 arg_env = vanillaCtxt env
1019 Applications are dealt with specially because we want
1020 the "build hack" to work.
1022 Note [Arguments of let-bound constructors]
1023 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1025 f x = let y = expensive x in
1027 (case z of {(p,q)->q}, case z of {(p,q)->q})
1028 We feel free to duplicate the WHNF (True,y), but that means
1029 that y may be duplicated thereby.
1031 If we aren't careful we duplicate the (expensive x) call!
1032 Constructors are rather like lambdas in this way.
1035 occAnalApp :: OccEnv
1036 -> (Expr CoreBndr, [Arg CoreBndr])
1037 -> (UsageDetails, Expr CoreBndr)
1038 occAnalApp env (Var fun, args)
1039 = case args_stuff of { (args_uds, args') ->
1041 final_args_uds = markManyIf (isRhsEnv env && is_exp) args_uds
1042 -- We mark the free vars of the argument of a constructor or PAP
1043 -- as "many", if it is the RHS of a let(rec).
1044 -- This means that nothing gets inlined into a constructor argument
1045 -- position, which is what we want. Typically those constructor
1046 -- arguments are just variables, or trivial expressions.
1048 -- This is the *whole point* of the isRhsEnv predicate
1049 -- See Note [Arguments of let-bound constructors]
1051 (fun_uds +++ final_args_uds, mkApps (Var fun) args') }
1053 fun_uniq = idUnique fun
1054 fun_uds = mkOneOcc env fun (valArgCount args > 0)
1055 is_exp = isExpandableApp fun (valArgCount args)
1056 -- See Note [CONLIKE pragma] in BasicTypes
1057 -- The definition of is_exp should match that in
1058 -- Simplify.prepareRhs
1060 -- Hack for build, fold, runST
1061 args_stuff | fun_uniq == buildIdKey = appSpecial env 2 [True,True] args
1062 | fun_uniq == augmentIdKey = appSpecial env 2 [True,True] args
1063 | fun_uniq == foldrIdKey = appSpecial env 3 [False,True] args
1064 | fun_uniq == runSTRepIdKey = appSpecial env 2 [True] args
1065 -- (foldr k z xs) may call k many times, but it never
1066 -- shares a partial application of k; hence [False,True]
1067 -- This means we can optimise
1068 -- foldr (\x -> let v = ...x... in \y -> ...v...) z xs
1069 -- by floating in the v
1071 | otherwise = occAnalArgs env args
1074 occAnalApp env (fun, args)
1075 = case occAnal (addAppCtxt env args) fun of { (fun_uds, fun') ->
1076 -- The addAppCtxt is a bit cunning. One iteration of the simplifier
1077 -- often leaves behind beta redexs like
1078 -- (\x y -> e) a1 a2
1079 -- Here we would like to mark x,y as one-shot, and treat the whole
1080 -- thing much like a let. We do this by pushing some True items
1081 -- onto the context stack.
1083 case occAnalArgs env args of { (args_uds, args') ->
1085 final_uds = fun_uds +++ args_uds
1087 (final_uds, mkApps fun' args') }}
1090 markManyIf :: Bool -- If this is true
1091 -> UsageDetails -- Then do markMany on this
1093 markManyIf True uds = mapVarEnv markMany uds
1094 markManyIf False uds = uds
1096 appSpecial :: OccEnv
1097 -> Int -> CtxtTy -- Argument number, and context to use for it
1099 -> (UsageDetails, [CoreExpr])
1100 appSpecial env n ctxt args
1103 arg_env = vanillaCtxt env
1105 go _ [] = (emptyDetails, []) -- Too few args
1107 go 1 (arg:args) -- The magic arg
1108 = case occAnal (setCtxtTy arg_env ctxt) arg of { (arg_uds, arg') ->
1109 case occAnalArgs env args of { (args_uds, args') ->
1110 (arg_uds +++ args_uds, arg':args') }}
1113 = case occAnal arg_env arg of { (arg_uds, arg') ->
1114 case go (n-1) args of { (args_uds, args') ->
1115 (arg_uds +++ args_uds, arg':args') }}
1119 Note [Binders in case alternatives]
1120 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1122 case x of y { (a,b) -> f y }
1123 We treat 'a', 'b' as dead, because they don't physically occur in the
1124 case alternative. (Indeed, a variable is dead iff it doesn't occur in
1125 its scope in the output of OccAnal.) It really helps to know when
1126 binders are unused. See esp the call to isDeadBinder in
1127 Simplify.mkDupableAlt
1129 In this example, though, the Simplifier will bring 'a' and 'b' back to
1130 life, beause it binds 'y' to (a,b) (imagine got inlined and
1134 occAnalAlt :: OccEnv
1137 -> (UsageDetails, Alt IdWithOccInfo)
1138 occAnalAlt env case_bndr (con, bndrs, rhs)
1140 env' = trimOccEnv env bndrs
1142 case occAnal env' rhs of { (rhs_usage1, rhs1) ->
1144 proxies = getProxies env' case_bndr
1145 (rhs_usage2, rhs2) = foldrBag wrapProxy (rhs_usage1, rhs1) proxies
1146 (alt_usg, tagged_bndrs) = tagLamBinders rhs_usage2 bndrs
1147 bndrs' = tagged_bndrs -- See Note [Binders in case alternatives]
1149 (alt_usg, (con, bndrs', rhs2)) }
1151 wrapProxy :: ProxyBind -> (UsageDetails, CoreExpr) -> (UsageDetails, CoreExpr)
1152 wrapProxy (bndr, rhs_var, co) (body_usg, body)
1153 | not (bndr `usedIn` body_usg)
1156 = (body_usg' +++ rhs_usg, Let (NonRec tagged_bndr rhs) body)
1158 (body_usg', tagged_bndr) = tagBinder body_usg bndr
1159 rhs_usg = unitVarEnv rhs_var NoOccInfo -- We don't need exact info
1160 rhs = mkCoerce co (Var (zapIdOccInfo rhs_var)) -- See Note [Zap case binders in proxy bindings]
1164 %************************************************************************
1168 %************************************************************************
1172 = OccEnv { occ_encl :: !OccEncl -- Enclosing context information
1173 , occ_ctxt :: !CtxtTy -- Tells about linearity
1174 , occ_proxy :: ProxyEnv
1175 , occ_rule_fvs :: ImpRuleUsage
1176 , occ_rule_act :: Maybe (Activation -> Bool) -- Nothing => Rules are inactive
1177 -- See Note [Finding rule RHS free vars]
1181 -----------------------------
1182 -- OccEncl is used to control whether to inline into constructor arguments
1184 -- x = (p,q) -- Don't inline p or q
1185 -- y = /\a -> (p a, q a) -- Still don't inline p or q
1186 -- z = f (p,q) -- Do inline p,q; it may make a rule fire
1187 -- So OccEncl tells enought about the context to know what to do when
1188 -- we encounter a contructor application or PAP.
1191 = OccRhs -- RHS of let(rec), albeit perhaps inside a type lambda
1192 -- Don't inline into constructor args here
1193 | OccVanilla -- Argument of function, body of lambda, scruintee of case etc.
1194 -- Do inline into constructor args here
1196 instance Outputable OccEncl where
1197 ppr OccRhs = ptext (sLit "occRhs")
1198 ppr OccVanilla = ptext (sLit "occVanilla")
1200 type CtxtTy = [Bool]
1203 -- True:ctxt Analysing a function-valued expression that will be
1204 -- applied just once
1206 -- False:ctxt Analysing a function-valued expression that may
1207 -- be applied many times; but when it is,
1208 -- the CtxtTy inside applies
1210 initOccEnv :: Maybe (Activation -> Bool) -> [CoreRule]
1212 initOccEnv active_rule imp_rules
1213 = OccEnv { occ_encl = OccVanilla
1215 , occ_proxy = PE emptyVarEnv emptyVarSet
1216 , occ_rule_fvs = findImpRuleUsage active_rule imp_rules
1217 , occ_rule_act = active_rule }
1219 vanillaCtxt :: OccEnv -> OccEnv
1220 vanillaCtxt env = env { occ_encl = OccVanilla, occ_ctxt = [] }
1222 rhsCtxt :: OccEnv -> OccEnv
1223 rhsCtxt env = env { occ_encl = OccRhs, occ_ctxt = [] }
1225 setCtxtTy :: OccEnv -> CtxtTy -> OccEnv
1226 setCtxtTy env ctxt = env { occ_ctxt = ctxt }
1228 isRhsEnv :: OccEnv -> Bool
1229 isRhsEnv (OccEnv { occ_encl = OccRhs }) = True
1230 isRhsEnv (OccEnv { occ_encl = OccVanilla }) = False
1232 oneShotGroup :: OccEnv -> [CoreBndr] -> [CoreBndr]
1233 -- The result binders have one-shot-ness set that they might not have had originally.
1234 -- This happens in (build (\cn -> e)). Here the occurrence analyser
1235 -- linearity context knows that c,n are one-shot, and it records that fact in
1236 -- the binder. This is useful to guide subsequent float-in/float-out tranformations
1238 oneShotGroup (OccEnv { occ_ctxt = ctxt }) bndrs
1241 go _ [] rev_bndrs = reverse rev_bndrs
1243 go (lin_ctxt:ctxt) (bndr:bndrs) rev_bndrs
1244 | isId bndr = go ctxt bndrs (bndr':rev_bndrs)
1246 bndr' | lin_ctxt = setOneShotLambda bndr
1249 go ctxt (bndr:bndrs) rev_bndrs = go ctxt bndrs (bndr:rev_bndrs)
1251 addAppCtxt :: OccEnv -> [Arg CoreBndr] -> OccEnv
1252 addAppCtxt env@(OccEnv { occ_ctxt = ctxt }) args
1253 = env { occ_ctxt = replicate (valArgCount args) True ++ ctxt }
1256 %************************************************************************
1260 %************************************************************************
1263 type ImpRuleUsage = NameEnv UsageDetails
1264 -- Maps an *imported* Id f to the UsageDetails for *local* Ids
1265 -- used on the RHS for a *local* rule for f.
1270 Consider this, where A.g is an imported Id
1273 {-# RULE "foo" forall x. A.g x = f x #-}
1275 Obviously there's a loop, but the danger is that the occurrence analyser
1276 will say that 'f' is not a loop breaker. Then the simplifier will
1279 and then gaily inline 'f'. Result infinite loop. More realistically,
1280 these kind of rules are generated when specialising imported INLINABLE Ids.
1282 Solution: treat an occurrence of A.g as an occurrence of all the local Ids
1283 that occur on the RULE's RHS. This mapping from imported Id to local Ids
1284 is held in occ_rule_fvs.
1287 findImpRuleUsage :: Maybe (Activation -> Bool) -> [CoreRule] -> ImpRuleUsage
1288 -- Find the *local* Ids that can be reached transitively,
1289 -- via local rules, from each *imported* Id.
1290 -- Sigh: this function seems more complicated than it is really worth
1291 findImpRuleUsage Nothing _ = emptyNameEnv
1292 findImpRuleUsage (Just is_active) rules
1293 = mkNameEnv [ (f, mapUFM (\_ -> NoOccInfo) ls)
1295 , let ls = find_lcl_deps f
1296 , not (isEmptyVarSet ls) ]
1298 rule_names = map ru_fn rules
1299 rule_name_set = mkNameSet rule_names
1301 imp_deps :: NameEnv VarSet
1302 -- (f,g) means imported Id 'g' appears in RHS of
1303 -- rule for imported Id 'f', *or* does so transitively
1304 imp_deps = foldr add_imp emptyNameEnv rules
1306 | is_active (ruleActivation rule)
1307 = extendNameEnv_C unionVarSet acc (ru_fn rule)
1308 (exprSomeFreeVars keep_imp (ru_rhs rule))
1310 keep_imp v = isId v && (idName v `elemNameSet` rule_name_set)
1311 full_imp_deps = transClosureFV (ufmToList imp_deps)
1313 lcl_deps :: NameEnv VarSet
1314 -- (f, l) means localId 'l' appears immediately
1315 -- in the RHS of a rule for imported Id 'f'
1316 -- Remember, many rules might have the same ru_fn
1317 -- so we do need to fold
1318 lcl_deps = foldr add_lcl emptyNameEnv rules
1319 add_lcl rule acc = extendNameEnv_C unionVarSet acc (ru_fn rule)
1320 (exprFreeIds (ru_rhs rule))
1322 find_lcl_deps :: Name -> VarSet
1324 = foldVarSet (unionVarSet . lookup_lcl . idName) (lookup_lcl f)
1325 (lookupNameEnv full_imp_deps f `orElse` emptyVarSet)
1326 lookup_lcl :: Name -> VarSet
1327 lookup_lcl g = lookupNameEnv lcl_deps g `orElse` emptyVarSet
1330 transClosureFV :: Uniquable a => [(a, VarSet)] -> UniqFM VarSet
1331 -- If (f,g), (g,h) are in the input, then (f,h) is in the output
1332 transClosureFV fv_list
1334 | otherwise = transClosureFV new_fv_list
1336 env = listToUFM fv_list
1337 (no_change, new_fv_list) = mapAccumL bump True fv_list
1338 bump no_change (b,fvs)
1339 | no_change_here = (no_change, (b,fvs))
1340 | otherwise = (False, (b,new_fvs))
1342 (new_fvs, no_change_here) = extendFvs env fvs
1345 extendFvs :: UniqFM VarSet -> VarSet -> (VarSet, Bool)
1346 -- (extendFVs env s) returns
1347 -- (s `union` env(s), env(s) `subset` s)
1349 = foldVarSet add (s, True) s
1351 add v (vs, no_change_so_far)
1352 = case lookupUFM env v of
1353 Just fvs | not (fvs `subVarSet` s)
1354 -> (vs `unionVarSet` fvs, False)
1355 _ -> (vs, no_change_so_far)
1359 %************************************************************************
1363 %************************************************************************
1366 data ProxyEnv -- See Note [ProxyEnv]
1367 = PE (IdEnv -- Domain = scrutinee variables
1368 (Id, -- The scrutinee variable again
1369 [(Id,Coercion)])) -- The case binders that it maps to
1370 VarSet -- Free variables of both range and domain
1375 The ProxyEnv keeps track of the connection between case binders and
1376 scrutinee. Specifically, if
1377 sc |-> (sc, [...(cb, co)...])
1378 is a binding in the ProxyEnv, then
1380 Typically we add such a binding when encountering the case expression
1381 case (sc |> coi) of cb { ... }
1384 * The domain of the ProxyEnv is the variable (or casted variable)
1385 scrutinees of enclosing cases. This is additionally used
1386 to ensure we gather occurrence info even for GlobalId scrutinees;
1387 see Note [Binder swap for GlobalId scrutinee]
1389 * The ProxyEnv is just an optimisation; you can throw away any
1390 element without losing correctness. And we do so when pushing
1391 it inside a binding (see trimProxyEnv).
1393 * One scrutinee might map to many case binders: Eg
1394 case sc of cb1 { DEFAULT -> ....case sc of cb2 { ... } .. }
1397 * If sc1 |-> (sc2, [...(cb, co)...]), then sc1==sc2
1398 It's a UniqFM and we sometimes need the domain Id
1400 * Any particular case binder 'cb' occurs only once in entire range
1404 The Main Reason for having a ProxyEnv is so that when we encounter
1405 case e of cb { pi -> ri }
1406 we can find all the in-scope variables derivable from 'cb',
1407 and effectively add let-bindings for them (or at least for the
1408 ones *mentioned* in ri) thus:
1409 case e of cb { pi -> let { x = ..cb..; y = ...cb.. }
1411 In this way we'll replace occurrences of 'x', 'y' with 'cb',
1412 which implements the Binder-swap idea (see Note [Binder swap])
1414 The function getProxies finds these bindings; then we
1415 add just the necessary ones, using wrapProxy.
1419 We do these two transformations right here:
1421 (1) case x of b { pi -> ri }
1423 case x of b { pi -> let x=b in ri }
1425 (2) case (x |> co) of b { pi -> ri }
1427 case (x |> co) of b { pi -> let x = b |> sym co in ri }
1429 Why (2)? See Note [Case of cast]
1431 In both cases, in a particular alternative (pi -> ri), we only
1433 (a) x occurs free in (pi -> ri)
1434 (ie it occurs in ri, but is not bound in pi)
1435 (b) the pi does not bind b (or the free vars of co)
1436 We need (a) and (b) for the inserted binding to be correct.
1438 For the alternatives where we inject the binding, we can transfer
1439 all x's OccInfo to b. And that is the point.
1442 * The deliberate shadowing of 'x'.
1443 * That (a) rapidly becomes false, so no bindings are injected.
1445 The reason for doing these transformations here is because it allows
1446 us to adjust the OccInfo for 'x' and 'b' as we go.
1448 * Suppose the only occurrences of 'x' are the scrutinee and in the
1449 ri; then this transformation makes it occur just once, and hence
1450 get inlined right away.
1452 * If we do this in the Simplifier, we don't know whether 'x' is used
1453 in ri, so we are forced to pessimistically zap b's OccInfo even
1454 though it is typically dead (ie neither it nor x appear in the
1455 ri). There's nothing actually wrong with zapping it, except that
1456 it's kind of nice to know which variables are dead. My nose
1457 tells me to keep this information as robustly as possible.
1459 The Maybe (Id,CoreExpr) passed to occAnalAlt is the extra let-binding
1460 {x=b}; it's Nothing if the binder-swap doesn't happen.
1462 There is a danger though. Consider
1464 in case (f v) of w -> ...v...v...
1465 And suppose that (f v) expands to just v. Then we'd like to
1466 use 'w' instead of 'v' in the alternative. But it may be too
1467 late; we may have substituted the (cheap) x+#y for v in the
1468 same simplifier pass that reduced (f v) to v.
1470 I think this is just too bad. CSE will recover some of it.
1474 Consider case (x `cast` co) of b { I# ->
1475 ... (case (x `cast` co) of {...}) ...
1476 We'd like to eliminate the inner case. That is the motivation for
1477 equation (2) in Note [Binder swap]. When we get to the inner case, we
1478 inline x, cancel the casts, and away we go.
1480 Note [Binder swap on GlobalId scrutinees]
1481 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1482 When the scrutinee is a GlobalId we must take care in two ways
1484 i) In order to *know* whether 'x' occurs free in the RHS, we need its
1485 occurrence info. BUT, we don't gather occurrence info for
1486 GlobalIds. That's one use for the (small) occ_proxy env in OccEnv is
1487 for: it says "gather occurrence info for these.
1489 ii) We must call localiseId on 'x' first, in case it's a GlobalId, or
1490 has an External Name. See, for example, SimplEnv Note [Global Ids in
1493 Note [getProxies is subtle]
1494 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
1495 The code for getProxies isn't all that obvious. Consider
1497 case v |> cov of x { DEFAULT ->
1498 case x |> cox1 of y { DEFAULT ->
1499 case x |> cox2 of z { DEFAULT -> r
1501 These will give us a ProxyEnv looking like:
1502 x |-> (x, [(y, cox1), (z, cox2)])
1503 v |-> (v, [(x, cov)])
1505 From this we want to extract the bindings
1510 Notice that later bindings may mention earlier ones, and that
1511 we need to go "both ways".
1513 Note [Zap case binders in proxy bindings]
1514 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1516 case x of cb(dead) { p -> ...x... }
1518 case x of cb(live) { p -> let x = cb in ...x... }
1520 Core Lint never expects to find an *occurence* of an Id marked
1521 as Dead, so we must zap the OccInfo on cb before making the
1522 binding x = cb. See Trac #5028.
1524 Historical note [no-case-of-case]
1525 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1526 We *used* to suppress the binder-swap in case expressions when
1527 -fno-case-of-case is on. Old remarks:
1528 "This happens in the first simplifier pass,
1529 and enhances full laziness. Here's the bad case:
1530 f = \ y -> ...(case x of I# v -> ...(case x of ...) ... )
1531 If we eliminate the inner case, we trap it inside the I# v -> arm,
1532 which might prevent some full laziness happening. I've seen this
1533 in action in spectral/cichelli/Prog.hs:
1534 [(m,n) | m <- [1..max], n <- [1..max]]
1535 Hence the check for NoCaseOfCase."
1536 However, now the full-laziness pass itself reverses the binder-swap, so this
1537 check is no longer necessary.
1539 Historical note [Suppressing the case binder-swap]
1540 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1541 This old note describes a problem that is also fixed by doing the
1542 binder-swap in OccAnal:
1544 There is another situation when it might make sense to suppress the
1545 case-expression binde-swap. If we have
1547 case x of w1 { DEFAULT -> case x of w2 { A -> e1; B -> e2 }
1548 ...other cases .... }
1550 We'll perform the binder-swap for the outer case, giving
1552 case x of w1 { DEFAULT -> case w1 of w2 { A -> e1; B -> e2 }
1553 ...other cases .... }
1555 But there is no point in doing it for the inner case, because w1 can't
1556 be inlined anyway. Furthermore, doing the case-swapping involves
1557 zapping w2's occurrence info (see paragraphs that follow), and that
1558 forces us to bind w2 when doing case merging. So we get
1560 case x of w1 { A -> let w2 = w1 in e1
1561 B -> let w2 = w1 in e2
1562 ...other cases .... }
1564 This is plain silly in the common case where w2 is dead.
1566 Even so, I can't see a good way to implement this idea. I tried
1567 not doing the binder-swap if the scrutinee was already evaluated
1568 but that failed big-time:
1572 case v of w { MkT x ->
1573 case x of x1 { I# y1 ->
1574 case x of x2 { I# y2 -> ...
1576 Notice that because MkT is strict, x is marked "evaluated". But to
1577 eliminate the last case, we must either make sure that x (as well as
1578 x1) has unfolding MkT y1. THe straightforward thing to do is to do
1579 the binder-swap. So this whole note is a no-op.
1581 It's fixed by doing the binder-swap in OccAnal because we can do the
1582 binder-swap unconditionally and still get occurrence analysis
1586 extendProxyEnv :: ProxyEnv -> Id -> Coercion -> Id -> ProxyEnv
1587 -- (extendPE x co y) typically arises from
1588 -- case (x |> co) of y { ... }
1589 -- It extends the proxy env with the binding
1591 extendProxyEnv pe scrut co case_bndr
1592 | scrut == case_bndr = PE env1 fvs1 -- If case_bndr shadows scrut,
1593 | otherwise = PE env2 fvs2 -- don't extend
1595 PE env1 fvs1 = trimProxyEnv pe [case_bndr]
1596 env2 = extendVarEnv_Acc add single env1 scrut1 (case_bndr,co)
1597 single cb_co = (scrut1, [cb_co])
1598 add cb_co (x, cb_cos) = (x, cb_co:cb_cos)
1599 fvs2 = fvs1 `unionVarSet` tyCoVarsOfCo co
1600 `extendVarSet` case_bndr
1601 `extendVarSet` scrut1
1603 scrut1 = mkLocalId (localiseName (idName scrut)) (idType scrut)
1604 -- Localise the scrut_var before shadowing it; we're making a
1605 -- new binding for it, and it might have an External Name, or
1606 -- even be a GlobalId; Note [Binder swap on GlobalId scrutinees]
1607 -- Also we don't want any INLINE or NOINLINE pragmas!
1610 type ProxyBind = (Id, Id, Coercion)
1611 -- (scrut variable, case-binder variable, coercion)
1613 getProxies :: OccEnv -> Id -> Bag ProxyBind
1614 -- Return a bunch of bindings [...(xi,ei)...]
1615 -- such that let { ...; xi=ei; ... } binds the xi using y alone
1616 -- See Note [getProxies is subtle]
1617 getProxies (OccEnv { occ_proxy = PE pe _ }) case_bndr
1618 = -- pprTrace "wrapProxies" (ppr case_bndr) $
1621 fwd_pe :: IdEnv (Id, Coercion)
1622 fwd_pe = foldVarEnv add1 emptyVarEnv pe
1624 add1 (x,ycos) env = foldr (add2 x) env ycos
1625 add2 x (y,co) env = extendVarEnv env y (x,co)
1627 go_fwd :: Id -> Bag ProxyBind
1628 -- Return bindings derivable from case_bndr
1629 go_fwd case_bndr = -- pprTrace "go_fwd" (vcat [ppr case_bndr, text "fwd_pe =" <+> ppr fwd_pe,
1630 -- text "pe =" <+> ppr pe]) $
1634 | Just (scrut, co) <- lookupVarEnv fwd_pe case_bndr
1635 = unitBag (scrut, case_bndr, mkSymCo co)
1636 `unionBags` go_fwd scrut
1637 `unionBags` go_bwd scrut [pr | pr@(cb,_) <- lookup_bwd scrut
1642 lookup_bwd :: Id -> [(Id, Coercion)]
1643 -- Return case_bndrs that are connected to scrut
1644 lookup_bwd scrut = case lookupVarEnv pe scrut of
1646 Just (_, cb_cos) -> cb_cos
1648 go_bwd :: Id -> [(Id, Coercion)] -> Bag ProxyBind
1649 go_bwd scrut cb_cos = foldr (unionBags . go_bwd1 scrut) emptyBag cb_cos
1651 go_bwd1 :: Id -> (Id, Coercion) -> Bag ProxyBind
1652 go_bwd1 scrut (case_bndr, co)
1653 = -- pprTrace "go_bwd1" (ppr case_bndr) $
1654 unitBag (case_bndr, scrut, co)
1655 `unionBags` go_bwd case_bndr (lookup_bwd case_bndr)
1658 mkAltEnv :: OccEnv -> CoreExpr -> Id -> OccEnv
1659 -- Does two things: a) makes the occ_ctxt = OccVanilla
1660 -- b) extends the ProxyEnv if possible
1661 mkAltEnv env scrut cb
1662 = env { occ_encl = OccVanilla, occ_proxy = pe' }
1666 Var v -> extendProxyEnv pe v (mkReflCo (idType v)) cb
1667 Cast (Var v) co -> extendProxyEnv pe v co cb
1668 _other -> trimProxyEnv pe [cb]
1671 trimOccEnv :: OccEnv -> [CoreBndr] -> OccEnv
1672 trimOccEnv env bndrs = env { occ_proxy = trimProxyEnv (occ_proxy env) bndrs }
1675 trimProxyEnv :: ProxyEnv -> [CoreBndr] -> ProxyEnv
1676 -- We are about to push this ProxyEnv inside a binding for 'bndrs'
1677 -- So dump any ProxyEnv bindings which mention any of the bndrs
1678 trimProxyEnv (PE pe fvs) bndrs
1679 | not (bndr_set `intersectsVarSet` fvs)
1682 = PE pe' (fvs `minusVarSet` bndr_set)
1684 pe' = mapVarEnv trim pe
1685 bndr_set = mkVarSet bndrs
1686 trim (scrut, cb_cos) | scrut `elemVarSet` bndr_set = (scrut, [])
1687 | otherwise = (scrut, filterOut discard cb_cos)
1688 discard (cb,co) = bndr_set `intersectsVarSet`
1689 extendVarSet (tyCoVarsOfCo co) cb
1693 %************************************************************************
1695 \subsection[OccurAnal-types]{OccEnv}
1697 %************************************************************************
1700 type UsageDetails = IdEnv OccInfo -- A finite map from ids to their usage
1701 -- INVARIANT: never IAmDead
1702 -- (Deadness is signalled by not being in the map at all)
1704 (+++), combineAltsUsageDetails
1705 :: UsageDetails -> UsageDetails -> UsageDetails
1708 = plusVarEnv_C addOccInfo usage1 usage2
1710 combineAltsUsageDetails usage1 usage2
1711 = plusVarEnv_C orOccInfo usage1 usage2
1713 addOneOcc :: UsageDetails -> Id -> OccInfo -> UsageDetails
1714 addOneOcc usage id info
1715 = plusVarEnv_C addOccInfo usage (unitVarEnv id info)
1716 -- ToDo: make this more efficient
1718 emptyDetails :: UsageDetails
1719 emptyDetails = (emptyVarEnv :: UsageDetails)
1721 usedIn :: Id -> UsageDetails -> Bool
1722 v `usedIn` details = isExportedId v || v `elemVarEnv` details
1724 type IdWithOccInfo = Id
1726 tagLamBinders :: UsageDetails -- Of scope
1728 -> (UsageDetails, -- Details with binders removed
1729 [IdWithOccInfo]) -- Tagged binders
1730 -- Used for lambda and case binders
1731 -- It copes with the fact that lambda bindings can have InlineRule
1732 -- unfoldings, used for join points
1733 tagLamBinders usage binders = usage' `seq` (usage', bndrs')
1735 (usage', bndrs') = mapAccumR tag_lam usage binders
1736 tag_lam usage bndr = (usage2, setBinderOcc usage bndr)
1738 usage1 = usage `delVarEnv` bndr
1739 usage2 | isId bndr = addIdOccs usage1 (idUnfoldingVars bndr)
1740 | otherwise = usage1
1742 tagBinder :: UsageDetails -- Of scope
1744 -> (UsageDetails, -- Details with binders removed
1745 IdWithOccInfo) -- Tagged binders
1747 tagBinder usage binder
1749 usage' = usage `delVarEnv` binder
1750 binder' = setBinderOcc usage binder
1752 usage' `seq` (usage', binder')
1754 setBinderOcc :: UsageDetails -> CoreBndr -> CoreBndr
1755 setBinderOcc usage bndr
1756 | isTyVar bndr = bndr
1757 | isExportedId bndr = case idOccInfo bndr of
1759 _ -> setIdOccInfo bndr NoOccInfo
1760 -- Don't use local usage info for visible-elsewhere things
1761 -- BUT *do* erase any IAmALoopBreaker annotation, because we're
1762 -- about to re-generate it and it shouldn't be "sticky"
1764 | otherwise = setIdOccInfo bndr occ_info
1766 occ_info = lookupVarEnv usage bndr `orElse` IAmDead
1770 %************************************************************************
1772 \subsection{Operations over OccInfo}
1774 %************************************************************************
1777 mkOneOcc :: OccEnv -> Id -> InterestingCxt -> UsageDetails
1778 mkOneOcc env id int_cxt
1779 | isLocalId id = unitVarEnv id (OneOcc False True int_cxt)
1780 | PE env _ <- occ_proxy env
1781 , id `elemVarEnv` env = unitVarEnv id NoOccInfo
1782 | Just uds <- lookupNameEnv (occ_rule_fvs env) (idName id)
1784 | otherwise = emptyDetails
1786 markMany, markInsideLam, markInsideSCC :: OccInfo -> OccInfo
1788 markMany _ = NoOccInfo
1790 markInsideSCC occ = markMany occ
1792 markInsideLam (OneOcc _ one_br int_cxt) = OneOcc True one_br int_cxt
1793 markInsideLam occ = occ
1795 addOccInfo, orOccInfo :: OccInfo -> OccInfo -> OccInfo
1797 addOccInfo a1 a2 = ASSERT( not (isDeadOcc a1 || isDeadOcc a2) )
1798 NoOccInfo -- Both branches are at least One
1799 -- (Argument is never IAmDead)
1801 -- (orOccInfo orig new) is used
1802 -- when combining occurrence info from branches of a case
1804 orOccInfo (OneOcc in_lam1 _ int_cxt1)
1805 (OneOcc in_lam2 _ int_cxt2)
1806 = OneOcc (in_lam1 || in_lam2)
1807 False -- False, because it occurs in both branches
1808 (int_cxt1 && int_cxt2)
1809 orOccInfo a1 a2 = ASSERT( not (isDeadOcc a1 || isDeadOcc a2) )