2 % (c) The GRASP/AQUA Project, Glasgow University, 1992-1998
4 %************************************************************************
6 \section[FloatIn]{Floating Inwards pass}
8 %************************************************************************
10 The main purpose of @floatInwards@ is floating into branches of a
11 case, so that we don't allocate things, save them on the stack, and
12 then discover that they aren't needed in the chosen branch.
15 module FloatIn ( floatInwards ) where
17 #include "HsVersions.h"
19 import DynFlags ( DynFlags, DynFlag(..) )
21 import CoreUtils ( exprIsHNF, exprIsDupable )
22 import CoreLint ( showPass, endPass )
23 import CoreFVs ( CoreExprWithFVs, freeVars, freeVarsOf )
24 import Id ( isOneShotBndr )
25 import Var ( Id, idType )
26 import Type ( isUnLiftedType )
28 import Util ( zipEqual, zipWithEqual, count )
32 Top-level interface function, @floatInwards@. Note that we do not
33 actually float any bindings downwards from the top-level.
36 floatInwards :: DynFlags -> [CoreBind] -> IO [CoreBind]
38 floatInwards dflags binds
40 showPass dflags "Float inwards";
41 let { binds' = map fi_top_bind binds };
42 endPass dflags "Float inwards" Opt_D_verbose_core2core binds'
43 {- no specific flag for dumping float-in -}
47 fi_top_bind (NonRec binder rhs)
48 = NonRec binder (fiExpr [] (freeVars rhs))
49 fi_top_bind (Rec pairs)
50 = Rec [ (b, fiExpr [] (freeVars rhs)) | (b, rhs) <- pairs ]
53 %************************************************************************
55 \subsection{Mail from Andr\'e [edited]}
57 %************************************************************************
59 {\em Will wrote: What??? I thought the idea was to float as far
60 inwards as possible, no matter what. This is dropping all bindings
61 every time it sees a lambda of any kind. Help! }
63 You are assuming we DO DO full laziness AFTER floating inwards! We
64 have to [not float inside lambdas] if we don't.
66 If we indeed do full laziness after the floating inwards (we could
67 check the compilation flags for that) then I agree we could be more
68 aggressive and do float inwards past lambdas.
70 Actually we are not doing a proper full laziness (see below), which
71 was another reason for not floating inwards past a lambda.
73 This can easily be fixed. The problem is that we float lets outwards,
74 but there are a few expressions which are not let bound, like case
75 scrutinees and case alternatives. After floating inwards the
76 simplifier could decide to inline the let and the laziness would be
80 let a = expensive ==> \b -> case expensive of ...
81 in \ b -> case a of ...
86 to let bind the algebraic case scrutinees (done, I think) and
87 the case alternatives (except the ones with an
88 unboxed type)(not done, I think). This is best done in the
89 SetLevels.lhs module, which tags things with their level numbers.
91 do the full laziness pass (floating lets outwards).
93 simplify. The simplifier inlines the (trivial) lets that were
94 created but were not floated outwards.
97 With the fix I think Will's suggestion that we can gain even more from
98 strictness by floating inwards past lambdas makes sense.
100 We still gain even without going past lambdas, as things may be
101 strict in the (new) context of a branch (where it was floated to) or
104 let a = something case x of
105 in case x of alt1 -> case something of a -> a + a
106 alt1 -> a + a ==> alt2 -> b
109 let a = something let b = case something of a -> a + a
110 in let b = a + a ==> in (b,b)
113 Also, even if a is not found to be strict in the new context and is
114 still left as a let, if the branch is not taken (or b is not entered)
115 the closure for a is not built.
117 %************************************************************************
119 \subsection{Main floating-inwards code}
121 %************************************************************************
124 type FreeVarsSet = IdSet
126 type FloatingBinds = [(CoreBind, FreeVarsSet)]
127 -- In reverse dependency order (innermost bindiner first)
129 -- The FreeVarsSet is the free variables of the binding. In the case
130 -- of recursive bindings, the set doesn't include the bound
133 fiExpr :: FloatingBinds -- Binds we're trying to drop
134 -- as far "inwards" as possible
135 -> CoreExprWithFVs -- Input expr
136 -> CoreExpr -- Result
138 fiExpr to_drop (_, AnnVar v) = mkCoLets' to_drop (Var v)
140 fiExpr to_drop (_, AnnType ty) = ASSERT( null to_drop )
143 fiExpr to_drop (_, AnnLit lit) = Lit lit
146 Applications: we do float inside applications, mainly because we
147 need to get at all the arguments. The next simplifier run will
148 pull out any silly ones.
151 fiExpr to_drop (_,AnnApp fun arg)
152 = mkCoLets' drop_here (App (fiExpr fun_drop fun) (fiExpr arg_drop arg))
154 [drop_here, fun_drop, arg_drop] = sepBindsByDropPoint False [freeVarsOf fun, freeVarsOf arg] to_drop
157 We are careful about lambdas:
159 * We must be careful about floating inside inside a value lambda.
160 That risks losing laziness.
161 The float-out pass might rescue us, but then again it might not.
163 * We must be careful about type lambdas too. At one time we did, and
164 there is no risk of duplicating work thereby, but we do need to be
165 careful. In particular, here is a bad case (it happened in the
168 in let f = /\t -> \a -> ...
170 let f = /\t -> let v = ... in \a -> ...
171 This is bad as now f is an updatable closure (update PAP)
174 So we treat lambda in groups, using the following rule:
176 Float inside a group of lambdas only if
177 they are all either type lambdas or one-shot lambdas.
179 Otherwise drop all the bindings outside the group.
182 -- Hack alert! We only float in through one-shot lambdas,
183 -- not (as you might guess) through big lambdas.
184 -- Reason: we float *out* past big lambdas (see the test in the Lam
185 -- case of FloatOut.floatExpr) and we don't want to float straight
188 -- It *is* important to float into one-shot lambdas, however;
189 -- see the remarks with noFloatIntoRhs.
190 fiExpr to_drop lam@(_, AnnLam _ _)
191 | all is_one_shot bndrs -- Float in
192 = mkLams bndrs (fiExpr to_drop body)
194 | otherwise -- Dump it all here
195 = mkCoLets' to_drop (mkLams bndrs (fiExpr [] body))
198 (bndrs, body) = collectAnnBndrs lam
201 We don't float lets inwards past an SCC.
202 ToDo: keep info on current cc, and when passing
203 one, if it is not the same, annotate all lets in binds with current
204 cc, change current cc to the new one and float binds into expr.
207 fiExpr to_drop (_, AnnNote note@(SCC cc) expr)
208 = -- Wimp out for now
209 mkCoLets' to_drop (Note note (fiExpr [] expr))
211 fiExpr to_drop (_, AnnNote InlineMe expr)
212 = -- Ditto... don't float anything into an INLINE expression
213 mkCoLets' to_drop (Note InlineMe (fiExpr [] expr))
215 fiExpr to_drop (_, AnnNote note@(Coerce _ _) expr)
216 = -- Just float in past coercion
217 Note note (fiExpr to_drop expr)
219 fiExpr to_drop (_, AnnNote note@(CoreNote _) expr)
220 = Note note (fiExpr to_drop expr)
223 For @Lets@, the possible ``drop points'' for the \tr{to_drop}
224 bindings are: (a)~in the body, (b1)~in the RHS of a NonRec binding,
225 or~(b2), in each of the RHSs of the pairs of a @Rec@.
227 Note that we do {\em weird things} with this let's binding. Consider:
236 Look at the inner \tr{let}. As \tr{w} is used in both the bind and
237 body of the inner let, we could panic and leave \tr{w}'s binding where
238 it is. But \tr{v} is floatable further into the body of the inner let, and
239 {\em then} \tr{w} will also be only in the body of that inner let.
241 So: rather than drop \tr{w}'s binding here, we add it onto the list of
242 things to drop in the outer let's body, and let nature take its
246 fiExpr to_drop (_,AnnLet (AnnNonRec id rhs@(rhs_fvs, ann_rhs)) body)
247 = fiExpr new_to_drop body
249 body_fvs = freeVarsOf body
251 final_body_fvs | noFloatIntoRhs ann_rhs
252 || isUnLiftedType (idType id) = body_fvs `unionVarSet` rhs_fvs
253 | otherwise = body_fvs
254 -- See commments with letrec below
255 -- No point in floating in only to float straight out again
256 -- Ditto ok-for-speculation unlifted RHSs
258 [shared_binds, rhs_binds, body_binds] = sepBindsByDropPoint False [rhs_fvs, final_body_fvs] to_drop
260 new_to_drop = body_binds ++ -- the bindings used only in the body
261 [(NonRec id rhs', rhs_fvs')] ++ -- the new binding itself
262 shared_binds -- the bindings used both in rhs and body
264 -- Push rhs_binds into the right hand side of the binding
265 rhs' = fiExpr rhs_binds rhs
266 rhs_fvs' = rhs_fvs `unionVarSet` floatedBindsFVs rhs_binds
268 fiExpr to_drop (_,AnnLet (AnnRec bindings) body)
269 = fiExpr new_to_drop body
271 rhss = map snd bindings
273 rhss_fvs = map freeVarsOf rhss
274 body_fvs = freeVarsOf body
276 -- Add to body_fvs the free vars of any RHS that has
277 -- a lambda at the top. This has the effect of making it seem
278 -- that such things are used in the body as well, and hence prevents
279 -- them getting floated in. The big idea is to avoid turning:
282 -- letrec f = \z. ...x#...f...
285 -- letrec f = let x# = y# +# 1# in \z. ...x#...f... in ...
287 -- Because now we can't float the let out again, because a letrec
288 -- can't have unboxed bindings.
290 final_body_fvs = foldr (unionVarSet . get_extras) body_fvs rhss
291 get_extras (rhs_fvs, rhs) | noFloatIntoRhs rhs = rhs_fvs
292 | otherwise = emptyVarSet
294 (shared_binds:body_binds:rhss_binds) = sepBindsByDropPoint False (final_body_fvs:rhss_fvs) to_drop
296 new_to_drop = -- the bindings used only in the body
298 -- the new binding itself
299 [(Rec (fi_bind rhss_binds bindings), rhs_fvs')] ++
300 -- the bindings used both in rhs and body or in more than one rhs
303 rhs_fvs' = unionVarSet (unionVarSets rhss_fvs)
304 (unionVarSets (map floatedBindsFVs rhss_binds))
306 -- Push rhs_binds into the right hand side of the binding
307 fi_bind :: [FloatingBinds] -- one per "drop pt" conjured w/ fvs_of_rhss
308 -> [(Id, CoreExprWithFVs)]
311 fi_bind to_drops pairs
312 = [ (binder, fiExpr to_drop rhs)
313 | ((binder, rhs), to_drop) <- zipEqual "fi_bind" pairs to_drops ]
316 For @Case@, the possible ``drop points'' for the \tr{to_drop}
317 bindings are: (a)~inside the scrutinee, (b)~inside one of the
318 alternatives/default [default FVs always {\em first}!].
321 fiExpr to_drop (_, AnnCase scrut case_bndr ty alts)
322 = mkCoLets' drop_here1 $
323 mkCoLets' drop_here2 $
324 Case (fiExpr scrut_drops scrut) case_bndr ty
325 (zipWith fi_alt alts_drops_s alts)
327 -- Float into the scrut and alts-considered-together just like App
328 [drop_here1, scrut_drops, alts_drops] = sepBindsByDropPoint False [scrut_fvs, all_alts_fvs] to_drop
330 -- Float into the alts with the is_case flag set
331 (drop_here2 : alts_drops_s) = sepBindsByDropPoint True alts_fvs alts_drops
333 scrut_fvs = freeVarsOf scrut
334 alts_fvs = map alt_fvs alts
335 all_alts_fvs = unionVarSets alts_fvs
336 alt_fvs (con, args, rhs) = foldl delVarSet (freeVarsOf rhs) (case_bndr:args)
337 -- Delete case_bndr and args from free vars of rhs
338 -- to get free vars of alt
340 fi_alt to_drop (con, args, rhs) = (con, args, fiExpr to_drop rhs)
342 noFloatIntoRhs (AnnNote InlineMe _) = True
343 noFloatIntoRhs (AnnLam b _) = not (is_one_shot b)
344 -- IMPORTANT: don't say 'True' for a RHS with a one-shot lambda at the top.
345 -- This makes a big difference for things like
346 -- f x# = let x = I# x#
347 -- in let j = \() -> ...x...
348 -- in if <condition> then normal-path else j ()
349 -- If x is used only in the error case join point, j, we must float the
350 -- boxing constructor into it, else we box it every time which is very bad
353 noFloatIntoRhs rhs = exprIsHNF (deAnnotate' rhs) -- We'd just float right back out again...
355 is_one_shot b = isId b && isOneShotBndr b
359 %************************************************************************
361 \subsection{@sepBindsByDropPoint@}
363 %************************************************************************
365 This is the crucial function. The idea is: We have a wad of bindings
366 that we'd like to distribute inside a collection of {\em drop points};
367 insides the alternatives of a \tr{case} would be one example of some
368 drop points; the RHS and body of a non-recursive \tr{let} binding
369 would be another (2-element) collection.
371 So: We're given a list of sets-of-free-variables, one per drop point,
372 and a list of floating-inwards bindings. If a binding can go into
373 only one drop point (without suddenly making something out-of-scope),
374 in it goes. If a binding is used inside {\em multiple} drop points,
375 then it has to go in a you-must-drop-it-above-all-these-drop-points
378 We have to maintain the order on these drop-point-related lists.
382 :: Bool -- True <=> is case expression
383 -> [FreeVarsSet] -- One set of FVs per drop point
384 -> FloatingBinds -- Candidate floaters
385 -> [FloatingBinds] -- FIRST one is bindings which must not be floated
386 -- inside any drop point; the rest correspond
387 -- one-to-one with the input list of FV sets
389 -- Every input floater is returned somewhere in the result;
390 -- none are dropped, not even ones which don't seem to be
391 -- free in *any* of the drop-point fvs. Why? Because, for example,
392 -- a binding (let x = E in B) might have a specialised version of
393 -- x (say x') stored inside x, but x' isn't free in E or B.
395 type DropBox = (FreeVarsSet, FloatingBinds)
397 sepBindsByDropPoint is_case drop_pts []
398 = [] : [[] | p <- drop_pts] -- cut to the chase scene; it happens
400 sepBindsByDropPoint is_case drop_pts floaters
401 = go floaters (map (\fvs -> (fvs, [])) (emptyVarSet : drop_pts))
403 go :: FloatingBinds -> [DropBox] -> [FloatingBinds]
404 -- The *first* one in the argument list is the drop_here set
405 -- The FloatingBinds in the lists are in the reverse of
406 -- the normal FloatingBinds order; that is, they are the right way round!
408 go [] drop_boxes = map (reverse . snd) drop_boxes
410 go (bind_w_fvs@(bind, bind_fvs) : binds) drop_boxes@(here_box : fork_boxes)
413 -- "here" means the group of bindings dropped at the top of the fork
415 (used_here : used_in_flags) = [ any (`elemVarSet` fvs) (bindersOf bind)
416 | (fvs, drops) <- drop_boxes]
418 drop_here = used_here || not can_push
420 -- For case expressions we duplicate the binding if it is
421 -- reasonably small, and if it is not used in all the RHSs
422 -- This is good for situations like
427 -- E -> ...not mentioning x...
429 n_alts = length used_in_flags
430 n_used_alts = count id used_in_flags -- returns number of Trues in list.
432 can_push = n_used_alts == 1 -- Used in just one branch
433 || (is_case && -- We are looking at case alternatives
434 n_used_alts > 1 && -- It's used in more than one
435 n_used_alts < n_alts && -- ...but not all
436 bindIsDupable bind) -- and we can duplicate the binding
438 new_boxes | drop_here = (insert here_box : fork_boxes)
439 | otherwise = (here_box : new_fork_boxes)
441 new_fork_boxes = zipWithEqual "FloatIn.sepBinds" insert_maybe fork_boxes used_in_flags
443 insert :: DropBox -> DropBox
444 insert (fvs,drops) = (fvs `unionVarSet` bind_fvs, bind_w_fvs:drops)
446 insert_maybe box True = insert box
447 insert_maybe box False = box
450 floatedBindsFVs :: FloatingBinds -> FreeVarsSet
451 floatedBindsFVs binds = unionVarSets (map snd binds)
453 mkCoLets' :: FloatingBinds -> CoreExpr -> CoreExpr
454 mkCoLets' to_drop e = foldl (flip (Let . fst)) e to_drop
455 -- Remember to_drop is in *reverse* dependency order
457 bindIsDupable (Rec prs) = all (exprIsDupable . snd) prs
458 bindIsDupable (NonRec b r) = exprIsDupable r