2 ( ProcPointSet, Status(..)
3 , callProcPoints, minimalProcPointSet
4 , addProcPointProtocols, splitAtProcPoints, procPointAnalysis
8 import Prelude hiding (zip, unzip, last)
12 import Cmm hiding (blockId)
19 import Data.List (sortBy)
22 import MkZipCfgCmm hiding (CmmBlock, CmmGraph, CmmTopZ)
31 -- Compute a minimal set of proc points for a control-flow graph.
33 -- Determine a protocol for each proc point (which live variables will
34 -- be passed as arguments and which will be on the stack).
37 A proc point is a basic block that, after CPS transformation, will
38 start a new function. The entry block of the original function is a
39 proc point, as is the continuation of each function call.
40 A third kind of proc point arises if we want to avoid copying code.
41 Suppose we have code like the following:
44 if (...) { ..1..; call foo(); ..2..}
45 else { ..3..; call bar(); ..4..}
50 The statement 'x = y + z' can be reached from two different proc
51 points: the continuations of foo() and bar(). We would prefer not to
52 put a copy in each continuation; instead we would like 'x = y + z' to
53 be the start of a new procedure to which the continuations can jump:
56 if (...) { ..1..; push k_foo; jump foo_cps(); }
57 else { ..3..; push k_bar; jump bar_cps(); }
59 k_foo() { ..2..; jump k_join(y, z); }
60 k_bar() { ..4..; jump k_join(y, z); }
61 k_join(y, z) { x = y + z; return x; }
63 You might think then that a criterion to make a node a proc point is
64 that it is directly reached by two distinct proc points. (Note
65 [Direct reachability].) But this criterion is a bit too simple; for
66 example, 'return x' is also reached by two proc points, yet there is
67 no point in pulling it out of k_join. A good criterion would be to
68 say that a node should be made a proc point if it is reached by a set
69 of proc points that is different than its immediate dominator. NR
70 believes this criterion can be shown to produce a minimum set of proc
71 points, and given a dominator tree, the proc points can be chosen in
72 time linear in the number of blocks. Lacking a dominator analysis,
73 however, we turn instead to an iterative solution, starting with no
74 proc points and adding them according to these rules:
76 1. The entry block is a proc point.
77 2. The continuation of a call is a proc point.
78 3. A node is a proc point if it is directly reached by more proc
79 points than one of its predecessors.
81 Because we don't understand the problem very well, we apply rule 3 at
82 most once per iteration, then recompute the reachability information.
83 (See Note [No simple dataflow].) The choice of the new proc point is
84 arbitrary, and I don't know if the choice affects the final solution,
85 so I don't know if the number of proc points chosen is the
86 minimum---but the set will be minimal.
89 type ProcPointSet = BlockSet
92 = ReachedBy ProcPointSet -- set of proc points that directly reach the block
93 | ProcPoint -- this block is itself a proc point
95 instance Outputable Status where
97 | isEmptyBlockSet ps = text "<not-reached>"
98 | otherwise = text "reached by" <+>
99 (hsep $ punctuate comma $ map ppr $ blockSetToList ps)
100 ppr ProcPoint = text "<procpt>"
103 lattice :: DataflowLattice Status
104 lattice = DataflowLattice "direct proc-point reachability" unreached add_to False
105 where unreached = ReachedBy emptyBlockSet
106 add_to _ ProcPoint = noTx ProcPoint
107 add_to ProcPoint _ = aTx ProcPoint -- aTx because of previous case again
108 add_to (ReachedBy p) (ReachedBy p') =
109 let union = unionBlockSets p p'
110 in if sizeBlockSet union > sizeBlockSet p' then
111 aTx (ReachedBy union)
114 --------------------------------------------------
115 -- transfer equations
117 forward :: ForwardTransfers Middle Last Status
118 forward = ForwardTransfers first middle last exit
119 where first id ProcPoint = ReachedBy $ unitBlockSet id
122 last (LastCall _ (Just id) _ _ _) _ = LastOutFacts [(id, ProcPoint)]
123 last l x = LastOutFacts $ map (\id -> (id, x)) (succs l)
126 -- It is worth distinguishing two sets of proc points:
127 -- those that are induced by calls in the original graph
128 -- and those that are introduced because they're reachable from multiple proc points.
129 callProcPoints :: CmmGraph -> ProcPointSet
130 minimalProcPointSet :: ProcPointSet -> CmmGraph -> FuelMonad ProcPointSet
132 callProcPoints g = fold_blocks add (unitBlockSet (lg_entry g)) g
133 where add b set = case last $ unzip b of
134 LastOther (LastCall _ (Just k) _ _ _) -> extendBlockSet set k
137 minimalProcPointSet callProcPoints g = extendPPSet g (postorder_dfs g) callProcPoints
139 type PPFix = FuelMonad (ForwardFixedPoint Middle Last Status ())
141 procPointAnalysis :: ProcPointSet -> CmmGraph -> FuelMonad (BlockEnv Status)
142 procPointAnalysis procPoints g =
143 let addPP env id = extendBlockEnv env id ProcPoint
144 initProcPoints = foldl addPP emptyBlockEnv (blockSetToList procPoints)
145 in liftM zdfFpFacts $
146 (zdfSolveFrom initProcPoints "proc-point reachability" lattice
147 forward (fact_bot lattice) $ graphOfLGraph g :: PPFix)
149 extendPPSet :: CmmGraph -> [CmmBlock] -> ProcPointSet -> FuelMonad ProcPointSet
150 extendPPSet g blocks procPoints =
151 do env <- procPointAnalysis procPoints g
152 let add block pps = let id = blockId block
153 in case lookupBlockEnv env id of
154 Just ProcPoint -> extendBlockSet pps id
156 procPoints' = fold_blocks add emptyBlockSet g
157 newPoints = mapMaybe ppSuccessor blocks
158 newPoint = listToMaybe newPoints
159 ppSuccessor b@(Block bid _) =
160 let nreached id = case lookupBlockEnv env id `orElse`
161 pprPanic "no ppt" (ppr id <+> ppr b) of
163 ReachedBy ps -> sizeBlockSet ps
164 block_procpoints = nreached bid
165 -- | Looking for a successor of b that is reached by
166 -- more proc points than b and is not already a proc
167 -- point. If found, it can become a proc point.
168 newId succ_id = not (elemBlockSet succ_id procPoints') &&
169 nreached succ_id > block_procpoints
170 in listToMaybe $ filter newId $ succs b
173 [] -> return procPoints'
174 pps -> extendPPSet g blocks
175 (foldl extendBlockSet procPoints' pps)
177 case newPoint of Just id ->
178 if elemBlockSet id procPoints' then panic "added old proc pt"
179 else extendPPSet g blocks (extendBlockSet procPoints' id)
180 Nothing -> return procPoints'
183 ------------------------------------------------------------------------
184 -- Computing Proc-Point Protocols --
185 ------------------------------------------------------------------------
189 There is one major trick, discovered by Michael Adams, which is that
190 we want to choose protocols in a way that enables us to optimize away
191 some continuations. The optimization is very much like branch-chain
192 elimination, except that it involves passing results as well as
193 control. The idea is that if a call's continuation k does nothing but
194 CopyIn its results and then goto proc point P, the call's continuation
195 may be changed to P, *provided* P's protocol is identical to the
196 protocol for the CopyIn. We choose protocols to make this so.
198 Here's an explanatory example; we begin with the source code (lines
199 separate basic blocks):
207 Zipperization converts this code as follows:
210 call g() returns to k;
217 What we'd like to do is assign P the same CopyIn protocol as k, so we
221 call g() returns to P;
223 P: CopyIn(x, y); ..2..;
225 Of course, P may be the target of more than one continuation, and
226 different continuations may have different protocols. Michael Adams
227 implemented a voting mechanism, but he thinks a simple greedy
228 algorithm would be just as good, so that's what we do.
232 data Protocol = Protocol Convention CmmFormals Area
234 instance Outputable Protocol where
235 ppr (Protocol c fs a) = text "Protocol" <+> ppr c <+> ppr fs <+> ppr a
237 -- | Function 'optimize_calls' chooses protocols only for those proc
238 -- points that are relevant to the optimization explained above.
239 -- The others are assigned by 'add_unassigned', which is not yet clever.
241 addProcPointProtocols :: ProcPointSet -> ProcPointSet -> CmmGraph -> FuelMonad CmmGraph
242 addProcPointProtocols callPPs procPoints g =
243 do liveness <- cmmLivenessZ g
244 (protos, g') <- optimize_calls liveness g
245 blocks'' <- add_CopyOuts protos procPoints g'
246 return $ LGraph (lg_entry g) blocks''
247 where optimize_calls liveness g = -- see Note [Separate Adams optimization]
248 do let (protos, blocks') =
249 fold_blocks maybe_add_call (init_protocols, emptyBlockEnv) g
250 protos' = add_unassigned liveness procPoints protos
251 blocks <- add_CopyIns callPPs protos' blocks'
252 let g' = LGraph (lg_entry g) (mkBlockEnv (map withKey (concat blocks)))
253 withKey b@(Block bid _) = (bid, b)
254 return (protos', runTx removeUnreachableBlocksZ g')
255 maybe_add_call :: CmmBlock -> (BlockEnv Protocol, BlockEnv CmmBlock)
256 -> (BlockEnv Protocol, BlockEnv CmmBlock)
257 -- ^ If the block is a call whose continuation goes to a proc point
258 -- whose protocol either matches the continuation's or is not yet set,
259 -- redirect the call (cf 'newblock') and set the protocol if necessary
260 maybe_add_call block (protos, blocks) =
261 case goto_end $ unzip block of
262 (h, LastOther (LastCall tgt (Just k) args res s))
263 | Just proto <- lookupBlockEnv protos k,
264 Just pee <- branchesToProcPoint k
265 -> let newblock = zipht h (tailOfLast (LastCall tgt (Just pee)
267 changed_blocks = insertBlock newblock blocks
268 unchanged_blocks = insertBlock block blocks
269 in case lookupBlockEnv protos pee of
270 Nothing -> (extendBlockEnv protos pee proto,changed_blocks)
272 if proto == proto' then (protos, changed_blocks)
273 else (protos, unchanged_blocks)
274 _ -> (protos, insertBlock block blocks)
276 branchesToProcPoint :: BlockId -> Maybe BlockId
277 -- ^ Tells whether the named block is just a branch to a proc point
278 branchesToProcPoint id =
279 let (Block _ t) = lookupBlockEnv (lg_blocks g) id `orElse`
280 panic "branch out of graph"
282 ZLast (LastOther (LastBranch pee))
283 | elemBlockSet pee procPoints -> Just pee
285 init_protocols = fold_blocks maybe_add_proto emptyBlockEnv g
286 maybe_add_proto :: CmmBlock -> BlockEnv Protocol -> BlockEnv Protocol
287 --maybe_add_proto (Block id (ZTail (CopyIn c _ fs _srt) _)) env =
288 -- extendBlockEnv env id (Protocol c fs $ toArea id fs)
289 maybe_add_proto _ env = env
290 -- JD: Is this proto stuff even necessary, now that we have
291 -- common blockification?
293 -- | For now, following a suggestion by Ben Lippmeier, we pass all
294 -- live variables as arguments, hoping that a clever register
295 -- allocator might help.
297 add_unassigned :: BlockEnv CmmLive -> ProcPointSet -> BlockEnv Protocol ->
299 add_unassigned = pass_live_vars_as_args
301 pass_live_vars_as_args :: BlockEnv CmmLive -> ProcPointSet ->
302 BlockEnv Protocol -> BlockEnv Protocol
303 pass_live_vars_as_args _liveness procPoints protos = protos'
304 where protos' = foldBlockSet addLiveVars protos procPoints
305 addLiveVars :: BlockId -> BlockEnv Protocol -> BlockEnv Protocol
306 addLiveVars id protos =
307 case lookupBlockEnv protos id of
309 Nothing -> let live = emptyRegSet
310 --lookupBlockEnv _liveness id `orElse`
311 --panic ("no liveness at block " ++ show id)
312 formals = uniqSetToList live
313 prot = Protocol Private formals $ CallArea $ Young id
314 in extendBlockEnv protos id prot
317 -- | Add copy-in instructions to each proc point that did not arise from a call
318 -- instruction. (Proc-points that arise from calls already have their copy-in instructions.)
320 add_CopyIns :: ProcPointSet -> BlockEnv Protocol -> BlockEnv CmmBlock ->
321 FuelMonad [[CmmBlock]]
322 add_CopyIns callPPs protos blocks =
323 liftUniq $ mapM maybe_insert_CopyIns (blockEnvToList blocks)
324 where maybe_insert_CopyIns (_, b@(Block id t))
325 | not $ elemBlockSet id callPPs
326 = case lookupBlockEnv protos id of
327 Just (Protocol c fs _area) ->
328 do LGraph _ blocks <-
329 lgraphOfAGraph (mkLabel id <*> copyInSlot c fs <*> mkZTail t)
330 return (map snd $ blockEnvToList blocks)
331 Nothing -> return [b]
332 | otherwise = return [b]
334 -- | Add a CopyOut node before each procpoint.
335 -- If the predecessor is a call, then the copy outs should already be done by the callee.
336 -- Note: If we need to add copy-out instructions, they may require stack space,
337 -- so we accumulate a map from the successors to the necessary stack space,
338 -- then update the successors after we have finished inserting the copy-outs.
340 add_CopyOuts :: BlockEnv Protocol -> ProcPointSet -> CmmGraph ->
341 FuelMonad (BlockEnv CmmBlock)
342 add_CopyOuts protos procPoints g = fold_blocks mb_copy_out (return emptyBlockEnv) g
343 where mb_copy_out :: CmmBlock -> FuelMonad (BlockEnv CmmBlock) ->
344 FuelMonad (BlockEnv CmmBlock)
345 mb_copy_out b@(Block bid _) z | bid == lg_entry g = skip b z
347 case last $ unzip b of
348 LastOther (LastCall _ _ _ _ _) -> skip b z -- copy out done by callee
350 copy_out b z = fold_succs trySucc b init >>= finish
351 where init = z >>= (\bmap -> return (b, bmap))
353 if elemBlockSet succId procPoints then
354 case lookupBlockEnv protos succId of
356 Just (Protocol c fs _area) -> insert z succId $ copyOutSlot c fs
360 (b, bs) <- insertBetween b m succId
361 -- pprTrace "insert for succ" (ppr succId <> ppr m) $ do
362 return $ (b, foldl (flip insertBlock) bmap bs)
363 finish (b@(Block bid _), bmap) =
364 return $ (extendBlockEnv bmap bid b)
365 skip b@(Block bid _) bs =
366 bs >>= (\bmap -> return (extendBlockEnv bmap bid b))
368 -- At this point, we have found a set of procpoints, each of which should be
369 -- the entry point of a procedure.
370 -- Now, we create the procedure for each proc point,
371 -- which requires that we:
372 -- 1. build a map from proc points to the blocks reachable from the proc point
373 -- 2. turn each branch to a proc point into a jump
374 -- 3. turn calls and returns into jumps
375 -- 4. build info tables for the procedures -- and update the info table for
376 -- the SRTs in the entry procedure as well.
377 -- Input invariant: A block should only be reachable from a single ProcPoint.
378 splitAtProcPoints :: CLabel -> ProcPointSet-> ProcPointSet -> BlockEnv Status ->
379 CmmTopZ -> FuelMonad [CmmTopZ]
380 splitAtProcPoints entry_label callPPs procPoints procMap
381 (CmmProc (CmmInfo gc upd_fr info_tbl) top_l top_args
382 (stackInfo, g@(LGraph entry blocks))) =
383 do -- Build a map from procpoints to the blocks they reach
384 let addBlock b@(Block bid _) graphEnv =
385 case lookupBlockEnv procMap bid of
386 Just ProcPoint -> add graphEnv bid bid b
387 Just (ReachedBy set) ->
388 case blockSetToList set of
390 [id] -> add graphEnv id bid b
391 _ -> panic "Each block should be reachable from only one ProcPoint"
392 Nothing -> pprPanic "block not reached by a proc point?" (ppr bid)
393 add graphEnv procId bid b = extendBlockEnv graphEnv procId graph'
394 where graph = lookupBlockEnv graphEnv procId `orElse` emptyBlockEnv
395 graph' = extendBlockEnv graph bid b
396 graphEnv <- return $ fold_blocks addBlock emptyBlockEnv g
397 -- Build a map from proc point BlockId to labels for their new procedures
398 -- Due to common blockification, we may overestimate the set of procpoints.
399 let add_label map pp = return $ addToFM map pp lbl
400 where lbl = if pp == entry then entry_label else blockLbl pp
401 procLabels <- foldM add_label emptyFM
402 (filter (elemBlockEnv blocks) (blockSetToList procPoints))
403 -- For each procpoint, we need to know the SP offset on entry.
404 -- If the procpoint is:
405 -- - continuation of a call, the SP offset is in the call
406 -- - otherwise, 0 -- no overflow for passing those variables
407 let add_sp_off b env =
408 case last (unzip b) of
409 LastOther (LastCall {cml_cont = Just succ, cml_ret_args = off,
410 cml_ret_off = updfr_off}) ->
411 extendBlockEnv env succ (off, updfr_off)
413 spEntryMap = fold_blocks add_sp_off (mkBlockEnv [(entry, stackInfo)]) g
414 getStackInfo id = lookupBlockEnv spEntryMap id `orElse` (0, Nothing)
415 -- In each new graph, add blocks jumping off to the new procedures,
416 -- and replace branches to procpoints with branches to the jump-off blocks
417 let add_jump_block (env, bs) (pp, l) =
418 do bid <- liftM mkBlockId getUniqueM
419 let b = Block bid (ZLast (LastOther jump))
420 (argSpace, _) = getStackInfo pp
421 jump = LastCall (CmmLit (CmmLabel l')) Nothing argSpace 0 Nothing
422 l' = if elemBlockSet pp callPPs then entryLblToInfoLbl l else l
423 return (extendBlockEnv env pp bid, b : bs)
424 add_jumps (newGraphEnv) (ppId, blockEnv) =
425 do let needed_jumps = -- find which procpoints we currently branch to
426 foldBlockEnv' add_if_branch_to_pp [] blockEnv
427 add_if_branch_to_pp block rst =
428 case last (unzip block) of
429 LastOther (LastBranch id) -> add_if_pp id rst
430 LastOther (LastCondBranch _ ti fi) ->
431 add_if_pp ti (add_if_pp fi rst)
432 LastOther (LastSwitch _ tbl) -> foldr add_if_pp rst (catMaybes tbl)
434 add_if_pp id rst = case lookupFM procLabels id of
435 Just x -> (id, x) : rst
437 (jumpEnv, jumpBlocks) <-
438 foldM add_jump_block (emptyBlockEnv, []) needed_jumps
439 -- update the entry block
440 let b = expectJust "block in env" $ lookupBlockEnv blockEnv ppId
441 off = getStackInfo ppId
442 blockEnv' = extendBlockEnv blockEnv ppId b
443 -- replace branches to procpoints with branches to jumps
444 LGraph _ blockEnv'' = replaceBranches jumpEnv $ LGraph ppId blockEnv'
445 -- add the jump blocks to the graph
446 blockEnv''' = foldl (flip insertBlock) blockEnv'' jumpBlocks
447 let g' = (off, LGraph ppId blockEnv''')
448 -- pprTrace "g' pre jumps" (ppr g') $ do
449 return (extendBlockEnv newGraphEnv ppId g')
450 graphEnv <- foldM add_jumps emptyBlockEnv $ blockEnvToList graphEnv
451 let to_proc (bid, g) | elemBlockSet bid callPPs =
453 CmmProc (CmmInfo gc upd_fr info_tbl) top_l top_args (replacePPIds g)
455 CmmProc emptyContInfoTable lbl [] (replacePPIds g)
456 where lbl = expectJust "pp label" $ lookupFM procLabels bid
458 CmmProc (CmmInfo Nothing Nothing CmmNonInfoTable) lbl [] (replacePPIds g)
459 where lbl = expectJust "pp label" $ lookupFM procLabels bid
460 -- References to procpoint IDs can now be replaced with the infotable's label
461 replacePPIds (x, g) = (x, map_nodes id (mapExpMiddle repl) (mapExpLast repl) g)
462 where repl e@(CmmLit (CmmBlock bid)) =
463 case lookupFM procLabels bid of
464 Just l -> CmmLit (CmmLabel (entryLblToInfoLbl l))
467 -- The C back end expects to see return continuations before the call sites.
468 -- Here, we sort them in reverse order -- it gets reversed later.
469 let (_, block_order) = foldl add_block_num (0::Int, emptyBlockEnv) (postorder_dfs g)
470 add_block_num (i, map) (Block bid _) = (i+1, extendBlockEnv map bid i)
471 sort_fn (bid, _) (bid', _) =
472 compare (expectJust "block_order" $ lookupBlockEnv block_order bid)
473 (expectJust "block_order" $ lookupBlockEnv block_order bid')
474 procs <- return $ map to_proc $ sortBy sort_fn $ blockEnvToList graphEnv
475 return -- pprTrace "procLabels" (ppr procLabels)
476 -- pprTrace "splitting graphs" (ppr procs)
478 splitAtProcPoints _ _ _ _ t@(CmmData _ _) = return [t]
480 ----------------------------------------------------------------
483 Note [Direct reachability]
485 Block B is directly reachable from proc point P iff control can flow
486 from P to B without passing through an intervening proc point.
489 ----------------------------------------------------------------
492 Note [No simple dataflow]
494 Sadly, it seems impossible to compute the proc points using a single
495 dataflow pass. One might attempt to use this simple lattice:
497 data Location = Unknown
498 | InProc BlockId -- node is in procedure headed by the named proc point
499 | ProcPoint -- node is itself a proc point
501 At a join, a node in two different blocks becomes a proc point.
502 The difficulty is that the change of information during iterative
503 computation may promote a node prematurely. Here's a program that
504 illustrates the difficulty:
513 L2: if (...) { g(); goto L1; }
517 The only proc-point needed (besides the entry) is L1. But in an
518 iterative analysis, consider what happens to L2. On the first pass
519 through, it rises from Unknown to 'InProc entry', but when L1 is
520 promoted to a proc point (because it's the successor of g()), L1's
521 successors will be promoted to 'InProc L1'. The problem hits when the
522 new fact 'InProc L1' flows into L2 which is already bound to 'InProc entry'.
523 The join operation makes it a proc point when in fact it needn't be,
524 because its immediate dominator L1 is already a proc point and there
525 are no other proc points that directly reach L2.
530 {- Note [Separate Adams optimization]
531 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
532 It may be worthwhile to attempt the Adams optimization by rewriting
533 the graph before the assignment of proc-point protocols. Here are a
536 g() returns to k; g() returns to L;
537 k: CopyIn c ress; goto L:
539 L: // no CopyIn node here L: CopyIn c ress;
542 And when c == c' and ress == ress', this also:
544 g() returns to k; g() returns to L;
545 k: CopyIn c ress; goto L:
547 L: CopyIn c' ress' L: CopyIn c' ress' ;
549 In both cases the goal is to eliminate k.