3 ( callProcPoints, minimalProcPointSet
4 , addProcPointProtocols
9 import Prelude hiding (zip, unzip, last)
13 import Cmm hiding (blockId)
20 import ForeignCall -- used in protocol for the entry point
21 import MachOp (MachHint(NoHint))
23 import MkZipCfgCmm hiding (CmmBlock, CmmGraph)
36 -- Compute a minimal set of proc points for a control-flow graph.
38 -- Determine a protocol for each proc point (which live variables will
39 -- be passed as arguments and which will be on the stack).
42 A proc point is a basic block that, after CPS transformation, will
43 start a new function. The entry block of the original function is a
44 proc point, as is the continuation of each function call.
45 A third kind of proc point arises if we want to avoid copying code.
46 Suppose we have code like the following:
49 if (...) { ..1..; call foo(); ..2..}
50 else { ..3..; call bar(); ..4..}
55 The statement 'x = y + z' can be reached from two different proc
56 points: the continuations of foo() and bar(). We would prefer not to
57 put a copy in each continuation; instead we would like 'x = y + z' to
58 be the start of a new procedure to which the continuations can jump:
61 if (...) { ..1..; push k_foo; jump foo_cps(); }
62 else { ..3..; push k_bar; jump bar_cps(); }
64 k_foo() { ..2..; jump k_join(y, z); }
65 k_bar() { ..4..; jump k_join(y, z); }
66 k_join(y, z) { x = y + z; return x; }
68 You might think then that a criterion to make a node a proc point is
69 that it is directly reached by two distinct proc points. (Note
70 [Direct reachability].) But this criterion is a bit two simple; for
71 example, 'return x' is also reached by two proc points, yet there is
72 no point in pulling it out of k_join. A good criterion would be to
73 say that a node should be made a proc point if it is reached by a set
74 of proc points that is different than its immediate dominator. NR
75 believes this criterion can be shown to produce a minimum set of proc
76 points, and given a dominator tree, the proc points can be chosen in
77 time linear in the number of blocks. Lacking a dominator analysis,
78 however, we turn instead to an iterative solution, starting with no
79 proc points and adding them according to these rules:
81 1. The entry block is a proc point.
82 2. The continuation of a call is a proc point.
83 3. A node is a proc point if it is directly reached by more proc
84 points than one of its predecessors.
86 Because we don't understand the problem very well, we apply rule 3 at
87 most once per iteration, then recompute the reachability information.
88 (See Note [No simple dataflow].) The choice of the new proc point is
89 arbitrary, and I don't know if the choice affects the final solution,
90 so I don't know if the number of proc points chosen is the
91 minimum---but the set will be minimal.
94 type ProcPointSet = BlockSet
97 = ReachedBy ProcPointSet -- set of proc points that directly reach the block
98 | ProcPoint -- this block is itself a proc point
100 instance Outputable Status where
102 | isEmptyUniqSet ps = text "<not-reached>"
103 | otherwise = text "reached by" <+>
104 (hsep $ punctuate comma $ map ppr $ uniqSetToList ps)
105 ppr ProcPoint = text "<procpt>"
108 lattice :: DataflowLattice Status
109 lattice = DataflowLattice "direct proc-point reachability" unreached add_to False
110 where unreached = ReachedBy emptyBlockSet
111 add_to _ ProcPoint = noTx ProcPoint
112 add_to ProcPoint _ = aTx ProcPoint -- aTx because of previous case again
113 add_to (ReachedBy p) (ReachedBy p') =
114 let union = unionUniqSets p p'
115 in if sizeUniqSet union > sizeUniqSet p' then
116 aTx (ReachedBy union)
119 --------------------------------------------------
120 -- transfer equations
122 forward :: ForwardTransfers Middle Last Status
123 forward = ForwardTransfers first middle last exit
124 where first ProcPoint id = ReachedBy $ unitUniqSet id
127 last _ (LastCall _ (Just id)) = LastOutFacts [(id, ProcPoint)]
128 last x l = LastOutFacts $ map (\id -> (id, x)) (succs l)
131 -- It is worth distinguishing two sets of proc points:
132 -- those that are induced by calls in the original graph
133 -- and those that are introduced because they're reachable from multiple proc points.
134 callProcPoints :: CmmGraph -> ProcPointSet
135 minimalProcPointSet :: ProcPointSet -> CmmGraph -> FuelMonad ProcPointSet
137 callProcPoints g = fold_blocks add entryPoint g
138 where entryPoint = unitUniqSet (lg_entry g)
139 add b set = case last $ unzip b of
140 LastOther (LastCall _ (Just k)) -> extendBlockSet set k
143 minimalProcPointSet callProcPoints g = extendPPSet g (postorder_dfs g) callProcPoints
145 type PPFix = FuelMonad (ForwardFixedPoint Middle Last Status ())
147 procPointAnalysis :: ProcPointSet -> CmmGraph -> FuelMonad PPFix
148 procPointAnalysis procPoints g =
149 let addPP env id = extendBlockEnv env id ProcPoint
150 initProcPoints = foldl addPP emptyBlockEnv (uniqSetToList procPoints)
151 in runDFM lattice $ -- init with old facts and solve
152 return $ (zdfSolveFrom initProcPoints "proc-point reachability" lattice
153 forward (fact_bot lattice) $ graphOfLGraph g :: PPFix)
155 extendPPSet :: CmmGraph -> [CmmBlock] -> ProcPointSet -> FuelMonad ProcPointSet
156 extendPPSet g blocks procPoints =
157 do res <- procPointAnalysis procPoints g
158 env <- liftM zdfFpFacts res
159 let add block pps = let id = blockId block
160 in case lookupBlockEnv env id of
161 Just ProcPoint -> extendBlockSet pps id
163 procPoints' = fold_blocks add emptyBlockSet g
164 newPoint = listToMaybe (mapMaybe ppSuccessor blocks)
165 ppSuccessor b@(Block id _) =
166 let nreached id = case lookupBlockEnv env id `orElse` panic "no ppt" of
168 ReachedBy ps -> sizeUniqSet ps
169 my_nreached = nreached id
170 -- | Looking for a successor of b that is reached by
171 -- more proc points than b and is not already a proc
172 -- point. If found, it can become a proc point.
173 newId succ_id = not (elemBlockSet succ_id procPoints') &&
174 nreached succ_id > my_nreached
175 in listToMaybe $ filter newId $ succs b
176 case newPoint of Just id ->
177 if elemBlockSet id procPoints' then panic "added old proc pt"
178 else extendPPSet g blocks (extendBlockSet procPoints' id)
179 Nothing -> return procPoints'
184 ------------------------------------------------------------------------
185 -- Computing Proc-Point Protocols --
186 ------------------------------------------------------------------------
190 There is one major trick, discovered by Michael Adams, which is that
191 we want to choose protocols in a way that enables us to optimize away
192 some continuations. The optimization is very much like branch-chain
193 elimination, except that it involves passing results as well as
194 control. The idea is that if a call's continuation k does nothing but
195 CopyIn its results and then goto proc point P, the call's continuation
196 may be changed to P, *provided* P's protocol is identical to the
197 protocol for the CopyIn. We choose protocols to make this so.
199 Here's an explanatory example; we begin with the source code (lines
200 separate basic blocks):
208 Zipperization converts this code as follows:
211 call g() returns to k;
218 What we'd like to do is assign P the same CopyIn protocol as k, so we
222 call g() returns to P;
224 P: CopyIn(x, y); ..2..;
226 Of course, P may be the target of more than one continuation, and
227 different continuations may have different protocols. Michael Adams
228 implemented a voting mechanism, but he thinks a simple greedy
229 algorithm would be just as good, so that's what we do.
233 data Protocol = Protocol Convention CmmFormals StackArea
235 instance Outputable Protocol where
236 ppr (Protocol c fs a) = text "Protocol" <+> ppr c <+> ppr fs <+> ppr a
238 -- | Function 'optimize_calls' chooses protocols only for those proc
239 -- points that are relevant to the optimization explained above.
240 -- The others are assigned by 'add_unassigned', which is not yet clever.
242 addProcPointProtocols :: ProcPointSet -> ProcPointSet -> CmmFormalsWithoutKinds ->
243 CmmGraph -> FuelMonad CmmGraph
244 addProcPointProtocols callPPs procPoints formals g =
245 do liveness <- cmmLivenessZ g
246 (protos, g') <- return $ optimize_calls liveness g
247 blocks'' <- add_CopyOuts protos procPoints g'
248 return $ LGraph (lg_entry g) blocks''
249 where optimize_calls liveness g = -- see Note [Separate Adams optimization]
250 let (protos, blocks') =
251 fold_blocks maybe_add_call (init_protocols, emptyBlockEnv) g
252 protos' = add_unassigned liveness procPoints protos
253 g' = LGraph (lg_entry g) $ add_CopyIns callPPs protos' blocks'
254 in (protos', runTx removeUnreachableBlocksZ g')
255 maybe_add_call :: CmmBlock -> (BlockEnv Protocol, BlockEnv CmmBlock)
256 -> (BlockEnv Protocol, BlockEnv CmmBlock)
257 -- ^ If the block is a call whose continuation goes to a proc point
258 -- whose protocol either matches the continuation's or is not yet set,
259 -- redirect the call (cf 'newblock') and set the protocol if necessary
260 maybe_add_call block (protos, blocks) =
261 case goto_end $ unzip block of
262 (h, LastOther (LastCall tgt (Just k)))
263 | Just proto <- lookupBlockEnv protos k,
264 Just pee <- jumpsToProcPoint k
266 zipht h (tailOfLast (LastCall tgt (Just pee)))
267 changed_blocks = insertBlock newblock blocks
268 unchanged_blocks = insertBlock block blocks
269 in case lookupBlockEnv protos pee of
270 Nothing -> (extendBlockEnv protos pee proto,changed_blocks)
272 if proto == proto' then (protos, changed_blocks)
273 else (protos, unchanged_blocks)
274 _ -> (protos, insertBlock block blocks)
276 jumpsToProcPoint :: BlockId -> Maybe BlockId
277 -- ^ Tells whether the named block is just a jump to a proc point
278 jumpsToProcPoint id =
279 let (Block _ t) = lookupBlockEnv (lg_blocks g) id `orElse`
280 panic "jump out of graph"
282 ZTail (CopyIn {}) (ZLast (LastOther (LastBranch pee)))
283 | elemBlockSet pee procPoints -> Just pee
285 init_protocols = fold_blocks maybe_add_proto emptyBlockEnv g
286 maybe_add_proto :: CmmBlock -> BlockEnv Protocol -> BlockEnv Protocol
287 maybe_add_proto (Block id (ZTail (CopyIn c fs _srt) _)) env =
288 extendBlockEnv env id (Protocol c fs $ toArea id fs)
289 maybe_add_proto (Block id _) env | id == lg_entry g =
290 extendBlockEnv env id (Protocol stdArgConvention hfs $ toArea id hfs)
291 maybe_add_proto _ env = env
292 toArea id fs = mkStackArea id fs $ Just fs
293 hfs = map (\x -> CmmKinded x NoHint) formals
294 stdArgConvention = ConventionStandard CmmCallConv Arguments
296 -- | For now, following a suggestion by Ben Lippmeier, we pass all
297 -- live variables as arguments, hoping that a clever register
298 -- allocator might help.
300 add_unassigned :: BlockEnv CmmLive -> ProcPointSet -> BlockEnv Protocol ->
302 add_unassigned = pass_live_vars_as_args
304 pass_live_vars_as_args :: BlockEnv CmmLive -> ProcPointSet ->
305 BlockEnv Protocol -> BlockEnv Protocol
306 pass_live_vars_as_args liveness procPoints protos = protos'
307 where protos' = foldUniqSet addLiveVars protos procPoints
308 addLiveVars :: BlockId -> BlockEnv Protocol -> BlockEnv Protocol
309 addLiveVars id protos =
310 case lookupBlockEnv protos id of
312 Nothing -> let live = lookupBlockEnv liveness id `orElse`
313 panic ("no liveness at block " ++ show id)
314 formals = map (\x -> CmmKinded x NoHint) $ uniqSetToList live
315 prot = Protocol ConventionPrivate formals $
316 mkStackArea id formals $ Just formals
317 in extendBlockEnv protos id prot
320 -- | Add copy-in instructions to each proc point that did not arise from a call
321 -- instruction. (Proc-points that arise from calls already have their copy-in instructions.)
323 add_CopyIns :: ProcPointSet -> BlockEnv Protocol -> BlockEnv CmmBlock -> BlockEnv CmmBlock
324 add_CopyIns callPPs protos = mapUFM maybe_insert_CopyIns
325 where maybe_insert_CopyIns :: CmmBlock -> CmmBlock
326 maybe_insert_CopyIns b@(Block id t) | not $ elementOfUniqSet id callPPs =
327 case lookupBlockEnv protos id of
329 Just (Protocol c fs area) ->
331 --ZTail (CopyIn c' fs' _) _ ->
332 -- if c == c' && fs == fs' then b
333 -- else panic ("mismatched protocols for block " ++ show id)
334 _ -> Block id -- (ZTail (CopyIn c fs NoC_SRT) t)
335 $ foldr ZTail t (copyIn c area fs)
336 maybe_insert_CopyIns b = b
338 -- | Add a CopyOut node before each procpoint.
339 -- If the predecessor is a call, then the CopyOut should already exist (in the callee).
341 add_CopyOuts :: BlockEnv Protocol -> ProcPointSet -> CmmGraph ->
342 FuelMonad (BlockEnv CmmBlock)
343 add_CopyOuts protos procPoints g = fold_blocks maybe_insert_CopyOut (return emptyBlockEnv) g
344 where maybe_insert_CopyOut :: CmmBlock -> FuelMonad (BlockEnv CmmBlock) ->
345 FuelMonad (BlockEnv CmmBlock)
346 maybe_insert_CopyOut b@(Block bid _) blocks =
347 case last $ unzip b of
348 LastOther (LastCall _ _) -> -- skip calls (copy out done by callee)
349 blocks >>= (\bmap -> return $ extendBlockEnv bmap bid b)
350 _ -> maybe_insert_CopyOut' b blocks
351 maybe_insert_CopyOut' b blocks = fold_succs trySucc b init >>= finish
352 where init = blocks >>= (\bmap -> return (b, bmap))
354 if elemBlockSet succId procPoints then
355 case lookupBlockEnv protos succId of
357 Just (Protocol c fs area) ->
358 insert z succId $ copyOut c area $ map fetch fs
359 -- CopyOut c $ map fetch fs
361 fetch k = k {kindlessCmm = CmmReg $ CmmLocal $ kindlessCmm k}
364 (b, bs) <- insertBetween b m succId
365 return $ (b, foldl (flip insertBlock) bmap bs)
366 finish (b@(Block bid _), bmap) = return $ extendBlockEnv bmap bid b
369 -- Input invariant: A block should only be reachable from a single ProcPoint.
370 -- If you want to duplicate blocks, do it before this gets called.
371 splitAtProcPoints :: CmmFormalsWithoutKinds -> CLabel -> ProcPointSet ->
372 CmmGraph -> FuelMonad [CmmGraph]
373 splitAtProcPoints formals entry_label procPoints g@(LGraph entry _) =
374 do let layout = layout_stack formals g
375 pprTrace "stack layout" (ppr layout) $ return ()
376 res <- procPointAnalysis procPoints g
377 procMap <- liftM zdfFpFacts res
378 let addBlock b@(Block bid _) graphEnv =
379 case lookupBlockEnv procMap bid of
380 Just ProcPoint -> add graphEnv bid bid b
381 Just (ReachedBy set) ->
382 case uniqSetToList set of
384 [id] -> add graphEnv id bid b
385 _ -> panic "Each block should be reachable from only one ProcPoint"
386 Nothing -> panic "block not reached by a proc point?"
387 add graphEnv procId bid b = extendBlockEnv graphEnv procId graph'
388 where graph = lookupBlockEnv graphEnv procId `orElse` emptyBlockEnv
389 graph' = extendBlockEnv graph bid b
390 graphEnv <- return $ fold_blocks addBlock emptyBlockEnv g
391 -- Build a map from proc point BlockId to labels for their new procedures
392 let add_label map pp = clabel pp >>= (\l -> return $ (pp, l) : map)
393 clabel procPoint = if procPoint == entry then return entry_label
394 else getUniqueM >>= return . to_label
395 to_label u = mkEntryLabel (mkFCallName u "procpoint")
396 procLabels <- foldM add_label [] (uniqSetToList procPoints)
397 -- In each new graph, add blocks jumping off to the new procedures,
398 -- and replace branches to procpoints with branches to the jump-off blocks
399 let add_jump_block (env, bs) (pp, l) =
400 do bid <- liftM mkBlockId getUniqueM
401 let b = Block bid (ZLast (LastOther (LastJump $ CmmLit $ CmmLabel l)))
402 return $ (extendBlockEnv env pp bid, b : bs)
403 add_jumps newGraphEnv (guniq, blockEnv) =
404 do (jumpEnv, jumpBlocks) <- foldM add_jump_block (emptyBlockEnv, []) procLabels
405 let ppId = mkBlockId guniq
406 LGraph _ blockEnv' = replaceLabelsZ jumpEnv $ LGraph ppId blockEnv
407 blockEnv'' = foldl (flip insertBlock) blockEnv' jumpBlocks
408 return $ extendBlockEnv newGraphEnv ppId $
409 runTx cmmCfgOptsZ $ LGraph ppId blockEnv''
410 _ <- return $ replaceLabelsZ
411 graphEnv <- foldM add_jumps emptyBlockEnv $ ufmToList graphEnv
412 return $ pprTrace "procLabels" (ppr procLabels) $
413 pprTrace "splitting graphs" (ppr graphEnv) $ [g]
415 ------------------------------------------------------------------------
416 -- Stack Layout (completely bogus for now) --
417 ------------------------------------------------------------------------
419 -- At some point, we'll do stack layout properly.
420 -- But for now, we can move forward on generating code by just producing
421 -- a brain dead layout, giving a separate slot to every variable,
422 -- and (incorrectly) assuming that all parameters are passed on the stack.
424 -- For now, variables are placed at explicit offsets from a virtual
426 -- We may want to use abstract stack slots at some point.
427 data Placement = VFPMinus Int
429 instance Outputable Placement where
430 ppr (VFPMinus k) = text "VFP - " <> int k
432 -- Build a map from registers to stack locations.
433 -- Return that map along with the offset to the end of the block
434 -- containing local registers.
435 layout_stack ::CmmFormalsWithoutKinds -> CmmGraph ->
436 (Int, FiniteMap LocalReg Placement, FiniteMap LocalReg Placement)
437 layout_stack formals g = (ix', incomingMap, localMap)
438 where (ix, incomingMap) = foldl (flip place) (1, emptyFM) formals -- IGNORES CC'S
439 -- 1 leaves space for the return infotable
440 (ix', localMap) = foldUniqSet place (ix, emptyFM) regs
441 place r (ix, map) = (ix', addToFM map r $ VFPMinus ix') where ix' = ix + 1
442 regs = fold_blocks (fold_fwd_block (\_ y -> y) add addL) emptyRegSet g
443 add x y = foldRegsDefd extendRegSet y x
444 addL (LastOther l) z = add l z
448 ----------------------------------------------------------------
451 Note [Direct reachability]
453 Block B is directly reachable from proc point P iff control can flow
454 from P to B without passing through an intervening proc point.
457 ----------------------------------------------------------------
460 Note [No simple dataflow]
462 Sadly, it seems impossible to compute the proc points using a single
463 dataflow pass. One might attempt to use this simple lattice:
465 data Location = Unknown
466 | InProc BlockId -- node is in procedure headed by the named proc point
467 | ProcPoint -- node is itself a proc point
469 At a join, a node in two different blocks becomes a proc point.
470 The difficulty is that the change of information during iterative
471 computation may promote a node prematurely. Here's a program that
472 illustrates the difficulty:
481 L2: if (...) { g(); goto L1; }
485 The only proc-point needed (besides the entry) is L1. But in an
486 iterative analysis, consider what happens to L2. On the first pass
487 through, it rises from Unknown to 'InProc entry', but when L1 is
488 promoted to a proc point (because it's the successor of g()), L1's
489 successors will be promoted to 'InProc L1'. The problem hits when the
490 new fact 'InProc L1' flows into L2 which is already bound to 'InProc entry'.
491 The join operation makes it a proc point when in fact it needn't be,
492 because its immediate dominator L1 is already a proc point and there
493 are no other proc points that directly reach L2.
498 {- Note [Separate Adams optimization]
499 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
500 It may be worthwhile to attempt the Adams optimization by rewriting
501 the graph before the assignment of proc-point protocols. Here are a
504 g() returns to k; g() returns to L;
505 k: CopyIn c ress; goto L:
507 L: // no CopyIn node here L: CopyIn c ress;
510 And when c == c' and ress == ress', this also:
512 g() returns to k; g() returns to L;
513 k: CopyIn c ress; goto L:
515 L: CopyIn c' ress' L: CopyIn c' ress' ;
517 In both cases the goal is to eliminate k.