2 % (c) The GRASP/AQUA Project, Glasgow University, 1998
5 Run through the STG code and compute the Static Reference Table for
6 each let-binding. At the same time, we figure out which top-level
7 bindings have no CAF references, and record the fact in their IdInfo.
12 import Id ( Id, setIdCafInfo, getIdCafInfo, externallyVisibleId,
15 import IdInfo ( CafInfo(..) )
23 computeSRTs :: [StgBinding] -> [(StgBinding,[Id])]
24 computeSRTs binds = srtBinds emptyUFM binds
28 srtBinds :: UniqFM CafInfo -> [StgBinding] -> [(StgBinding,[Id])]
31 srtTopBind rho b =: \(b, srt, rho) ->
32 (b,srt) : srtBinds rho bs
35 -----------------------------------------------------------------------------
36 Circular algorithm for simultaneously figuring out CafInfo and SRT
39 Our functions have type
41 :: UniqFM CafInfo -- which top-level ids don't refer to any CAfs
42 -> SrtOffset -- next free offset within the SRT
43 -> (UniqSet Id, -- global refs in the continuation
44 UniqFM (UniqSet Id))-- global refs in let-no-escaped variables
45 {- * -} -> StgExpr -- expression to analyse
47 -> (StgExpr, -- (e) newly annotated expression
48 UniqSet Id, -- (g) set of *all* global references
49 [Id], -- (s) SRT required for this expression
50 SrtOffset) -- (o) new offset
52 (g) is a set containing all local top-level and imported ids referred
53 to by the expression (e).
55 The set of all global references is used to build the environment,
56 which is passed in again. The environment is used to build the final
59 We build a single SRT for a recursive binding group, which is why the
60 SRT building is done at the binding level rather than the
63 Hence, the only argument which we can look at before returning is the
64 expression (marked with {- * -} above).
66 The SRT is built up in reverse order, to avoid too many expensive
67 appends. We therefore reverse the SRT before returning it, so that
68 the offsets will be from the beginning of the SRT.
70 -----------------------------------------------------------------------------
73 The environment contains a mapping from local top-level bindings to
74 CafInfo. The CafInfo is either
76 NoCafRefs - indicating that the id is not a CAF and furthermore
77 that it doesn't refer, even indirectly, to any CAFs.
79 MayHaveCafRefs - everything else.
81 A function whose CafInfo is NoCafRefs will have an empty SRT, and its
82 closure will not appear in the SRT of any other function (unless we're
83 compiling without optimisation and the CafInfos haven't been emitted
84 in the interface files).
86 Top-Level recursive groups
88 This gets a bit complicated, but the general idea is that we want a
89 single SRT for the whole group, and we'd rather not have recursive
90 references in it if at all possible.
92 We collect all the global references for the group, and filter out
93 those that are binders in the group and not CAFs themselves. This set
94 of references is then used to infer the CafInfo for each of the
95 binders in the group. Why is it done this way?
97 - if all the bindings in the group just refer to each other,
98 and none of them are CAFs, we'd like to get an empty SRT.
100 - if any of the bindings in the group refer to a CAF, this will
103 Hmm, that probably makes no sense.
109 -> (StgBinding, -- the new binding
110 [Id], -- the SRT for this binding
111 UniqFM CafInfo) -- the new environment
113 srtTopBind rho (StgNonRec binder rhs) =
115 -- no need to use circularity for non-recursive bindings
116 srtRhs rho (emptyUniqSet,emptyUFM) 0{-initial offset-} rhs
117 =: \(rhs, g, srt, off) ->
119 filtered_g = filter (mayHaveCafRefs rho) (uniqSetToList g)
120 caf_info = mk_caf_info rhs filtered_g
121 binder' = setIdCafInfo binder caf_info
122 rho' = addToUFM rho binder' caf_info
123 extra_refs = filter (`notElem` srt) filtered_g
124 bind_srt = reverse (extra_refs ++ srt)
127 StgRhsClosure _ _ _ _ _ _ _ ->
128 (StgNonRec binder' (attach_srt_rhs rhs 0 (length bind_srt)),
131 -- don't output an SRT for the constructor, but just remember
132 -- whether it had any caf references or not.
133 StgRhsCon _ _ _ -> (StgNonRec binder' rhs, [], rho')
136 srtTopBind rho (StgRec bs) =
137 (attach_srt_bind (StgRec (reverse new_bs')) 0 (length bind_srt),
140 (binders,rhss) = unzip bs
142 non_caf_binders = [ b | (b, rhs) <- bs, not (caf_rhs rhs) ]
144 -- circular: rho' is calculated from g below
145 (new_bs, g, srt, _) = doBinds bs [] emptyUniqSet [] 0
147 -- filter out ourselves from the global references: it makes no
148 -- sense to refer recursively to our SRT unless the recursive
149 -- reference is required by a nested SRT.
150 filtered_g = filter (\id -> id `notElem` non_caf_binders &&
151 mayHaveCafRefs rho id) (uniqSetToList g)
152 extra_refs = filter (`notElem` srt) filtered_g
153 bind_srt = reverse (extra_refs ++ srt)
154 caf_infos = map (\rhs -> mk_caf_info rhs filtered_g) rhss
155 rho' = addListToUFM rho (zip binders caf_infos)
156 binders' = zipWith setIdCafInfo binders caf_infos
158 new_bs' = zip binders' (map snd new_bs)
160 doBinds [] new_binds g srt off = (reverse new_binds, g, srt, off)
161 doBinds ((binder,rhs):binds) new_binds g srt off =
162 srtRhs rho' (emptyUniqSet,emptyUFM) off rhs
163 =: \(rhs, rhs_g, rhs_srt, off) ->
165 g' = unionUniqSets rhs_g g
166 srt' = rhs_srt ++ srt
168 doBinds binds ((binder,rhs):new_binds) g' srt' off
170 caf_rhs (StgRhsClosure _ _ _ free_vars _ [] body) = True
174 -----------------------------------------------------------------------------
175 Non-top-level bindings
178 srtBind :: UniqFM CafInfo -> (UniqSet Id, UniqFM (UniqSet Id))
179 -> Int -> StgBinding -> (StgBinding, UniqSet Id, [Id], Int)
181 srtBind rho cont_refs off (StgNonRec binder rhs) =
182 srtRhs rho cont_refs off rhs =: \(rhs, g, srt, off) ->
183 (StgNonRec binder rhs, g, srt, off)
185 srtBind rho cont_refs off (StgRec binds) =
186 (StgRec new_binds, g, srt, new_off)
188 -- process each binding
189 (new_binds, g, srt, new_off) = doBinds binds emptyUniqSet [] off []
191 doBinds [] g srt off new_binds = (reverse new_binds, g, srt, off)
192 doBinds ((binder,rhs):binds) g srt off new_binds =
193 srtRhs rho cont_refs off rhs =: \(rhs, g', srt', off) ->
194 doBinds binds (unionUniqSets g g') (srt'++srt) off
195 ((binder,rhs):new_binds)
198 -----------------------------------------------------------------------------
202 srtRhs :: UniqFM CafInfo -> (UniqSet Id, UniqFM (UniqSet Id))
203 -> Int -> StgRhs -> (StgRhs, UniqSet Id, [Id], Int)
205 srtRhs rho cont off (StgRhsClosure cc bi old_srt free_vars u args body) =
206 srtExpr rho cont off body =: \(body, g, srt, off) ->
207 (StgRhsClosure cc bi old_srt free_vars u args body, g, srt, off)
209 srtRhs rho cont off e@(StgRhsCon cc con args) =
210 (e, getGlobalRefs rho args, [], off)
213 -----------------------------------------------------------------------------
217 srtExpr :: UniqFM CafInfo -> (UniqSet Id, UniqFM (UniqSet Id))
218 -> Int -> StgExpr -> (StgExpr, UniqSet Id, [Id], Int)
220 srtExpr rho (cont,lne) off e@(StgApp f args) = (e, global_refs, [], off)
223 getGlobalRefs rho (StgVarArg f:args) `unionUniqSets`
224 lookupPossibleLNE lne f
226 srtExpr rho (cont,lne) off e@(StgCon con args ty) =
227 (e, cont `unionUniqSets` getGlobalRefs rho args, [], off)
229 srtExpr rho c@(cont,lne) off (StgCase scrut live1 live2 uniq _{-srt-} alts) =
230 srtCaseAlts rho c off alts =: \(alts, alts_g, alts_srt, alts_off) ->
232 -- construct the SRT for this case
233 let (this_srt, scrut_off) = construct_srt rho alts_g alts_srt alts_off in
235 -- global refs in the continuation is alts_g.
236 srtExpr rho (alts_g,lne) scrut_off scrut
237 =: \(scrut, scrut_g, scrut_srt, case_off) ->
239 g = unionUniqSets alts_g scrut_g
240 srt = scrut_srt ++ this_srt
241 srt_info = case length this_srt of
245 (StgCase scrut live1 live2 uniq srt_info alts, g, srt, case_off)
247 srtExpr rho cont off (StgLet bind body) =
248 srtLet rho cont off bind body StgLet (\_ cont -> cont)
250 srtExpr rho cont off (StgLetNoEscape live1 live2 b@(StgNonRec bndr rhs) body)
251 = srtLet rho cont off b body (StgLetNoEscape live1 live2) calc_cont
252 where calc_cont g (cont,lne) = (cont,addToUFM lne bndr g)
254 -- for recursive let-no-escapes, we do *two* passes, the first time
255 -- just to extract the list of global refs, and the second time we actually
256 -- construct the SRT now that we know what global refs should be in
257 -- the various let-no-escape continuations.
258 srtExpr rho conts@(cont,lne) off
259 (StgLetNoEscape live1 live2 bind@(StgRec pairs) body)
260 = srtBind rho conts off bind =: \(_, g, _, _) ->
262 lne' = addListToUFM lne [ (bndr,g) | (bndr,_) <- pairs ]
263 calc_cont _ conts = conts
265 srtLet rho (cont,lne') off bind body (StgLetNoEscape live1 live2) calc_cont
268 srtExpr rho cont off (StgSCC cc expr) =
269 srtExpr rho cont off expr =: \(expr, g, srt, off) ->
270 (StgSCC cc expr, g, srt, off)
273 -----------------------------------------------------------------------------
276 This is quite complicated stuff...
279 srtLet rho cont off bind body let_constr calc_cont
281 -- If the bindings are all constructors, then we don't need to
282 -- buid an SRT at all...
283 | all_con_binds bind =
284 srtBind rho cont off bind =: \(bind, bind_g, bind_srt, off) ->
285 srtExpr rho cont off body =: \(body, body_g, body_srt, off) ->
287 g = unionUniqSets bind_g body_g
288 srt = body_srt ++ bind_srt
290 (let_constr bind body, g, srt, off)
292 -- we have some closure bindings...
295 -- first, find the sub-SRTs in the binding
296 srtBind rho cont off bind =: \(bind, bind_g, bind_srt, bind_off) ->
298 -- construct the SRT for this binding
299 let (this_srt, body_off) = construct_srt rho bind_g bind_srt bind_off in
301 -- get the new continuation information (if a let-no-escape)
302 let new_cont = calc_cont bind_g cont in
304 -- now find the SRTs in the body
305 srtExpr rho cont body_off body =: \(body, body_g, body_srt, let_off) ->
308 -- union all the global references together
309 let_g = unionUniqSets bind_g body_g
311 -- concatenate the sub-SRTs
312 let_srt = body_srt ++ this_srt
314 -- attach the SRT info to the binding
315 bind' = attach_srt_bind bind off (length this_srt)
317 (let_constr bind' body, let_g, let_srt, let_off)
320 -----------------------------------------------------------------------------
323 Construct the SRT at this point from its sub-SRTs and any new global
324 references which aren't already contained in one of the sub-SRTs (and
328 construct_srt rho global_refs sub_srt current_offset
330 extra_refs = filter (`notElem` sub_srt)
331 (filter (mayHaveCafRefs rho) (uniqSetToList global_refs))
332 this_srt = extra_refs ++ sub_srt
334 -- Add the length of the new entries to the
335 -- current offset to get the next free offset in the global SRT.
336 new_offset = current_offset + length extra_refs
337 in (this_srt, new_offset)
340 -----------------------------------------------------------------------------
344 srtCaseAlts :: UniqFM CafInfo -> (UniqSet Id, UniqFM (UniqSet Id))
345 -> Int -> StgCaseAlts -> (StgCaseAlts, UniqSet Id, [Id], Int)
347 srtCaseAlts rho cont off (StgAlgAlts t alts dflt) =
348 srtAlgAlts rho cont off alts [] emptyUniqSet []
349 =: \(alts, alts_g, alts_srt, off) ->
350 srtDefault rho cont off dflt =: \(dflt, dflt_g, dflt_srt, off) ->
352 g = unionUniqSets alts_g dflt_g
353 srt = dflt_srt ++ alts_srt
355 (StgAlgAlts t alts dflt, g, srt, off)
357 srtCaseAlts rho cont off (StgPrimAlts t alts dflt) =
358 srtPrimAlts rho cont off alts [] emptyUniqSet []
359 =: \(alts, alts_g, alts_srt, off) ->
360 srtDefault rho cont off dflt =: \(dflt, dflt_g, dflt_srt, off) ->
362 g = unionUniqSets alts_g dflt_g
363 srt = dflt_srt ++ alts_srt
365 (StgPrimAlts t alts dflt, g, srt, off)
367 srtAlgAlts rho cont off [] new_alts g srt = (reverse new_alts, g, srt, off)
368 srtAlgAlts rho cont off ((con,args,used,rhs):alts) new_alts g srt =
369 srtExpr rho cont off rhs =: \(rhs, rhs_g, rhs_srt, off) ->
371 g' = unionUniqSets rhs_g g
372 srt' = rhs_srt ++ srt
374 srtAlgAlts rho cont off alts ((con,args,used,rhs) : new_alts) g' srt'
376 srtPrimAlts rho cont off [] new_alts g srt = (reverse new_alts, g, srt, off)
377 srtPrimAlts rho cont off ((lit,rhs):alts) new_alts g srt =
378 srtExpr rho cont off rhs =: \(rhs, rhs_g, rhs_srt, off) ->
380 g' = unionUniqSets rhs_g g
381 srt' = rhs_srt ++ srt
383 srtPrimAlts rho cont off alts ((lit,rhs) : new_alts) g' srt'
385 srtDefault rho cont off StgNoDefault = (StgNoDefault,emptyUniqSet,[],off)
386 srtDefault rho cont off (StgBindDefault rhs) =
387 srtExpr rho cont off rhs =: \(rhs, g, srt, off) ->
388 (StgBindDefault rhs, g, srt, off)
391 -----------------------------------------------------------------------------
393 Decide whether a closure looks like a CAF or not. In an effort to
394 keep the number of CAFs (and hence the size of the SRTs) down, we
395 would also like to look at the expression and decide whether it
396 requires a small bounded amount of heap, so we can ignore it as a CAF.
397 In these cases, we need to use an additional CAF list to keep track of
398 non-collectable CAFs.
400 We mark real CAFs as `MayHaveCafRefs' because this information is used
401 to decide whether a particular closure needs to be referenced in an
406 :: StgRhs -- right-hand-side of the definition
407 -> [Id] -- static references
410 -- special case for expressions which are always bottom,
411 -- such as 'error "..."'. We don't need to record it as
412 -- a CAF, since it can only be entered once.
413 mk_caf_info (StgRhsClosure _ _ _ free_vars _ [] e) srt
414 | isBottomingExpr e && null srt = NoCafRefs
416 mk_caf_info (StgRhsClosure _ _ _ free_vars upd args body) srt
417 | isUpdatable upd = MayHaveCafRefs -- a real live CAF
418 | null srt = NoCafRefs -- function w/ no static references
419 | otherwise = MayHaveCafRefs -- function w/ some static references
421 mk_caf_info rcon@(StgRhsCon cc con args) srt
422 | null srt = NoCafRefs -- constructor w/ no static references
423 | otherwise = MayHaveCafRefs -- otherwise, treat as a CAF
426 isBottomingExpr (StgLet bind expr) = isBottomingExpr expr
427 isBottomingExpr (StgApp f args) = idAppIsBottom f (length args)
428 isBottomingExpr _ = False
431 -----------------------------------------------------------------------------
433 Here we decide which Id's to place in the static reference table. An
434 internal top-level id will be in the environment with the appropriate
435 CafInfo, so we use that if available. An imported top-level Id will
436 have the CafInfo attached. Otherwise, we just ignore the Id.
439 getGlobalRefs :: UniqFM CafInfo -> [StgArg] -> UniqSet Id
440 getGlobalRefs rho args = mkUniqSet (concat (map (globalRefArg rho) args))
442 globalRefArg :: UniqFM CafInfo -> StgArg -> [Id]
444 globalRefArg rho (StgVarArg id)
447 case lookupUFM rho id of {
448 Just _ -> [id]; -- can't look at the caf_info yet...
451 if externallyVisibleId id
452 then case getIdCafInfo id of
453 MayHaveCafRefs -> [id]
458 globalRefArg rho _ = []
462 mayHaveCafRefs rho id =
463 case lookupUFM rho id of
464 Just MayHaveCafRefs -> True
465 Just NoCafRefs -> False
469 -----------------------------------------------------------------------------
473 attach_srt_bind :: StgBinding -> Int -> Int -> StgBinding
474 attach_srt_bind (StgNonRec binder rhs) off len =
475 StgNonRec binder (attach_srt_rhs rhs off len)
476 attach_srt_bind (StgRec binds) off len =
477 StgRec [ (v,attach_srt_rhs rhs off len) | (v,rhs) <- binds ]
479 attach_srt_rhs :: StgRhs -> Int -> Int -> StgRhs
480 attach_srt_rhs (StgRhsCon cc con args) off length
481 = StgRhsCon cc con args
482 attach_srt_rhs (StgRhsClosure cc bi _ free upd args rhs) off length
483 = StgRhsClosure cc bi srt free upd args rhs
485 srt | length == 0 = NoSRT
486 | otherwise = SRT off length
489 all_con_binds (StgNonRec x rhs) = con_rhs rhs
490 all_con_binds (StgRec bs) = all con_rhs (map snd bs)
492 con_rhs (StgRhsCon _ _ _) = True
499 -----------------------------------------------------------------------------
500 Fix up the SRT's in a let-no-escape.
502 (for a description of let-no-escapes, see CgLetNoEscape.lhs)
504 Here's the problem: a let-no-escape isn't represented by an activation
505 record on the stack. It seems either very difficult or impossible to
506 get the liveness bitmap right in the info table, so we don't do it
507 this way (the liveness mask isn't constant).
509 So, the question is how does the garbage collector get access to the
510 SRT for the rhs of the let-no-escape? It can't see an info table, so
511 it must get the SRT from somewhere else. Here's an example:
513 let-no-escape x = .... f ....
515 p -> .... x ... g ....
517 (f and g are global). Suppose we garbage collect while evaluating
518 'blah'. The stack will contain an activation record for the case,
519 which will point to an SRT containing [g] (according to our SRT
520 algorithm above). But, since the case continuation can call x, and
521 hence f, the SRT should really be [f,g].
525 let-no-escape {-rec-} z = \x -> case blah of
531 if we GC while evaluating blah2, then the case continuation on the
532 stack needs to refer to [f] in its SRT, because we can reach f by
533 calling z recursively.
537 The following code fixes up a let-no-escape expression after we've run
538 the SRT algorithm. It needs to know the SRT for the *whole*
539 expression (this is plugged in instead of the SRT for case exprsesions
540 in the body). The good news is that we only need to traverse nested
541 case expressions, since the let-no-escape bound variable can't occur
542 in the rhs of a let or in a case scrutinee.
544 For recursive let-no-escapes, the body is processed as for
545 non-recursive let-no-escapes, but case expressions in the rhs of each
546 binding have their SRTs replaced with the SRT for the binding group
547 (*not* the SRT of the whole let-no-escape expression).
550 lookupPossibleLNE lne_env f =
551 case lookupUFM lne_env f of
552 Nothing -> emptyUniqSet