2 % (c) The University of Glasgow 2006
3 % (c) The GRASP/AQUA Project, Glasgow University, 1992-1998
7 module CgCase ( cgCase, saveVolatileVarsAndRegs,
8 restoreCurrentCostCentre
11 #include "HsVersions.h"
13 import {-# SOURCE #-} CgExpr ( cgExpr )
47 = GCMayHappen -- The scrutinee may involve GC, so everything must be
48 -- tidy before the code for the scrutinee.
50 | NoGC -- The scrutinee is a primitive value, or a call to a
51 -- primitive op which does no GC. Hence the case can
52 -- be done inline, without tidying up first.
55 It is quite interesting to decide whether to put a heap-check
56 at the start of each alternative. Of course we certainly have
57 to do so if the case forces an evaluation, or if there is a primitive
58 op which can trigger GC.
60 A more interesting situation is this:
67 default -> !C!; ...C...
70 where \tr{!x!} indicates a possible heap-check point. The heap checks
71 in the alternatives {\em can} be omitted, in which case the topmost
72 heapcheck will take their worst case into account.
74 In favour of omitting \tr{!B!}, \tr{!C!}:
76 - {\em May} save a heap overflow test,
77 if ...A... allocates anything. The other advantage
78 of this is that we can use relative addressing
79 from a single Hp to get at all the closures so allocated.
81 - No need to save volatile vars etc across the case
85 - May do more allocation than reqd. This sometimes bites us
86 badly. For example, nfib (ha!) allocates about 30\% more space if the
87 worst-casing is done, because many many calls to nfib are leaf calls
88 which don't need to allocate anything.
90 This never hurts us if there is only one alternative.
102 Special case #1: case of literal.
105 cgCase (StgLit lit) _live_in_whole_case _live_in_alts bndr
106 alt_type@(PrimAlt _) alts
107 = do { tmp_reg <- bindNewToTemp bndr
108 ; cm_lit <- cgLit lit
109 ; stmtC (CmmAssign (CmmLocal tmp_reg) (CmmLit cm_lit))
110 ; cgPrimAlts NoGC alt_type (CmmLocal tmp_reg) alts }
113 Special case #2: scrutinising a primitive-typed variable. No
114 evaluation required. We don't save volatile variables, nor do we do a
115 heap-check in the alternatives. Instead, the heap usage of the
116 alternatives is worst-cased and passed upstream. This can result in
117 allocating more heap than strictly necessary, but it will sometimes
118 eliminate a heap check altogether.
121 cgCase (StgApp v []) _live_in_whole_case _live_in_alts bndr
122 alt_type@(PrimAlt _) alts
123 = do { -- Careful! we can't just bind the default binder to the same thing
124 -- as the scrutinee, since it might be a stack location, and having
125 -- two bindings pointing at the same stack locn doesn't work (it
126 -- confuses nukeDeadBindings). Hence, use a new temp.
127 v_info <- getCgIdInfo v
128 ; amode <- idInfoToAmode v_info
129 ; tmp_reg <- bindNewToTemp bndr
130 ; stmtC (CmmAssign (CmmLocal tmp_reg) amode)
131 ; cgPrimAlts NoGC alt_type (CmmLocal tmp_reg) alts }
134 Special case #3: inline PrimOps and foreign calls.
137 cgCase (StgOpApp (StgPrimOp primop) args _)
138 _live_in_whole_case live_in_alts bndr alt_type alts
139 | not (primOpOutOfLine primop)
140 = cgInlinePrimOp primop args bndr alt_type live_in_alts alts
143 TODO: Case-of-case of primop can probably be done inline too (but
144 maybe better to translate it out beforehand). See
145 ghc/lib/misc/PackedString.lhs for examples where this crops up (with
148 Special case #4: inline foreign calls: an unsafe foreign call can be done
149 right here, just like an inline primop.
152 cgCase (StgOpApp (StgFCallOp fcall _) args _)
153 _live_in_whole_case live_in_alts _bndr _alt_type alts
154 | unsafe_foreign_call
155 = ASSERT( isSingleton alts )
156 do -- *must* be an unboxed tuple alt.
157 -- exactly like the cgInlinePrimOp case for unboxed tuple alts..
158 { res_tmps <- mapFCs bindNewToTemp non_void_res_ids
159 ; let res_hints = map (typeForeignHint.idType) non_void_res_ids
160 ; cgForeignCall (zipWith CmmHinted res_tmps res_hints) fcall args live_in_alts
163 (_, res_ids, _, rhs) = head alts
164 non_void_res_ids = filter (nonVoidArg . idCgRep) res_ids
168 CCall (CCallSpec _ _ s) -> not (playSafe s)
172 Special case: scrutinising a non-primitive variable.
173 This can be done a little better than the general case, because
174 we can reuse/trim the stack slot holding the variable (if it is in one).
177 cgCase (StgApp fun args)
178 _live_in_whole_case live_in_alts bndr alt_type alts
179 = do { fun_info <- getCgIdInfo fun
180 ; arg_amodes <- getArgAmodes args
182 -- Nuking dead bindings *before* calculating the saves is the
183 -- value-add here. We might end up freeing up some slots currently
184 -- occupied by variables only required for the call.
185 -- NOTE: we need to look up the variables used in the call before
186 -- doing this, because some of them may not be in the environment
188 ; nukeDeadBindings live_in_alts
189 ; (save_assts, alts_eob_info, maybe_cc_slot)
190 <- saveVolatileVarsAndRegs live_in_alts
193 <- forkEval alts_eob_info
194 (allocStackTop retAddrSizeW >> nopC)
195 (do { deAllocStackTop retAddrSizeW
196 ; cgEvalAlts maybe_cc_slot bndr alt_type alts })
198 ; setEndOfBlockInfo scrut_eob_info
199 (performTailCall fun_info arg_amodes save_assts) }
202 Note about return addresses: we *always* push a return address, even
203 if because of an optimisation we end up jumping direct to the return
204 code (not through the address itself). The alternatives always assume
205 that the return address is on the stack. The return address is
206 required in case the alternative performs a heap check, since it
207 encodes the liveness of the slots in the activation record.
209 On entry to the case alternative, we can re-use the slot containing
210 the return address immediately after the heap check. That's what the
211 deAllocStackTop call is doing above.
213 Finally, here is the general case.
216 cgCase expr live_in_whole_case live_in_alts bndr alt_type alts
217 = do { -- Figure out what volatile variables to save
218 nukeDeadBindings live_in_whole_case
220 ; (save_assts, alts_eob_info, maybe_cc_slot)
221 <- saveVolatileVarsAndRegs live_in_alts
223 -- Save those variables right now!
224 ; emitStmts save_assts
226 -- generate code for the alts
228 <- forkEval alts_eob_info
229 (do { nukeDeadBindings live_in_alts
230 ; allocStackTop retAddrSizeW -- space for retn address
232 (do { deAllocStackTop retAddrSizeW
233 ; cgEvalAlts maybe_cc_slot bndr alt_type alts })
235 ; setEndOfBlockInfo scrut_eob_info (cgExpr expr)
239 There's a lot of machinery going on behind the scenes to manage the
240 stack pointer here. forkEval takes the virtual Sp and free list from
241 the first argument, and turns that into the *real* Sp for the second
242 argument. It also uses this virtual Sp as the args-Sp in the EOB info
243 returned, so that the scrutinee will trim the real Sp back to the
244 right place before doing whatever it does.
245 --SDM (who just spent an hour figuring this out, and didn't want to
248 Why don't we push the return address just before evaluating the
249 scrutinee? Because the slot reserved for the return address might
250 contain something useful, so we wait until performing a tail call or
251 return before pushing the return address (see
252 CgTailCall.pushReturnAddress).
254 This also means that the environment doesn't need to know about the
255 free stack slot for the return address (for generating bitmaps),
256 because we don't reserve it until just before the eval.
258 TODO!! Problem: however, we have to save the current cost centre
259 stack somewhere, because at the eval point the current CCS might be
260 different. So we pick a free stack slot and save CCCS in it. One
261 consequence of this is that activation records on the stack don't
262 follow the layout of closures when we're profiling. The CCS could be
263 anywhere within the record).
265 %************************************************************************
269 %************************************************************************
272 cgInlinePrimOp :: PrimOp -> [StgArg] -> Id -> AltType -> StgLiveVars
273 -> [(AltCon, [Id], [Bool], StgExpr)]
275 cgInlinePrimOp primop args bndr (PrimAlt _) live_in_alts alts
276 | isVoidArg (idCgRep bndr)
277 = ASSERT( con == DEFAULT && isSingleton alts && null bs )
278 do { -- VOID RESULT; just sequencing,
279 -- so get in there and do it
280 cgPrimOp [] primop args live_in_alts
283 (con,bs,_,rhs) = head alts
285 cgInlinePrimOp primop args bndr (PrimAlt tycon) live_in_alts alts
286 = do { -- PRIMITIVE ALTS, with non-void result
287 tmp_reg <- bindNewToTemp bndr
288 ; cgPrimOp [tmp_reg] primop args live_in_alts
289 ; cgPrimAlts NoGC (PrimAlt tycon) (CmmLocal tmp_reg) alts }
291 cgInlinePrimOp primop args _ (UbxTupAlt _) live_in_alts alts
292 = ASSERT( isSingleton alts )
293 do { -- UNBOXED TUPLE ALTS
294 -- No heap check, no yield, just get in there and do it.
295 -- NB: the case binder isn't bound to anything;
296 -- it has a unboxed tuple type
298 res_tmps <- mapFCs bindNewToTemp non_void_res_ids
299 ; cgPrimOp res_tmps primop args live_in_alts
302 (_, res_ids, _, rhs) = head alts
303 non_void_res_ids = filter (nonVoidArg . idCgRep) res_ids
305 cgInlinePrimOp primop args bndr (AlgAlt tycon) live_in_alts alts
306 = do { -- ENUMERATION TYPE RETURN
307 -- Typical: case a ># b of { True -> ..; False -> .. }
308 -- The primop itself returns an index into the table of
309 -- closures for the enumeration type.
310 tag_amode <- ASSERT( isEnumerationTyCon tycon )
311 do_enum_primop primop
313 -- Bind the default binder if necessary
314 -- (avoiding it avoids the assignment)
315 -- The deadness info is set by StgVarInfo
316 ; whenC (not (isDeadBinder bndr))
317 (do { tmp_reg <- bindNewToTemp bndr
320 (tagToClosure tycon tag_amode)) })
323 ; (branches, mb_deflt) <- cgAlgAlts NoGC Nothing{-cc_slot-}
327 ; emitSwitch tag_amode branches mb_deflt 0 (tyConFamilySize tycon - 1)
331 do_enum_primop :: PrimOp -> FCode CmmExpr -- Returns amode for result
332 do_enum_primop TagToEnumOp -- No code!
334 (_,e) <- getArgAmode arg
336 do_enum_primop primop
337 = do tmp <- newTemp bWord
338 cgPrimOp [tmp] primop args live_in_alts
339 returnFC (CmmReg (CmmLocal tmp))
341 cgInlinePrimOp _ _ bndr _ _ _
342 = pprPanic "cgCase: case of primop has polymorphic type" (ppr bndr)
345 %************************************************************************
347 \subsection[CgCase-alts]{Alternatives}
349 %************************************************************************
351 @cgEvalAlts@ returns an addressing mode for a continuation for the
352 alternatives of a @case@, used in a context when there
353 is some evaluation to be done.
356 cgEvalAlts :: Maybe VirtualSpOffset -- Offset of cost-centre to be restored, if any
360 -> FCode Sequel -- Any addr modes inside are guaranteed
361 -- to be a label so that we can duplicate it
362 -- without risk of duplicating code
364 cgEvalAlts cc_slot bndr alt_type@(PrimAlt tycon) alts
365 = do { let rep = tyConCgRep tycon
366 reg = dataReturnConvPrim rep -- Bottom for voidRep
368 ; abs_c <- forkProc $ do
369 { -- Bind the case binder, except if it's void
370 -- (reg is bottom in that case)
371 whenC (nonVoidArg rep) $
372 bindNewToReg bndr reg (mkLFArgument bndr)
373 ; restoreCurrentCostCentre cc_slot True
374 ; cgPrimAlts GCMayHappen alt_type reg alts }
376 ; lbl <- emitReturnTarget (idName bndr) abs_c
377 ; returnFC (CaseAlts lbl Nothing bndr) }
379 cgEvalAlts cc_slot bndr (UbxTupAlt _) [(con,args,_,rhs)]
380 = -- Unboxed tuple case
381 -- By now, the simplifier should have have turned it
382 -- into case e of (# a,b #) -> e
383 -- There shouldn't be a
384 -- case e of DEFAULT -> e
385 ASSERT2( case con of { DataAlt _ -> True; _ -> False },
386 text "cgEvalAlts: dodgy case of unboxed tuple type" )
387 do { -- forkAbsC for the RHS, so that the envt is
388 -- not changed for the emitReturn call
389 abs_c <- forkProc $ do
390 { (live_regs, ptrs, nptrs, _) <- bindUnboxedTupleComponents args
391 -- Restore the CC *after* binding the tuple components,
392 -- so that we get the stack offset of the saved CC right.
393 ; restoreCurrentCostCentre cc_slot True
394 -- Generate a heap check if necessary
395 -- and finally the code for the alternative
396 ; unbxTupleHeapCheck live_regs ptrs nptrs noStmts
398 ; lbl <- emitReturnTarget (idName bndr) abs_c
399 ; returnFC (CaseAlts lbl Nothing bndr) }
401 cgEvalAlts cc_slot bndr alt_type alts
402 = -- Algebraic and polymorphic case
403 do { -- Bind the default binder
404 bindNewToReg bndr nodeReg (mkLFArgument bndr)
406 -- Generate sequel info for use downstream
407 -- At the moment, we only do it if the type is vector-returnable.
408 -- Reason: if not, then it costs extra to label the
409 -- alternatives, because we'd get return code like:
411 -- switch TagReg { 0 : JMP(alt_1); 1 : JMP(alt_2) ..etc }
413 -- which is worse than having the alt code in the switch statement
415 ; (alts, mb_deflt) <- cgAlgAlts GCMayHappen cc_slot alt_type alts
417 ; (lbl, branches) <- emitAlgReturnTarget (idName bndr)
420 ; returnFC (CaseAlts lbl branches bndr) }
422 fam_sz = case alt_type of
423 AlgAlt tc -> tyConFamilySize tc
425 PrimAlt _ -> panic "cgEvalAlts: PrimAlt"
426 UbxTupAlt _ -> panic "cgEvalAlts: UbxTupAlt"
430 HWL comment on {\em GrAnSim\/} (adding GRAN_YIELDs for context switch): If
431 we do an inlining of the case no separate functions for returning are
432 created, so we don't have to generate a GRAN_YIELD in that case. This info
433 must be propagated to cgAlgAltRhs (where the GRAN_YIELD macro might be
434 emitted). Hence, the new Bool arg to cgAlgAltRhs.
436 %************************************************************************
438 \subsection[CgCase-alg-alts]{Algebraic alternatives}
440 %************************************************************************
442 In @cgAlgAlts@, none of the binders in the alternatives are
443 assumed to be yet bound.
445 HWL comment on {\em GrAnSim\/} (adding GRAN_YIELDs for context switch): The
446 last arg of cgAlgAlts indicates if we want a context switch at the
447 beginning of each alternative. Normally we want that. The only exception
448 are inlined alternatives.
452 -> Maybe VirtualSpOffset
453 -> AltType -- ** AlgAlt or PolyAlt only **
454 -> [StgAlt] -- The alternatives
455 -> FCode ( [(ConTagZ, CgStmts)], -- The branches
456 Maybe CgStmts ) -- The default case
458 cgAlgAlts gc_flag cc_slot alt_type alts
459 = do alts <- forkAlts [ cgAlgAlt gc_flag cc_slot alt_type alt | alt <- alts]
461 mb_deflt = case alts of -- DEFAULT is always first, if present
462 ((DEFAULT,blks) : _) -> Just blks
465 branches = [(dataConTagZ con, blks)
466 | (DataAlt con, blks) <- alts]
468 return (branches, mb_deflt)
472 -> Maybe VirtualSpOffset -- Turgid state
473 -> AltType -- ** AlgAlt or PolyAlt only **
475 -> FCode (AltCon, CgStmts)
477 cgAlgAlt gc_flag cc_slot alt_type (con, args, _use_mask, rhs)
478 = do { abs_c <- getCgStmts $ do
479 { bind_con_args con args
480 ; restoreCurrentCostCentre cc_slot True
481 ; maybeAltHeapCheck gc_flag alt_type (cgExpr rhs) }
482 ; return (con, abs_c) }
484 bind_con_args DEFAULT _ = nopC
485 bind_con_args (DataAlt dc) args = bindConArgs dc args
486 bind_con_args (LitAlt _) _ = panic "cgAlgAlt: LitAlt"
490 %************************************************************************
492 \subsection[CgCase-prim-alts]{Primitive alternatives}
494 %************************************************************************
496 @cgPrimAlts@ generates suitable a @CSwitch@
497 for dealing with the alternatives of a primitive @case@, given an
498 addressing mode for the thing to scrutinise. It also keeps track of
499 the maximum stack depth encountered down any branch.
501 As usual, no binders in the alternatives are yet bound.
505 -> AltType -- Always PrimAlt, but passed to maybeAltHeapCheck
506 -> CmmReg -- Scrutinee
507 -> [StgAlt] -- Alternatives
509 -- NB: cgPrimAlts emits code that does the case analysis.
510 -- It's often used in inline situations, rather than to genearte
511 -- a labelled return point. That's why its interface is a little
512 -- different to cgAlgAlts
514 -- INVARIANT: the default binder is already bound
515 cgPrimAlts gc_flag alt_type scrutinee alts
516 = do { tagged_absCs <- forkAlts (map (cgPrimAlt gc_flag alt_type) alts)
517 ; let ((DEFAULT, deflt_absC) : others) = tagged_absCs -- There is always a default
518 alt_absCs = [(lit,rhs) | (LitAlt lit, rhs) <- others]
519 ; emitLitSwitch (CmmReg scrutinee) alt_absCs deflt_absC }
523 -> StgAlt -- The alternative
524 -> FCode (AltCon, CgStmts) -- Its compiled form
526 cgPrimAlt gc_flag alt_type (con, [], [], rhs)
527 = ASSERT( case con of { DEFAULT -> True; LitAlt _ -> True; _ -> False } )
528 do { abs_c <- getCgStmts (maybeAltHeapCheck gc_flag alt_type (cgExpr rhs))
529 ; returnFC (con, abs_c) }
530 cgPrimAlt _ _ _ = panic "cgPrimAlt: non-empty lists"
534 %************************************************************************
536 \subsection[CgCase-tidy]{Code for tidying up prior to an eval}
538 %************************************************************************
543 -> AltType -- PolyAlt, PrimAlt, AlgAlt, but *not* UbxTupAlt
544 -> Code -- Continuation
546 maybeAltHeapCheck NoGC _ code = code
547 maybeAltHeapCheck GCMayHappen alt_type code = altHeapCheck alt_type code
549 saveVolatileVarsAndRegs
550 :: StgLiveVars -- Vars which should be made safe
551 -> FCode (CmmStmts, -- Assignments to do the saves
552 EndOfBlockInfo, -- sequel for the alts
553 Maybe VirtualSpOffset) -- Slot for current cost centre
555 saveVolatileVarsAndRegs vars
556 = do { var_saves <- saveVolatileVars vars
557 ; (maybe_cc_slot, cc_save) <- saveCurrentCostCentre
558 ; eob_info <- getEndOfBlockInfo
559 ; returnFC (var_saves `plusStmts` cc_save,
564 saveVolatileVars :: StgLiveVars -- Vars which should be made safe
565 -> FCode CmmStmts -- Assignments to to the saves
567 saveVolatileVars vars
568 = do { stmts_s <- mapFCs save_it (varSetElems vars)
569 ; return (foldr plusStmts noStmts stmts_s) }
572 = do { v <- getCAddrModeIfVolatile var
574 Nothing -> return noStmts -- Non-volatile
575 Just vol_amode -> save_var var vol_amode -- Aha! It's volatile
578 save_var var vol_amode
579 = do { slot <- allocPrimStack (idCgRep var)
580 ; rebindToStack var slot
581 ; sp_rel <- getSpRelOffset slot
582 ; returnFC (oneStmt (CmmStore sp_rel vol_amode)) }
585 ---------------------------------------------------------------------------
587 When we save the current cost centre (which is done for lexical
588 scoping), we allocate a free stack location, and return (a)~the
589 virtual offset of the location, to pass on to the alternatives, and
590 (b)~the assignment to do the save (just as for @saveVolatileVars@).
593 saveCurrentCostCentre ::
594 FCode (Maybe VirtualSpOffset, -- Where we decide to store it
595 CmmStmts) -- Assignment to save it
597 saveCurrentCostCentre
598 | not opt_SccProfilingOn
599 = returnFC (Nothing, noStmts)
601 = do { slot <- allocPrimStack PtrArg
602 ; sp_rel <- getSpRelOffset slot
603 ; returnFC (Just slot,
604 oneStmt (CmmStore sp_rel curCCS)) }
606 -- Sometimes we don't free the slot containing the cost centre after restoring it
607 -- (see CgLetNoEscape.cgLetNoEscapeBody).
608 restoreCurrentCostCentre :: Maybe VirtualSpOffset -> Bool -> Code
609 restoreCurrentCostCentre Nothing _freeit = nopC
610 restoreCurrentCostCentre (Just slot) freeit
611 = do { sp_rel <- getSpRelOffset slot
612 ; whenC freeit (freeStackSlots [slot])
613 ; stmtC (CmmStore curCCSAddr (CmmLoad sp_rel bWord)) }