2 % (c) The University of Glasgow 2006
3 % (c) The GRASP/AQUA Project, Glasgow University, 1992-1998
7 module CgCase ( cgCase, saveVolatileVarsAndRegs,
8 restoreCurrentCostCentre
11 #include "HsVersions.h"
13 import {-# SOURCE #-} CgExpr ( cgExpr )
48 = GCMayHappen -- The scrutinee may involve GC, so everything must be
49 -- tidy before the code for the scrutinee.
51 | NoGC -- The scrutinee is a primitive value, or a call to a
52 -- primitive op which does no GC. Hence the case can
53 -- be done inline, without tidying up first.
56 It is quite interesting to decide whether to put a heap-check
57 at the start of each alternative. Of course we certainly have
58 to do so if the case forces an evaluation, or if there is a primitive
59 op which can trigger GC.
61 A more interesting situation is this:
68 default -> !C!; ...C...
71 where \tr{!x!} indicates a possible heap-check point. The heap checks
72 in the alternatives {\em can} be omitted, in which case the topmost
73 heapcheck will take their worst case into account.
75 In favour of omitting \tr{!B!}, \tr{!C!}:
77 - {\em May} save a heap overflow test,
78 if ...A... allocates anything. The other advantage
79 of this is that we can use relative addressing
80 from a single Hp to get at all the closures so allocated.
82 - No need to save volatile vars etc across the case
86 - May do more allocation than reqd. This sometimes bites us
87 badly. For example, nfib (ha!) allocates about 30\% more space if the
88 worst-casing is done, because many many calls to nfib are leaf calls
89 which don't need to allocate anything.
91 This never hurts us if there is only one alternative.
103 Special case #1: case of literal.
106 cgCase (StgLit lit) live_in_whole_case live_in_alts bndr
107 alt_type@(PrimAlt tycon) alts
108 = do { tmp_reg <- bindNewToTemp bndr
109 ; cm_lit <- cgLit lit
110 ; stmtC (CmmAssign (CmmLocal tmp_reg) (CmmLit cm_lit))
111 ; cgPrimAlts NoGC alt_type (CmmLocal tmp_reg) alts }
114 Special case #2: scrutinising a primitive-typed variable. No
115 evaluation required. We don't save volatile variables, nor do we do a
116 heap-check in the alternatives. Instead, the heap usage of the
117 alternatives is worst-cased and passed upstream. This can result in
118 allocating more heap than strictly necessary, but it will sometimes
119 eliminate a heap check altogether.
122 cgCase (StgApp v []) live_in_whole_case live_in_alts bndr
123 alt_type@(PrimAlt tycon) alts
124 = do { -- Careful! we can't just bind the default binder to the same thing
125 -- as the scrutinee, since it might be a stack location, and having
126 -- two bindings pointing at the same stack locn doesn't work (it
127 -- confuses nukeDeadBindings). Hence, use a new temp.
128 v_info <- getCgIdInfo v
129 ; amode <- idInfoToAmode v_info
130 ; tmp_reg <- bindNewToTemp bndr
131 ; stmtC (CmmAssign (CmmLocal tmp_reg) amode)
132 ; cgPrimAlts NoGC alt_type (CmmLocal tmp_reg) alts }
135 Special case #3: inline PrimOps and foreign calls.
138 cgCase (StgOpApp op@(StgPrimOp primop) args _)
139 live_in_whole_case live_in_alts bndr alt_type alts
140 | not (primOpOutOfLine primop)
141 = cgInlinePrimOp primop args bndr alt_type live_in_alts alts
144 TODO: Case-of-case of primop can probably be done inline too (but
145 maybe better to translate it out beforehand). See
146 ghc/lib/misc/PackedString.lhs for examples where this crops up (with
149 Special case #4: inline foreign calls: an unsafe foreign call can be done
150 right here, just like an inline primop.
153 cgCase (StgOpApp op@(StgFCallOp fcall _) args _)
154 live_in_whole_case live_in_alts bndr alt_type alts
155 | unsafe_foreign_call
156 = ASSERT( isSingleton alts )
157 do -- *must* be an unboxed tuple alt.
158 -- exactly like the cgInlinePrimOp case for unboxed tuple alts..
159 { res_tmps <- mapFCs bindNewToTemp non_void_res_ids
160 ; let res_hints = map (typeHint.idType) non_void_res_ids
161 ; cgForeignCall (zip res_tmps res_hints) fcall args live_in_alts
164 (_, res_ids, _, rhs) = head alts
165 non_void_res_ids = filter (nonVoidArg . idCgRep) res_ids
169 CCall (CCallSpec _ _ s) -> not (playSafe s)
173 Special case: scrutinising a non-primitive variable.
174 This can be done a little better than the general case, because
175 we can reuse/trim the stack slot holding the variable (if it is in one).
178 cgCase (StgApp fun args)
179 live_in_whole_case live_in_alts bndr alt_type alts
180 = do { fun_info <- getCgIdInfo fun
181 ; arg_amodes <- getArgAmodes args
183 -- Nuking dead bindings *before* calculating the saves is the
184 -- value-add here. We might end up freeing up some slots currently
185 -- occupied by variables only required for the call.
186 -- NOTE: we need to look up the variables used in the call before
187 -- doing this, because some of them may not be in the environment
189 ; nukeDeadBindings live_in_alts
190 ; (save_assts, alts_eob_info, maybe_cc_slot)
191 <- saveVolatileVarsAndRegs live_in_alts
194 <- forkEval alts_eob_info
195 (allocStackTop retAddrSizeW >> nopC)
196 (do { deAllocStackTop retAddrSizeW
197 ; cgEvalAlts maybe_cc_slot bndr alt_type alts })
199 ; setEndOfBlockInfo scrut_eob_info
200 (performTailCall fun_info arg_amodes save_assts) }
203 Note about return addresses: we *always* push a return address, even
204 if because of an optimisation we end up jumping direct to the return
205 code (not through the address itself). The alternatives always assume
206 that the return address is on the stack. The return address is
207 required in case the alternative performs a heap check, since it
208 encodes the liveness of the slots in the activation record.
210 On entry to the case alternative, we can re-use the slot containing
211 the return address immediately after the heap check. That's what the
212 deAllocStackTop call is doing above.
214 Finally, here is the general case.
217 cgCase expr live_in_whole_case live_in_alts bndr alt_type alts
218 = do { -- Figure out what volatile variables to save
219 nukeDeadBindings live_in_whole_case
221 ; (save_assts, alts_eob_info, maybe_cc_slot)
222 <- saveVolatileVarsAndRegs live_in_alts
224 -- Save those variables right now!
225 ; emitStmts save_assts
227 -- generate code for the alts
229 <- forkEval alts_eob_info
230 (do { nukeDeadBindings live_in_alts
231 ; allocStackTop retAddrSizeW -- space for retn address
233 (do { deAllocStackTop retAddrSizeW
234 ; cgEvalAlts maybe_cc_slot bndr alt_type alts })
236 ; setEndOfBlockInfo scrut_eob_info (cgExpr expr)
240 There's a lot of machinery going on behind the scenes to manage the
241 stack pointer here. forkEval takes the virtual Sp and free list from
242 the first argument, and turns that into the *real* Sp for the second
243 argument. It also uses this virtual Sp as the args-Sp in the EOB info
244 returned, so that the scrutinee will trim the real Sp back to the
245 right place before doing whatever it does.
246 --SDM (who just spent an hour figuring this out, and didn't want to
249 Why don't we push the return address just before evaluating the
250 scrutinee? Because the slot reserved for the return address might
251 contain something useful, so we wait until performing a tail call or
252 return before pushing the return address (see
253 CgTailCall.pushReturnAddress).
255 This also means that the environment doesn't need to know about the
256 free stack slot for the return address (for generating bitmaps),
257 because we don't reserve it until just before the eval.
259 TODO!! Problem: however, we have to save the current cost centre
260 stack somewhere, because at the eval point the current CCS might be
261 different. So we pick a free stack slot and save CCCS in it. One
262 consequence of this is that activation records on the stack don't
263 follow the layout of closures when we're profiling. The CCS could be
264 anywhere within the record).
266 %************************************************************************
270 %************************************************************************
273 cgInlinePrimOp primop args bndr (PrimAlt tycon) live_in_alts alts
274 | isVoidArg (idCgRep bndr)
275 = ASSERT( con == DEFAULT && isSingleton alts && null bs )
276 do { -- VOID RESULT; just sequencing,
277 -- so get in there and do it
278 cgPrimOp [] primop args live_in_alts
281 (con,bs,_,rhs) = head alts
283 cgInlinePrimOp primop args bndr (PrimAlt tycon) live_in_alts alts
284 = do { -- PRIMITIVE ALTS, with non-void result
285 tmp_reg <- bindNewToTemp bndr
286 ; cgPrimOp [tmp_reg] primop args live_in_alts
287 ; cgPrimAlts NoGC (PrimAlt tycon) (CmmLocal tmp_reg) alts }
289 cgInlinePrimOp primop args bndr (UbxTupAlt tycon) live_in_alts alts
290 = ASSERT( isSingleton alts )
291 do { -- UNBOXED TUPLE ALTS
292 -- No heap check, no yield, just get in there and do it.
293 -- NB: the case binder isn't bound to anything;
294 -- it has a unboxed tuple type
296 res_tmps <- mapFCs bindNewToTemp non_void_res_ids
297 ; cgPrimOp res_tmps primop args live_in_alts
300 (_, res_ids, _, rhs) = head alts
301 non_void_res_ids = filter (nonVoidArg . idCgRep) res_ids
303 cgInlinePrimOp primop args bndr (AlgAlt tycon) live_in_alts alts
304 = do { -- ENUMERATION TYPE RETURN
305 -- Typical: case a ># b of { True -> ..; False -> .. }
306 -- The primop itself returns an index into the table of
307 -- closures for the enumeration type.
308 tag_amode <- ASSERT( isEnumerationTyCon tycon )
309 do_enum_primop primop
311 -- Bind the default binder if necessary
312 -- (avoiding it avoids the assignment)
313 -- The deadness info is set by StgVarInfo
314 ; this_pkg <- getThisPackage
315 ; whenC (not (isDeadBinder bndr))
316 (do { tmp_reg <- bindNewToTemp bndr
319 (tagToClosure this_pkg tycon tag_amode)) })
322 ; (branches, mb_deflt) <- cgAlgAlts NoGC Nothing{-cc_slot-}
326 ; emitSwitch tag_amode branches mb_deflt 0 (tyConFamilySize tycon - 1)
330 do_enum_primop :: PrimOp -> FCode CmmExpr -- Returns amode for result
331 do_enum_primop TagToEnumOp -- No code!
333 (_,e) <- getArgAmode arg
335 do_enum_primop primop
336 = do tmp <- newNonPtrTemp wordRep
337 cgPrimOp [tmp] primop args live_in_alts
338 returnFC (CmmReg (CmmLocal tmp))
340 cgInlinePrimOp primop arg_amodes bndr PolyAlt live_in_alts alts
341 = pprPanic "cgCase: case of primop has polymorphic type" (ppr bndr)
344 %************************************************************************
346 \subsection[CgCase-alts]{Alternatives}
348 %************************************************************************
350 @cgEvalAlts@ returns an addressing mode for a continuation for the
351 alternatives of a @case@, used in a context when there
352 is some evaluation to be done.
355 cgEvalAlts :: Maybe VirtualSpOffset -- Offset of cost-centre to be restored, if any
359 -> FCode Sequel -- Any addr modes inside are guaranteed
360 -- to be a label so that we can duplicate it
361 -- without risk of duplicating code
363 cgEvalAlts cc_slot bndr alt_type@(PrimAlt tycon) alts
364 = do { let rep = tyConCgRep tycon
365 reg = dataReturnConvPrim rep -- Bottom for voidRep
367 ; abs_c <- forkProc $ do
368 { -- Bind the case binder, except if it's void
369 -- (reg is bottom in that case)
370 whenC (nonVoidArg rep) $
371 bindNewToReg bndr reg (mkLFArgument bndr)
372 ; restoreCurrentCostCentre cc_slot True
373 ; cgPrimAlts GCMayHappen alt_type reg alts }
375 ; lbl <- emitReturnTarget (idName bndr) abs_c
376 ; returnFC (CaseAlts lbl Nothing bndr) }
378 cgEvalAlts cc_slot bndr (UbxTupAlt _) [(con,args,_,rhs)]
379 = -- Unboxed tuple case
380 -- By now, the simplifier should have have turned it
381 -- into case e of (# a,b #) -> e
382 -- There shouldn't be a
383 -- case e of DEFAULT -> e
384 ASSERT2( case con of { DataAlt _ -> True; other -> False },
385 text "cgEvalAlts: dodgy case of unboxed tuple type" )
386 do { -- forkAbsC for the RHS, so that the envt is
387 -- not changed for the emitReturn call
388 abs_c <- forkProc $ do
389 { (live_regs, ptrs, nptrs, _) <- bindUnboxedTupleComponents args
390 -- Restore the CC *after* binding the tuple components,
391 -- so that we get the stack offset of the saved CC right.
392 ; restoreCurrentCostCentre cc_slot True
393 -- Generate a heap check if necessary
394 -- and finally the code for the alternative
395 ; unbxTupleHeapCheck live_regs ptrs nptrs noStmts
397 ; lbl <- emitReturnTarget (idName bndr) abs_c
398 ; returnFC (CaseAlts lbl Nothing bndr) }
400 cgEvalAlts cc_slot bndr alt_type alts
401 = -- Algebraic and polymorphic case
402 do { -- Bind the default binder
403 bindNewToReg bndr nodeReg (mkLFArgument bndr)
405 -- Generate sequel info for use downstream
406 -- At the moment, we only do it if the type is vector-returnable.
407 -- Reason: if not, then it costs extra to label the
408 -- alternatives, because we'd get return code like:
410 -- switch TagReg { 0 : JMP(alt_1); 1 : JMP(alt_2) ..etc }
412 -- which is worse than having the alt code in the switch statement
414 ; (alts, mb_deflt) <- cgAlgAlts GCMayHappen cc_slot alt_type alts
416 ; (lbl, branches) <- emitAlgReturnTarget (idName bndr)
419 ; returnFC (CaseAlts lbl branches bndr) }
421 fam_sz = case alt_type of
422 AlgAlt tc -> tyConFamilySize tc
427 HWL comment on {\em GrAnSim\/} (adding GRAN_YIELDs for context switch): If
428 we do an inlining of the case no separate functions for returning are
429 created, so we don't have to generate a GRAN_YIELD in that case. This info
430 must be propagated to cgAlgAltRhs (where the GRAN_YIELD macro might be
431 emitted). Hence, the new Bool arg to cgAlgAltRhs.
433 %************************************************************************
435 \subsection[CgCase-alg-alts]{Algebraic alternatives}
437 %************************************************************************
439 In @cgAlgAlts@, none of the binders in the alternatives are
440 assumed to be yet bound.
442 HWL comment on {\em GrAnSim\/} (adding GRAN_YIELDs for context switch): The
443 last arg of cgAlgAlts indicates if we want a context switch at the
444 beginning of each alternative. Normally we want that. The only exception
445 are inlined alternatives.
449 -> Maybe VirtualSpOffset
450 -> AltType -- ** AlgAlt or PolyAlt only **
451 -> [StgAlt] -- The alternatives
452 -> FCode ( [(ConTagZ, CgStmts)], -- The branches
453 Maybe CgStmts ) -- The default case
455 cgAlgAlts gc_flag cc_slot alt_type alts
456 = do alts <- forkAlts [ cgAlgAlt gc_flag cc_slot alt_type alt | alt <- alts]
458 mb_deflt = case alts of -- DEFAULT is always first, if present
459 ((DEFAULT,blks) : _) -> Just blks
462 branches = [(dataConTagZ con, blks)
463 | (DataAlt con, blks) <- alts]
465 return (branches, mb_deflt)
469 -> Maybe VirtualSpOffset -- Turgid state
470 -> AltType -- ** AlgAlt or PolyAlt only **
472 -> FCode (AltCon, CgStmts)
474 cgAlgAlt gc_flag cc_slot alt_type (con, args, use_mask, rhs)
475 = do { abs_c <- getCgStmts $ do
476 { bind_con_args con args
477 ; restoreCurrentCostCentre cc_slot True
478 ; maybeAltHeapCheck gc_flag alt_type (cgExpr rhs) }
479 ; return (con, abs_c) }
481 bind_con_args DEFAULT args = nopC
482 bind_con_args (DataAlt dc) args = bindConArgs dc args
486 %************************************************************************
488 \subsection[CgCase-prim-alts]{Primitive alternatives}
490 %************************************************************************
492 @cgPrimAlts@ generates suitable a @CSwitch@
493 for dealing with the alternatives of a primitive @case@, given an
494 addressing mode for the thing to scrutinise. It also keeps track of
495 the maximum stack depth encountered down any branch.
497 As usual, no binders in the alternatives are yet bound.
501 -> AltType -- Always PrimAlt, but passed to maybeAltHeapCheck
502 -> CmmReg -- Scrutinee
503 -> [StgAlt] -- Alternatives
505 -- NB: cgPrimAlts emits code that does the case analysis.
506 -- It's often used in inline situations, rather than to genearte
507 -- a labelled return point. That's why its interface is a little
508 -- different to cgAlgAlts
510 -- INVARIANT: the default binder is already bound
511 cgPrimAlts gc_flag alt_type scrutinee alts
512 = do { tagged_absCs <- forkAlts (map (cgPrimAlt gc_flag alt_type) alts)
513 ; let ((DEFAULT, deflt_absC) : others) = tagged_absCs -- There is always a default
514 alt_absCs = [(lit,rhs) | (LitAlt lit, rhs) <- others]
515 ; emitLitSwitch (CmmReg scrutinee) alt_absCs deflt_absC }
519 -> StgAlt -- The alternative
520 -> FCode (AltCon, CgStmts) -- Its compiled form
522 cgPrimAlt gc_flag alt_type (con, [], [], rhs)
523 = ASSERT( case con of { DEFAULT -> True; LitAlt _ -> True; other -> False } )
524 do { abs_c <- getCgStmts (maybeAltHeapCheck gc_flag alt_type (cgExpr rhs))
525 ; returnFC (con, abs_c) }
529 %************************************************************************
531 \subsection[CgCase-tidy]{Code for tidying up prior to an eval}
533 %************************************************************************
538 -> AltType -- PolyAlt, PrimAlt, AlgAlt, but *not* UbxTupAlt
539 -> Code -- Continuation
541 maybeAltHeapCheck NoGC _ code = code
542 maybeAltHeapCheck GCMayHappen alt_type code = altHeapCheck alt_type code
544 saveVolatileVarsAndRegs
545 :: StgLiveVars -- Vars which should be made safe
546 -> FCode (CmmStmts, -- Assignments to do the saves
547 EndOfBlockInfo, -- sequel for the alts
548 Maybe VirtualSpOffset) -- Slot for current cost centre
550 saveVolatileVarsAndRegs vars
551 = do { var_saves <- saveVolatileVars vars
552 ; (maybe_cc_slot, cc_save) <- saveCurrentCostCentre
553 ; eob_info <- getEndOfBlockInfo
554 ; returnFC (var_saves `plusStmts` cc_save,
559 saveVolatileVars :: StgLiveVars -- Vars which should be made safe
560 -> FCode CmmStmts -- Assignments to to the saves
562 saveVolatileVars vars
563 = do { stmts_s <- mapFCs save_it (varSetElems vars)
564 ; return (foldr plusStmts noStmts stmts_s) }
567 = do { v <- getCAddrModeIfVolatile var
569 Nothing -> return noStmts -- Non-volatile
570 Just vol_amode -> save_var var vol_amode -- Aha! It's volatile
573 save_var var vol_amode
574 = do { slot <- allocPrimStack (idCgRep var)
575 ; rebindToStack var slot
576 ; sp_rel <- getSpRelOffset slot
577 ; returnFC (oneStmt (CmmStore sp_rel vol_amode)) }
580 ---------------------------------------------------------------------------
582 When we save the current cost centre (which is done for lexical
583 scoping), we allocate a free stack location, and return (a)~the
584 virtual offset of the location, to pass on to the alternatives, and
585 (b)~the assignment to do the save (just as for @saveVolatileVars@).
588 saveCurrentCostCentre ::
589 FCode (Maybe VirtualSpOffset, -- Where we decide to store it
590 CmmStmts) -- Assignment to save it
592 saveCurrentCostCentre
593 | not opt_SccProfilingOn
594 = returnFC (Nothing, noStmts)
596 = do { slot <- allocPrimStack PtrArg
597 ; sp_rel <- getSpRelOffset slot
598 ; returnFC (Just slot,
599 oneStmt (CmmStore sp_rel curCCS)) }
601 -- Sometimes we don't free the slot containing the cost centre after restoring it
602 -- (see CgLetNoEscape.cgLetNoEscapeBody).
603 restoreCurrentCostCentre :: Maybe VirtualSpOffset -> Bool -> Code
604 restoreCurrentCostCentre Nothing _freeit = nopC
605 restoreCurrentCostCentre (Just slot) freeit
606 = do { sp_rel <- getSpRelOffset slot
607 ; whenC freeit (freeStackSlots [slot])
608 ; stmtC (CmmStore curCCSAddr (CmmLoad sp_rel wordRep)) }