2 % (c) The University of Glasgow 2006
3 % (c) The GRASP/AQUA Project, Glasgow University, 1992-1998
7 module CgCase ( cgCase, saveVolatileVarsAndRegs,
8 restoreCurrentCostCentre
11 #include "HsVersions.h"
13 import {-# SOURCE #-} CgExpr ( cgExpr )
48 = GCMayHappen -- The scrutinee may involve GC, so everything must be
49 -- tidy before the code for the scrutinee.
51 | NoGC -- The scrutinee is a primitive value, or a call to a
52 -- primitive op which does no GC. Hence the case can
53 -- be done inline, without tidying up first.
56 It is quite interesting to decide whether to put a heap-check
57 at the start of each alternative. Of course we certainly have
58 to do so if the case forces an evaluation, or if there is a primitive
59 op which can trigger GC.
61 A more interesting situation is this:
68 default -> !C!; ...C...
71 where \tr{!x!} indicates a possible heap-check point. The heap checks
72 in the alternatives {\em can} be omitted, in which case the topmost
73 heapcheck will take their worst case into account.
75 In favour of omitting \tr{!B!}, \tr{!C!}:
77 - {\em May} save a heap overflow test,
78 if ...A... allocates anything. The other advantage
79 of this is that we can use relative addressing
80 from a single Hp to get at all the closures so allocated.
82 - No need to save volatile vars etc across the case
86 - May do more allocation than reqd. This sometimes bites us
87 badly. For example, nfib (ha!) allocates about 30\% more space if the
88 worst-casing is done, because many many calls to nfib are leaf calls
89 which don't need to allocate anything.
91 This never hurts us if there is only one alternative.
104 Special case #1: case of literal.
107 cgCase (StgLit lit) live_in_whole_case live_in_alts bndr srt
108 alt_type@(PrimAlt tycon) alts
109 = do { tmp_reg <- bindNewToTemp bndr
110 ; cm_lit <- cgLit lit
111 ; stmtC (CmmAssign tmp_reg (CmmLit cm_lit))
112 ; cgPrimAlts NoGC alt_type tmp_reg alts }
115 Special case #2: scrutinising a primitive-typed variable. No
116 evaluation required. We don't save volatile variables, nor do we do a
117 heap-check in the alternatives. Instead, the heap usage of the
118 alternatives is worst-cased and passed upstream. This can result in
119 allocating more heap than strictly necessary, but it will sometimes
120 eliminate a heap check altogether.
123 cgCase (StgApp v []) live_in_whole_case live_in_alts bndr srt
124 alt_type@(PrimAlt tycon) alts
125 = do { -- Careful! we can't just bind the default binder to the same thing
126 -- as the scrutinee, since it might be a stack location, and having
127 -- two bindings pointing at the same stack locn doesn't work (it
128 -- confuses nukeDeadBindings). Hence, use a new temp.
129 v_info <- getCgIdInfo v
130 ; amode <- idInfoToAmode v_info
131 ; tmp_reg <- bindNewToTemp bndr
132 ; stmtC (CmmAssign tmp_reg amode)
133 ; cgPrimAlts NoGC alt_type tmp_reg alts }
136 Special case #3: inline PrimOps and foreign calls.
139 cgCase (StgOpApp op@(StgPrimOp primop) args _)
140 live_in_whole_case live_in_alts bndr srt alt_type alts
141 | not (primOpOutOfLine primop)
142 = cgInlinePrimOp primop args bndr alt_type live_in_alts alts
145 TODO: Case-of-case of primop can probably be done inline too (but
146 maybe better to translate it out beforehand). See
147 ghc/lib/misc/PackedString.lhs for examples where this crops up (with
150 Special case #4: inline foreign calls: an unsafe foreign call can be done
151 right here, just like an inline primop.
154 cgCase (StgOpApp op@(StgFCallOp fcall _) args _)
155 live_in_whole_case live_in_alts bndr srt alt_type alts
156 | unsafe_foreign_call
157 = ASSERT( isSingleton alts )
158 do -- *must* be an unboxed tuple alt.
159 -- exactly like the cgInlinePrimOp case for unboxed tuple alts..
160 { res_tmps <- mapFCs bindNewToTemp non_void_res_ids
161 ; let res_hints = map (typeHint.idType) non_void_res_ids
162 ; cgForeignCall (zip res_tmps res_hints) fcall args live_in_alts
165 (_, res_ids, _, rhs) = head alts
166 non_void_res_ids = filter (nonVoidArg . idCgRep) res_ids
170 CCall (CCallSpec _ _ s) -> not (playSafe s)
174 Special case: scrutinising a non-primitive variable.
175 This can be done a little better than the general case, because
176 we can reuse/trim the stack slot holding the variable (if it is in one).
179 cgCase (StgApp fun args)
180 live_in_whole_case live_in_alts bndr srt alt_type alts
181 = do { fun_info <- getCgIdInfo fun
182 ; arg_amodes <- getArgAmodes args
184 -- Nuking dead bindings *before* calculating the saves is the
185 -- value-add here. We might end up freeing up some slots currently
186 -- occupied by variables only required for the call.
187 -- NOTE: we need to look up the variables used in the call before
188 -- doing this, because some of them may not be in the environment
190 ; nukeDeadBindings live_in_alts
191 ; (save_assts, alts_eob_info, maybe_cc_slot)
192 <- saveVolatileVarsAndRegs live_in_alts
195 <- forkEval alts_eob_info
196 (allocStackTop retAddrSizeW >> nopC)
197 (do { deAllocStackTop retAddrSizeW
198 ; cgEvalAlts maybe_cc_slot bndr srt alt_type alts })
200 ; setEndOfBlockInfo (maybeReserveSeqFrame alt_type scrut_eob_info)
201 (performTailCall fun_info arg_amodes save_assts) }
204 Note about return addresses: we *always* push a return address, even
205 if because of an optimisation we end up jumping direct to the return
206 code (not through the address itself). The alternatives always assume
207 that the return address is on the stack. The return address is
208 required in case the alternative performs a heap check, since it
209 encodes the liveness of the slots in the activation record.
211 On entry to the case alternative, we can re-use the slot containing
212 the return address immediately after the heap check. That's what the
213 deAllocStackTop call is doing above.
215 Finally, here is the general case.
218 cgCase expr live_in_whole_case live_in_alts bndr srt alt_type alts
219 = do { -- Figure out what volatile variables to save
220 nukeDeadBindings live_in_whole_case
222 ; (save_assts, alts_eob_info, maybe_cc_slot)
223 <- saveVolatileVarsAndRegs live_in_alts
225 -- Save those variables right now!
226 ; emitStmts save_assts
228 -- generate code for the alts
230 <- forkEval alts_eob_info
231 (do { nukeDeadBindings live_in_alts
232 ; allocStackTop retAddrSizeW -- space for retn address
234 (do { deAllocStackTop retAddrSizeW
235 ; cgEvalAlts maybe_cc_slot bndr srt alt_type alts })
237 ; setEndOfBlockInfo (maybeReserveSeqFrame alt_type scrut_eob_info)
242 There's a lot of machinery going on behind the scenes to manage the
243 stack pointer here. forkEval takes the virtual Sp and free list from
244 the first argument, and turns that into the *real* Sp for the second
245 argument. It also uses this virtual Sp as the args-Sp in the EOB info
246 returned, so that the scrutinee will trim the real Sp back to the
247 right place before doing whatever it does.
248 --SDM (who just spent an hour figuring this out, and didn't want to
251 Why don't we push the return address just before evaluating the
252 scrutinee? Because the slot reserved for the return address might
253 contain something useful, so we wait until performing a tail call or
254 return before pushing the return address (see
255 CgTailCall.pushReturnAddress).
257 This also means that the environment doesn't need to know about the
258 free stack slot for the return address (for generating bitmaps),
259 because we don't reserve it until just before the eval.
261 TODO!! Problem: however, we have to save the current cost centre
262 stack somewhere, because at the eval point the current CCS might be
263 different. So we pick a free stack slot and save CCCS in it. One
264 consequence of this is that activation records on the stack don't
265 follow the layout of closures when we're profiling. The CCS could be
266 anywhere within the record).
269 maybeReserveSeqFrame PolyAlt (EndOfBlockInfo args_sp (CaseAlts amode stuff bndr _))
270 = EndOfBlockInfo (args_sp + retAddrSizeW) (CaseAlts amode stuff bndr True)
271 maybeReserveSeqFrame other scrut_eob_info = scrut_eob_info
275 %************************************************************************
279 %************************************************************************
282 cgInlinePrimOp primop args bndr (PrimAlt tycon) live_in_alts alts
283 | isVoidArg (idCgRep bndr)
284 = ASSERT( con == DEFAULT && isSingleton alts && null bs )
285 do { -- VOID RESULT; just sequencing,
286 -- so get in there and do it
287 cgPrimOp [] primop args live_in_alts
290 (con,bs,_,rhs) = head alts
292 cgInlinePrimOp primop args bndr (PrimAlt tycon) live_in_alts alts
293 = do { -- PRIMITIVE ALTS, with non-void result
294 tmp_reg <- bindNewToTemp bndr
295 ; cgPrimOp [tmp_reg] primop args live_in_alts
296 ; cgPrimAlts NoGC (PrimAlt tycon) tmp_reg alts }
298 cgInlinePrimOp primop args bndr (UbxTupAlt tycon) live_in_alts alts
299 = ASSERT( isSingleton alts )
300 do { -- UNBOXED TUPLE ALTS
301 -- No heap check, no yield, just get in there and do it.
302 -- NB: the case binder isn't bound to anything;
303 -- it has a unboxed tuple type
305 res_tmps <- mapFCs bindNewToTemp non_void_res_ids
306 ; cgPrimOp res_tmps primop args live_in_alts
309 (_, res_ids, _, rhs) = head alts
310 non_void_res_ids = filter (nonVoidArg . idCgRep) res_ids
312 cgInlinePrimOp primop args bndr (AlgAlt tycon) live_in_alts alts
313 = do { -- ENUMERATION TYPE RETURN
314 -- Typical: case a ># b of { True -> ..; False -> .. }
315 -- The primop itself returns an index into the table of
316 -- closures for the enumeration type.
317 tag_amode <- ASSERT( isEnumerationTyCon tycon )
318 do_enum_primop primop
320 -- Bind the default binder if necessary
321 -- (avoiding it avoids the assignment)
322 -- The deadness info is set by StgVarInfo
323 ; this_pkg <- getThisPackage
324 ; whenC (not (isDeadBinder bndr))
325 (do { tmp_reg <- bindNewToTemp bndr
326 ; stmtC (CmmAssign tmp_reg (tagToClosure this_pkg tycon tag_amode)) })
329 ; (branches, mb_deflt) <- cgAlgAlts NoGC Nothing{-cc_slot-}
333 ; emitSwitch tag_amode branches mb_deflt 0 (tyConFamilySize tycon - 1)
337 do_enum_primop :: PrimOp -> FCode CmmExpr -- Returns amode for result
338 do_enum_primop TagToEnumOp -- No code!
340 (_,e) <- getArgAmode arg
342 do_enum_primop primop
343 = do tmp <- newTemp wordRep
344 cgPrimOp [tmp] primop args live_in_alts
345 returnFC (CmmReg tmp)
347 cgInlinePrimOp primop arg_amodes bndr PolyAlt live_in_alts alts
348 = pprPanic "cgCase: case of primop has polymorphic type" (ppr bndr)
351 %************************************************************************
353 \subsection[CgCase-alts]{Alternatives}
355 %************************************************************************
357 @cgEvalAlts@ returns an addressing mode for a continuation for the
358 alternatives of a @case@, used in a context when there
359 is some evaluation to be done.
362 cgEvalAlts :: Maybe VirtualSpOffset -- Offset of cost-centre to be restored, if any
364 -> SRT -- SRT for the continuation
367 -> FCode Sequel -- Any addr modes inside are guaranteed
368 -- to be a label so that we can duplicate it
369 -- without risk of duplicating code
371 cgEvalAlts cc_slot bndr srt alt_type@(PrimAlt tycon) alts
372 = do { let rep = tyConCgRep tycon
373 reg = dataReturnConvPrim rep -- Bottom for voidRep
375 ; abs_c <- forkProc $ do
376 { -- Bind the case binder, except if it's void
377 -- (reg is bottom in that case)
378 whenC (nonVoidArg rep) $
379 bindNewToReg bndr reg (mkLFArgument bndr)
380 ; restoreCurrentCostCentre cc_slot True
381 ; cgPrimAlts GCMayHappen alt_type reg alts }
383 ; lbl <- emitDirectReturnTarget (idName bndr) abs_c srt
384 ; returnFC (CaseAlts lbl Nothing bndr False) }
386 cgEvalAlts cc_slot bndr srt (UbxTupAlt _) [(con,args,_,rhs)]
387 = -- Unboxed tuple case
388 -- By now, the simplifier should have have turned it
389 -- into case e of (# a,b #) -> e
390 -- There shouldn't be a
391 -- case e of DEFAULT -> e
392 ASSERT2( case con of { DataAlt _ -> True; other -> False },
393 text "cgEvalAlts: dodgy case of unboxed tuple type" )
394 do { -- forkAbsC for the RHS, so that the envt is
395 -- not changed for the emitDirectReturn call
396 abs_c <- forkProc $ do
397 { (live_regs, ptrs, nptrs, _) <- bindUnboxedTupleComponents args
398 -- Restore the CC *after* binding the tuple components,
399 -- so that we get the stack offset of the saved CC right.
400 ; restoreCurrentCostCentre cc_slot True
401 -- Generate a heap check if necessary
402 -- and finally the code for the alternative
403 ; unbxTupleHeapCheck live_regs ptrs nptrs noStmts
405 ; lbl <- emitDirectReturnTarget (idName bndr) abs_c srt
406 ; returnFC (CaseAlts lbl Nothing bndr False) }
408 cgEvalAlts cc_slot bndr srt alt_type alts
409 = -- Algebraic and polymorphic case
410 do { -- Bind the default binder
411 bindNewToReg bndr nodeReg (mkLFArgument bndr)
413 -- Generate sequel info for use downstream
414 -- At the moment, we only do it if the type is vector-returnable.
415 -- Reason: if not, then it costs extra to label the
416 -- alternatives, because we'd get return code like:
418 -- switch TagReg { 0 : JMP(alt_1); 1 : JMP(alt_2) ..etc }
420 -- which is worse than having the alt code in the switch statement
422 ; (alts, mb_deflt) <- cgAlgAlts GCMayHappen cc_slot alt_type alts
424 ; (lbl, branches) <- emitAlgReturnTarget (idName bndr)
425 alts mb_deflt srt ret_conv
427 ; returnFC (CaseAlts lbl branches bndr False) }
429 ret_conv = case alt_type of
430 AlgAlt tc -> ctrlReturnConvAlg tc
431 PolyAlt -> UnvectoredReturn 0
435 HWL comment on {\em GrAnSim\/} (adding GRAN_YIELDs for context switch): If
436 we do an inlining of the case no separate functions for returning are
437 created, so we don't have to generate a GRAN_YIELD in that case. This info
438 must be propagated to cgAlgAltRhs (where the GRAN_YIELD macro might be
439 emitted). Hence, the new Bool arg to cgAlgAltRhs.
441 %************************************************************************
443 \subsection[CgCase-alg-alts]{Algebraic alternatives}
445 %************************************************************************
447 In @cgAlgAlts@, none of the binders in the alternatives are
448 assumed to be yet bound.
450 HWL comment on {\em GrAnSim\/} (adding GRAN_YIELDs for context switch): The
451 last arg of cgAlgAlts indicates if we want a context switch at the
452 beginning of each alternative. Normally we want that. The only exception
453 are inlined alternatives.
457 -> Maybe VirtualSpOffset
458 -> AltType -- ** AlgAlt or PolyAlt only **
459 -> [StgAlt] -- The alternatives
460 -> FCode ( [(ConTagZ, CgStmts)], -- The branches
461 Maybe CgStmts ) -- The default case
463 cgAlgAlts gc_flag cc_slot alt_type alts
464 = do alts <- forkAlts [ cgAlgAlt gc_flag cc_slot alt_type alt | alt <- alts]
466 mb_deflt = case alts of -- DEFAULT is always first, if present
467 ((DEFAULT,blks) : _) -> Just blks
470 branches = [(dataConTagZ con, blks)
471 | (DataAlt con, blks) <- alts]
473 return (branches, mb_deflt)
477 -> Maybe VirtualSpOffset -- Turgid state
478 -> AltType -- ** AlgAlt or PolyAlt only **
480 -> FCode (AltCon, CgStmts)
482 cgAlgAlt gc_flag cc_slot alt_type (con, args, use_mask, rhs)
483 = do { abs_c <- getCgStmts $ do
484 { bind_con_args con args
485 ; restoreCurrentCostCentre cc_slot True
486 ; maybeAltHeapCheck gc_flag alt_type (cgExpr rhs) }
487 ; return (con, abs_c) }
489 bind_con_args DEFAULT args = nopC
490 bind_con_args (DataAlt dc) args = bindConArgs dc args
494 %************************************************************************
496 \subsection[CgCase-prim-alts]{Primitive alternatives}
498 %************************************************************************
500 @cgPrimAlts@ generates suitable a @CSwitch@
501 for dealing with the alternatives of a primitive @case@, given an
502 addressing mode for the thing to scrutinise. It also keeps track of
503 the maximum stack depth encountered down any branch.
505 As usual, no binders in the alternatives are yet bound.
509 -> AltType -- Always PrimAlt, but passed to maybeAltHeapCheck
510 -> CmmReg -- Scrutinee
511 -> [StgAlt] -- Alternatives
513 -- NB: cgPrimAlts emits code that does the case analysis.
514 -- It's often used in inline situations, rather than to genearte
515 -- a labelled return point. That's why its interface is a little
516 -- different to cgAlgAlts
518 -- INVARIANT: the default binder is already bound
519 cgPrimAlts gc_flag alt_type scrutinee alts
520 = do { tagged_absCs <- forkAlts (map (cgPrimAlt gc_flag alt_type) alts)
521 ; let ((DEFAULT, deflt_absC) : others) = tagged_absCs -- There is always a default
522 alt_absCs = [(lit,rhs) | (LitAlt lit, rhs) <- others]
523 ; emitLitSwitch (CmmReg scrutinee) alt_absCs deflt_absC }
527 -> StgAlt -- The alternative
528 -> FCode (AltCon, CgStmts) -- Its compiled form
530 cgPrimAlt gc_flag alt_type (con, [], [], rhs)
531 = ASSERT( case con of { DEFAULT -> True; LitAlt _ -> True; other -> False } )
532 do { abs_c <- getCgStmts (maybeAltHeapCheck gc_flag alt_type (cgExpr rhs))
533 ; returnFC (con, abs_c) }
537 %************************************************************************
539 \subsection[CgCase-tidy]{Code for tidying up prior to an eval}
541 %************************************************************************
546 -> AltType -- PolyAlt, PrimAlt, AlgAlt, but *not* UbxTupAlt
547 -> Code -- Continuation
549 maybeAltHeapCheck NoGC _ code = code
550 maybeAltHeapCheck GCMayHappen alt_type code = altHeapCheck alt_type code
552 saveVolatileVarsAndRegs
553 :: StgLiveVars -- Vars which should be made safe
554 -> FCode (CmmStmts, -- Assignments to do the saves
555 EndOfBlockInfo, -- sequel for the alts
556 Maybe VirtualSpOffset) -- Slot for current cost centre
558 saveVolatileVarsAndRegs vars
559 = do { var_saves <- saveVolatileVars vars
560 ; (maybe_cc_slot, cc_save) <- saveCurrentCostCentre
561 ; eob_info <- getEndOfBlockInfo
562 ; returnFC (var_saves `plusStmts` cc_save,
567 saveVolatileVars :: StgLiveVars -- Vars which should be made safe
568 -> FCode CmmStmts -- Assignments to to the saves
570 saveVolatileVars vars
571 = do { stmts_s <- mapFCs save_it (varSetElems vars)
572 ; return (foldr plusStmts noStmts stmts_s) }
575 = do { v <- getCAddrModeIfVolatile var
577 Nothing -> return noStmts -- Non-volatile
578 Just vol_amode -> save_var var vol_amode -- Aha! It's volatile
581 save_var var vol_amode
582 = do { slot <- allocPrimStack (idCgRep var)
583 ; rebindToStack var slot
584 ; sp_rel <- getSpRelOffset slot
585 ; returnFC (oneStmt (CmmStore sp_rel vol_amode)) }
588 ---------------------------------------------------------------------------
590 When we save the current cost centre (which is done for lexical
591 scoping), we allocate a free stack location, and return (a)~the
592 virtual offset of the location, to pass on to the alternatives, and
593 (b)~the assignment to do the save (just as for @saveVolatileVars@).
596 saveCurrentCostCentre ::
597 FCode (Maybe VirtualSpOffset, -- Where we decide to store it
598 CmmStmts) -- Assignment to save it
600 saveCurrentCostCentre
601 | not opt_SccProfilingOn
602 = returnFC (Nothing, noStmts)
604 = do { slot <- allocPrimStack PtrArg
605 ; sp_rel <- getSpRelOffset slot
606 ; returnFC (Just slot,
607 oneStmt (CmmStore sp_rel curCCS)) }
609 -- Sometimes we don't free the slot containing the cost centre after restoring it
610 -- (see CgLetNoEscape.cgLetNoEscapeBody).
611 restoreCurrentCostCentre :: Maybe VirtualSpOffset -> Bool -> Code
612 restoreCurrentCostCentre Nothing _freeit = nopC
613 restoreCurrentCostCentre (Just slot) freeit
614 = do { sp_rel <- getSpRelOffset slot
615 ; whenC freeit (freeStackSlots [slot])
616 ; stmtC (CmmStore curCCSAddr (CmmLoad sp_rel wordRep)) }