2 % (c) The GRASP/AQUA Project, Glasgow University, 1992-1998
4 % $Id: AbsCSyn.lhs,v 1.42 2001/12/06 11:50:07 sewardj Exp $
6 \section[AbstractC]{Abstract C: the last stop before machine code}
8 This ``Abstract C'' data type describes the raw Spineless Tagless
9 machine model at a C-ish level; it is ``abstract'' in that it only
10 includes C-like structures that we happen to need. The conversion of
11 programs from @StgSyntax@ (basically a functional language) to
12 @AbstractC@ (basically imperative C) is the heart of code generation.
13 From @AbstractC@, one may convert to real C (for portability) or to
14 raw assembler/machine code.
25 mkAbstractCs, mkAbsCStmts, mkAlgAltsCSwitch,
34 MagicId(..), node, infoptr,
39 #include "HsVersions.h"
41 import {-# SOURCE #-} ClosureInfo ( ClosureInfo )
44 import Constants ( mAX_Vanilla_REG, mAX_Float_REG,
45 mAX_Double_REG, spRelToInt )
46 import CostCentre ( CostCentre, CostCentreStack )
47 import Literal ( mkMachInt, Literal(..) )
48 import ForeignCall ( CCallSpec )
49 import PrimRep ( PrimRep(..) )
50 import MachOp ( MachOp(..) )
51 import Unique ( Unique )
52 import StgSyn ( StgOp )
53 import TyCon ( TyCon )
54 import BitSet -- for liveness masks
55 import Maybes ( Maybe012(..) )
61 @AbstractC@ is a list of Abstract~C statements, but the data structure
62 is tree-ish, for easier and more efficient putting-together.
68 | AbsCStmts AbstractC AbstractC
70 -- and the individual stmts...
73 A note on @CAssign@: In general, the type associated with an assignment
74 is the type of the lhs. However, when the lhs is a pointer to mixed
75 types (e.g. SpB relative), the type of the assignment is the type of
76 the rhs for float types, or the generic StgWord for all other types.
77 (In particular, a CharRep on the rhs is promoted to IntRep when
78 stored in a mixed type location.)
86 CAddrMode -- Put this in the program counter
87 -- eg `CJump (CReg (VanillaReg PtrRep 1))' puts Ret1 in PC
88 -- Enter can be done by:
89 -- CJump (CVal NodeRel zeroOff)
92 CAddrMode -- Fall through into this routine
93 -- (for the benefit of the native code generators)
94 -- Equivalent to CJump in C land
96 | CReturn -- Perform a return
97 CAddrMode -- Address of a RET_<blah> info table
98 ReturnInfo -- Whether it's a direct or vectored return
101 [(Literal, AbstractC)] -- alternatives
102 AbstractC -- default; if there is no real Abstract C in here
103 -- (e.g., all comments; see function "nonemptyAbsC"),
104 -- then that means the default _cannot_ occur.
105 -- If there is only one alternative & no default code,
106 -- then there is no need to check the tag.
108 -- CSwitch m [(tag,code)] AbsCNop == code
110 | CCodeBlock CLabel AbstractC
111 -- A labelled block of code; this "statement" is not
112 -- executed; rather, the labelled code will be hoisted
113 -- out to the top level (out of line) & it can be
116 | CInitHdr -- to initialise the header of a closure (both fixed/var parts)
118 CAddrMode -- address of the info ptr
119 CAddrMode -- cost centre to place in closure
120 -- CReg CurCostCentre or CC_HDR(R1.p{-Node-})
121 Int -- size of closure, for profiling
123 -- NEW CASES FOR EXPANDED PRIMOPS
125 | CMachOpStmt -- Machine-level operation
126 (Maybe012 CAddrMode) -- 0, 1 or 2 results
128 [CAddrMode] -- Arguments
129 (Maybe [MagicId]) -- list of regs which need to be preserved
130 -- across the primop. This is allowed to be Nothing only if
131 -- machOpIsDefinitelyInline returns True. And that in turn may
132 -- only return True if we are absolutely sure that the mach op
133 -- can be done inline on all platforms.
135 | CSequential -- Do the nested AbstractCs sequentially.
136 [AbstractC] -- In particular, as far as the AbsCUtils.doSimultaneously
137 -- is concerned, these stmts are to be treated as atomic
138 -- and are not to be reordered.
140 -- end of NEW CASES FOR EXPANDED PRIMOPS
143 [CAddrMode] -- Results
145 [CAddrMode] -- Arguments
146 [MagicId] -- Potentially volatile/live registers
147 -- (to save/restore around the call/op)
149 -- INVARIANT: When a PrimOp which can cause GC is used, the
150 -- only live data is tidily on the STG stacks or in the STG
151 -- registers (the code generator ensures this).
153 -- Why this? Because if the arguments were arbitrary
154 -- addressing modes, they might be things like (Hp+6) which
155 -- will get utterly spongled by GC.
157 | CSimultaneous -- Perform simultaneously all the statements
158 AbstractC -- in the nested AbstractC. They are only
159 -- allowed to be CAssigns, COpStmts and AbsCNops, so the
160 -- "simultaneous" part just concerns making
161 -- sure that permutations work.
162 -- For example { a := b, b := a }
163 -- needs to go via (at least one) temporary
165 | CCheck -- heap or stack checks, or both.
166 CCheckMacro -- These might include some code to fill in tags
167 [CAddrMode] -- on the stack, so we can't use CMacroStmt below.
170 | CRetDirect -- Direct return
171 !Unique -- for making labels
172 AbstractC -- return code
174 Liveness -- stack liveness at the return point
176 -- see the notes about these next few; they follow below...
177 | CMacroStmt CStmtMacro [CAddrMode]
178 | CCallProfCtrMacro FAST_STRING [CAddrMode]
179 | CCallProfCCMacro FAST_STRING [CAddrMode]
181 {- The presence of this constructor is a makeshift solution;
182 it being used to work around a gcc-related problem of
183 handling typedefs within statement blocks (or, rather,
184 the inability to do so.)
186 The AbstractC flattener takes care of lifting out these
187 typedefs if needs be (i.e., when generating .hc code and
188 compiling 'foreign import dynamic's)
190 | CCallTypedef Bool {- True => use "typedef"; False => use "extern"-}
191 CCallSpec Unique [CAddrMode] [CAddrMode]
193 -- *** the next three [or so...] are DATA (those above are CODE) ***
196 CLabel -- The (full, not base) label to use for labelling the closure.
198 CAddrMode -- cost centre identifier to place in closure
199 [CAddrMode] -- free vars; ptrs, then non-ptrs.
201 | CSRT CLabel [CLabel] -- SRT declarations: basically an array of
202 -- pointers to static closures.
204 | CBitmap CLabel LivenessMask -- A bitmap to be emitted if and only if
205 -- it is larger than a target machine word.
207 | CClosureInfoAndCode
208 ClosureInfo -- Explains placement and layout of closure
209 AbstractC -- Slow entry point code
211 -- Fast entry point code, if any
212 String -- Closure description; NB we can't get this
213 -- from ClosureInfo, because the latter refers
214 -- to the *right* hand side of a defn, whereas
215 -- the "description" refers to *left* hand side
217 | CRetVector -- A labelled block of static data
221 Liveness -- stack liveness at the return point
223 | CClosureTbl -- table of constructors for enumerated types
224 TyCon -- which TyCon this table is for
226 | CModuleInitBlock -- module initialisation block
227 CLabel -- label for init block
228 AbstractC -- initialisation code
230 | CCostCentreDecl -- A cost centre *declaration*
231 Bool -- True <=> local => full declaration
232 -- False <=> extern; just say so
235 | CCostCentreStackDecl -- A cost centre stack *declaration*
236 CostCentreStack -- this is the declaration for a
237 -- pre-defined singleton CCS (see
240 | CSplitMarker -- Split into separate object modules here
242 -- C_SRT is what StgSyn.SRT gets translated to...
243 -- we add a label for the table, and expect only the 'offset/length' form
246 | C_SRT CLabel !Int{-offset-} !Int{-length-}
248 needsSRT :: C_SRT -> Bool
249 needsSRT NoC_SRT = False
250 needsSRT (C_SRT _ _ _) = True
253 About @CMacroStmt@, etc.: notionally, they all just call some
254 arbitrary C~macro or routine, passing the @CAddrModes@ as arguments.
255 However, we distinguish between various flavours of these things,
256 mostly just to keep things somewhat less wild and wooly.
260 Some {\em essential} bits of the STG execution model are done with C
261 macros. An example is @STK_CHK@, which checks for stack-space
262 overflow. This enumeration type lists all such macros:
265 = ARGS_CHK -- arg satisfaction check
266 | ARGS_CHK_LOAD_NODE -- arg check for top-level functions
267 | UPD_CAF -- update CAF closure with indirection
268 | UPD_BH_UPDATABLE -- eager backholing
269 | UPD_BH_SINGLE_ENTRY -- more eager blackholing
270 | PUSH_UPD_FRAME -- push update frame
271 | PUSH_SEQ_FRAME -- push seq frame
272 | UPDATE_SU_FROM_UPD_FRAME -- pull Su out of the update frame
273 | SET_TAG -- set TagReg if it exists
275 | REGISTER_FOREIGN_EXPORT -- register a foreign exported fun
276 | REGISTER_IMPORT -- register an imported module
277 | REGISTER_DIMPORT -- register an imported module from
280 | GRAN_FETCH -- for GrAnSim only -- HWL
281 | GRAN_RESCHEDULE -- for GrAnSim only -- HWL
282 | GRAN_FETCH_AND_RESCHEDULE -- for GrAnSim only -- HWL
283 | THREAD_CONTEXT_SWITCH -- for GrAnSim only -- HWL
284 | GRAN_YIELD -- for GrAnSim only -- HWL
287 Heap/Stack checks. There are far too many of these.
292 = HP_CHK_NP -- heap/stack checks when
293 | STK_CHK_NP -- node points to the closure
295 | HP_CHK_SEQ_NP -- for 'seq' style case alternatives
297 | HP_CHK -- heap/stack checks when
298 | STK_CHK -- node doesn't point
300 -- case alternative heap checks:
302 | HP_CHK_NOREGS -- no registers live
303 | HP_CHK_UNPT_R1 -- R1 is boxed/unlifted
304 | HP_CHK_UNBX_R1 -- R1 is unboxed
305 | HP_CHK_F1 -- FloatReg1 (only) is live
306 | HP_CHK_D1 -- DblReg1 (only) is live
307 | HP_CHK_L1 -- LngReg1 (only) is live
308 | HP_CHK_UT_ALT -- unboxed tuple return.
310 | HP_CHK_GEN -- generic heap check
313 \item[@CCallProfCtrMacro@:]
314 The @String@ names a macro that, if \tr{#define}d, will bump one/some
315 of the STG-event profiling counters.
317 \item[@CCallProfCCMacro@:]
318 The @String@ names a macro that, if \tr{#define}d, will perform some
319 cost-centre-profiling-related action.
322 %************************************************************************
324 \subsection[CAddrMode]{C addressing modes}
326 %************************************************************************
330 = CVal RegRelative PrimRep
331 -- On RHS of assign: Contents of Magic[n]
332 -- On LHS of assign: location Magic[n]
333 -- (ie at addr Magic+n)
336 -- On RHS of assign: Address of Magic[n]; ie Magic+n
337 -- n=0 gets the Magic location itself
338 -- (NB: n=0 case superceded by CReg)
339 -- On LHS of assign: only sensible if n=0,
340 -- which gives the magic location itself
341 -- (NB: superceded by CReg)
343 | CReg MagicId -- To replace (CAddr MagicId 0)
345 | CTemp !Unique !PrimRep -- Temporary locations
346 -- ``Temporaries'' correspond to local variables in C, and registers in
349 | CLbl CLabel -- Labels in the runtime system, etc.
350 PrimRep -- the kind is so we can generate accurate C decls
352 | CCharLike CAddrMode -- The address of a static char-like closure for
353 -- the specified character. It is guaranteed to be in
354 -- the range mIN_CHARLIKE..mAX_CHARLIKE
356 | CIntLike CAddrMode -- The address of a static int-like closure for the
357 -- specified small integer. It is guaranteed to be in
358 -- the range mIN_INTLIKE..mAX_INTLIKE
362 | CJoinPoint -- This is used as the amode of a let-no-escape-bound
364 VirtualSpOffset -- Sp value after any volatile free vars
365 -- of the rhs have been saved on stack.
366 -- Just before the code for the thing is jumped to,
367 -- Sp will be set to this value,
368 -- and then any stack-passed args pushed,
369 -- then the code for this thing will be entered
371 !PrimRep -- the kind of the result
372 CExprMacro -- the macro to generate a value
373 [CAddrMode] -- and its arguments
375 | CMem PrimRep -- A value :: PrimRep, in memory, at the
376 CAddrMode -- specified address
378 | CBytesPerWord -- Word size, in bytes, on this platform
381 Various C macros for values which are dependent on the back-end layout.
387 | ARG_TAG -- stack argument tagging
388 | GET_TAG -- get current constructor tag
394 Convenience functions:
397 mkIntCLit :: Int -> CAddrMode
398 mkIntCLit i = CLit (mkMachInt (toInteger i))
400 mkCString :: FAST_STRING -> CAddrMode
401 mkCString s = CLit (MachStr s)
403 mkCCostCentre :: CostCentre -> CAddrMode
404 mkCCostCentre cc = CLbl (mkCC_Label cc) DataPtrRep
406 mkCCostCentreStack :: CostCentreStack -> CAddrMode
407 mkCCostCentreStack ccs = CLbl (mkCCS_Label ccs) DataPtrRep
410 %************************************************************************
412 \subsection[RegRelative]{@RegRelatives@: ???}
414 %************************************************************************
419 | SpRel FastInt -- }- offsets in StgWords
420 | NodeRel FastInt -- }
421 | CIndex CAddrMode CAddrMode PrimRep -- pointer arithmetic :-)
422 -- CIndex a b k === (k*)a[b]
425 = DirectReturn -- Jump directly, if possible
426 | StaticVectoredReturn Int -- Fixed tag, starting at zero
427 | DynamicVectoredReturn CAddrMode -- Dynamic tag given by amode, starting at zero
429 hpRel :: VirtualHeapOffset -- virtual offset of Hp
430 -> VirtualHeapOffset -- virtual offset of The Thing
431 -> RegRelative -- integer offset
432 hpRel hp off = HpRel (iUnbox (hp - off))
434 spRel :: VirtualSpOffset -- virtual offset of Sp
435 -> VirtualSpOffset -- virtual offset of The Thing
436 -> RegRelative -- integer offset
437 spRel sp off = SpRel (iUnbox (spRelToInt sp off))
439 nodeRel :: VirtualHeapOffset
441 nodeRel off = NodeRel (iUnbox off)
445 %************************************************************************
447 \subsection[Liveness]{Liveness Masks}
449 %************************************************************************
451 We represent liveness bitmaps as a BitSet (whose internal
452 representation really is a bitmap). These are pinned onto case return
453 vectors to indicate the state of the stack for the garbage collector.
455 In the compiled program, liveness bitmaps that fit inside a single
456 word (StgWord) are stored as a single word, while larger bitmaps are
457 stored as a pointer to an array of words. When we compile via C
458 (especially when we bootstrap via HC files), we generate identical C
459 code regardless of whether words are 32- or 64-bit on the target
460 machine, by postponing the decision of how to store each liveness
461 bitmap to C compilation time (or rather, C preprocessing time).
464 type LivenessMask = [BitSet]
466 data Liveness = Liveness CLabel LivenessMask
469 %************************************************************************
471 \subsection[HeapOffset]{@Heap Offsets@}
473 %************************************************************************
475 This used to be a grotesquely complicated datatype in an attempt to
476 hide the details of header sizes from the compiler itself. Now these
477 constants are imported from the RTS, and we deal in real Ints.
480 type HeapOffset = Int -- ToDo: remove
482 type VirtualHeapOffset = HeapOffset
483 type VirtualSpOffset = Int
485 type HpRelOffset = HeapOffset
486 type SpRelOffset = Int
489 %************************************************************************
491 \subsection[MagicId]{@MagicIds@: registers and such}
493 %************************************************************************
497 = BaseReg -- mentioned only in nativeGen
499 -- Argument and return registers
500 | VanillaReg -- pointers, unboxed ints and chars
502 FastInt -- its number (1 .. mAX_Vanilla_REG)
504 | FloatReg -- single-precision floating-point registers
505 FastInt -- its number (1 .. mAX_Float_REG)
507 | DoubleReg -- double-precision floating-point registers
508 FastInt -- its number (1 .. mAX_Double_REG)
511 | Sp -- Stack ptr; points to last occupied stack location.
512 | Su -- Stack update frame pointer
513 | SpLim -- Stack limit
514 | Hp -- Heap ptr; points to last occupied heap location.
515 | HpLim -- Heap limit register
516 | CurCostCentre -- current cost centre register.
517 | VoidReg -- see "VoidPrim" type; just a placeholder;
518 -- no actual register
519 | LongReg -- long int registers (64-bit, really)
520 PrimRep -- Int64Rep or Word64Rep
521 FastInt -- its number (1 .. mAX_Long_REG)
523 | CurrentTSO -- pointer to current thread's TSO
524 | CurrentNursery -- pointer to allocation area
525 | HpAlloc -- allocation count for heap check failure
528 node = VanillaReg PtrRep (_ILIT 1) -- A convenient alias for Node
529 tagreg = VanillaReg WordRep (_ILIT 2) -- A convenient alias for TagReg
534 We need magical @Eq@ because @VanillaReg@s come in multiple flavors.
537 instance Eq MagicId where
538 reg1 == reg2 = tag reg1 ==# tag reg2
540 tag BaseReg = (_ILIT(0) :: FastInt)
546 tag CurCostCentre = _ILIT(6)
547 tag VoidReg = _ILIT(7)
549 tag (VanillaReg _ i) = _ILIT(8) +# i
551 tag (FloatReg i) = _ILIT(8) +# maxv +# i
552 tag (DoubleReg i) = _ILIT(8) +# maxv +# maxf +# i
553 tag (LongReg _ i) = _ILIT(8) +# maxv +# maxf +# maxd +# i
555 maxv = iUnbox mAX_Vanilla_REG
556 maxf = iUnbox mAX_Float_REG
557 maxd = iUnbox mAX_Double_REG
560 Returns True for any register that {\em potentially} dies across
561 C calls (or anything near equivalent). We just say @True@ and
562 let the (machine-specific) registering macros sort things out...
565 isVolatileReg :: MagicId -> Bool
566 isVolatileReg any = True