2 % (c) The GRASP/AQUA Project, Glasgow University, 1992-1998
4 % $Id: AbsCSyn.lhs,v 1.51 2002/12/11 15:36:21 simonmar Exp $
6 \section[AbstractC]{Abstract C: the last stop before machine code}
8 This ``Abstract C'' data type describes the raw Spineless Tagless
9 machine model at a C-ish level; it is ``abstract'' in that it only
10 includes C-like structures that we happen to need. The conversion of
11 programs from @StgSyntax@ (basically a functional language) to
12 @AbstractC@ (basically imperative C) is the heart of code generation.
13 From @AbstractC@, one may convert to real C (for portability) or to
14 raw assembler/machine code.
25 mkAbstractCs, mkAbsCStmts, mkAlgAltsCSwitch,
34 MagicId(..), node, infoptr,
39 #include "HsVersions.h"
41 import {-# SOURCE #-} ClosureInfo ( ClosureInfo )
44 import Constants ( mAX_Vanilla_REG, mAX_Float_REG,
45 mAX_Double_REG, spRelToInt )
46 import CostCentre ( CostCentre, CostCentreStack )
47 import Literal ( mkMachInt, Literal(..) )
48 import ForeignCall ( CCallSpec )
49 import PrimRep ( PrimRep(..) )
50 import MachOp ( MachOp(..) )
51 import Unique ( Unique )
52 import StgSyn ( StgOp )
53 import TyCon ( TyCon )
54 import BitSet -- for liveness masks
59 @AbstractC@ is a list of Abstract~C statements, but the data structure
60 is tree-ish, for easier and more efficient putting-together.
66 | AbsCStmts AbstractC AbstractC
68 -- and the individual stmts...
71 A note on @CAssign@: In general, the type associated with an assignment
72 is the type of the lhs. However, when the lhs is a pointer to mixed
73 types (e.g. SpB relative), the type of the assignment is the type of
74 the rhs for float types, or the generic StgWord for all other types.
75 (In particular, a CharRep on the rhs is promoted to IntRep when
76 stored in a mixed type location.)
84 CAddrMode -- Put this in the program counter
85 -- eg `CJump (CReg (VanillaReg PtrRep 1))' puts Ret1 in PC
86 -- Enter can be done by:
87 -- CJump (CVal NodeRel zeroOff)
90 CAddrMode -- Fall through into this routine
91 -- (for the benefit of the native code generators)
92 -- Equivalent to CJump in C land
94 | CReturn -- Perform a return
95 CAddrMode -- Address of a RET_<blah> info table
96 ReturnInfo -- Whether it's a direct or vectored return
99 [(Literal, AbstractC)] -- alternatives
100 AbstractC -- default; if there is no real Abstract C in here
101 -- (e.g., all comments; see function "nonemptyAbsC"),
102 -- then that means the default _cannot_ occur.
103 -- If there is only one alternative & no default code,
104 -- then there is no need to check the tag.
106 -- CSwitch m [(tag,code)] AbsCNop == code
108 | CCodeBlock CLabel AbstractC
109 -- A labelled block of code; this "statement" is not
110 -- executed; rather, the labelled code will be hoisted
111 -- out to the top level (out of line) & it can be
114 | CInitHdr -- to initialise the header of a closure (both fixed/var parts)
116 CAddrMode -- address of the info ptr
117 !CAddrMode -- cost centre to place in closure
118 -- CReg CurCostCentre or CC_HDR(R1.p{-Node-})
119 Int -- size of closure, for profiling
121 -- NEW CASES FOR EXPANDED PRIMOPS
123 | CMachOpStmt -- Machine-level operation
126 [CAddrMode] -- Arguments
127 (Maybe [MagicId]) -- list of regs which need to be preserved
128 -- across the primop. This is allowed to be Nothing only if
129 -- machOpIsDefinitelyInline returns True. And that in turn may
130 -- only return True if we are absolutely sure that the mach op
131 -- can be done inline on all platforms.
133 | CSequential -- Do the nested AbstractCs sequentially.
134 [AbstractC] -- In particular, as far as the AbsCUtils.doSimultaneously
135 -- is concerned, these stmts are to be treated as atomic
136 -- and are not to be reordered.
138 -- end of NEW CASES FOR EXPANDED PRIMOPS
141 [CAddrMode] -- Results
143 [CAddrMode] -- Arguments
144 [MagicId] -- Potentially volatile/live registers
145 -- (to save/restore around the call/op)
147 -- INVARIANT: When a PrimOp which can cause GC is used, the
148 -- only live data is tidily on the STG stacks or in the STG
149 -- registers (the code generator ensures this).
151 -- Why this? Because if the arguments were arbitrary
152 -- addressing modes, they might be things like (Hp+6) which
153 -- will get utterly spongled by GC.
155 | CSimultaneous -- Perform simultaneously all the statements
156 AbstractC -- in the nested AbstractC. They are only
157 -- allowed to be CAssigns, COpStmts and AbsCNops, so the
158 -- "simultaneous" part just concerns making
159 -- sure that permutations work.
160 -- For example { a := b, b := a }
161 -- needs to go via (at least one) temporary
163 | CCheck -- heap or stack checks, or both.
164 CCheckMacro -- These might include some code to fill in tags
165 [CAddrMode] -- on the stack, so we can't use CMacroStmt below.
168 | CRetDirect -- Direct return
169 !Unique -- for making labels
170 AbstractC -- return code
172 Liveness -- stack liveness at the return point
174 -- see the notes about these next few; they follow below...
175 | CMacroStmt CStmtMacro [CAddrMode]
176 | CCallProfCtrMacro FastString [CAddrMode]
177 | CCallProfCCMacro FastString [CAddrMode]
179 {- The presence of this constructor is a makeshift solution;
180 it being used to work around a gcc-related problem of
181 handling typedefs within statement blocks (or, rather,
182 the inability to do so.)
184 The AbstractC flattener takes care of lifting out these
185 typedefs if needs be (i.e., when generating .hc code and
186 compiling 'foreign import dynamic's)
188 | CCallTypedef Bool {- True => use "typedef"; False => use "extern"-}
189 CCallSpec Unique [CAddrMode] [CAddrMode]
191 -- *** the next three [or so...] are DATA (those above are CODE) ***
194 CLabel -- The closure's label
195 ClosureInfo -- Todo: maybe info_lbl & closure_lbl instead?
196 CAddrMode -- cost centre identifier to place in closure
197 [CAddrMode] -- free vars; ptrs, then non-ptrs.
199 | CSRT CLabel [CLabel] -- SRT declarations: basically an array of
200 -- pointers to static closures.
202 | CBitmap Liveness -- A bitmap to be emitted if and only if
203 -- it is larger than a target machine word.
205 | CClosureInfoAndCode
206 ClosureInfo -- Explains placement and layout of closure
207 AbstractC -- Entry point code
209 | CRetVector -- A labelled block of static data
213 Liveness -- stack liveness at the return point
215 | CClosureTbl -- table of constructors for enumerated types
216 TyCon -- which TyCon this table is for
218 | CModuleInitBlock -- module initialisation block
219 CLabel -- "plain" label for init block
220 CLabel -- label for init block (with ver + way info)
221 AbstractC -- initialisation code
223 | CCostCentreDecl -- A cost centre *declaration*
224 Bool -- True <=> local => full declaration
225 -- False <=> extern; just say so
228 | CCostCentreStackDecl -- A cost centre stack *declaration*
229 CostCentreStack -- this is the declaration for a
230 -- pre-defined singleton CCS (see
233 | CSplitMarker -- Split into separate object modules here
235 -- C_SRT is what StgSyn.SRT gets translated to...
236 -- we add a label for the table, and expect only the 'offset/length' form
239 | C_SRT CLabel !Int{-offset-} !Int{-length-}
241 needsSRT :: C_SRT -> Bool
242 needsSRT NoC_SRT = False
243 needsSRT (C_SRT _ _ _) = True
246 About @CMacroStmt@, etc.: notionally, they all just call some
247 arbitrary C~macro or routine, passing the @CAddrModes@ as arguments.
248 However, we distinguish between various flavours of these things,
249 mostly just to keep things somewhat less wild and wooly.
253 Some {\em essential} bits of the STG execution model are done with C
254 macros. An example is @STK_CHK@, which checks for stack-space
255 overflow. This enumeration type lists all such macros:
258 = UPD_CAF -- update CAF closure with indirection
259 | UPD_BH_UPDATABLE -- eager backholing
260 | UPD_BH_SINGLE_ENTRY -- more eager blackholing
261 | PUSH_UPD_FRAME -- push update frame
262 | SET_TAG -- set TagReg if it exists
263 -- dataToTag# primop -- *only* used in unregisterised builds.
264 -- (see AbsCUtils.dsCOpStmt)
267 | REGISTER_FOREIGN_EXPORT -- register a foreign exported fun
268 | REGISTER_IMPORT -- register an imported module
269 | REGISTER_DIMPORT -- register an imported module from
272 | GRAN_FETCH -- for GrAnSim only -- HWL
273 | GRAN_RESCHEDULE -- for GrAnSim only -- HWL
274 | GRAN_FETCH_AND_RESCHEDULE -- for GrAnSim only -- HWL
275 | THREAD_CONTEXT_SWITCH -- for GrAnSim only -- HWL
276 | GRAN_YIELD -- for GrAnSim only -- HWL
279 Heap/Stack checks. There are far too many of these.
284 = HP_CHK_NP -- heap/stack checks when
285 | STK_CHK_NP -- node points to the closure
288 | HP_CHK_FUN -- heap/stack checks when
289 | STK_CHK_FUN -- node doesn't point
291 -- case alternative heap checks:
293 | HP_CHK_NOREGS -- no registers live
294 | HP_CHK_UNPT_R1 -- R1 is boxed/unlifted
295 | HP_CHK_UNBX_R1 -- R1 is unboxed
296 | HP_CHK_F1 -- FloatReg1 (only) is live
297 | HP_CHK_D1 -- DblReg1 (only) is live
298 | HP_CHK_L1 -- LngReg1 (only) is live
300 | HP_CHK_UNBX_TUPLE -- unboxed tuple heap check
303 \item[@CCallProfCtrMacro@:]
304 The @String@ names a macro that, if \tr{#define}d, will bump one/some
305 of the STG-event profiling counters.
307 \item[@CCallProfCCMacro@:]
308 The @String@ names a macro that, if \tr{#define}d, will perform some
309 cost-centre-profiling-related action.
312 %************************************************************************
314 \subsection[CAddrMode]{C addressing modes}
316 %************************************************************************
320 = CVal RegRelative PrimRep
321 -- On RHS of assign: Contents of Magic[n]
322 -- On LHS of assign: location Magic[n]
323 -- (ie at addr Magic+n)
326 -- On RHS of assign: Address of Magic[n]; ie Magic+n
327 -- n=0 gets the Magic location itself
328 -- (NB: n=0 case superceded by CReg)
329 -- On LHS of assign: only sensible if n=0,
330 -- which gives the magic location itself
331 -- (NB: superceded by CReg)
333 -- JRS 2002-02-05: CAddr is really scummy and should be fixed.
334 -- The effect is that the semantics of CAddr depend on what the
335 -- contained RegRelative is; it is decidely non-orthogonal.
337 | CReg MagicId -- To replace (CAddr MagicId 0)
339 | CTemp !Unique !PrimRep -- Temporary locations
340 -- ``Temporaries'' correspond to local variables in C, and registers in
343 | CLbl CLabel -- Labels in the runtime system, etc.
344 PrimRep -- the kind is so we can generate accurate C decls
346 | CCharLike CAddrMode -- The address of a static char-like closure for
347 -- the specified character. It is guaranteed to be in
348 -- the range mIN_CHARLIKE..mAX_CHARLIKE
350 | CIntLike CAddrMode -- The address of a static int-like closure for the
351 -- specified small integer. It is guaranteed to be in
352 -- the range mIN_INTLIKE..mAX_INTLIKE
356 | CJoinPoint -- This is used as the amode of a let-no-escape-bound
358 VirtualSpOffset -- Sp value after any volatile free vars
359 -- of the rhs have been saved on stack.
360 -- Just before the code for the thing is jumped to,
361 -- Sp will be set to this value,
362 -- and then any stack-passed args pushed,
363 -- then the code for this thing will be entered
365 !PrimRep -- the kind of the result
366 CExprMacro -- the macro to generate a value
367 [CAddrMode] -- and its arguments
369 | CBytesPerWord -- Word size, in bytes, on this platform
370 -- required for: half-word loads (used in fishing tags
371 -- out of info tables), and sizeofByteArray#.
374 Various C macros for values which are dependent on the back-end layout.
380 | ARG_TAG -- stack argument tagging
381 | GET_TAG -- get current constructor tag
384 | BYTE_ARR_CTS -- used when passing a ByteArray# to a ccall
385 | PTRS_ARR_CTS -- similarly for an Array#
386 | ForeignObj_CLOSURE_DATA -- and again for a ForeignObj#
389 Convenience functions:
392 mkIntCLit :: Int -> CAddrMode
393 mkIntCLit i = CLit (mkMachInt (toInteger i))
395 mkCString :: FastString -> CAddrMode
396 mkCString s = CLit (MachStr s)
398 mkCCostCentre :: CostCentre -> CAddrMode
399 mkCCostCentre cc = CLbl (mkCC_Label cc) DataPtrRep
401 mkCCostCentreStack :: CostCentreStack -> CAddrMode
402 mkCCostCentreStack ccs = CLbl (mkCCS_Label ccs) DataPtrRep
405 %************************************************************************
407 \subsection[RegRelative]{@RegRelatives@: ???}
409 %************************************************************************
414 | SpRel FastInt -- }- offsets in StgWords
415 | NodeRel FastInt -- }
416 | CIndex CAddrMode CAddrMode PrimRep -- pointer arithmetic :-)
417 -- CIndex a b k === (k*)a[b]
420 = DirectReturn -- Jump directly, if possible
421 | StaticVectoredReturn Int -- Fixed tag, starting at zero
422 | DynamicVectoredReturn CAddrMode -- Dynamic tag given by amode, starting at zero
424 hpRel :: VirtualHeapOffset -- virtual offset of Hp
425 -> VirtualHeapOffset -- virtual offset of The Thing
426 -> RegRelative -- integer offset
427 hpRel hp off = HpRel (iUnbox (hp - off))
429 spRel :: VirtualSpOffset -- virtual offset of Sp
430 -> VirtualSpOffset -- virtual offset of The Thing
431 -> RegRelative -- integer offset
432 spRel sp off = SpRel (iUnbox (spRelToInt sp off))
434 nodeRel :: VirtualHeapOffset
436 nodeRel off = NodeRel (iUnbox off)
440 %************************************************************************
442 \subsection[Liveness]{Liveness Masks}
444 %************************************************************************
446 We represent liveness bitmaps as a BitSet (whose internal
447 representation really is a bitmap). These are pinned onto case return
448 vectors to indicate the state of the stack for the garbage collector.
450 In the compiled program, liveness bitmaps that fit inside a single
451 word (StgWord) are stored as a single word, while larger bitmaps are
452 stored as a pointer to an array of words. When we compile via C
453 (especially when we bootstrap via HC files), we generate identical C
454 code regardless of whether words are 32- or 64-bit on the target
455 machine, by postponing the decision of how to store each liveness
456 bitmap to C compilation time (or rather, C preprocessing time).
459 type LivenessMask = [BitSet]
461 data Liveness = Liveness CLabel !Int LivenessMask
464 %************************************************************************
466 \subsection[HeapOffset]{@Heap Offsets@}
468 %************************************************************************
470 This used to be a grotesquely complicated datatype in an attempt to
471 hide the details of header sizes from the compiler itself. Now these
472 constants are imported from the RTS, and we deal in real Ints.
475 type HeapOffset = Int -- ToDo: remove
477 type VirtualHeapOffset = HeapOffset
478 type VirtualSpOffset = Int
480 type HpRelOffset = HeapOffset
481 type SpRelOffset = Int
484 %************************************************************************
486 \subsection[MagicId]{@MagicIds@: registers and such}
488 %************************************************************************
492 = BaseReg -- mentioned only in nativeGen
494 -- Argument and return registers
495 | VanillaReg -- pointers, unboxed ints and chars
497 FastInt -- its number (1 .. mAX_Vanilla_REG)
499 | FloatReg -- single-precision floating-point registers
500 FastInt -- its number (1 .. mAX_Float_REG)
502 | DoubleReg -- double-precision floating-point registers
503 FastInt -- its number (1 .. mAX_Double_REG)
506 | Sp -- Stack ptr; points to last occupied stack location.
507 | SpLim -- Stack limit
508 | Hp -- Heap ptr; points to last occupied heap location.
509 | HpLim -- Heap limit register
510 | CurCostCentre -- current cost centre register.
511 | VoidReg -- see "VoidPrim" type; just a placeholder;
512 -- no actual register
513 | LongReg -- long int registers (64-bit, really)
514 PrimRep -- Int64Rep or Word64Rep
515 FastInt -- its number (1 .. mAX_Long_REG)
517 | CurrentTSO -- pointer to current thread's TSO
518 | CurrentNursery -- pointer to allocation area
519 | HpAlloc -- allocation count for heap check failure
522 node = VanillaReg PtrRep (_ILIT 1) -- A convenient alias for Node
523 tagreg = VanillaReg WordRep (_ILIT 2) -- A convenient alias for TagReg
528 We need magical @Eq@ because @VanillaReg@s come in multiple flavors.
531 instance Eq MagicId where
532 reg1 == reg2 = tag reg1 ==# tag reg2
534 tag BaseReg = (_ILIT(0) :: FastInt)
539 tag CurCostCentre = _ILIT(6)
540 tag VoidReg = _ILIT(7)
542 tag (VanillaReg _ i) = _ILIT(8) +# i
544 tag (FloatReg i) = _ILIT(8) +# maxv +# i
545 tag (DoubleReg i) = _ILIT(8) +# maxv +# maxf +# i
546 tag (LongReg _ i) = _ILIT(8) +# maxv +# maxf +# maxd +# i
548 maxv = iUnbox mAX_Vanilla_REG
549 maxf = iUnbox mAX_Float_REG
550 maxd = iUnbox mAX_Double_REG
553 Returns True for any register that {\em potentially} dies across
554 C calls (or anything near equivalent). We just say @True@ and
555 let the (machine-specific) registering macros sort things out...
558 isVolatileReg :: MagicId -> Bool
559 isVolatileReg any = True