2 % (c) The GRASP/AQUA Project, Glasgow University, 1992-1998
4 % $Id: AbsCSyn.lhs,v 1.52 2003/05/14 09:13:52 simonmar Exp $
6 \section[AbstractC]{Abstract C: the last stop before machine code}
8 This ``Abstract C'' data type describes the raw Spineless Tagless
9 machine model at a C-ish level; it is ``abstract'' in that it only
10 includes C-like structures that we happen to need. The conversion of
11 programs from @StgSyntax@ (basically a functional language) to
12 @AbstractC@ (basically imperative C) is the heart of code generation.
13 From @AbstractC@, one may convert to real C (for portability) or to
14 raw assembler/machine code.
25 mkAbstractCs, mkAbsCStmts, mkAlgAltsCSwitch,
34 MagicId(..), node, infoptr,
39 #include "HsVersions.h"
41 import {-# SOURCE #-} ClosureInfo ( ClosureInfo )
44 import Constants ( mAX_Vanilla_REG, mAX_Float_REG,
45 mAX_Double_REG, spRelToInt )
46 import CostCentre ( CostCentre, CostCentreStack )
47 import Literal ( mkMachInt, Literal(..) )
48 import ForeignCall ( CCallSpec )
49 import PrimRep ( PrimRep(..) )
50 import MachOp ( MachOp(..) )
51 import Unique ( Unique )
52 import StgSyn ( StgOp )
53 import TyCon ( TyCon )
54 import Bitmap ( Bitmap, mAX_SMALL_BITMAP_SIZE )
55 import SMRep ( StgWord, StgHalfWord )
60 @AbstractC@ is a list of Abstract~C statements, but the data structure
61 is tree-ish, for easier and more efficient putting-together.
67 | AbsCStmts AbstractC AbstractC
69 -- and the individual stmts...
72 A note on @CAssign@: In general, the type associated with an assignment
73 is the type of the lhs. However, when the lhs is a pointer to mixed
74 types (e.g. SpB relative), the type of the assignment is the type of
75 the rhs for float types, or the generic StgWord for all other types.
76 (In particular, a CharRep on the rhs is promoted to IntRep when
77 stored in a mixed type location.)
85 CAddrMode -- Put this in the program counter
86 -- eg `CJump (CReg (VanillaReg PtrRep 1))' puts Ret1 in PC
87 -- Enter can be done by:
88 -- CJump (CVal NodeRel zeroOff)
91 CAddrMode -- Fall through into this routine
92 -- (for the benefit of the native code generators)
93 -- Equivalent to CJump in C land
95 | CReturn -- Perform a return
96 CAddrMode -- Address of a RET_<blah> info table
97 ReturnInfo -- Whether it's a direct or vectored return
100 [(Literal, AbstractC)] -- alternatives
101 AbstractC -- default; if there is no real Abstract C in here
102 -- (e.g., all comments; see function "nonemptyAbsC"),
103 -- then that means the default _cannot_ occur.
104 -- If there is only one alternative & no default code,
105 -- then there is no need to check the tag.
107 -- CSwitch m [(tag,code)] AbsCNop == code
109 | CCodeBlock CLabel AbstractC
110 -- A labelled block of code; this "statement" is not
111 -- executed; rather, the labelled code will be hoisted
112 -- out to the top level (out of line) & it can be
115 | CInitHdr -- to initialise the header of a closure (both fixed/var parts)
117 CAddrMode -- address of the info ptr
118 !CAddrMode -- cost centre to place in closure
119 -- CReg CurCostCentre or CC_HDR(R1.p{-Node-})
120 Int -- size of closure, for profiling
122 -- NEW CASES FOR EXPANDED PRIMOPS
124 | CMachOpStmt -- Machine-level operation
127 [CAddrMode] -- Arguments
128 (Maybe [MagicId]) -- list of regs which need to be preserved
129 -- across the primop. This is allowed to be Nothing only if
130 -- machOpIsDefinitelyInline returns True. And that in turn may
131 -- only return True if we are absolutely sure that the mach op
132 -- can be done inline on all platforms.
134 | CSequential -- Do the nested AbstractCs sequentially.
135 [AbstractC] -- In particular, as far as the AbsCUtils.doSimultaneously
136 -- is concerned, these stmts are to be treated as atomic
137 -- and are not to be reordered.
139 -- end of NEW CASES FOR EXPANDED PRIMOPS
142 [CAddrMode] -- Results
144 [CAddrMode] -- Arguments
145 [MagicId] -- Potentially volatile/live registers
146 -- (to save/restore around the call/op)
148 -- INVARIANT: When a PrimOp which can cause GC is used, the
149 -- only live data is tidily on the STG stacks or in the STG
150 -- registers (the code generator ensures this).
152 -- Why this? Because if the arguments were arbitrary
153 -- addressing modes, they might be things like (Hp+6) which
154 -- will get utterly spongled by GC.
156 | CSimultaneous -- Perform simultaneously all the statements
157 AbstractC -- in the nested AbstractC. They are only
158 -- allowed to be CAssigns, COpStmts and AbsCNops, so the
159 -- "simultaneous" part just concerns making
160 -- sure that permutations work.
161 -- For example { a := b, b := a }
162 -- needs to go via (at least one) temporary
164 | CCheck -- heap or stack checks, or both.
165 CCheckMacro -- These might include some code to fill in tags
166 [CAddrMode] -- on the stack, so we can't use CMacroStmt below.
169 | CRetDirect -- Direct return
170 !Unique -- for making labels
171 AbstractC -- return code
173 Liveness -- stack liveness at the return point
175 -- see the notes about these next few; they follow below...
176 | CMacroStmt CStmtMacro [CAddrMode]
177 | CCallProfCtrMacro FastString [CAddrMode]
178 | CCallProfCCMacro FastString [CAddrMode]
180 {- The presence of this constructor is a makeshift solution;
181 it being used to work around a gcc-related problem of
182 handling typedefs within statement blocks (or, rather,
183 the inability to do so.)
185 The AbstractC flattener takes care of lifting out these
186 typedefs if needs be (i.e., when generating .hc code and
187 compiling 'foreign import dynamic's)
189 | CCallTypedef Bool {- True => use "typedef"; False => use "extern"-}
190 CCallSpec Unique [CAddrMode] [CAddrMode]
192 -- *** the next three [or so...] are DATA (those above are CODE) ***
195 CLabel -- The closure's label
196 ClosureInfo -- Todo: maybe info_lbl & closure_lbl instead?
197 CAddrMode -- cost centre identifier to place in closure
198 [CAddrMode] -- free vars; ptrs, then non-ptrs.
200 | CSRT CLabel [CLabel] -- SRT declarations: basically an array of
201 -- pointers to static closures.
203 | CBitmap Liveness -- A "large" bitmap to be emitted
205 | CSRTDesc -- A "large" SRT descriptor (one that doesn't
206 -- fit into the half-word bitmap in the itbl).
207 !CLabel -- Label for this SRT descriptor
208 !CLabel -- Pointer to the SRT
209 !Int -- Offset within the SRT
213 | CClosureInfoAndCode
214 ClosureInfo -- Explains placement and layout of closure
215 AbstractC -- Entry point code
217 | CRetVector -- A labelled block of static data
221 Liveness -- stack liveness at the return point
223 | CClosureTbl -- table of constructors for enumerated types
224 TyCon -- which TyCon this table is for
226 | CModuleInitBlock -- module initialisation block
227 CLabel -- "plain" label for init block
228 CLabel -- label for init block (with ver + way info)
229 AbstractC -- initialisation code
231 | CCostCentreDecl -- A cost centre *declaration*
232 Bool -- True <=> local => full declaration
233 -- False <=> extern; just say so
236 | CCostCentreStackDecl -- A cost centre stack *declaration*
237 CostCentreStack -- this is the declaration for a
238 -- pre-defined singleton CCS (see
241 | CSplitMarker -- Split into separate object modules here
243 -- C_SRT is what StgSyn.SRT gets translated to...
244 -- we add a label for the table, and expect only the 'offset/length' form
247 | C_SRT !CLabel !Int{-offset-} !StgHalfWord{-bitmap or escape-}
249 needsSRT :: C_SRT -> Bool
250 needsSRT NoC_SRT = False
251 needsSRT (C_SRT _ _ _) = True
254 About @CMacroStmt@, etc.: notionally, they all just call some
255 arbitrary C~macro or routine, passing the @CAddrModes@ as arguments.
256 However, we distinguish between various flavours of these things,
257 mostly just to keep things somewhat less wild and wooly.
261 Some {\em essential} bits of the STG execution model are done with C
262 macros. An example is @STK_CHK@, which checks for stack-space
263 overflow. This enumeration type lists all such macros:
266 = UPD_CAF -- update CAF closure with indirection
267 | UPD_BH_UPDATABLE -- eager backholing
268 | UPD_BH_SINGLE_ENTRY -- more eager blackholing
269 | PUSH_UPD_FRAME -- push update frame
270 | SET_TAG -- set TagReg if it exists
271 -- dataToTag# primop -- *only* used in unregisterised builds.
272 -- (see AbsCUtils.dsCOpStmt)
275 | REGISTER_FOREIGN_EXPORT -- register a foreign exported fun
276 | REGISTER_IMPORT -- register an imported module
277 | REGISTER_DIMPORT -- register an imported module from
280 | GRAN_FETCH -- for GrAnSim only -- HWL
281 | GRAN_RESCHEDULE -- for GrAnSim only -- HWL
282 | GRAN_FETCH_AND_RESCHEDULE -- for GrAnSim only -- HWL
283 | THREAD_CONTEXT_SWITCH -- for GrAnSim only -- HWL
284 | GRAN_YIELD -- for GrAnSim only -- HWL
287 Heap/Stack checks. There are far too many of these.
292 = HP_CHK_NP -- heap/stack checks when
293 | STK_CHK_NP -- node points to the closure
296 | HP_CHK_FUN -- heap/stack checks when
297 | STK_CHK_FUN -- node doesn't point
299 -- case alternative heap checks:
301 | HP_CHK_NOREGS -- no registers live
302 | HP_CHK_UNPT_R1 -- R1 is boxed/unlifted
303 | HP_CHK_UNBX_R1 -- R1 is unboxed
304 | HP_CHK_F1 -- FloatReg1 (only) is live
305 | HP_CHK_D1 -- DblReg1 (only) is live
306 | HP_CHK_L1 -- LngReg1 (only) is live
308 | HP_CHK_UNBX_TUPLE -- unboxed tuple heap check
311 \item[@CCallProfCtrMacro@:]
312 The @String@ names a macro that, if \tr{#define}d, will bump one/some
313 of the STG-event profiling counters.
315 \item[@CCallProfCCMacro@:]
316 The @String@ names a macro that, if \tr{#define}d, will perform some
317 cost-centre-profiling-related action.
320 %************************************************************************
322 \subsection[CAddrMode]{C addressing modes}
324 %************************************************************************
328 = CVal RegRelative PrimRep
329 -- On RHS of assign: Contents of Magic[n]
330 -- On LHS of assign: location Magic[n]
331 -- (ie at addr Magic+n)
334 -- On RHS of assign: Address of Magic[n]; ie Magic+n
335 -- n=0 gets the Magic location itself
336 -- (NB: n=0 case superceded by CReg)
337 -- On LHS of assign: only sensible if n=0,
338 -- which gives the magic location itself
339 -- (NB: superceded by CReg)
341 -- JRS 2002-02-05: CAddr is really scummy and should be fixed.
342 -- The effect is that the semantics of CAddr depend on what the
343 -- contained RegRelative is; it is decidely non-orthogonal.
345 | CReg MagicId -- To replace (CAddr MagicId 0)
347 | CTemp !Unique !PrimRep -- Temporary locations
348 -- ``Temporaries'' correspond to local variables in C, and registers in
351 | CLbl CLabel -- Labels in the runtime system, etc.
352 PrimRep -- the kind is so we can generate accurate C decls
354 | CCharLike CAddrMode -- The address of a static char-like closure for
355 -- the specified character. It is guaranteed to be in
356 -- the range mIN_CHARLIKE..mAX_CHARLIKE
358 | CIntLike CAddrMode -- The address of a static int-like closure for the
359 -- specified small integer. It is guaranteed to be in
360 -- the range mIN_INTLIKE..mAX_INTLIKE
364 | CJoinPoint -- This is used as the amode of a let-no-escape-bound
366 VirtualSpOffset -- Sp value after any volatile free vars
367 -- of the rhs have been saved on stack.
368 -- Just before the code for the thing is jumped to,
369 -- Sp will be set to this value,
370 -- and then any stack-passed args pushed,
371 -- then the code for this thing will be entered
373 !PrimRep -- the kind of the result
374 CExprMacro -- the macro to generate a value
375 [CAddrMode] -- and its arguments
378 Various C macros for values which are dependent on the back-end layout.
384 | ARG_TAG -- stack argument tagging
385 | GET_TAG -- get current constructor tag
388 | BYTE_ARR_CTS -- used when passing a ByteArray# to a ccall
389 | PTRS_ARR_CTS -- similarly for an Array#
390 | ForeignObj_CLOSURE_DATA -- and again for a ForeignObj#
393 Convenience functions:
396 mkIntCLit :: Int -> CAddrMode
397 mkIntCLit i = CLit (mkMachInt (toInteger i))
399 mkWordCLit :: StgWord -> CAddrMode
400 mkWordCLit wd = CLit (MachWord (fromIntegral wd))
402 mkCString :: FastString -> CAddrMode
403 mkCString s = CLit (MachStr s)
405 mkCCostCentre :: CostCentre -> CAddrMode
406 mkCCostCentre cc = CLbl (mkCC_Label cc) DataPtrRep
408 mkCCostCentreStack :: CostCentreStack -> CAddrMode
409 mkCCostCentreStack ccs = CLbl (mkCCS_Label ccs) DataPtrRep
412 %************************************************************************
414 \subsection[RegRelative]{@RegRelatives@: ???}
416 %************************************************************************
421 | SpRel FastInt -- }- offsets in StgWords
422 | NodeRel FastInt -- }
423 | CIndex CAddrMode CAddrMode PrimRep -- pointer arithmetic :-)
424 -- CIndex a b k === (k*)a[b]
427 = DirectReturn -- Jump directly, if possible
428 | StaticVectoredReturn Int -- Fixed tag, starting at zero
429 | DynamicVectoredReturn CAddrMode -- Dynamic tag given by amode, starting at zero
431 hpRel :: VirtualHeapOffset -- virtual offset of Hp
432 -> VirtualHeapOffset -- virtual offset of The Thing
433 -> RegRelative -- integer offset
434 hpRel hp off = HpRel (iUnbox (hp - off))
436 spRel :: VirtualSpOffset -- virtual offset of Sp
437 -> VirtualSpOffset -- virtual offset of The Thing
438 -> RegRelative -- integer offset
439 spRel sp off = SpRel (iUnbox (spRelToInt sp off))
441 nodeRel :: VirtualHeapOffset
443 nodeRel off = NodeRel (iUnbox off)
447 %************************************************************************
449 \subsection[Liveness]{Liveness Masks}
451 %************************************************************************
453 We represent liveness bitmaps as a BitSet (whose internal
454 representation really is a bitmap). These are pinned onto case return
455 vectors to indicate the state of the stack for the garbage collector.
457 In the compiled program, liveness bitmaps that fit inside a single
458 word (StgWord) are stored as a single word, while larger bitmaps are
459 stored as a pointer to an array of words.
462 data Liveness = Liveness CLabel !Int Bitmap
464 maybeLargeBitmap :: Liveness -> AbstractC
465 maybeLargeBitmap liveness@(Liveness _ size _)
466 | size <= mAX_SMALL_BITMAP_SIZE = AbsCNop
467 | otherwise = CBitmap liveness
470 %************************************************************************
472 \subsection[HeapOffset]{@Heap Offsets@}
474 %************************************************************************
476 This used to be a grotesquely complicated datatype in an attempt to
477 hide the details of header sizes from the compiler itself. Now these
478 constants are imported from the RTS, and we deal in real Ints.
481 type HeapOffset = Int -- ToDo: remove
483 type VirtualHeapOffset = HeapOffset
484 type VirtualSpOffset = Int
486 type HpRelOffset = HeapOffset
487 type SpRelOffset = Int
490 %************************************************************************
492 \subsection[MagicId]{@MagicIds@: registers and such}
494 %************************************************************************
498 = BaseReg -- mentioned only in nativeGen
500 -- Argument and return registers
501 | VanillaReg -- pointers, unboxed ints and chars
503 FastInt -- its number (1 .. mAX_Vanilla_REG)
505 | FloatReg -- single-precision floating-point registers
506 FastInt -- its number (1 .. mAX_Float_REG)
508 | DoubleReg -- double-precision floating-point registers
509 FastInt -- its number (1 .. mAX_Double_REG)
512 | Sp -- Stack ptr; points to last occupied stack location.
513 | SpLim -- Stack limit
514 | Hp -- Heap ptr; points to last occupied heap location.
515 | HpLim -- Heap limit register
516 | CurCostCentre -- current cost centre register.
517 | VoidReg -- see "VoidPrim" type; just a placeholder;
518 -- no actual register
519 | LongReg -- long int registers (64-bit, really)
520 PrimRep -- Int64Rep or Word64Rep
521 FastInt -- its number (1 .. mAX_Long_REG)
523 | CurrentTSO -- pointer to current thread's TSO
524 | CurrentNursery -- pointer to allocation area
525 | HpAlloc -- allocation count for heap check failure
528 node = VanillaReg PtrRep (_ILIT 1) -- A convenient alias for Node
529 tagreg = VanillaReg WordRep (_ILIT 2) -- A convenient alias for TagReg
534 We need magical @Eq@ because @VanillaReg@s come in multiple flavors.
537 instance Eq MagicId where
538 reg1 == reg2 = tag reg1 ==# tag reg2
540 tag BaseReg = (_ILIT(0) :: FastInt)
545 tag CurCostCentre = _ILIT(6)
546 tag VoidReg = _ILIT(7)
548 tag (VanillaReg _ i) = _ILIT(8) +# i
550 tag (FloatReg i) = _ILIT(8) +# maxv +# i
551 tag (DoubleReg i) = _ILIT(8) +# maxv +# maxf +# i
552 tag (LongReg _ i) = _ILIT(8) +# maxv +# maxf +# maxd +# i
554 maxv = iUnbox mAX_Vanilla_REG
555 maxf = iUnbox mAX_Float_REG
556 maxd = iUnbox mAX_Double_REG
559 Returns True for any register that {\em potentially} dies across
560 C calls (or anything near equivalent). We just say @True@ and
561 let the (machine-specific) registering macros sort things out...
564 isVolatileReg :: MagicId -> Bool
565 isVolatileReg any = True