2 % (c) The GRASP/AQUA Project, Glasgow University, 1992-1998
4 % $Id: AbsCSyn.lhs,v 1.53 2003/07/02 13:12:33 simonpj Exp $
6 \section[AbstractC]{Abstract C: the last stop before machine code}
8 This ``Abstract C'' data type describes the raw Spineless Tagless
9 machine model at a C-ish level; it is ``abstract'' in that it only
10 includes C-like structures that we happen to need. The conversion of
11 programs from @StgSyntax@ (basically a functional language) to
12 @AbstractC@ (basically imperative C) is the heart of code generation.
13 From @AbstractC@, one may convert to real C (for portability) or to
14 raw assembler/machine code.
17 module AbsCSyn where -- export everything
19 #include "HsVersions.h"
21 import {-# SOURCE #-} ClosureInfo ( ClosureInfo )
24 import Constants ( mAX_Vanilla_REG, mAX_Float_REG,
25 mAX_Double_REG, spRelToInt )
26 import CostCentre ( CostCentre, CostCentreStack )
27 import Literal ( mkMachInt, Literal(..) )
28 import ForeignCall ( CCallSpec )
29 import PrimRep ( PrimRep(..) )
30 import MachOp ( MachOp(..) )
31 import Unique ( Unique )
32 import StgSyn ( StgOp )
33 import TyCon ( TyCon )
34 import Bitmap ( Bitmap, mAX_SMALL_BITMAP_SIZE )
35 import SMRep ( StgWord, StgHalfWord )
40 @AbstractC@ is a list of Abstract~C statements, but the data structure
41 is tree-ish, for easier and more efficient putting-together.
47 | AbsCStmts AbstractC AbstractC
49 -- and the individual stmts...
52 A note on @CAssign@: In general, the type associated with an assignment
53 is the type of the lhs. However, when the lhs is a pointer to mixed
54 types (e.g. SpB relative), the type of the assignment is the type of
55 the rhs for float types, or the generic StgWord for all other types.
56 (In particular, a CharRep on the rhs is promoted to IntRep when
57 stored in a mixed type location.)
65 CAddrMode -- Put this in the program counter
66 -- eg `CJump (CReg (VanillaReg PtrRep 1))' puts Ret1 in PC
67 -- Enter can be done by:
68 -- CJump (CVal NodeRel zeroOff)
71 CAddrMode -- Fall through into this routine
72 -- (for the benefit of the native code generators)
73 -- Equivalent to CJump in C land
75 | CReturn -- Perform a return
76 CAddrMode -- Address of a RET_<blah> info table
77 ReturnInfo -- Whether it's a direct or vectored return
80 [(Literal, AbstractC)] -- alternatives
81 AbstractC -- default; if there is no real Abstract C in here
82 -- (e.g., all comments; see function "nonemptyAbsC"),
83 -- then that means the default _cannot_ occur.
84 -- If there is only one alternative & no default code,
85 -- then there is no need to check the tag.
87 -- CSwitch m [(tag,code)] AbsCNop == code
89 | CCodeBlock CLabel AbstractC
90 -- A labelled block of code; this "statement" is not
91 -- executed; rather, the labelled code will be hoisted
92 -- out to the top level (out of line) & it can be
95 | CInitHdr -- to initialise the header of a closure (both fixed/var parts)
97 CAddrMode -- address of the info ptr
98 !CAddrMode -- cost centre to place in closure
99 -- CReg CurCostCentre or CC_HDR(R1.p{-Node-})
100 Int -- size of closure, for profiling
102 -- NEW CASES FOR EXPANDED PRIMOPS
104 | CMachOpStmt -- Machine-level operation
107 [CAddrMode] -- Arguments
108 (Maybe [MagicId]) -- list of regs which need to be preserved
109 -- across the primop. This is allowed to be Nothing only if
110 -- machOpIsDefinitelyInline returns True. And that in turn may
111 -- only return True if we are absolutely sure that the mach op
112 -- can be done inline on all platforms.
114 | CSequential -- Do the nested AbstractCs sequentially.
115 [AbstractC] -- In particular, as far as the AbsCUtils.doSimultaneously
116 -- is concerned, these stmts are to be treated as atomic
117 -- and are not to be reordered.
119 -- end of NEW CASES FOR EXPANDED PRIMOPS
122 [CAddrMode] -- Results
124 [CAddrMode] -- Arguments
125 [MagicId] -- Potentially volatile/live registers
126 -- (to save/restore around the call/op)
128 -- INVARIANT: When a PrimOp which can cause GC is used, the
129 -- only live data is tidily on the STG stacks or in the STG
130 -- registers (the code generator ensures this).
132 -- Why this? Because if the arguments were arbitrary
133 -- addressing modes, they might be things like (Hp+6) which
134 -- will get utterly spongled by GC.
136 | CSimultaneous -- Perform simultaneously all the statements
137 AbstractC -- in the nested AbstractC. They are only
138 -- allowed to be CAssigns, COpStmts and AbsCNops, so the
139 -- "simultaneous" part just concerns making
140 -- sure that permutations work.
141 -- For example { a := b, b := a }
142 -- needs to go via (at least one) temporary
144 | CCheck -- heap or stack checks, or both.
145 CCheckMacro -- These might include some code to fill in tags
146 [CAddrMode] -- on the stack, so we can't use CMacroStmt below.
149 | CRetDirect -- Direct return
150 !Unique -- for making labels
151 AbstractC -- return code
153 Liveness -- stack liveness at the return point
155 -- see the notes about these next few; they follow below...
156 | CMacroStmt CStmtMacro [CAddrMode]
157 | CCallProfCtrMacro FastString [CAddrMode]
158 | CCallProfCCMacro FastString [CAddrMode]
160 {- The presence of this constructor is a makeshift solution;
161 it being used to work around a gcc-related problem of
162 handling typedefs within statement blocks (or, rather,
163 the inability to do so.)
165 The AbstractC flattener takes care of lifting out these
166 typedefs if needs be (i.e., when generating .hc code and
167 compiling 'foreign import dynamic's)
169 | CCallTypedef Bool {- True => use "typedef"; False => use "extern"-}
170 CCallSpec Unique [CAddrMode] [CAddrMode]
172 -- *** the next three [or so...] are DATA (those above are CODE) ***
175 CLabel -- The closure's label
176 ClosureInfo -- Todo: maybe info_lbl & closure_lbl instead?
177 CAddrMode -- cost centre identifier to place in closure
178 [CAddrMode] -- free vars; ptrs, then non-ptrs.
180 | CSRT CLabel [CLabel] -- SRT declarations: basically an array of
181 -- pointers to static closures.
183 | CBitmap Liveness -- A "large" bitmap to be emitted
185 | CSRTDesc -- A "large" SRT descriptor (one that doesn't
186 -- fit into the half-word bitmap in the itbl).
187 !CLabel -- Label for this SRT descriptor
188 !CLabel -- Pointer to the SRT
189 !Int -- Offset within the SRT
193 | CClosureInfoAndCode
194 ClosureInfo -- Explains placement and layout of closure
195 AbstractC -- Entry point code
197 | CRetVector -- A labelled block of static data
201 Liveness -- stack liveness at the return point
203 | CClosureTbl -- table of constructors for enumerated types
204 TyCon -- which TyCon this table is for
206 | CModuleInitBlock -- module initialisation block
207 CLabel -- "plain" label for init block
208 CLabel -- label for init block (with ver + way info)
209 AbstractC -- initialisation code
211 | CCostCentreDecl -- A cost centre *declaration*
212 Bool -- True <=> local => full declaration
213 -- False <=> extern; just say so
216 | CCostCentreStackDecl -- A cost centre stack *declaration*
217 CostCentreStack -- this is the declaration for a
218 -- pre-defined singleton CCS (see
221 | CSplitMarker -- Split into separate object modules here
223 -- C_SRT is what StgSyn.SRT gets translated to...
224 -- we add a label for the table, and expect only the 'offset/length' form
227 | C_SRT !CLabel !Int{-offset-} !StgHalfWord{-bitmap or escape-}
229 needsSRT :: C_SRT -> Bool
230 needsSRT NoC_SRT = False
231 needsSRT (C_SRT _ _ _) = True
234 About @CMacroStmt@, etc.: notionally, they all just call some
235 arbitrary C~macro or routine, passing the @CAddrModes@ as arguments.
236 However, we distinguish between various flavours of these things,
237 mostly just to keep things somewhat less wild and wooly.
241 Some {\em essential} bits of the STG execution model are done with C
242 macros. An example is @STK_CHK@, which checks for stack-space
243 overflow. This enumeration type lists all such macros:
246 = UPD_CAF -- update CAF closure with indirection
247 | UPD_BH_UPDATABLE -- eager backholing
248 | UPD_BH_SINGLE_ENTRY -- more eager blackholing
249 | PUSH_UPD_FRAME -- push update frame
250 | SET_TAG -- set TagReg if it exists
251 -- dataToTag# primop -- *only* used in unregisterised builds.
252 -- (see AbsCUtils.dsCOpStmt)
255 | REGISTER_FOREIGN_EXPORT -- register a foreign exported fun
256 | REGISTER_IMPORT -- register an imported module
257 | REGISTER_DIMPORT -- register an imported module from
260 | GRAN_FETCH -- for GrAnSim only -- HWL
261 | GRAN_RESCHEDULE -- for GrAnSim only -- HWL
262 | GRAN_FETCH_AND_RESCHEDULE -- for GrAnSim only -- HWL
263 | THREAD_CONTEXT_SWITCH -- for GrAnSim only -- HWL
264 | GRAN_YIELD -- for GrAnSim only -- HWL
267 Heap/Stack checks. There are far too many of these.
272 = HP_CHK_NP -- heap/stack checks when
273 | STK_CHK_NP -- node points to the closure
276 | HP_CHK_FUN -- heap/stack checks when
277 | STK_CHK_FUN -- node doesn't point
279 -- case alternative heap checks:
281 | HP_CHK_NOREGS -- no registers live
282 | HP_CHK_UNPT_R1 -- R1 is boxed/unlifted
283 | HP_CHK_UNBX_R1 -- R1 is unboxed
284 | HP_CHK_F1 -- FloatReg1 (only) is live
285 | HP_CHK_D1 -- DblReg1 (only) is live
286 | HP_CHK_L1 -- LngReg1 (only) is live
288 | HP_CHK_UNBX_TUPLE -- unboxed tuple heap check
291 \item[@CCallProfCtrMacro@:]
292 The @String@ names a macro that, if \tr{#define}d, will bump one/some
293 of the STG-event profiling counters.
295 \item[@CCallProfCCMacro@:]
296 The @String@ names a macro that, if \tr{#define}d, will perform some
297 cost-centre-profiling-related action.
300 %************************************************************************
302 \subsection[CAddrMode]{C addressing modes}
304 %************************************************************************
308 = CVal RegRelative PrimRep
309 -- On RHS of assign: Contents of Magic[n]
310 -- On LHS of assign: location Magic[n]
311 -- (ie at addr Magic+n)
314 -- On RHS of assign: Address of Magic[n]; ie Magic+n
315 -- n=0 gets the Magic location itself
316 -- (NB: n=0 case superceded by CReg)
317 -- On LHS of assign: only sensible if n=0,
318 -- which gives the magic location itself
319 -- (NB: superceded by CReg)
321 -- JRS 2002-02-05: CAddr is really scummy and should be fixed.
322 -- The effect is that the semantics of CAddr depend on what the
323 -- contained RegRelative is; it is decidely non-orthogonal.
325 | CReg MagicId -- To replace (CAddr MagicId 0)
327 | CTemp !Unique !PrimRep -- Temporary locations
328 -- ``Temporaries'' correspond to local variables in C, and registers in
331 | CLbl CLabel -- Labels in the runtime system, etc.
332 PrimRep -- the kind is so we can generate accurate C decls
334 | CCharLike CAddrMode -- The address of a static char-like closure for
335 -- the specified character. It is guaranteed to be in
336 -- the range mIN_CHARLIKE..mAX_CHARLIKE
338 | CIntLike CAddrMode -- The address of a static int-like closure for the
339 -- specified small integer. It is guaranteed to be in
340 -- the range mIN_INTLIKE..mAX_INTLIKE
344 | CJoinPoint -- This is used as the amode of a let-no-escape-bound
346 VirtualSpOffset -- Sp value after any volatile free vars
347 -- of the rhs have been saved on stack.
348 -- Just before the code for the thing is jumped to,
349 -- Sp will be set to this value,
350 -- and then any stack-passed args pushed,
351 -- then the code for this thing will be entered
353 !PrimRep -- the kind of the result
354 CExprMacro -- the macro to generate a value
355 [CAddrMode] -- and its arguments
358 Various C macros for values which are dependent on the back-end layout.
364 | ARG_TAG -- stack argument tagging
365 | GET_TAG -- get current constructor tag
368 | BYTE_ARR_CTS -- used when passing a ByteArray# to a ccall
369 | PTRS_ARR_CTS -- similarly for an Array#
370 | ForeignObj_CLOSURE_DATA -- and again for a ForeignObj#
373 Convenience functions:
376 mkIntCLit :: Int -> CAddrMode
377 mkIntCLit i = CLit (mkMachInt (toInteger i))
379 mkWordCLit :: StgWord -> CAddrMode
380 mkWordCLit wd = CLit (MachWord (fromIntegral wd))
382 mkCString :: FastString -> CAddrMode
383 mkCString s = CLit (MachStr s)
385 mkCCostCentre :: CostCentre -> CAddrMode
386 mkCCostCentre cc = CLbl (mkCC_Label cc) DataPtrRep
388 mkCCostCentreStack :: CostCentreStack -> CAddrMode
389 mkCCostCentreStack ccs = CLbl (mkCCS_Label ccs) DataPtrRep
392 %************************************************************************
394 \subsection[RegRelative]{@RegRelatives@: ???}
396 %************************************************************************
401 | SpRel FastInt -- }- offsets in StgWords
402 | NodeRel FastInt -- }
403 | CIndex CAddrMode CAddrMode PrimRep -- pointer arithmetic :-)
404 -- CIndex a b k === (k*)a[b]
407 = DirectReturn -- Jump directly, if possible
408 | StaticVectoredReturn Int -- Fixed tag, starting at zero
409 | DynamicVectoredReturn CAddrMode -- Dynamic tag given by amode, starting at zero
411 hpRel :: VirtualHeapOffset -- virtual offset of Hp
412 -> VirtualHeapOffset -- virtual offset of The Thing
413 -> RegRelative -- integer offset
414 hpRel hp off = HpRel (iUnbox (hp - off))
416 spRel :: VirtualSpOffset -- virtual offset of Sp
417 -> VirtualSpOffset -- virtual offset of The Thing
418 -> RegRelative -- integer offset
419 spRel sp off = SpRel (iUnbox (spRelToInt sp off))
421 nodeRel :: VirtualHeapOffset
423 nodeRel off = NodeRel (iUnbox off)
427 %************************************************************************
429 \subsection[Liveness]{Liveness Masks}
431 %************************************************************************
433 We represent liveness bitmaps as a BitSet (whose internal
434 representation really is a bitmap). These are pinned onto case return
435 vectors to indicate the state of the stack for the garbage collector.
437 In the compiled program, liveness bitmaps that fit inside a single
438 word (StgWord) are stored as a single word, while larger bitmaps are
439 stored as a pointer to an array of words.
442 data Liveness = Liveness CLabel !Int Bitmap
444 maybeLargeBitmap :: Liveness -> AbstractC
445 maybeLargeBitmap liveness@(Liveness _ size _)
446 | size <= mAX_SMALL_BITMAP_SIZE = AbsCNop
447 | otherwise = CBitmap liveness
450 %************************************************************************
452 \subsection[HeapOffset]{@Heap Offsets@}
454 %************************************************************************
456 This used to be a grotesquely complicated datatype in an attempt to
457 hide the details of header sizes from the compiler itself. Now these
458 constants are imported from the RTS, and we deal in real Ints.
461 type HeapOffset = Int -- ToDo: remove
463 type VirtualHeapOffset = HeapOffset
464 type VirtualSpOffset = Int
466 type HpRelOffset = HeapOffset
467 type SpRelOffset = Int
470 %************************************************************************
472 \subsection[MagicId]{@MagicIds@: registers and such}
474 %************************************************************************
478 = BaseReg -- mentioned only in nativeGen
480 -- Argument and return registers
481 | VanillaReg -- pointers, unboxed ints and chars
483 FastInt -- its number (1 .. mAX_Vanilla_REG)
485 | FloatReg -- single-precision floating-point registers
486 FastInt -- its number (1 .. mAX_Float_REG)
488 | DoubleReg -- double-precision floating-point registers
489 FastInt -- its number (1 .. mAX_Double_REG)
492 | Sp -- Stack ptr; points to last occupied stack location.
493 | SpLim -- Stack limit
494 | Hp -- Heap ptr; points to last occupied heap location.
495 | HpLim -- Heap limit register
496 | CurCostCentre -- current cost centre register.
497 | VoidReg -- see "VoidPrim" type; just a placeholder;
498 -- no actual register
499 | LongReg -- long int registers (64-bit, really)
500 PrimRep -- Int64Rep or Word64Rep
501 FastInt -- its number (1 .. mAX_Long_REG)
503 | CurrentTSO -- pointer to current thread's TSO
504 | CurrentNursery -- pointer to allocation area
505 | HpAlloc -- allocation count for heap check failure
508 node = VanillaReg PtrRep (_ILIT 1) -- A convenient alias for Node
509 tagreg = VanillaReg WordRep (_ILIT 2) -- A convenient alias for TagReg
514 We need magical @Eq@ because @VanillaReg@s come in multiple flavors.
517 instance Eq MagicId where
518 reg1 == reg2 = tag reg1 ==# tag reg2
520 tag BaseReg = (_ILIT(0) :: FastInt)
525 tag CurCostCentre = _ILIT(6)
526 tag VoidReg = _ILIT(7)
528 tag (VanillaReg _ i) = _ILIT(8) +# i
530 tag (FloatReg i) = _ILIT(8) +# maxv +# i
531 tag (DoubleReg i) = _ILIT(8) +# maxv +# maxf +# i
532 tag (LongReg _ i) = _ILIT(8) +# maxv +# maxf +# maxd +# i
534 maxv = iUnbox mAX_Vanilla_REG
535 maxf = iUnbox mAX_Float_REG
536 maxd = iUnbox mAX_Double_REG
539 Returns True for any register that {\em potentially} dies across
540 C calls (or anything near equivalent). We just say @True@ and
541 let the (machine-specific) registering macros sort things out...
544 isVolatileReg :: MagicId -> Bool
545 isVolatileReg any = True