2 % (c) The GRASP/AQUA Project, Glasgow University, 1992-1996
4 \section[AbstractC]{Abstract C: the last stop before machine code}
6 This ``Abstract C'' data type describes the raw Spineless Tagless
7 machine model at a C-ish level; it is ``abstract'' in that it only
8 includes C-like structures that we happen to need. The conversion of
9 programs from @StgSyntax@ (basically a functional language) to
10 @AbstractC@ (basically imperative C) is the heart of code generation.
11 From @AbstractC@, one may convert to real C (for portability) or to
12 raw assembler/machine code.
15 #include "HsVersions.h"
24 mkAbstractCs, mkAbsCStmts, mkAlgAltsCSwitch,
33 MagicId(..), node, infoptr,
34 isVolatileReg, noLiveRegsMask, mkLiveRegsMask,
40 import Constants ( mAX_Vanilla_REG, mAX_Float_REG,
41 mAX_Double_REG, lIVENESS_R1, lIVENESS_R2,
42 lIVENESS_R3, lIVENESS_R4, lIVENESS_R5,
43 lIVENESS_R6, lIVENESS_R7, lIVENESS_R8
45 import HeapOffs ( SYN_IE(VirtualSpAOffset), SYN_IE(VirtualSpBOffset),
46 SYN_IE(VirtualHeapOffset)
48 import Literal ( mkMachInt )
49 import PrimRep ( isFollowableRep, PrimRep(..) )
52 @AbstractC@ is a list of Abstract~C statements, but the data structure
53 is tree-ish, for easier and more efficient putting-together.
59 | AbsCStmts AbstractC AbstractC
61 -- and the individual stmts...
64 A note on @CAssign@: In general, the type associated with an assignment
65 is the type of the lhs. However, when the lhs is a pointer to mixed
66 types (e.g. SpB relative), the type of the assignment is the type of
67 the rhs for float types, or the generic StgWord for all other types.
68 (In particular, a CharRep on the rhs is promoted to IntRep when
69 stored in a mixed type location.)
77 CAddrMode -- Put this in the program counter
78 -- eg `CJump (CReg (VanillaReg PtrRep 1))' puts Ret1 in PC
79 -- Enter can be done by:
80 -- CJump (CVal NodeRel zeroOff)
83 CAddrMode -- Fall through into this routine
84 -- (for the benefit of the native code generators)
85 -- Equivalent to CJump in C land
87 | CReturn -- This used to be RetVecRegRel
88 CAddrMode -- Any base address mode
89 ReturnInfo -- How to get the return address from the base address
92 [(Literal, AbstractC)] -- alternatives
93 AbstractC -- default; if there is no real Abstract C in here
94 -- (e.g., all comments; see function "nonemptyAbsC"),
95 -- then that means the default _cannot_ occur.
96 -- If there is only one alternative & no default code,
97 -- then there is no need to check the tag.
99 -- CSwitch m [(tag,code)] AbsCNop == code
101 | CCodeBlock CLabel AbstractC
102 -- [amode analog: CLabelledCode]
103 -- A labelled block of code; this "statement" is not
104 -- executed; rather, the labelled code will be hoisted
105 -- out to the top level (out of line) & it can be
108 | CInitHdr -- to initialise the header of a closure (both fixed/var parts)
110 RegRelative -- address of the info ptr
111 CAddrMode -- cost centre to place in closure
112 -- CReg CurCostCentre or CC_HDR(R1.p{-Node-})
113 Bool -- inplace update or allocate
116 [CAddrMode] -- Results
118 [CAddrMode] -- Arguments
119 Int -- Live registers (may be obtainable from volatility? ADR)
120 [MagicId] -- Potentially volatile/live registers
121 -- (to save/restore around the call/op)
123 -- INVARIANT: When a PrimOp which can cause GC is used, the
124 -- only live data is tidily on the STG stacks or in the STG
125 -- registers (the code generator ensures this).
127 -- Why this? Because if the arguments were arbitrary
128 -- addressing modes, they might be things like (Hp+6) which
129 -- will get utterly spongled by GC.
131 | CSimultaneous -- Perform simultaneously all the statements
132 AbstractC -- in the nested AbstractC. They are only
133 -- allowed to be CAssigns, COpStmts and AbsCNops, so the
134 -- "simultaneous" part just concerns making
135 -- sure that permutations work.
136 -- For example { a := b, b := a }
137 -- needs to go via (at least one) temporary
139 -- see the notes about these next few; they follow below...
140 | CMacroStmt CStmtMacro [CAddrMode]
141 | CCallProfCtrMacro FAST_STRING [CAddrMode]
142 | CCallProfCCMacro FAST_STRING [CAddrMode]
144 -- *** the next three [or so...] are DATA (those above are CODE) ***
147 CLabel -- The (full, not base) label to use for labelling the closure.
149 CAddrMode -- cost centre identifier to place in closure
150 [CAddrMode] -- free vars; ptrs, then non-ptrs
153 | CClosureInfoAndCode
154 ClosureInfo -- Explains placement and layout of closure
155 AbstractC -- Slow entry point code
157 -- Fast entry point code, if any
158 CAddrMode -- Address of update code; Nothing => should never be used
159 -- (which is the case for all except constructors)
160 String -- Closure description; NB we can't get this from
161 -- ClosureInfo, because the latter refers to the *right* hand
162 -- side of a defn, whereas the "description" refers to *left*
164 Int -- Liveness info; this is here because it is
165 -- easy to produce w/in the CgMonad; hard
166 -- thereafter. (WDP 95/11)
168 | CRetVector -- Return vector with "holes"
169 -- (Nothings) for the default
170 CLabel -- vector-table label
172 AbstractC -- (and what to put in a "hole" [when Nothing])
174 | CRetUnVector -- Direct return
175 CLabel -- unvector-table label
176 CAddrMode -- return code
178 | CFlatRetVector -- A labelled block of static data
179 CLabel -- This is the flattened version of CRetVector
182 | CCostCentreDecl -- A cost centre *declaration*
183 Bool -- True <=> local => full declaration
184 -- False <=> extern; just say so
188 AbstractC -- InRegs Info Table (CClosureInfoTable)
190 -- out of date -- HWL
192 | CSplitMarker -- Split into separate object modules here
195 About @CMacroStmt@, etc.: notionally, they all just call some
196 arbitrary C~macro or routine, passing the @CAddrModes@ as arguments.
197 However, we distinguish between various flavours of these things,
198 mostly just to keep things somewhat less wild and wooly.
202 Some {\em essential} bits of the STG execution model are done with C
203 macros. An example is @STK_CHK@, which checks for stack-space
204 overflow. This enumeration type lists all such macros:
207 = ARGS_CHK_A_LOAD_NODE
209 | ARGS_CHK_B_LOAD_NODE
218 | UPD_BH_SINGLE_ENTRY
222 | GRAN_FETCH -- for GrAnSim only -- HWL
223 | GRAN_RESCHEDULE -- for GrAnSim only -- HWL
224 | GRAN_FETCH_AND_RESCHEDULE -- for GrAnSim only -- HWL
225 | THREAD_CONTEXT_SWITCH -- for GrAnSim only -- HWL
226 | GRAN_YIELD -- for GrAnSim only -- HWL
230 \item[@CCallProfCtrMacro@:]
231 The @String@ names a macro that, if \tr{#define}d, will bump one/some
232 of the STG-event profiling counters.
234 \item[@CCallProfCCMacro@:]
235 The @String@ names a macro that, if \tr{#define}d, will perform some
236 cost-centre-profiling-related action.
239 HERE ARE SOME OLD NOTES ABOUT HEAP-CHK ENTRY POINTS:
242 Some parts of the system, {\em notably the storage manager}, are
243 implemented by C~routines that must know something about the internals
244 of the STG world, e.g., where the heap-pointer is. (The
245 ``C-as-assembler'' documents describes this stuff in detail.)
247 This is quite a tricky business, especially with ``optimised~C,'' so
248 we keep close tabs on these fellows. This enumeration type lists all
249 such ``STG~C'' routines:
251 HERE ARE SOME *OLD* NOTES ABOUT HEAP-CHK ENTRY POINTS:
253 Heap overflow invokes the garbage collector (of your choice :-), and
254 we have different entry points, to tell the GC the exact configuration
257 \item[Branch of a boxed case:]
258 The @Node@ register points off to somewhere legitimate, the @TagReg@
259 holds the tag, and the @RetReg@ points to the code for the
260 alterative which should be resumed. (ToDo: update)
262 \item[Branch of an unboxed case:]
263 The @Node@ register points nowhere of any particular interest, a
264 kind-specific register (@IntReg@, @FloatReg@, etc.) holds the unboxed
265 value, and the @RetReg@ points to the code for the alternative
266 which should be resumed. (ToDo: update)
268 \item[Closure entry:]
269 The @Node@ register points to the closure, and the @RetReg@ points
270 to the code to be resumed. (ToDo: update)
273 %************************************************************************
275 \subsection[CAddrMode]{C addressing modes}
277 %************************************************************************
279 Addressing modes: these have @PrimitiveKinds@ pinned on them.
282 = CVal RegRelative PrimRep
283 -- On RHS of assign: Contents of Magic[n]
284 -- On LHS of assign: location Magic[n]
285 -- (ie at addr Magic+n)
288 -- On RHS of assign: Address of Magic[n]; ie Magic+n
289 -- n=0 gets the Magic location itself
290 -- (NB: n=0 case superceded by CReg)
291 -- On LHS of assign: only sensible if n=0,
292 -- which gives the magic location itself
293 -- (NB: superceded by CReg)
295 | CReg MagicId -- To replace (CAddr MagicId 0)
297 | CTableEntry -- CVal should be generalized to allow this
300 PrimRep -- For casting
302 | CTemp Unique PrimRep -- Temporary locations
303 -- ``Temporaries'' correspond to local variables in C, and registers in
306 | CLbl CLabel -- Labels in the runtime system, etc.
307 -- See comment under CLabelledData about (String,Name)
308 PrimRep -- the kind is so we can generate accurate C decls
310 | CUnVecLbl -- A choice of labels left up to the back end
314 | CCharLike CAddrMode -- The address of a static char-like closure for
315 -- the specified character. It is guaranteed to be in
318 | CIntLike CAddrMode -- The address of a static int-like closure for the
319 -- specified small integer. It is guaranteed to be in the
320 -- range mIN_INTLIKE..mAX_INTLIKE
322 | CString FAST_STRING -- The address of the null-terminated string
324 | CLitLit FAST_STRING -- completely literal literal: just spit this String
328 | COffset HeapOffset -- A literal constant, not an offset *from* anything!
329 -- ToDo: this should really be CLitOffset
331 | CCode AbstractC -- Some code. Used mainly for return addresses.
333 | CLabelledCode CLabel AbstractC -- Almost defunct? (ToDo?) --JSM
334 -- Some code that must have a particular label
335 -- (which is jumpable to)
337 | CJoinPoint -- This is used as the amode of a let-no-escape-bound variable
338 VirtualSpAOffset -- SpA and SpB values after any volatile free vars
339 VirtualSpBOffset -- of the rhs have been saved on stack.
340 -- Just before the code for the thing is jumped to,
341 -- SpA/B will be set to these values,
342 -- and then any stack-passed args pushed,
343 -- then the code for this thing will be entered
346 PrimRep -- the kind of the result
347 CExprMacro -- the macro to generate a value
348 [CAddrMode] -- and its arguments
350 | CCostCentre -- If Bool is True ==> it to be printed as a String,
351 CostCentre -- (*not* as a C identifier or some such).
352 Bool -- (It's not just the double-quotes on either side;
353 -- spaces and other funny characters will have been
354 -- fiddled in the non-String variant.)
357 = --ASSERT(not (currentOrSubsumedCosts cc))
358 --FALSE: We do put subsumedCC in static closures
362 Various C macros for values which are dependent on the back-end layout.
377 mkIntCLit :: Int -> CAddrMode
378 mkIntCLit i = CLit (mkMachInt (toInteger i))
381 %************************************************************************
383 \subsection[RegRelative]{@RegRelatives@: ???}
385 %************************************************************************
389 = HpRel VirtualHeapOffset -- virtual offset of Hp
390 VirtualHeapOffset -- virtual offset of The Thing
391 | SpARel VirtualSpAOffset -- virtual offset of SpA
392 VirtualSpAOffset -- virtual offset of The Thing
393 | SpBRel VirtualSpBOffset -- virtual offset of SpB
394 VirtualSpBOffset -- virtual offset of The Thing
395 | NodeRel VirtualHeapOffset
398 = DirectReturn -- Jump directly, if possible
399 | StaticVectoredReturn Int -- Fixed tag, starting at zero
400 | DynamicVectoredReturn CAddrMode -- Dynamic tag given by amode, starting at zero
403 %************************************************************************
405 \subsection[MagicId]{@MagicIds@: registers and such}
407 %************************************************************************
409 Much of what happens in Abstract-C is in terms of ``magic'' locations,
410 such as the stack pointer, heap pointer, etc. If possible, these will
411 be held in registers.
413 Here are some notes about what's active when:
415 \item[Always active:]
416 Hp, HpLim, SpA, SpB, SuA, SuB
422 Ptr regs: RetPtr1 (= Node), RetPtr2...
423 Int/char regs: RetData1 (= TagReg = IntReg), RetData2...
424 Float regs: RetFloat1, ...
425 Double regs: RetDouble1, ...
430 = BaseReg -- mentioned only in nativeGen
432 | StkOReg -- mentioned only in nativeGen
434 -- Argument and return registers
435 | VanillaReg -- pointers, unboxed ints and chars
436 PrimRep -- PtrRep, IntRep, CharRep, StablePtrRep or ForeignObjRep
437 -- (in case we need to distinguish)
438 FAST_INT -- its number (1 .. mAX_Vanilla_REG)
440 | FloatReg -- single-precision floating-point registers
441 FAST_INT -- its number (1 .. mAX_Float_REG)
443 | DoubleReg -- double-precision floating-point registers
444 FAST_INT -- its number (1 .. mAX_Double_REG)
446 | TagReg -- to return constructor tags; as almost all returns are vectored,
447 -- this is rarely used.
449 | RetReg -- topmost return address from the B stack
451 | SpA -- Stack ptr; points to last occupied stack location.
452 -- Stack grows downward.
453 | SuA -- mentioned only in nativeGen
455 | SpB -- Basic values, return addresses and update frames.
457 | SuB -- mentioned only in nativeGen
459 | Hp -- Heap ptr; points to last occupied heap location.
460 -- Free space at lower addresses.
462 | HpLim -- Heap limit register: mentioned only in nativeGen
464 | LivenessReg -- (parallel only) used when we need to record explicitly
465 -- what registers are live
467 | StdUpdRetVecReg -- mentioned only in nativeGen
468 | StkStubReg -- register holding STK_STUB_closure (for stubbing dead stack slots)
470 | CurCostCentre -- current cost centre register.
472 | VoidReg -- see "VoidPrim" type; just a placeholder; no actual register
474 node = VanillaReg PtrRep ILIT(1) -- A convenient alias for Node
475 infoptr = VanillaReg DataPtrRep ILIT(2) -- An alias for InfoPtr
478 noLiveRegsMask :: Int -- Mask indicating nothing live
482 :: [MagicId] -- Candidate live regs; depends what they have in them
486 = foldl do_reg noLiveRegsMask regs
488 do_reg acc (VanillaReg kind reg_no)
489 | isFollowableRep kind
490 = acc + (reg_tbl !! IBOX(reg_no _SUB_ ILIT(1)))
492 do_reg acc anything_else = acc
494 reg_tbl -- ToDo: mk Array!
495 = [lIVENESS_R1, lIVENESS_R2, lIVENESS_R3, lIVENESS_R4,
496 lIVENESS_R5, lIVENESS_R6, lIVENESS_R7, lIVENESS_R8]
499 We need magical @Eq@ because @VanillaReg@s come in multiple flavors.
502 instance Eq MagicId where
503 reg1 == reg2 = tag reg1 _EQ_ tag reg2
505 tag BaseReg = (ILIT(0) :: FAST_INT)
506 tag StkOReg = ILIT(1)
515 tag LivenessReg = ILIT(10)
516 tag StdUpdRetVecReg = ILIT(12)
517 tag StkStubReg = ILIT(13)
518 tag CurCostCentre = ILIT(14)
519 tag VoidReg = ILIT(15)
521 tag (VanillaReg _ i) = ILIT(15) _ADD_ i
523 tag (FloatReg i) = ILIT(15) _ADD_ maxv _ADD_ i
525 maxv = case mAX_Vanilla_REG of { IBOX(x) -> x }
527 tag (DoubleReg i) = ILIT(15) _ADD_ maxv _ADD_ maxf _ADD_ i
529 maxv = case mAX_Vanilla_REG of { IBOX(x) -> x }
530 maxf = case mAX_Float_REG of { IBOX(x) -> x }
533 Returns True for any register that {\em potentially} dies across
534 C calls (or anything near equivalent). We just say @True@ and
535 let the (machine-specific) registering macros sort things out...
537 isVolatileReg :: MagicId -> Bool
539 isVolatileReg any = True
540 --isVolatileReg (FloatReg _) = True
541 --isVolatileReg (DoubleReg _) = True
544 %************************************************************************
546 \subsection[AbsCSyn-printing]{Pretty-printing Abstract~C}
548 %************************************************************************
550 It's in \tr{PprAbsC.lhs}.