ghc/compiler/codeGen/cgintro.lit

   1 \section[codegen-intro]{Intro/background info for the code generator}
   2
   3 \tr{NOTES.codeGen} LIVES!!!
   4
   5 \begin{verbatim}
   6 =======================
   7 NEW!  10 Nov 93                 Semi-tagging
   8
   9 Rough idea
  10
  11         case x of               -- NB just a variable scrutinised
  12           []     -> ...
  13           (p:ps) -> ...p...     -- eg.  ps not used
  14
  15 generates
  16
  17         Node = a ptr to x
  18         while TRUE do { switch TAG(Node) {
  19
  20           INDIRECTION_TAG : Node = Node[1]; break;      -- Dereference indirection
  21
  22           OTHER_TAG : adjust stack; push return address; ENTER(Node)
  23
  24           0 :   adjust stack;
  25                 JUMP( Nil_case )
  26
  27           1 :   adjust stack;
  28                 R2 := Node[2]   -- Get ps
  29                 JUMP( Cons_case )
  30         }
  31
  32 * The "return address" is a vector table, which contains pointers to
  33   Nil_case and Cons_case.
  34
  35 * The "adjust stack" in the case of OTHER_TAG is one word different to
  36   that in the case of a constructor tag (0,1,...), because it needs to
  37   take account of the return address.  That's why the stack adjust
  38   shows up in the branches, rather than before the switch.
  39
  40 * In the case of *unvectored* returns, the "return address" will be
  41   some code which switches on TagReg.  Currently, the branches of the
  42   case at the return address have the code for the alternatives
  43   actually there:
  44
  45         switch TagReg {
  46           0 : code for nil case
  47           1 : code for cons case
  48         }
  49
  50 But with semi-tagging, we'll have to label each branch:
  51
  52         switch TagReg {
  53           0 : JUMP( Nil_case )
  54           1 : JUMP( Cons_case )
  55         }
  56
  57 So there's an extra jump.  Boring.  Boring.  (But things are usually
  58 eval'd...in which case we save a jump.)
  59
  60 * TAG is a macro which gets a "tag" from the info table. The tag
  61   encodes whether the thing is (a) an indirection, (b) evaluated
  62   constructor with tag N, or (c) something else. The "something else"
  63   usually indicates something unevaluated, but it might also include
  64   FETCH_MEs etc.  Anything which must be entered.
  65
  66 * Maybe we should get the info ptr out of Node, into a temporary
  67   InfoPtrReg, so that TAG and ENTER share the info-ptr fetch.
  68
  69 * We only load registers which are live in the alternatives.  So at
  70   the start of an alternative, either the unused fields *will* be in
  71   regs (if we came via enter/return) or they *won't* (if we came via
  72   the semi-tagging switch).  If they aren't, GC had better not follow
  73   them. So we can't arrange that all live ptrs are neatly lined up in
  74   the first N regs any more.  So GC has to take a liveness
  75   bit-pattern, not just a "number of live regs" number.
  76
  77 * We need to know which of the constructors fields are live in the
  78   alternatives.  Hence STG code has to be elaborated to keep live vars
  79   for each alternative, or to tag each bound-var in the alternatives
  80   with whether or not it is used.
  81
  82 * The code generator needs to be able to construct unique labels for
  83   the case alternatives.  (Previously this was done by the AbsC
  84   flattening pass.) Reason: we now have an explicit join point at the
  85   start of each alternative.
  86
  87 * There's some question about how tags are mapped.  Is 0 the first
  88   tag?  (Good when switching on TagReg when there are only two
  89   constructors.)  What is OTHER_TAG and INDIRECTION_TAG?
  90
  91 * This whole deal can be freely mixed with un-semi-tagged code.
  92   There should be a compiler flag to control it.
  93
  94 =======================
  95 Many of the details herein are moldy and dubious, but the general
  96 principles are still mostly sound.
  97 \end{verbatim}
  98
  99 %************************************************************************
 100 %*                                                                      *
 101 \subsection{LIST OF OPTIMISATIONS TO DO}
 102 %*                                                                      *
 103 %************************************************************************
 104
 105 \begin{itemize}
 106 \item
 107 Register return conventions.
 108
 109 \item
 110 Optimisations for Enter when
 111         \begin{itemize}
 112         \item
 113         know code ptr, so don't indirect via Node
 114         \item
 115         know how many args
 116         \item
 117         top level closures don't load Node
 118         \end{itemize}
 119 \item
 120 Strings.
 121
 122 \item
 123 Case of unboxed op with more than one alternative, should generate
 124 a switch or an if statement.
 125 \end{itemize}
 126
 127 {\em Medium}
 128
 129 \begin{itemize}
 130 \item
 131 Don't allocate constructors with no args.
 132 Instead have a single global one.
 133
 134 \item
 135 Have global closures for all characters, and all small numbers.
 136 \end{itemize}
 137
 138
 139 {\em Small}
 140
 141 \begin{itemize}
 142 \item
 143 When a closure is one of its own free variables, don't waste a field
 144 on it.  Instead just use Node.
 145 \end{itemize}
 146
 147
 148 %************************************************************************
 149 %*                                                                      *
 150 \subsection{ENTERING THE GARBAGE COLLECTOR}
 151 %*                                                                      *
 152 %************************************************************************
 153
 154 [WDP: OLD]
 155
 156 There are the following ways to get into the garbage collector:
 157
 158 \begin{verbatim}
 159 _HEAP_OVERFLOW_ReturnViaNode
 160 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 161 Used for the GC trap at closure entry.
 162
 163         - Node is only live ptr
 164         - After GC, enter Node
 165
 166 _HEAP_OVERFLOW_ReturnDirect0, _HEAP_OVERFLOW_ReturnDirect1, ...
 167 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 168 Used:   for fast entry of functions, and
 169         case alternative where values are returned in regs
 170
 171         - PtrReg1..n are live ptrs
 172         - ReturnReg points to start of code (before hp oflo check)
 173         - After GC, jump to ReturnReg
 174         - TagReg is preserved, in case this is an unvectored return
 175
 176
 177 _HEAP_OVERFLOW_CaseReturnViaNode
 178 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 179         *** GRIP ONLY ***
 180
 181 Used for case alternatives which return node in heap
 182
 183         - Node is only live ptr
 184         - RetVecReg points to return vector
 185         - After GC, push RetVecReg and enter Node
 186 \end{verbatim}
 187
 188 Exactly equivalent to @GC_ReturnViaNode@, preceded by pushing @ReturnVectorReg@.
 189
 190 The only reason we re-enter Node is so that in a GRIP-ish world, the
 191 closure pointed to be Node is re-loaded into local store if necessary.
 192
 193 %************************************************************************
 194 %*                                                                      *
 195 \subsection{UPDATES}
 196 %*                                                                      *
 197 %************************************************************************
 198
 199 [New stuff 27 Nov 91]
 200
 201 \subsubsection{Return conventions}
 202 %~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 203
 204 When executing the update continuation code for a constructor,
 205 @RetVecReg@ points to the {\em beginning of} the return vector.  This is to
 206 enable the update code to find the normal continuation code.
 207 (@RetVecReg@ is set up by the code which jumps to the update continuation
 208 code.)
 209
 210 \subsubsection{Stack arrangement}
 211 %~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 212
 213 Each stack has a ``stack update ptr'', SuA and SuB, which point to the
 214 topmost word of the stack just after an update frame has been pushed.
 215
 216 A standard update frame (on the B stack) looks like this
 217 (stack grows downward in this picture):
 218
 219 \begin{verbatim}
 220         |                                       |
 221         |---------------------------------------|
 222         | Saved SuA                             |
 223         |---------------------------------------|
 224         | Saved SuB                             |
 225         |---------------------------------------|
 226         | Pointer to closure to be updated      |
 227         |---------------------------------------|
 228         | Pointer to Update return vector       |
 229         |---------------------------------------|
 230 \end{verbatim}
 231
 232 The SuB therefore points to the Update return vector component of the
 233 topmost update frame.
 234
 235 A {\em constructor} update frame, which is pushed only by closures
 236 which know they will evaluate to a data object, looks just the
 237 same, but without the saved SuA pointer.
 238
 239 \subsubsection{Pushing update frames}
 240 %~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 241
 242 An update is pushed right at the start of the code for an updatable
 243 closure.  But {\em after} the stack overflow check.  (The B-stack oflo
 244 check should thereby include allowance for the update frame itself.)
 245
 246 \subsubsection{Return vectors}
 247 %~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 248
 249 Every ``return address'' pushed on the stack by a boxed \tr{case} is a
 250 pointer to a vector of one or more pairs of code pointers:
 251
 252 \begin{verbatim}
 253         ------> -----------------
 254                 | Cont1         |
 255                 |---------------|
 256                 | Update1       |
 257                 -----------------
 258                 | Cont2         |
 259                 |---------------|
 260                 | Update2       |
 261                 -----------------
 262                 ...etc...
 263 \end{verbatim}
 264
 265 Each pair consists of a {\em continuation} code pointer and an
 266 {\em update} code pointer.
 267
 268 For data types with only one constructor, or too many constructors for
 269 vectoring, the return vector consists of a single pair.
 270
 271 When the \tr{data} decl for each data type is compiled, as well as
 272 making info tables for each constructor, an update code sequence for
 273 each constructor (or a single one, if unvectored) is also created.
 274
 275 ToDo: ** record naming convention for these code sequences somewhere **
 276
 277 When the update code is entered, it uses the value stored in the
 278 return registers used by that constructor to update the thing pointed
 279 to by the update frame (all of which except for the return address is
 280 still on the B stack).  If it can do an update in place (ie
 281 constructor takes 3 words or fewer) it does so.
 282
 283 In the unvectored case, this code first has to do a switch on the tag,
 284 UNLESS the return is in the heap, in which case simply overwrite with
 285 an indirection to the thing Node points to.
 286
 287 Tricky point: if the update code can't update in place it has to
 288 allocate a new object, by performing a heap-oflo check and jumping to
 289 the appropriate heap-overflow entry point depending on which RetPtr
 290 registers are live (just as when compiling a case alternative).
 291
 292 When the update code is entered, a register @ReturnReg@ is assumed to
 293 contain the ``return address'' popped from the B stack. This is so
 294 that the update code can enter the normal continuation code when it is
 295 done.
 296
 297 For standard update frames, the A and B stack update ptrs are restored
 298 from the saved versions before returning, too.
 299
 300 \subsubsection{Update return vector}
 301 %~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 302
 303 Both standard and constructor update frames have as their topmost word
 304 a pointer to a static, fixed, update return vector.
 305
 306 The ``continuation'' entry of each pair in this vector sets UpdReg to
 307 point to the thing to be updated (gotten from the update frame), pops
 308 the update frame, and returns to the ``update'' entry of the
 309 corresponding pair in the next return vector (now exposed on top of B
 310 stk).
 311
 312 The ``update'' entry of each pair in this vector overwrites the thing
 313 to be updated with an indirection to the thing UpdReg points to, and
 314 then returns in the same was as the "continuation" entry above.
 315
 316 There need to be enough pairs in the update return vector to cater for
 317 any constructor at all.
 318
 319
 320 *************************
 321
 322 Things which need to be altered if you change the number of constructors
 323 which switches off vectored returns:
 324 \begin{verbatim}
 325         Extra cases in update return vector (file xxx)
 326         The value xxxx in yyyy.lhs
 327         others?
 328 \end{verbatim}
 329 **************************
 330
 331 %************************************************************************
 332 %*                                                                      *
 333 \subsection{HEAP OBJECTS}
 334 %*                                                                      *
 335 %************************************************************************
 336
 337 The heap consists of {\em closures}.
 338 A closure can be either:
 339 \begin{itemize}
 340 \item
 341 a {\em suspension}, which is an unevaluated thunk.
 342 \item
 343 a {\em constructed object} (or just constructor); created by let(recs) and
 344 by updating.
 345 \item
 346 a {\em partial application} (only updating creates these).
 347 \end{itemize}
 348
 349 Closures are laid out with the {\em info pointer} at the lowest
 350 address (but see notes on the Global Address field for parallel
 351 system).  [We don't try to localise knowledge of this!  It is a royal
 352 pain having to cope with closures laid out backwards.]
 353
 354 Ptr fields occur first (before non-ptr ones).
 355
 356 Non-normal-form closures are always at least 3 words in size (excl
 357 global address), so they can be updated with a list cell (should they
 358 evaluate to that).
 359
 360 Normal form (constructor) closures are always at least 2 words in size
 361 (excl global address), so they have room enough for forwarding ptrs
 362 during GC, and FETCHME boxes after flushing.
 363
 364 1-word closures for normal-form closures in static space.  Explain
 365 more.
 366
 367 Ideally, the info pointer of a closure would point to...
 368 \begin{verbatim}
 369               |-------------|
 370               | info table  |
 371               |-------------|
 372 info ptr ---> code
 373 \end{verbatim}
 374
 375 But when C is the target code we can't guarantee the relative
 376 positions of code and data.  So the info ptr points to
 377 \begin{verbatim}
 378               |-------------|
 379 info ptr ---->|    ------------------------> code
 380               |-------------|
 381               | info table  |
 382               |-------------|
 383 \end{verbatim}
 384
 385 That is, there's an extra indirection involved; and the info table
 386 occurs AFTER the info pointer rather than before. The info table
 387 entries are ``reversed'' too, so that bigger negative offsets in the
 388 ``usual'' case turn into bigger positive offsets.
 389
 390 SUSPENSIONS
 391
 392 The simplest form of suspension is
 393 \begin{verbatim}
 394         info-ptr, ptr free vars, non-ptr free vars
 395 \end{verbatim}
 396
 397 where the info table for info-ptr gives
 398 \begin{itemize}
 399 \item
 400 the total number of words of free vars
 401 \item
 402 the number of words of ptr free vars (== number of ptr free vars)
 403 in its extra-info part.
 404 \end{itemize}
 405
 406 Optimised versions omit the size info from the info table, and instead
 407 use specialised GC routines.
 408
 409
 410 %************************************************************************
 411 %*                                                                      *
 412 \subsection{NAMING CONVENTIONS for compiled code}
 413 %*                                                                      *
 414 %************************************************************************
 415
 416
 417 Given a top-level closure called f defined in module M,
 418
 419 \begin{verbatim}
 420         _M_f_closure            labels the closure itself
 421                                 (only for top-level (ie static) closures)
 422
 423         _M_f_entry              labels the slow entry point of the code
 424         _M_f_fast               labels the fast entry point of the code
 425
 426         _M_f_info               labels the info pointer for the closure for f
 427                                 (NB the info ptr of a closure isn't public
 428                                 in the sense that these labels
 429                                 are.  It is private to a module, and
 430                                 its name can be a secret.)
 431 \end{verbatim}
 432
 433 These names are the REAL names that the linker sees. The initial underscores
 434 are attached by the C compiler.
 435
 436 A non-top-level closure has the same names, but as well as the \tr{f}
 437 the labels have the unique number, so that different local closures
 438 which share a name don't get confused.  The reason we need a naming
 439 convention at all is that with a little optimisation a tail call may
 440 jump direct to the fast entry of a locally-defined closure.
 441
 442 \tr{f} may be a constructor, in the case of closures which are the curried
 443 versions of the constructor.
 444
 445 For constructor closures, we have the following naming conventions, where
 446 the constructor is C defined in module M:
 447
 448 \begin{verbatim}
 449         _M_C_con_info           is the info ptr for the constructor
 450         _M_C_con_entry          is the corresponding code entry point
 451 \end{verbatim}
 452
 453 %************************************************************************
 454 %*                                                                      *
 455 \subsection{ENTRY CONVENTIONS}
 456 %*                                                                      *
 457 %************************************************************************
 458
 459 \begin{description}
 460 \item[Constructor objects:]
 461         On entry to the code for a constructor (\tr{_M_C_con_entry}), Node
 462         points to the constructor object.  [Even if the constructor has arity
 463         zero...]
 464
 465 \item[Non-top-level suspensions (both fast and slow entries):]
 466         Node points to the closure.
 467
 468 \item[Top-level suspensions, slow entry:]
 469         ReturnReg points to the slow entry point itself
 470
 471 \item[..ditto, fast entry:]
 472         No entry convention
 473 \end{description}
 474
 475
 476 %************************************************************************
 477 %*                                                                      *
 478 \subsection{CONSTRUCTOR RETURN CONVENTIONS}
 479 %*                                                                      *
 480 %************************************************************************
 481
 482 There is lots of excitement concerning the way in which constructors
 483 are returned to case expressions.
 484
 485 {\em Simplest version}
 486 %=====================
 487
 488 The return address on the stack points directly to some code.  It
 489 expects:
 490
 491 \begin{verbatim}
 492 Boxed objects:
 493         PtrReg1 points to the constructed value (in the heap) (unless arity=0)
 494         Tag     contains its tag (unless # of constructors = 1)
 495
 496 Unboxed Ints:   IntReg          contains the int
 497         Float:  FloatReg        contains the returned value
 498 \end{verbatim}
 499
 500 {\em Small improvement: vectoring}
 501 %=================================
 502
 503 If there are fewer than (say) 8 constructors in the type, the return
 504 address points to a vector of return addresses.  The constructor does
 505 a vectored return.  No CSwitch.
 506
 507 Complication: updates.  Update frames are built before the type of the
 508 thing which will be returned is known.  Hence their return address
 509 UPDATE has to be able to handle anything (vectored and nonvectored).
 510
 511 Hence the vector table goes BACKWARD from ONE WORD BEFORE the word
 512 pointed to by the return address.
 513
 514 {\em Big improvement: contents in registers}
 515 %===========================================
 516
 517 Constructor with few enough components (eg 8ish) return their
 518 arguments in registers.  [If there is only one constructor in the
 519 type, the tag register can be pressed into service for this purpose.]
 520
 521 Complication: updates.  Update frames are built before the type of the
 522 thing which will be returned is known.  Hence their return address
 523 UPDATE has to be able to handle anything.
 524
 525 So, a return address is a pointer to a PAIR of return addresses (or
 526 maybe a pointer to some code immediately preceded by a pointer to some
 527 code).
 528
 529 The ``main'' return address is just as before.
 530
 531 The ``update'' return address expects just the same regs to be in use
 532 as the ``main'' address, BUT AS WELL the magic loc UpdPtr points to a
 533 closure to be updated.  It carries out the update, and contines with
 534 the main return address.
 535
 536 The ``main'' code for UPDATE just loads UpdPtr the thing to be
 537 updated, and returns to the "update" entry of the next thing on the
 538 stack.
 539
 540 The ``update'' entry for UPDATE just overwrites the thing to be
 541 updated with an indirection to UpdPtr.
 542
 543 These two improvements can be combined orthogonally.
 544
 545
 546 %************************************************************************
 547 %*                                                                      *
 548 \subsection{REGISTERS}
 549 %*                                                                      *
 550 %************************************************************************
 551
 552 Separate registers for
 553 \begin{verbatim}
 554         C stack (incl interrupt handling, if this is not done on
 555                 another stk) (if interrupts don't mangle the C stack,
 556                 we could save it for most of the time and reuse the
 557                 register)
 558
 559         Arg stack
 560         Basic value and control stack
 561                 These two grow towards each other, so they are each
 562                 other's limits!
 563
 564         Heap pointer
 565 \end{verbatim}
 566
 567 And probably also
 568 \begin{verbatim}
 569         Heap limit
 570 \end{verbatim}
 571
 572
 573 %************************************************************************
 574 %*                                                                      *
 575 \subsection{THE OFFSET SWAMP}
 576 %*                                                                      *
 577 %************************************************************************
 578
 579 There are THREE kinds of offset:
 580 \begin{description}
 581 \item[virtual offsets:]
 582
 583     start at 1 at base of frame, and increase towards top of stack.
 584
 585     don't change when you adjust sp/hp.
 586
 587     independent of stack direction.
 588
 589     only exist inside the code generator, pre Abstract C
 590
 591     for multi-word objects, the offset identifies the word of the
 592     object with smallest offset
 593
 594 \item[reg-relative offsets:]
 595
 596     start at 0 for elt to which sp points, and increase ``into the
 597     interesting stuff.''
 598
 599     Specifically, towards
 600     \begin{itemize}
 601     \item
 602     bottom of stack (for SpA, SpB)
 603     \item
 604     beginning of heap (for Hp)
 605     \item
 606     end of closure (for Node)
 607     \end{itemize}
 608
 609     offset for a particular item changes when you adjust sp.
 610
 611     independent of stack direction.
 612
 613     exist in abstract C CVal and CAddr addressing modes
 614
 615     for multi-word objects, the offset identifies the word of the
 616     object with smallest offset
 617
 618 \item[real offsets:]
 619
 620     either the negation or identity of sp-relative offset.
 621
 622     start at 0 for elt to which sp points, and either increase or
 623     decrease towards bottom of stk, depending on stk direction
 624
 625     exist in real C, usually as a macro call passing an sp-rel offset
 626
 627     for multi-word objects, the offset identifies the word of the
 628     object with lowest address
 629 \end{description}
 630
 631 %************************************************************************
 632 %*                                                                      *
 633 \subsection{STACKS}
 634 %*                                                                      *
 635 %************************************************************************
 636
 637 There are two stacks, as in the STG paper.
 638 \begin{description}
 639 \item[A stack:]
 640 contains only closure pointers.  Its stack ptr is SpA.
 641
 642 \item[B stack:]
 643 contains basic values, return addresses, update frames.
 644 Its stack ptr is SpB.
 645 \end{description}
 646
 647 SpA and SpB point to the topmost allocated word of stack (though they
 648 may not be up to date in the middle of a basic block).
 649
 650 \subsubsection{STACK ALLOCATION}
 651
 652 A stack and B stack grow towards each other, so they overflow when
 653 they collide.
 654
 655 The A stack grows downward; the B stack grows upward.  [We'll try to
 656 localise stuff which uses this info.]
 657
 658 We can check for stack {\em overflow} not just at the start of a basic
 659 block, but at the start of an entire expression evaluation.  The
 660 high-water marks of case-expression alternatives can be max'd.
 661
 662 Within the code for a closure, the ``stack frame'' is deemed to start
 663 with the last argument taken by the closure (ie the one deepest in the
 664 stack).  Stack slots are can then be identified by ``virtual offsets''
 665 from the base of the frame; the bottom-most word of the frame has
 666 offset 1.
 667
 668 For multi-word slots (B stack only) the offset identifies the word
 669 with the smallest virtual offset. [If B grows upward, this is the word
 670 with the lowest physical address too.]
 671
 672 Since there are two stacks, a ``stack frame'' really consists of two
 673 stack frames, one on each stack.
 674
 675 For each stack, we keep track of the following:
 676
 677 \begin{verbatim}
 678 * virtSp        virtual stack ptr       offset of topmost occupied stack slot
 679                                         (initialised to 0 if no args)
 680
 681 * realSp        real stack ptr          offset of real stack ptr reg
 682                                         (initialised to 0 if no args)
 683
 684 * tailSp        tail-call ptr           offset of topmost slot to be retained
 685                                         at next tail call, excluding the
 686                                         argument to the tail call itself
 687
 688 * hwSp          high-water mark         largest value taken by virtSp
 689                                         in this closure body
 690 \end{verbatim}
 691
 692 The real stack pointer is (for now) only adjusted at the tail call itself,
 693 at which point it is made to point to the topmost occupied word of the stack.
 694
 695 We can't always adjust it at the beginning, because we don't
 696 necessarily know which tail call will be made (a conditional might
 697 intervene).  So stuff is actually put on the stack ``above'' the stack
 698 pointer.  This is ok because interrupts are serviced on a different
 699 stack.
 700
 701 The code generator works entirely in terms of stack {\em virtual
 702 offsets}.  The conversion to real addressing modes is done solely when
 703 we look up a binding.  When we move a stack pointer, the offsets of
 704 variables currently bound to stack offsets in the environment will
 705 change.  We provide operations in the @cgBindings@ type to perform
 706 this offset-change (to wit, @shiftStkOffsets@), leaving open whether
 707 it is done pronto, or kept separate and applied to lookups.
 708
 709 Stack overflow checking takes place at the start of a closure body, using
 710 the high-water mark information gotten from the closure body.
 711
 712
 713 %************************************************************************
 714 %*                                                                      *
 715 \subsection{HEAP ALLOCATION}
 716 %*                                                                      *
 717 %************************************************************************
 718
 719 Heap ptr reg (Hp) points to the last word of allocated space (and not
 720 to the first word of free space).
 721
 722 The heap limit register (HpLim) points to the last word of available
 723 space.
 724
 725 A basic block allocates a chunk of heap called a ``heap frame''.
 726 The word of the frame nearest to the previously-allocated stuff
 727 has virtual offset 1, and offsets increase from 1 to the size of the
 728 frame in words.
 729
 730 Closures are allocated with their code pointers having the lowest virtual
 731 offset.
 732
 733 NOTE: this means that closures are only laid out with code ptr at
 734 lowest PHYSICAL address if the heap grows upwards.
 735
 736 Heap ptr reg is moved at the beginning of a basic block to account for
 737 the allocation of the whole frame.  At this time a heap exhaustion
 738 check is made (has the heap ptr gone past the heap limit?).  In the
 739 basic block, indexed accesses off the heap ptr fill in this newly
 740 allocated block.  [Bias to RISC here: no cheap auto-inc mode, and free
 741 indexing.]
 742
 743 We maintain the following information during code generation:
 744
 745 \begin{verbatim}
 746 * virtHp        virtual heap ptr        offset of last word
 747                                         of the frame allocated so far
 748                                         Starts at 0 and increases.
 749 * realHp        virtual offset of
 750                 the real Hp register
 751 \end{verbatim}
 752
 753 Since virtHp only ever increases, it doubles as the heap high water mark.
 754
 755 \subsubsection{BINDINGS}
 756
 757 The code generator maintains info for each name about where it is.
 758 Each variable maps to:
 759
 760 \begin{verbatim}
 761         - its kind
 762
 763         - its volatile location:- a temporary variable
 764                                 - a virtual heap offset n, meaning the
 765                                         ADDRESS OF a word in the current
 766                                         heap frame
 767                                 - absent
 768
 769         - its stable location:  - a virtual stack offset n, meaning the
 770                                         CONTENTS OF an object in the
 771                                         current stack frame
 772                                 - absent
 773 \end{verbatim}
 774
 775 \subsubsection{ENTERING AN OBJECT}
 776
 777 When a closure is entered at the normal entry point, the magic locs
 778 \begin{verbatim}
 779         Node            points to the closure (unless it is a top-level closure)
 780         ReturnReg       points to the code being jumped to
 781 \end{verbatim}
 782 At the fast entry point, Node is still set up, but ReturnReg may not be.
 783 [Not sure about this.]