ghc/docs/NOTES.update-mechanism

   1         The Glorious New Update Mechanism
   2         ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   3         Simon & Jim Dec 93
   4
   5 Return convention
   6 ~~~~~~~~~~~~~~~~~
   7 When a constructor returns it makes sure that
   8
   9         R2 contains the info pointer for the constructor
  10         R1,R3.. contain the components (if return in regs)
  11         R1 points to the constructor object itself (if return in heap)
  12
  13 The info table for a constructor contains a pointer to the
  14 constructor's update code.  If a constructor returns to an
  15 update frame, the update frame's code just jumps direct to the
  16 constructor's update code, via the info pointer in R2.
  17
  18 This penalises slightly the return of a new constructor,
  19 because we have to load R2 with the info ptr.  [Fact: in runs
  20 of the compiler, 20-30% of all returns are of a new constructor;
  21 70-80% are existing constructors.]
  22
  23 Info tables
  24 ~~~~~~~~~~~
  25 Each dynamic-heap-allocated constructor has *two* info tables:
  26
  27 * the "NewCon" info table is put into R2 when returning a new
  28   constructor, which does not yet exist in the heap; R1 is dead!
  29   The "NewCon" info table has no GC entries, because it's only ever used
  30   when returning in regs, never installed in a real constructor.
  31
  32   The NewCon table also needs a valid tag field (see taggery below)
  33
  34 * the "ExistingCon" info table is used for all constructors allocated
  35   in the heap.
  36
  37 The update code for a constructor
  38 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  39 The update code for a constructor actually performs the update
  40 right away.  [At present, the update is deferred until we get
  41 back to the case expression.]  It knows how to do the update
  42 because the update code is constructor-specific.
  43
  44 Once it's done the update, it makes R1 point to the constructor object
  45 in the heap (which'll either be freshly-allocated, if big, or the
  46 updated thing itself), and (for non-niladic constructors) makes R2 point
  47 to the "ExistingCon" info table for the constructor.  (Of course the new
  48 constructor will also have an ExistingCon info ptr.)  For niladic
  49 constructors, we do *not* use the "ExistingCon" info table.  We continue
  50 to overwrite updatees in-place, because this saves us an indirection
  51 prior to garbage collection (and the extra niladic constructors disappear
  52 during the next garbage collection anyway).
  53
  54 The update code in the ExistingCon info table simply updates with an
  55 indirection, using R1.  I *think* this can be one standard piece of
  56 code.  The only doubt here concerns GC; if updating with an
  57 indirection can cause GC (possible on GRIP?  or generational GC?),
  58 then we need to know which regs are live.  We can solve this by
  59 putting a liveness mask in the info table too.  [Arguably we want
  60 that anyway; consider returning to the bottom of a stack object.]
  61 So a liveness mask in the info table is probably a good idea.
  62
  63 Constructors which return in heap return with an ExistingCon info
  64 ptr.  They don't need a NewCon info table at all.
  65
  66 Notice that this means that when we return an *existing* constructor,
  67 to an update frame, the update is done with an indirection, rather
  68 than [as now] copying the constructor afresh.  This solves the space duplication
  69 problem which shows up in "clausify".
  70
  71 GC: R1 might be dead; R2 is a non-ptr.  So this return convention
  72 relies on using liveness masks for GC reg-liveness info, not the
  73 old no-of-live-ptrs info.
  74
  75 Taggery
  76 ~~~~~~~
  77
  78   [Current: For unvectored returns with more than one constructor, we
  79   currently load TagReg, and scrutinise it in the case expression.
  80   Worse, we also have to scrutinise TagReg in the update entry of the
  81   return vector.]
  82
  83 In the new world, updates are handled without any nonsense.  No need
  84 to look at any register, becase we just jump to the constructor
  85 specific update code.
  86
  87 Since we have an info ptr in R2, we can get the tag out of the info
  88 table, thus getting rid of TagReg altogether.  (This could conceivably
  89 be a *lose* on a machine with lots of regs, because it replaces a
  90 immediate small-const load by a memory fetch of the tag from the info
  91 table.
  92
  93 Not clear whether this is worth trying to improve.  Could
  94
  95         a) #define TagReg to be a register or an offset from R2
  96         b) issue a SET_TAG macro in the entry code for a constructor,
  97            which usually expands to nothing
  98
  99 [NB 75-95% of all returns are vectored in runs of the compiler itself.]
 100
 101 The entry code for a constructor
 102 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 103 The real reason the registers are assigned as above is to make the
 104 entry code for a constructor simple.  When the entry code is executed,
 105 we have a new entry convention:
 106
 107         R1 points to the object
 108         R2 is its info pointer
 109
 110 (Why? because we usually enter it by indirecting through its info
 111 table, so it seems a shame to load the info ptr from memory twice.)
 112
 113 So all the entry code has to do is to return (perhaps vectored-ly).
 114 (Maybe load TagReg, usually not --- see above.)
 115
 116 NB this entry convention applies, of course, to all thunks as
 117 well as constructors -- whenever we enter an unknown object via R1 (Node).
 118
 119 Case expressions
 120 ~~~~~~~~~~~~~~~~
 121 Return vectors no longer need update code.
 122
 123 Unvectored returns can therefore be *direct* to the code,
 124 rather than *indirect* via a 2-entry vector.
 125
 126 Penalty for this improvement: "polymorphic" return vectors,
 127 notably that in an update frame, needs to accomodate either
 128 a direct or a vectored return.  So it has to look like:
 129
 130         UpdVec: jmp UnvectoredUpd
 131                 .word UpdVec0
 132                 .word UpdVec1
 133                 ...
 134
 135 that is, the return vector is offset by some fixed amount
 136 from the pointer put on the stack.  Or, it could be done
 137 backwards:
 138
 139                 ...
 140                 .word UpdVec1
 141                 .word UpdVec0
 142         UpdVec: ...code for UnvectoredUpd...
 143
 144 and then vectored returns would use negative offsets.
 145
 146 This grunge is necessary *only* for a fixed set of polymorphic return
 147 vectors, part of the runtime system:
 148
 149         - update frames
 150         - restore cost centres
 151         - seq, par
 152         - thread base
 153         - stack object base
 154
 155 Case expressions generate either a direct return, or a vector,
 156 but never a combination.
 157
 158 Update Frames
 159 ~~~~~~~~~~~~~
 160
 161 Standard update frames are still required if we don't know the type of
 162 the constructor being returned.  However, we often do know the type.  In
 163 this case, we can generate a type-specific updating return-vector to place
 164 in the update frame rather than the StdUpdRetVector.  This saves us one
 165 level of indirection.
 166
 167 Partial applications
 168 ~~~~~~~~~~~~~~~~~~~~
 169 PAPs are basically handled just the same way as at present.
 170
 171 Changes from now
 172 ~~~~~~~~~~~~~~~~
 173 * UpdReg dies.
 174 * TagReg dies.
 175 * RetVecReg dies.  (Previously needed to do return after update.)
 176 * Return vectors have half the number of entries.
 177 * Unvectored returns go direct.
 178 * Polymorphic seq/par and friends.
 179 * No space duplication problem (cf clausify)
 180
 181
 182 Glosses
 183 ~~~~~~~
 184 Tag and update code are needed only for constructor info tables.
 185 It seems a shame to take up space in other info tables (ie 99% of them).
 186
 187 Possibilities:
 188
 189 - use an indirection to GC code, so the vari-sized gc stuff becomes
 190   fixed
 191 - put the tag/upd code ptrs before the start of the info table.  (or
 192   between the info table and code when reversing info tables...)
 193
 194
 195 Looks tricky to me.