ghc/docs/NOTES.saving-space

   1 Ways to save code space
   2 ~~~~~~~~~~~~~~~~~~~~~~~
   3 SLPJ/BOS 16 Sept 94
   4
   5
   6
   7
   8
   9 Heap entry points
  10 ~~~~~~~~~~~~~~~~~
  11 We have lots of thunks of the form
  12
  13         let
  14            x = f p q r
  15         in ...
  16
  17 where f is know function of arity 3 (ie saturated).
  18 At the moment we generate special code for this one closure,
  19 which:
  20         pushes an update frame
  21         loads p,q,r into registers from the closure (or using
  22                 immediate loads if they are literals),
  23         jumps to f_fast.
  24
  25 Since there are quite a lot of thunks of this form, the idea is to
  26 generate some code (and its info table) just once, *with the
  27 definition of f*, which does exactly as described above.  We can then
  28 use this code for every thunk of (exactly) this form.  Call this
  29 the "heap entry" for f:
  30
  31         slow entry:     args on stack
  32         fast entry:     args in regs
  33         heap entry:     args in closure pointed to by Node
  34
  35 So the thunk for x would look like this:
  36
  37         -----------------
  38   x =   | * | p | q | r |
  39         --|--------------
  40           |
  41           |       common heap entry code for f
  42           ------> push update frame
  43                   R2 := R1[2]   -- Second arg
  44                   R3 := R1[3]   -- Third arg
  45                   R1 := R1[1]   -- First arg
  46                   goto f_fast
  47
  48 The jump to f_fast can be implemented as a fall-through.  (The
  49 slow entry point can take a jump instead!)
  50
  51 Of course there are also lots of thunks which *aren't* of the heap-entry
  52 form:
  53         x = case y of ...
  54         x = let v = ... in ...
  55         etc
  56
  57 Things to watch out for:
  58
  59 * Literal args.  Consider
  60
  61         x = f 2 p 4
  62
  63 We don't *have* to use the heap entry for f (we could generate special
  64 code+info table as we do now), but we *can* use it provided we
  65 generate a thunk with 2 and 4 stored in it as well as p:
  66
  67         -----------------
  68         | * | 2 | p | 4 |
  69         --|--------------
  70           |
  71           |       common heap entry code for f
  72           ------> push update frame
  73                   R2 := R1[2]   -- Second arg
  74                   R3 := R1[3]   -- Third arg
  75                   R1 := R1[1]   -- First arg
  76                   goto f_fast
  77
  78 (If we have special code the thunk needs only p stored in it, because
  79 the special code can use immediate constants for 2 and 4:
  80
  81         ---------
  82         | * | p |
  83         --|------
  84           |
  85           |     special code for x
  86           ----> push update frame
  87                 R2 := R1[1]     -- Second arg
  88                 R3 := 4         -- Third  arg
  89                 R1 := 2         -- First arg
  90                 goto f_fast
  91
  92
  93 * Single-entry thunks.  If x is a single-entry thunk, there's no need to
  94 push an update frame. That suggests:
  95
  96         ---------------
  97     x = | * | 2 | p 4 |
  98         --|------------
  99           |
 100           |       heap entry code for f
 101           ---->   -- NO!  NO!  push update frame
 102                   R2 := R1[2]   -- Second arg
 103                   R3 := R1[3]   -- Third arg
 104                   R1 := R1[1]   -- First arg
 105                   goto f_fast
 106
 107 Let's call the two variants the
 108         standard heap entry
 109 and     no-update heap entry
 110
 111 We can't fall through from the standard heap-entry code (which pushes
 112 an update frame) to the arg-loading code, because both need an info table.
 113 We have to take a jump.
 114
 115 For non-exported functions we may be able to see that only one of the
 116 two heap entries is required.
 117
 118 * Local functions.  When f is a *local* (ie not top-level) function, its
 119 fast-entry convention is that
 120          R1   = the function closure
 121          R2.. = the args
 122
 123 For example:
 124
 125         top p q = let
 126                         f = \r -> ..r..p...q...
 127                   in
 128                   let
 129                         x = f q
 130                   in
 131                   ...
 132
 133
 134 The shape of the heap-entry closure for f must be
 135
 136                -------------
 137         x  =   | * | f | q |
 138                --|----------
 139                  |
 140                  -------> heap entry code
 141                           must load *f* into R1 as well as q and
 142                           the other args
 143
 144
 145
 146
 147
 148 Avoiding generating entries and info tables
 149 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 150 At present, for every function we generate all of the following,
 151 just in case.  But they aren't always all needed, as noted below:
 152
 153 [NB: all of this applies only to *functions*.  Thunks always
 154 have closure, info table, and entry code.]
 155
 156
 157 * Fast-entry code  ALWAYS NEEDED
 158
 159 * Slow-entry code
 160         Needed iff (a) we have any un-saturated calls to the function
 161         OR         (b) the function is passed as an arg
 162
 163 * The function closure
 164         Needed iff (a) we have any un-saturated calls to the function
 165         OR         (b) the function is passed as an arg
 166         OR         (c) if the function has free vars (ie top level)
 167
 168   Why case (a) here?  Because if the arg-satis check fails,
 169   UpdatePAP stuffs a pointer to the function closure in the PAP.
 170   [Could be changed; UpdatePAP could stuff in a code ptr instead,
 171    but doesn't seem worth it.]
 172
 173   [NB: these conditions imply that we might need the closure
 174   without the slow-entry code.  Here's how.
 175
 176         f x y = let g w = ...x..y..w...
 177                 in
 178                 ...(g t)...
 179
 180   Here we need a closure for g which contains x and y,
 181   but since the calls are all saturated we just jump to the
 182   fast entry point for g, with R1 pointing to the closure for g.]
 183
 184
 185 * Slow-entry info table
 186         Needed iff (a) we have any un-saturated calls to the function
 187         OR         (b) the function is passed as an arg
 188         OR         (c) the function has free vars (ie top level)
 189
 190         NB.  (c) is only required so that the function closure has
 191         an info table to point to, to keep the storage manager happy.
 192         If (c) alone is true we could fake up an info table by choosing
 193         one of a standard family of info tables, whose entry code just
 194         bombs out.
 195
 196         If (c) is retained, then we'll sometimes generate an info table
 197         (for storage mgr purposes) without slow-entry code.  Then we need
 198         to use an error label in the info table to substitute for the absent
 199         slow entry code.
 200
 201 * Standard heap-entry code
 202   Standard heap-entry info table
 203         Needed iff we have any updatable thunks of the standard heap-entry shape.
 204
 205 * Single-update heap-entry code
 206   Single-update heap-entry info table
 207         Needed iff we have any non-updatable thunks of the
 208         standard heap-entry shape.
 209
 210
 211 All are needed if the function is exported, just to play safe.
 212
 213 Idea: generate just the stuff we need!
 214
 215
 216
 217 \begin{code}
 218 staticClosureRequired           -- Assumption: it's a top-level, no-free-var binding
 219         :: StgBinderInfo
 220         -> [Id]                 -- Args
 221         -> Bool
 222 staticClosureRequired (StgBinderInfo arg_occ unsat_occ _ _) args
 223   = arg_occ ||          -- There's an argument occurrence
 224     unsat_occ ||        -- There's an unsaturated call
 225     null args           -- It's a thunk
 226
 227 staticClosureRequired NoStgBinderInfo args = True
 228
 229
 230
 231 slowFunEntryCodeRequired        -- Assumption: it's a function, not a thunk.
 232         :: StgBinderInfo
 233         -> Bool
 234 slowFunEntryCodeRequired (StgBinderInfo arg_occ unsat_occ _ _)
 235   = arg_occ ||          -- There's an argument occurrence
 236     unsat_occ           -- There's an unsaturated call
 237 slowFunEntryCodeRequired NoStgBinderInfo = True
 238
 239
 240 funInfoTableRequired            -- Assumption: it's a function, not a thunk.
 241         :: Bool                 -- Top level?
 242         -> StgBinderInfo
 243         -> Bool
 244 funInfoTableRequired top_level (StgBinderInfo arg_occ unsat_occ _ _)
 245   = not top_level ||
 246     arg_occ ||          -- There's an argument occurrence
 247     unsat_occ           -- There's an unsaturated call
 248
 249 funInfoTableRequired top_level NoStgBinderInfo = True
 250 \end{code}