driver/ordering-passes

   1         Ordering the compiler's passes
   2         ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   3
   4 Change notes
   5 ~~~~~~~~~~~~
   6 1  Nov 94       * NB: if float-out is done after strictness, remember to
   7                   switch off demandedness flags on floated bindings!
   8 13 Oct 94       * Run Float Inwards once more after strictness-simplify [andre]
   9  4 Oct 94       * Do simplification between float-in and strictness [andre]
  10                 * Ignore-inline-pragmas flag for final simplification [andre]
  11
  12 Aug 94          Original: Simon, Andy, Andre
  13
  14
  15
  16
  17 This ordering obeys all the constraints except (5)
  18 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  19
  20         full laziness
  21         simplify with foldr/build
  22         float-in
  23         simplify
  24         strictness
  25         float-in
  26
  27 [check FFT2 still gets benefits with this ordering]
  28
  29 =================================
  30         Constraints
  31 =================================
  32
  33 1. float-in before strictness.
  34 Reason: floating inwards moves definitions inwards to a site at which
  35 the binding might well be strict.
  36
  37 Example         let x = ... in
  38                     y = x+1
  39                 in
  40                 ...
  41 ===>
  42                 let y = let x = ... in x+1
  43                 in ...
  44
  45 The strictness analyser will do a better job of the latter
  46 than the former.
  47
  48 2. Don't simplify between float-in and strictness,
  49 unless you disable float-let-out-of-let, otherwise
  50 the simiplifier's local floating might undo some
  51 useful floating-in.
  52
  53 Example         let f = let y = .. in \x-> x+y
  54                 in ...
  55 ===>
  56                 let y = ...
  57                     f = \x -> x+y
  58                 in ...
  59
  60 This is a bad move, because now y isn't strict.
  61 In the pre-float case, the binding for y is strict.
  62 Mind you, this isn't a very common case, and
  63 it's easy to disable float-let-from-let.
  64
  65 3. Want full-laziness before foldr/build.
  66 Reason: Give priority to sharing rather than deforestation.
  67
  68 Example         \z -> let xs = build g
  69                       in foldr k z xs
  70 ===>
  71                 let xs = build g
  72                 in \x -> foldr k z xs
  73
  74 In the post-full-laziness case, xs is shared between all
  75 applications of the function.   If we did foldr/build
  76 first, we'd have got
  77
  78                 \z -> g k z
  79
  80 and now we can't share xs.
  81
  82
  83 4.  Want strictness after foldr/build.
  84 Reason: foldr/build makes new function definitions which
  85 can benefit from strictness analysis.
  86
  87 Example:        sum [1..10]
  88 ===> (f/b)
  89                 let g x a | x > 10    = a
  90                           | otherwise = g (x+1) (a+x)
  91
  92 Here we clearly want to get strictness analysis on g.
  93
  94
  95 5.  Want full laziness after strictness
  96 Reason: absence may allow something to be floated out
  97 which would not otherwise be.
  98
  99 Example         \z -> let x = f (a,z) in ...
 100 ===> (absence anal + inline wrapper of f)
 101                 \z -> let x = f.wrk a in ...
 102 ===> (full laziness)
 103                 let x= f.wrk a in  \z -> ...
 104
 105 TOO BAD.  This doesn't look a common case to me.
 106
 107
 108 6. Want float-in after foldr/build.
 109 Reason: Desugaring list comprehensions + foldr/build
 110 gives rise to new float-in opportunities.
 111
 112 Example         ...some list comp...
 113 ==> (foldr/build)
 114                 let v = h xs in
 115                 case ... of
 116                   []     -> v
 117                   (y:ys) -> ...(t v)...
 118 ==> (simplifier)
 119                 let v = h xs in
 120                 case ... of
 121                   [] -> h xs
 122                   (y:ys) -> ...(t v)...
 123
 124 Now v could usefully be floated into the second branch.
 125
 126 7. Want simplify after float-inwards.
 127 [Occurred in the prelude, compiling ITup2.hs, function dfun.Ord.(*,*)]
 128 This is due to the following (that happens with dictionaries):
 129
 130 let a1 = case v of (a,b) -> a
 131 in let m1 = \ c -> case c of I# c# -> case c# of 1 -> a1 5
 132                                                  2 -> 6
 133 in let m2 = \ c -> case c of I# c# ->
 134                      case c# +# 1# of cc# -> let cc = I# cc#
 135                                              in m1 cc
 136    in (m1,m2)
 137
 138 floating inwards will push the definition of a1 into m1 (supposing
 139 it is only used there):
 140
 141 in let m1 = let a1 = case v of (a,b) -> a
 142             in \ c -> case c of I# c# -> case c# of 1 -> a1 5
 143                                                     2 -> 6
 144 in let m2 = \ c -> case c of I# c# ->
 145                      case c# +# 1# of cc# -> let cc = I# cc#
 146                                              in m1 cc
 147    in (m1,m2)
 148
 149 if we do strictness analysis now we will not get a worker-wrapper
 150 for m1, because of the "let a1 ..." (notice that a1 is not strict in
 151 its body).
 152
 153 Not having this worker wrapper might be very bad, because it might
 154 mean that we will have to rebox arguments to m1 if they are
 155 already unboxed, generating extra allocations, as occurs with m2 (cc)
 156 above.
 157
 158 To solve this problem we have decided to run the simplifier after
 159 float-inwards, so that lets whose body is a HNF are floated out,
 160 undoing the float-inwards transformation in these cases.
 161 We are then back to the original code, which would have a worker-wrapper
 162 for m1 after strictness analysis and would avoid the extra let in m2.
 163
 164 What we lose in this case are the opportunities for case-floating
 165 that could be presented if, for example, a1 would indeed be demanded (strict)
 166 after the floating inwards.
 167
 168 The only way of having the best of both is if we have the worker/wrapper
 169 pass explicitly called, and then we could do with
 170
 171 float-in
 172 strictness analysis
 173 simplify
 174 strictness analysis
 175 worker-wrapper generation
 176
 177 as we would
 178 a) be able to detect the strictness of m1 after the
 179    first call to the strictness analyser, and exploit it with the simplifier
 180    (in case it was strict).
 181 b) after the call to the simplifier (if m1 was not demanded)
 182    it would be floated out just like we currently do, before stricness
 183    analysis II and worker/wrapperisation.
 184
 185 The reason to not do worker/wrapperisation twice is to avoid
 186 generating wrappers for wrappers which could happen.
 187
 188
 189 8.  If full laziness is ever done after strictness, remember to switch off
 190 demandedness flags on floated bindings!  This isn't done at the moment.
 191
 192
 193 Ignore-inline-pragmas flag for final simplification
 194 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 195 [Occurred in the prelude, compiling ITup2.hs, function dfun.Ord.(*,*)]
 196 Sometimes (e.g. in dictionary methods) we generate
 197 worker/wrappers for functions but the wrappers are never
 198 inlined. In dictionaries we often have
 199
 200 dict = let f1 = ...
 201            f2 = ...
 202            ...
 203        in (f1,f2,...)
 204
 205 and if we create worker/wrappers for f1,...,fn the wrappers will not
 206 be inlined anywhere, and we will have ended up with extra
 207 closures (one for the worker and one for the wrapper) and extra
 208 function calls, as when we access the dictionary we will be acessing
 209 the wrapper, which will call the worker.
 210 The simplifier never inlines workers into wrappers, as the wrappers
 211 themselves have INLINE pragmas attached to them (so that they are always
 212 inlined, and we do not know in advance how many times they will be inlined).
 213
 214 To solve this problem, in the last call to the simplifier we will
 215 ignore these inline pragmas and handle the workers and the wrappers
 216 as normal definitions. This will allow a worker to be inlined into
 217 the wrapper if it satisfies all the criteria for inlining (e.g. it is
 218 the only occurrence of the worker etc.).
 219
 220 Run Float Inwards once more after strictness-simplify
 221 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 222 [Occurred in the prelude, compiling IInt.hs, function const.Int.index.wrk]
 223 When workers are generated after strictness analysis (worker/wrapper),
 224 we generate them with "reboxing" lets, that simply reboxes the unboxed
 225 arguments, as it may be the case that the worker will need the
 226 original boxed value:
 227
 228 f x y = case x of
 229           (a,b) -> case y of
 230                      (c,d) -> case a == c of
 231                                 True -> (x,x)
 232                                 False -> ((1,1),(2,2))
 233
 234 ==> (worker/wrapper)
 235
 236 f_wrapper x y = case x of
 237                   (a,b) -> case y of
 238                              (c,d) -> f_worker a b c d
 239
 240 f_worker a b c d = let x = (a,b)
 241                        y = (c,d)
 242                    in case a == c of
 243                         True -> (x,x)
 244                         False -> ((1,1),(2,2))
 245
 246 in this case the simplifier will remove the binding for y as it is not
 247 used (we expected this to happen very often, but we do not know how
 248 many "reboxers" are eventually removed and how many are kept), and
 249 will keep the binding for x.  But notice that x is only used in *one*
 250 of the branches in the case, but is always being allocated! The
 251 floating inwards pass would push its definition into the True branch.
 252 A similar benefit occurs if it is only used inside a let definition.
 253 These are basically the advantages of floating inwards, but they are
 254 only exposed after the S.A./worker-wrapperisation of the code!  As we
 255 also have reasons to float inwards before S.A. we have to run it
 256 twice.
 257