3 % - I'd like the transformation rules to appear clearly-identified in
4 % a box of some kind, so they can be distinguished from the examples.
9 \documentstyle[slpj,11pt]{article}
11 \renewcommand{\textfraction}{0.2}
12 \renewcommand{\floatpagefraction}{0.7}
16 \title{How to simplify matters}
18 \author{Simon Peyton Jones and Andre Santos\\
19 Department of Computing Science, University of Glasgow, G12 8QQ \\
20 @simonpj@@dcs.gla.ac.uk@
28 Quite a few compilers use the {\em compilation by transformation} idiom.
29 The idea is that as much of possible of the compilation process is
30 expressed as correctness-preserving transformations, each of which
31 transforms a program into a semantically-equivalent
32 program that (hopefully) executes more quickly or in less space.
33 Functional languages are particularly amenable to this approach because
34 they have a particularly rich family of possible transformations.
35 Examples of transformation-based compilers
36 include the Orbit compiler,[.kranz orbit thesis.]
37 Kelsey's compilers,[.kelsey thesis, hudak kelsey principles 1989.]
38 the New Jersey SML compiler,[.appel compiling with continuations.]
39 and the Glasgow Haskell compiler.[.ghc JFIT.] Of course many, perhaps most,
40 other compilers also use transformation to some degree.
42 Compilation by transformation uses automatic transformations; that is, those
43 which can safely be applied automatically by a compiler. There
44 is also a whole approach to programming, which we might call {\em programming by transformation},
45 in which the programmer manually transforms an inefficient specification into
46 an efficient program. This development process might be supported by
47 a programming environment in which does the book keeping, but the key steps
48 are guided by the programmer. We focus exclusively on automatic transformations
51 Automatic program transformations seem to fall into two broad categories:
53 \item {\bf Glamorous transformations} are global, sophisticated,
54 intellectually satisfying transformations, sometimes guided by some
55 interesting kind of analysis.
57 lambda lifting,[.johnsson lambda lifting.]
58 full laziness,[.hughes thesis, lester spe.]
59 closure conversion,[.appel jim 1989.]
60 deforestation,[.wadler 1990 deforestation, marlow wadler deforestation Glasgow92, chin phd 1990 march, gill launchbury.]
61 transformations based on strictness analysis,[.peyton launchbury unboxed.]
62 and so on. It is easy to write papers about these sorts of transformations.
64 \item {\bf Humble transformations} are small, simple, local transformations,
65 which individually look pretty trivial. Here are two simple examples\footnote{
66 The notation @E[]@ stands for an arbitrary expression with zero or more holes.
67 The notation @E[e]@ denotes @E[]@ with the holes filled in by the expression @e@.
68 We implicitly assume that no name-capture happens --- it's just
69 a short-hand, not an algorithm.
72 let x = y in E[x] ===> E[y]
74 case (x:xs) of ===> E1[x,xs]
78 Transformations of this kind are almost embarassingly simple. How could
79 anyone write a paper about them?
81 This paper is about humble transformations, and how to implement them.
82 Although each individual
83 transformation is simple enough, there is a scaling issue:
84 there are a large number of candidate transformations to consider, and
85 there are a very large number of opportunities to apply them.
87 In the Glasgow Haskell compiler, all humble transformations
88 are performed by the so-called {\em simplifier}.
89 Our goal in this paper is to give an overview of how the simplifier works, what
90 transformations it applies, and what issues arose in its design.
92 \section{The language}
94 Mutter mutter. Important points:
96 \item Second order lambda calculus.
97 \item Arguments are variables.
98 \item Unboxed data types, and unboxed cases.
100 Less important points:
102 \item Constructors and primitives are saturated.
103 \item if-then-else desugared to @case@
108 \section{Transformations}
110 This section lists all the transformations implemented by the simplifier.
111 Because it is a complete list, it is a long one.
112 We content ourselves with a brief statement of each transformation,
113 augmented with forward references to Section~\ref{sect:composing}
114 which gives examples of the ways in which the transformations can compose together.
116 \subsection{Beta reduction}
118 If a lambda abstraction is applied to an argument, we can simply
119 beta-reduce. This applies equally to ordinary lambda abstractions and
122 (\x -> E[x]) arg ===> E[arg]
123 (/\a -> E[a]) ty ===> E[ty]
125 There is no danger of duplicating work because the argument is
126 guaranteed to be a simple variable or literal.
128 \subsubsection{Floating applications inward}
130 Applications can be floated inside a @let(rec)@ or @case@ expression.
131 This is a good idea, because they might find a lambda abstraction inside
134 (let(rec) Bind in E) arg ===> let(rec) Bind in (E arg)
136 (case E of {P1 -> E1;...; Pn -> En}) arg
138 case E of {P1 -> E1 arg; ...; Pn -> En arg}
143 \subsection{Transformations concerning @let(rec)@}
145 \subsubsection{Floating @let@ out of @let@}
147 It is sometimes useful to float a @let(rec)@ out of a @let(rec)@ right-hand
150 let x = let(rec) Bind in B1 ===> let(rec) Bind in
155 letrec x = let(rec) Bind in B1 ===> let(rec) Bind
160 \subsubsection{Floating @case@ out of @let@}
163 \subsubsection{@let@ to @case@}
166 \subsection{Transformations concerning @case@}
168 \subsubsection{Case of known constructor}
170 If a @case@ expression scrutinises a constructor,
171 the @case@ can be eliminated. This transformation is a real
172 win: it eliminates a whole @case@ expression.
174 case (C a1 .. an) of ===> E[a1..an]
176 C b1 .. bn -> E[b1..bn]
179 If none of the constructors in the alternatives match, then
180 the default is taken:
182 case (C a1 .. an) of ===> let y = C a1 .. an
183 ...[no alt matches C]... in E
186 There is an important variant of this transformation when
187 the @case@ expression scrutinises a {\em variable}
188 which is known to be bound to a constructor.
190 arise for two reasons:
192 \item An enclosing @let(rec)@ binding binds the variable to a constructor.
195 let x = C p q in ... (case x of ...) ...
197 \item An enclosing @case@ expression scrutinises the same variable.
202 C p q -> ... (case x of ...) ...
205 This situation is particularly common, as we discuss in Section~\ref{sect:repeated-evals}.
207 In each of these examples, @x@ is known to be bound to @C p q@
208 at the inner @case@. The general rules are:
210 case x of {...; C b1 .. bn -> E[b1..bn]; ...}
211 ===> {x bound to C a1 .. an}
214 case x of {...[no alts match C]...; y -> E[y]}
215 ===> {x bound to C a1 .. an}
219 \subsubsection{Dead alternative elimination}
224 ===> x *not* bound to C
228 We might know that @x@ is not bound to a particular constructor
229 because of an enclosing case:
235 Inside @E1@ we know that @x@ is bound to @C@.
236 However, if the type has more than two constructors,
237 inside @E2@ all we know is that @x@ is {\em not} bound to @C@.
239 This applies to unboxed cases also, in the obvious way.
241 \subsubsection{Case elimination}
243 If we can prove that @x@ is not bottom, then this rule applies.
248 We might know that @x@ is non-bottom because:
250 \item @x@ has an unboxed type.
251 \item There's an enclosing case which scrutinises @x@.
252 \item It is bound to an expression which provably terminates.
254 Since this transformation can only improve termination, even if we apply it
255 when @x@ is not provably non-bottom, we provide a compiler flag to
256 enable it all the time.
258 \subsubsection{Case of error}
261 case (error ty E) of Alts ===> error ty' E
263 ty' is type of whole case expression
266 Mutter about types. Mutter about variables bound to error.
267 Mutter about disguised forms of error.
269 \subsubsection{Floating @let(rec)@ out of @case@}
271 A @let(rec)@ binding can be floated out of a @case@ scrutinee:
273 case (let(rec) Bind in E) of Alts ===> let(rec) Bind in
276 This increases the likelihood of a case-of-known-constructor transformation,
277 because @E@ is not hidden from the @case@ by the @let(rec)@.
279 \subsubsection{Floating @case@ out of @case@}
281 Analogous to floating a @let(rec)@ from a @case@ scrutinee is
282 floating a @case@ from a @case@ scrutinee. We have to be
283 careful, though, about code size. If there's only one alternative
284 in the inner case, things are easy:
286 case (case E of {P -> R}) of ===> case E of {P -> case R of
291 If there's more than one alternative there's a danger
292 that we'll duplicate @S1@...@Sm@, which might be a lot of code.
293 Our solution is to create a new local definition for each
296 case (case E of {P1 -> R1; ...; Pn -> Rn}) of
301 let s1 = \x1 ... z1 -> S1
303 sm = \xm ... zm -> Sm
306 P1 -> case R1 of {Q1 -> s1 x1 ... z1; ...; Qm -> sm xm ... zm}
308 Pn -> case Rn of {Q1 -> s1 x1 ... z1; ...; Qm -> sm xm ... zm}
310 Here, @x1 ... z1@ are that subset of
311 variables bound by the pattern @Q1@ which are free in @S1@, and
312 similarly for the other @si@.
314 Is this transformation a win? After all, we have introduced @m@ new
315 functions! Section~\ref{sect:join-points} discusses this point.
317 \subsubsection{Case merging}
329 Any alternatives in @[more alts]@ which are already covered by @[some alts]@
330 should first be eliminated by the dead-alternative transformation.
333 \subsection{Constructor reuse}
336 \subsection{Inlining}
338 The inlining transformtion is simple enough:
340 let x = R in B[x] ===> B[R]
342 Inlining is more conventionally used to describe the instantiation of a function
343 body at its call site, with arguments substituted for formal parameters. We treat
344 this as a two-stage process: inlining followed by beta reduction. Since we are
345 working with a higher-order language, not all the arguments may be available at every
346 call site, so separating inlining from beta reduction allows us to concentrate on
347 one problem at a time.
349 The choice of exactly {\em which} bindings to inline has a major impact on efficiency.
350 Specifically, we need to consider the following factors:
353 Inlining a function at its call site, followed by some beta reduction,
354 very often exposes opportunities for further transformations.
355 We inline many simple arithmetic and boolean operators for this reason.
357 Inlining can increase code size.
359 Inlining can duplicate work, for example if a redex is inlined at more than one site.
360 Duplicating a single expensive redex can ruin a program's efficiency.
364 Our inlining strategy depends on the form of @R@:
369 \subsubsection{Dead code removal}
371 If a @let@-bound variable is not used the binding can be dropped:
373 let x = E in B ===> B
376 A similar transformation applies for @letrec@-bound variables.
377 Programmers seldom write dead code, of course, but bindings often become dead when they
383 \section{Composing transformations}
384 \label{sect:composing}
386 The really interesting thing about humble transformations is the way in which
387 they compose together to carry out substantial and useful transformations.
388 This section gives a collection of motivating examples, all of which have
389 shown up in real application programs.
391 \subsection{Repeated evals}
392 \label{sect:repeated-evals}
394 Example: x+x, as in unboxed paper.
397 \subsection{Lazy pattern matching}
399 Lazy pattern matching is pretty inefficient. Consider:
406 x = case t of (x,y) -> x
407 y = case t of (x,y) -> y
410 This code allocates three thunks! However, if @B@ is strict in {\em either}
411 @x@ {\em or} @y@, then the strictness analyser will easily spot that
412 the binding for @t@ is strict, so we can do a @let@-to-@case@ transformation:
415 (x,y) -> let t = (x,y) in
416 let x = case t of (x,y) -> x
417 y = case t of (x,y) -> y
420 whereupon the case-of-known-constructor transformation
421 eliminates the @case@ expressions in the right-hand side of @x@ and @y@,
422 and @t@ is then spotted as being dead, so we get
428 \subsection{Join points}
429 \label{sect:join-points}
431 One motivating example is this:
433 if (not x) then E1 else E2
435 After desugaring the conditional, and inlining the definition of
438 case (case x of True -> False; False -> True}) of
442 Now, if we apply our case-of-case transformation we get:
448 True -> case False of {True -> e1; False -> e2}
449 False -> case True of {True -> e1; False -> e2}
451 Now the case-of-known constructor transformation applies:
460 Since there is now only one occurrence of @e1@ and @e2@ we can
461 inline them, giving just what we hoped for:
463 case x of {True -> E2; False -> E1}
465 The point is that the local definitions will often disappear again.
467 \subsubsection{How join points occur}
469 But what if they don't disappear? Then the definitions @s1@ ... @sm@
470 play the role of ``join points''; they represent the places where
471 execution joins up again, having forked at the @case x@. The
472 ``calls'' to the @si@ should really be just jumps. To see this more clearly
473 consider the expression
475 if (x || y) then E1 else E2
477 A C compiler will ``short-circuit'' the
478 evaluation of the condition if @x@ turns out to be true
479 generate code, something like this:
482 if (y) {...code for E2...}
483 l1: ...code for E1...
485 In our setting, here's what will happen. First we desguar the
486 conditional, and inline the definition of @||@:
488 case (case x of {True -> True; False -> y}) of
492 Now apply the case-of-case transformation:
498 True -> case True of {True -> e1; False -> e2}
499 False -> case y of {True -> e1; False -> e2}
501 Unlike the @not@ example, only one of the two inner case
502 simplifies, and we can therefore only inline @e2@, because
503 @e1@ is still mentioned twice\footnote{Unless the
504 inlining strategy decides that @E1@ is small enough to duplicate;
505 it is used in separate @case@ branches so there's no concern about duplicating
506 work. Here's another example of the way in which we make one part of the
507 simplifier (the inlining strategy) help with the work of another (@case@-expression
514 False -> case y of {True -> e1; False -> e2}
516 The code generator produces essentially the same code as
517 the C code given above. The binding for @e1@ turns into
518 just a label, which is jumped to from the two occurrences of @e1@.
520 \subsubsection{Case of @error@}
522 The case-of-error transformation is often exposed by the case-of-case
523 transformation. Consider
529 After inlining @hd@, we get
531 case (case xs of [] -> error "hd"; (x:_) -> x) of
535 (I've omitted the type argument of @error@ to save clutter.)
536 Now doing case-of-case gives
542 [] -> case (error "hd") of { True -> e1; False -> e2 }
543 (x:_) -> case x of { True -> e1; False -> e2 }
545 Now the case-of-error transformation springs to life, after which
546 we can inline @e1@ and @e2@:
550 (x:_) -> case x of {True -> E1; False -> E2}
553 \subsection{Nested conditionals combined}
555 Sometimes programmers write something which should be done
556 by a single @case@ as a sequence of tests:
558 if x==0::Int then E0 else
562 After eliminating some redundant evals and doing the case-of-case
563 transformation we get
572 The case-merging transformation puts these together to get
580 Sometimes the sequence of tests cannot be eliminated from the source
581 code because of overloading:
583 f :: Num a => a -> Bool
588 If we specialise @f@ to @Int@ we'll get the previous example again.
590 \subsection{Error tests eliminated}
592 The elimination of redundant alternatives, and then of redundant cases,
593 arises when we inline functions which do error checking. A typical
596 if (x `rem` y) == 0 then (x `div` y) else y
598 Here, both @rem@ and @div@ do an error-check for @y@ being zero.
599 The second check is eliminated by the transformations.
600 After transformation the code becomes:
605 0# -> error "rem: zero divisor"
606 _ -> case x# rem# y# of
607 0# -> case x# div# y# of
612 \subsection{Atomic arguments}
614 At this point it is possible to appreciate the usefulness of
615 the Core-language syntax requirement that arguments are atomic.
616 For example, suppose that arguments could be arbitrary expressions.
617 Here is a possible transformation:
619 f (case x of (p,q) -> p)
620 ===> f strict in its second argument
621 case x of (p,q) -> f (p,p)
623 Doing this transformation would be useful, because now the
624 argument to @f@ is a simple variable rather than a thunk.
625 However, if arguments are atomic, this transformation becomes
626 just a special case of floating a @case@ out of a strict @let@:
628 let a = case x of (p,q) -> p
630 ===> (f a) strict in a
631 case x of (p,q) -> let a=p in f a
633 case x of (p,q) -> f p
635 There are many examples of this kind. For almost any transformation
636 involving @let@ there is a corresponding one involving a function
637 argument. The same effect is achieved with much less complexity
638 by restricting function arguments to be atomic.
645 \subsection{Renaming and cloning}
647 Every program-transformation system has to worry about name capture.
648 For example, here is an erroneous transformation:
652 (\x -> \y -> x + y) (y+3)
658 The transformation fails because the originally free-occurrence
659 of @y@ in the argument @y+3@ has been ``captured'' by the @\y@-abstraction.
660 There are various sophisticated solutions to this difficulty, but
661 we adopted a very simple one: we uniquely rename every locally-bound identifier
662 on every pass of the simplifier.
663 Since we are in any case producing an entirely new program (rather than side-effecting
664 an existing one) it costs very little extra to rename the identifiers as we go.
666 So our example would become
670 (\x -> \y -> x + y) (y+3)
676 The simplifier accepts as input a program which has arbitrary bound
677 variable names, including ``shadowing'' (where a binding hides an
678 outer binding for the same identifier), but it produces a program in
679 which every bound identifier has a distinct name.
681 Both the ``old'' and ``new'' identifiers have type @Id@, but when writing
682 type signatures for functions in the simplifier we use the types @InId@, for
683 identifiers from the input program, and @OutId@ for identifiers from the output program:
688 This nomenclature extends naturally to expressions: a value of type @InExpr@ is an
689 expression whose identifiers are from the input-program name-space, and similarly
693 \section{The simplifier}
695 The basic algorithm followed by the simplifier is:
697 \item Analyse: perform occurrence analysis and dependency analysis.
698 \item Simplify: apply as many transformations as possible.
699 \item Iterate: perform the above two steps repeatedly until no further transformations are possible.
700 (A compiler flag allows the programmer to bound the maximum number of iterations.)
702 We make a effort to apply as many transformations as possible in Step
703 2. To see why this is a good idea, just consider a sequence of
704 transformations in which each transformation enables the next. If
705 each iteration of Step 2 only performs one transformation, then the
706 entire program will to be re-analysed by Step 1, and re-traversed by
707 Step 2, for each transformation of the sequence. Sometimes this is
708 unavoidable, but it is often possible to perform a sequence of
709 transformtions in a single pass.
711 The key function, which simplifies expressions, has the following type:
713 simplExpr :: SimplEnv
714 -> InExpr -> [OutArg]
717 The monad, @SmplM@ can quickly be disposed of. It has only two purposes:
719 \item It plumbs around a supply of unique names, so that the simplifier can
720 easily invent new names.
721 \item It gathers together counts of how many of each kind of transformation
722 has been applied, for statistical purposes. These counts are also used
723 in Step 3 to decide when the simplification process has terminated.
726 The signature can be understood like this:
728 \item The environment, of type @SimplEnv@, provides information about
729 identifiers bound by the enclosing context.
730 \item The second and third arguments together specify the expression to be simplified.
731 \item The result is the simplified expression, wrapped up by the monad.
733 The simplifier's invariant is this:
735 @simplExpr@~env~expr~[a_1,\ldots,a_n] = expr[env]~a_1~\ldots~a_n
737 That is, the expression returned by $@simplExpr@~env~expr~[a_1,\ldots,a_n]$
738 is semantically equal (although hopefully more efficient than)
739 $expr$, with the renamings in $env$ applied to it, applied to the arguments
742 \subsection{Application and beta reduction}
744 The arguments are carried ``inwards'' by @simplExpr@, as an accumulating parameter.
745 This is a convenient way of implementing the transformations which float
746 arguments inside a @let@ and @case@. This list of pending arguments
747 requires a new data type, @CoreArg@, along with its ``in'' and ``out'' synonyms,
748 because an argument might be a type or an atom:
750 data CoreArg bindee = TypeArg UniType
751 | ValArg (CoreAtom bindee)
753 type InArg = CoreArg InId
754 type OutArg = CoreArg OutId
756 The equations for applications simply apply
757 the environment to the argument (to handle renaming) and put the result
758 on the argument stack, tagged to say whether it is a type argument or value argument:
760 simplExpr env (CoApp fun arg) args
761 = simplExpr env fun (ValArg (simplAtom env arg) : args)
762 simplExpr env (CoTyApp fun ty) args
763 = simplExpr env fun (TypeArg (simplTy env ty) : args)