% Andre: % % - I'd like the transformation rules to appear clearly-identified in % a box of some kind, so they can be distinguished from the examples. % \documentstyle[slpj,11pt]{article} \renewcommand{\textfraction}{0.2} \renewcommand{\floatpagefraction}{0.7} \begin{document} \title{How to simplify matters} \author{Simon Peyton Jones and Andre Santos\\ Department of Computing Science, University of Glasgow, G12 8QQ \\ @simonpj@@dcs.gla.ac.uk@ } \maketitle \section{Motivation} Quite a few compilers use the {\em compilation by transformation} idiom. The idea is that as much of possible of the compilation process is expressed as correctness-preserving transformations, each of which transforms a program into a semantically-equivalent program that (hopefully) executes more quickly or in less space. Functional languages are particularly amenable to this approach because they have a particularly rich family of possible transformations. Examples of transformation-based compilers include the Orbit compiler,[.kranz orbit thesis.] Kelsey's compilers,[.kelsey thesis, hudak kelsey principles 1989.] the New Jersey SML compiler,[.appel compiling with continuations.] and the Glasgow Haskell compiler.[.ghc JFIT.] Of course many, perhaps most, other compilers also use transformation to some degree. Compilation by transformation uses automatic transformations; that is, those which can safely be applied automatically by a compiler. There is also a whole approach to programming, which we might call {\em programming by transformation}, in which the programmer manually transforms an inefficient specification into an efficient program. This development process might be supported by a programming environment in which does the book keeping, but the key steps are guided by the programmer. We focus exclusively on automatic transformations in this paper. Automatic program transformations seem to fall into two broad categories: \begin{itemize} \item {\bf Glamorous transformations} are global, sophisticated, intellectually satisfying transformations, sometimes guided by some interesting kind of analysis. Examples include: lambda lifting,[.johnsson lambda lifting.] full laziness,[.hughes thesis, lester spe.] closure conversion,[.appel jim 1989.] deforestation,[.wadler 1990 deforestation, marlow wadler deforestation Glasgow92, chin phd 1990 march, gill launchbury.] transformations based on strictness analysis,[.peyton launchbury unboxed.] and so on. It is easy to write papers about these sorts of transformations. \item {\bf Humble transformations} are small, simple, local transformations, which individually look pretty trivial. Here are two simple examples\footnote{ The notation @E[]@ stands for an arbitrary expression with zero or more holes. The notation @E[e]@ denotes @E[]@ with the holes filled in by the expression @e@. We implicitly assume that no name-capture happens --- it's just a short-hand, not an algorithm. }: @ let x = y in E[x] ===> E[y] case (x:xs) of ===> E1[x,xs] (y:ys) -> E1[y,ys] [] -> E2 @ Transformations of this kind are almost embarassingly simple. How could anyone write a paper about them? \end{itemize} This paper is about humble transformations, and how to implement them. Although each individual transformation is simple enough, there is a scaling issue: there are a large number of candidate transformations to consider, and there are a very large number of opportunities to apply them. In the Glasgow Haskell compiler, all humble transformations are performed by the so-called {\em simplifier}. Our goal in this paper is to give an overview of how the simplifier works, what transformations it applies, and what issues arose in its design. \section{The language} Mutter mutter. Important points: \begin{itemize} \item Second order lambda calculus. \item Arguments are variables. \item Unboxed data types, and unboxed cases. \end{itemize} Less important points: \begin{itemize} \item Constructors and primitives are saturated. \item if-then-else desugared to @case@ \end{itemize} Give data type. \section{Transformations} This section lists all the transformations implemented by the simplifier. Because it is a complete list, it is a long one. We content ourselves with a brief statement of each transformation, augmented with forward references to Section~\ref{sect:composing} which gives examples of the ways in which the transformations can compose together. \subsection{Beta reduction} If a lambda abstraction is applied to an argument, we can simply beta-reduce. This applies equally to ordinary lambda abstractions and type abstractions: @ (\x -> E[x]) arg ===> E[arg] (/\a -> E[a]) ty ===> E[ty] @ There is no danger of duplicating work because the argument is guaranteed to be a simple variable or literal. \subsubsection{Floating applications inward} Applications can be floated inside a @let(rec)@ or @case@ expression. This is a good idea, because they might find a lambda abstraction inside to beta-reduce with: @ (let(rec) Bind in E) arg ===> let(rec) Bind in (E arg) (case E of {P1 -> E1;...; Pn -> En}) arg ===> case E of {P1 -> E1 arg; ...; Pn -> En arg} @ \subsection{Transformations concerning @let(rec)@} \subsubsection{Floating @let@ out of @let@} It is sometimes useful to float a @let(rec)@ out of a @let(rec)@ right-hand side: @ let x = let(rec) Bind in B1 ===> let(rec) Bind in in B2 let x = B1 in B2 letrec x = let(rec) Bind in B1 ===> let(rec) Bind in B2 x = B1 in B2 @ \subsubsection{Floating @case@ out of @let@} \subsubsection{@let@ to @case@} \subsection{Transformations concerning @case@} \subsubsection{Case of known constructor} If a @case@ expression scrutinises a constructor, the @case@ can be eliminated. This transformation is a real win: it eliminates a whole @case@ expression. @ case (C a1 .. an) of ===> E[a1..an] ... C b1 .. bn -> E[b1..bn] ... @ If none of the constructors in the alternatives match, then the default is taken: @ case (C a1 .. an) of ===> let y = C a1 .. an ...[no alt matches C]... in E y -> E @ There is an important variant of this transformation when the @case@ expression scrutinises a {\em variable} which is known to be bound to a constructor. This situation can arise for two reasons: \begin{itemize} \item An enclosing @let(rec)@ binding binds the variable to a constructor. For example: @ let x = C p q in ... (case x of ...) ... @ \item An enclosing @case@ expression scrutinises the same variable. For example: @ case x of ... C p q -> ... (case x of ...) ... ... @ This situation is particularly common, as we discuss in Section~\ref{sect:repeated-evals}. \end{itemize} In each of these examples, @x@ is known to be bound to @C p q@ at the inner @case@. The general rules are: @ case x of {...; C b1 .. bn -> E[b1..bn]; ...} ===> {x bound to C a1 .. an} E[a1..an] case x of {...[no alts match C]...; y -> E[y]} ===> {x bound to C a1 .. an} E[x] @ \subsubsection{Dead alternative elimination} @ case x of C a .. z -> E ...[other alts]... ===> x *not* bound to C case x of ...[other alts]... @ We might know that @x@ is not bound to a particular constructor because of an enclosing case: @ case x of C a .. z -> E1 other -> E2 @ Inside @E1@ we know that @x@ is bound to @C@. However, if the type has more than two constructors, inside @E2@ all we know is that @x@ is {\em not} bound to @C@. This applies to unboxed cases also, in the obvious way. \subsubsection{Case elimination} If we can prove that @x@ is not bottom, then this rule applies. @ case x of ===> E[x] y -> E[y] @ We might know that @x@ is non-bottom because: \begin{itemize} \item @x@ has an unboxed type. \item There's an enclosing case which scrutinises @x@. \item It is bound to an expression which provably terminates. \end{itemize} Since this transformation can only improve termination, even if we apply it when @x@ is not provably non-bottom, we provide a compiler flag to enable it all the time. \subsubsection{Case of error} @ case (error ty E) of Alts ===> error ty' E where ty' is type of whole case expression @ Mutter about types. Mutter about variables bound to error. Mutter about disguised forms of error. \subsubsection{Floating @let(rec)@ out of @case@} A @let(rec)@ binding can be floated out of a @case@ scrutinee: @ case (let(rec) Bind in E) of Alts ===> let(rec) Bind in case E of Alts @ This increases the likelihood of a case-of-known-constructor transformation, because @E@ is not hidden from the @case@ by the @let(rec)@. \subsubsection{Floating @case@ out of @case@} Analogous to floating a @let(rec)@ from a @case@ scrutinee is floating a @case@ from a @case@ scrutinee. We have to be careful, though, about code size. If there's only one alternative in the inner case, things are easy: @ case (case E of {P -> R}) of ===> case E of {P -> case R of Q1 -> S1 Q1 -> S1 ... ... Qm -> Sm Qm -> Sm} @ If there's more than one alternative there's a danger that we'll duplicate @S1@...@Sm@, which might be a lot of code. Our solution is to create a new local definition for each alternative: @ case (case E of {P1 -> R1; ...; Pn -> Rn}) of Q1 -> S1 ... Qm -> Sm ===> let s1 = \x1 ... z1 -> S1 ... sm = \xm ... zm -> Sm in case E of P1 -> case R1 of {Q1 -> s1 x1 ... z1; ...; Qm -> sm xm ... zm} ... Pn -> case Rn of {Q1 -> s1 x1 ... z1; ...; Qm -> sm xm ... zm} @ Here, @x1 ... z1@ are that subset of variables bound by the pattern @Q1@ which are free in @S1@, and similarly for the other @si@. Is this transformation a win? After all, we have introduced @m@ new functions! Section~\ref{sect:join-points} discusses this point. \subsubsection{Case merging} @ case x of ...[some alts]... other -> case x of ...[more alts]... ===> case x of ...[some alts]... ...[more alts]... @ Any alternatives in @[more alts]@ which are already covered by @[some alts]@ should first be eliminated by the dead-alternative transformation. \subsection{Constructor reuse} \subsection{Inlining} The inlining transformtion is simple enough: @ let x = R in B[x] ===> B[R] @ Inlining is more conventionally used to describe the instantiation of a function body at its call site, with arguments substituted for formal parameters. We treat this as a two-stage process: inlining followed by beta reduction. Since we are working with a higher-order language, not all the arguments may be available at every call site, so separating inlining from beta reduction allows us to concentrate on one problem at a time. The choice of exactly {\em which} bindings to inline has a major impact on efficiency. Specifically, we need to consider the following factors: \begin{itemize} \item Inlining a function at its call site, followed by some beta reduction, very often exposes opportunities for further transformations. We inline many simple arithmetic and boolean operators for this reason. \item Inlining can increase code size. \item Inlining can duplicate work, for example if a redex is inlined at more than one site. Duplicating a single expensive redex can ruin a program's efficiency. \end{itemize} Our inlining strategy depends on the form of @R@: Mutter mutter. \subsubsection{Dead code removal} If a @let@-bound variable is not used the binding can be dropped: @ let x = E in B ===> B x not free in B @ A similar transformation applies for @letrec@-bound variables. Programmers seldom write dead code, of course, but bindings often become dead when they are inlined. \section{Composing transformations} \label{sect:composing} The really interesting thing about humble transformations is the way in which they compose together to carry out substantial and useful transformations. This section gives a collection of motivating examples, all of which have shown up in real application programs. \subsection{Repeated evals} \label{sect:repeated-evals} Example: x+x, as in unboxed paper. \subsection{Lazy pattern matching} Lazy pattern matching is pretty inefficient. Consider: @ let (x,y) = E in B @ which desugars to: @ let t = E x = case t of (x,y) -> x y = case t of (x,y) -> y in B @ This code allocates three thunks! However, if @B@ is strict in {\em either} @x@ {\em or} @y@, then the strictness analyser will easily spot that the binding for @t@ is strict, so we can do a @let@-to-@case@ transformation: @ case E of (x,y) -> let t = (x,y) in let x = case t of (x,y) -> x y = case t of (x,y) -> y in B @ whereupon the case-of-known-constructor transformation eliminates the @case@ expressions in the right-hand side of @x@ and @y@, and @t@ is then spotted as being dead, so we get @ case E of (x,y) -> B @ \subsection{Join points} \label{sect:join-points} One motivating example is this: @ if (not x) then E1 else E2 @ After desugaring the conditional, and inlining the definition of @not@, we get @ case (case x of True -> False; False -> True}) of True -> E1 False -> E2 @ Now, if we apply our case-of-case transformation we get: @ let e1 = E1 e2 = E2 in case x of True -> case False of {True -> e1; False -> e2} False -> case True of {True -> e1; False -> e2} @ Now the case-of-known constructor transformation applies: @ let e1 = E1 e2 = E2 in case x of True -> e2 False -> e1 @ Since there is now only one occurrence of @e1@ and @e2@ we can inline them, giving just what we hoped for: @ case x of {True -> E2; False -> E1} @ The point is that the local definitions will often disappear again. \subsubsection{How join points occur} But what if they don't disappear? Then the definitions @s1@ ... @sm@ play the role of ``join points''; they represent the places where execution joins up again, having forked at the @case x@. The ``calls'' to the @si@ should really be just jumps. To see this more clearly consider the expression @ if (x || y) then E1 else E2 @ A C compiler will ``short-circuit'' the evaluation of the condition if @x@ turns out to be true generate code, something like this: @ if (x) goto l1; if (y) {...code for E2...} l1: ...code for E1... @ In our setting, here's what will happen. First we desguar the conditional, and inline the definition of @||@: @ case (case x of {True -> True; False -> y}) of True -> E1 False -> E2 @ Now apply the case-of-case transformation: @ let e1 = E1 e2 = E2 in case x of True -> case True of {True -> e1; False -> e2} False -> case y of {True -> e1; False -> e2} @ Unlike the @not@ example, only one of the two inner case simplifies, and we can therefore only inline @e2@, because @e1@ is still mentioned twice\footnote{Unless the inlining strategy decides that @E1@ is small enough to duplicate; it is used in separate @case@ branches so there's no concern about duplicating work. Here's another example of the way in which we make one part of the simplifier (the inlining strategy) help with the work of another (@case@-expression simplification.} @ let e1 = E1 in case x of True -> e1 False -> case y of {True -> e1; False -> e2} @ The code generator produces essentially the same code as the C code given above. The binding for @e1@ turns into just a label, which is jumped to from the two occurrences of @e1@. \subsubsection{Case of @error@} The case-of-error transformation is often exposed by the case-of-case transformation. Consider @ case (hd xs) of True -> E1 False -> E2 @ After inlining @hd@, we get @ case (case xs of [] -> error "hd"; (x:_) -> x) of True -> E1 False -> E2 @ (I've omitted the type argument of @error@ to save clutter.) Now doing case-of-case gives @ let e1 = E1 e2 = E2 in case xs of [] -> case (error "hd") of { True -> e1; False -> e2 } (x:_) -> case x of { True -> e1; False -> e2 } @ Now the case-of-error transformation springs to life, after which we can inline @e1@ and @e2@: @ case xs of [] -> error "hd" (x:_) -> case x of {True -> E1; False -> E2} @ \subsection{Nested conditionals combined} Sometimes programmers write something which should be done by a single @case@ as a sequence of tests: @ if x==0::Int then E0 else if x==1 then E1 else E2 @ After eliminating some redundant evals and doing the case-of-case transformation we get @ case x of I# x# -> case x# of 0# -> E0 other -> case x# of 1# -> E1 other -> E2 @ The case-merging transformation puts these together to get @ case x of I# x# -> case x# of 0# -> E0 1# -> E1 other -> E2 @ Sometimes the sequence of tests cannot be eliminated from the source code because of overloading: @ f :: Num a => a -> Bool f 0 = True f 3 = True f n = False @ If we specialise @f@ to @Int@ we'll get the previous example again. \subsection{Error tests eliminated} The elimination of redundant alternatives, and then of redundant cases, arises when we inline functions which do error checking. A typical example is this: @ if (x `rem` y) == 0 then (x `div` y) else y @ Here, both @rem@ and @div@ do an error-check for @y@ being zero. The second check is eliminated by the transformations. After transformation the code becomes: @ case x of I# x# -> case y of I# y# -> case y of 0# -> error "rem: zero divisor" _ -> case x# rem# y# of 0# -> case x# div# y# of r# -> I# r# _ -> y @ \subsection{Atomic arguments} At this point it is possible to appreciate the usefulness of the Core-language syntax requirement that arguments are atomic. For example, suppose that arguments could be arbitrary expressions. Here is a possible transformation: @ f (case x of (p,q) -> p) ===> f strict in its second argument case x of (p,q) -> f (p,p) @ Doing this transformation would be useful, because now the argument to @f@ is a simple variable rather than a thunk. However, if arguments are atomic, this transformation becomes just a special case of floating a @case@ out of a strict @let@: @ let a = case x of (p,q) -> p in f a ===> (f a) strict in a case x of (p,q) -> let a=p in f a ===> case x of (p,q) -> f p @ There are many examples of this kind. For almost any transformation involving @let@ there is a corresponding one involving a function argument. The same effect is achieved with much less complexity by restricting function arguments to be atomic. \section{Design} Dependency analysis Occurrence analysis \subsection{Renaming and cloning} Every program-transformation system has to worry about name capture. For example, here is an erroneous transformation: @ let y = E in (\x -> \y -> x + y) (y+3) ===> WRONG! let y = E in (\y -> (y+3) + y) @ The transformation fails because the originally free-occurrence of @y@ in the argument @y+3@ has been ``captured'' by the @\y@-abstraction. There are various sophisticated solutions to this difficulty, but we adopted a very simple one: we uniquely rename every locally-bound identifier on every pass of the simplifier. Since we are in any case producing an entirely new program (rather than side-effecting an existing one) it costs very little extra to rename the identifiers as we go. So our example would become @ let y = E in (\x -> \y -> x + y) (y+3) ===> WRONG! let y1 = E in (\y2 -> (y1+3) + y2) @ The simplifier accepts as input a program which has arbitrary bound variable names, including ``shadowing'' (where a binding hides an outer binding for the same identifier), but it produces a program in which every bound identifier has a distinct name. Both the ``old'' and ``new'' identifiers have type @Id@, but when writing type signatures for functions in the simplifier we use the types @InId@, for identifiers from the input program, and @OutId@ for identifiers from the output program: @ type InId = Id type OutId = Id @ This nomenclature extends naturally to expressions: a value of type @InExpr@ is an expression whose identifiers are from the input-program name-space, and similarly @OutExpr@. \section{The simplifier} The basic algorithm followed by the simplifier is: \begin{enumerate} \item Analyse: perform occurrence analysis and dependency analysis. \item Simplify: apply as many transformations as possible. \item Iterate: perform the above two steps repeatedly until no further transformations are possible. (A compiler flag allows the programmer to bound the maximum number of iterations.) \end{enumerate} We make a effort to apply as many transformations as possible in Step 2. To see why this is a good idea, just consider a sequence of transformations in which each transformation enables the next. If each iteration of Step 2 only performs one transformation, then the entire program will to be re-analysed by Step 1, and re-traversed by Step 2, for each transformation of the sequence. Sometimes this is unavoidable, but it is often possible to perform a sequence of transformtions in a single pass. The key function, which simplifies expressions, has the following type: @ simplExpr :: SimplEnv -> InExpr -> [OutArg] -> SmplM OutExpr @ The monad, @SmplM@ can quickly be disposed of. It has only two purposes: \begin{itemize} \item It plumbs around a supply of unique names, so that the simplifier can easily invent new names. \item It gathers together counts of how many of each kind of transformation has been applied, for statistical purposes. These counts are also used in Step 3 to decide when the simplification process has terminated. \end{itemize} The signature can be understood like this: \begin{itemize} \item The environment, of type @SimplEnv@, provides information about identifiers bound by the enclosing context. \item The second and third arguments together specify the expression to be simplified. \item The result is the simplified expression, wrapped up by the monad. \end{itemize} The simplifier's invariant is this: $$ @simplExpr@~env~expr~[a_1,\ldots,a_n] = expr[env]~a_1~\ldots~a_n $$ That is, the expression returned by $@simplExpr@~env~expr~[a_1,\ldots,a_n]$ is semantically equal (although hopefully more efficient than) $expr$, with the renamings in $env$ applied to it, applied to the arguments $a_1,\ldots,a_n$. \subsection{Application and beta reduction} The arguments are carried ``inwards'' by @simplExpr@, as an accumulating parameter. This is a convenient way of implementing the transformations which float arguments inside a @let@ and @case@. This list of pending arguments requires a new data type, @CoreArg@, along with its ``in'' and ``out'' synonyms, because an argument might be a type or an atom: @ data CoreArg bindee = TypeArg UniType | ValArg (CoreAtom bindee) type InArg = CoreArg InId type OutArg = CoreArg OutId @ The equations for applications simply apply the environment to the argument (to handle renaming) and put the result on the argument stack, tagged to say whether it is a type argument or value argument: @ simplExpr env (CoApp fun arg) args = simplExpr env fun (ValArg (simplAtom env arg) : args) simplExpr env (CoTyApp fun ty) args = simplExpr env fun (TypeArg (simplTy env ty) : args) @ \end{document}