%************************************************************************
%*									*
\section{How to add an optimisation pass}
%*									*
%************************************************************************
\subsection{Summary of the steps required}

Well, with all the preliminaries out of the way, here is all that it
takes to add your optimisation pass to the new glorious Glasgow
Haskell compiler:
\begin{enumerate}
\item
Select the input and output types for your pass; these will very
likely be particular parameterisations of the Core or annotated Core
data types.  There is a small chance you will prefer to work at the
STG-syntax level.  (If these data types are inadequate to this
purpose, {\em please} let us know!)

\item
Depending on exactly how you want your pass to work, set up some
monad-ery, so as to avoid lots of horrible needless plumbing.  The
whole compiler is written in a monadic style, and there are plenty of
examples from which you may steal.  Section~\ref{sec:monadic-style}
gives further details about how you might do this.

\item
Write your optimisation pass, and...

{\em Do} use the existing types in the compiler, e.g., @UniType@,
and the ``approved'' interfaces to them.

{\em Don't} rewrite lots of utility code!  Scattered around in various
sometimes-obvious places, there is considerable code already written
for the mangling and massaging of expressions, types, variables, etc.

Section~\ref{sec:reuse-code} says more about how to re-use existing
compiler bits.

\item
Follow our naming conventions \smiley{} Seriously, it may lead to greater
acceptance of our code and yours if readers find a system written with
at least a veneer of uniformity.
\ToDo{mention Style Guide, if it ever really exists.}

\item
To hook your pass into the compiler, either add something directly to
the @Main@ module of the compiler,\srcloc{main/Main.lhs} or into the
Core-to-Core simplification driver,\srcloc{simplCore/SimplCore.lhs} or
into the STG-to-STG driver.\srcloc{simplStg/SimplStg.lhs}

Also add something to the compilation-system
driver\srcloc{ghc/driver/ghc.lprl}
(called @ghc@ here) so that appropriate user-provided command-line
options will be transmogrified into the correct options fed to the
@Main@ module.

\item
Add some appropriate documentation to the user's guide,
@ghc/docs/users_guide@.

\item
Run your optimisation on {\em real programs}, measure, and improve.
(Separate from this compiler's distribution, we provide a ``NoFib''
suite of ``real Haskell programs'' \cite{partain92a}.  We strongly
encourage their use, so you can more readily compare your work with
others'.)

\item
Send us your contribution so we can add it to the distribution!  We
will be happy to include anything semi-reasonable.
This will practically ensure fame, if
not fortune, and---with a little luck---considerable notoriety.
\end{enumerate}

%************************************************************************
%*									*
\subsection{Using monadic code}\label{sec:monadic-style}
%*									*
%************************************************************************

{\em Monads} are one way of structuring functional programs. Phil
Wadler is their champion, and his recent papers on the subject are a
good place to start your investigations.  ``The essence of functional
programming'' even has a section about the use of monads in this
compiler \cite{wadler92a}!  An earlier paper describes ``monad
comprehensions'' \cite{wadler90a}.  For a smaller self-contained
example, see his ``literate typechecker'' \cite{wadler90b}.

We use monads extensively in this compiler, mainly as a way to plumb
state around.  The simplest example is a monad to plumb a
@UniqueSupply@\srcloc{basicTypes/Unique.lhs} (i.e., name supply)
through a function.

\ToDo{Actually describe one monad thing completely.}

We encourage you to use a monadic style, where appropriate, in
the code you add to the compiler.  To this end, here is a list of
monads already in use in the compiler:
\begin{description}
\item[@UniqueSupply@ monad:] \srcloc{basicTypes/Unique.lhs}
To carry a name supply around; do a @getUnique@ when you
need one.  Used in several parts of the compiler.

\item[Typechecker monad:] \srcloc{typecheck/TcMonad.lhs}
Quite a complicated monad; carries around a substitution, some
source-location information, and a @UniqueSupply@; also plumbs
typechecker success/failure back up to the right place.

\item[Desugarer monad:] \srcloc{deSugar/DsMonad.lhs}
Carries around a @UniqueSupply@ and source-location information (to
put in pattern-matching-failure error messages).

\item[Code-generator monad:] \srcloc{codeGen/CgMonad.lhs}
Carries around an environment that maps variables to addressing modes
(e.g., ``in this block, @f@ is at @Node@ offset 3''); also, carries
around stack- and heap-usage information.  Quite tricky plumbing, in
part so that the ``Abstract~C'' output will be produced lazily.

\item[Monad for underlying I/O machinery:] \srcloc{ghc/lib/io/GlaIOMonad.lhs}
This is the basis of our I/O implementation.  See the paper about it
\cite{peyton-jones92b}.
\end{description}

%************************************************************************
%*									*
\subsection{Adding a new @PrimitiveOp@}\label{sec:add-primop}
%*									*
%************************************************************************

You may find yourself wanting to add a new
@PrimitiveOp@\srcloc{prelude/PrimOps.lhs} to the compiler's
repertoire: these are the lowest-level operations that cannot be
expressed in Haskell---in our case, things written in C.

What you would need to do to add a new op:
\begin{itemize}
\item
Add it to the @PrimitiveOp@ datatype in @prelude/PrimOps.lhs@; it's
just an enumeration type.
\item
Most importantly, add an entry in the @primOpInfo@ function for your
new primitive.
\item
If you want your primitive to be visible to some other part of the
compiler, export it via the @AbsPrel@\srcloc{prelude/AbsPrel.lhs}
interface (and import it from there).
\item
If you want your primitive to be visible to the user (modulo some
``show-me-nonstd-names'' compiler flag...), add your primitive to one
or more of the appropriate lists in @buildinNameFuns@, in
@prelude/AbsPrel.lhs@.
\item
If your primitive can be implemented with just a C macro, add it to
@ghc/imports/StgMacros.lh@.  If it needs a C function, put that in
@ghc/runtime/prims/@, somewhere appropriate; you might need to put a
declaration of some kind in a C header file in @ghc/imports/@.
\item
If these steps are not enough, please get in touch.
\end{itemize}

%************************************************************************
%*									*
\section{How to add a new ``PrimOp'' (primitive operation)}
%*									*
%************************************************************************

%************************************************************************
%*									*
\section{How to add a new ``user pragma''}
%*									*
%************************************************************************

%************************************************************************
%*									*
\section{GHC utilities and re-usable code}\label{sec:reuse-code}
%*									*
%************************************************************************

%************************************************************************
%*									*
\subsection{Reuse existing utilities}
%*									*
%************************************************************************

Besides the utility functions provided in Haskell's standard prelude,
we have several modules of generally-useful utilities in \mbox{\tt utils/}
(no need to re-invent them!):
\begin{description}
\item[@Maybe@ and @MaybeErr@:]
Two very widely used types (and some operations on them):
\begin{verbatim}
data Maybe    a   = Nothing | Just a
data MaybeErr a b = Succeeded a | Failed b
\end{verbatim}

\item[@Set@:]
A simple implementation of sets (an abstract type).  The things you
want to have @Sets@ of must be in class @Ord@.

\item[@ListSetOps@:]
A module providing operations on lists that have @Set@-sounding names;
e.g., @unionLists@.

\item[@Digraph@:]
A few functions to do with directed graphs, notably finding
strongly-connected components (and cycles).

\item[@Util@:]
General grab-bag of utility functions not provided by the standard
prelude.
\end{description}

Much of the compiler is structured around major datatypes, e.g.,
@UniType@ or @Id@.  These datatypes (often ``abstract''---you can't
see their actual constructors) are packaged with many useful
operations on them.  So, again, look around a little for these
functions before rolling your own.  Section~\ref{sec:reuse-datatypes}
goes into this matter in more detail.

%************************************************************************
%*									*
\subsection{Use pretty-printing and forcing machinery}
%*									*
%************************************************************************

All of the non-trivial datatypes in the compiler are in class
@Outputable@, meaning you can pretty-print them (method: @ppr@) or
force them (method: @frc@).

Pretty-printing is by far the more common operation.  @ppr@ takes a
``style'' as its first argument; it can be one of @PprForUser@,
@PprDebug@, or @PprShowAll@, which---in turn---are intended to show
more and more detail.  For example, @ppr PprForUser@ on a @UniType@
should print a type that would be recognisable to a Haskell user;
@ppr PprDebug@ prints a type in the way an implementer would normally
want to see it (e.g., with all the ``for all...''s), and
@ppr PprShowAll@ prints everything you could ever want to know about that
type.

@ppr@ produces a @Pretty@, which should eventually wend its way to
@main@.  @main@ can then peruse the program's command-line options to
decide on a @PprStyle@ and column width in which to print.  In
particular, it's bad form to @ppShow@ the @Pretty@ into a @String@
deep in the bowels of the compiler, where the user cannot control the
printing.

If you introduce non-trivial datatypes, please make them instances of
class @Outputable@.

%************************************************************************
%*									*
\subsection{Use existing data types appropriately}\label{sec:reuse-datatypes}
%*									*
%************************************************************************

The compiler uses many datatypes.  Believe it or not, these have
carefully structured interfaces to the ``outside world''!  Unfortunately,
the current Haskell module system does not let us enforce proper
access to these datatypes to the extent we would prefer.  Here is a
list of datatypes (and their operations) you should feel free to use,
as well as how to access them.

The first major group of datatypes are the ``syntax datatypes,'' the
various ways in which the program text is represented as it makes its
way through the compiler.  These are notable in that you are allowed
to see/make-use-of all of their constructors:
\begin{description}
\item[Prefix form:]\srcloc{reader/PrefixSyn.lhs}  You shouldn't need
this. 

\item[Abstract Haskell syntax:]\srcloc{abstractSyn/AbsSyn.lhs}  Access
via the @AbsSyn@ interface.  An example of what you should {\em not}
do is import the @AbsSynFuns@ (or @HsBinds@ or ...) interface
directly.  @AbsSyn@ tells you what you're supposed to see.

\item[Core syntax:]\srcloc{coreSyn/*Core.lhs}  Core syntax is
parameterised, and you should access it {\em via one of the
parameterisations}.  The most common is @PlainCore@; another is
@TaggedCore@.  Don't use @CoreSyn@, though.

\item[STG syntax:]\srcloc{stgSyn/StgSyn.lhs} Access via the @StgSyn@ interface.

\item[Abstract~C syntax:]\srcloc{absCSyn/AbsCSyn.lhs} Access via the
@AbsCSyn@ interface.
\end{description}

The second major group of datatypes are the ``basic entity''
datatypes; these are notable in that you don't need to know their
representation to use them.  Several have already been mentioned:
\begin{description}
\item[UniTypes:]\srcloc{uniType/AbsUniType.lhs} This is a gigantic
interface onto the world of @UniTypes@; accessible via the
@AbsUniType@ interface.  You should import operations on all the {\em
pieces} of @UniTypes@ (@TyVars@, @TyVarTemplates@, @TyCons@,
@Classes@, and @ClassOps@) from here as well---everything for the
``type world.''

{\em Please don't grab type-related functions from internal modules,
behind @AbsUniType@'s back!}  (Otherwise, we won't discover the
shortcomings of the interface...)

\item[Identifiers:]\srcloc{basicTypes/Id.lhs}  Interface: @Id@.

\item[``Core'' literals:]\srcloc{basicTypes/CoreLit.lhs}  These are
the unboxed literals used in Core syntax onwards.  Interface: @CoreLit@.

\item[Environments:]\srcloc{envs/GenericEnv.lhs}
A generic environment datatype, plus a generally useful set of
operations, is provided via the @GenericEnv@ interface.  We encourage
you to use this, rather than roll your own; then your code will
benefit when we speed up the generic code.  All of the typechecker's
environment stuff (of which there is plenty) is built on @GenericEnv@,
so there are plenty of examples to follow.

\item[@Uniques@:]\srcloc{basicTypes/Unique.lhs} Essentially @Ints@.
When you need something unique for fast comparisons.  Interface:
@Unique@.  This interface also provides a simple @UniqueSupply@ monad;
often just the thing...

\item[Wired-in standard prelude knowledge:]\srcloc{prelude/} The
compiler has to know a lot about the standard prelude.  What it knows
is in the @compiler/prelude@ directory; all the rest of the compiler
gets its prelude knowledge through the @AbsPrel@ interface.

The prelude stuff can get hairy.  There is a separate document about
it.  Check the @ghc/docs/README@ list for a pointer to it...
\end{description}

The above list isn't exhaustive.  By all means, ask if you think
``Surely a function like {\em this} is in here somewhere...''


%************************************************************************
%*									*
\section{Cross-module pragmatic info: the mysteries revealed}
%*									*
%************************************************************************

ToDo: mention wired-in info.

%************************************************************************
%*									*
\section{GHC hacking tips and ``good practice''}
%*									*
%************************************************************************

ASSERT

%************************************************************************
%*									*
\section{Glasgow pragmatics: build trees, etc.}
%*									*
%************************************************************************