size of update frames, and eliminates
``stack-stubbing'' instructions.)
+\item The ``return in registers'' return convention has been dropped
+because it was complicated and doesn't work well on register-poor
+architectures. It has been partly replaced by unboxed
+tuples~\ref{sect:unboxed-tuples} which allow the programmer to
+explicitly state where results should be returned in registers (or on
+the stack) instead of on the heap.
+
\end{itemize}
\subsection{Wish list}
only anticipate one, however.
\end{itemize}
-\subsection{Terminology}
+\subsection{Glossary}
+
+\ToDo{This terminology is not used consistently within the document.
+If you find soemthing which disagrees with this terminology, fix the
+usage.}
\begin{itemize}
\item A {\em pointer} is (at least) 32 bits and big enough to hold a
function pointer or a data pointer.
+\item A {\em boxed} type is one whose elements are heap allocated.
+
+\item An {\em unboxed} type is one whose elements are {\em not} heap allocated.
+
+\item A {\em pointed} type is one that contains $\bot$. Variables with
+pointed types are the only things which can be lazily evaluated. In
+the STG machine, this means that they are the only things that can be
+{\em entered} or {\em updated} and it requires that they be boxed.
+
+\item An {\em unpointed} type is one that does not contains $\bot$.
+Variables with unpointed types are never delayed --- they are always
+evaluated when they are constructed. In the STG machine, this means
+that they cannot be {\em entered} or {\em updated}. Unpointed objects
+may be boxed (like @Array#@) or unboxed (like @Int#@).
+
\item A {\em closure} is a (representation of) a value of a {\em pointed}
- type. It may be in HNF or it may be unevaluated --- in either case, you can
- try to evaluate it again.
+type. It may be in HNF or it may be unevaluated --- in either case, you can
+try to evaluate it again.
\item A {\em thunk} is a (representation of) a value of a {\em pointed}
- type which is {\em not} in HNF.
+type which is {\em not} in HNF.
+
+\item A {\em value} is an object in HNF. It can be pointed or unpointed.
\end{itemize}
% More terminology to mention.
% unboxed, unpointed
-There are a few other system invariants which need to be mentioned ---
-though not necessarily here:
+\subsection{Invariants}
+
+There are a few system invariants which need to be mentioned ---
+though this is probably the wrong place for them.
\begin{itemize}
\end{itemize}
+\section{Source Language}
+
+\subsection{Unboxed tuples}\label{sect:unboxed-tuples}
+
+Functions can take multiple arguments as easily as they can take one
+argument: there's no cost for adding another argument. But functions
+can only return one result: the cost of adding a second ``result'' is
+that the function must construct a tuple of ``results'' on the heap.
+The assymetry is rather galling and can make certain programming
+styles quite expensive. For example, consider a simple state transformer
+monad:
+@
+> type S a = State -> (a,State)
+> bindS m k s0 = case m s0 of { (a,s1) -> k a s1 }
+> returnS a s = (a,s)
+> getS s = (s,s)
+> setS s _ = ((),s)
+@
+Here, every use of @returnS@, @getS@ or @setS@ constructs a new tuple
+in the heap which is instantly taken apart (and becomes garbage) by
+the case analysis in @bind@. Even a short state-transformer program
+will construct a lot of these temporary tuples.
+
+Unboxed tuples provide a way for the programmer to indicate that they
+do not expect a tuple to be shared and that they do not expect it to
+be allocated in the heap. Syntactically, unboxed tuples are just like
+single constructor datatypes except for the annotation @unboxed@.
+@
+> data unboxed AAndState# a = AnS a State
+> type S a = State -> AAndState# a
+> bindS m k s0 = case m s0 of { AnS a s1 -> k a s1 }
+> returnS a s = AnS a s
+> getS s = AnS s s
+> setS s _ = AnS () s
+@
+Semantically, unboxed tuples are just unlifted tuples and are subject
+to the same restrictions as other unpointed types.
+
+Operationally, unboxed tuples are never built on the heap. When
+unboxed tuples are returned, they are returned in multiple registers
+or multiple stack slots. At first sight, this seems a little strange
+but it's no different from passing double precision floats in two
+registers.
+
+Note that unboxed tuples can only have one constructor and that
+thunks never have unboxed types --- so we'll never try to update an
+unboxed constructor. The restriction to a single constructor is
+largely to avoid garbage collection complications.
+
+
%-----------------------------------------------------------------------------
\part{Evaluation Model}
\section{Compiled Execution}
\Arg{2} \ldots \Arg{m+1} and jumps to the fast entry point.
-\item {\em Unknown function closure or thunk.}
+\item {\em Unknown function closure, thunk or constructor.}
Sometimes, the function being called is not statically identifiable.
Consider, for example, the @compose@ function:
(section~\ref{sect:activation-records}) --- should a garbage collection
be required.
-\item If @x@ has a pointed type (e.g.~a data constructor or a function),
+\item If @x@ has a boxed type (e.g.~a data constructor or a function),
a pointer to @x@ is returned in \Arg{1}.
\ToDo{Warn that components of E should be extracted as soon as
possible to avoid a space leak.}
-\item If @x@ is an unpointed type (e.g.~@Int#@ or @Float#@), @x@ is
+\item If @x@ is an unboxed type (e.g.~@Int#@ or @Float#@), @x@ is
returned in \Arg{1}
-\item If @x@ is an unpointed tuple constructor, the components of @x@
+\item If @x@ is an unboxed tuple constructor, the components of @x@
are returned in \Arg{1} \ldots \Arg{n} but no object is constructed in
-the heap. Unboxed tuple constructors are useful for functions which
-want to return multiple values such as those used in an (explicitly
-encoded) state monad:
-
-\ToDo{Move stuff about unboxed tuples to a separate section}
-
-@
-data unpointed AAndState a = AnS a State
-type S a = State -> AAndState a
-
-bindS m k s0 = case m s0 of { AnS s1 a -> k s1 a }
-returnS a s = AnS a s
-@
-
-Note that unboxed datatypes can only have one constructor and that
-thunks never have unboxed types --- so we'll never try to update an
-unboxed constructor. Unboxed tuples are \emph{never} built on the
-heap.
+the heap.
When passing an unboxed tuple to a function, the components are
flattened out and passed in \Arg{1} \ldots \Arg{n} as usual.
\subsection{Updates}
\label{sect:data-updates}
-The entry code for an updatable thunk (which must also be of arity 0):
+The entry code for an updatable thunk (which must be of arity 0):
\begin{itemize}
\item copies the free variables out of the thunk into registers or
return address on the stack.
If we knew that the scrutinee was already evaluated we could generate
-(better) code which simply jumps to the appropriate branch of the @case@
-with a pointer to the scrutinee in \Arg{1}.
+(better) code which simply jumps to the appropriate branch of the
+@case@ with a pointer to the scrutinee in \Arg{1}. (For direct
+returns to multiconstructor datatypes, we might also load the tag into
+\Arg{2}).
An obvious idea, therefore, is to test dynamically whether the heap
closure is a value (using the tag in the info table). If not, we
\note{I reckon these deserve a subsection of their own}
-Don't move heap pointer before success occurs.
+The storage manager detects that it needs to garbage collect the old
+generation when the evaluator requests a garbage collection without
+having moved the heap pointer since the last garbage collection. It
+is therefore important that the GC routines {\em not} move the heap
+pointer unless the heap check fails. This is different from what
+happens in the current STG implementation.
+
Talk about how stack check looks ahead into the branches of case expressions.
+
\subsection{Handling interrupts/signals}
@
\label{sect:hugs-calling-conventions}
The calling convention for any byte-code function is straightforward:
-
\begin{itemize}
\item Push any arguments on the stack.
\item Push a pointer to the BCO.
\item Begin interpreting the byte code.
\end{itemize}
-The @ENTER@ byte-code instruction decides how to enter its argument.
-The object being entered must be either
+But it isn't enough for the bytecode interpreter to handle just
+byte-code functions specially. At a minimum, it must also be able to
+enter constructors too because of the way that GHC returns to
+Hugs-compiled return addresses (Section~\ref{sect:ghc-to-hugs-return}).
+
+In principle, all other closure types could be handled by switching to
+the compiled world (as described in
+Section~\ref{sect:hugs-to-ghc-closure}) and entering the closure
+there. This would work but it would obviously be very inefficient if
+we entered a @HUGS_AP@ by switching worlds, entering the @HUGS_AP@,
+pushing the arguments and function onto the stack, and entering the
+function which, likely as not, will be a byte-code object which we
+will enter by \emph{returning} to the byte-code interpreter. To avoid
+such gratuitious world switching, we choose to recognise certain
+closure types as being ``standard'' --- and duplicate the entry code
+for the ``standard closures'' in the bytecode interpreter.
+
+A closure is said to be ``standard'' if its entry code is entirely
+determined by its info table. \emph{Standard Closures} have the
+desirable property that the byte-code interpreter can enter
+the closure by simply ``interpreting'' the info table instead of
+switching to the compiled world. The standard closures include:
-\begin{itemize}
-\item A BCO,
-\item A @HUGS_AP@,
-\item A @HUGS_PAP@,
-\item A constructor,
-\item A GHC-built closure, or
-\item An indirection.
-\end{itemize}
+\begin{description}
+\item[@HUGS_AP@]
+To enter a @HUGS_AP@ we push an update frame, push the values from the
+@HUGS_AP@ on the stack, and enter its associated object.
+
+\item[@HUGS_PAP@]
+To enter a @HUGS_PAP@, we push its values on the stack and enter the
+first one.
+
+\item[@PAP@]
+Same as @HUGS_PAP@.
+
+\item[Constructor]
+To enter a constructor, we simply return (see Section
+\ref{sect:hugs-return-convention}).
+
+\item[Indirection]
+To enter an indirection, we simply enter the object it points to
+after possibly adjusting the current cost centre.
+
+\item[Selector]
+To enter a selector, we test whether the selectee is a value. If so,
+we simply select the appropriate component; if not, it's simplest to
+treat it as a GHC-built closure --- though we could interpret it if we
+wanted.
+
+\end{description}
+
+The most obvious omissions from the above list are @BCO@s (which we
+dealt with above) and GHC-built closures (which are covered in Section
+\ref{sect:hugs-to-ghc-closure}).
-If @ENTER@ is applied to a BCO, we just begin interpreting the
-byte-code contained therein. If the object is an @HUGS_AP@, we push an
-update frame, push the values from the @HUGS_AP@ on the stack, and enter
-its associated object. If the object is a @HUGS_PAP@, we push its
-values on the stack and enter the first one. If the object is a
-constructor, we simply return (see Section
-\ref{sect:hugs-return-convention}). The fourth case is convered in
-Section \ref{sect:hugs-to-ghc-closure}. If the object is an
-indirection, we simply enter the object it points to.
\subsection{Return convention}
\label{sect:hugs-return-convention}
\begin{center}
\input{hugs_ret.pstex_t}
\end{center}
-\caption{Stack layout for a hugs return address}
+\caption{Stack layout for a Hugs return address}
\label{fig:hugs-return-stack}
\end{figure}
\begin{center}
\input{hugs_ret2.pstex_t}
\end{center}
-\caption{Stack layout on enterings a hugs return address}
+\caption{Stack layout on enterings a Hugs return address}
\label{fig:hugs-return2}
\end{figure}
pick a runnable thread;
do {
switch (thread->whatNext) {
- case (RunGHC pc): status=runGHC(pc); break;
- case (RunHugs bc): status=runHugs(bc); break;
+ case (EnterGHC pc): status=runGHC(pc); break;
+ case (EnterHugs bc): status=runHugs(bc); break;
}
switch (status) { // handle local problems
case (StackOverflow): enlargeStack; break;
\label{fig:closure}
\end{figure}
-Every {\em heap object}, or {\em closure} is a contiguous block
+Every {\em heap object} is a contiguous block
of memory, consisting of a fixed-format {\em header} followed
by zero or more {\em data words}.
-The header consists of
-the following fields:
+
+\ToDo{I absolutely do not believe that every heap object has a header
+like this - ADR. I believe that they all have an info pointer but I
+see no readon why stack objects and unpointed heap objects would have
+an entry count since this will always be zero.}
+
+The header consists of the following fields:
\begin{itemize}
\item A one-word {\em info pointer}, which points to
the object's static {\em info table}.
A second useful classification is this:
\begin{itemize}
\item
-{\em Executive closures}, such as thunks and data constructors,
+{\em Executive objects}, such as thunks and data constructors,
participate directly in a program's execution. They can be subdivided into
-two kinds of objects according to their type:
+three kinds of objects according to their type:
\begin{itemize}
\item
{\em Pointed objects}, represent values of a {\em pointed} type
\end{itemize}
-\item {\em Administrative closures}, such as stack objects and thread
+\item {\em Administrative objects}, such as stack objects and thread
state objects, do not represent values in the original program.
\end{itemize}
@BCO_CAF@ & & 1 & & & 1 & & & & & \ref{sect:BCO} \\
@HUGS_AP@ & & 1 & & & 1 & & & & & \ref{sect:HUGS-AP} \\
-@HUGS_PAP@ & & & 1 & & & & & & & \ref{sect:HUGS-AP} \\
+@HUGS_PAP@ & 1 & & 1 & & & & & & & \ref{sect:HUGS-AP} \\
@PAP@ & 1 & & 1 & & & & & & & \ref{sect:PAP} \\
{\em Fixed header} & {\em No of arg words} & {\em Function closure} & {\em Arg stack} \\ \hline
\end{tabular}
\end{center}
-The ``arg stack'' is a copy of of the chunk of stack above the update
+The ``arg stack'' is a copy of the chunk of stack above the update
frame; ``no of arg words'' tells how many words it consists of. The
function closure is (a pointer to) the closure for the function whose
argument-satisfaction check failed.
the expression
@
let xs = [1..1000] in last xs
-@
+@
where @last@ is a function that returns the last element of its
argument list. When the thunk is entered it will call @last@, which
will consume @xs@ until it finds the last element. Since the list
\fi
-\subsection{Space leaks and selectors}
+\subsection{Space leaks and selectors}\label{sect:space-leaks-and-selectors}
\iffalse
A very similar idea, dubbed ``stingy evaluation'', is described
by <.stingy.>.
+\ToDo{Simple generalisation: handle all the ``standard closures'' this way.}
+
<.sparud lazy pattern matching.> describes another solution to the
lazy-pattern-matching
problem. His solution involves adding code to the two thunks for