\marginparsep 0 in
\sloppy
+\usepackage{epsfig}
+
\newcommand{\note}[1]{{\em Note: #1}}
% DIMENSION OF TEXT:
\textheight 8.5 in
\title{The STG runtime system (revised)}
\author{Simon Peyton Jones \\ Glasgow University and Oregon Graduate Institute \and
+Simon Marlow \\ Glasgow University \and
Alastair Reid \\ Yale University}
\maketitle
\tableofcontents
\newpage
-\section{Introduction}
+\part{Introduction}
+\section{Overview}
This document describes the GHC/Hugs run-time system. It serves as
a Glasgow/Yale/Nottingham ``contract'' about what the RTS does.
\end{itemize}
-
-\section{The Scheduler}
-
-The Scheduler is the heart of the run-time system. A running program
-consists of a single running thread, and a list of runnable and
-blocked threads. The running thread returns to the scheduler when any
-of the following conditions arises:
-
-\begin{itemize}
-\item A heap check fails, and a garbage collection is required
-\item Compiled code needs to switch to interpreted code, and vice
-versa.
-\item The thread becomes blocked.
-\item The thread is preempted.
-\end{itemize}
-
-A running system has a global state, consisting of
-
-\begin{itemize}
-\item @Hp@, the current heap pointer, which points to the next
-available address in the Heap.
-\item @HpLim@, the heap limit pointer, which points to the end of the
-heap.
-\item The Thread Preemption Flag, which is set whenever the currently
-running thread should be preempted at the next opportunity.
-\end{itemize}
-
-Each thread has a thread-local state, which consists of
-
-\begin{itemize}
-\item @TSO@, the Thread State Object for this thread. This is a heap
-object that is used to store the current thread state when the thread
-is blocked or sleeping.
-\item @Sp@, the current stack pointer.
-\item @Su@, the current stack update frame pointer. This register
-points to the most recent update frame on the stack, and is used to
-calculate the number of arguments available when entering a function.
-\item @SpLim@, the stack limit pointer. This points to the end of the
-current stack chunk.
-\item Several general purpose registers, used for passing arguments to
-functions.
-\end{itemize}
-
-\noindent and various other bits of information used in specialised
-circumstances, such as profiling and parallel execution. These are
-described in the appropriate sections.
-
-The following is pseudo-code for the inner loop of the scheduler
-itself.
-
-@
-while (threads_exist) {
- // handle global problems: GC, parallelism, etc
- if (need_gc) gc();
- if (external_message) service_message();
- // deal with other urgent stuff
-
- pick a runnable thread;
- do {
- switch (thread->whatNext) {
- case (RunGHC pc): status=runGHC(pc); break;
- case (RunHugs bc): status=runHugs(bc); break;
- }
- switch (status) { // handle local problems
- case (StackOverflow): enlargeStack; break;
- case (Error e) : error(thread,e); break;
- case (ExitWith e) : exit(e); break;
- case (Yield) : break;
- }
- } while (thread_runnable);
-}
-@
-
-Optimisations to avoid excess trampolining from Hugs into itself.
-How do we invoke GC, ccalls, etc.
-General ccall (@ccall-GC@) and optimised ccall.
-
-\section{Evaluation}
+%-----------------------------------------------------------------------------
+\part{Evaluation Model}
+\section{Compiled Execution}
This section describes the framework in which compiled code evaluates
expressions. Only at certain points will compiled code need to be
able to talk to the interpreted world; these are discussed in Section
-\ref{sec:hugs-ghc-interaction}.
+\ref{sect:switching-worlds}.
\subsection{Calling conventions}
May have to revert black holes - ouch!
@
+\section{Interpreted Execution}
+
+This section describes how the Hugs interpreter interprets code in the
+same environment as compiled code executes. Both evaluation models
+use a common garbage collector, so they must agree on the form of
+objects in the heap.
+
+Hugs interprets code by converting it to byte-code and applying a
+byte-code interpreter to it. Wherever possible, we try to ensure that
+the byte-code is all that is required to interpret a section of code.
+This means not dynamically generating info tables, and hence we can
+only have a small number of possible heap objects each with a staticly
+compiled info table. Similarly for stack objects: in fact we only
+have one Hugs stack object, in which all information is tagged for the
+garbage collector.
+
+There is, however, one exception to this rule. Hugs must generate
+info tables for any constructors it is asked to compile, since the
+alternative is to force a context-switch each time compiled code
+enters a Hugs-built constructor, which would be prohibitively
+expensive.
+
+\subsection{Hugs Heap Objects}
+\label{sect:hugs-heap-objects}
+
+\subsubsection{Byte-Code Objects}
+
+Compiled byte code lives on the global heap, in objects called
+Byte-Code Objects (or BCOs). The layout of BCOs is described in
+detail in Section \ref{sect:BCO}, in this section we will describe
+their semantics.
+
+Since byte-code lives on the heap, it can be garbage collected just
+like any other heap-resident data. Hugs maintains a table of
+currently live BCOs, which is treated as a table of live pointers by
+the garbage collector. When a module is unloaded, the pointers to its
+BCOs are removed from the table, and the code will be garbage
+collected some time later.
+
+A BCO represents a basic block of code - all entry points are at the
+beginning of a BCO, and it is impossible to jump into the middle of
+one. A BCO represents not only the code for a function, but also its
+closure; a BCO can be entered just like any other closure. Hugs
+performs lambda-lifting during compilation to byte-code, and each
+top-level combinator becomes a BCO in the heap.
+
+\subsubsection{Thunks and partial applications}
+
+A thunk consists of a code pointer, and values for the free variables
+of that code. Since Hugs byte-code is lambda-lifted, free variables
+become arguments and are expected to be on the stack by the called
+function.
+
+Hugs represents thunks with an @HUGS_AP@ object. The @HUGS_AP@ object
+contains one or more pointers to other heap objects. When it is
+entered, it pushes an update frame followed by its payload on the
+stack, and enters the first word (which will be a pointer to a BCO).
+The layout of @HUGS_AP@ objects is described in more detail in Section
+\ref{sect:HUGS-AP}.
+
+Partial applications are represented by @HUGS_PAP@ objects, which are
+identical to @HUGS_AP@s except that they are non-updatable.
+
+\ToDo{Hugs Constructors}.
+
+\subsection{Calling conventions}
+\label{sect:hugs-calling-conventions}
+
+The calling convention for any byte-code function is straightforward:
+
+\begin{itemize}
+\item Push any arguments on the stack.
+\item Push a pointer to the BCO.
+\item Begin interpreting the byte code.
+\end{itemize}
+
+The @ENTER@ byte-code instruction decides how to enter its argument.
+The object being entered must be either
+
+\begin{itemize}
+\item A BCO,
+\item A @HUGS_AP@,
+\item A @HUGS_PAP@,
+\item A constructor,
+\item A GHC-built closure, or
+\item An indirection.
+\end{itemize}
+
+If @ENTER@ is applied to a BCO, we just begin interpreting the
+byte-code contained therein. If the object is an @HUGS_AP@, we push an
+update frame, push the values from the @HUGS_AP@ on the stack, and enter
+its associated object. If the object is a @HUGS_PAP@, we push its
+values on the stack and enter the first one. If the object is a
+constructor, we simply return (see Section
+\ref{sect:hugs-return-convention}). The fourth case is convered in
+Section \ref{sect:hugs-to-ghc-closure}. If the object is an
+indirection, we simply enter the object it points to.
+
+\subsection{Return convention}
+\label{sect:hugs-return-convention}
+
+When Hugs pushes a return address, it pushes both a pointer to the BCO
+to return to, and a pointer to a static code fragment @HUGS_RET@ (this
+will be described in Section \ref{sect:ghc-to-hugs-return}). The
+stack layout is shown in Figure \ref{fig:hugs-return-stack}.
+
+\begin{figure}
+\begin{center}
+\input{hugs_ret.pstex_t}
+\end{center}
+\caption{Stack layout for a hugs return address}
+\label{fig:hugs-return-stack}
+\end{figure}
+
+\begin{figure}
+\begin{center}
+\input{hugs_ret2.pstex_t}
+\end{center}
+\caption{Stack layout on enterings a hugs return address}
+\label{fig:hugs-return2}
+\end{figure}
+
+When a Hugs byte-code sequence is returning, it first places the
+return value on the stack. It then examines the return address (now
+the second word on the stack):
+
+\begin{itemize}
+
+\item If the return address is @HUGS_RET@, rearrange the stack so that
+it has the returned object followed by the pointer to the BCO at the
+top, then enter the BCO (Figure \ref{fig:hugs-return2}).
+
+\item If the top of the stack is not @HUGS_RET@, we need to do a world
+switch as described in Section \ref{sect:hugs-to-ghc-return}.
+
+\end{itemize}
+
+
+\section{The Scheduler}
+
+The Scheduler is the heart of the run-time system. A running program
+consists of a single running thread, and a list of runnable and
+blocked threads. The running thread returns to the scheduler when any
+of the following conditions arises:
+
+\begin{itemize}
+\item A heap check fails, and a garbage collection is required
+\item Compiled code needs to switch to interpreted code, and vice
+versa.
+\item The thread becomes blocked.
+\item The thread is preempted.
+\end{itemize}
+
+A running system has a global state, consisting of
+
+\begin{itemize}
+\item @Hp@, the current heap pointer, which points to the next
+available address in the Heap.
+\item @HpLim@, the heap limit pointer, which points to the end of the
+heap.
+\item The Thread Preemption Flag, which is set whenever the currently
+running thread should be preempted at the next opportunity.
+\item A list of runnable threads.
+\item A list of blocked threads.
+\end{itemize}
+
+Each thread is represented by a Thread State Object (TSO), which is
+described in detail in Section \ref{sect:TSO}.
+
+The following is pseudo-code for the inner loop of the scheduler
+itself.
+
+@
+while (threads_exist) {
+ // handle global problems: GC, parallelism, etc
+ if (need_gc) gc();
+ if (external_message) service_message();
+ // deal with other urgent stuff
+
+ pick a runnable thread;
+ do {
+ switch (thread->whatNext) {
+ case (RunGHC pc): status=runGHC(pc); break;
+ case (RunHugs bc): status=runHugs(bc); break;
+ }
+ switch (status) { // handle local problems
+ case (StackOverflow): enlargeStack; break;
+ case (Error e) : error(thread,e); break;
+ case (ExitWith e) : exit(e); break;
+ case (Yield) : break;
+ }
+ } while (thread_runnable);
+}
+@
+
+Optimisations to avoid excess trampolining from Hugs into itself.
+How do we invoke GC, ccalls, etc.
+General ccall (@ccall-GC@) and optimised ccall.
+
\section{Switching Worlds}
+\label{sect:switching-worlds}
+
Because this is a combined compiled/interpreted system, the
interpreter will sometimes encounter compiled code, and vice-versa.
-There are six cases we need to consider:
+All world-switches go via the scheduler, ensuring that the world is in
+a known state ready to enter either compiled code or the interpreter.
+When a thread is run from the scheduler, the @whatNext@ field in the
+TSO (Section \ref{sect:TSO}) is checked to find out how to execute the
+thread.
+
+\begin{itemize}
+\item If @whatNext@ is set to @ReturnGHC@, we load up the required
+registers from the TSO and jump to the address at the top of the user
+stack.
+\item If @whatNext@ is set to @EnterGHC@, we load up the required
+registers from the TSO and enter the closure pointed to by the top
+word of the stack.
+\item If @whatNext@ is set to @EnterHugs@, we enter the top thing on
+the stack, using the interpreter.
+\end{itemize}
+
+There are four cases we need to consider:
\begin{enumerate}
-\item A GHC thread enters a Hugs-built thunk.
-\item A GHC thread calls a Hugs-compiled function.
+\item A GHC thread enters a Hugs-built closure.
\item A GHC thread returns to a Hugs-compiled return address.
-\item A Hugs thread enters a GHC-built thunk.
-\item A Hugs thread calls a GHC-compiled function.
+\item A Hugs thread enters a GHC-built closure.
\item A Hugs thread returns to a Hugs-compiled return address.
\end{enumerate}
-\subsection{A GHC thread enters a Hugs-built thunk}
+GHC-compiled modules cannot call functions in a Hugs-compiled module
+directly, because the compiler has no information about arities in the
+external module. Therefore it must assume any top-level objects are
+CAFs, and enter their closures.
-A Hugs-built thunk looks like this:
+\ToDo{dynamic linking stuff}
+\ToDo{Hugs-built constructors?}
-\begin{center}
-\begin{tabular}{|l|l|}
-\hline
-\emph{Hugs} & \emph{Hugs-specific information} \\
-\hline
-\end{tabular}
-\end{center}
+We now examine the various cases one by one and describe how the
+switch happens in each situation.
+
+\subsection{A GHC thread enters a Hugs-built closure}
+\label{sect:ghc-to-hugs-closure}
-\noindent where \emph{Hugs} is a pointer to a small
-statically-compiled piece of code that does the following:
+There are three possibilities: GHC has entered the BCO directly (for a
+top-level function closure), it has entered a @HUGS_AP@, or it has
+entered a @HUGS_PAP@.
+
+The code for all three objects is the same:
\begin{itemize}
-\item Push the address of the thunk on the stack.
-\item Push @entertop@ on the stack.
-\item Save the current state of the thread in the TSO.
-\item Return to the scheduler, with the @whatNext@ field set to
-@RunHugs@.
+\item Push the address of the object entered on the stack.
+\item Save the current state of the thread in its TSO.
+\item Return to the scheduler, setting @whatNext@ to @EnterHugs@.
\end{itemize}
-\noindent where @entertop@ is a small statically-compiled piece of
-code that does the following:
+\subsection{A GHC thread returns to a Hugs-compiled return address}
+\label{sect:ghc-to-hugs-return}
+
+Hugs return addresses are laid out as in Figure
+\ref{fig:hugs-return-stack}. If GHC is returning, it will return to
+the address at the top of the stack, namely @HUGS_RET@. The code at
+@HUGS_RET@ performs the following:
\begin{itemize}
-\item pop the return address from the stack.
-\item pop the next word off the stack into \Arg{1}.
-\item enter \Arg{1}.
+\item pushes \Arg{1} (the return value) on the stack.
+\item saves the thread state in the TSO
+\item returns to the scheduler with @whatNext@ set to @EnterHugs@.
\end{itemize}
-The infotable for @entertop@ has some byte-codes attached that do
-essentially the same thing if the code is entered from Hugs.
+\noindent When Hugs runs, it will enter the return value, which will
+return using the correct Hugs convention (Section
+\ref{sect:hugs-return-convention}) to the return address underneath it
+on the stack.
-\subsection{A GHC thread calls a Hugs-compiled function}
+\subsection{A Hugs thread enters a GHC-compiled closure}
+\label{sect:hugs-to-ghc-closure}
-How do we do this?
-
-\subsection{A GHC thread returns to a Hugs-compiled return address}
+Hugs can recognise a GHC-built closure as not being one of the
+following types of object:
-\subsection{A Hugs thread enters a GHC-compiled thunk}
+\begin{itemize}
+\item A @BCO@,
+\item A @HUGS_AP@,
+\item A @HUGS_PAP@,
+\item An indirection, or
+\item A constructor.
+\end{itemize}
-When Hugs is called on to enter a non-Hugs closure (these are
-recognisable by the lack of a \emph{Hugs} pointer at the front), the
-following sequence of instructions is executed:
+When Hugs is called on to enter a GHC closure, it executes the
+following sequence of instructions:
\begin{itemize}
-\item Push the address of the thunk on the stack.
-\item Push @entertop@ on the stack.
+\item Push the address of the closure on the stack.
\item Save the current state of the thread in the TSO.
\item Return to the scheduler, with the @whatNext@ field set to
-@RunGHC@.
+@EnterGHC@.
\end{itemize}
-\subsection{A Hugs thread calls a GHC-compiled function}
+\subsection{A Hugs thread returns to a GHC-compiled return address}
+\label{sect:hugs-to-ghc-return}
-Hugs never calls GHC-functions directly, it only enters closures
-(which point to the slow entry point for the function). Hence in this
-case, we just push the arguments on the stack and proceed as for a
-thunk.
+When hugs encounters a return address on the stack that is not
+@HUGS_RET@, it knows that a world-switch is required. At this point
+the stack contains a pointer to the return value, followed by the GHC
+return address. The following sequence is then performed:
-\subsection{A Hugs thread returns to a GHC-compiled return address}
+\begin{itemize}
+\item save the state of the thread in the TSO.
+\item return to the scheduler, setting @whatNext@ to @EnterGHC@.
+\end{itemize}
+The first thing that GHC will do is enter the object on the top of the
+stack, which is a pointer to the return value. This value will then
+return itself to the return address using the GHC return convention.
+
+\part{Implementation}
\section{Heap objects}
\label{sect:fixed-header}
\item Statistics (e.g. a word to track how many times a thunk is entered.).
We add a Ticky word to the fixed-header part of closures. This is
-used to record indicate if a closure has been updated but not yet
-entered. It is set when the closure is updated and cleared when
-subsequently entered.
+used to indicate if a closure has been updated but not yet entered. It
+is set when the closure is updated and cleared when subsequently
+entered.
NB: It is {\em not} an ``entry count'', it is an
``entries-after-update count.'' The commoning up of @CONST@,
@CHARLIKE@ and @INTLIKE@ closures is turned off(?) if this is
required. This has only been done for 2s collection.
-
-
\end{itemize}
\end{itemize}
+
Most of the RTS is completely insensitive to the number of admin words.
The total size of the fixed header is @FIXED_HS@.
\hline Parallelism Info
\\ \hline Profile Info
\\ \hline Debug Info
-\\ \hline Tag/bytecode pointer
-\\ \hline Static reference table
+\\ \hline Tag / Static reference table
\\ \hline Storage manager layout info
\\ \hline Closure type
-\\ \hline entry code \ldots
-\\ \hline
+\\ \hline entry code
+\\ \vdots
\end{tabular}
\end{center}
An info table has the following contents (working backwards in memory
precise layout, for the benefit of the garbage collector and the code
that stuffs graph into packets for transmission over the network.
-\item A one-pointer {\em Static Reference Table (SRT) pointer}, @INFO_SRT@, points to
-a table which enables the garbage collector to identify all accessible
-code and CAFs. They are fully described in Section~\ref{sect:srt}.
-
-\item A one-pointer {\em tag/bytecode-pointer} field, @INFO_TAG@ or @INFO_BC@.
-For data constructors this field contains the constructor tag, in the
-range $0..n-1$ where $n$ is the number of constructors.
-
-For other objects that can be entered this field points to the byte
-codes for the object. For the constructor case you can think of the
-tag as the name of a a suitable bytecode sequence but it can also be used to
-implement semi-tagging (section~\ref{sect:semi-tagging}).
-
-One awkward question (which may not belong here) is ``how does the
-bytecode interpreter know whether to do a vectored return?'' The
-answer is it examines the @INFO_TYPE@ field of the return address:
-@RET_VEC_@$sz$ requires a vectored return and @RET_@$sz$ requires a
-direct return.
+\item A one-word {\em Tag/Static Reference Table} field, @INFO_SRT@.
+For data constructors, this field contains the constructor tag, in the
+range $0..n-1$ where $n$ is the number of constructors. For all other
+objects it contains a pointer to a table which enables the garbage
+collector to identify all accessible code and CAFs. They are fully
+described in Section~\ref{sect:srt}.
\item {\em Profiling info\/}
\end{itemize}
-
-\section{Kinds of Heap Object}
+%-----------------------------------------------------------------------------
+\subsection{Kinds of Heap Object}
\label{sect:closures}
Heap objects can be classified in several ways, but one useful one is
{\em Pointed} \\
\hline
-@CONSTR@ & 1 & & 1 & & & & & & & \ref{sect:CONSTR} \\
-@CONSTR_STATIC@ & 1 & & 1 & 1 & & & & & & \ref{sect:CONSTR} \\
-@CONSTR_STATIC_NOCAF@ & 1 & & 1 & 1 & & & & & & \ref{sect:CONSTR} \\
-
-@FUN@ & 1 & & ? & & & & & & & \ref{sect:FUN} \\
-@FUN_STATIC@ & 1 & & ? & 1 & & & & & & \ref{sect:FUN} \\
-
-@THUNK@ & 1 & 1 & & & 1 & & & & & \ref{sect:THUNK} \\
-@THUNK_STATIC@ & 1 & 1 & & 1 & 1 & & & & & \ref{sect:THUNK} \\
-@THUNK_SELECTOR@ & 1 & 1 & 1 & & 1 & & & & & \ref{sect:THUNK_SEL} \\
-
-@PAP@ & 1 & & ? & & & & & & & \ref{sect:PAP} \\
-
-@IND@ & & & 1 & & ? & & & & 1 & \ref{sect:IND} \\
-@IND_OLDGEN@ & 1 & & 1 & & ? & & & & 1 & \ref{sect:IND} \\
-@IND_PERM@ & & & 1 & & ? & & & & 1 & \ref{sect:IND} \\
-@IND_OLDGEN_PERM@ & 1 & & 1 & & ? & & & & 1 & \ref{sect:IND} \\
-@IND_STATIC@ & ? & & 1 & 1 & ? & & & & 1 & \ref{sect:IND} \\
-
-\hline
-{\em Unpointed} \\
-\hline
-
-
-@ARR_WORDS@ & 1 & & 1 & & & 1 & 1 & & & \ref{sect:ARR_WORDS1},\ref{sect:ARR_WORDS2} \\
-@ARR_PTRS@ & 1 & & 1 & & & 1 & 1 & & & \ref{sect:ARR_PTRS} \\
-@MUTVAR@ & 1 & & 1 & & & 1 & 1 & & & \ref{sect:MUTVAR} \\
-@MUTARR_PTRS@ & 1 & & 1 & & & 1 & 1 & & & \ref{sect:MUTARR_PTRS} \\
-@MUTARR_PTRS_FROZEN@ & 1 & & 1 & & & 1 & 1 & & & \ref{sect:MUTARR_PTRS_FROZEN} \\
-
-@FOREIGN@ & 1 & & 1 & & & & 1 & & & \ref{sect:FOREIGN} \\
-
-@BH@ & ? & 0/1 & 1 & & ? & ? & & 1 & ? & \ref{sect:BH} \\
-@MVAR@ & & & & & & & & & & \ref{sect:MVAR} \\
-@IVAR@ & & & & & & & & & & \ref{sect:IVAR} \\
-@FETCHME@ & & & & & & & & & & \ref{sect:FETCHME} \\
+@CONSTR@ & 1 & & 1 & & & & & & & \ref{sect:CONSTR} \\
+@CONSTR_STATIC@ & 1 & & 1 & 1 & & & & & & \ref{sect:CONSTR} \\
+@CONSTR_STATIC_NOCAF@ & 1 & & 1 & 1 & & & & & & \ref{sect:CONSTR} \\
+
+@FUN@ & 1 & & ? & & & & & & & \ref{sect:FUN} \\
+@FUN_STATIC@ & 1 & & ? & 1 & & & & & & \ref{sect:FUN} \\
+
+@THUNK@ & & 1 & & & 1 & & & & & \ref{sect:THUNK} \\
+@THUNK_STATIC@ & & 1 & & 1 & 1 & & & & & \ref{sect:THUNK} \\
+@THUNK_SELECTOR@ & & 1 & 1 & & 1 & & & & & \ref{sect:THUNK_SEL} \\
+
+@BCO@ & 1 & & 1 & & & & & & & \ref{sect:BCO} \\
+@BCO_CAF@ & & 1 & & & 1 & & & & & \ref{sect:BCO} \\
+
+@HUGS_AP@ & & 1 & & & 1 & & & & & \ref{sect:HUGS-AP} \\
+@HUGS_PAP@ & & & 1 & & & & & & & \ref{sect:HUGS-AP} \\
+
+@PAP@ & 1 & & 1 & & & & & & & \ref{sect:PAP} \\
+
+@IND@ & ? & & ? & & ? & & & & 1 & \ref{sect:IND} \\
+@IND_OLDGEN@ & ? & & ? & & ? & & & & 1 & \ref{sect:IND} \\
+@IND_PERM@ & ? & & ? & & ? & & & & 1 & \ref{sect:IND} \\
+@IND_OLDGEN_PERM@ & ? & & ? & & ? & & & & 1 & \ref{sect:IND} \\
+@IND_STATIC@ & ? & & ? & 1 & ? & & & & 1 & \ref{sect:IND} \\
+
+\hline
+{\em Unpointed} \\
+\hline
+
+
+@ARR_WORDS@ & 1 & & 1 & & & & 1 & & & \ref{sect:ARR_WORDS1},\ref{sect:ARR_WORDS2} \\
+@ARR_PTRS@ & 1 & & 1 & & & & 1 & & & \ref{sect:ARR_PTRS} \\
+@MUTVAR@ & 1 & & 1 & & & 1 & 1 & & & \ref{sect:MUTVAR} \\
+@MUTARR_PTRS@ & 1 & & 1 & & & 1 & 1 & & & \ref{sect:MUTARR_PTRS} \\
+@MUTARR_PTRS_FROZEN@ & 1 & & 1 & & & 1 & 1 & & & \ref{sect:MUTARR_PTRS_FROZEN} \\
+
+@FOREIGN@ & 1 & & 1 & & & & 1 & & & \ref{sect:FOREIGN} \\
+
+@BH@ & & 1 & 1 & & ? & ? & & 1 & ? & \ref{sect:BH} \\
+@MVAR@ & 1 & & 1 & & & & & & & \ref{sect:MVAR} \\
+@IVAR@ & 1 & & 1 & & & & & & & \ref{sect:IVAR} \\
+@FETCHME@ & 1 & & 1 & & & & & & & \ref{sect:FETCHME} \\
\hline
\end{tabular}
+\subsection{Hugs Objects}
+
+\subsubsection{Byte-Code Objects}
+\label{sect:BCO}
+
+A Byte-Code Object (BCO) is a container for a a chunk of byte-code,
+which can be executed by Hugs. The byte-code represents a
+supercombinator in the program: when hugs compiles a module, it
+performs lambda lifting and each resulting supercombinator becomes a
+byte-code object in the heap.
+
+There are two kinds of BCO: a standard @BCO@ which has an arity of one
+or more, and a @BCO_CAF@ which takes no arguments and can be updated.
+When a @BCO_CAF@ is updated, the code is thrown away!
+
+The semantics of BCOs are described in Section
+\ref{sect:hugs-heap-objects}. A BCO has the following structure:
+
+\begin{center}
+\begin{tabular}{|l|l|l|l|l|l|}
+\hline
+\emph{Fixed Header} & \emph{Layout} & \emph{Offset} & \emph{Size} &
+\emph{Literals} & \emph{Byte code} \\
+\hline
+\end{tabular}
+\end{center}
+
+\noindent where:
+\begin{itemize}
+\item The entry code is a static code fragment/info table that
+returns to the scheduler to invoke Hugs (Section
+\ref{sect:ghc-to-hugs-closure}).
+\item \emph{Layout} contains the number of pointer literals in the
+\emph{Literals} field.
+\item \emph{Offset} is the offset to the byte code from the start of
+the object.
+\item \emph{Size} is the number of words of byte code in the object.
+\item \emph{Literals} contains any pointer and non-pointer literals used in
+the byte-codes (including jump addresses), pointers first.
+\item \emph{Byte code} contains \emph{Size} words of non-pointer byte
+code.
+\end{itemize}
+
+\subsubsection{@HUGS_AP@ objects}
+\label{sect:HUGS-AP}
+
+There are two kinds of @HUGS_AP@ objects: a standard @HUGS_AP@, used
+to represent thunks buit by Hugs, and a @HUGS_PAP@, used for partial
+applications. The only difference between the two is that a
+@HUGS_PAP@ is non-updatable.
+
+\begin{center}
+\begin{tabular}{|l|l|l|l|}
+\hline
+\emph{Fixed Header} & \emph{BCO} & \emph{Layout} & \emph{Free Variables} \\
+\hline
+\end{tabular}
+\end{center}
+
+\noindent where:
+
+\begin{itemize}
+
+\item The entry code is a statically-compiled code fragment/info table
+that returns to the scheduler to invoke Hugs (Sections
+\ref{sect:ghc-to-hugs-closure}, \ref{sect:ghc-to-hugs-return}).
+
+\item \emph{BCO} is a pointer to the BCO for the thunk.
+
+\item \emph{Layout} contains the number of pointers and the size of
+the \emph{Free Variables} field.
+
+\item \emph{Free Variables} contains the free variables of the
+thunk/partial application/return address, pointers first.
+
+\end{itemize}
+
\subsection{Pointed Objects}
All pointed objects can be entered.
{\em Fixed header} & {\em Static object link} \\ \hline
\end{tabular}
\end{center}
-Static function closurs have no free variables. (However they may refer to other
+Static function closures have no free variables. (However they may refer to other
static closures; these references are recorded in the function closure's SRT.)
They have one field that is not present in dynamic closures, the {\em static object
link} field. This is used by the garbage collector in the same way that to-space
\end{tabular}
\end{center}
-The SRT pointer in a data constructor's info table is never used --- the
-code for a constructor does not make any static references.
-\note{Use it for something else?? E.g. tag?}
+The SRT pointer in a data constructor's info table is used for the
+constructor tag, since a constructor never has any static references.
There are several different sorts of constructor:
\begin{itemize}
\subsection{Internal workings of the Generational Collector}
+\section{Dynamic Linking}
\section{Profiling}