[project @ 1997-10-15 12:44:35 by simonm]

[ghc-hetmet.git] / docs / rts / rts.verb
diff --git a/docs/rts/rts.verb b/docs/rts/rts.verb

index 25021c4..7a5a62e 100644 (file)
--- a/docs/rts/rts.verb
+++ b/docs/rts/rts.verb
@@ -20,6 +20,8 @@
  \marginparsep 0 in 
  \sloppy
  
+\usepackage{epsfig}
+
  \newcommand{\note}[1]{{\em Note: #1}}
  % DIMENSION OF TEXT:
  \textheight 8.5 in
@@ -54,7 +56,8 @@ Alastair Reid \\ Yale University}
  \tableofcontents
  \newpage
  
-\section{Introduction}
+\part{Introduction}
+\section{Overview}
  
  This document describes the GHC/Hugs run-time system.  It serves as 
  a Glasgow/Yale/Nottingham ``contract'' about what the RTS does.
@@ -176,89 +179,14 @@ the old generation is no bigger than the current new generation.
  
  \end{itemize}
  
-
-\section{The Scheduler}
-
-The Scheduler is the heart of the run-time system.  A running program
-consists of a single running thread, and a list of runnable and
-blocked threads.  The running thread returns to the scheduler when any
-of the following conditions arises:
-
-\begin{itemize}
-\item A heap check fails, and a garbage collection is required
-\item Compiled code needs to switch to interpreted code, and vice
-versa.
-\item The thread becomes blocked.
-\item The thread is preempted.
-\end{itemize}
-
-A running system has a global state, consisting of
-
-\begin{itemize}
-\item @Hp@, the current heap pointer, which points to the next
-available address in the Heap.
-\item @HpLim@, the heap limit pointer, which points to the end of the
-heap.
-\item The Thread Preemption Flag, which is set whenever the currently
-running thread should be preempted at the next opportunity.
-\end{itemize}
-
-Each thread has a thread-local state, which consists of
-
-\begin{itemize}
-\item @TSO@, the Thread State Object for this thread.  This is a heap
-object that is used to store the current thread state when the thread
-is blocked or sleeping.
-\item @Sp@, the current stack pointer.
-\item @Su@, the current stack update frame pointer.  This register
-points to the most recent update frame on the stack, and is used to
-calculate the number of arguments available when entering a function.
-\item @SpLim@, the stack limit pointer.  This points to the end of the
-current stack chunk.
-\item Several general purpose registers, used for passing arguments to
-functions.
-\end{itemize}
-
-\noindent and various other bits of information used in specialised
-circumstances, such as profiling and parallel execution.  These are
-described in the appropriate sections.
-
-The following is pseudo-code for the inner loop of the scheduler
-itself.
-
-@
-while (threads_exist) {
-  // handle global problems: GC, parallelism, etc
-  if (need_gc) gc();  
-  if (external_message) service_message();
-  // deal with other urgent stuff
-
-  pick a runnable thread;
-  do {
-    switch (thread->whatNext) {
-      case (RunGHC  pc): status=runGHC(pc);  break;
-      case (RunHugs bc): status=runHugs(bc); break;
-    }
-    switch (status) {  // handle local problems
-      case (StackOverflow): enlargeStack; break;
-      case (Error e)      : error(thread,e); break;
-      case (ExitWith e)   : exit(e); break;
-      case (Yield)        : break;
-    }
-  } while (thread_runnable);
-}
-@
-
-Optimisations to avoid excess trampolining from Hugs into itself.
-How do we invoke GC, ccalls, etc.
-General ccall (@ccall-GC@) and optimised ccall.
-
-\section{Evaluation}
+%-----------------------------------------------------------------------------
+\part{Evaluation Model}
+\section{Compiled Execution}
  
  This section describes the framework in which compiled code evaluates
  expressions.  Only at certain points will compiled code need to be
  able to talk to the interpreted world; these are discussed in Section
-\ref{sec:hugs-ghc-interaction}.
+\ref{sect:switching-worlds}.
  
  \subsection{Calling conventions}
  
@@ -742,172 +670,321 @@ May have to keep C stack pointer in register to placate OS?
  May have to revert black holes - ouch!
  @
  
-\section{Switching Worlds}
-\label{sect:switching-worlds}
+\section{Interpreted Execution}
+
+This section describes how the Hugs interpreter interprets code in the
+same environment as compiled code executes.  Both evaluation models
+use a common garbage collector, so they must agree on the form of
+objects in the heap.
+
+Hugs interprets code by converting it to byte-code and applying a
+byte-code interpreter to it.  Wherever possible, we try to ensure that
+the byte-code is all that is required to interpret a section of code.
+This means not dynamically generating info tables, and hence we can
+only have a small number of possible heap objects each with a staticly
+compiled info table.  Similarly for stack objects: in fact we only
+have one Hugs stack object, in which all information is tagged for the
+garbage collector.
+
+There is, however, one exception to this rule.  Hugs must generate
+info tables for any constructors it is asked to compile, since the
+alternative is to force a context-switch each time compiled code
+enters a Hugs-built constructor, which would be prohibitively
+expensive.
+
+\subsection{Hugs Heap Objects}
+\label{sect:hugs-heap-objects}
+
+\subsubsection{Byte-Code Objects}
+
+Compiled byte code lives on the global heap, in objects called
+Byte-Code Objects (or BCOs).  The layout of BCOs is described in
+detail in Section \ref{sect:BCO}, in this section we will describe
+their semantics.
+
+Since byte-code lives on the heap, it can be garbage collected just
+like any other heap-resident data.  Hugs maintains a table of
+currently live BCOs, which is treated as a table of live pointers by
+the garbage collector.  When a module is unloaded, the pointers to its
+BCOs are removed from the table, and the code will be garbage
+collected some time later.
+
+A BCO represents a basic block of code - all entry points are at the
+beginning of a BCO, and it is impossible to jump into the middle of
+one.  A BCO represents not only the code for a function, but also its
+closure; a BCO can be entered just like any other closure.  Hugs
+performs lambda-lifting during compilation to byte-code, and each
+top-level combinator becomes a BCO in the heap.
  
-Because this is a combined compiled/interpreted system, the
-interpreter will sometimes encounter compiled code, and vice-versa.
+\subsubsection{Thunks}
  
-All world-switches go via the scheduler, ensuring that the world is in
-a known state ready to enter either compiled code or the interpreter.
-When a thread is run from the scheduler, the @whatNext@ field is
-checked to find out how to execute the thread.
+A thunk consists of a code pointer, and values for the free variables
+of that code.  Since Hugs byte-code is lambda-lifted, free variables
+become arguments and are expected to be on the stack by the called
+function.
+
+Hugs represents thunks with an AP object.  The AP object contains one
+or more pointers to other heap objects.  When it is entered, it pushes
+an update frame followed by its payload on the stack, and enters the
+first word (which will be a pointer to a BCO).  The layout of AP
+objects is described in more detail in Section \ref{sect:AP}.
+
+\subsubsection{Partial Applications}
+
+Partial applications are represented by PAP objects.  A PAP object is
+exactly the same as an AP object, except that it is non-updatable.
+The layout of PAP objects is described in Section \ref{sect:PAP}.
+
+\subsection{Calling conventions}
+\label{sect:hugs-calling-conventions}
+
+The calling convention for any byte-code function is straightforward:
  
  \begin{itemize}
-\item If @whatNext@ is set to @RunGHC@, we load up the required
-registers from the TSO and jump to the address at the top of the user
-stack.
-\item If @whatNext@ is set to @RunHugs@, we execute the byte-code
-object pointed to by the top word of the stack.
+\item Push any arguments on the stack.
+\item Push a pointer to the BCO.
+\item Begin interpreting the byte code.
  \end{itemize}
  
-Sometimes instead of returning to the address at the top of the stack,
-we need to enter a closure instead.  This is achieved by pushing a
-pointer to the closure to be entered on the stack, followed by a
-pointer to a canned code sequence called @ghc_entertop@, or the dual
-byte-code object @hugs_entertop@.  Both code sequences do the following:
+The @ENTER@ byte-code instruction decides how to enter its argument.
+The object being entered must be either
  
  \begin{itemize}
-\item pop the top word (either @ghc_entertop@ or @hugs_entertop@) from
-the stack.
-\item pop the next word off the stack and enter it.
+\item A BCO,
+\item An AP,
+\item A PAP,
+\item A constructor,
+\item A GHC-built closure, or
+\item An indirection.
  \end{itemize}
  
-There are six cases we need to consider:
+If @ENTER@ is applied to a BCO, we just begin interpreting the
+byte-code contained therein.  If the object is an AP, we push an
+update frame, push the values from the AP on the stack, and enter its
+associated object.  If the object is a PAP, we push its values on the
+stack and enter the first one.  If the object is a constructor, we
+simply return (see Section \ref{sect:hugs-return-convention}).  The
+fourth case is convered in Section \ref{sect:hugs-to-ghc-closure}.  If
+the object is an indirection, we simply enter the object it points to.
  
-\begin{enumerate}
-\item A GHC thread enters a Hugs-built closure.
-\item A GHC thread calls a Hugs-compiled function.
-\item A GHC thread returns to a Hugs-compiled return address.
-\item A Hugs thread enters a GHC-built closure.
-\item A Hugs thread calls a GHC-compiled function.
-\item A Hugs thread returns to a Hugs-compiled return address.
-\end{enumerate}
+\subsection{Return convention}
+\label{sect:hugs-return-convention}
  
-We now examine the various cases one by one and describe how the
-switch happens in each situation.
+When Hugs pushes a return address, it pushes both a pointer to the BCO
+to return to, and a pointer to a static code fragment @HUGS_RET@ (this
+will be described in Section \ref{sect:ghc-to-hugs-return}).  The
+stack layout is shown in Figure \ref{fig:hugs-return-fig}.
  
-\subsection{A GHC thread enters a Hugs-built closure}
-
-All Hugs-built closures look like this:
+\begin{figure}
+\begin{center}
+\input{hugs_ret.pstex_t}
+\end{center}
+\caption{Stack layout for a hugs return address}
+\label{fig:hugs-return-stack}
+\end{figure}
  
+\begin{figure}
  \begin{center}
-\begin{tabular}{|l|l|}
-\hline
-\emph{Hugs} & \emph{Hugs-specific payload} \\
-\hline 
-\end{tabular}
+\input{hugs_ret2.pstex_t}
  \end{center}
+\caption{Stack layout on enterings a hugs return address}
+\label{fig:hugs-return2}
+\end{figure}
  
-\noindent where \emph{Hugs} is a pointer to a small statically
-compiled-piece of code that does the following:
+When a Hugs byte-code sequence is returning, it first places the
+return value on the stack.  It then examines the return address (now
+the second word on the stack):
  
  \begin{itemize}
-\item Push the address of this thunk on the stack.
-\item Push @hugs_entertop@ on the stack.
-\item Save the current state of the thread in the TSO.
-\item Return to the scheduler, with @whatNext@ set to @RunHugs@.
-\end{itemize}
  
-\ToDo{What about static thunks?  If all code lives on the heap, we'll
-need an extra level of indirection for GHC references to Hugs
-closures.}
+\item If the return address is @HUGS_RET@, rearrange the stack so that
+it has the returned object followed by the pointer to the BCO at the
+top, then enter the BCO (Figure \ref{fig:hugsreturn2}).
  
-\subsection{A GHC thread calls a Hugs-compiled function}
+\item If the top of the stack is not @HUGS_RET@, we need to do a world
+switch as described in Section \ref{sect:hugs-to-ghc-return}.
  
-In order to call the fast entry point for a function, GHC needs arity
-information from the defining module's interface file.  Hugs doesn't
-supply this information, so GHC will always call the slow entry point
-for functions in Hugs-compiled modules.
+\end{itemize}
  
-When a GHC module is linked into a running system, the calls to
-external Hugs-compiled functions will be resolved to point to
-dynamically-generated code that does the following:
+
+\section{The Scheduler}
+
+The Scheduler is the heart of the run-time system.  A running program
+consists of a single running thread, and a list of runnable and
+blocked threads.  The running thread returns to the scheduler when any
+of the following conditions arises:
  
  \begin{itemize}
-\item Push a pointer to the Hugs byte code object for the function on
-the stack.
-\item Push @hugs_entertop@ on the stack.
-\item Save the current thread state in the TSO.
-\item Return to the scheduler with @whatNext@ set to @RunHugs@
+\item A heap check fails, and a garbage collection is required
+\item Compiled code needs to switch to interpreted code, and vice
+versa.
+\item The thread becomes blocked.
+\item The thread is preempted.
  \end{itemize}
  
-Ok, but how does Hugs find the byte code object for the function?
-These live on the heap, and can therefore move around.  One solution
-is to use a jump table, where each element in the table has two
-elements:
+A running system has a global state, consisting of
  
  \begin{itemize}
-\item A call instruction pointing to the code fragment above.
-\item A pointer to the byte-code object for the function.
+\item @Hp@, the current heap pointer, which points to the next
+available address in the Heap.
+\item @HpLim@, the heap limit pointer, which points to the end of the
+heap.
+\item The Thread Preemption Flag, which is set whenever the currently
+running thread should be preempted at the next opportunity.
+\item A list of runnable threads. 
+\item A list of blocked threads.
  \end{itemize}
  
-When GHC jumps to the address in the jump table, the call takes it to
-the statically-compiled code fragment, leaving a pointer to a pointer
-to the byte-code object on the C stack, which can then be retrieved.
+Each thread is represented by a Thread State Object (TSO), which is
+described in detail in Section \ref{sect:TSO}.
  
-\subsection{A GHC thread returns to a Hugs-compiled return address}
-
-When Hugs pushes return addresses on the stack, they look like this:
+The following is pseudo-code for the inner loop of the scheduler
+itself.
  
  @
-       |               |
-       |_______________|
-       |               |  -----> bytecode object
-       |_______________|
-       |               |  _____
-       |_______________|       | 
-                               |       _____
-                               |       |    | Info Table
-                               |       |    | 
-                               |_____\ |____| hugs_return
-                                     / .    .
-                                       .    . Code
-                                       .    .
+while (threads_exist) {
+  // handle global problems: GC, parallelism, etc
+  if (need_gc) gc();  
+  if (external_message) service_message();
+  // deal with other urgent stuff
+
+  pick a runnable thread;
+  do {
+    switch (thread->whatNext) {
+      case (RunGHC  pc): status=runGHC(pc);  break;
+      case (RunHugs bc): status=runHugs(bc); break;
+    }
+    switch (status) {  // handle local problems
+      case (StackOverflow): enlargeStack; break;
+      case (Error e)      : error(thread,e); break;
+      case (ExitWith e)   : exit(e); break;
+      case (Yield)        : break;
+    }
+  } while (thread_runnable);
+}
  @
  
-If GHC is returning, it will return to the address at the top of the
-stack.  This address a pointer to a statically compiled code fragment
-called @hugs_return@, which:
+Optimisations to avoid excess trampolining from Hugs into itself.
+How do we invoke GC, ccalls, etc.
+General ccall (@ccall-GC@) and optimised ccall.
+
+\section{Switching Worlds}
+
+\label{sect:switching-worlds}
+
+Because this is a combined compiled/interpreted system, the
+interpreter will sometimes encounter compiled code, and vice-versa.
+
+All world-switches go via the scheduler, ensuring that the world is in
+a known state ready to enter either compiled code or the interpreter.
+When a thread is run from the scheduler, the @whatNext@ field in the
+TSO (Section \ref{sect:TSO}) is checked to find out how to execute the
+thread.
+
+\begin{itemize}
+\item If @whatNext@ is set to @ReturnGHC@, we load up the required
+registers from the TSO and jump to the address at the top of the user
+stack.
+\item If @whatNext@ is set to @EnterGHC@, we load up the required
+registers from the TSO and enter the closure pointed to by the top
+word of the stack.
+\item If @whatNext@ is set to @EnterHugs@, we enter the top thing on
+the stack, using the interpreter.
+\end{itemize}
+
+There are four cases we need to consider:
+
+\begin{enumerate}
+\item A GHC thread enters a Hugs-built closure.
+\item A GHC thread returns to a Hugs-compiled return address.
+\item A Hugs thread enters a GHC-built closure.
+\item A Hugs thread returns to a Hugs-compiled return address.
+\end{enumerate}
+
+GHC-compiled modules cannot call functions in a Hugs-compiled module
+directly, because the compiler has no information about arities in the
+external module.  Therefore it must assume any top-level objects are
+CAFs, and enter their closures.
+
+\ToDo{dynamic linking stuff}
+\ToDo{Hugs-built constructors?}
+
+We now examine the various cases one by one and describe how the
+switch happens in each situation.
+
+\subsection{A GHC thread enters a Hugs-built closure}
+\label{sect:ghc-to-hugs-closure}
+
+There are two possibilities: GHC has entered the BCO directly (for a
+top-level function closure), or it has entered an AP.
+
+The code for both objects is the same:
  
  \begin{itemize}
-\item pops the return address off the user stack.
+\item Push the address of the BCO on the stack.
+\item Save the current state of the thread in its TSO.
+\item Return to the scheduler, setting @whatNext@ to @EnterHugs@.
+\end{itemize}
+
+\subsection{A GHC thread returns to a Hugs-compiled return address}
+\label{sect:ghc-to-hugs-return}
+
+Hugs return addresses are laid out as in Figure
+\ref{fig:hugs-return-stack}.  If GHC is returning, it will return to
+the address at the top of the stack, namely @HUGS_RET@.  The code at
+@HUGS_RET@ performs the following:
+
+\begin{itemize}
+\item pushes \Arg{1} (the return value) on the stack.
  \item saves the thread state in the TSO
-\item returns to the scheduler with @whatNext@ set to @RunHugs@.
+\item returns to the scheduler with @whatNext@ set to @EnterHugs@.
  \end{itemize}
  
+\noindent When Hugs runs, it will enter the return value, which will
+return using the correct Hugs convention (Section
+\ref{sect:hugs-return-convention}) to the return address underneath it
+on the stack.
+
  \subsection{A Hugs thread enters a GHC-compiled closure}
+\label{sect:hugs-to-ghc-closure}
  
-When Hugs is called on to enter a GHC closure (these are recognisable
-by the lack of a \emph{Hugs} pointer at the front), the following
-sequence of instructions is executed:
+Hugs can recognise a GHC-built closure as not being one of the
+following types of object:
  
  \begin{itemize}
-\item Push the address of the thunk on the stack.
-\item Push @ghc_entertop@ on the stack.
-\item Save the current state of the thread in the TSO.
-\item Return to the scheduler, with the @whatNext@ field set to
-@RunGHC@.
+\item A BCO.
+\item An AP.
+\item A constructor.
  \end{itemize}
  
-\subsection{A Hugs thread calls a GHC-compiled function}
+When Hugs is called on to enter a GHC closure, it executes the
+following sequence of instructions:
  
-Hugs never calls GHC-functions directly, it only enters closures
-(which point to the slow entry point for the function).  Hence in this
-case, we just push the arguments on the stack and proceed as above.
+\begin{itemize}
+\item Push the address of the closure on the stack.
+\item Save the current state of the thread in the TSO.
+\item Return to the scheduler, with the @whatNext@ field set to
+@EnterGHC@.
+\end{itemize}
  
  \subsection{A Hugs thread returns to a GHC-compiled return address}
+\label{sect:hugs-to-ghc-return}
  
-The return address at the top of the stack is recognisable as a
-GHC-return address by virtue of not being @hugs_return@.  In this
-case, hugs recognises that it needs to do a world-switch and performs
-the following sequence:
+When hugs encounters a return address on the stack that is not
+@HUGS_RET@, it knows that a world-switch is required.  At this point
+the stack contains a pointer to the return value, followed by the GHC
+return address.  The following sequence is then performed:
  
  \begin{itemize}
  \item save the state of the thread in the TSO.
-\item return to the scheduler, setting @whatNext@ to @RunGHC@.
+\item return to the scheduler, setting @whatNext@ to @EnterGHC@.
  \end{itemize}
  
+The first thing that GHC will do is enter the object on the top of the
+stack, which is a pointer to the return value.  This value will then
+return itself to the return address using the GHC return convention.
+
+\part{Implementation}
  \section{Heap objects}
  \label{sect:fixed-header}
  
@@ -1384,6 +1461,70 @@ under evaluation (BH), or by now an HNF.  Thus, indirections get NoSpark flag.
  
  
  
+\subsection{Hugs Objects}
+
+\subsubsection{Byte-Code Objects}
+\label{sect:BCO}
+
+A Byte-Code Object (BCO) is a container for a a chunk of byte-code,
+which can be executed by Hugs.  For a top-level function, the BCO also
+serves as the closure for the function.
+
+The semantics of BCOs are described in Section
+\ref{sect:hugs-heap-objects}.  A BCO has the following structure:
+
+\begin{center}
+\begin{tabular}{|l|l|l|l|l|l|}
+\hline 
+\emph{BCO} & \emph{Layout} & \emph{Offset} & \emph{Size} &
+\emph{Literals} & \emph{Byte code} \\
+\hline
+\end{tabular}
+\end{center}
+
+\noindent where:
+\begin{itemize}
+\item \emph{BCO} is a pointer to a static code fragment/info table that
+returns to the scheduler to invoke Hugs (Section
+\ref{sect:ghc-to-hugs-closure}).
+\item \emph{Layout} contains the number of pointer literals in the
+\emph{Literals} field.
+\item \emph{Offset} is the offset to the byte code from the start of
+the object.
+\item \emph{Size} is the number of words of byte code in the object.
+\item \emph{Literals} contains any pointer and non-pointer literals used in
+the byte-codes (including jump addresses), pointers first.
+\item \emph{Byte code} contains \emph{Size} words of non-pointer byte
+code.
+\end{itemize}
+
+\subsubsection{AP objects}
+\label{sect:AP}
+
+Hugs uses a standard object called an AP for thunks and partial
+applications.  The layout of an AP is
+
+\begin{center}
+\begin{tabular}{|l|l|l|l|}
+\hline
+\emph{AP} & \emph{BCO} & \emph{Layout} & \emph{Free Variables} \\
+\hline
+\end{tabular}
+\end{center}
+
+\noindent where:
+
+\begin{itemize}
+\item \emph{AP} is a pointer to a statically-compiled code
+fragment/info table that returns to the scheduler to invoke Hugs
+(Sections \ref{sect:ghc-to-hugs-closure}, \ref{sect:ghc-to-hugs-return}).
+\item \emph{BCO} is a pointer to the BCO for the thunk.
+\item \emph{Layout} contains the number of pointers and the size of
+the \emph{Free Variables} field.
+\item \emph{Free Variables} contains the free variables of the
+thunk/partial application/return address, pointers first.
+\end{itemize}
+
  \subsection{Pointed Objects}
  
  All pointed objects can be entered.
@@ -2764,6 +2905,7 @@ to implement than Sparud's.
  \subsection{Internal workings of the Generational Collector}
  
  
+\section{Dynamic Linking}
  
  \section{Profiling}