X-Git-Url: http://git.megacz.com/?a=blobdiff_plain;f=docs%2Frts%2Frts.verb;h=3548688d2c3cb19a805624531e2d526a4bd064ec;hb=779993020d8a21908018707f43a0e0d95247f057;hp=97a4fe0602d72c7e1509ab6182ab1eda3f21bf0f;hpb=2fa7ebd47c00b5716cb6904648637a65c97b09c7;p=ghc-hetmet.git diff --git a/docs/rts/rts.verb b/docs/rts/rts.verb index 97a4fe0..3548688 100644 --- a/docs/rts/rts.verb +++ b/docs/rts/rts.verb @@ -20,6 +20,8 @@ \marginparsep 0 in \sloppy +\usepackage{epsfig} + \newcommand{\note}[1]{{\em Note: #1}} % DIMENSION OF TEXT: \textheight 8.5 in @@ -47,6 +49,7 @@ \title{The STG runtime system (revised)} \author{Simon Peyton Jones \\ Glasgow University and Oregon Graduate Institute \and +Simon Marlow \\ Glasgow University \and Alastair Reid \\ Yale University} \maketitle @@ -54,7 +57,8 @@ Alastair Reid \\ Yale University} \tableofcontents \newpage -\section{Introduction} +\part{Introduction} +\section{Overview} This document describes the GHC/Hugs run-time system. It serves as a Glasgow/Yale/Nottingham ``contract'' about what the RTS does. @@ -176,89 +180,14 @@ the old generation is no bigger than the current new generation. \end{itemize} - -\section{The Scheduler} - -The Scheduler is the heart of the run-time system. A running program -consists of a single running thread, and a list of runnable and -blocked threads. The running thread returns to the scheduler when any -of the following conditions arises: - -\begin{itemize} -\item A heap check fails, and a garbage collection is required -\item Compiled code needs to switch to interpreted code, and vice -versa. -\item The thread becomes blocked. -\item The thread is preempted. -\end{itemize} - -A running system has a global state, consisting of - -\begin{itemize} -\item @Hp@, the current heap pointer, which points to the next -available address in the Heap. -\item @HpLim@, the heap limit pointer, which points to the end of the -heap. -\item The Thread Preemption Flag, which is set whenever the currently -running thread should be preempted at the next opportunity. -\end{itemize} - -Each thread has a thread-local state, which consists of - -\begin{itemize} -\item @TSO@, the Thread State Object for this thread. This is a heap -object that is used to store the current thread state when the thread -is blocked or sleeping. -\item @Sp@, the current stack pointer. -\item @Su@, the current stack update frame pointer. This register -points to the most recent update frame on the stack, and is used to -calculate the number of arguments available when entering a function. -\item @SpLim@, the stack limit pointer. This points to the end of the -current stack chunk. -\item Several general purpose registers, used for passing arguments to -functions. -\end{itemize} - -\noindent and various other bits of information used in specialised -circumstances, such as profiling and parallel execution. These are -described in the appropriate sections. - -The following is pseudo-code for the inner loop of the scheduler -itself. - -@ -while (threads_exist) { - // handle global problems: GC, parallelism, etc - if (need_gc) gc(); - if (external_message) service_message(); - // deal with other urgent stuff - - pick a runnable thread; - do { - switch (thread->whatNext) { - case (RunGHC pc): status=runGHC(pc); break; - case (RunHugs bc): status=runHugs(bc); break; - } - switch (status) { // handle local problems - case (StackOverflow): enlargeStack; break; - case (Error e) : error(thread,e); break; - case (ExitWith e) : exit(e); break; - case (Yield) : break; - } - } while (thread_runnable); -} -@ - -Optimisations to avoid excess trampolining from Hugs into itself. -How do we invoke GC, ccalls, etc. -General ccall (@ccall-GC@) and optimised ccall. - -\section{Evaluation} +%----------------------------------------------------------------------------- +\part{Evaluation Model} +\section{Compiled Execution} This section describes the framework in which compiled code evaluates expressions. Only at certain points will compiled code need to be able to talk to the interpreted world; these are discussed in Section -\ref{sec:hugs-ghc-interaction}. +\ref{sect:switching-worlds}. \subsection{Calling conventions} @@ -742,116 +671,325 @@ May have to keep C stack pointer in register to placate OS? May have to revert black holes - ouch! @ -\section{Switching Worlds} +\section{Interpreted Execution} -Because this is a combined compiled/interpreted system, the -interpreter will sometimes encounter compiled code, and vice-versa. +This section describes how the Hugs interpreter interprets code in the +same environment as compiled code executes. Both evaluation models +use a common garbage collector, so they must agree on the form of +objects in the heap. -There are six cases we need to consider: +Hugs interprets code by converting it to byte-code and applying a +byte-code interpreter to it. Wherever possible, we try to ensure that +the byte-code is all that is required to interpret a section of code. +This means not dynamically generating info tables, and hence we can +only have a small number of possible heap objects each with a staticly +compiled info table. Similarly for stack objects: in fact we only +have one Hugs stack object, in which all information is tagged for the +garbage collector. -\begin{enumerate} -\item A GHC thread enters a Hugs-built thunk. -\item A GHC thread calls a Hugs-compiled function. -\item A GHC thread returns to a Hugs-compiled return address. -\item A Hugs thread enters a GHC-built thunk. -\item A Hugs thread calls a GHC-compiled function. -\item A Hugs thread returns to a Hugs-compiled return address. -\end{enumerate} +There is, however, one exception to this rule. Hugs must generate +info tables for any constructors it is asked to compile, since the +alternative is to force a context-switch each time compiled code +enters a Hugs-built constructor, which would be prohibitively +expensive. + +\subsection{Hugs Heap Objects} +\label{sect:hugs-heap-objects} + +\subsubsection{Byte-Code Objects} + +Compiled byte code lives on the global heap, in objects called +Byte-Code Objects (or BCOs). The layout of BCOs is described in +detail in Section \ref{sect:BCO}, in this section we will describe +their semantics. + +Since byte-code lives on the heap, it can be garbage collected just +like any other heap-resident data. Hugs maintains a table of +currently live BCOs, which is treated as a table of live pointers by +the garbage collector. When a module is unloaded, the pointers to its +BCOs are removed from the table, and the code will be garbage +collected some time later. + +A BCO represents a basic block of code - all entry points are at the +beginning of a BCO, and it is impossible to jump into the middle of +one. A BCO represents not only the code for a function, but also its +closure; a BCO can be entered just like any other closure. Hugs +performs lambda-lifting during compilation to byte-code, and each +top-level combinator becomes a BCO in the heap. + +\subsubsection{Thunks and partial applications} + +A thunk consists of a code pointer, and values for the free variables +of that code. Since Hugs byte-code is lambda-lifted, free variables +become arguments and are expected to be on the stack by the called +function. + +Hugs represents thunks with an @HUGS_AP@ object. The @HUGS_AP@ object +contains one or more pointers to other heap objects. When it is +entered, it pushes an update frame followed by its payload on the +stack, and enters the first word (which will be a pointer to a BCO). +The layout of @HUGS_AP@ objects is described in more detail in Section +\ref{sect:HUGS-AP}. + +Partial applications are represented by @HUGS_PAP@ objects, which are +identical to @HUGS_AP@s except that they are non-updatable. + +\ToDo{Hugs Constructors}. + +\subsection{Calling conventions} +\label{sect:hugs-calling-conventions} + +The calling convention for any byte-code function is straightforward: + +\begin{itemize} +\item Push any arguments on the stack. +\item Push a pointer to the BCO. +\item Begin interpreting the byte code. +\end{itemize} + +The @ENTER@ byte-code instruction decides how to enter its argument. +The object being entered must be either + +\begin{itemize} +\item A BCO, +\item A @HUGS_AP@, +\item A @HUGS_PAP@, +\item A constructor, +\item A GHC-built closure, or +\item An indirection. +\end{itemize} -\subsection{A GHC thread enters a Hugs-built thunk} +If @ENTER@ is applied to a BCO, we just begin interpreting the +byte-code contained therein. If the object is an @HUGS_AP@, we push an +update frame, push the values from the @HUGS_AP@ on the stack, and enter +its associated object. If the object is a @HUGS_PAP@, we push its +values on the stack and enter the first one. If the object is a +constructor, we simply return (see Section +\ref{sect:hugs-return-convention}). The fourth case is convered in +Section \ref{sect:hugs-to-ghc-closure}. If the object is an +indirection, we simply enter the object it points to. -A Hugs-built thunk looks like this: +\subsection{Return convention} +\label{sect:hugs-return-convention} +When Hugs pushes a return address, it pushes both a pointer to the BCO +to return to, and a pointer to a static code fragment @HUGS_RET@ (this +will be described in Section \ref{sect:ghc-to-hugs-return}). The +stack layout is shown in Figure \ref{fig:hugs-return-stack}. + +\begin{figure} \begin{center} -\begin{tabular}{|l|l|} -\hline -\emph{Hugs} & \emph{Hugs-specific information} \\ -\hline -\end{tabular} +\input{hugs_ret.pstex_t} \end{center} +\caption{Stack layout for a hugs return address} +\label{fig:hugs-return-stack} +\end{figure} -\noindent where \emph{Hugs} is a pointer to a small -statically-compiled piece of code that does the following: +\begin{figure} +\begin{center} +\input{hugs_ret2.pstex_t} +\end{center} +\caption{Stack layout on enterings a hugs return address} +\label{fig:hugs-return2} +\end{figure} + +When a Hugs byte-code sequence is returning, it first places the +return value on the stack. It then examines the return address (now +the second word on the stack): \begin{itemize} -\item Push the address of the thunk on the stack. -\item Push @entertop@ on the stack. -\item Save the current state of the thread in the TSO. -\item Return to the scheduler, with the @whatNext@ field set to -@RunHugs@. + +\item If the return address is @HUGS_RET@, rearrange the stack so that +it has the returned object followed by the pointer to the BCO at the +top, then enter the BCO (Figure \ref{fig:hugs-return2}). + +\item If the top of the stack is not @HUGS_RET@, we need to do a world +switch as described in Section \ref{sect:hugs-to-ghc-return}. + \end{itemize} -\noindent where @entertop@ is a small statically-compiled piece of -code that does the following: + +\section{The Scheduler} + +The Scheduler is the heart of the run-time system. A running program +consists of a single running thread, and a list of runnable and +blocked threads. The running thread returns to the scheduler when any +of the following conditions arises: \begin{itemize} -\item pop the return address from the stack. -\item pop the next word off the stack into \Arg{1}. -\item enter \Arg{1}. +\item A heap check fails, and a garbage collection is required +\item Compiled code needs to switch to interpreted code, and vice +versa. +\item The thread becomes blocked. +\item The thread is preempted. \end{itemize} -The infotable for @entertop@ has some byte-codes attached that do -essentially the same thing if the code is entered from Hugs. - -\subsection{A GHC thread calls a Hugs-compiled function} +A running system has a global state, consisting of -How do we do this? +\begin{itemize} +\item @Hp@, the current heap pointer, which points to the next +available address in the Heap. +\item @HpLim@, the heap limit pointer, which points to the end of the +heap. +\item The Thread Preemption Flag, which is set whenever the currently +running thread should be preempted at the next opportunity. +\item A list of runnable threads. +\item A list of blocked threads. +\end{itemize} -\subsection{A GHC thread returns to a Hugs-compiled return address} +Each thread is represented by a Thread State Object (TSO), which is +described in detail in Section \ref{sect:TSO}. -When Hugs pushes return addresses on the stack, they look like this: +The following is pseudo-code for the inner loop of the scheduler +itself. @ - | | - |_______________| - | | -----> bytecode object - |_______________| - | | _____ - |_______________| |___ GHC-friendly return code - _____ - | | - | | Info Table - |____| - . . - . . Code - . . +while (threads_exist) { + // handle global problems: GC, parallelism, etc + if (need_gc) gc(); + if (external_message) service_message(); + // deal with other urgent stuff + + pick a runnable thread; + do { + switch (thread->whatNext) { + case (RunGHC pc): status=runGHC(pc); break; + case (RunHugs bc): status=runHugs(bc); break; + } + switch (status) { // handle local problems + case (StackOverflow): enlargeStack; break; + case (Error e) : error(thread,e); break; + case (ExitWith e) : exit(e); break; + case (Yield) : break; + } + } while (thread_runnable); +} @ -If GHC is returning, it will return to the address at the top of the -stack. The code at this address +Optimisations to avoid excess trampolining from Hugs into itself. +How do we invoke GC, ccalls, etc. +General ccall (@ccall-GC@) and optimised ccall. + +\section{Switching Worlds} + +\label{sect:switching-worlds} + +Because this is a combined compiled/interpreted system, the +interpreter will sometimes encounter compiled code, and vice-versa. + +All world-switches go via the scheduler, ensuring that the world is in +a known state ready to enter either compiled code or the interpreter. +When a thread is run from the scheduler, the @whatNext@ field in the +TSO (Section \ref{sect:TSO}) is checked to find out how to execute the +thread. + +\begin{itemize} +\item If @whatNext@ is set to @ReturnGHC@, we load up the required +registers from the TSO and jump to the address at the top of the user +stack. +\item If @whatNext@ is set to @EnterGHC@, we load up the required +registers from the TSO and enter the closure pointed to by the top +word of the stack. +\item If @whatNext@ is set to @EnterHugs@, we enter the top thing on +the stack, using the interpreter. +\end{itemize} + +There are four cases we need to consider: + +\begin{enumerate} +\item A GHC thread enters a Hugs-built closure. +\item A GHC thread returns to a Hugs-compiled return address. +\item A Hugs thread enters a GHC-built closure. +\item A Hugs thread returns to a Hugs-compiled return address. +\end{enumerate} + +GHC-compiled modules cannot call functions in a Hugs-compiled module +directly, because the compiler has no information about arities in the +external module. Therefore it must assume any top-level objects are +CAFs, and enter their closures. + +\ToDo{dynamic linking stuff} +\ToDo{Hugs-built constructors?} + +We now examine the various cases one by one and describe how the +switch happens in each situation. + +\subsection{A GHC thread enters a Hugs-built closure} +\label{sect:ghc-to-hugs-closure} + +There are three possibilities: GHC has entered the BCO directly (for a +top-level function closure), it has entered a @HUGS_AP@, or it has +entered a @HUGS_PAP@. + +The code for all three objects is the same: + +\begin{itemize} +\item Push the address of the object entered on the stack. +\item Save the current state of the thread in its TSO. +\item Return to the scheduler, setting @whatNext@ to @EnterHugs@. +\end{itemize} + +\subsection{A GHC thread returns to a Hugs-compiled return address} +\label{sect:ghc-to-hugs-return} + +Hugs return addresses are laid out as in Figure +\ref{fig:hugs-return-stack}. If GHC is returning, it will return to +the address at the top of the stack, namely @HUGS_RET@. The code at +@HUGS_RET@ performs the following: \begin{itemize} +\item pushes \Arg{1} (the return value) on the stack. \item saves the thread state in the TSO -\item returns to the scheduler with a @whatNext@ field of @RunHugs@. +\item returns to the scheduler with @whatNext@ set to @EnterHugs@. \end{itemize} -If Hugs is returning to one of these addresses, it can spot the -special return address at the top and instead jump to the bytecodes -pointed to by the second word on the stack. +\noindent When Hugs runs, it will enter the return value, which will +return using the correct Hugs convention (Section +\ref{sect:hugs-return-convention}) to the return address underneath it +on the stack. + +\subsection{A Hugs thread enters a GHC-compiled closure} +\label{sect:hugs-to-ghc-closure} -\subsection{A Hugs thread enters a GHC-compiled thunk} +Hugs can recognise a GHC-built closure as not being one of the +following types of object: -When Hugs is called on to enter a non-Hugs closure (these are -recognisable by the lack of a \emph{Hugs} pointer at the front), the -following sequence of instructions is executed: +\begin{itemize} +\item A @BCO@, +\item A @HUGS_AP@, +\item A @HUGS_PAP@, +\item An indirection, or +\item A constructor. +\end{itemize} + +When Hugs is called on to enter a GHC closure, it executes the +following sequence of instructions: \begin{itemize} -\item Push the address of the thunk on the stack. -\item Push @entertop@ on the stack. +\item Push the address of the closure on the stack. \item Save the current state of the thread in the TSO. \item Return to the scheduler, with the @whatNext@ field set to -@RunGHC@. +@EnterGHC@. \end{itemize} -\subsection{A Hugs thread calls a GHC-compiled function} +\subsection{A Hugs thread returns to a GHC-compiled return address} +\label{sect:hugs-to-ghc-return} -Hugs never calls GHC-functions directly, it only enters closures -(which point to the slow entry point for the function). Hence in this -case, we just push the arguments on the stack and proceed as for a -thunk. +When hugs encounters a return address on the stack that is not +@HUGS_RET@, it knows that a world-switch is required. At this point +the stack contains a pointer to the return value, followed by the GHC +return address. The following sequence is then performed: -\subsection{A Hugs thread returns to a GHC-compiled return address} +\begin{itemize} +\item save the state of the thread in the TSO. +\item return to the scheduler, setting @whatNext@ to @EnterGHC@. +\end{itemize} + +The first thing that GHC will do is enter the object on the top of the +stack, which is a pointer to the return value. This value will then +return itself to the return address using the GHC return convention. +\part{Implementation} \section{Heap objects} \label{sect:fixed-header} @@ -883,19 +1021,18 @@ though GUM keeps a separate hash table). \item Statistics (e.g. a word to track how many times a thunk is entered.). We add a Ticky word to the fixed-header part of closures. This is -used to record indicate if a closure has been updated but not yet -entered. It is set when the closure is updated and cleared when -subsequently entered. +used to indicate if a closure has been updated but not yet entered. It +is set when the closure is updated and cleared when subsequently +entered. NB: It is {\em not} an ``entry count'', it is an ``entries-after-update count.'' The commoning up of @CONST@, @CHARLIKE@ and @INTLIKE@ closures is turned off(?) if this is required. This has only been done for 2s collection. - - \end{itemize} \end{itemize} + Most of the RTS is completely insensitive to the number of admin words. The total size of the fixed header is @FIXED_HS@. @@ -949,12 +1086,11 @@ successive decreasing memory addresses. \hline Parallelism Info \\ \hline Profile Info \\ \hline Debug Info -\\ \hline Tag/bytecode pointer -\\ \hline Static reference table +\\ \hline Tag / Static reference table \\ \hline Storage manager layout info \\ \hline Closure type -\\ \hline entry code \ldots -\\ \hline +\\ \hline entry code +\\ \vdots \end{tabular} \end{center} An info table has the following contents (working backwards in memory @@ -977,24 +1113,12 @@ are represented as high-order bits so they can be tested quickly. precise layout, for the benefit of the garbage collector and the code that stuffs graph into packets for transmission over the network. -\item A one-pointer {\em Static Reference Table (SRT) pointer}, @INFO_SRT@, points to -a table which enables the garbage collector to identify all accessible -code and CAFs. They are fully described in Section~\ref{sect:srt}. - -\item A one-pointer {\em tag/bytecode-pointer} field, @INFO_TAG@ or @INFO_BC@. -For data constructors this field contains the constructor tag, in the -range $0..n-1$ where $n$ is the number of constructors. - -For other objects that can be entered this field points to the byte -codes for the object. For the constructor case you can think of the -tag as the name of a a suitable bytecode sequence but it can also be used to -implement semi-tagging (section~\ref{sect:semi-tagging}). - -One awkward question (which may not belong here) is ``how does the -bytecode interpreter know whether to do a vectored return?'' The -answer is it examines the @INFO_TYPE@ field of the return address: -@RET_VEC_@$sz$ requires a vectored return and @RET_@$sz$ requires a -direct return. +\item A one-word {\em Tag/Static Reference Table} field, @INFO_SRT@. +For data constructors, this field contains the constructor tag, in the +range $0..n-1$ where $n$ is the number of constructors. For all other +objects it contains a pointer to a table which enables the garbage +collector to identify all accessible code and CAFs. They are fully +described in Section~\ref{sect:srt}. \item {\em Profiling info\/} @@ -1095,8 +1219,8 @@ Something internal to the runtime system. \end{itemize} - -\section{Kinds of Heap Object} +%----------------------------------------------------------------------------- +\subsection{Kinds of Heap Object} \label{sect:closures} Heap objects can be classified in several ways, but one useful one is @@ -1160,42 +1284,48 @@ closure kind & HNF & UPD & NS & STA & THU & MUT & UPT & BH & IND & Sect {\em Pointed} \\ \hline -@CONSTR@ & 1 & & 1 & & & & & & & \ref{sect:CONSTR} \\ -@CONSTR_STATIC@ & 1 & & 1 & 1 & & & & & & \ref{sect:CONSTR} \\ -@CONSTR_STATIC_NOCAF@ & 1 & & 1 & 1 & & & & & & \ref{sect:CONSTR} \\ - -@FUN@ & 1 & & ? & & & & & & & \ref{sect:FUN} \\ -@FUN_STATIC@ & 1 & & ? & 1 & & & & & & \ref{sect:FUN} \\ - -@THUNK@ & 1 & 1 & & & 1 & & & & & \ref{sect:THUNK} \\ -@THUNK_STATIC@ & 1 & 1 & & 1 & 1 & & & & & \ref{sect:THUNK} \\ -@THUNK_SELECTOR@ & 1 & 1 & 1 & & 1 & & & & & \ref{sect:THUNK_SEL} \\ - -@PAP@ & 1 & & ? & & & & & & & \ref{sect:PAP} \\ - -@IND@ & & & 1 & & ? & & & & 1 & \ref{sect:IND} \\ -@IND_OLDGEN@ & 1 & & 1 & & ? & & & & 1 & \ref{sect:IND} \\ -@IND_PERM@ & & & 1 & & ? & & & & 1 & \ref{sect:IND} \\ -@IND_OLDGEN_PERM@ & 1 & & 1 & & ? & & & & 1 & \ref{sect:IND} \\ -@IND_STATIC@ & ? & & 1 & 1 & ? & & & & 1 & \ref{sect:IND} \\ - -\hline -{\em Unpointed} \\ -\hline - - -@ARR_WORDS@ & 1 & & 1 & & & 1 & 1 & & & \ref{sect:ARR_WORDS1},\ref{sect:ARR_WORDS2} \\ -@ARR_PTRS@ & 1 & & 1 & & & 1 & 1 & & & \ref{sect:ARR_PTRS} \\ -@MUTVAR@ & 1 & & 1 & & & 1 & 1 & & & \ref{sect:MUTVAR} \\ -@MUTARR_PTRS@ & 1 & & 1 & & & 1 & 1 & & & \ref{sect:MUTARR_PTRS} \\ -@MUTARR_PTRS_FROZEN@ & 1 & & 1 & & & 1 & 1 & & & \ref{sect:MUTARR_PTRS_FROZEN} \\ - -@FOREIGN@ & 1 & & 1 & & & & 1 & & & \ref{sect:FOREIGN} \\ - -@BH@ & ? & 0/1 & 1 & & ? & ? & & 1 & ? & \ref{sect:BH} \\ -@MVAR@ & & & & & & & & & & \ref{sect:MVAR} \\ -@IVAR@ & & & & & & & & & & \ref{sect:IVAR} \\ -@FETCHME@ & & & & & & & & & & \ref{sect:FETCHME} \\ +@CONSTR@ & 1 & & 1 & & & & & & & \ref{sect:CONSTR} \\ +@CONSTR_STATIC@ & 1 & & 1 & 1 & & & & & & \ref{sect:CONSTR} \\ +@CONSTR_STATIC_NOCAF@ & 1 & & 1 & 1 & & & & & & \ref{sect:CONSTR} \\ + +@FUN@ & 1 & & ? & & & & & & & \ref{sect:FUN} \\ +@FUN_STATIC@ & 1 & & ? & 1 & & & & & & \ref{sect:FUN} \\ + +@THUNK@ & & 1 & & & 1 & & & & & \ref{sect:THUNK} \\ +@THUNK_STATIC@ & & 1 & & 1 & 1 & & & & & \ref{sect:THUNK} \\ +@THUNK_SELECTOR@ & & 1 & 1 & & 1 & & & & & \ref{sect:THUNK_SEL} \\ + +@BCO@ & 1 & & 1 & & & & & & & \ref{sect:BCO} \\ +@BCO_CAF@ & & 1 & & & 1 & & & & & \ref{sect:BCO} \\ + +@HUGS_AP@ & & 1 & & & 1 & & & & & \ref{sect:HUGS-AP} \\ +@HUGS_PAP@ & & & 1 & & & & & & & \ref{sect:HUGS-AP} \\ + +@PAP@ & 1 & & 1 & & & & & & & \ref{sect:PAP} \\ + +@IND@ & ? & & ? & & ? & & & & 1 & \ref{sect:IND} \\ +@IND_OLDGEN@ & ? & & ? & & ? & & & & 1 & \ref{sect:IND} \\ +@IND_PERM@ & ? & & ? & & ? & & & & 1 & \ref{sect:IND} \\ +@IND_OLDGEN_PERM@ & ? & & ? & & ? & & & & 1 & \ref{sect:IND} \\ +@IND_STATIC@ & ? & & ? & 1 & ? & & & & 1 & \ref{sect:IND} \\ + +\hline +{\em Unpointed} \\ +\hline + + +@ARR_WORDS@ & 1 & & 1 & & & & 1 & & & \ref{sect:ARR_WORDS1},\ref{sect:ARR_WORDS2} \\ +@ARR_PTRS@ & 1 & & 1 & & & & 1 & & & \ref{sect:ARR_PTRS} \\ +@MUTVAR@ & 1 & & 1 & & & 1 & 1 & & & \ref{sect:MUTVAR} \\ +@MUTARR_PTRS@ & 1 & & 1 & & & 1 & 1 & & & \ref{sect:MUTARR_PTRS} \\ +@MUTARR_PTRS_FROZEN@ & 1 & & 1 & & & 1 & 1 & & & \ref{sect:MUTARR_PTRS_FROZEN} \\ + +@FOREIGN@ & 1 & & 1 & & & & 1 & & & \ref{sect:FOREIGN} \\ + +@BH@ & & 1 & 1 & & ? & ? & & 1 & ? & \ref{sect:BH} \\ +@MVAR@ & 1 & & 1 & & & & & & & \ref{sect:MVAR} \\ +@IVAR@ & 1 & & 1 & & & & & & & \ref{sect:IVAR} \\ +@FETCHME@ & 1 & & 1 & & & & & & & \ref{sect:FETCHME} \\ \hline \end{tabular} @@ -1328,6 +1458,83 @@ under evaluation (BH), or by now an HNF. Thus, indirections get NoSpark flag. +\subsection{Hugs Objects} + +\subsubsection{Byte-Code Objects} +\label{sect:BCO} + +A Byte-Code Object (BCO) is a container for a a chunk of byte-code, +which can be executed by Hugs. The byte-code represents a +supercombinator in the program: when hugs compiles a module, it +performs lambda lifting and each resulting supercombinator becomes a +byte-code object in the heap. + +There are two kinds of BCO: a standard @BCO@ which has an arity of one +or more, and a @BCO_CAF@ which takes no arguments and can be updated. +When a @BCO_CAF@ is updated, the code is thrown away! + +The semantics of BCOs are described in Section +\ref{sect:hugs-heap-objects}. A BCO has the following structure: + +\begin{center} +\begin{tabular}{|l|l|l|l|l|l|} +\hline +\emph{Fixed Header} & \emph{Layout} & \emph{Offset} & \emph{Size} & +\emph{Literals} & \emph{Byte code} \\ +\hline +\end{tabular} +\end{center} + +\noindent where: +\begin{itemize} +\item The entry code is a static code fragment/info table that +returns to the scheduler to invoke Hugs (Section +\ref{sect:ghc-to-hugs-closure}). +\item \emph{Layout} contains the number of pointer literals in the +\emph{Literals} field. +\item \emph{Offset} is the offset to the byte code from the start of +the object. +\item \emph{Size} is the number of words of byte code in the object. +\item \emph{Literals} contains any pointer and non-pointer literals used in +the byte-codes (including jump addresses), pointers first. +\item \emph{Byte code} contains \emph{Size} words of non-pointer byte +code. +\end{itemize} + +\subsubsection{@HUGS_AP@ objects} +\label{sect:HUGS-AP} + +There are two kinds of @HUGS_AP@ objects: a standard @HUGS_AP@, used +to represent thunks buit by Hugs, and a @HUGS_PAP@, used for partial +applications. The only difference between the two is that a +@HUGS_PAP@ is non-updatable. + +\begin{center} +\begin{tabular}{|l|l|l|l|} +\hline +\emph{Fixed Header} & \emph{BCO} & \emph{Layout} & \emph{Free Variables} \\ +\hline +\end{tabular} +\end{center} + +\noindent where: + +\begin{itemize} + +\item The entry code is a statically-compiled code fragment/info table +that returns to the scheduler to invoke Hugs (Sections +\ref{sect:ghc-to-hugs-closure}, \ref{sect:ghc-to-hugs-return}). + +\item \emph{BCO} is a pointer to the BCO for the thunk. + +\item \emph{Layout} contains the number of pointers and the size of +the \emph{Free Variables} field. + +\item \emph{Free Variables} contains the free variables of the +thunk/partial application/return address, pointers first. + +\end{itemize} + \subsection{Pointed Objects} All pointed objects can be entered. @@ -1372,7 +1579,7 @@ layout than dynamic ones: {\em Fixed header} & {\em Static object link} \\ \hline \end{tabular} \end{center} -Static function closurs have no free variables. (However they may refer to other +Static function closures have no free variables. (However they may refer to other static closures; these references are recorded in the function closure's SRT.) They have one field that is not present in dynamic closures, the {\em static object link} field. This is used by the garbage collector in the same way that to-space @@ -1406,9 +1613,8 @@ closures. That is \end{tabular} \end{center} -The SRT pointer in a data constructor's info table is never used --- the -code for a constructor does not make any static references. -\note{Use it for something else?? E.g. tag?} +The SRT pointer in a data constructor's info table is used for the +constructor tag, since a constructor never has any static references. There are several different sorts of constructor: \begin{itemize} @@ -2708,6 +2914,7 @@ to implement than Sparud's. \subsection{Internal workings of the Generational Collector} +\section{Dynamic Linking} \section{Profiling}