\documentclass{article}
-\usepackage{code}
+\usepackage{code,a4wide}
+
+\usepackage{graphics,epsfig,epic,eepic}
\setlength{\parskip}{0.25cm}
\setlength{\parsep}{0.25cm}
\newcommand{\note}[1]{{\em $\spadesuit$ #1}}
\begin{document}
-\title{The GHC storage manager}
-\author{Simon Peyton Jones}
+\title{The GHC Storage Manager}
+\author{Simon Peyton Jones and Sungwoo Park}
\makeatactive
\maketitle
-
\section{Introduction}
-This is a draft design document, not a completed thing.
+This document describes the details of the GHC storage manager, including
+the interface and implementation of each of its components.
\section{Goals}
-Storage management goals:
-
+Storage management goals are:
\begin{itemize}
\item Generational collection, supporting multiple generations.
-
\item The ability to pin the allocation
area into a few pages that we hope will fit entirely in the cache.
-
\item Allows objects to age within a generation before getting promoted.
-
\item Heap can grow as needed, rather than having to be pre-sized
by the programmer.
-
\item We support mark/sweep/compact collection for older generations.
This is a Good Thing when the live memory approaches the available
physical memory, because it reduces paging.
-
-\item Little OS support needed. No mmap etc. All that we require is
- the ability to @malloc@ (or whatever) a new chunk of memory.
+\item Little OS support needed. No @mmap()@ etc. All that we require is
+ the ability to call @malloc()@ to allocate a new chunk of memory.
There can be intervening ``sandbars'' allocated by other programs
- (e.g. DLLs or other @malloc@'d structures) between chunks of heap.
-
-\item Possible feature: ability to support mostly-copying conservative
-collection.[.bartlett mostly 1989, morrisset mostly.]
-Not top priority.
+ (e.g. DLLs or other @malloc()@'d structures) between chunks of heap.
\end{itemize}
-Language-support goals:
+Language-support goals are:
\begin{itemize}
\item The garbage collector ``shorts out'' indirection objects introduced
by the mutator (notably when overwriting a thunk with an indirection).
-
-\item The garbage collector executes selector thunks.
+\item The garbage collector executes selector thunks.
For example, a thunk for
@(fst x)@ where @x@ is a pointer to a pair @(a,b)@ would be
evaluated by the garbage collector to just @a@. This is an important
strategy for plugging space leaks.
-
\item The garbage collector traversese the code tree, as well as
-the heap data structures, to find which CAFs are live. This is a Royal Pain.
-
+the heap data structures, to find which CAFs are live. This is a royal pain.
\item The garbage collector finalises some objects (typically a tiny minority).
At the moment ``finalisation'' means ``call a C routine when this thing
-dies'' but it would be more general to schedule a call to a Haskell
+dies'' but it would be more general to schedule a call to a Haskell
procedure.
\end{itemize}
-Instrumentation goals:
-
+Instrumentation goals are:
\begin{itemize}
\item The garbage collector can gather heap-census information for profiling.
To this end we can force GC to happen more often than it otherwise would,
associated with each heap object.
\end{itemize}
+\section{The architecture of the storage manager}
+
+The storage manager is a component of the GHC system which is responsible
+for allocating fresh memory for new objects and reclaiming memory
+that is no longer used.
+It is built on a layered architecture and consists of four main parts:
+\emph{megablock allocator}, \emph{block allocator}, \emph{heap allocator},
+and \emph{garbage collector} (Figure~\ref{fig-architecture}).
+The megablock allocator communicates directly with the underlying
+operating system and forms the lowest level of the storage manager.
+The heap allocator and garbage collector lie in the topmost level of
+the storage manager and process requests from
+the mutator (the Haskell realm at the runtime) and the runtime system.
+The block allocator lies between the two levels.
+
+\begin{figure}[ht]
+\begin{center}
+\input{architecture.eepic}
+\caption{The overall architecture of the storage manager}
+\label{fig-architecture}
+\end{center}
+\end{figure}
-\section{Assumptions}
-
-Every garbage collector uses assupmtions about the sort of
-objects it manipulates, and how the mutator behaves. Here we
-collect our assumptions:
-\begin{itemize}
-\item Each heap object starts with a one-word header that encodes, or points
-to, information about the layout of the object. The details of this
-encoding aren't fundamental to the collector, but are, of course, rather
-important.
-
-\item Every heap pointer points to the beginning of
-an object; no pointers into the middle of objects.
-
-\item There are two spare bits in the LSB of a pointer, because
-every heap object is at least word aligned.
-
-\item
-Heap objects are packed contiguously, with no ``slop'' between them.
-This assumption makes it possible to scan a region of heap linearly.
-It's easy to arrange that objects are packed contiguously in the
-to-space of a copying collector, but the no-slop assumption also
-applies during mutation. This is a bit of a pain, because we often
-update a thunk with an indirection to the thunk's value; and the
-indirection isn't necessarily the same size as the thunk. This can be
-solved, but it takes a few more instructions.
-
-We know of two reasons to uphold the no-slop assumption:
-\begin{itemize}
-\item A {\em mostly-copying conservative (MCC) collector} has to deal with
-possible pointers from conservative regions.[.morrisset mostly.]
-Given a possible pointer the MCC collector must find the head of the object into
-which the putative pointer points. One easy way to do this is
-to scan forward from a known point; but only if there is no
-slop.
-\item An {\em incremental collector}
-will scan linearly during mutation.[.field while.]
-\end{itemize}
-\end{itemize}
-
-
-\section{\Block{}s}
+\section{The megablock allocator}
-The basic memory structure is similar to other BIBOP-based
-collectors.[.stop the bibop, reppy garbage collector, moss garbage toolkit.]
+The megablock allocator requests a chunk of physical memory of a fixed size,
+which is called a \emph{megablock}, from the operating system and returns it
+to the block allocator. A new megablock is not initialized by the
+megablock allocator; it is later initialized by the block allocator.
-A {\em \block} is a contiguous chunk of $2^K$ bytes, starting on a
-$2^K$-byte boundary. We expect a \block{} to be between 4k and 64k bytes.
+\subsection{Interface}
-A \block{} is the unit of allocation for the storage manager. That is,
-the {\em block allocator} hands over store to the mutator in multiples of
-one \block. (The mutator may then allocate multiple heap objects within
-a \block, of course.)
+\begin{description}
+\item[@void *getMBlock()@] allocates a single megablock and returns its
+starting address. @MBlock.c@.
+\item[@void *getMBlocks(nat n)@] allocates @n@ contiguous megablocks
+and returns their starting address. @MBlock.c@.
+\end{description}
-\Block{}s are often allocated individually, but for large
-objects, or just to reduce inter-block linkage costs, a
-contiguous group of \block{}s can be allocated; we call
-this a {\em \block{} group} or sometimes just {\em group}.
-The first \block{} of a group is called the {\em group head}.
+\subsection{Implementation}
-\subsection{\Block{} descriptors}
+Since the megablock allocator communicates directly with the underlying
+operating system, its implementation relies on memory allocation functions
+provided by the operating system; thus, the implementation varies between
+platforms.
+However, every megablock is always of a fixed size $2^K$ and aligned on a
+$2^K$ boundary, regardless of the platform
+(@MBLOCK_SIZE@ in @include/Constants.h@ defines the size of megablocks).
+@mblocks_allocated@ in @MBlock.c@ stores the number of megablocks allocated.
+
+For implementation details, see @MBlock.c@, @MBlock.h@, @include/Block.h@.
+
+\section{The block allocator}
+
+The block allocator divides a megablock returned by the megablock allocator
+into a contiguous group of \emph{block descriptors} followed by another
+contiguous group of \emph{blocks}.
+
+A block is a contiguous chunk of $2^K$ bytes, starting on
+a $2^K$-byte boundary (@BLOCK_SIZE@ in
+@include/Constants.h@ defines the size of blocks).
+Each block has its own associated block descriptor, which records the
+current state of the block.
+Both blocks and block descriptors are placed in a common megablock.
+
+Figure~\ref{fig-megablock} shows a megablock after initialization by the
+megablock allocator.
+Block descriptors occupy the lower address space and blocks the higher address
+space in the megablock.
+Since block descriptors are ordered linearly, we can always locate a block
+descriptor corresponding to a particular block from the starting address
+of the block.
+
+\begin{figure}[ht]
+\begin{center}
+\input{megablock.eepic}
+\caption{A megablock after initialization}
+\label{fig-megablock}
+\end{center}
+\end{figure}
+
+A block is the unit of allocation for the block allocator.
+That is, the block allocator hands over store to the heap allocator in multiples of
+one block, where multiple heap objects may be allocated.
+A contiguous group of blocks, which is called a \emph{block group}, can be
+directly handed over to the heap allocator to reduce inter-block
+linkage costs.
+The first block of a block group is called the \emph{group head}.\footnote{
+An alternative design has the block descriptor at the start of each block.
+This makes it easy to locate the block descriptor corresponding to a particular
+block, but is pessimistic for cache locality when fiddling with block descriptors.
+It also means that only the first block in a contiguous chunk of blocks can
+have a block descriptor. This in turn makes it difficult to achieve an
+efficient mostly-copying conservative (MCC) garbage collector.}
+
+\subsection{Block descriptors}
+
+A block descriptor has the following structure, defined in
+@include/Blocks.h@:
-Each block has an associated {\em block{} descriptor}, a
-record with the following structure:
\begin{code}
- typedef struct {
- int Gen;
- int Step;
- void *Start;
- void *Free;
- bdescr *Link
- } bdescr;
+typedef struct _bdescr {
+ StgPtr start;
+ StgPtr free;
+ StgWord32 blocks;
+ struct _bdescr *link;
+ /* additional fields */
+} bdescr;
\end{code}
-The function @BDescr(a)@ takes an address @a@ and returns
-the address of its block descriptor. It is expected that
-@BDescr@ only takes a few instructions (see below).
-The field of a \block{} descriptor have the following
-purposes:
-\begin{center}
-\begin{tabular}{lp{4.5in}}
-@Gen@ & The generation to which this block belongs. The
-youngest generation is 0. Valid for all blocks whether or not
-it is a group head.
-\\ \\
-@Step@ & The number of GCs that this block has survived
-while within its current generation. Only valid for group heads.
-\\ \\
-@Start@ & Points to the first byte of the block itself.
-Valid for all blocks.
-\\ \\
-@Free@ & For a group head, @Free@ points to the first free (un-allocated)
-byte in the group. For non-group heads, @Free@ is set to zero;
-this identifies the \block{} as a non-head.
-\\ \\
-@Link@ & Used to chain \block{}s together. For non-group-heads,
-@Link@ points to the (\block{} descriptor of the) group head.
-\end{tabular}
-\end{center}
-\subsection{\Step{}s}
+The fields of a block descriptor have the following purposes:
-A {\em \step} is a linked list of \block{} groups, {\em not} necessarily contiguous,
-all belonging to the same generation,
-and all of which have survived the same number of garbage collections.
+\begin{description}
+\item[@start@] points to the first byte of the corresponding block.
+\item[@free@] For a group head, @free@ points to the first free byte in
+the block group. For a non-group head, @free@ is set to zero to identify
+the corresponding block as a non-group head.
+\item[@blocks@] For a group head, @blocks@ stores the number of blocks
+in the block group. It is not used for non-group heads.
+\item[@link@] For a group head, @link@ is used to chain all individual
+blocks or block groups together. For a non-group head, @link@ points
+to the block descriptor of the group head.
+\end{description}
-We think of a \step{} as {\em logically} contiguous (via their @Link@ fields) even though
-it is not physically contiguous.
+\subsection{Interface}
-The \step{} number of a \step{} indicates its age (in GCs) within its generation.
-(\Step{} zero is the youngest \step{} in a generation.) During GC of generation G,
-live objects
-from \step{} N are moved to \step{} N+1, and live objects from \step{} @MaxStep@(G)
-are promoted to generation G+1. In this way, objects are given a decent chance
-to die before being promoted.
+\begin{description}
+\item[@void initBlockAllocator(void)@] initializes the block allocator.
+@BlockAlloc.c@.
+\item[@bdescr *allocBlock(void)@] requests a single block and returns
+the address of its block descriptor. @BlockAlloc.c@.
+\item[@bdescr *allocGroup(nat n)@] allocates a block group of size @n@
+and returns the address of the block descriptor for the group head.
+@BlockAlloc.c@.
+\item[@void freeGroup(bdescr *p)@] frees the block group where @p@ points
+to the block descriptor of the group head, and places it in a pool of
+free block groups. @BlockAlloc.c@.
+\item[@bdescr *Bdescr(StgPtr p)@] takes a pointer @p@ to any byte within
+a block and returns a pointer to its block descriptor. It is implemented as
+an @inline@ procedure. @include/Block.h@.
+\end{description}
-In effect, we record the age for a set of objects grouped by address (the \step{}),
-rather than recording the age for each object individually. This is quite
-a conventional idea.
+\subsection{Implementation}
-\subsection{Generations}
+The block allocator maintains a linked list of free block groups, whose head
+is stored in @free_list@ in @BlockAlloc.c@ (Figure~\ref{fig-freelist}).
+When @allocBlock()@ or @allocGroup()@ is called, the block allocator
+scans the linked list from @free_list@ and finds the first block group
+which can handle the request.
+If such a block group exists, it takes off the requested number of blocks
+from the block group, creates a new block group from them,
+initializes it if needed, and returns it to the caller.
+The rest of the old block group, if any, is linked back to the list of free block
+groups as another block group.
+If such a block group does not exist, the block allocator requests a megablock
+from the megablock allocator and processes the request using the new megablock.
+
+For implementation details, see @BlockAlloc.c@.
+
+\begin{figure}[ht]
+\begin{center}
+\input{freelist.eepic}
+\caption{Linked list of free block groups}
+\label{fig-freelist}
+\end{center}
+\end{figure}
+
+\section{Heap allocator}
+
+The role of the heap allocator in the storage manager is to allocate fresh
+memory upon requests from the mutator and the runtime system.
+Memory allocation takes place frequently during the execution of Haskell
+programs, and hence its efficiency is crucial to the overall performance.
+To handle requests from the mutator and the runtime system efficiently,
+the heap allocator maintains three different memory stores,
+each of which has its own purpose.
+
+The first store is the \emph{nursery}, where typical Haskell
+objects are born.
+The mutator itself can allocate fresh memory directly in the nursery
+without invoking an interface function:
+the configuration of the nursery is always revealed to the mutator and can even
+be changed by the mutator when it allocates fresh memory from the nursery
+on its own.
+Thus, although the small overhead in manipulating the nursery results in fast
+memory allocation, it is up to the mutator to keep the nursery in an
+uncorrupted state.
+
+The second and the third are the \emph{small object pool} and the
+\emph{large object pool}.
+The heap allocator provides a common interface function to be shared by both stores:
+the size of fresh memory requested, which is passed as an argument to the
+interface function, determines which of the two stores to be used.
+The interface function can be called by both the mutator and the runtime system.
+
+\subsection{Interface}
+
+\begin{description}
+\item[@void allocNurseries(void)@] creates and initializes the nursery.
+@Storage.c@.
+\item[@void resetNurseries(void)@] re-initializes the nursery. @Storage.c@.
+\item[@Hp@] points to the byte before the first byte of the current allocation area
+(contiguous free bytes in the nursery).
+Defined as @MainRegTable.rHp@ in @include/Regs.h@.
+\item[@HpLim@] marks the boundary of the current allocation area:
+it points to the last byte in the current allocation area, and thus
+all the bytes of memory addresses from @Hp@$ + 1$ to @HpLim@ are free.
+The mutator can obtain fresh memory simply by adjusting @Hp@ as long as the new
+value of @Hp@ does not exceed @HpLim@. For instance, if the mutator
+increases @Hp@ by @n@, it can now store an object of size up to @n@ at the
+address pointed to by the old value of @Hp@$ + 1$.
+Defined as @MainRegTable.rHpLim@ in @include/Regs.h@.\footnote{@HpLim@ is not always
+correct. It may point event to a word in the next block.}
+\item[@OpenNursery(hp, hplim)@] opens an allocation area and sets @hp@ and
+@hplim@ appropriately. A macro in @include/StgStorage.h@.
+\item[@CloseNursery(hp)@] closes the current allocation area beginning at @hp@.
+A macro in @include/StgStorage.h@.
+\item[@ExtendNursery(hp, hplim)@] closes the current allocation area and
+tries to find a new allocation area in the nursery.
+If it succeeds, it sets @hp@ and @hplim@ appropriately and returns @rtsTrue@;
+otherwise, it returns @rtsFalse@,
+which means that the nursery has been exhausted.
+The new allocation area is not necessarily contiguous with the old one.
+A macro in @Storage.h@.
+\item[@StgPtr allocate(nat n)@] allocates @n@ words from either the small
+object pool or the large object pool, depending on the argument @n@,
+and returns a pointer to the first byte. It always succeeds.
+@Storage.c@.
+\end{description}
-A {\em generation} is a set of \step{}s
-(not necessarily contiguous). Most allocation takes
-place in generation zero, the youngest generation.
+\subsection{Implementation}
-The big difference between generations and \step{}s is that:
-\begin{itemize}
-\item Garbage collection applies to all of generation G and all younger
- generations (for some G). You cannot do garbage collection on just
- part of a generation.
+The nursery is implemented with a fixed number of blocks (@nursery_blocks@
+in @Storage.c@ specifies the number of blocks).
+Each of these blocks forms its own block group, and they are all linked together
+by @allocNurseries()@ (Figure~\ref{fig-nursery}).
+The head of the linked list is stored in @MainRegTable.rNursery@.
+These blocks are never freed.
-\item Consequently we have to track inter-generational pointers from
-older to younger generations; we do not
- track inter-\step{} pointers within a generation.
-\end{itemize}
+\begin{figure}[ht]
+\begin{center}
+\input{nursery.eepic}
+\caption{Nursery during the mutator time}
+\label{fig-nursery}
+\end{center}
+\end{figure}
+
+A single block called the \emph{active block} serves as the allocation area
+at any moment.
+When the free space left in the block is not enough for the request from
+the mutator, the heap allocator sets the @free@ field in the corresponding
+block descriptor to the first free byte in the block and moves the allocation
+area to the next block.
+The block descriptor of the active block is stored in
+@MainRegTable.rCurrentNursery@.
+@Hp@ and @HpLim@ are kept valid during the mutator time so that
+the mutator can obtain fresh memory on its own simply by adjusting @Hp@.
+
+The blocks in the nursery are carefully allocated in a contiguous address
+range so that they fit next to each other in the cache.
+
+The small object pool is implemented with a linked list of block groups,
+each of which consists of a single block (Figure~\ref{fig-smallobjectpool}).
+The head of the linked list is stored in @small_alloc_list@ in @Storage.c@.
+
+\begin{figure}[ht]
+\begin{center}
+\input{smallobjectpool.eepic}
+\caption{Small object pool}
+\label{fig-smallobjectpool}
+\end{center}
+\end{figure}
+
+The allocation in the small object pool is done in the same way as in the
+nursery; @alloc_Hp@ and @alloc_HpLim@ (both defined in @Storage.c@)
+point to the first free byte and the boundary of the small object pool,
+respectively.
+Thus, if when @allocate()@ is called and the heap allocator decides to
+allocate fresh memory in the small object pool, it simply increases @alloc_Hp@
+by the size of memory requested.
+If the allocation cannot be done in the current small object pool, the
+heap allocator calls @allocBlock()@ to obtain a new block from the block
+allocator, puts it to the head of the linked list, and
+sets @alloc_Hp@ and @alloc_HpLim@ appropriately.
+
+The large object pool is also implemented with a (doubly) linked list of block
+groups (Figure~\ref{fig-largeobjectpool}).
+The difference from the small object pool is that each block group stores only
+a single object: each time the argument to @allocate()@ is
+greater than a threshold value (computed from @LARGE_OBJECT_THRESHOLD@
+in @include/Constants.h@), a new block group accommodating the requested size
+is created to store a single object.
+The new block group is put to the head of the list.
+The head of the linked list is available as @g0s0->large_objects@.\footnote{The
+meaning of @g0s0@ is explained in the next section.}
+
+\begin{figure}[ht]
+\begin{center}
+\input{largeobjectpool.eepic}
+\caption{Large object pool}
+\label{fig-largeobjectpool}
+\end{center}
+\end{figure}
+
+For implementation details, see @Storage.c@ and @include/StgStorage.h@.
+
+\section{Garbage collector}
+
+The garbage collector finds all the objects unreachable from a given set of
+roots and frees the memory allocated to them. By invoking the
+garbage collector regularly, the storage manager prevents the heap from
+growing indefinitely and allows Haskell programs to be executed at a
+reasonable memory cost.
+
+The garbage collector in the storage manager is based upon the generational
+garbage collection algorithm.
+The storage manager records the age for every object in the heap.
+An object surviving one garbage collection grows old by one \emph{step},
+and an object surviving a certain number of garbage collections
+is promoted to the next \emph{generation}.
+That is, a step can be defined as a collection of objects which have survived
+the same number of garbage collections (or a collection objects which are
+born at some point between two particular successive garbage collections),
+and a generation as a group of steps belonging to a certain range of ages.
+Notice that the unit of garbage collections is not step but generation:
+a garbage collection applies to all the steps in a generation, and we cannot
+perform a garbage collection just on part of a generation.
+Furthermore, if a particular generation is garbage collected, so are
+all the younger generations.\footnote{Some
+authors define a generation as the set of
+all the objects created between two particular garbage collection and
+an object cannot change its generation (e.g., 1960's, 1970's, and so on).
+In this document,
+an object can change its generation as it survives garbage collections
+(e.g., teenagers, 20's, and so on).}
+
+Figure~\ref{fig-generation} illustrates how an object grows old.
+Every object is created in step $0$ of generation $0$.
+As it survives garbage collections, it is moved to the next step of the
+same generation until it is finally promoted to the next generation and
+moved to step $0$ of the next generation:
+during a garbage collection of generation $g < G$, live objects from
+step $s < S$ are moved to step $s + 1$, and live objects from
+the last step $S$ are promoted to step $0$ in generation $g + 1$.
+Live objects in step $0$ of generation $G$ stay in the same step;
+the oldest generation maintains only one step because there is no point
+in aging objects in the oldest generation.
+In this way, objects are given a decent chance of dying before being
+promoted to the next generation.
+
+\begin{figure}[ht]
+\begin{center}
+\input{generation.eepic}
+\caption{Evolution of objects through garbage collections}
+\label{fig-generation}
+\end{center}
+\end{figure}
+
+The main reason that we separate steps from generations is to
+reduce the cost of maintaining \emph{backward inter-generational pointers},
+that is, pointers from older generations to younger generations.
+Suppose that a garbage collection applies to all generations $0$
+through $g$. If an object @O@ in one of these generations is pointed to
+by another object in generation $g' > g$, we cannot free the object @O@
+even though generation $g'$ is out of consideration. Consequently
+we have to track backward inter-generational pointers to perform garbage
+collections correctly.
+Since maintaining backward pointers is costly, we
+choose to track backward inter-generational pointers only;
+we do not track backward inter-step pointers.
+
+By grouping all the objects created between two successive garbage collections
+and grouping multiple age groups into one generation, the garbage
+collector makes an efficient use of heap memory.
+
+\subsection{Steps}
+
+A step has the following structure, defined in
+@include/StgStorage.h@:
-\subsection{The \block{} allocator}
+\begin{code}
+typedef struct _step {
+ unsigned int no;
+ bdescr *blocks;
+ unsigned int n_blocks;
+ bdescr *large_objects;
+ /* additional fields */
+} step;
+\end{code}
-The {\em \block{} allocator} is responsible for
-allocating blocks. It mediates between the higher levels of the
-storage manager and the operating system.
+The fields of a step have the following purposes (Figure~\ref{fig-step}):
-It supports the following API:
\begin{description}
-\item{@InitialiseBlockAllocator()@} initialises the \block{} allocator.
-\item{@bdescr *AllocBlocks( int n )@} allocates a {\em contiguous}
-group of @n@ \block{}s, returning a pointer to the \block{}
-descriptor of the first. Returns @NULL@ if the request can't be
-satisfied. The block descriptors of the group are initialised by the block allocator.
-\item{@FreeBlocks( bdescr *p )@} returns a block group to the block allocator.
-\item{@BDescr( void *p )@} takes a pointer @p@ to any
-byte within a block{} and returns a pointer to its \block{}
-descriptor. It does no error checking, and fails horribly if the @p@
-does not point into a \block{} allocated by the \block{} allocator.
-It is probably implemented as an @inline@ procedure.
+\item[@no@] indicates the age within its generation.
+$0$ indicates the youngest step in a generation.
+\item[@blocks@] is a linked list of all the blocks in this step
+which contain small objects.
+Each block forms its own block group.
+\item[@n\_blocks@] is the number of blocks in the linked list @blocks@.
+\item[@large\_objects@] is a (doubly) linked list of all the block groups
+in this step which contain large objects.
+Each block group stores only a single object.
\end{description}
+\begin{figure}[ht]
+\begin{center}
+\input{step.eepic}
+\caption{Memory layout of a step}
+\label{fig-step}
+\end{center}
+\end{figure}
+
+The linked list @blocks@ of step $s$ in generation $g$ is created
+during a garbage collection
+from live small objects of step $s - 1$ in the same generation
+(or the last step in the previous generation if $s = 0$).
+The @free@ field in every block descriptor never changes because
+no objects are added after the garbage collection; new objects are created
+only in step $0$ in generation $0$.
+Likewise, the linked list @large_objects@ is created during a
+garbage collection from live large objects of the previous step.
+
+There are three exceptions to the above rules.
+First, both @blocks@ and @large_objects@ of
+step $0$ in generation $0$ are not filled with new objects during a garbage
+collection.
+They are simply re-initialized by the garbage collector and
+grow during during the execution of a program as new objects are
+created.
+Step $0$ in generation $0$ is accessible via a global variable @g0s0@,
+and this is the reason why the large object pool (described in the previous
+section) is indeed stored in @g0s0->large_objects@.
+For the same reason, @MainRegTable.rNursery@ holds the same address as
+@g0s0->blocks@ all the time.
+Second, @blocks@ of step $1$ in generation $0$ is created not only from
+the nursery (@blocks@ of step $0$ in the same generation) but also from the
+small object pool. In other words, all the live small objects created since
+the previous garbage collection, either directly by the mutator or indirectly
+through @allocate()@, are gathered together in the same linked list.
+Finally, step $0$ of the last generation serves the source for itself during
+any garbage collection (because there exists no older step).
-\subsection{Implementation}
+\subsection{Generations}
+
+A generation has the following structure, defined in
+@include/StgStorage.h@:
-Our current implementation plan for the \block{} allocator is as
-follows. It allocates memory from the OS in {\em megablocks}. A
-megablock is a contiguous chunk of memory of $2^M$ bytes or less,
-starting on a $2^M$ byte boundary. All the \block{} descriptors are
-laid out in an array at the start of the megablock. Each is less than 32 bytes
-long. So to get from an address @p@ to its block descriptor, @BDESCR@
-does this:
\begin{code}
- bdescr *BDescr( void *p ) {
- return (((p>>(K-5)) & Bmask) | (p & Mmask)));
- }
+typedef struct _generation {
+ unsigned int no;
+ step *steps;
+ unsigned int n_steps;
+ unsigned int max_blocks;
+ StgMutClosure *mut_list;
+ StgMutClosure *mut_once_list;
+ /* additional fields */
+} generation;
\end{code}
-Depending on how many \block{}s fit in the megablock, the first
-one or more descriptors will be unused.
-
-(An alternative design has the \block{} descriptors at the start of
-each \block{}. This makes @BDescr@ shorter, but is pessimal for
-cache locality when fiddling with \block{} descriptors.
-It also means that only the first block in a contiguous chunk of
-blocks can have a descriptor. That in turn makes life difficult
-for a mostly-copying conservative collector. Given a possible pointer
-from a conservative region, a MCC collector needs
-to find the start of the object into which the putative pointer
-points. One way to do this is to scan from the start of the
-contiguous chunk of \block{}s into which the pointer points; and to
-find that it helps to have a descriptor for each \block{}.)
-
-The \block{} allocator maintains a pool of free \block{}s.
-Because it may be asked for a contiguous chunk of \block{}s it
-needs to keep the free-list ordered (or something similar).
+The fields of a generation have the following purposes (Figure~\ref{fig-gen}):
-\section{The Storage Manager API}
-
-The storage manager supports the following API.
\begin{description}
-\item[@void *Allocate( int n )@] allocates @n@ bytes and returns
-a pointer to the first byte of the block. Returns @NULL@ if
-the request could not be satisfied --- @Allocate@ does not call
-the garbage collector itself. It is the responsibility of the caller
-to completely fill in the allocated chunk with objects packed end to
-end. @OpenNursery( void **p_hp, void **p_hplim )@
-
-\item[@OpenNursery( void **p\_hp, void **p\_hplim )@]
-returns pointers in @*p_hp@ and @*p_hplim@, to the beginning and end
-of a free chunk of store.
-More precisely, @*p_hp@ points to the first byte of the region
-and @*p_hplim@ to the first byte beyond the end of the region.
-@*p_hp@ is set to @NULL@ if there is no chunk available.
-
-@OpenNursery@ is used by the main mutuator to get a chunk of
-contiguous store to allocate in. It allocates until @hp@ gets
-too close to @hplim@ and then calls @ExtendNursery@ to try
-to get a new region of nursery.
-
-\item[@ExtendNursery( void **p\_hp, void **p\_hplim )@]
-returns to the storage manager the region handed out by
-the previous @OpenNursery@ or @ExtendNursery@. Suppose that
-this previous call returned with @*p_hp@ set
-to $hp$ and @p_hplim@ set to
-$hplim$.
-Then the @*p_hplim@ given to @ExtendNursery@ should be the same
-as $hplim$, and @*p_hp@ should be greater than or equal to $hp$ and
-less than or equal to $hplim$.
-
-@ExtendNursery@ tries to find another suitable region. If it
-succeeds, it sets @*hp@ and @*hplim@ appropriately; if not, it sets
-@*hp@ to @NULL@. The new region is not necessarily contiguous
-with the old one.
-
-\note{Maybe @ExtendNursery@ should be a macro, so that we can pass a
-globally allocated register to it? (We can't take the address of a register.)
-If so then presumably so should @OpenNursery@.}
-
-\item[@ZapNursery()@] arranges that the next call to @ExtendNursery@
-will fail. @ZapNursery@ is called by asynchronous interrupts to arrange that
-execution is brought to an orderly halt.
-
-\item[@GarbageCollect( void *Roots )@] performs garbage collection.
-The parameter @Roots@ is a procedure which is called by the garbage
-collector when it wishes to find all the roots. This procedure
-should in turn call @MarkRoot@ on each such root.
-
-\item[@void *MarkRoot( void *p )@] informs the garbage collector that
-@p@ is a root. It returns the new location of the object.
-
-\item[@RecordMutable( void *p )@] informs the storage manager that
-@p@ points to an object that may contain pointers to objects in
-a younger generation.
-
-It is not necessary to call @RecordMutable@ on objects known to
-be in the nursery.
-
-It is only necessary to call @RecordMutable@ once. For objects that
-are genuinely mutable (like mutable arrays), the object is permanently
-recorded as mutable. On the other hand, thunks that are updated only
-once can be dropped from the mutables pool once their pointer has been
-dealt with (promoted).
-
-\item{@boolean InNursery( void *p )@} tells whether @p@ points into
-the nursery. This should be a rather fast test; it is used to guard
-calls to @RecordMutable@.
+\item[@no@] is the generation number.
+\item[@steps@] points to an array of @step@ structures. @steps[@$i$@]@
+corresponds to step $i$ in this generation, i.e.,
+@steps[@$i$@].no@ is equal to $i$.
+\item[@n\_steps@] is the number of @step@ structures in the array pointed to
+by @steps@.
+\item[@max\_blocks@] is the maximum number of blocks allowed in step $0$ of
+this generation. If the number of blocks allocated
+in step @0@ exceeds @max_blocks@,
+this generation is garbage collected during the next garbage collection.
+\item[@mut\_list@] links all mutable objects in this generation, that is,
+objects whose contents can be updated and hence may contain pointers to
+younger generations.
+Every object in this linked list is a dynamic object residing in the heap
+and has a structure compatible with @StgMutClosure@.
+The structure @StgMutClosure@ (@includes/Closures.h@) has a field
+@mut_link@ (called a mutable link field) of type @StgMutClosure *@, which
+points to the next object in this linked list.
+The end mark of this linked list is a pointer to a statically allocated object
+@END_MUT_LIST@ (@StoragePriv.h@).
+\item[@mut\_once\_list@] links objects in this generation whose contents
+cannot be updated any more but may already have pointers to younger generations.
+As with @mut_list@, it links only those objects whose structure is compatible
+with @StgMutClosure@ and ends with @END_MUT_LIST@.
\end{description}
-
-\section{Allocation}
-
-We want to experiment with the idea of keeping the youngest generation
-entirely in the cache, so that an object might be allocated, used,
-and garbage collected without ever hitting main memory.
-
-To do this, generation zero consists of a fixed number of one-\block{}
-\segment{}s, carefully allocated in a contiguous address range so that
-they fit snugly next to each other in the cache.
-(Large objects are allocated directly into generation 1,
-so that there is no need to find contiguous \block{}s in generation 0 --- Section~\ref{sect:large}.)
-
-After a GC, the storage manager decides how many generation-0 \segment{}s
-(= \block{}s) can be
-allocated before the next minor GC; it chains these \segment{}s together
-through their @SegLink@ fields. The storage manager then makes the allocation pointer
-(@Hp@) point to the start of the first \segment{}, and the limit pointer
-(@HpLim@) point to the end of that \segment{}.
-Objects are allocated by incrementing the @Hp@ until
-it hits @HpLim@. If the @SegLink@ field of the current \segment{} is NIL,
-then it is time for GC; otherwise @Hp@ is set to point to the first data
-word of the next \segment{},
-and @HpLim@ to the end of the \segment{}, and allocation
-continues. So between GCs allocation is only interrupted briefly
-(i.e. a handful of instructions)
-to move @Hp@/@HpLim@ on to the next \segment{}. Actually, this regular
-interruption can be useful, because it makes a good moment to pre-empt
-a thread to schedule another, or to respond to an interrupt.
-(Pre-emption cannot occur at arbitrary times because we are assuming
-an accurate garbage collector.)
-
-We expect there to be two \step{}s in generation zero, \step{} 0 and \step{} 1.
-At a minor GC we promote live objects in \step{} 1, and copy \step{} 0 to
-become a new \step{} 1. The new \step{} 1 should fit also fit into the cache.
-Suppose there are 16 generation-zero \segment{}s. A reasonable budget
-might be this:
+\begin{figure}[ht]
\begin{center}
-\begin{tabular}{l}
- 10 \segment{}s to allocate into (\step{} 0) \\
- 3 \segment{}s for \step{} 1 \\
- 3 \segment{}s free for the new \step{} 1
-\end{tabular}
+\input{gen.eepic}
+\caption{Memory layout of a generation}
+\label{fig-gen}
\end{center}
+\end{figure}
+
+The garbage collector maintains an array @generations@ of @generation@ structure
+(defined in @Storage.c@), whose size is stored in a runtime system flag
+(@RtsFlags.GcFlags.generations@).
+The generation number of each generation coincides with its index into
+the array @generations@, i.e., @generations[@$i$@].no@ is equal to $i$.
+
+As mentioned before, lists of objects which may have pointers to younger
+generations are kept per generation, not per step. The youngest generation,
+accessible via a global variable @g0@, does not keep such a list because it
+does not have younger generations.
+
+The oldest generation, accessible via a global variable @oldest_gen@, may
+contain static objects (as opposed to dynamic objects residing in the heap)
+in its list @mut_once_list@. This happens when a static
+thunk, also known as a \emph{constant applicative form} (CAF), is entered.
+When a CAF (corresponding to closure type @THUNK_STATIC@, defined
+in @includes/ClosureTypes.h@) is entered,
+it is first put in the list @mut_once_list@ of the oldest generation
+and then overwritten with an appropriate static indirection object
+(corresponding to closure type @IND_STATIC@).\footnote{Actually a static
+indirection object does not have a @mut\_link@ field.
+We use its @static\_link@ field as a substitute for @mut\_link@.
+See the structure @StgIndStatic@ in @include/Closures.h@.}\footnote{For
+details of this operation, see the macro @UPD\_CAF()@ in @includes/Updates.h@}
+If the CAF is dynamically loaded (e.g., in an interactive environment), it is
+instead put in a separate linked list @caf_list@
+(declared in @Storage.c@).
+
+The evaluation result of the
+CAF is stored in a separate dynamic object in the heap and the static
+indirection object has a pointer to the dynamic object.
+Thus, the new static indirection object is put in the list
+@mut_once_list@ of the oldest generation (or the list @caf_list@) so that the
+dynamic object is not removed during the next garbage collection.
+Once it is created, the static indirection object remains unaltered, which
+is the reason why it is put in the @mut_once_list@ list, not in the
+@mut_list@ list.
+Since the static indirection object survives any garbage collection (because
+it comes from a static object) and would be eventually moved to the oldest
+generation,
+we put it in the @mut_once_list@ of the oldest generation as soon
+as it is created.
+
+\subsection{Interface}
-This assumes a 30\% survival rate from \step{} 0 to \step{} 1. We can do in-flight
-measurements to tune this. And (MOST IMPORTANT) if we get it wrong, and
-get a blip in which 9 \segment{}s survive minor GC, all is not lost... we simply
-have to use some new \block{}s and generation zero gets bigger for a while
-and perhaps less cache resident. (This contrasts with systems which rely
-on contiguous partitions, where an overflow is catastrophic, and which must
-therefore over-reserve. We get better address-space usage by not having
-to be so conservative. Again, this isn't a new observation.)
-
-
-\section{Doing garbage collection}
+\begin{description}
+%\item[@StgClosure *MarkRoot(StgClosure *root)@] informs the garbage collector
+%that @root@ is an object in the root set. It returns the new location of
+%the object. @GC.c@.
+\item[@void *mark\_root(StgClosure **root)@] informs the garbage collector
+that @*root@ is an object in the root set. It replaces @*root@ by
+the new location of the object. @GC.c@.
+\item[@void GarbageCollect(void (*get\_roots)(evac\_fn), rtsBool force\_major\_gc)@]
+performs a garbage collection.
+@get_roots()@ is a function which is called by the garbage collector when
+it wishes to find all the objects in the root set (other than those
+it can find itself).
+Therefore it is incumbent on the caller to find the root set.
+@force_major_gc@ specifies whether a major garbage collection is required
+or not. If a major garbage collection is not required, the garbage collector
+decides an oldest generation $g$ to garbage collect on its own.
+@GC.c@.
+\item[@rtsBool doYouWantToGC(void)@] returns @rtsTrue@ if the garbage
+collector is ready to perform a garbage collection. Specifically, it returns
+@rtsTrue@ if the number of allocated blocks since the last garbage collection
+(@alloc_blocks@ in @Storage.c@) exceeds an approximate limit
+(@alloc_blocks_lim@ in @Storage.c@).
+@Storage.h@.
+\item[@void recordMutable(StgMutClosure *p)@] informs the garbage collector
+that a previously immutable object @p@ has become mutable.
+The garbage collector then puts the object @p@ in the list @mut_list@ of the
+generation to which it belongs.\footnote{It is easy to
+locate the generation to which a dynamic object belongs from its address:
+we can identify the block in which the object resides from its address,
+and the corresponding block descriptor stores pointers
+to the step and the generation (@gen@ and @step@ fields in the @bdescr@
+structure) to which it belongs.}
+It suffices to call @RecordMutable()@ only once for any object.
+
+For an object which is genuinely mutable (e.g., mutable arrays),
+it is permanently recorded as mutable.
+On the other hand,
+an object which is temporarily mutable (e.g., frozen arrays),
+can be dropped from the list @mut_list@ once its pointer has been dealt with
+during garbage collections. @Storage.h@.
+\item[@void recordOldToNewPtrs(StgMutClosure *p)@] puts the object @p@ in the
+list @mut_once_list@ of the generation to which it belongs.
+\item[@void newCAF(StgClosure *caf)@] puts the CAF @caf@ either
+in the list @caf_list@ or
+in the list @mut_once_list@ of the oldest generation,
+depending on whether it is dynamically loaded or not.
+\end{description}
-When GC happens, we first decide which generation to collect. (How?)
-When we collect generation G we collect all younger generations too.
+\subsection{Implementation}
-To begin with we assume a copying style garbage collection
-algorithm. The structure of a full GC is as follows:
+The overall structure of a garbage collection is as follows:
\begin{enumerate}
-\item Initialise
-\item For each root @r@ do @r:=Evacuate(r)@.
-\item Scavenge
-\item Tidy up
+\item[(1)] Initialize.
+\item[(2)] Scavenge lists @mut_once_list@ and @mut_list@ if necessary.
+\item[(3)] Scavenge CAFs.
+\item[(4)] Evacuate roots.
+\item[(5)] Scavenge objects.
+\item[(6)] Tidy up.
\end{enumerate}
-A {\em root} is either a pointer held by the RTS (notably the queue
-of runnable threads), or a pointer from a generation > G that might
-point into a generation <= G.
+\subsubsection{(1) Initialization}
+
+During initialization, the garbage collector first decides which generation
+to garbage collect.
+Specifically,
+if the argument @force_major_gc@ to @GarbageCollect()@ is @rtsFalse@,
+it decides the greatest generation number $N$ such
+that the number of blocks allocated in step $0$ of generation $N$ exceeds
+@generations[@$N$@].max_blocks@.
+If the argument @force_major_gc@ to @GarbageCollect()@ is @rtsTrue@,
+$N$ is set to the greatest generation number, namely,
+$@RtsFlags.GcFlags.generations@ - 1$.
+The garbage collector considers up to generation $N$ for garbage collection.
+A major garbage collection takes place if $N$ is set to
+$@RtsFlags.GcFlags.generations@ - 1$ during this process.
+
+Then, the garbage collector initialize the \emph{to-space} (as opposed to
+\emph{from-space}) for each step of
+each generation, which is complete with an \emph{allocation pointer} and
+an \emph{sweep pointer}.
+The to-space of a step is the memory to which any object belonging to the
+step can be copied when it survives a garbage collection.
+For instance, a live object in step $s$ of generation $g$ can first be copied
+to the to-space associated with step $s$, which eventually becomes
+associated with the next step $s + 1$ (or step $0$ of the next generation)
+during tidying up.
+This operation effectively moves an object to the next step if it survives
+a garbage collection.
+The allocation pointer points to the next free in the to-space while
+the sweep pointer points to the next object considered for scavenging.
+
+During major garbage collections,
+the static link field of every static object indicates whether it has
+been visited by the garbage collector or not.
+Therefore, the static link field of every static object must have
+a null value before a major garbage collection starts.
+The list @mut_once_list@ of the oldest generation may contain static
+indirection objects, and thus
+the garbage collector invokes @zero_mutable_list()@ on the list,
+Although this breaks up the list, it does not cause any problem because
+the list is not employed during major garbage collections.
+
+\subsubsection{\tt evacuate()}
+
+The function @evacuate()@ (defined in @GC.c@), which
+is called eventually for every live object
+(including even static objects reachable from roots),
+moves an object to
+a safe place so as not to be garbage collected.
+Before invoking the function @evacuate()@ on an object @o@, the caller specifies
+a \emph{desired generation} for @o@ in a variable @evac_gen@
+(declared in @GC.c@).
+The desired generation is the youngest generation to which the caller wishes
+@o@ to be evacuated; the garbage collector should evacuate @o@ to a
+generation no younger than the desired generation.
+
+Depending on @evac_gen@ and the generation $M$ where @o@ currently resides,
+@evacuate()@ behaves itself as follows:
+\begin{itemize}
+\item If @evac_gen@ $\leq M$ and $N < M$, it does nothing because @o@ is already
+ in a generation no younger than @evac_gen@.
+\item If @evac_gen@ $\leq M \leq N$, it evacuates @o@ to the to-space of the
+step to which @o@ currently belongs. @o@ will be moved to the next step later.
+@recordMutable()@ may be invoked on @o@ depending on its type (e.g., @MVAR@).
+\item If $M <$ @evac_gen@, @o@ is evacuated to the to-space of step $0$
+ of generation @even_gen@, which accomplishes the request.
+ This happens even when $N \leq$ @evac_gen@. Therefore, those generations
+ which are not considered for garbage collection may still be augmented
+ with new objects during garbage collection.
+ @recordMutable()@ may be invoked on @o@ depending on its type.
+\end{itemize}
+If @o@ has already been evacuated, @evacuate()@ either does nothing (when
+@even_gen@ $\leq M$) or reports
+a failure to evacuate @o@ by setting the flag @failed_to_evac@ (declared
+in @GC.c@).
+
+Evacuating a large object is handled by @evacuate_large()@.
+Since it is costly to allocate new memory blocks and copy all the contents
+of the object, the garbage collector simply removes the object form
+the list @large_alloc_list@ of its step and links it to another list,
+from which it will be scavenged later.
+
+\subsubsection{Set of roots for garbage collection}
+Part of the set of roots for garbage collection is obtained indirectly by
+invoking the function
+@get_roots()@, an argument to @GarbageCollect()@: the garbage collector
+invokes @get_roots()@ with @mark_root()@ as an argument, and @get_roots()@
+in turn invokes @mark_root()@ on each of known roots.
+The rest of the set of roots is obtained from the lists @mut_list@ and
+@mut_once_list@ of generation $N + 1$ through the oldest generation:
+any objects in these lists may have pointers to objects in generations
+$0$ to $N$, and thus must be considered as a root.
+If a major garbage collection takes place, no @mut_list@ and @mut_once_list@
+lists are consider for scavenging and step (2) is skipped.
+The entire set of roots is now specified by @get_roots()@ alone.
+
+\subsubsection{(2) Scavenging lists {\tt mut\_once\_list} and {\tt mut\_list}}
+
+Since the roots obtained from the lists @mut_list@ and @mut_once_list@ are
+already in generations $N' > N$, we only have to scavenge them.
+That is, it suffices to invoke @evacuate()@ once on each object
+which is currently pointed to by an object in these lists.
+
+When scavenging an object @r@ in the list @mut_once_list@ of generation $M$,
+the desired generation is set to $M$ for each object @o@ pointed
+to by @r@ before invoking @evacuate()@.
+The rationale is that the contents of @r@ cannot be updated any more,
+and thus @r@ is always survived by @o@; @o@ is live as long as @r@ is.
+Therefore, we wish @r@ to be evacuated to the same generation $M$ as @r@
+currently resides (not to its next step).
+If the evacuation succeeds (indicated by a @rtsFalse@ value of a variable
+@failed_to_evac@, declared in @GC.c@) for every object @o@, @r@ is removed
+from the list @mut_once_list@ because it does not hold any backward
+inter-generational pointers.\footnote{It turns out that @r@ can have only
+one such object @o@. The type of @r@ is one of the following:
+@IND\_OLDGEN@, @IND\_OLDGEN\_PERM@, @IND\_STATIC@, and @MUT\_VAR@.}
+
+Scavenging a list @mut_list@ is similar to the case of @mut_once_list@.
+When scavenging an object @r@ in the list @mut_list@ of generation $M$,
+the desired generation is set to $M$ for each object pointed to by @r@
+if @r@ is known to be immutable (e.g., @MUT_ARR_PTRS_FROZEN@,
+@IND_OLDGEN@)
+or to $0$ if @r@ is still mutable (e.g., @MUT_ARR_PTRS@, @MUT_VAR@).
+The list @mut_once_list@ is also adjusted if it is safe to remove @r@ from
+@mut_list@.
+
+\subsubsection{(3) Scavenging CAFs}
+
+When a dynamically loaded CAF is entered, it it first put to the list
+@caf_list@ and then overwritten with a static indirection object.
+The evaluation result of the CAF is stored in a dynamic object in the heap
+and the static indirection object stores a pointer to the dynamic object.
+Although the static indirection object (or the CAF) itself is never freed,
+it may be removed later from the @caf_list@ when it is reverted to the
+original CAF, and the dynamic object may not be live afterwards.
+Hence, we treat the dynamic object just as normal dynamic objects and
+set the desired generation to $0$.
+
+\subsubsection{(4) Evacuating roots}
+
+Evacuating roots (other than those in the lists @mut_once_list@ and
+@mut_list@) is simply done by invoking @get_roots()@ with @mark_root()@
+as an argument.
+Since these roots are normal dynamic objects, we set the desired generation
+to $0$.
+
+\subsubsection{(5) Scavenging}
+
+The garbage collector scavenges all the objects in the to-space of
+each step (by invoking @evacuate()@ on each object reachable from them)
+until every sweep pointer has reached its corresponding
+allocation pointer.
+It repeatedly examines all the to-spaces because not only sweep pointers
+but also allocation pointers change during scavenging:
+when an object @r@ is scavenged, each object reachable from
+@r@ is evacuated to a certain to-space, which increases the corresponding
+allocation pointer, and
+the sweep pointer of the to-space which currently contains @r@
+increases as well upon finishing scavenging the object @r@.
+Thus, the garbage collector cannot anticipate in advance how many times
+it needs to scan through all the to-spaces; it keeps scavenging until
+no objects are left to be scavenged.
+
+\subsubsection{Scavenging static objects}
+
+Since it is possible for dynamic objects to point to static objects,
+the garbage collector may invoke @evacuate()@ on static objects
+while scavenging dynamic objects in to-spaces.
+This complicates the garbage collector because
+static objects cannot be evacuated in general yet
+they may have pointers to dynamic objects, which must be evacuated.
+Thus the garbage collector needs to at least scavenge live static objects
+(as opposed to those static objects currently not reachable from roots).
+
+When a minor garbage collection is performed, any invocation of
+@evacuate()@ on static objects is simply ignored.
+Furthermore, no static object is considered for scavenging
+(except those in the list @mut_once_list@ of the oldest generation during).
+Still all dynamic objects which are marked as live due to static objects
+are safely evacuated.
+The reason is that we can reach all such dynamic objects from
+indirection static objects stored in the list
+@mut_once_list@ of the oldest generation, which is scavenged during step (2),
+and the list @caf_list@.
+In other words, in order to evacuate all such dynamic objects, it is
+sufficient to evacuate all dynamic objects reachable from
+static indirection objects in
+the list @mut_once_list@ of the oldest generation and the list @caf_list@.
+However, the garbage collector may unnecessarily scavenge certain static
+indirection objects which are no longer used.
+They are not scavenged during a major garbage collection, however.
+
+During a major garbage collection,
+if an invocation of @evacuate()@ on a static object @r@ is made,
+the garbage collector first checks whether @r@ needs to be scavenged or not.
+If its SRT (Static Reference Table) is empty and it has no other pointers,
+no dynamic objects are reachable from @r@ and it is ignored.\footnote{If
+no dynamic objects are reachable from a static object @r@ (even indirectly
+via multiple static objects),
+@r@ is not stored in \emph{any} SRT table because it would be no use attempting
+to follow any pointers in @r@.}
+Otherwise, it is put in the list @static_objects@.
+At the beginning of each scavenging loop in step (5),
+the garbage collector invokes @scavenge_static()@ if the list @static_objects@
+is not empty.
+@scavenge_static()@ scavenges the static objects in the list @static_objects@
+by invoking @evacuate()@ on every object reachable from them.
+The desired generation is set to the oldest generation (because any
+dynamic object directly pointed to by a static object lives
+forever).
+These static objects are then put in another list @scavenged_static_objects@
+and removed from the list @static_objects@.
+For a static indirection object, if the evacuation
+fails, it is put back to the list @mut_once_list@ of the oldest generation;
+it can be thought of as a CAF just entered.
+
+After a major garbage collection, therefore, the list @scavenged_static_objects@
+links all live static objects except for static indirection objects put back
+to the list @mut_once_list@ of the oldest generation.
+Dynamically loaded CAFs are found in the list @caf_list@.
+
+\subsubsection{(6) Tidying up}
+
+The garbage collector tidies up the heap by
+moving the to-space of each step to the next step.
+It also re-initialize the small object pool (which now does not contain
+any live objects), frees any large objects which have not been scavenged,
+and invokes @resetNurseries()@.
+If a major garbage collection has been performed, it
+invokes @zero_static_object_list()@ on the list @scavenged_static_objects@
+so that all static objects
+(other than those in the list @mut_once_list@ of the oldest generation)
+have a null static link field again.
+
+At this point, both the small allocation pool and the large object pool are
+empty. Upon the exit from @GarbageCollect()@, however, they may not
+be empty any more because the garbage collector invokes @scheduleFinalizer()@
+before exiting, which tries to run pending finalizers on dead weak pointers and
+may create new objects through @allocate()@.
+The nursery still remains intact.
+
+The heap may contain extra objects which are not reachable from the roots
+used during the garbage collection: 1) weak head pointers; 2) dead
+weak head pointers. Weak head pointers can be tracked from
+the list @weak_ptr_list@ (declared in @Weak.c@). However, there is no way
+of reaching dead weak pointers; they will be garbage collected during the
+next garbage collection.
+
+For implementation details, see @GC.c@.
+
+\section{Miscellaneous notes}
-The @Evacuate@ operation does this:
+\begin{itemize}
+\item To see how to add new fields to Haskell closures,
+see the document on the implementation of retainer profiling
+(section `Adding Retainer Set Fields').
-\begin{code}
-Evacuate(p) {
- /* Ignore pointers into non-collected generations */
- if generation(p) > G then return(p);
+\item To see how to traverse the graph and visit every live closure,
+see the document on the implementation of retainer profiling
+(section `Graph Traversal').
- /* Copy the object and return its new location */
- p' := copy the object pointed to by p into to-space(g);
+\item To see how to linearly scan the heap at any random moment during
+program execution, see the document on the implementation of LDVU profiling
+(section `Heap Censuses').
- return(p');
-}
-\end{code}
-The @Scavenge@ operation operation does this:
-\begin{code}
-Scavenge() {
- While there is an un-scavenged object O in to-space do
- for each pointer q in O do
- q := Mark(q)
- remember that O has been scavenged
-}
-\end{code}
-We say ``copy into to-space'' and ``find an un-scavenged object in to-space'',
-but actually each \step{} of each generation has a separate to-space,
-complete with allocation pointer and sweep pointer. (The sweep pointer
-finds the un-scavenged objects.) When copying
-into a to-space a check must be made for to-space
-overflow and a new \segment{} chained on if necessary.
-This is much as when the
-mutator is doing allocation. The difference is that we are allocating
-in multiple \segment{}s in an interleaved fashion, so we can't hold
-the allocation pointers in registers.
-
-Only when all the sweep pointers
-have reached the corresponding allocation pointer is GC done.
-This entails multiple passes through the \step{}s until a pass
-finds no un-scavenged objects. [NB: if we treated inter-generational pointers
-between generations <= G as roots, then we could avoid some of these multiple
-passes. But we'd collect less garbage, and hence copy more
-objects, so it would probably be slower.]
-
-Notice that there is no issue about partitioning the address space
-among generations, as there might be if each lived in a contiguous address
-space.
-
-\section{Large objects}
-\label{sect:large}
-
-Large objects are allocated directly in generation 1, each alone in its
-own \segment{}. This means that they never, ever, need to be copied! Instead,
-we simply change their generation or \step{} number. We also need to chain the
-object onto a list of large objects to be swept for pointers. This act also
-serves to mark the object, so we don't chain it on twice, or increment its
-\step{} number twice.
-
-Stack objects probably count as large objects. This is one reason for not
-wanting \block{}s to be too big.
-
-
-\section{Mark sweep}
-
-So far we have assumed copying collection. Actually, for
-older generations we may want to use
-mark/sweep/compact because it is a bit less greedy on address
-space, and may help paging behaviour.
-
-What I have in mind is a pointer-reversing mark phase, so that we don't have
-to allocate space for a mark stack, although one could argue that a mark
-stack will almost invariably be small, and if it gets big then you can always
-malloc some more space. Mumble mumble.
-
-\bibliographystyle{plain}
-\bibliography{bibl,simon}
+\item To see how to linearly scan the from-space during garbage collections,
+see the document on the implementation of LDVU profiling
+(section `Destruction of Closures').
+
+\end{itemize}
\end{document}