Consistently use <sect1> etc rather than <section>; fixes #5009
[ghc-hetmet.git] / docs / users_guide / runtime_control.xml
index 95365ba..be341b2 100644 (file)
 
   <para>To make an executable program, the GHC system compiles your
   code and then links it with a non-trivial runtime system (RTS),
-  which handles storage management, profiling, etc.</para>
-
-  <para>You have some control over the behaviour of the RTS, by giving
-  special command-line arguments to your program.</para>
-
-  <para>When your Haskell program starts up, its RTS extracts
-  command-line arguments bracketed between
-  <option>+RTS</option><indexterm><primary><option>+RTS</option></primary></indexterm>
-  and
-  <option>-RTS</option><indexterm><primary><option>-RTS</option></primary></indexterm>
-  as its own.  For example:</para>
+  which handles storage management, thread scheduling, profiling, and
+  so on.</para>
+
+  <para>
+    The RTS has a lot of options to control its behaviour.  For
+    example, you can change the context-switch interval, the default
+    size of the heap, and enable heap profiling.  These options can be
+    passed to the runtime system in a variety of different ways; the
+    next section (<xref linkend="setting-rts-options" />) describes
+    the various methods, and the following sections describe the RTS
+    options themselves.
+  </para>
+
+  <sect2 id="setting-rts-options">
+    <title>Setting RTS options</title>
+    <indexterm><primary>RTS options, setting</primary></indexterm>
+
+    <para>
+      There are four ways to set RTS options:
+
+      <itemizedlist>
+        <listitem>
+          <para>
+          on the command line between <literal>+RTS ... -RTS</literal>, when running the program
+           (<xref linkend="rts-opts-cmdline" />)
+          </para>
+        </listitem>
+        <listitem>
+          <para>at compile-time, using <option>--with-rtsopts</option>
+            (<xref linkend="rts-opts-compile-time" />)
+          </para>
+        </listitem>
+        <listitem>
+          <para>with the environment variable <envar>GHCRTS</envar>
+          (<xref linkend="rts-options-environment" />)
+          </para>
+        </listitem>
+        <listitem>
+          <para>by overriding &ldquo;hooks&rdquo; in the runtime system
+            (<xref linkend="rts-hooks" />)
+          </para>
+        </listitem>
+      </itemizedlist>
+    </para>
+
+    <sect3 id="rts-opts-cmdline">
+      <title>Setting RTS options on the command line</title>
+
+      <para>
+        If you set the <literal>-rtsopts</literal> flag appropriately
+        when linking (see <xref linkend="options-linker" />), you can
+        give RTS options on the command line when running your
+        program.
+      </para>
+
+      <para>
+        When your Haskell program starts up, the RTS extracts
+        command-line arguments bracketed between
+        <option>+RTS</option><indexterm><primary><option>+RTS</option></primary></indexterm>
+        and
+        <option>-RTS</option><indexterm><primary><option>-RTS</option></primary></indexterm>
+        as its own.  For example:
+      </para>
 
 <screen>
-% ./a.out -f +RTS -p -S -RTS -h foo bar
+$ ghc prog.hs -rtsopts
+[1 of 1] Compiling Main             ( prog.hs, prog.o )
+Linking prog ...
+$ ./prog -f +RTS -H32m -S -RTS -h foo bar
 </screen>
 
-  <para>The RTS will snaffle <option>-p</option> <option>-S</option>
-  for itself, and the remaining arguments <literal>-f -h foo bar</literal>
-  will be handed to your program if/when it calls
-  <function>System.getArgs</function>.</para>
+        <para>
+          The RTS will
+          snaffle <option>-H32m</option> <option>-S</option> for itself,
+          and the remaining arguments <literal>-f -h foo bar</literal>
+          will be available to your program if/when it calls
+          <function>System.Environment.getArgs</function>.
+        </para>
 
-  <para>No <option>-RTS</option> option is required if the
-  runtime-system options extend to the end of the command line, as in
-  this example:</para>
+        <para>
+          No <option>-RTS</option> option is required if the
+          runtime-system options extend to the end of the command line, as in
+          this example:
+        </para>
 
 <screen>
 % hls -ltr /usr/etc +RTS -A5m
 </screen>
 
-  <para>If you absolutely positively want all the rest of the options
-  in a command line to go to the program (and not the RTS), use a
-  <option>&ndash;&ndash;RTS</option><indexterm><primary><option>--RTS</option></primary></indexterm>.</para>
-
-  <para>As always, for RTS options that take
-  <replaceable>size</replaceable>s: If the last character of
-  <replaceable>size</replaceable> is a K or k, multiply by 1000; if an
-  M or m, by 1,000,000; if a G or G, by 1,000,000,000.  (And any
-  wraparound in the counters is <emphasis>your</emphasis>
-  fault!)</para>
-
-  <para>Giving a <literal>+RTS -f</literal>
-  <indexterm><primary><option>-f</option></primary><secondary>RTS option</secondary></indexterm> option
-  will print out the RTS options actually available in your program
-  (which vary, depending on how you compiled).</para>
-
-  <para>NOTE: since GHC is itself compiled by GHC, you can change RTS
-  options in the compiler using the normal
-  <literal>+RTS ... -RTS</literal>
-  combination.  eg. to increase the maximum heap
-  size for a compilation to 128M, you would add
-  <literal>+RTS -M128m -RTS</literal>
-  to the command line.</para>
-
-  <sect2 id="rts-optinos-environment">
-    <title>Setting global RTS options</title>
-
-    <indexterm><primary>RTS options</primary><secondary>from the environment</secondary></indexterm>
-    <indexterm><primary>environment variable</primary><secondary>for
-    setting RTS options</secondary></indexterm>
-
-    <para>RTS options are also taken from the environment variable
-    <envar>GHCRTS</envar><indexterm><primary><envar>GHCRTS</envar></primary>
-      </indexterm>.  For example, to set the maximum heap size
-    to 128M for all GHC-compiled programs (using an
-    <literal>sh</literal>-like shell):</para>
+        <para>
+          If you absolutely positively want all the rest of the options
+          in a command line to go to the program (and not the RTS), use a
+          <option>&ndash;&ndash;RTS</option><indexterm><primary><option>--RTS</option></primary></indexterm>.
+        </para>
+
+        <para>
+          As always, for RTS options that take
+          <replaceable>size</replaceable>s: If the last character of
+          <replaceable>size</replaceable> is a K or k, multiply by 1000; if an
+          M or m, by 1,000,000; if a G or G, by 1,000,000,000.  (And any
+          wraparound in the counters is <emphasis>your</emphasis>
+          fault!)
+        </para>
+
+        <para>
+          Giving a <literal>+RTS -?</literal>
+          <indexterm><primary><option>-?</option></primary><secondary>RTS option</secondary></indexterm> option
+          will print out the RTS options actually available in your program
+          (which vary, depending on how you compiled).</para>
+
+        <para>
+          NOTE: since GHC is itself compiled by GHC, you can change RTS
+          options in the compiler using the normal
+          <literal>+RTS ... -RTS</literal>
+          combination.  eg. to set the maximum heap
+          size for a compilation to 128M, you would add
+          <literal>+RTS -M128m -RTS</literal>
+          to the command line.
+        </para>
+      </sect3>
+
+      <sect3 id="rts-opts-compile-time">
+        <title>Setting RTS options at compile time</title>
+
+        <para>
+          GHC lets you change the default RTS options for a program at
+          compile time, using the <literal>-with-rtsopts</literal>
+          flag (<xref linkend="options-linker" />). For example, to
+          set <literal>-H128m -K64m</literal>, link
+          with <literal>-with-rtsopts="-H128m -K64m"</literal>.
+        </para>
+      </sect3>
+
+      <sect3 id="rts-options-environment">
+        <title>Setting RTS options with the <envar>GHCRTS</envar>
+          environment variable</title>
+
+        <indexterm><primary>RTS options</primary><secondary>from the environment</secondary></indexterm>
+        <indexterm><primary>environment variable</primary><secondary>for
+            setting RTS options</secondary></indexterm>
+
+        <para>
+          If the <literal>-rtsopts</literal> flag is set to
+          something other than <literal>none</literal> when linking,
+          RTS options are also taken from the environment variable
+          <envar>GHCRTS</envar><indexterm><primary><envar>GHCRTS</envar></primary>
+          </indexterm>.  For example, to set the maximum heap size
+          to 2G for all GHC-compiled programs (using an
+          <literal>sh</literal>-like shell):
+        </para>
 
 <screen>
-   GHCRTS='-M128m'
+   GHCRTS='-M2G'
    export GHCRTS
 </screen>
 
-    <para>RTS options taken from the <envar>GHCRTS</envar> environment
-    variable can be overridden by options given on the command
-    line.</para>
+        <para>
+          RTS options taken from the <envar>GHCRTS</envar> environment
+          variable can be overridden by options given on the command
+          line.
+        </para>
+
+        <para>
+          Tip: setting something like <literal>GHCRTS=-M2G</literal>
+          in your environment is a handy way to avoid Haskell programs
+          growing beyond the real memory in your machine, which is
+          easy to do by accident and can cause the machine to slow to
+          a crawl until the OS decides to kill the process (and you
+          hope it kills the right one).
+        </para>
+      </sect3>
+
+  <sect3 id="rts-hooks">
+    <title>&ldquo;Hooks&rdquo; to change RTS behaviour</title>
 
-  </sect2>
+    <indexterm><primary>hooks</primary><secondary>RTS</secondary></indexterm>
+    <indexterm><primary>RTS hooks</primary></indexterm>
+    <indexterm><primary>RTS behaviour, changing</primary></indexterm>
+
+    <para>GHC lets you exercise rudimentary control over the RTS
+    settings for any given program, by compiling in a
+    &ldquo;hook&rdquo; that is called by the run-time system.  The RTS
+    contains stub definitions for all these hooks, but by writing your
+    own version and linking it on the GHC command line, you can
+    override the defaults.</para>
+
+    <para>Owing to the vagaries of DLL linking, these hooks don't work
+    under Windows when the program is built dynamically.</para>
+
+    <para>The hook <literal>ghc_rts_opts</literal><indexterm><primary><literal>ghc_rts_opts</literal></primary>
+      </indexterm>lets you set RTS
+    options permanently for a given program, in the same way as the
+    newer <option>-with-rtsopts</option> linker option does.  A common use for this is
+    to give your program a default heap and/or stack size that is
+    greater than the default.  For example, to set <literal>-H128m
+    -K1m</literal>, place the following definition in a C source
+    file:</para>
+
+<programlisting>
+char *ghc_rts_opts = "-H128m -K1m";
+</programlisting>
+
+    <para>Compile the C file, and include the object file on the
+    command line when you link your Haskell program.</para>
+
+    <para>These flags are interpreted first, before any RTS flags from
+    the <literal>GHCRTS</literal> environment variable and any flags
+    on the command line.</para>
+
+    <para>You can also change the messages printed when the runtime
+    system &ldquo;blows up,&rdquo; e.g., on stack overflow.  The hooks
+    for these are as follows:</para>
+
+    <variablelist>
+
+      <varlistentry>
+       <term>
+          <function>void OutOfHeapHook (unsigned long, unsigned long)</function>
+          <indexterm><primary><function>OutOfHeapHook</function></primary></indexterm>
+        </term>
+       <listitem>
+         <para>The heap-overflow message.</para>
+       </listitem>
+      </varlistentry>
+
+      <varlistentry>
+       <term>
+          <function>void StackOverflowHook (long int)</function>
+          <indexterm><primary><function>StackOverflowHook</function></primary></indexterm>
+        </term>
+       <listitem>
+         <para>The stack-overflow message.</para>
+       </listitem>
+      </varlistentry>
+
+      <varlistentry>
+       <term>
+          <function>void MallocFailHook (long int)</function>
+          <indexterm><primary><function>MallocFailHook</function></primary></indexterm>
+        </term>
+       <listitem>
+         <para>The message printed if <function>malloc</function>
+         fails.</para>
+       </listitem>
+      </varlistentry>
+    </variablelist>
+
+    <para>For examples of the use of these hooks, see GHC's own
+    versions in the file
+    <filename>ghc/compiler/parser/hschooks.c</filename> in a GHC
+    source tree.</para>
+  </sect3>
+
+    </sect2>
 
   <sect2 id="rts-options-misc">
     <title>Miscellaneous RTS options</title>
          increase the resolution of the time profiler.</para>
 
          <para>Using a value of zero disables the RTS clock
-         completetly, and has the effect of disabling timers that
+         completely, and has the effect of disabling timers that
          depend on it: the context switch timer and the heap profiling
          timer.  Context switches will still happen, but
          deterministically and at a rate much faster than normal.
          things like ctrl-C. This option is primarily useful for when
          you are using the Haskell code as a DLL, and want to set your
          own signal handlers.</para>
+
+         <para>Note that even
+           with <option>--install-signal-handlers=no</option>, the RTS
+           interval timer signal is still enabled.  The timer signal
+           is either SIGVTALRM or SIGALRM, depending on the RTS
+           configuration and OS capabilities.  To disable the timer
+           signal, use the <literal>-V0</literal> RTS option (see
+           above).
+         </para>
+       </listitem>
+     </varlistentry>
+
+     <varlistentry>
+       <term><option>-xm<replaceable>address</replaceable></option>
+       <indexterm><primary><option>-xm</option></primary><secondary>RTS
+       option</secondary></indexterm></term>
+       <listitem>
+         <para>
+           WARNING: this option is for working around memory
+           allocation problems only.  Do not use unless GHCi fails
+           with a message like &ldquo;<literal>failed to mmap() memory below 2Gb</literal>&rdquo;.  If you need to use this option to get GHCi working
+           on your machine, please file a bug.
+         </para>
+         
+         <para>
+           On 64-bit machines, the RTS needs to allocate memory in the
+           low 2Gb of the address space.  Support for this across
+           different operating systems is patchy, and sometimes fails.
+           This option is there to give the RTS a hint about where it
+           should be able to allocate memory in the low 2Gb of the
+           address space.  For example, <literal>+RTS -xm20000000
+           -RTS</literal> would hint that the RTS should allocate
+           starting at the 0.5Gb mark.  The default is to use the OS's
+           built-in support for allocating memory in the low 2Gb if
+           available (e.g. <literal>mmap</literal>
+           with <literal>MAP_32BIT</literal> on Linux), or
+           otherwise <literal>-xm40000000</literal>.
+         </para>
        </listitem>
      </varlistentry>
     </variablelist>
           <indexterm><primary>allocation area, size</primary></indexterm>
         </term>
        <listitem>
-         <para>&lsqb;Default: 256k&rsqb; Set the allocation area size
+         <para>&lsqb;Default: 512k&rsqb; Set the allocation area size
           used by the garbage collector.  The allocation area
           (actually generation 0 step 0) is fixed and is never resized
           (unless you use <option>-H</option>, below).</para>
       </varlistentry>
 
       <varlistentry>
+        <term>
+          <option>-qg<optional><replaceable>gen</replaceable></optional></option>
+          <indexterm><primary><option>-qg</option><secondary>RTS
+          option</secondary></primary></indexterm>
+        </term>
+        <listitem>
+          <para>&lsqb;New in GHC 6.12.1&rsqb; &lsqb;Default: 0&rsqb;
+            Use parallel GC in
+            generation <replaceable>gen</replaceable> and higher.
+            Omitting <replaceable>gen</replaceable> turns off the
+            parallel GC completely, reverting to sequential GC.</para>
+          
+          <para>The default parallel GC settings are usually suitable
+            for parallel programs (i.e. those
+            using <literal>par</literal>, Strategies, or with multiple
+            threads).  However, it is sometimes beneficial to enable
+            the parallel GC for a single-threaded sequential program
+            too, especially if the program has a large amount of heap
+            data and GC is a significant fraction of runtime.  To use
+            the parallel GC in a sequential program, enable the
+            parallel runtime with a suitable <literal>-N</literal>
+            option, and additionally it might be beneficial to
+            restrict parallel GC to the old generation
+            with <literal>-qg1</literal>.</para>
+        </listitem>
+      </varlistentry>        
+
+      <varlistentry>
+        <term>
+          <option>-qb<optional><replaceable>gen</replaceable></optional></option>
+          <indexterm><primary><option>-qb</option><secondary>RTS
+          option</secondary></primary></indexterm>
+        </term>
+        <listitem>
+          <para>
+            &lsqb;New in GHC 6.12.1&rsqb; &lsqb;Default: 1&rsqb; Use
+            load-balancing in the parallel GC in
+            generation <replaceable>gen</replaceable> and higher.
+            Omitting <replaceable>gen</replaceable> disables
+            load-balancing entirely.</para>
+          
+          <para>
+            Load-balancing shares out the work of GC between the
+            available cores.  This is a good idea when the heap is
+            large and we need to parallelise the GC work, however it
+            is also pessimal for the short young-generation
+            collections in a parallel program, because it can harm
+            locality by moving data from the cache of the CPU where is
+            it being used to the cache of another CPU.  Hence the
+            default is to do load-balancing only in the
+            old-generation.  In fact, for a parallel program it is
+            sometimes beneficial to disable load-balancing entirely
+            with <literal>-qb</literal>.
+          </para>
+        </listitem>
+      </varlistentry>
+
+      <varlistentry>
        <term>
           <option>-H</option><replaceable>size</replaceable>
           <indexterm><primary><option>-H</option></primary><secondary>RTS option</secondary></indexterm>
 
       <varlistentry>
        <term>
-         <option>-k</option><replaceable>size</replaceable>
+         <option>-ki</option><replaceable>size</replaceable>
          <indexterm><primary><option>-k</option></primary><secondary>RTS option</secondary></indexterm>
-         <indexterm><primary>stack, minimum size</primary></indexterm>
+         <indexterm><primary>stack, initial size</primary></indexterm>
         </term>
        <listitem>
-         <para>&lsqb;Default: 1k&rsqb; Set the initial stack size for
-          new threads.  Thread stacks (including the main thread's
-          stack) live on the heap, and grow as required.  The default
-          value is good for concurrent applications with lots of small
-          threads; if your program doesn't fit this model then
-          increasing this option may help performance.</para>
-
-         <para>The main thread is normally started with a slightly
-          larger heap to cut down on unnecessary stack growth while
-          the program is starting up.</para>
-       </listitem>
+          <para>
+            &lsqb;Default: 1k&rsqb; Set the initial stack size for new
+            threads.  (Note: this flag used to be
+            simply <option>-k</option>, but was renamed
+            to <option>-ki</option> in GHC 7.2.1.  The old name is
+            still accepted for backwards compatibility, but that may
+            be removed in a future version).
+          </para>
+
+          <para>
+            Thread stacks (including the main thread's stack) live on
+            the heap.  As the stack grows, new stack chunks are added
+            as required; if the stack shrinks again, these extra stack
+            chunks are reclaimed by the garbage collector.  The
+            default initial stack size is deliberately small, in order
+            to keep the time and space overhead for thread creation to
+            a minimum, and to make it practical to spawn threads for
+            even tiny pieces of work.
+          </para>
+        </listitem>
+      </varlistentry>
+
+      <varlistentry>
+        <term>
+          <option>-kc</option><replaceable>size</replaceable>
+          <indexterm><primary><option>-kc</option></primary><secondary>RTS
+          option</secondary></indexterm>
+          <indexterm><primary>stack</primary><secondary>chunk size</secondary></indexterm>
+        </term>
+        <listitem>
+          <para>
+            &lsqb;Default: 32k&rsqb; Set the size of &ldquo;stack
+            chunks&rdquo;.  When a thread's current stack overflows, a
+            new stack chunk is created and added to the thread's
+            stack, until the limit set by <option>-K</option> is
+            reached.
+          </para>
+
+          <para>
+            The advantage of smaller stack chunks is that the garbage
+            collector can avoid traversing stack chunks if they are
+            known to be unmodified since the last collection, so
+            reducing the chunk size means that the garbage collector
+            can identify more stack as unmodified, and the GC overhead
+            might be reduced.  On the other hand, making stack chunks
+            too small adds some overhead as there will be more
+            overflow/underflow between chunks.  The default setting of
+            32k appears to be a reasonable compromise in most cases.
+          </para>
+        </listitem>
+      </varlistentry>
+
+      <varlistentry>
+        <term>
+          <option>-kb</option><replaceable>size</replaceable>
+          <indexterm><primary><option>-kc</option></primary><secondary>RTS
+          option</secondary></indexterm>
+          <indexterm><primary>stack</primary><secondary>chunk buffer size</secondary></indexterm>
+        </term>
+        <listitem>
+          <para>
+            &lsqb;Default: 1k&rsqb; Sets the stack chunk buffer size.
+            When a stack chunk overflows and a new stack chunk is
+            created, some of the data from the previous stack chunk is
+            moved into the new chunk, to avoid an immediate underflow
+            and repeated overflow/underflow at the boundary.  The
+            amount of stack moved is set by the <option>-kb</option>
+            option.
+          </para>
+          <para>
+            Note that to avoid wasting space, this value should
+            typically be less than 10&percnt; of the size of a stack
+            chunk (<option>-kc</option>), because in a chain of stack
+            chunks, each chunk will have a gap of unused space of this
+            size.
+          </para>
+        </listitem>
       </varlistentry>
 
       <varlistentry>
        <listitem>
          <para>&lsqb;Default: 8M&rsqb; Set the maximum stack size for
           an individual thread to <replaceable>size</replaceable>
-          bytes.  This option is there purely to stop the program
-          eating up all the available memory in the machine if it gets
-          into an infinite loop.</para>
+          bytes.  If the thread attempts to exceed this limit, it will
+            be send the <literal>StackOverflow</literal> exception.
+          </para>
+          <para>
+            This option is there mainly to stop the program eating up
+            all the available memory in the machine if it gets into an
+            infinite loop.
+          </para>
        </listitem>
       </varlistentry>
 
       </varlistentry>
 
       <varlistentry>
+        <term>
+          <option>-t</option><optional><replaceable>file</replaceable></optional>
+          <indexterm><primary><option>-t</option></primary><secondary>RTS option</secondary></indexterm>
+        </term>
        <term>
-          <option>-s</option><replaceable>file</replaceable>
+          <option>-s</option><optional><replaceable>file</replaceable></optional>
           <indexterm><primary><option>-s</option></primary><secondary>RTS option</secondary></indexterm>
         </term>
        <term>
-          <option>-S</option><replaceable>file</replaceable>
+          <option>-S</option><optional><replaceable>file</replaceable></optional>
           <indexterm><primary><option>-S</option></primary><secondary>RTS option</secondary></indexterm>
         </term>
-       <listitem>
-         <para>Write modest (<option>-s</option>) or verbose
-          (<option>-S</option>) garbage-collector statistics into file
-          <replaceable>file</replaceable>. The default
-          <replaceable>file</replaceable> is
-          <filename><replaceable>program</replaceable>.stat</filename>. The
-          <replaceable>file</replaceable> <constant>stderr</constant>
-          is treated specially, with the output really being sent to
-          <constant>stderr</constant>.</para>
-
-         <para>This option is useful for watching how the storage
-          manager adjusts the heap size based on the current amount of
-          live data.</para>
-       </listitem>
-      </varlistentry>
-
-      <varlistentry>
        <term>
-          <option>-t<replaceable>file</replaceable></option>
-          <indexterm><primary><option>-t</option></primary><secondary>RTS option</secondary></indexterm>
+          <option>--machine-readable</option>
+          <indexterm><primary><option>--machine-readable</option></primary><secondary>RTS option</secondary></indexterm>
         </term>
        <listitem>
-         <para>Write a one-line GC stats summary after running the
-         program.  This output is in the same format as that produced
-         by the <option>-Rghc-timing</option> option.</para>
-
-         <para>As with <option>-s</option>, the default
-          <replaceable>file</replaceable> is
-          <filename><replaceable>program</replaceable>.stat</filename>. The
-          <replaceable>file</replaceable> <constant>stderr</constant>
-          is treated specially, with the output really being sent to
-          <constant>stderr</constant>.</para>
+         <para>These options produce runtime-system statistics, such
+         as the amount of time spent executing the program and in the
+         garbage collector, the amount of memory allocated, the
+         maximum size of the heap, and so on.  The three
+         variants give different levels of detail:
+         <option>-t</option> produces a single line of output in the
+         same format as GHC's <option>-Rghc-timing</option> option,
+         <option>-s</option> produces a more detailed summary at the
+         end of the program, and <option>-S</option> additionally
+         produces information about each and every garbage
+         collection.</para>
+
+          <para>The output is placed in
+          <replaceable>file</replaceable>.  If
+          <replaceable>file</replaceable> is omitted, then the output
+          is sent to <constant>stderr</constant>.</para>
+
+    <para>
+        If you use the <literal>-t</literal> flag then, when your
+        program finishes, you will see something like this:
+    </para>
+
+<programlisting>
+&lt;&lt;ghc: 36169392 bytes, 69 GCs, 603392/1065272 avg/max bytes residency (2 samples), 3M in use, 0.00 INIT (0.00 elapsed), 0.02 MUT (0.02 elapsed), 0.07 GC (0.07 elapsed) :ghc&gt;&gt;
+</programlisting>
+
+    <para>
+        This tells you:
+    </para>
+
+    <itemizedlist>
+      <listitem>
+        <para>
+          The total number of bytes allocated by the program over the
+          whole run.
+        </para>
+      </listitem>
+      <listitem>
+        <para>
+          The total number of garbage collections performed.
+        </para>
+      </listitem>
+      <listitem>
+        <para>
+          The average and maximum "residency", which is the amount of
+          live data in bytes.  The runtime can only determine the
+          amount of live data during a major GC, which is why the
+          number of samples corresponds to the number of major GCs
+          (and is usually relatively small).  To get a better picture
+          of the heap profile of your program, use
+          the <option>-hT</option> RTS option
+          (<xref linkend="rts-profiling" />).
+        </para>
+      </listitem>
+      <listitem>
+        <para>
+          The peak memory the RTS has allocated from the OS. 
+        </para>
+      </listitem>
+      <listitem>
+        <para>
+          The amount of CPU time and elapsed wall clock time while
+          initialising the runtime system (INIT), running the program
+          itself (MUT, the mutator), and garbage collecting (GC).
+        </para>
+      </listitem>
+    </itemizedlist>
+
+    <para>
+        You can also get this in a more future-proof, machine readable
+        format, with <literal>-t --machine-readable</literal>:
+    </para>
+
+<programlisting>
+ [("bytes allocated", "36169392")
+ ,("num_GCs", "69")
+ ,("average_bytes_used", "603392")
+ ,("max_bytes_used", "1065272")
+ ,("num_byte_usage_samples", "2")
+ ,("peak_megabytes_allocated", "3")
+ ,("init_cpu_seconds", "0.00")
+ ,("init_wall_seconds", "0.00")
+ ,("mutator_cpu_seconds", "0.02")
+ ,("mutator_wall_seconds", "0.02")
+ ,("GC_cpu_seconds", "0.07")
+ ,("GC_wall_seconds", "0.07")
+ ]
+</programlisting>
+
+    <para>
+        If you use the <literal>-s</literal> flag then, when your
+        program finishes, you will see something like this (the exact
+        details will vary depending on what sort of RTS you have, e.g.
+        you will only see profiling data if your RTS is compiled for
+        profiling):
+    </para>
+
+<programlisting>
+      36,169,392 bytes allocated in the heap
+       4,057,632 bytes copied during GC
+       1,065,272 bytes maximum residency (2 sample(s))
+          54,312 bytes maximum slop
+               3 MB total memory in use (0 MB lost due to fragmentation)
+
+  Generation 0:    67 collections,     0 parallel,  0.04s,  0.03s elapsed
+  Generation 1:     2 collections,     0 parallel,  0.03s,  0.04s elapsed
+
+  SPARKS: 359207 (557 converted, 149591 pruned)
+
+  INIT  time    0.00s  (  0.00s elapsed)
+  MUT   time    0.01s  (  0.02s elapsed)
+  GC    time    0.07s  (  0.07s elapsed)
+  EXIT  time    0.00s  (  0.00s elapsed)
+  Total time    0.08s  (  0.09s elapsed)
+
+  %GC time      89.5%  (75.3% elapsed)
+
+  Alloc rate    4,520,608,923 bytes per MUT second
+
+  Productivity  10.5% of total user, 9.1% of total elapsed
+</programlisting>
+
+    <itemizedlist>
+      <listitem>
+        <para>
+        The "bytes allocated in the heap" is the total bytes allocated
+        by the program over the whole run.
+        </para>
+      </listitem>
+      <listitem>
+        <para>
+        GHC uses a copying garbage collector by default. "bytes copied
+        during GC" tells you how many bytes it had to copy during
+        garbage collection.
+        </para>
+      </listitem>
+      <listitem>
+        <para>
+        The maximum space actually used by your program is the
+        "bytes maximum residency" figure. This is only checked during
+        major garbage collections, so it is only an approximation;
+        the number of samples tells you how many times it is checked.
+        </para>
+      </listitem>
+      <listitem>
+        <para>
+        The "bytes maximum slop" tells you the most space that is ever
+        wasted due to the way GHC allocates memory in blocks.  Slop is
+        memory at the end of a block that was wasted.  There's no way
+        to control this; we just like to see how much memory is being
+        lost this way.
+        </para>
+      </listitem>
+      <listitem>
+        <para>
+        The "total memory in use" tells you the peak memory the RTS has
+        allocated from the OS.
+        </para>
+      </listitem>
+      <listitem>
+        <para>
+        Next there is information about the garbage collections done.
+        For each generation it says how many garbage collections were
+        done, how many of those collections were done in parallel,
+        the total CPU time used for garbage collecting that generation,
+        and the total wall clock time elapsed while garbage collecting
+        that generation.
+        </para>
+      </listitem>
+      <listitem>
+        <para>The <literal>SPARKS</literal> statistic refers to the
+          use of <literal>Control.Parallel.par</literal> and related
+          functionality in the program.  Each spark represents a call
+          to <literal>par</literal>; a spark is "converted" when it is
+          executed in parallel; and a spark is "pruned" when it is
+          found to be already evaluated and is discarded from the pool
+          by the garbage collector.  Any remaining sparks are
+          discarded at the end of execution, so "converted" plus
+          "pruned" does not necessarily add up to the total.</para>
+      </listitem>
+      <listitem>
+        <para>
+        Next there is the CPU time and wall clock time elapsed broken
+        down by what the runtime system was doing at the time.
+        INIT is the runtime system initialisation.
+        MUT is the mutator time, i.e. the time spent actually running
+        your code.
+        GC is the time spent doing garbage collection.
+        RP is the time spent doing retainer profiling.
+        PROF is the time spent doing other profiling.
+        EXIT is the runtime system shutdown time.
+        And finally, Total is, of course, the total.
+        </para>
+        <para>
+        %GC time tells you what percentage GC is of Total.
+        "Alloc rate" tells you the "bytes allocated in the heap" divided
+        by the MUT CPU time.
+        "Productivity" tells you what percentage of the Total CPU and wall
+        clock elapsed times are spent in the mutator (MUT).
+        </para>
+      </listitem>
+    </itemizedlist>
+
+    <para>
+        The <literal>-S</literal> flag, as well as giving the same
+        output as the <literal>-s</literal> flag, prints information
+        about each GC as it happens:
+    </para>
+
+<programlisting>
+    Alloc    Copied     Live    GC    GC     TOT     TOT  Page Flts
+    bytes     bytes     bytes  user  elap    user    elap
+   528496     47728    141512  0.01  0.02    0.02    0.02    0    0  (Gen:  1)
+[...]
+   524944    175944   1726384  0.00  0.00    0.08    0.11    0    0  (Gen:  0)
+</programlisting>
+
+    <para>
+        For each garbage collection, we print:
+    </para>
+
+    <itemizedlist>
+      <listitem>
+        <para>
+          How many bytes we allocated this garbage collection.
+        </para>
+      </listitem>
+      <listitem>
+        <para>
+          How many bytes we copied this garbage collection.
+        </para>
+      </listitem>
+      <listitem>
+        <para>
+          How many bytes are currently live.
+        </para>
+      </listitem>
+      <listitem>
+        <para>
+          How long this garbage collection took (CPU time and elapsed
+          wall clock time).
+        </para>
+      </listitem>
+      <listitem>
+        <para>
+          How long the program has been running (CPU time and elapsed
+          wall clock time).
+        </para>
+      </listitem>
+      <listitem>
+        <para>
+          How many page faults occured this garbage collection.
+        </para>
+      </listitem>
+      <listitem>
+        <para>
+          How many page faults occured since the end of the last garbage
+          collection.
+        </para>
+      </listitem>
+      <listitem>
+        <para>
+          Which generation is being garbage collected.
+        </para>
+      </listitem>
+    </itemizedlist>
+
        </listitem>
       </varlistentry>
     </variablelist>
   </sect2>
 
   <sect2>
-    <title>RTS options for profiling and parallelism</title>
+    <title>RTS options for concurrency and parallelism</title>
 
-    <para>The RTS options related to profiling are described in <xref
-    linkend="rts-options-heap-prof"/>, those for concurrency in
+    <para>The RTS options related to concurrency are described in
       <xref linkend="using-concurrent" />, and those for parallelism in
       <xref linkend="parallel-options"/>.</para>
   </sect2>
 
+  <sect2 id="rts-profiling">
+    <title>RTS options for profiling</title>
+
+    <para>Most profiling runtime options are only available when you
+    compile your program for profiling (see
+    <xref linkend="prof-compiler-options" />, and
+    <xref linkend="rts-options-heap-prof" /> for the runtime options).
+    However, there is one profiling option that is available
+    for ordinary non-profiled executables:</para>
+
+    <variablelist>
+      <varlistentry>
+        <term>
+          <option>-hT</option>
+          <indexterm><primary><option>-hT</option></primary><secondary>RTS
+              option</secondary></indexterm>
+        </term>
+        <listitem>
+          <para>Generates a basic heap profile, in the
+            file <literal><replaceable>prog</replaceable>.hp</literal>.
+            To produce the heap profile graph,
+            use <command>hp2ps</command> (see <xref linkend="hp2ps"
+                                                    />).  The basic heap profile is broken down by data
+            constructor, with other types of closures (functions, thunks,
+            etc.) grouped into broad categories
+            (e.g. <literal>FUN</literal>, <literal>THUNK</literal>).  To
+            get a more detailed profile, use the full profiling
+            support (<xref linkend="profiling" />).</para>
+        </listitem>
+      </varlistentry>
+    </variablelist>
+  </sect2>
+
+  <sect2 id="rts-eventlog">
+    <title>Tracing</title>
+
+    <indexterm><primary>tracing</primary></indexterm>
+    <indexterm><primary>events</primary></indexterm>
+    <indexterm><primary>eventlog files</primary></indexterm>
+
+    <para>
+      When the program is linked with the <option>-eventlog</option>
+      option (<xref linkend="options-linker" />), runtime events can
+      be logged in two ways:
+    </para>
+
+    <itemizedlist>
+      <listitem>
+        <para>
+          In binary format to a file for later analysis by a
+          variety of tools.  One such tool
+          is <ulink url="http://hackage.haskell.org/package/ThreadScope">ThreadScope</ulink><indexterm><primary>ThreadScope</primary></indexterm>,
+          which interprets the event log to produce a visual parallel
+          execution profile of the program.
+        </para>
+      </listitem>
+      <listitem>
+        <para>
+          As text to standard output, for debugging purposes.
+        </para>
+      </listitem>
+    </itemizedlist>
+
+    <variablelist>
+      <varlistentry>
+        <term>
+          <option>-l<optional><replaceable>flags</replaceable></optional></option>
+          <indexterm><primary><option>-l</option></primary><secondary>RTS option</secondary></indexterm>
+        </term>
+        <listitem>
+          <para>
+            Log events in binary format to the
+            file <filename><replaceable>program</replaceable>.eventlog</filename>,
+            where <replaceable>flags</replaceable> is a sequence of
+            zero or more characters indicating which kinds of events
+            to log.  Currently there is only one type
+            supported: <literal>-ls</literal>, for scheduler events.
+          </para>
+
+          <para>
+            The format of the log file is described by the header
+            <filename>EventLogFormat.h</filename> that comes with
+            GHC, and it can be parsed in Haskell using
+            the <ulink url="http://hackage.haskell.org/package/ghc-events">ghc-events</ulink>
+            library.  To dump the contents of
+            a <literal>.eventlog</literal> file as text, use the
+            tool <literal>show-ghc-events</literal> that comes with
+            the <ulink url="http://hackage.haskell.org/package/ghc-events">ghc-events</ulink>
+            package.
+          </para>
+        </listitem>
+      </varlistentry>
+
+      <varlistentry>
+        <term>
+          <option>-v</option><optional><replaceable>flags</replaceable></optional>
+          <indexterm><primary><option>-v</option></primary><secondary>RTS option</secondary></indexterm>
+        </term>
+        <listitem>
+          <para>
+            Log events as text to standard output, instead of to
+            the <literal>.eventlog</literal> file.
+            The <replaceable>flags</replaceable> are the same as
+            for <option>-l</option>, with the additional
+            option <literal>t</literal> which indicates that the
+            each event printed should be preceded by a timestamp value
+            (in the binary <literal>.eventlog</literal> file, all
+            events are automatically associated with a timestamp).
+          </para>
+        </listitem>
+      </varlistentry>
+
+    </variablelist>
+
+    <para>
+      The debugging
+      options <option>-D<replaceable>x</replaceable></option> also
+      generate events which are logged using the tracing framework.
+      By default those events are dumped as text to stdout
+      (<option>-D<replaceable>x</replaceable></option>
+      implies <option>-v</option>), but they may instead be stored in
+      the binary eventlog file by using the <option>-l</option>
+      option.
+    </para>
+  </sect2>
+
   <sect2 id="rts-options-debugging">
     <title>RTS options for hackers, debuggers, and over-interested
     souls</title>
 
       <varlistentry>
        <term>
-          <option>-D</option><replaceable>num</replaceable>
+          <option>-D</option><replaceable>x</replaceable>
           <indexterm><primary>-D</primary><secondary>RTS option</secondary></indexterm>
         </term>
        <listitem>
-         <para>An RTS debugging flag; varying quantities of output
-          depending on which bits are set in
-          <replaceable>num</replaceable>.  Only works if the RTS was
-          compiled with the <option>DEBUG</option> option.</para>
+         <para>
+            An RTS debugging flag; only availble if the program was
+           linked with the <option>-debug</option> option.  Various
+           values of <replaceable>x</replaceable> are provided to
+           enable debug messages and additional runtime sanity checks
+           in different subsystems in the RTS, for
+           example <literal>+RTS -Ds -RTS</literal> enables debug
+           messages from the scheduler.
+           Use <literal>+RTS&nbsp;-?</literal> to find out which
+           debug flags are supported.
+          </para>
+
+          <para>
+            Debug messages will be sent to the binary event log file
+            instead of stdout if the <option>-l</option> option is
+            added.  This might be useful for reducing the overhead of
+            debug tracing.
+          </para>
        </listitem>
       </varlistentry>
 
         </term>
        <listitem>
          <para>Produce &ldquo;ticky-ticky&rdquo; statistics at the
-          end of the program run.  The <replaceable>file</replaceable>
-          business works just like on the <option>-S</option> RTS
-          option (above).</para>
-
-         <para>&ldquo;Ticky-ticky&rdquo; statistics are counts of
-          various program actions (updates, enters, etc.)  The program
-          must have been compiled using
-          <option>-ticky</option><indexterm><primary><option>-ticky</option></primary></indexterm>
-          (a.k.a. &ldquo;ticky-ticky profiling&rdquo;), and, for it to
-          be really useful, linked with suitable system libraries.
-          Not a trivial undertaking: consult the installation guide on
-          how to set things up for easy &ldquo;ticky-ticky&rdquo;
-          profiling.  For more information, see <xref
-          linkend="ticky-ticky"/>.</para>
+          end of the program run (only available if the program was
+          linked with <option>-debug</option>).
+          The <replaceable>file</replaceable> business works just like
+          on the <option>-S</option> RTS option, above.</para>
+
+          <para>For more information on ticky-ticky profiling, see
+          <xref linkend="ticky-ticky"/>.</para>
        </listitem>
       </varlistentry>
 
 
   </sect2>
 
-  <sect2 id="rts-hooks">
-    <title>&ldquo;Hooks&rdquo; to change RTS behaviour</title>
-
-    <indexterm><primary>hooks</primary><secondary>RTS</secondary></indexterm>
-    <indexterm><primary>RTS hooks</primary></indexterm>
-    <indexterm><primary>RTS behaviour, changing</primary></indexterm>
-
-    <para>GHC lets you exercise rudimentary control over the RTS
-    settings for any given program, by compiling in a
-    &ldquo;hook&rdquo; that is called by the run-time system.  The RTS
-    contains stub definitions for all these hooks, but by writing your
-    own version and linking it on the GHC command line, you can
-    override the defaults.</para>
-
-    <para>Owing to the vagaries of DLL linking, these hooks don't work
-    under Windows when the program is built dynamically.</para>
+  <sect2>
+    <title>Getting information about the RTS</title>
 
-    <para>The hook <literal>ghc_rts_opts</literal><indexterm><primary><literal>ghc_rts_opts</literal></primary>
-      </indexterm>lets you set RTS
-    options permanently for a given program.  A common use for this is
-    to give your program a default heap and/or stack size that is
-    greater than the default.  For example, to set <literal>-H128m
-    -K1m</literal>, place the following definition in a C source
-    file:</para>
+    <indexterm><primary>RTS</primary></indexterm>
 
-<programlisting>
-char *ghc_rts_opts = "-H128m -K1m";
-</programlisting>
+    <para>It is possible to ask the RTS to give some information about
+    itself. To do this, use the <option>--info</option> flag, e.g.</para>
+<screen>
+$ ./a.out +RTS --info
+ [("GHC RTS", "YES")
+ ,("GHC version", "6.7")
+ ,("RTS way", "rts_p")
+ ,("Host platform", "x86_64-unknown-linux")
+ ,("Host architecture", "x86_64")
+ ,("Host OS", "linux")
+ ,("Host vendor", "unknown")
+ ,("Build platform", "x86_64-unknown-linux")
+ ,("Build architecture", "x86_64")
+ ,("Build OS", "linux")
+ ,("Build vendor", "unknown")
+ ,("Target platform", "x86_64-unknown-linux")
+ ,("Target architecture", "x86_64")
+ ,("Target OS", "linux")
+ ,("Target vendor", "unknown")
+ ,("Word size", "64")
+ ,("Compiler unregisterised", "NO")
+ ,("Tables next to code", "YES")
+ ]
+</screen>
+    <para>The information is formatted such that it can be read as a
+    of type <literal>[(String, String)]</literal>. Currently the following
+    fields are present:</para>
 
-    <para>Compile the C file, and include the object file on the
-    command line when you link your Haskell program.</para>
+    <variablelist>
 
-    <para>These flags are interpreted first, before any RTS flags from
-    the <literal>GHCRTS</literal> environment variable and any flags
-    on the command line.</para>
+      <varlistentry>
+        <term><literal>GHC RTS</literal></term>
+        <listitem>
+          <para>Is this program linked against the GHC RTS? (always
+          "YES").</para>
+        </listitem>
+      </varlistentry>
 
-    <para>You can also change the messages printed when the runtime
-    system &ldquo;blows up,&rdquo; e.g., on stack overflow.  The hooks
-    for these are as follows:</para>
+      <varlistentry>
+        <term><literal>GHC version</literal></term>
+        <listitem>
+          <para>The version of GHC used to compile this program.</para>
+        </listitem>
+      </varlistentry>
 
-    <variablelist>
+      <varlistentry>
+        <term><literal>RTS way</literal></term>
+        <listitem>
+          <para>The variant (&ldquo;way&rdquo;) of the runtime. The
+          most common values are <literal>rts</literal> (vanilla),
+          <literal>rts_thr</literal> (threaded runtime, i.e. linked using the
+          <literal>-threaded</literal> option) and <literal>rts_p</literal>
+          (profiling runtime, i.e. linked using the <literal>-prof</literal>
+          option). Other variants include <literal>debug</literal>
+          (linked using <literal>-debug</literal>),
+          <literal>t</literal> (ticky-ticky profiling) and
+          <literal>dyn</literal> (the RTS is
+          linked in dynamically, i.e. a shared library, rather than statically
+          linked into the executable itself). These can be combined,
+          e.g. you might have <literal>rts_thr_debug_p</literal>.</para>
+        </listitem>
+      </varlistentry>
 
       <varlistentry>
-       <term>
-          <function>void OutOfHeapHook (unsigned long, unsigned long)</function>
-          <indexterm><primary><function>OutOfHeapHook</function></primary></indexterm>
+        <term>
+            <literal>Target platform</literal>,
+            <literal>Target architecture</literal>,
+            <literal>Target OS</literal>,
+            <literal>Target vendor</literal>
         </term>
-       <listitem>
-         <para>The heap-overflow message.</para>
-       </listitem>
+        <listitem>
+          <para>These are the platform the program is compiled to run on.</para>
+        </listitem>
       </varlistentry>
 
       <varlistentry>
-       <term>
-          <function>void StackOverflowHook (long int)</function>
-          <indexterm><primary><function>StackOverflowHook</function></primary></indexterm>
+        <term>
+            <literal>Build platform</literal>,
+            <literal>Build architecture</literal>,
+            <literal>Build OS</literal>,
+            <literal>Build vendor</literal>
         </term>
-       <listitem>
-         <para>The stack-overflow message.</para>
-       </listitem>
+        <listitem>
+          <para>These are the platform where the program was built
+          on. (That is, the target platform of GHC itself.) Ordinarily
+          this is identical to the target platform. (It could potentially
+          be different if cross-compiling.)</para>
+        </listitem>
       </varlistentry>
 
       <varlistentry>
-       <term>
-          <function>void MallocFailHook (long int)</function>
-          <indexterm><primary><function>MallocFailHook</function></primary></indexterm>
+        <term>
+            <literal>Host platform</literal>,
+            <literal>Host architecture</literal>
+            <literal>Host OS</literal>
+            <literal>Host vendor</literal>
         </term>
-       <listitem>
-         <para>The message printed if <function>malloc</function>
-         fails.</para>
-       </listitem>
+        <listitem>
+          <para>These are the platform where GHC itself was compiled.
+          Again, this would normally be identical to the build and
+          target platforms.</para>
+        </listitem>
       </varlistentry>
+
+      <varlistentry>
+        <term><literal>Word size</literal></term>
+        <listitem>
+          <para>Either <literal>"32"</literal> or <literal>"64"</literal>,
+          reflecting the word size of the target platform.</para>
+        </listitem>
+      </varlistentry>
+
+      <varlistentry>
+        <term><literal>Compiler unregistered</literal></term>
+        <listitem>
+          <para>Was this program compiled with an &ldquo;unregistered&rdquo;
+          version of GHC? (I.e., a version of GHC that has no platform-specific
+          optimisations compiled in, usually because this is a currently
+          unsupported platform.) This value will usually be no, unless you're
+          using an experimental build of GHC.</para>
+        </listitem>
+      </varlistentry>
+
+      <varlistentry>
+        <term><literal>Tables next to code</literal></term>
+        <listitem>
+          <para>Putting info tables directly next to entry code is a useful
+          performance optimisation that is not available on all platforms.
+          This field tells you whether the program has been compiled with
+          this optimisation. (Usually yes, except on unusual platforms.)</para>
+        </listitem>
+      </varlistentry>
+
     </variablelist>
 
-    <para>For examples of the use of these hooks, see GHC's own
-    versions in the file
-    <filename>ghc/compiler/parser/hschooks.c</filename> in a GHC
-    source tree.</para>
   </sect2>
 </sect1>
 
 <!-- Emacs stuff:
      ;;; Local Variables: ***
-     ;;; mode: xml ***
      ;;; sgml-parent-document: ("users_guide.xml" "book" "chapter" "sect1") ***
      ;;; End: ***
  -->