<para>To make an executable program, the GHC system compiles your
code and then links it with a non-trivial runtime system (RTS),
- which handles storage management, profiling, etc.</para>
-
- <para>You have some control over the behaviour of the RTS, by giving
- special command-line arguments to your program.</para>
-
- <para>When your Haskell program starts up, its RTS extracts
- command-line arguments bracketed between
- <option>+RTS</option><indexterm><primary><option>+RTS</option></primary></indexterm>
- and
- <option>-RTS</option><indexterm><primary><option>-RTS</option></primary></indexterm>
- as its own. For example:</para>
+ which handles storage management, thread scheduling, profiling, and
+ so on.</para>
+
+ <para>
+ The RTS has a lot of options to control its behaviour. For
+ example, you can change the context-switch interval, the default
+ size of the heap, and enable heap profiling. These options can be
+ passed to the runtime system in a variety of different ways; the
+ next section (<xref linkend="setting-rts-options" />) describes
+ the various methods, and the following sections describe the RTS
+ options themselves.
+ </para>
+
+ <sect2 id="setting-rts-options">
+ <title>Setting RTS options</title>
+ <indexterm><primary>RTS options, setting</primary></indexterm>
+
+ <para>
+ There are four ways to set RTS options:
+
+ <itemizedlist>
+ <listitem>
+ <para>
+ on the command line between <literal>+RTS ... -RTS</literal>, when running the program
+ (<xref linkend="rts-opts-cmdline" />)
+ </para>
+ </listitem>
+ <listitem>
+ <para>at compile-time, using <option>--with-rtsopts</option>
+ (<xref linkend="rts-opts-compile-time" />)
+ </para>
+ </listitem>
+ <listitem>
+ <para>with the environment variable <envar>GHCRTS</envar>
+ (<xref linkend="rts-options-environment" />)
+ </para>
+ </listitem>
+ <listitem>
+ <para>by overriding “hooks” in the runtime system
+ (<xref linkend="rts-hooks" />)
+ </para>
+ </listitem>
+ </itemizedlist>
+ </para>
+
+ <sect3 id="rts-opts-cmdline">
+ <title>Setting RTS options on the command line</title>
+
+ <para>
+ If you set the <literal>-rtsopts</literal> flag appropriately
+ when linking (see <xref linkend="options-linker" />), you can
+ give RTS options on the command line when running your
+ program.
+ </para>
+
+ <para>
+ When your Haskell program starts up, the RTS extracts
+ command-line arguments bracketed between
+ <option>+RTS</option><indexterm><primary><option>+RTS</option></primary></indexterm>
+ and
+ <option>-RTS</option><indexterm><primary><option>-RTS</option></primary></indexterm>
+ as its own. For example:
+ </para>
<screen>
-% ./a.out -f +RTS -p -S -RTS -h foo bar
+$ ghc prog.hs -rtsopts
+[1 of 1] Compiling Main ( prog.hs, prog.o )
+Linking prog ...
+$ ./prog -f +RTS -H32m -S -RTS -h foo bar
</screen>
- <para>The RTS will snaffle <option>-p</option> <option>-S</option>
- for itself, and the remaining arguments <literal>-f -h foo bar</literal>
- will be handed to your program if/when it calls
- <function>System.getArgs</function>.</para>
+ <para>
+ The RTS will
+ snaffle <option>-H32m</option> <option>-S</option> for itself,
+ and the remaining arguments <literal>-f -h foo bar</literal>
+ will be available to your program if/when it calls
+ <function>System.Environment.getArgs</function>.
+ </para>
- <para>No <option>-RTS</option> option is required if the
- runtime-system options extend to the end of the command line, as in
- this example:</para>
+ <para>
+ No <option>-RTS</option> option is required if the
+ runtime-system options extend to the end of the command line, as in
+ this example:
+ </para>
<screen>
% hls -ltr /usr/etc +RTS -A5m
</screen>
- <para>If you absolutely positively want all the rest of the options
- in a command line to go to the program (and not the RTS), use a
- <option>––RTS</option><indexterm><primary><option>--RTS</option></primary></indexterm>.</para>
-
- <para>As always, for RTS options that take
- <replaceable>size</replaceable>s: If the last character of
- <replaceable>size</replaceable> is a K or k, multiply by 1000; if an
- M or m, by 1,000,000; if a G or G, by 1,000,000,000. (And any
- wraparound in the counters is <emphasis>your</emphasis>
- fault!)</para>
-
- <para>Giving a <literal>+RTS -f</literal>
- <indexterm><primary><option>-f</option></primary><secondary>RTS option</secondary></indexterm> option
- will print out the RTS options actually available in your program
- (which vary, depending on how you compiled).</para>
-
- <para>NOTE: since GHC is itself compiled by GHC, you can change RTS
- options in the compiler using the normal
- <literal>+RTS ... -RTS</literal>
- combination. eg. to increase the maximum heap
- size for a compilation to 128M, you would add
- <literal>+RTS -M128m -RTS</literal>
- to the command line.</para>
-
- <sect2 id="rts-optinos-environment">
- <title>Setting global RTS options</title>
-
- <indexterm><primary>RTS options</primary><secondary>from the environment</secondary></indexterm>
- <indexterm><primary>environment variable</primary><secondary>for
- setting RTS options</secondary></indexterm>
-
- <para>RTS options are also taken from the environment variable
- <envar>GHCRTS</envar><indexterm><primary><envar>GHCRTS</envar></primary>
- </indexterm>. For example, to set the maximum heap size
- to 128M for all GHC-compiled programs (using an
- <literal>sh</literal>-like shell):</para>
+ <para>
+ If you absolutely positively want all the rest of the options
+ in a command line to go to the program (and not the RTS), use a
+ <option>––RTS</option><indexterm><primary><option>--RTS</option></primary></indexterm>.
+ </para>
+
+ <para>
+ As always, for RTS options that take
+ <replaceable>size</replaceable>s: If the last character of
+ <replaceable>size</replaceable> is a K or k, multiply by 1000; if an
+ M or m, by 1,000,000; if a G or G, by 1,000,000,000. (And any
+ wraparound in the counters is <emphasis>your</emphasis>
+ fault!)
+ </para>
+
+ <para>
+ Giving a <literal>+RTS -?</literal>
+ <indexterm><primary><option>-?</option></primary><secondary>RTS option</secondary></indexterm> option
+ will print out the RTS options actually available in your program
+ (which vary, depending on how you compiled).</para>
+
+ <para>
+ NOTE: since GHC is itself compiled by GHC, you can change RTS
+ options in the compiler using the normal
+ <literal>+RTS ... -RTS</literal>
+ combination. eg. to set the maximum heap
+ size for a compilation to 128M, you would add
+ <literal>+RTS -M128m -RTS</literal>
+ to the command line.
+ </para>
+ </sect3>
+
+ <sect3 id="rts-opts-compile-time">
+ <title>Setting RTS options at compile time</title>
+
+ <para>
+ GHC lets you change the default RTS options for a program at
+ compile time, using the <literal>-with-rtsopts</literal>
+ flag (<xref linkend="options-linker" />). For example, to
+ set <literal>-H128m -K64m</literal>, link
+ with <literal>-with-rtsopts="-H128m -K64m"</literal>.
+ </para>
+ </sect3>
+
+ <sect3 id="rts-options-environment">
+ <title>Setting RTS options with the <envar>GHCRTS</envar>
+ environment variable</title>
+
+ <indexterm><primary>RTS options</primary><secondary>from the environment</secondary></indexterm>
+ <indexterm><primary>environment variable</primary><secondary>for
+ setting RTS options</secondary></indexterm>
+
+ <para>
+ If the <literal>-rtsopts</literal> flag is set to
+ something other than <literal>none</literal> when linking,
+ RTS options are also taken from the environment variable
+ <envar>GHCRTS</envar><indexterm><primary><envar>GHCRTS</envar></primary>
+ </indexterm>. For example, to set the maximum heap size
+ to 2G for all GHC-compiled programs (using an
+ <literal>sh</literal>-like shell):
+ </para>
<screen>
- GHCRTS='-M128m'
+ GHCRTS='-M2G'
export GHCRTS
</screen>
- <para>RTS options taken from the <envar>GHCRTS</envar> environment
- variable can be overridden by options given on the command
- line.</para>
+ <para>
+ RTS options taken from the <envar>GHCRTS</envar> environment
+ variable can be overridden by options given on the command
+ line.
+ </para>
+
+ <para>
+ Tip: setting something like <literal>GHCRTS=-M2G</literal>
+ in your environment is a handy way to avoid Haskell programs
+ growing beyond the real memory in your machine, which is
+ easy to do by accident and can cause the machine to slow to
+ a crawl until the OS decides to kill the process (and you
+ hope it kills the right one).
+ </para>
+ </sect3>
+
+ <sect3 id="rts-hooks">
+ <title>“Hooks” to change RTS behaviour</title>
- </sect2>
+ <indexterm><primary>hooks</primary><secondary>RTS</secondary></indexterm>
+ <indexterm><primary>RTS hooks</primary></indexterm>
+ <indexterm><primary>RTS behaviour, changing</primary></indexterm>
+
+ <para>GHC lets you exercise rudimentary control over the RTS
+ settings for any given program, by compiling in a
+ “hook” that is called by the run-time system. The RTS
+ contains stub definitions for all these hooks, but by writing your
+ own version and linking it on the GHC command line, you can
+ override the defaults.</para>
+
+ <para>Owing to the vagaries of DLL linking, these hooks don't work
+ under Windows when the program is built dynamically.</para>
+
+ <para>The hook <literal>ghc_rts_opts</literal><indexterm><primary><literal>ghc_rts_opts</literal></primary>
+ </indexterm>lets you set RTS
+ options permanently for a given program, in the same way as the
+ newer <option>-with-rtsopts</option> linker option does. A common use for this is
+ to give your program a default heap and/or stack size that is
+ greater than the default. For example, to set <literal>-H128m
+ -K1m</literal>, place the following definition in a C source
+ file:</para>
+
+<programlisting>
+char *ghc_rts_opts = "-H128m -K1m";
+</programlisting>
+
+ <para>Compile the C file, and include the object file on the
+ command line when you link your Haskell program.</para>
+
+ <para>These flags are interpreted first, before any RTS flags from
+ the <literal>GHCRTS</literal> environment variable and any flags
+ on the command line.</para>
+
+ <para>You can also change the messages printed when the runtime
+ system “blows up,” e.g., on stack overflow. The hooks
+ for these are as follows:</para>
+
+ <variablelist>
+
+ <varlistentry>
+ <term>
+ <function>void OutOfHeapHook (unsigned long, unsigned long)</function>
+ <indexterm><primary><function>OutOfHeapHook</function></primary></indexterm>
+ </term>
+ <listitem>
+ <para>The heap-overflow message.</para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>
+ <function>void StackOverflowHook (long int)</function>
+ <indexterm><primary><function>StackOverflowHook</function></primary></indexterm>
+ </term>
+ <listitem>
+ <para>The stack-overflow message.</para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>
+ <function>void MallocFailHook (long int)</function>
+ <indexterm><primary><function>MallocFailHook</function></primary></indexterm>
+ </term>
+ <listitem>
+ <para>The message printed if <function>malloc</function>
+ fails.</para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+
+ <para>For examples of the use of these hooks, see GHC's own
+ versions in the file
+ <filename>ghc/compiler/parser/hschooks.c</filename> in a GHC
+ source tree.</para>
+ </sect3>
+
+ </sect2>
<sect2 id="rts-options-misc">
<title>Miscellaneous RTS options</title>
the <option>-C</option> or <option>-i</option> options.
However, setting <option>-V</option> is required in order to
increase the resolution of the time profiler.</para>
+
+ <para>Using a value of zero disables the RTS clock
+ completely, and has the effect of disabling timers that
+ depend on it: the context switch timer and the heap profiling
+ timer. Context switches will still happen, but
+ deterministically and at a rate much faster than normal.
+ Disabling the interval timer is useful for debugging, because
+ it eliminates a source of non-determinism at runtime.</para>
</listitem>
</varlistentry>
things like ctrl-C. This option is primarily useful for when
you are using the Haskell code as a DLL, and want to set your
own signal handlers.</para>
+
+ <para>Note that even
+ with <option>--install-signal-handlers=no</option>, the RTS
+ interval timer signal is still enabled. The timer signal
+ is either SIGVTALRM or SIGALRM, depending on the RTS
+ configuration and OS capabilities. To disable the timer
+ signal, use the <literal>-V0</literal> RTS option (see
+ above).
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><option>-xm<replaceable>address</replaceable></option>
+ <indexterm><primary><option>-xm</option></primary><secondary>RTS
+ option</secondary></indexterm></term>
+ <listitem>
+ <para>
+ WARNING: this option is for working around memory
+ allocation problems only. Do not use unless GHCi fails
+ with a message like “<literal>failed to mmap() memory below 2Gb</literal>”. If you need to use this option to get GHCi working
+ on your machine, please file a bug.
+ </para>
+
+ <para>
+ On 64-bit machines, the RTS needs to allocate memory in the
+ low 2Gb of the address space. Support for this across
+ different operating systems is patchy, and sometimes fails.
+ This option is there to give the RTS a hint about where it
+ should be able to allocate memory in the low 2Gb of the
+ address space. For example, <literal>+RTS -xm20000000
+ -RTS</literal> would hint that the RTS should allocate
+ starting at the 0.5Gb mark. The default is to use the OS's
+ built-in support for allocating memory in the low 2Gb if
+ available (e.g. <literal>mmap</literal>
+ with <literal>MAP_32BIT</literal> on Linux), or
+ otherwise <literal>-xm40000000</literal>.
+ </para>
</listitem>
</varlistentry>
</variablelist>
<indexterm><primary>allocation area, size</primary></indexterm>
</term>
<listitem>
- <para>[Default: 256k] Set the allocation area size
+ <para>[Default: 512k] Set the allocation area size
used by the garbage collector. The allocation area
(actually generation 0 step 0) is fixed and is never resized
(unless you use <option>-H</option>, below).</para>
</varlistentry>
<varlistentry>
+ <term>
+ <option>-qg<optional><replaceable>gen</replaceable></optional></option>
+ <indexterm><primary><option>-qg</option><secondary>RTS
+ option</secondary></primary></indexterm>
+ </term>
+ <listitem>
+ <para>[New in GHC 6.12.1] [Default: 0]
+ Use parallel GC in
+ generation <replaceable>gen</replaceable> and higher.
+ Omitting <replaceable>gen</replaceable> turns off the
+ parallel GC completely, reverting to sequential GC.</para>
+
+ <para>The default parallel GC settings are usually suitable
+ for parallel programs (i.e. those
+ using <literal>par</literal>, Strategies, or with multiple
+ threads). However, it is sometimes beneficial to enable
+ the parallel GC for a single-threaded sequential program
+ too, especially if the program has a large amount of heap
+ data and GC is a significant fraction of runtime. To use
+ the parallel GC in a sequential program, enable the
+ parallel runtime with a suitable <literal>-N</literal>
+ option, and additionally it might be beneficial to
+ restrict parallel GC to the old generation
+ with <literal>-qg1</literal>.</para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>
+ <option>-qb<optional><replaceable>gen</replaceable></optional></option>
+ <indexterm><primary><option>-qb</option><secondary>RTS
+ option</secondary></primary></indexterm>
+ </term>
+ <listitem>
+ <para>
+ [New in GHC 6.12.1] [Default: 1] Use
+ load-balancing in the parallel GC in
+ generation <replaceable>gen</replaceable> and higher.
+ Omitting <replaceable>gen</replaceable> disables
+ load-balancing entirely.</para>
+
+ <para>
+ Load-balancing shares out the work of GC between the
+ available cores. This is a good idea when the heap is
+ large and we need to parallelise the GC work, however it
+ is also pessimal for the short young-generation
+ collections in a parallel program, because it can harm
+ locality by moving data from the cache of the CPU where is
+ it being used to the cache of another CPU. Hence the
+ default is to do load-balancing only in the
+ old-generation. In fact, for a parallel program it is
+ sometimes beneficial to disable load-balancing entirely
+ with <literal>-qb</literal>.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
<term>
<option>-H</option><replaceable>size</replaceable>
<indexterm><primary><option>-H</option></primary><secondary>RTS option</secondary></indexterm>
<varlistentry>
<term>
- <option>-k</option><replaceable>size</replaceable>
+ <option>-ki</option><replaceable>size</replaceable>
<indexterm><primary><option>-k</option></primary><secondary>RTS option</secondary></indexterm>
- <indexterm><primary>stack, minimum size</primary></indexterm>
+ <indexterm><primary>stack, initial size</primary></indexterm>
</term>
<listitem>
- <para>[Default: 1k] Set the initial stack size for
- new threads. Thread stacks (including the main thread's
- stack) live on the heap, and grow as required. The default
- value is good for concurrent applications with lots of small
- threads; if your program doesn't fit this model then
- increasing this option may help performance.</para>
-
- <para>The main thread is normally started with a slightly
- larger heap to cut down on unnecessary stack growth while
- the program is starting up.</para>
- </listitem>
+ <para>
+ [Default: 1k] Set the initial stack size for new
+ threads. (Note: this flag used to be
+ simply <option>-k</option>, but was renamed
+ to <option>-ki</option> in GHC 7.2.1. The old name is
+ still accepted for backwards compatibility, but that may
+ be removed in a future version).
+ </para>
+
+ <para>
+ Thread stacks (including the main thread's stack) live on
+ the heap. As the stack grows, new stack chunks are added
+ as required; if the stack shrinks again, these extra stack
+ chunks are reclaimed by the garbage collector. The
+ default initial stack size is deliberately small, in order
+ to keep the time and space overhead for thread creation to
+ a minimum, and to make it practical to spawn threads for
+ even tiny pieces of work.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>
+ <option>-kc</option><replaceable>size</replaceable>
+ <indexterm><primary><option>-kc</option></primary><secondary>RTS
+ option</secondary></indexterm>
+ <indexterm><primary>stack</primary><secondary>chunk size</secondary></indexterm>
+ </term>
+ <listitem>
+ <para>
+ [Default: 32k] Set the size of “stack
+ chunks”. When a thread's current stack overflows, a
+ new stack chunk is created and added to the thread's
+ stack, until the limit set by <option>-K</option> is
+ reached.
+ </para>
+
+ <para>
+ The advantage of smaller stack chunks is that the garbage
+ collector can avoid traversing stack chunks if they are
+ known to be unmodified since the last collection, so
+ reducing the chunk size means that the garbage collector
+ can identify more stack as unmodified, and the GC overhead
+ might be reduced. On the other hand, making stack chunks
+ too small adds some overhead as there will be more
+ overflow/underflow between chunks. The default setting of
+ 32k appears to be a reasonable compromise in most cases.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>
+ <option>-kb</option><replaceable>size</replaceable>
+ <indexterm><primary><option>-kc</option></primary><secondary>RTS
+ option</secondary></indexterm>
+ <indexterm><primary>stack</primary><secondary>chunk buffer size</secondary></indexterm>
+ </term>
+ <listitem>
+ <para>
+ [Default: 1k] Sets the stack chunk buffer size.
+ When a stack chunk overflows and a new stack chunk is
+ created, some of the data from the previous stack chunk is
+ moved into the new chunk, to avoid an immediate underflow
+ and repeated overflow/underflow at the boundary. The
+ amount of stack moved is set by the <option>-kb</option>
+ option.
+ </para>
+ <para>
+ Note that to avoid wasting space, this value should
+ typically be less than 10% of the size of a stack
+ chunk (<option>-kc</option>), because in a chain of stack
+ chunks, each chunk will have a gap of unused space of this
+ size.
+ </para>
+ </listitem>
</varlistentry>
<varlistentry>
<listitem>
<para>[Default: 8M] Set the maximum stack size for
an individual thread to <replaceable>size</replaceable>
- bytes. This option is there purely to stop the program
- eating up all the available memory in the machine if it gets
- into an infinite loop.</para>
+ bytes. If the thread attempts to exceed this limit, it will
+ be send the <literal>StackOverflow</literal> exception.
+ </para>
+ <para>
+ This option is there mainly to stop the program eating up
+ all the available memory in the machine if it gets into an
+ infinite loop.
+ </para>
</listitem>
</varlistentry>
</varlistentry>
<varlistentry>
+ <term>
+ <option>-t</option><optional><replaceable>file</replaceable></optional>
+ <indexterm><primary><option>-t</option></primary><secondary>RTS option</secondary></indexterm>
+ </term>
<term>
- <option>-s</option><replaceable>file</replaceable>
+ <option>-s</option><optional><replaceable>file</replaceable></optional>
<indexterm><primary><option>-s</option></primary><secondary>RTS option</secondary></indexterm>
</term>
<term>
- <option>-S</option><replaceable>file</replaceable>
+ <option>-S</option><optional><replaceable>file</replaceable></optional>
<indexterm><primary><option>-S</option></primary><secondary>RTS option</secondary></indexterm>
</term>
- <listitem>
- <para>Write modest (<option>-s</option>) or verbose
- (<option>-S</option>) garbage-collector statistics into file
- <replaceable>file</replaceable>. The default
- <replaceable>file</replaceable> is
- <filename><replaceable>program</replaceable>.stat</filename>. The
- <replaceable>file</replaceable> <constant>stderr</constant>
- is treated specially, with the output really being sent to
- <constant>stderr</constant>.</para>
-
- <para>This option is useful for watching how the storage
- manager adjusts the heap size based on the current amount of
- live data.</para>
- </listitem>
- </varlistentry>
-
- <varlistentry>
<term>
- <option>-t<replaceable>file</replaceable></option>
- <indexterm><primary><option>-t</option></primary><secondary>RTS option</secondary></indexterm>
+ <option>--machine-readable</option>
+ <indexterm><primary><option>--machine-readable</option></primary><secondary>RTS option</secondary></indexterm>
</term>
<listitem>
- <para>Write a one-line GC stats summary after running the
- program. This output is in the same format as that produced
- by the <option>-Rghc-timing</option> option.</para>
-
- <para>As with <option>-s</option>, the default
- <replaceable>file</replaceable> is
- <filename><replaceable>program</replaceable>.stat</filename>. The
- <replaceable>file</replaceable> <constant>stderr</constant>
- is treated specially, with the output really being sent to
- <constant>stderr</constant>.</para>
+ <para>These options produce runtime-system statistics, such
+ as the amount of time spent executing the program and in the
+ garbage collector, the amount of memory allocated, the
+ maximum size of the heap, and so on. The three
+ variants give different levels of detail:
+ <option>-t</option> produces a single line of output in the
+ same format as GHC's <option>-Rghc-timing</option> option,
+ <option>-s</option> produces a more detailed summary at the
+ end of the program, and <option>-S</option> additionally
+ produces information about each and every garbage
+ collection.</para>
+
+ <para>The output is placed in
+ <replaceable>file</replaceable>. If
+ <replaceable>file</replaceable> is omitted, then the output
+ is sent to <constant>stderr</constant>.</para>
+
+ <para>
+ If you use the <literal>-t</literal> flag then, when your
+ program finishes, you will see something like this:
+ </para>
+
+<programlisting>
+<<ghc: 36169392 bytes, 69 GCs, 603392/1065272 avg/max bytes residency (2 samples), 3M in use, 0.00 INIT (0.00 elapsed), 0.02 MUT (0.02 elapsed), 0.07 GC (0.07 elapsed) :ghc>>
+</programlisting>
+
+ <para>
+ This tells you:
+ </para>
+
+ <itemizedlist>
+ <listitem>
+ <para>
+ The total number of bytes allocated by the program over the
+ whole run.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ The total number of garbage collections performed.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ The average and maximum "residency", which is the amount of
+ live data in bytes. The runtime can only determine the
+ amount of live data during a major GC, which is why the
+ number of samples corresponds to the number of major GCs
+ (and is usually relatively small). To get a better picture
+ of the heap profile of your program, use
+ the <option>-hT</option> RTS option
+ (<xref linkend="rts-profiling" />).
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ The peak memory the RTS has allocated from the OS.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ The amount of CPU time and elapsed wall clock time while
+ initialising the runtime system (INIT), running the program
+ itself (MUT, the mutator), and garbage collecting (GC).
+ </para>
+ </listitem>
+ </itemizedlist>
+
+ <para>
+ You can also get this in a more future-proof, machine readable
+ format, with <literal>-t --machine-readable</literal>:
+ </para>
+
+<programlisting>
+ [("bytes allocated", "36169392")
+ ,("num_GCs", "69")
+ ,("average_bytes_used", "603392")
+ ,("max_bytes_used", "1065272")
+ ,("num_byte_usage_samples", "2")
+ ,("peak_megabytes_allocated", "3")
+ ,("init_cpu_seconds", "0.00")
+ ,("init_wall_seconds", "0.00")
+ ,("mutator_cpu_seconds", "0.02")
+ ,("mutator_wall_seconds", "0.02")
+ ,("GC_cpu_seconds", "0.07")
+ ,("GC_wall_seconds", "0.07")
+ ]
+</programlisting>
+
+ <para>
+ If you use the <literal>-s</literal> flag then, when your
+ program finishes, you will see something like this (the exact
+ details will vary depending on what sort of RTS you have, e.g.
+ you will only see profiling data if your RTS is compiled for
+ profiling):
+ </para>
+
+<programlisting>
+ 36,169,392 bytes allocated in the heap
+ 4,057,632 bytes copied during GC
+ 1,065,272 bytes maximum residency (2 sample(s))
+ 54,312 bytes maximum slop
+ 3 MB total memory in use (0 MB lost due to fragmentation)
+
+ Generation 0: 67 collections, 0 parallel, 0.04s, 0.03s elapsed
+ Generation 1: 2 collections, 0 parallel, 0.03s, 0.04s elapsed
+
+ SPARKS: 359207 (557 converted, 149591 pruned)
+
+ INIT time 0.00s ( 0.00s elapsed)
+ MUT time 0.01s ( 0.02s elapsed)
+ GC time 0.07s ( 0.07s elapsed)
+ EXIT time 0.00s ( 0.00s elapsed)
+ Total time 0.08s ( 0.09s elapsed)
+
+ %GC time 89.5% (75.3% elapsed)
+
+ Alloc rate 4,520,608,923 bytes per MUT second
+
+ Productivity 10.5% of total user, 9.1% of total elapsed
+</programlisting>
+
+ <itemizedlist>
+ <listitem>
+ <para>
+ The "bytes allocated in the heap" is the total bytes allocated
+ by the program over the whole run.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ GHC uses a copying garbage collector by default. "bytes copied
+ during GC" tells you how many bytes it had to copy during
+ garbage collection.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ The maximum space actually used by your program is the
+ "bytes maximum residency" figure. This is only checked during
+ major garbage collections, so it is only an approximation;
+ the number of samples tells you how many times it is checked.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ The "bytes maximum slop" tells you the most space that is ever
+ wasted due to the way GHC allocates memory in blocks. Slop is
+ memory at the end of a block that was wasted. There's no way
+ to control this; we just like to see how much memory is being
+ lost this way.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ The "total memory in use" tells you the peak memory the RTS has
+ allocated from the OS.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Next there is information about the garbage collections done.
+ For each generation it says how many garbage collections were
+ done, how many of those collections were done in parallel,
+ the total CPU time used for garbage collecting that generation,
+ and the total wall clock time elapsed while garbage collecting
+ that generation.
+ </para>
+ </listitem>
+ <listitem>
+ <para>The <literal>SPARKS</literal> statistic refers to the
+ use of <literal>Control.Parallel.par</literal> and related
+ functionality in the program. Each spark represents a call
+ to <literal>par</literal>; a spark is "converted" when it is
+ executed in parallel; and a spark is "pruned" when it is
+ found to be already evaluated and is discarded from the pool
+ by the garbage collector. Any remaining sparks are
+ discarded at the end of execution, so "converted" plus
+ "pruned" does not necessarily add up to the total.</para>
+ </listitem>
+ <listitem>
+ <para>
+ Next there is the CPU time and wall clock time elapsed broken
+ down by what the runtime system was doing at the time.
+ INIT is the runtime system initialisation.
+ MUT is the mutator time, i.e. the time spent actually running
+ your code.
+ GC is the time spent doing garbage collection.
+ RP is the time spent doing retainer profiling.
+ PROF is the time spent doing other profiling.
+ EXIT is the runtime system shutdown time.
+ And finally, Total is, of course, the total.
+ </para>
+ <para>
+ %GC time tells you what percentage GC is of Total.
+ "Alloc rate" tells you the "bytes allocated in the heap" divided
+ by the MUT CPU time.
+ "Productivity" tells you what percentage of the Total CPU and wall
+ clock elapsed times are spent in the mutator (MUT).
+ </para>
+ </listitem>
+ </itemizedlist>
+
+ <para>
+ The <literal>-S</literal> flag, as well as giving the same
+ output as the <literal>-s</literal> flag, prints information
+ about each GC as it happens:
+ </para>
+
+<programlisting>
+ Alloc Copied Live GC GC TOT TOT Page Flts
+ bytes bytes bytes user elap user elap
+ 528496 47728 141512 0.01 0.02 0.02 0.02 0 0 (Gen: 1)
+[...]
+ 524944 175944 1726384 0.00 0.00 0.08 0.11 0 0 (Gen: 0)
+</programlisting>
+
+ <para>
+ For each garbage collection, we print:
+ </para>
+
+ <itemizedlist>
+ <listitem>
+ <para>
+ How many bytes we allocated this garbage collection.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ How many bytes we copied this garbage collection.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ How many bytes are currently live.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ How long this garbage collection took (CPU time and elapsed
+ wall clock time).
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ How long the program has been running (CPU time and elapsed
+ wall clock time).
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ How many page faults occured this garbage collection.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ How many page faults occured since the end of the last garbage
+ collection.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ Which generation is being garbage collected.
+ </para>
+ </listitem>
+ </itemizedlist>
+
</listitem>
</varlistentry>
</variablelist>
</sect2>
<sect2>
- <title>RTS options for profiling and parallelism</title>
+ <title>RTS options for concurrency and parallelism</title>
- <para>The RTS options related to profiling are described in <xref
- linkend="rts-options-heap-prof"/>, those for concurrency in
+ <para>The RTS options related to concurrency are described in
<xref linkend="using-concurrent" />, and those for parallelism in
<xref linkend="parallel-options"/>.</para>
</sect2>
+ <sect2 id="rts-profiling">
+ <title>RTS options for profiling</title>
+
+ <para>Most profiling runtime options are only available when you
+ compile your program for profiling (see
+ <xref linkend="prof-compiler-options" />, and
+ <xref linkend="rts-options-heap-prof" /> for the runtime options).
+ However, there is one profiling option that is available
+ for ordinary non-profiled executables:</para>
+
+ <variablelist>
+ <varlistentry>
+ <term>
+ <option>-hT</option>
+ <indexterm><primary><option>-hT</option></primary><secondary>RTS
+ option</secondary></indexterm>
+ </term>
+ <listitem>
+ <para>Generates a basic heap profile, in the
+ file <literal><replaceable>prog</replaceable>.hp</literal>.
+ To produce the heap profile graph,
+ use <command>hp2ps</command> (see <xref linkend="hp2ps"
+ />). The basic heap profile is broken down by data
+ constructor, with other types of closures (functions, thunks,
+ etc.) grouped into broad categories
+ (e.g. <literal>FUN</literal>, <literal>THUNK</literal>). To
+ get a more detailed profile, use the full profiling
+ support (<xref linkend="profiling" />).</para>
+ </listitem>
+ </varlistentry>
+ </variablelist>
+ </sect2>
+
+ <sect2 id="rts-eventlog">
+ <title>Tracing</title>
+
+ <indexterm><primary>tracing</primary></indexterm>
+ <indexterm><primary>events</primary></indexterm>
+ <indexterm><primary>eventlog files</primary></indexterm>
+
+ <para>
+ When the program is linked with the <option>-eventlog</option>
+ option (<xref linkend="options-linker" />), runtime events can
+ be logged in two ways:
+ </para>
+
+ <itemizedlist>
+ <listitem>
+ <para>
+ In binary format to a file for later analysis by a
+ variety of tools. One such tool
+ is <ulink url="http://hackage.haskell.org/package/ThreadScope">ThreadScope</ulink><indexterm><primary>ThreadScope</primary></indexterm>,
+ which interprets the event log to produce a visual parallel
+ execution profile of the program.
+ </para>
+ </listitem>
+ <listitem>
+ <para>
+ As text to standard output, for debugging purposes.
+ </para>
+ </listitem>
+ </itemizedlist>
+
+ <variablelist>
+ <varlistentry>
+ <term>
+ <option>-l<optional><replaceable>flags</replaceable></optional></option>
+ <indexterm><primary><option>-l</option></primary><secondary>RTS option</secondary></indexterm>
+ </term>
+ <listitem>
+ <para>
+ Log events in binary format to the
+ file <filename><replaceable>program</replaceable>.eventlog</filename>,
+ where <replaceable>flags</replaceable> is a sequence of
+ zero or more characters indicating which kinds of events
+ to log. Currently there is only one type
+ supported: <literal>-ls</literal>, for scheduler events.
+ </para>
+
+ <para>
+ The format of the log file is described by the header
+ <filename>EventLogFormat.h</filename> that comes with
+ GHC, and it can be parsed in Haskell using
+ the <ulink url="http://hackage.haskell.org/package/ghc-events">ghc-events</ulink>
+ library. To dump the contents of
+ a <literal>.eventlog</literal> file as text, use the
+ tool <literal>show-ghc-events</literal> that comes with
+ the <ulink url="http://hackage.haskell.org/package/ghc-events">ghc-events</ulink>
+ package.
+ </para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term>
+ <option>-v</option><optional><replaceable>flags</replaceable></optional>
+ <indexterm><primary><option>-v</option></primary><secondary>RTS option</secondary></indexterm>
+ </term>
+ <listitem>
+ <para>
+ Log events as text to standard output, instead of to
+ the <literal>.eventlog</literal> file.
+ The <replaceable>flags</replaceable> are the same as
+ for <option>-l</option>, with the additional
+ option <literal>t</literal> which indicates that the
+ each event printed should be preceded by a timestamp value
+ (in the binary <literal>.eventlog</literal> file, all
+ events are automatically associated with a timestamp).
+ </para>
+ </listitem>
+ </varlistentry>
+
+ </variablelist>
+
+ <para>
+ The debugging
+ options <option>-D<replaceable>x</replaceable></option> also
+ generate events which are logged using the tracing framework.
+ By default those events are dumped as text to stdout
+ (<option>-D<replaceable>x</replaceable></option>
+ implies <option>-v</option>), but they may instead be stored in
+ the binary eventlog file by using the <option>-l</option>
+ option.
+ </para>
+ </sect2>
+
<sect2 id="rts-options-debugging">
<title>RTS options for hackers, debuggers, and over-interested
souls</title>
<varlistentry>
<term>
- <option>-D</option><replaceable>num</replaceable>
+ <option>-D</option><replaceable>x</replaceable>
<indexterm><primary>-D</primary><secondary>RTS option</secondary></indexterm>
</term>
<listitem>
- <para>An RTS debugging flag; varying quantities of output
- depending on which bits are set in
- <replaceable>num</replaceable>. Only works if the RTS was
- compiled with the <option>DEBUG</option> option.</para>
+ <para>
+ An RTS debugging flag; only availble if the program was
+ linked with the <option>-debug</option> option. Various
+ values of <replaceable>x</replaceable> are provided to
+ enable debug messages and additional runtime sanity checks
+ in different subsystems in the RTS, for
+ example <literal>+RTS -Ds -RTS</literal> enables debug
+ messages from the scheduler.
+ Use <literal>+RTS -?</literal> to find out which
+ debug flags are supported.
+ </para>
+
+ <para>
+ Debug messages will be sent to the binary event log file
+ instead of stdout if the <option>-l</option> option is
+ added. This might be useful for reducing the overhead of
+ debug tracing.
+ </para>
</listitem>
</varlistentry>
</term>
<listitem>
<para>Produce “ticky-ticky” statistics at the
- end of the program run. The <replaceable>file</replaceable>
- business works just like on the <option>-S</option> RTS
- option (above).</para>
-
- <para>“Ticky-ticky” statistics are counts of
- various program actions (updates, enters, etc.) The program
- must have been compiled using
- <option>-ticky</option><indexterm><primary><option>-ticky</option></primary></indexterm>
- (a.k.a. “ticky-ticky profiling”), and, for it to
- be really useful, linked with suitable system libraries.
- Not a trivial undertaking: consult the installation guide on
- how to set things up for easy “ticky-ticky”
- profiling. For more information, see <xref
- linkend="ticky-ticky"/>.</para>
+ end of the program run (only available if the program was
+ linked with <option>-debug</option>).
+ The <replaceable>file</replaceable> business works just like
+ on the <option>-S</option> RTS option, above.</para>
+
+ <para>For more information on ticky-ticky profiling, see
+ <xref linkend="ticky-ticky"/>.</para>
</listitem>
</varlistentry>
</sect2>
- <sect2 id="rts-hooks">
- <title>“Hooks” to change RTS behaviour</title>
-
- <indexterm><primary>hooks</primary><secondary>RTS</secondary></indexterm>
- <indexterm><primary>RTS hooks</primary></indexterm>
- <indexterm><primary>RTS behaviour, changing</primary></indexterm>
-
- <para>GHC lets you exercise rudimentary control over the RTS
- settings for any given program, by compiling in a
- “hook” that is called by the run-time system. The RTS
- contains stub definitions for all these hooks, but by writing your
- own version and linking it on the GHC command line, you can
- override the defaults.</para>
-
- <para>Owing to the vagaries of DLL linking, these hooks don't work
- under Windows when the program is built dynamically.</para>
+ <sect2>
+ <title>Getting information about the RTS</title>
- <para>The hook <literal>ghc_rts_opts</literal><indexterm><primary><literal>ghc_rts_opts</literal></primary>
- </indexterm>lets you set RTS
- options permanently for a given program. A common use for this is
- to give your program a default heap and/or stack size that is
- greater than the default. For example, to set <literal>-H128m
- -K1m</literal>, place the following definition in a C source
- file:</para>
+ <indexterm><primary>RTS</primary></indexterm>
-<programlisting>
-char *ghc_rts_opts = "-H128m -K1m";
-</programlisting>
+ <para>It is possible to ask the RTS to give some information about
+ itself. To do this, use the <option>--info</option> flag, e.g.</para>
+<screen>
+$ ./a.out +RTS --info
+ [("GHC RTS", "YES")
+ ,("GHC version", "6.7")
+ ,("RTS way", "rts_p")
+ ,("Host platform", "x86_64-unknown-linux")
+ ,("Host architecture", "x86_64")
+ ,("Host OS", "linux")
+ ,("Host vendor", "unknown")
+ ,("Build platform", "x86_64-unknown-linux")
+ ,("Build architecture", "x86_64")
+ ,("Build OS", "linux")
+ ,("Build vendor", "unknown")
+ ,("Target platform", "x86_64-unknown-linux")
+ ,("Target architecture", "x86_64")
+ ,("Target OS", "linux")
+ ,("Target vendor", "unknown")
+ ,("Word size", "64")
+ ,("Compiler unregisterised", "NO")
+ ,("Tables next to code", "YES")
+ ]
+</screen>
+ <para>The information is formatted such that it can be read as a
+ of type <literal>[(String, String)]</literal>. Currently the following
+ fields are present:</para>
- <para>Compile the C file, and include the object file on the
- command line when you link your Haskell program.</para>
+ <variablelist>
- <para>These flags are interpreted first, before any RTS flags from
- the <literal>GHCRTS</literal> environment variable and any flags
- on the command line.</para>
+ <varlistentry>
+ <term><literal>GHC RTS</literal></term>
+ <listitem>
+ <para>Is this program linked against the GHC RTS? (always
+ "YES").</para>
+ </listitem>
+ </varlistentry>
- <para>You can also change the messages printed when the runtime
- system “blows up,” e.g., on stack overflow. The hooks
- for these are as follows:</para>
+ <varlistentry>
+ <term><literal>GHC version</literal></term>
+ <listitem>
+ <para>The version of GHC used to compile this program.</para>
+ </listitem>
+ </varlistentry>
- <variablelist>
+ <varlistentry>
+ <term><literal>RTS way</literal></term>
+ <listitem>
+ <para>The variant (“way”) of the runtime. The
+ most common values are <literal>rts</literal> (vanilla),
+ <literal>rts_thr</literal> (threaded runtime, i.e. linked using the
+ <literal>-threaded</literal> option) and <literal>rts_p</literal>
+ (profiling runtime, i.e. linked using the <literal>-prof</literal>
+ option). Other variants include <literal>debug</literal>
+ (linked using <literal>-debug</literal>),
+ <literal>t</literal> (ticky-ticky profiling) and
+ <literal>dyn</literal> (the RTS is
+ linked in dynamically, i.e. a shared library, rather than statically
+ linked into the executable itself). These can be combined,
+ e.g. you might have <literal>rts_thr_debug_p</literal>.</para>
+ </listitem>
+ </varlistentry>
<varlistentry>
- <term>
- <function>void OutOfHeapHook (unsigned long, unsigned long)</function>
- <indexterm><primary><function>OutOfHeapHook</function></primary></indexterm>
+ <term>
+ <literal>Target platform</literal>,
+ <literal>Target architecture</literal>,
+ <literal>Target OS</literal>,
+ <literal>Target vendor</literal>
</term>
- <listitem>
- <para>The heap-overflow message.</para>
- </listitem>
+ <listitem>
+ <para>These are the platform the program is compiled to run on.</para>
+ </listitem>
</varlistentry>
<varlistentry>
- <term>
- <function>void StackOverflowHook (long int)</function>
- <indexterm><primary><function>StackOverflowHook</function></primary></indexterm>
+ <term>
+ <literal>Build platform</literal>,
+ <literal>Build architecture</literal>,
+ <literal>Build OS</literal>,
+ <literal>Build vendor</literal>
</term>
- <listitem>
- <para>The stack-overflow message.</para>
- </listitem>
+ <listitem>
+ <para>These are the platform where the program was built
+ on. (That is, the target platform of GHC itself.) Ordinarily
+ this is identical to the target platform. (It could potentially
+ be different if cross-compiling.)</para>
+ </listitem>
</varlistentry>
<varlistentry>
- <term>
- <function>void MallocFailHook (long int)</function>
- <indexterm><primary><function>MallocFailHook</function></primary></indexterm>
+ <term>
+ <literal>Host platform</literal>,
+ <literal>Host architecture</literal>
+ <literal>Host OS</literal>
+ <literal>Host vendor</literal>
</term>
- <listitem>
- <para>The message printed if <function>malloc</function>
- fails.</para>
- </listitem>
+ <listitem>
+ <para>These are the platform where GHC itself was compiled.
+ Again, this would normally be identical to the build and
+ target platforms.</para>
+ </listitem>
</varlistentry>
+
+ <varlistentry>
+ <term><literal>Word size</literal></term>
+ <listitem>
+ <para>Either <literal>"32"</literal> or <literal>"64"</literal>,
+ reflecting the word size of the target platform.</para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><literal>Compiler unregistered</literal></term>
+ <listitem>
+ <para>Was this program compiled with an “unregistered”
+ version of GHC? (I.e., a version of GHC that has no platform-specific
+ optimisations compiled in, usually because this is a currently
+ unsupported platform.) This value will usually be no, unless you're
+ using an experimental build of GHC.</para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
+ <term><literal>Tables next to code</literal></term>
+ <listitem>
+ <para>Putting info tables directly next to entry code is a useful
+ performance optimisation that is not available on all platforms.
+ This field tells you whether the program has been compiled with
+ this optimisation. (Usually yes, except on unusual platforms.)</para>
+ </listitem>
+ </varlistentry>
+
</variablelist>
- <para>For examples of the use of these hooks, see GHC's own
- versions in the file
- <filename>ghc/compiler/parser/hschooks.c</filename> in a GHC
- source tree.</para>
</sect2>
</sect1>
<!-- Emacs stuff:
;;; Local Variables: ***
- ;;; mode: xml ***
;;; sgml-parent-document: ("users_guide.xml" "book" "chapter" "sect1") ***
;;; End: ***
-->