Document the new RTS linker flags
[ghc-hetmet.git] / docs / users_guide / runtime_control.xml
index 69e26bc..14732c5 100644 (file)
@@ -10,7 +10,8 @@
   code and then links it with a non-trivial runtime system (RTS),
   which handles storage management, profiling, etc.</para>
 
-  <para>You have some control over the behaviour of the RTS, by giving
+  <para>If you use the <literal>-rtsopts</literal> flag when linking,
+  you have some control over the behaviour of the RTS, by giving
   special command-line arguments to your program.</para>
 
   <para>When your Haskell program starts up, its RTS extracts
@@ -68,7 +69,8 @@
     <indexterm><primary>environment variable</primary><secondary>for
     setting RTS options</secondary></indexterm>
 
-    <para>RTS options are also taken from the environment variable
+    <para>When the <literal>-rtsopts</literal> flag is used when linking,
+    RTS options are also taken from the environment variable
     <envar>GHCRTS</envar><indexterm><primary><envar>GHCRTS</envar></primary>
       </indexterm>.  For example, to set the maximum heap size
     to 128M for all GHC-compiled programs (using an
 
       <varlistentry>
         <term>
-          <option>-q1</option>
-          <indexterm><primary><option>-q1</option><secondary>RTS
+          <option>-qg<optional><replaceable>gen</replaceable></optional></option>
+          <indexterm><primary><option>-qg</option><secondary>RTS
           option</secondary></primary></indexterm>
         </term>
         <listitem>
-          <para>&lsqb;New in GHC 6.12.1&rsqb; Disable the parallel GC.
-            The parallel GC is turned on automatically when parallel
-            execution is enabled with the <option>-N</option> option;
-            this option is available to turn it off if
-            necessary.</para>
+          <para>&lsqb;New in GHC 6.12.1&rsqb; &lsqb;Default: 0&rsqb;
+            Use parallel GC in
+            generation <replaceable>gen</replaceable> and higher.
+            Omitting <replaceable>gen</replaceable> turns off the
+            parallel GC completely, reverting to sequential GC.</para>
           
-          <para>Experiments have shown that parallel GC usually
-            results in a performance improvement given 3 cores or
-            more; with 2 cores it may or may not be beneficial,
-            depending on the workload.  Bigger heaps work better with
-            parallel GC, so set your <option>-H</option> value high (3
-            or more times the maximum residency).  Look at the timing
-            stats with <option>+RTS -s</option> to see whether you're
-            getting any benefit from parallel GC or not.  If you find
-            parallel GC is significantly <emphasis>slower</emphasis>
-            (in elapsed time) than sequential GC, please report it as
-            a bug.</para>
-
-          <para>In GHC 6.10.1 it was possible to use a different
-            number of threads for GC than for execution, because the GC
-            used its own pool of threads.  Now, the GC uses the same
-            threads as the mutator (for executing the program).</para>
+          <para>The default parallel GC settings are usually suitable
+            for parallel programs (i.e. those
+            using <literal>par</literal>, Strategies, or with multiple
+            threads).  However, it is sometimes beneficial to enable
+            the parallel GC for a single-threaded sequential program
+            too, especially if the program has a large amount of heap
+            data and GC is a significant fraction of runtime.  To use
+            the parallel GC in a sequential program, enable the
+            parallel runtime with a suitable <literal>-N</literal>
+            option, and additionally it might be beneficial to
+            restrict parallel GC to the old generation
+            with <literal>-qg1</literal>.</para>
         </listitem>
       </varlistentry>        
 
       <varlistentry>
         <term>
-          <option>-qg<replaceable>n</replaceable></option>
-          <indexterm><primary><option>-qg</option><secondary>RTS
+          <option>-qb<optional><replaceable>gen</replaceable></optional></option>
+          <indexterm><primary><option>-qb</option><secondary>RTS
           option</secondary></primary></indexterm>
         </term>
         <listitem>
           <para>
-            &lsqb;Default: 1&rsqb; &lsqb;New in GHC 6.12.1&rsqb;
-            Enable the parallel GC only in
-            generation <replaceable>n</replaceable> and greater.
-            Parallel GC is often not worthwhile for collections in
-            generation 0 (the young generation), so it is enabled by
-            default only for collections in generation 1 (and higher,
-            if applicable).
+            &lsqb;New in GHC 6.12.1&rsqb; &lsqb;Default: 1&rsqb; Use
+            load-balancing in the parallel GC in
+            generation <replaceable>gen</replaceable> and higher.
+            Omitting <replaceable>gen</replaceable> disables
+            load-balancing entirely.</para>
+          
+          <para>
+            Load-balancing shares out the work of GC between the
+            available cores.  This is a good idea when the heap is
+            large and we need to parallelise the GC work, however it
+            is also pessimal for the short young-generation
+            collections in a parallel program, because it can harm
+            locality by moving data from the cache of the CPU where is
+            it being used to the cache of another CPU.  Hence the
+            default is to do load-balancing only in the
+            old-generation.  In fact, for a parallel program it is
+            sometimes beneficial to disable load-balancing entirely
+            with <literal>-qb</literal>.
           </para>
         </listitem>
       </varlistentry>
     </variablelist>
   </sect2>
 
+  <sect2 id="rts-eventlog">
+    <title>Tracing</title>
+
+    <indexterm><primary>tracing</primary></indexterm>
+    <indexterm><primary>events</primary></indexterm>
+    <indexterm><primary>eventlog files</primary></indexterm>
+
+    <para>
+      When the program is linked with the <option>-eventlog</option>
+      option (<xref linkend="options-linker" />), runtime events can
+      be logged in two ways:
+    </para>
+
+    <itemizedlist>
+      <listitem>
+        <para>
+          In binary format to a file for later analysis by a
+          variety of tools.  One such tool
+          is <ulink url="http://hackage.haskell.org/package/ThreadScope">ThreadScope</ulink><indexterm><primary>ThreadScope</primary></indexterm>,
+          which interprets the event log to produce a visual parallel
+          execution profile of the program.
+        </para>
+      </listitem>
+      <listitem>
+        <para>
+          As text to standard output, for debugging purposes.
+        </para>
+      </listitem>
+    </itemizedlist>
+
+    <variablelist>
+      <varlistentry>
+        <term>
+          <option>-l<optional><replaceable>flags</replaceable></optional></option>
+          <indexterm><primary><option>-l</option></primary><secondary>RTS option</secondary></indexterm>
+        </term>
+        <listitem>
+          <para>
+            Log events in binary format to the
+            file <filename><replaceable>program</replaceable>.eventlog</filename>,
+            where <replaceable>flags</replaceable> is a sequence of
+            zero or more characters indicating which kinds of events
+            to log.  Currently there is only one type
+            supported: <literal>-ls</literal>, for scheduler events.
+          </para>
+
+          <para>
+            The format of the log file is described by the header
+            <filename>EventLogFormat.h</filename> that comes with
+            GHC, and it can be parsed in Haskell using
+            the <ulink url="http://hackage.haskell.org/package/ghc-events">ghc-events</ulink>
+            library.  To dump the contents of
+            a <literal>.eventlog</literal> file as text, use the
+            tool <literal>show-ghc-events</literal> that comes with
+            the <ulink url="http://hackage.haskell.org/package/ghc-events">ghc-events</ulink>
+            package.
+          </para>
+        </listitem>
+      </varlistentry>
+
+      <varlistentry>
+        <term>
+          <option>-v</option><optional><replaceable>flags</replaceable></optional>
+          <indexterm><primary><option>-v</option></primary><secondary>RTS option</secondary></indexterm>
+        </term>
+        <listitem>
+          <para>
+            Log events as text to standard output, instead of to
+            the <literal>.eventlog</literal> file.
+            The <replaceable>flags</replaceable> are the same as
+            for <option>-l</option>, with the additional
+            option <literal>t</literal> which indicates that the
+            each event printed should be preceded by a timestamp value
+            (in the binary <literal>.eventlog</literal> file, all
+            events are automatically associated with a timestamp).
+          </para>
+        </listitem>
+      </varlistentry>
+
+    </variablelist>
+
+    <para>
+      The debugging
+      options <option>-D<replaceable>x</replaceable></option> also
+      generate events which are logged using the tracing framework.
+      By default those events are dumped as text to stdout
+      (<option>-D<replaceable>x</replaceable></option>
+      implies <option>-v</option>), but they may instead be stored in
+      the binary eventlog file by using the <option>-l</option>
+      option.
+    </para>
+  </sect2>
+
   <sect2 id="rts-options-debugging">
     <title>RTS options for hackers, debuggers, and over-interested
     souls</title>
 
       <varlistentry>
        <term>
-          <option>-D</option><replaceable>num</replaceable>
+          <option>-D</option><replaceable>x</replaceable>
           <indexterm><primary>-D</primary><secondary>RTS option</secondary></indexterm>
         </term>
        <listitem>
-         <para>An RTS debugging flag; varying quantities of output
-          depending on which bits are set in
-          <replaceable>num</replaceable>.  Only works if the RTS was
-          compiled with the <option>DEBUG</option> option.</para>
+         <para>
+            An RTS debugging flag; only availble if the program was
+           linked with the <option>-debug</option> option.  Various
+           values of <replaceable>x</replaceable> are provided to
+           enable debug messages and additional runtime sanity checks
+           in different subsystems in the RTS, for
+           example <literal>+RTS -Ds -RTS</literal> enables debug
+           messages from the scheduler.
+           Use <literal>+RTS&nbsp;-?</literal> to find out which
+           debug flags are supported.
+          </para>
+
+          <para>
+            Debug messages will be sent to the binary event log file
+            instead of stdout if the <option>-l</option> option is
+            added.  This might be useful for reducing the overhead of
+            debug tracing.
+          </para>
        </listitem>
       </varlistentry>
 
         </term>
        <listitem>
          <para>Produce &ldquo;ticky-ticky&rdquo; statistics at the
-          end of the program run.  The <replaceable>file</replaceable>
-          business works just like on the <option>-S</option> RTS
-          option (above).</para>
-
-         <para>&ldquo;Ticky-ticky&rdquo; statistics are counts of
-          various program actions (updates, enters, etc.)  The program
-          must have been compiled using
-          <option>-ticky</option><indexterm><primary><option>-ticky</option></primary></indexterm>
-          (a.k.a. &ldquo;ticky-ticky profiling&rdquo;), and, for it to
-          be really useful, linked with suitable system libraries.
-          Not a trivial undertaking: consult the installation guide on
-          how to set things up for easy &ldquo;ticky-ticky&rdquo;
-          profiling.  For more information, see <xref
-          linkend="ticky-ticky"/>.</para>
+          end of the program run (only available if the program was
+          linked with <option>-debug</option>).
+          The <replaceable>file</replaceable> business works just like
+          on the <option>-S</option> RTS option, above.</para>
+
+          <para>For more information on ticky-ticky profiling, see
+          <xref linkend="ticky-ticky"/>.</para>
        </listitem>
       </varlistentry>
 
 
   </sect2>
 
+  <sect2>
+    <title>Linker flags to change RTS behaviour</title>
+
+    <indexterm><primary>RTS behaviour, changing</primary></indexterm>
+
+    <para>
+      GHC lets you exercise rudimentary control over the RTS settings
+      for any given program, by using the <literal>-with-rtsopts</literal>
+      linker flag. For example, to set <literal>-H128m -K1m</literal>,
+      link with <literal>-with-rtsopts="-H128m -K1m"</literal>.
+    </para>
+
+  </sect2>
+
   <sect2 id="rts-hooks">
     <title>&ldquo;Hooks&rdquo; to change RTS behaviour</title>