X-Git-Url: http://git.megacz.com/?a=blobdiff_plain;f=ghc%2Fdocs%2Fusers_guide%2Fprofiling.sgml;h=e79e63824af5a1fb52504c45f3c8e6e8c43fa3f8;hb=9a9feb62db17daf7f2566395f719c2aecec5feb0;hp=4bb22a1e13a43cece92f69d54df8afd69f0d63a0;hpb=ec655d31e7a73d0e2b9eb160faa7ebd7c9fe3577;p=ghc-hetmet.git

diff --git a/ghc/docs/users_guide/profiling.sgml b/ghc/docs/users_guide/profiling.sgml
index 4bb22a1..e79e638 100644
--- a/ghc/docs/users_guide/profiling.sgml
+++ b/ghc/docs/users_guide/profiling.sgml
@@ -45,7 +45,7 @@
     
   </orderedlist>
   
-  <sect1>
+  <sect1 id="cost-centres">
     <title>Cost centres and cost-centre stacks</title>
     
     <para>GHC's profiling system assigns <firstterm>costs</firstterm>
@@ -292,6 +292,12 @@ MAIN                     MAIN             0    0.0   0.0    100.0 100.0
 	  variable.</para></footnote>.  Notice that this is a recursive
 	  definition.</para>
 	</listitem>
+
+	<listitem>
+	  <para>Time spent in foreign code (see <xref linkend="ffi">)
+	  is always attributed to the cost centre in force at the
+	  Haskell call-site of the foreign function.</para>
+	</listitem>
       </itemizedlist>
 
       <para>What do we mean by one-off costs?  Well, Haskell is a lazy
@@ -327,7 +333,6 @@ x = nfib 25
       doesn't look like you expect it to, feel free to send it (and
       your program) to us at
       <email>glasgow-haskell-bugs@haskell.org</email>.</para>
-
     </sect2>
   </sect1>
 
@@ -676,7 +681,7 @@ x = nfib 25
       currently support mixing the <option>-hr</option> and
       <option>-hb</option> options.</para>
 
-      <para>There's one more option which relates to heap
+      <para>There are two more options which relate to heap
       profiling:</para>
 
       <variablelist>
@@ -684,7 +689,7 @@ x = nfib 25
 	  <term><Option>-i<replaceable>secs</replaceable></Option>:</Term>
 	  <indexterm><primary><option>-i</option></primary></indexterm>
 	  <listItem>
-	    <para> Set the profiling (sampling) interval to
+	    <para>Set the profiling (sampling) interval to
             <replaceable>secs</replaceable> seconds (the default is
             0.1&nbsp;second).  Fractions are allowed: for example
             <Option>-i0.2</Option> will get 5 samples per second.
@@ -692,6 +697,27 @@ x = nfib 25
             sampled on a 1/50 second frequency.</para>
 	  </listItem>
 	</varlistentry>
+
+	<varlistentry>
+	  <term><option>-xt</option></term>
+	  <indexterm><primary><option>-xt</option></primary><secondary>RTS option</secondary>
+	  </indexterm>
+	  <listitem>
+	    <para>Include the memory occupied by threads in a heap
+	    profile.  Each thread takes up a small area for its thread
+	    state in addition to the space allocated for its stack
+	    (stacks normally start small and then grow as
+	    necessary).</para>
+	    
+	    <para>This includes the main thread, so using
+	    <option>-xt</option> is a good way to see how much stack
+	    space the program is using.</para>
+
+	    <para>Memory occupied by threads and their stacks is
+	    labelled as &ldquo;TSO&rdquo; when displaying the profile
+	    by closure description or type description.</para>
+	  </listitem>
+	</varlistentry>
       </variablelist>
 
     </sect2>
@@ -838,6 +864,10 @@ x = nfib 25
       information simultaneously.</para>
     </sect2>
 
+
+
+
+
   </sect1>
 
   <sect1 id="prof-xml-tool">
@@ -1059,6 +1089,132 @@ hp2ps [flags] [&lt;file&gt;[.hp]]
 	</listItem>
       </varListEntry>
     </variableList>
+
+
+    <sect2 id="manipulating-hp">
+      <title>Manipulating the hp file</title>
+
+<para>(Notes kindly offered by Jan-Willhem Maessen.)</para>
+
+<para>
+The <filename>FOO.hp</filename> file produced when you ask for the
+heap profile of a program <filename>FOO</filename> is a text file with a particularly
+simple structure. Here's a representative example, with much of the
+actual data omitted:
+<screen>
+JOB "FOO -hC"
+DATE "Thu Dec 26 18:17 2002"
+SAMPLE_UNIT "seconds"
+VALUE_UNIT "bytes"
+BEGIN_SAMPLE 0.00
+END_SAMPLE 0.00
+BEGIN_SAMPLE 15.07
+  ... sample data ...
+END_SAMPLE 15.07
+BEGIN_SAMPLE 30.23
+  ... sample data ...
+END_SAMPLE 30.23
+... etc.
+BEGIN_SAMPLE 11695.47
+END_SAMPLE 11695.47
+</screen>
+The first four lines (<literal>JOB</literal>, <literal>DATE</literal>, <literal>SAMPLE_UNIT</literal>, <literal>VALUE_UNIT</literal>) form a
+header.  Each block of lines starting with <literal>BEGIN_SAMPLE</literal> and ending
+with <literal>END_SAMPLE</literal> forms a single sample (you can think of this as a
+vertical slice of your heap profile).  The hp2ps utility should accept
+any input with a properly-formatted header followed by a series of
+*complete* samples.
+</para>
+</sect2>
+
+    <sect2>
+      <title>Zooming in on regions of your profile</title>
+
+<para>
+You can look at particular regions of your profile simply by loading a
+copy of the <filename>.hp</filename> file into a text editor and deleting the unwanted
+samples.  The resulting <filename>.hp</filename> file can be run through <command>hp2ps</command> and viewed
+or printed.
+</para>
+</sect2>
+
+    <sect2>
+      <title>Viewing the heap profile of a running program</title>
+
+<para>
+The <filename>.hp</filename> file is generated incrementally as your
+program runs.  In principle, running <command>hp2ps</command> on the incomplete file
+should produce a snapshot of your program's heap usage.  However, the
+last sample in the file may be incomplete, causing <command>hp2ps</command> to fail.  If
+you are using a machine with UNIX utilities installed, it's not too
+hard to work around this problem (though the resulting command line
+looks rather Byzantine):
+<screen>
+  head -`fgrep -n END_SAMPLE FOO.hp | tail -1 | cut -d : -f 1` FOO.hp \
+    | hp2ps > FOO.ps
+</screen>
+
+The command <command>fgrep -n END_SAMPLE FOO.hp</command> finds the
+end of every complete sample in <filename>FOO.hp</filename>, and labels each sample with
+its ending line number.  We then select the line number of the last
+complete sample using <command>tail</command> and <command>cut</command>.  This is used as a
+parameter to <command>head</command>; the result is as if we deleted the final
+incomplete sample from <filename>FOO.hp</filename>.  This results in a properly-formatted
+.hp file which we feed directly to <command>hp2ps</command>.
+</para>
+</sect2>
+    <sect2>
+      <title>Viewing a heap profile in real time</title>
+
+<para>
+The <command>gv</command> and <command>ghostview</command> programs
+have a "watch file" option can be used to view an up-to-date heap
+profile of your program as it runs.  Simply generate an incremental
+heap profile as described in the previous section.  Run <command>gv</command> on your
+profile:
+<screen>
+  gv -watch -seascape FOO.ps 
+</screen>
+If you forget the <literal>-watch</literal> flag you can still select
+"Watch file" from the "State" menu.  Now each time you generate a new
+profile <filename>FOO.ps</filename> the view will update automatically.
+</para>
+
+<para>
+This can all be encapsulated in a little script:
+<screen>
+  #!/bin/sh
+  head -`fgrep -n END_SAMPLE FOO.hp | tail -1 | cut -d : -f 1` FOO.hp \
+    | hp2ps > FOO.ps
+  gv -watch -seascape FOO.ps &
+  while [ 1 ] ; do
+    sleep 10 # We generate a new profile every 10 seconds.
+    head -`fgrep -n END_SAMPLE FOO.hp | tail -1 | cut -d : -f 1` FOO.hp \
+      | hp2ps > FOO.ps
+  done
+</screen>
+Occasionally <command>gv</command> will choke as it tries to read an incomplete copy of
+<filename>FOO.ps</filename> (because <command>hp2ps</command> is still running as an update
+occurs).  A slightly more complicated script works around this
+problem, by using the fact that sending a SIGHUP to gv will cause it
+to re-read its input file:
+<screen>
+  #!/bin/sh
+  head -`fgrep -n END_SAMPLE FOO.hp | tail -1 | cut -d : -f 1` FOO.hp \
+    | hp2ps > FOO.ps
+  gv FOO.ps &
+  gvpsnum=$!
+  while [ 1 ] ; do
+    sleep 10
+    head -`fgrep -n END_SAMPLE FOO.hp | tail -1 | cut -d : -f 1` FOO.hp \
+      | hp2ps > FOO.ps
+    kill -HUP $gvpsnum
+  done    
+</screen>
+</para>
+</sect2>
+
+
   </sect1>
 
   <sect1 id="ticky-ticky">