1 <?xml version="1.0" encoding="iso-8859-1"?>
2 <sect1 id="runtime-control">
3 <title>Running a compiled program</title>
5 <indexterm><primary>runtime control of Haskell programs</primary></indexterm>
6 <indexterm><primary>running, compiled program</primary></indexterm>
7 <indexterm><primary>RTS options</primary></indexterm>
9 <para>To make an executable program, the GHC system compiles your
10 code and then links it with a non-trivial runtime system (RTS),
11 which handles storage management, profiling, etc.</para>
13 <para>If you use the <literal>-rtsopts</literal> flag when linking,
14 you have some control over the behaviour of the RTS, by giving
15 special command-line arguments to your program.</para>
17 <para>When your Haskell program starts up, its RTS extracts
18 command-line arguments bracketed between
19 <option>+RTS</option><indexterm><primary><option>+RTS</option></primary></indexterm>
21 <option>-RTS</option><indexterm><primary><option>-RTS</option></primary></indexterm>
22 as its own. For example:</para>
25 % ./a.out -f +RTS -p -S -RTS -h foo bar
28 <para>The RTS will snaffle <option>-p</option> <option>-S</option>
29 for itself, and the remaining arguments <literal>-f -h foo bar</literal>
30 will be handed to your program if/when it calls
31 <function>System.getArgs</function>.</para>
33 <para>No <option>-RTS</option> option is required if the
34 runtime-system options extend to the end of the command line, as in
38 % hls -ltr /usr/etc +RTS -A5m
41 <para>If you absolutely positively want all the rest of the options
42 in a command line to go to the program (and not the RTS), use a
43 <option>––RTS</option><indexterm><primary><option>--RTS</option></primary></indexterm>.</para>
45 <para>As always, for RTS options that take
46 <replaceable>size</replaceable>s: If the last character of
47 <replaceable>size</replaceable> is a K or k, multiply by 1000; if an
48 M or m, by 1,000,000; if a G or G, by 1,000,000,000. (And any
49 wraparound in the counters is <emphasis>your</emphasis>
52 <para>Giving a <literal>+RTS -f</literal>
53 <indexterm><primary><option>-f</option></primary><secondary>RTS option</secondary></indexterm> option
54 will print out the RTS options actually available in your program
55 (which vary, depending on how you compiled).</para>
57 <para>NOTE: since GHC is itself compiled by GHC, you can change RTS
58 options in the compiler using the normal
59 <literal>+RTS ... -RTS</literal>
60 combination. eg. to increase the maximum heap
61 size for a compilation to 128M, you would add
62 <literal>+RTS -M128m -RTS</literal>
63 to the command line.</para>
65 <sect2 id="rts-optinos-environment">
66 <title>Setting global RTS options</title>
68 <indexterm><primary>RTS options</primary><secondary>from the environment</secondary></indexterm>
69 <indexterm><primary>environment variable</primary><secondary>for
70 setting RTS options</secondary></indexterm>
72 <para>When the <literal>-rtsopts</literal> flag is used when linking,
73 RTS options are also taken from the environment variable
74 <envar>GHCRTS</envar><indexterm><primary><envar>GHCRTS</envar></primary>
75 </indexterm>. For example, to set the maximum heap size
76 to 128M for all GHC-compiled programs (using an
77 <literal>sh</literal>-like shell):</para>
84 <para>RTS options taken from the <envar>GHCRTS</envar> environment
85 variable can be overridden by options given on the command
90 <sect2 id="rts-options-misc">
91 <title>Miscellaneous RTS options</title>
95 <term><option>-V<replaceable>secs</replaceable></option>
96 <indexterm><primary><option>-V</option></primary><secondary>RTS
97 option</secondary></indexterm></term>
99 <para>Sets the interval that the RTS clock ticks at. The
100 runtime uses a single timer signal to count ticks; this timer
101 signal is used to control the context switch timer (<xref
102 linkend="using-concurrent" />) and the heap profiling
103 timer <xref linkend="rts-options-heap-prof" />. Also, the
104 time profiler uses the RTS timer signal directly to record
105 time profiling samples.</para>
107 <para>Normally, setting the <option>-V</option> option
108 directly is not necessary: the resolution of the RTS timer is
109 adjusted automatically if a short interval is requested with
110 the <option>-C</option> or <option>-i</option> options.
111 However, setting <option>-V</option> is required in order to
112 increase the resolution of the time profiler.</para>
114 <para>Using a value of zero disables the RTS clock
115 completely, and has the effect of disabling timers that
116 depend on it: the context switch timer and the heap profiling
117 timer. Context switches will still happen, but
118 deterministically and at a rate much faster than normal.
119 Disabling the interval timer is useful for debugging, because
120 it eliminates a source of non-determinism at runtime.</para>
125 <term><option>--install-signal-handlers=<replaceable>yes|no</replaceable></option>
126 <indexterm><primary><option>--install-signal-handlers</option></primary><secondary>RTS
127 option</secondary></indexterm></term>
129 <para>If yes (the default), the RTS installs signal handlers to catch
130 things like ctrl-C. This option is primarily useful for when
131 you are using the Haskell code as a DLL, and want to set your
132 own signal handlers.</para>
137 <term><option>-xm<replaceable>address</replaceable></option>
138 <indexterm><primary><option>-xm</option></primary><secondary>RTS
139 option</secondary></indexterm></term>
142 WARNING: this option is for working around memory
143 allocation problems only. Do not use unless GHCi fails
144 with a message like “<literal>failed to mmap() memory below 2Gb</literal>”. If you need to use this option to get GHCi working
145 on your machine, please file a bug.
149 On 64-bit machines, the RTS needs to allocate memory in the
150 low 2Gb of the address space. Support for this across
151 different operating systems is patchy, and sometimes fails.
152 This option is there to give the RTS a hint about where it
153 should be able to allocate memory in the low 2Gb of the
154 address space. For example, <literal>+RTS -xm20000000
155 -RTS</literal> would hint that the RTS should allocate
156 starting at the 0.5Gb mark. The default is to use the OS's
157 built-in support for allocating memory in the low 2Gb if
158 available (e.g. <literal>mmap</literal>
159 with <literal>MAP_32BIT</literal> on Linux), or
160 otherwise <literal>-xm40000000</literal>.
167 <sect2 id="rts-options-gc">
168 <title>RTS options to control the garbage collector</title>
170 <indexterm><primary>garbage collector</primary><secondary>options</secondary></indexterm>
171 <indexterm><primary>RTS options</primary><secondary>garbage collection</secondary></indexterm>
173 <para>There are several options to give you precise control over
174 garbage collection. Hopefully, you won't need any of these in
175 normal operation, but there are several things that can be tweaked
176 for maximum performance.</para>
182 <option>-A</option><replaceable>size</replaceable>
183 <indexterm><primary><option>-A</option></primary><secondary>RTS option</secondary></indexterm>
184 <indexterm><primary>allocation area, size</primary></indexterm>
187 <para>[Default: 512k] Set the allocation area size
188 used by the garbage collector. The allocation area
189 (actually generation 0 step 0) is fixed and is never resized
190 (unless you use <option>-H</option>, below).</para>
192 <para>Increasing the allocation area size may or may not
193 give better performance (a bigger allocation area means
194 worse cache behaviour but fewer garbage collections and less
197 <para>With only 1 generation (<option>-G1</option>) the
198 <option>-A</option> option specifies the minimum allocation
199 area, since the actual size of the allocation area will be
200 resized according to the amount of data in the heap (see
201 <option>-F</option>, below).</para>
208 <indexterm><primary><option>-c</option></primary><secondary>RTS option</secondary></indexterm>
209 <indexterm><primary>garbage collection</primary><secondary>compacting</secondary></indexterm>
210 <indexterm><primary>compacting garbage collection</primary></indexterm>
213 <para>Use a compacting algorithm for collecting the oldest
214 generation. By default, the oldest generation is collected
215 using a copying algorithm; this option causes it to be
216 compacted in-place instead. The compaction algorithm is
217 slower than the copying algorithm, but the savings in memory
218 use can be considerable.</para>
220 <para>For a given heap size (using the <option>-H</option>
221 option), compaction can in fact reduce the GC cost by
222 allowing fewer GCs to be performed. This is more likely
223 when the ratio of live data to heap size is high, say
224 >30%.</para>
226 <para>NOTE: compaction doesn't currently work when a single
227 generation is requested using the <option>-G1</option>
233 <term><option>-c</option><replaceable>n</replaceable></term>
236 <para>[Default: 30] Automatically enable
237 compacting collection when the live data exceeds
238 <replaceable>n</replaceable>% of the maximum heap size
239 (see the <option>-M</option> option). Note that the maximum
240 heap size is unlimited by default, so this option has no
241 effect unless the maximum heap size is set with
242 <option>-M</option><replaceable>size</replaceable>. </para>
248 <option>-F</option><replaceable>factor</replaceable>
249 <indexterm><primary><option>-F</option></primary><secondary>RTS option</secondary></indexterm>
250 <indexterm><primary>heap size, factor</primary></indexterm>
254 <para>[Default: 2] This option controls the amount
255 of memory reserved for the older generations (and in the
256 case of a two space collector the size of the allocation
257 area) as a factor of the amount of live data. For example,
258 if there was 2M of live data in the oldest generation when
259 we last collected it, then by default we'll wait until it
260 grows to 4M before collecting it again.</para>
262 <para>The default seems to work well here. If you have
263 plenty of memory, it is usually better to use
264 <option>-H</option><replaceable>size</replaceable> than to
266 <option>-F</option><replaceable>factor</replaceable>.</para>
268 <para>The <option>-F</option> setting will be automatically
269 reduced by the garbage collector when the maximum heap size
270 (the <option>-M</option><replaceable>size</replaceable>
271 setting) is approaching.</para>
277 <option>-G</option><replaceable>generations</replaceable>
278 <indexterm><primary><option>-G</option></primary><secondary>RTS option</secondary></indexterm>
279 <indexterm><primary>generations, number of</primary></indexterm>
282 <para>[Default: 2] Set the number of generations
283 used by the garbage collector. The default of 2 seems to be
284 good, but the garbage collector can support any number of
285 generations. Anything larger than about 4 is probably not a
286 good idea unless your program runs for a
287 <emphasis>long</emphasis> time, because the oldest
288 generation will hardly ever get collected.</para>
290 <para>Specifying 1 generation with <option>+RTS -G1</option>
291 gives you a simple 2-space collector, as you would expect.
292 In a 2-space collector, the <option>-A</option> option (see
293 above) specifies the <emphasis>minimum</emphasis> allocation
294 area size, since the allocation area will grow with the
295 amount of live data in the heap. In a multi-generational
296 collector the allocation area is a fixed size (unless you
297 use the <option>-H</option> option, see below).</para>
303 <option>-qg<optional><replaceable>gen</replaceable></optional></option>
304 <indexterm><primary><option>-qg</option><secondary>RTS
305 option</secondary></primary></indexterm>
308 <para>[New in GHC 6.12.1] [Default: 0]
310 generation <replaceable>gen</replaceable> and higher.
311 Omitting <replaceable>gen</replaceable> turns off the
312 parallel GC completely, reverting to sequential GC.</para>
314 <para>The default parallel GC settings are usually suitable
315 for parallel programs (i.e. those
316 using <literal>par</literal>, Strategies, or with multiple
317 threads). However, it is sometimes beneficial to enable
318 the parallel GC for a single-threaded sequential program
319 too, especially if the program has a large amount of heap
320 data and GC is a significant fraction of runtime. To use
321 the parallel GC in a sequential program, enable the
322 parallel runtime with a suitable <literal>-N</literal>
323 option, and additionally it might be beneficial to
324 restrict parallel GC to the old generation
325 with <literal>-qg1</literal>.</para>
331 <option>-qb<optional><replaceable>gen</replaceable></optional></option>
332 <indexterm><primary><option>-qb</option><secondary>RTS
333 option</secondary></primary></indexterm>
337 [New in GHC 6.12.1] [Default: 1] Use
338 load-balancing in the parallel GC in
339 generation <replaceable>gen</replaceable> and higher.
340 Omitting <replaceable>gen</replaceable> disables
341 load-balancing entirely.</para>
344 Load-balancing shares out the work of GC between the
345 available cores. This is a good idea when the heap is
346 large and we need to parallelise the GC work, however it
347 is also pessimal for the short young-generation
348 collections in a parallel program, because it can harm
349 locality by moving data from the cache of the CPU where is
350 it being used to the cache of another CPU. Hence the
351 default is to do load-balancing only in the
352 old-generation. In fact, for a parallel program it is
353 sometimes beneficial to disable load-balancing entirely
354 with <literal>-qb</literal>.
361 <option>-H</option><replaceable>size</replaceable>
362 <indexterm><primary><option>-H</option></primary><secondary>RTS option</secondary></indexterm>
363 <indexterm><primary>heap size, suggested</primary></indexterm>
366 <para>[Default: 0] This option provides a
367 “suggested heap size” for the garbage collector. The
368 garbage collector will use about this much memory until the
369 program residency grows and the heap size needs to be
370 expanded to retain reasonable performance.</para>
372 <para>By default, the heap will start small, and grow and
373 shrink as necessary. This can be bad for performance, so if
374 you have plenty of memory it's worthwhile supplying a big
375 <option>-H</option><replaceable>size</replaceable>. For
376 improving GC performance, using
377 <option>-H</option><replaceable>size</replaceable> is
378 usually a better bet than
379 <option>-A</option><replaceable>size</replaceable>.</para>
385 <option>-I</option><replaceable>seconds</replaceable>
386 <indexterm><primary><option>-I</option></primary>
387 <secondary>RTS option</secondary>
389 <indexterm><primary>idle GC</primary>
393 <para>(default: 0.3) In the threaded and SMP versions of the RTS (see
394 <option>-threaded</option>, <xref linkend="options-linker" />), a
395 major GC is automatically performed if the runtime has been idle
396 (no Haskell computation has been running) for a period of time.
397 The amount of idle time which must pass before a GC is performed is
398 set by the <option>-I</option><replaceable>seconds</replaceable>
399 option. Specifying <option>-I0</option> disables the idle GC.</para>
401 <para>For an interactive application, it is probably a good idea to
402 use the idle GC, because this will allow finalizers to run and
403 deadlocked threads to be detected in the idle time when no Haskell
404 computation is happening. Also, it will mean that a GC is less
405 likely to happen when the application is busy, and so
406 responsiveness may be improved. However, if the amount of live data in
407 the heap is particularly large, then the idle GC can cause a
408 significant delay, and too small an interval could adversely affect
409 interactive responsiveness.</para>
411 <para>This is an experimental feature, please let us know if it
412 causes problems and/or could benefit from further tuning.</para>
418 <option>-k</option><replaceable>size</replaceable>
419 <indexterm><primary><option>-k</option></primary><secondary>RTS option</secondary></indexterm>
420 <indexterm><primary>stack, minimum size</primary></indexterm>
423 <para>[Default: 1k] Set the initial stack size for
424 new threads. Thread stacks (including the main thread's
425 stack) live on the heap, and grow as required. The default
426 value is good for concurrent applications with lots of small
427 threads; if your program doesn't fit this model then
428 increasing this option may help performance.</para>
430 <para>The main thread is normally started with a slightly
431 larger heap to cut down on unnecessary stack growth while
432 the program is starting up.</para>
438 <option>-K</option><replaceable>size</replaceable>
439 <indexterm><primary><option>-K</option></primary><secondary>RTS option</secondary></indexterm>
440 <indexterm><primary>stack, maximum size</primary></indexterm>
443 <para>[Default: 8M] Set the maximum stack size for
444 an individual thread to <replaceable>size</replaceable>
445 bytes. This option is there purely to stop the program
446 eating up all the available memory in the machine if it gets
447 into an infinite loop.</para>
453 <option>-m</option><replaceable>n</replaceable>
454 <indexterm><primary><option>-m</option></primary><secondary>RTS option</secondary></indexterm>
455 <indexterm><primary>heap, minimum free</primary></indexterm>
458 <para>Minimum % <replaceable>n</replaceable> of heap
459 which must be available for allocation. The default is
466 <option>-M</option><replaceable>size</replaceable>
467 <indexterm><primary><option>-M</option></primary><secondary>RTS option</secondary></indexterm>
468 <indexterm><primary>heap size, maximum</primary></indexterm>
471 <para>[Default: unlimited] Set the maximum heap size to
472 <replaceable>size</replaceable> bytes. The heap normally
473 grows and shrinks according to the memory requirements of
474 the program. The only reason for having this option is to
475 stop the heap growing without bound and filling up all the
476 available swap space, which at the least will result in the
477 program being summarily killed by the operating
480 <para>The maximum heap size also affects other garbage
481 collection parameters: when the amount of live data in the
482 heap exceeds a certain fraction of the maximum heap size,
483 compacting collection will be automatically enabled for the
484 oldest generation, and the <option>-F</option> parameter
485 will be reduced in order to avoid exceeding the maximum heap
492 <option>-t</option><optional><replaceable>file</replaceable></optional>
493 <indexterm><primary><option>-t</option></primary><secondary>RTS option</secondary></indexterm>
496 <option>-s</option><optional><replaceable>file</replaceable></optional>
497 <indexterm><primary><option>-s</option></primary><secondary>RTS option</secondary></indexterm>
500 <option>-S</option><optional><replaceable>file</replaceable></optional>
501 <indexterm><primary><option>-S</option></primary><secondary>RTS option</secondary></indexterm>
504 <option>--machine-readable</option>
505 <indexterm><primary><option>--machine-readable</option></primary><secondary>RTS option</secondary></indexterm>
508 <para>These options produce runtime-system statistics, such
509 as the amount of time spent executing the program and in the
510 garbage collector, the amount of memory allocated, the
511 maximum size of the heap, and so on. The three
512 variants give different levels of detail:
513 <option>-t</option> produces a single line of output in the
514 same format as GHC's <option>-Rghc-timing</option> option,
515 <option>-s</option> produces a more detailed summary at the
516 end of the program, and <option>-S</option> additionally
517 produces information about each and every garbage
520 <para>The output is placed in
521 <replaceable>file</replaceable>. If
522 <replaceable>file</replaceable> is omitted, then the output
523 is sent to <constant>stderr</constant>.</para>
526 If you use the <literal>-t</literal> flag then, when your
527 program finishes, you will see something like this:
531 <<ghc: 36169392 bytes, 69 GCs, 603392/1065272 avg/max bytes residency (2 samples), 3M in use, 0.00 INIT (0.00 elapsed), 0.02 MUT (0.02 elapsed), 0.07 GC (0.07 elapsed) :ghc>>
541 The total number of bytes allocated by the program over the
547 The total number of garbage collections performed.
552 The average and maximum "residency", which is the amount of
553 live data in bytes. The runtime can only determine the
554 amount of live data during a major GC, which is why the
555 number of samples corresponds to the number of major GCs
556 (and is usually relatively small). To get a better picture
557 of the heap profile of your program, use
558 the <option>-hT</option> RTS option
559 (<xref linkend="rts-profiling" />).
564 The peak memory the RTS has allocated from the OS.
569 The amount of CPU time and elapsed wall clock time while
570 initialising the runtime system (INIT), running the program
571 itself (MUT, the mutator), and garbage collecting (GC).
577 You can also get this in a more future-proof, machine readable
578 format, with <literal>-t --machine-readable</literal>:
582 [("bytes allocated", "36169392")
584 ,("average_bytes_used", "603392")
585 ,("max_bytes_used", "1065272")
586 ,("num_byte_usage_samples", "2")
587 ,("peak_megabytes_allocated", "3")
588 ,("init_cpu_seconds", "0.00")
589 ,("init_wall_seconds", "0.00")
590 ,("mutator_cpu_seconds", "0.02")
591 ,("mutator_wall_seconds", "0.02")
592 ,("GC_cpu_seconds", "0.07")
593 ,("GC_wall_seconds", "0.07")
598 If you use the <literal>-s</literal> flag then, when your
599 program finishes, you will see something like this (the exact
600 details will vary depending on what sort of RTS you have, e.g.
601 you will only see profiling data if your RTS is compiled for
606 36,169,392 bytes allocated in the heap
607 4,057,632 bytes copied during GC
608 1,065,272 bytes maximum residency (2 sample(s))
609 54,312 bytes maximum slop
610 3 MB total memory in use (0 MB lost due to fragmentation)
612 Generation 0: 67 collections, 0 parallel, 0.04s, 0.03s elapsed
613 Generation 1: 2 collections, 0 parallel, 0.03s, 0.04s elapsed
615 SPARKS: 359207 (557 converted, 149591 pruned)
617 INIT time 0.00s ( 0.00s elapsed)
618 MUT time 0.01s ( 0.02s elapsed)
619 GC time 0.07s ( 0.07s elapsed)
620 EXIT time 0.00s ( 0.00s elapsed)
621 Total time 0.08s ( 0.09s elapsed)
623 %GC time 89.5% (75.3% elapsed)
625 Alloc rate 4,520,608,923 bytes per MUT second
627 Productivity 10.5% of total user, 9.1% of total elapsed
633 The "bytes allocated in the heap" is the total bytes allocated
634 by the program over the whole run.
639 GHC uses a copying garbage collector by default. "bytes copied
640 during GC" tells you how many bytes it had to copy during
646 The maximum space actually used by your program is the
647 "bytes maximum residency" figure. This is only checked during
648 major garbage collections, so it is only an approximation;
649 the number of samples tells you how many times it is checked.
654 The "bytes maximum slop" tells you the most space that is ever
655 wasted due to the way GHC allocates memory in blocks. Slop is
656 memory at the end of a block that was wasted. There's no way
657 to control this; we just like to see how much memory is being
663 The "total memory in use" tells you the peak memory the RTS has
664 allocated from the OS.
669 Next there is information about the garbage collections done.
670 For each generation it says how many garbage collections were
671 done, how many of those collections were done in parallel,
672 the total CPU time used for garbage collecting that generation,
673 and the total wall clock time elapsed while garbage collecting
678 <para>The <literal>SPARKS</literal> statistic refers to the
679 use of <literal>Control.Parallel.par</literal> and related
680 functionality in the program. Each spark represents a call
681 to <literal>par</literal>; a spark is "converted" when it is
682 executed in parallel; and a spark is "pruned" when it is
683 found to be already evaluated and is discarded from the pool
684 by the garbage collector. Any remaining sparks are
685 discarded at the end of execution, so "converted" plus
686 "pruned" does not necessarily add up to the total.</para>
690 Next there is the CPU time and wall clock time elapsed broken
691 down by what the runtime system was doing at the time.
692 INIT is the runtime system initialisation.
693 MUT is the mutator time, i.e. the time spent actually running
695 GC is the time spent doing garbage collection.
696 RP is the time spent doing retainer profiling.
697 PROF is the time spent doing other profiling.
698 EXIT is the runtime system shutdown time.
699 And finally, Total is, of course, the total.
702 %GC time tells you what percentage GC is of Total.
703 "Alloc rate" tells you the "bytes allocated in the heap" divided
705 "Productivity" tells you what percentage of the Total CPU and wall
706 clock elapsed times are spent in the mutator (MUT).
712 The <literal>-S</literal> flag, as well as giving the same
713 output as the <literal>-s</literal> flag, prints information
714 about each GC as it happens:
718 Alloc Copied Live GC GC TOT TOT Page Flts
719 bytes bytes bytes user elap user elap
720 528496 47728 141512 0.01 0.02 0.02 0.02 0 0 (Gen: 1)
722 524944 175944 1726384 0.00 0.00 0.08 0.11 0 0 (Gen: 0)
726 For each garbage collection, we print:
732 How many bytes we allocated this garbage collection.
737 How many bytes we copied this garbage collection.
742 How many bytes are currently live.
747 How long this garbage collection took (CPU time and elapsed
753 How long the program has been running (CPU time and elapsed
759 How many page faults occured this garbage collection.
764 How many page faults occured since the end of the last garbage
770 Which generation is being garbage collected.
782 <title>RTS options for concurrency and parallelism</title>
784 <para>The RTS options related to concurrency are described in
785 <xref linkend="using-concurrent" />, and those for parallelism in
786 <xref linkend="parallel-options"/>.</para>
789 <sect2 id="rts-profiling">
790 <title>RTS options for profiling</title>
792 <para>Most profiling runtime options are only available when you
793 compile your program for profiling (see
794 <xref linkend="prof-compiler-options" />, and
795 <xref linkend="rts-options-heap-prof" /> for the runtime options).
796 However, there is one profiling option that is available
797 for ordinary non-profiled executables:</para>
803 <indexterm><primary><option>-hT</option></primary><secondary>RTS
804 option</secondary></indexterm>
807 <para>Generates a basic heap profile, in the
808 file <literal><replaceable>prog</replaceable>.hp</literal>.
809 To produce the heap profile graph,
810 use <command>hp2ps</command> (see <xref linkend="hp2ps"
811 />). The basic heap profile is broken down by data
812 constructor, with other types of closures (functions, thunks,
813 etc.) grouped into broad categories
814 (e.g. <literal>FUN</literal>, <literal>THUNK</literal>). To
815 get a more detailed profile, use the full profiling
816 support (<xref linkend="profiling" />).</para>
822 <sect2 id="rts-eventlog">
823 <title>Tracing</title>
825 <indexterm><primary>tracing</primary></indexterm>
826 <indexterm><primary>events</primary></indexterm>
827 <indexterm><primary>eventlog files</primary></indexterm>
830 When the program is linked with the <option>-eventlog</option>
831 option (<xref linkend="options-linker" />), runtime events can
832 be logged in two ways:
838 In binary format to a file for later analysis by a
839 variety of tools. One such tool
840 is <ulink url="http://hackage.haskell.org/package/ThreadScope">ThreadScope</ulink><indexterm><primary>ThreadScope</primary></indexterm>,
841 which interprets the event log to produce a visual parallel
842 execution profile of the program.
847 As text to standard output, for debugging purposes.
855 <option>-l<optional><replaceable>flags</replaceable></optional></option>
856 <indexterm><primary><option>-l</option></primary><secondary>RTS option</secondary></indexterm>
860 Log events in binary format to the
861 file <filename><replaceable>program</replaceable>.eventlog</filename>,
862 where <replaceable>flags</replaceable> is a sequence of
863 zero or more characters indicating which kinds of events
864 to log. Currently there is only one type
865 supported: <literal>-ls</literal>, for scheduler events.
869 The format of the log file is described by the header
870 <filename>EventLogFormat.h</filename> that comes with
871 GHC, and it can be parsed in Haskell using
872 the <ulink url="http://hackage.haskell.org/package/ghc-events">ghc-events</ulink>
873 library. To dump the contents of
874 a <literal>.eventlog</literal> file as text, use the
875 tool <literal>show-ghc-events</literal> that comes with
876 the <ulink url="http://hackage.haskell.org/package/ghc-events">ghc-events</ulink>
884 <option>-v</option><optional><replaceable>flags</replaceable></optional>
885 <indexterm><primary><option>-v</option></primary><secondary>RTS option</secondary></indexterm>
889 Log events as text to standard output, instead of to
890 the <literal>.eventlog</literal> file.
891 The <replaceable>flags</replaceable> are the same as
892 for <option>-l</option>, with the additional
893 option <literal>t</literal> which indicates that the
894 each event printed should be preceded by a timestamp value
895 (in the binary <literal>.eventlog</literal> file, all
896 events are automatically associated with a timestamp).
905 options <option>-D<replaceable>x</replaceable></option> also
906 generate events which are logged using the tracing framework.
907 By default those events are dumped as text to stdout
908 (<option>-D<replaceable>x</replaceable></option>
909 implies <option>-v</option>), but they may instead be stored in
910 the binary eventlog file by using the <option>-l</option>
915 <sect2 id="rts-options-debugging">
916 <title>RTS options for hackers, debuggers, and over-interested
919 <indexterm><primary>RTS options, hacking/debugging</primary></indexterm>
921 <para>These RTS options might be used (a) to avoid a GHC bug,
922 (b) to see “what's really happening”, or
923 (c) because you feel like it. Not recommended for everyday
931 <indexterm><primary><option>-B</option></primary><secondary>RTS option</secondary></indexterm>
934 <para>Sound the bell at the start of each (major) garbage
937 <para>Oddly enough, people really do use this option! Our
938 pal in Durham (England), Paul Callaghan, writes: “Some
939 people here use it for a variety of
940 purposes—honestly!—e.g., confirmation that the
941 code/machine is doing something, infinite loop detection,
942 gauging cost of recently added code. Certain people can even
943 tell what stage [the program] is in by the beep
944 pattern. But the major use is for annoying others in the
945 same office…”</para>
951 <option>-D</option><replaceable>x</replaceable>
952 <indexterm><primary>-D</primary><secondary>RTS option</secondary></indexterm>
956 An RTS debugging flag; only availble if the program was
957 linked with the <option>-debug</option> option. Various
958 values of <replaceable>x</replaceable> are provided to
959 enable debug messages and additional runtime sanity checks
960 in different subsystems in the RTS, for
961 example <literal>+RTS -Ds -RTS</literal> enables debug
962 messages from the scheduler.
963 Use <literal>+RTS -?</literal> to find out which
964 debug flags are supported.
968 Debug messages will be sent to the binary event log file
969 instead of stdout if the <option>-l</option> option is
970 added. This might be useful for reducing the overhead of
978 <option>-r</option><replaceable>file</replaceable>
979 <indexterm><primary><option>-r</option></primary><secondary>RTS option</secondary></indexterm>
980 <indexterm><primary>ticky ticky profiling</primary></indexterm>
981 <indexterm><primary>profiling</primary><secondary>ticky ticky</secondary></indexterm>
984 <para>Produce “ticky-ticky” statistics at the
985 end of the program run (only available if the program was
986 linked with <option>-debug</option>).
987 The <replaceable>file</replaceable> business works just like
988 on the <option>-S</option> RTS option, above.</para>
990 <para>For more information on ticky-ticky profiling, see
991 <xref linkend="ticky-ticky"/>.</para>
998 <indexterm><primary><option>-xc</option></primary><secondary>RTS option</secondary></indexterm>
1001 <para>(Only available when the program is compiled for
1002 profiling.) When an exception is raised in the program,
1003 this option causes the current cost-centre-stack to be
1004 dumped to <literal>stderr</literal>.</para>
1006 <para>This can be particularly useful for debugging: if your
1007 program is complaining about a <literal>head []</literal>
1008 error and you haven't got a clue which bit of code is
1009 causing it, compiling with <literal>-prof
1010 -auto-all</literal> and running with <literal>+RTS -xc
1011 -RTS</literal> will tell you exactly the call stack at the
1012 point the error was raised.</para>
1014 <para>The output contains one line for each exception raised
1015 in the program (the program might raise and catch several
1016 exceptions during its execution), where each line is of the
1020 < cc<subscript>1</subscript>, ..., cc<subscript>n</subscript> >
1022 <para>each <literal>cc</literal><subscript>i</subscript> is
1023 a cost centre in the program (see <xref
1024 linkend="cost-centres"/>), and the sequence represents the
1025 “call stack” at the point the exception was
1026 raised. The leftmost item is the innermost function in the
1027 call stack, and the rightmost item is the outermost
1036 <indexterm><primary><option>-Z</option></primary><secondary>RTS option</secondary></indexterm>
1039 <para>Turn <emphasis>off</emphasis> “update-frame
1040 squeezing” at garbage-collection time. (There's no
1041 particularly good reason to turn it off, except to ensure
1042 the accuracy of certain data collected regarding thunk entry
1051 <title>Linker flags to change RTS behaviour</title>
1053 <indexterm><primary>RTS behaviour, changing</primary></indexterm>
1056 GHC lets you exercise rudimentary control over the RTS settings
1057 for any given program, by using the <literal>-with-rtsopts</literal>
1058 linker flag. For example, to set <literal>-H128m -K1m</literal>,
1059 link with <literal>-with-rtsopts="-H128m -K1m"</literal>.
1064 <sect2 id="rts-hooks">
1065 <title>“Hooks” to change RTS behaviour</title>
1067 <indexterm><primary>hooks</primary><secondary>RTS</secondary></indexterm>
1068 <indexterm><primary>RTS hooks</primary></indexterm>
1069 <indexterm><primary>RTS behaviour, changing</primary></indexterm>
1071 <para>GHC lets you exercise rudimentary control over the RTS
1072 settings for any given program, by compiling in a
1073 “hook” that is called by the run-time system. The RTS
1074 contains stub definitions for all these hooks, but by writing your
1075 own version and linking it on the GHC command line, you can
1076 override the defaults.</para>
1078 <para>Owing to the vagaries of DLL linking, these hooks don't work
1079 under Windows when the program is built dynamically.</para>
1081 <para>The hook <literal>ghc_rts_opts</literal><indexterm><primary><literal>ghc_rts_opts</literal></primary>
1082 </indexterm>lets you set RTS
1083 options permanently for a given program. A common use for this is
1084 to give your program a default heap and/or stack size that is
1085 greater than the default. For example, to set <literal>-H128m
1086 -K1m</literal>, place the following definition in a C source
1090 char *ghc_rts_opts = "-H128m -K1m";
1093 <para>Compile the C file, and include the object file on the
1094 command line when you link your Haskell program.</para>
1096 <para>These flags are interpreted first, before any RTS flags from
1097 the <literal>GHCRTS</literal> environment variable and any flags
1098 on the command line.</para>
1100 <para>You can also change the messages printed when the runtime
1101 system “blows up,” e.g., on stack overflow. The hooks
1102 for these are as follows:</para>
1108 <function>void OutOfHeapHook (unsigned long, unsigned long)</function>
1109 <indexterm><primary><function>OutOfHeapHook</function></primary></indexterm>
1112 <para>The heap-overflow message.</para>
1118 <function>void StackOverflowHook (long int)</function>
1119 <indexterm><primary><function>StackOverflowHook</function></primary></indexterm>
1122 <para>The stack-overflow message.</para>
1128 <function>void MallocFailHook (long int)</function>
1129 <indexterm><primary><function>MallocFailHook</function></primary></indexterm>
1132 <para>The message printed if <function>malloc</function>
1138 <para>For examples of the use of these hooks, see GHC's own
1139 versions in the file
1140 <filename>ghc/compiler/parser/hschooks.c</filename> in a GHC
1145 <title>Getting information about the RTS</title>
1147 <indexterm><primary>RTS</primary></indexterm>
1149 <para>It is possible to ask the RTS to give some information about
1150 itself. To do this, use the <option>--info</option> flag, e.g.</para>
1152 $ ./a.out +RTS --info
1154 ,("GHC version", "6.7")
1155 ,("RTS way", "rts_p")
1156 ,("Host platform", "x86_64-unknown-linux")
1157 ,("Host architecture", "x86_64")
1158 ,("Host OS", "linux")
1159 ,("Host vendor", "unknown")
1160 ,("Build platform", "x86_64-unknown-linux")
1161 ,("Build architecture", "x86_64")
1162 ,("Build OS", "linux")
1163 ,("Build vendor", "unknown")
1164 ,("Target platform", "x86_64-unknown-linux")
1165 ,("Target architecture", "x86_64")
1166 ,("Target OS", "linux")
1167 ,("Target vendor", "unknown")
1168 ,("Word size", "64")
1169 ,("Compiler unregisterised", "NO")
1170 ,("Tables next to code", "YES")
1173 <para>The information is formatted such that it can be read as a
1174 of type <literal>[(String, String)]</literal>. Currently the following
1175 fields are present:</para>
1180 <term><literal>GHC RTS</literal></term>
1182 <para>Is this program linked against the GHC RTS? (always
1188 <term><literal>GHC version</literal></term>
1190 <para>The version of GHC used to compile this program.</para>
1195 <term><literal>RTS way</literal></term>
1197 <para>The variant (“way”) of the runtime. The
1198 most common values are <literal>rts</literal> (vanilla),
1199 <literal>rts_thr</literal> (threaded runtime, i.e. linked using the
1200 <literal>-threaded</literal> option) and <literal>rts_p</literal>
1201 (profiling runtime, i.e. linked using the <literal>-prof</literal>
1202 option). Other variants include <literal>debug</literal>
1203 (linked using <literal>-debug</literal>),
1204 <literal>t</literal> (ticky-ticky profiling) and
1205 <literal>dyn</literal> (the RTS is
1206 linked in dynamically, i.e. a shared library, rather than statically
1207 linked into the executable itself). These can be combined,
1208 e.g. you might have <literal>rts_thr_debug_p</literal>.</para>
1214 <literal>Target platform</literal>,
1215 <literal>Target architecture</literal>,
1216 <literal>Target OS</literal>,
1217 <literal>Target vendor</literal>
1220 <para>These are the platform the program is compiled to run on.</para>
1226 <literal>Build platform</literal>,
1227 <literal>Build architecture</literal>,
1228 <literal>Build OS</literal>,
1229 <literal>Build vendor</literal>
1232 <para>These are the platform where the program was built
1233 on. (That is, the target platform of GHC itself.) Ordinarily
1234 this is identical to the target platform. (It could potentially
1235 be different if cross-compiling.)</para>
1241 <literal>Host platform</literal>,
1242 <literal>Host architecture</literal>
1243 <literal>Host OS</literal>
1244 <literal>Host vendor</literal>
1247 <para>These are the platform where GHC itself was compiled.
1248 Again, this would normally be identical to the build and
1249 target platforms.</para>
1254 <term><literal>Word size</literal></term>
1256 <para>Either <literal>"32"</literal> or <literal>"64"</literal>,
1257 reflecting the word size of the target platform.</para>
1262 <term><literal>Compiler unregistered</literal></term>
1264 <para>Was this program compiled with an “unregistered”
1265 version of GHC? (I.e., a version of GHC that has no platform-specific
1266 optimisations compiled in, usually because this is a currently
1267 unsupported platform.) This value will usually be no, unless you're
1268 using an experimental build of GHC.</para>
1273 <term><literal>Tables next to code</literal></term>
1275 <para>Putting info tables directly next to entry code is a useful
1276 performance optimisation that is not available on all platforms.
1277 This field tells you whether the program has been compiled with
1278 this optimisation. (Usually yes, except on unusual platforms.)</para>
1288 ;;; Local Variables: ***
1290 ;;; sgml-parent-document: ("users_guide.xml" "book" "chapter" "sect1") ***