1 <?xml version="1.0" encoding="iso-8859-1"?>
2 <sect1 id="runtime-control">
3 <title>Running a compiled program</title>
5 <indexterm><primary>runtime control of Haskell programs</primary></indexterm>
6 <indexterm><primary>running, compiled program</primary></indexterm>
7 <indexterm><primary>RTS options</primary></indexterm>
9 <para>To make an executable program, the GHC system compiles your
10 code and then links it with a non-trivial runtime system (RTS),
11 which handles storage management, profiling, etc.</para>
13 <para>If you set the <literal>-rtsopts</literal> flag appropriately when linking,
14 you have some control over the behaviour of the RTS, by giving
15 special command-line arguments to your program.</para>
17 <para>When your Haskell program starts up, its RTS extracts
18 command-line arguments bracketed between
19 <option>+RTS</option><indexterm><primary><option>+RTS</option></primary></indexterm>
21 <option>-RTS</option><indexterm><primary><option>-RTS</option></primary></indexterm>
22 as its own. For example:</para>
25 % ./a.out -f +RTS -p -S -RTS -h foo bar
28 <para>The RTS will snaffle <option>-p</option> <option>-S</option>
29 for itself, and the remaining arguments <literal>-f -h foo bar</literal>
30 will be handed to your program if/when it calls
31 <function>System.getArgs</function>.</para>
33 <para>No <option>-RTS</option> option is required if the
34 runtime-system options extend to the end of the command line, as in
38 % hls -ltr /usr/etc +RTS -A5m
41 <para>If you absolutely positively want all the rest of the options
42 in a command line to go to the program (and not the RTS), use a
43 <option>––RTS</option><indexterm><primary><option>--RTS</option></primary></indexterm>.</para>
45 <para>As always, for RTS options that take
46 <replaceable>size</replaceable>s: If the last character of
47 <replaceable>size</replaceable> is a K or k, multiply by 1000; if an
48 M or m, by 1,000,000; if a G or G, by 1,000,000,000. (And any
49 wraparound in the counters is <emphasis>your</emphasis>
52 <para>Giving a <literal>+RTS -?</literal>
53 <indexterm><primary><option>-?</option></primary><secondary>RTS option</secondary></indexterm> option
54 will print out the RTS options actually available in your program
55 (which vary, depending on how you compiled).</para>
57 <para>NOTE: since GHC is itself compiled by GHC, you can change RTS
58 options in the compiler using the normal
59 <literal>+RTS ... -RTS</literal>
60 combination. eg. to increase the maximum heap
61 size for a compilation to 128M, you would add
62 <literal>+RTS -M128m -RTS</literal>
63 to the command line.</para>
65 <sect2 id="rts-options-environment">
66 <title>Setting global RTS options</title>
68 <indexterm><primary>RTS options</primary><secondary>from the environment</secondary></indexterm>
69 <indexterm><primary>environment variable</primary><secondary>for
70 setting RTS options</secondary></indexterm>
72 <para>If the <literal>-rtsopts</literal> flag is set to
73 something other than <literal>none</literal> when linking,
74 RTS options are also taken from the environment variable
75 <envar>GHCRTS</envar><indexterm><primary><envar>GHCRTS</envar></primary>
76 </indexterm>. For example, to set the maximum heap size
77 to 128M for all GHC-compiled programs (using an
78 <literal>sh</literal>-like shell):</para>
85 <para>RTS options taken from the <envar>GHCRTS</envar> environment
86 variable can be overridden by options given on the command
91 <sect2 id="rts-options-misc">
92 <title>Miscellaneous RTS options</title>
96 <term><option>-V<replaceable>secs</replaceable></option>
97 <indexterm><primary><option>-V</option></primary><secondary>RTS
98 option</secondary></indexterm></term>
100 <para>Sets the interval that the RTS clock ticks at. The
101 runtime uses a single timer signal to count ticks; this timer
102 signal is used to control the context switch timer (<xref
103 linkend="using-concurrent" />) and the heap profiling
104 timer <xref linkend="rts-options-heap-prof" />. Also, the
105 time profiler uses the RTS timer signal directly to record
106 time profiling samples.</para>
108 <para>Normally, setting the <option>-V</option> option
109 directly is not necessary: the resolution of the RTS timer is
110 adjusted automatically if a short interval is requested with
111 the <option>-C</option> or <option>-i</option> options.
112 However, setting <option>-V</option> is required in order to
113 increase the resolution of the time profiler.</para>
115 <para>Using a value of zero disables the RTS clock
116 completely, and has the effect of disabling timers that
117 depend on it: the context switch timer and the heap profiling
118 timer. Context switches will still happen, but
119 deterministically and at a rate much faster than normal.
120 Disabling the interval timer is useful for debugging, because
121 it eliminates a source of non-determinism at runtime.</para>
126 <term><option>--install-signal-handlers=<replaceable>yes|no</replaceable></option>
127 <indexterm><primary><option>--install-signal-handlers</option></primary><secondary>RTS
128 option</secondary></indexterm></term>
130 <para>If yes (the default), the RTS installs signal handlers to catch
131 things like ctrl-C. This option is primarily useful for when
132 you are using the Haskell code as a DLL, and want to set your
133 own signal handlers.</para>
136 with <option>--install-signal-handlers=no</option>, the RTS
137 interval timer signal is still enabled. The timer signal
138 is either SIGVTALRM or SIGALRM, depending on the RTS
139 configuration and OS capabilities. To disable the timer
140 signal, use the <literal>-V0</literal> RTS option (see
147 <term><option>-xm<replaceable>address</replaceable></option>
148 <indexterm><primary><option>-xm</option></primary><secondary>RTS
149 option</secondary></indexterm></term>
152 WARNING: this option is for working around memory
153 allocation problems only. Do not use unless GHCi fails
154 with a message like “<literal>failed to mmap() memory below 2Gb</literal>”. If you need to use this option to get GHCi working
155 on your machine, please file a bug.
159 On 64-bit machines, the RTS needs to allocate memory in the
160 low 2Gb of the address space. Support for this across
161 different operating systems is patchy, and sometimes fails.
162 This option is there to give the RTS a hint about where it
163 should be able to allocate memory in the low 2Gb of the
164 address space. For example, <literal>+RTS -xm20000000
165 -RTS</literal> would hint that the RTS should allocate
166 starting at the 0.5Gb mark. The default is to use the OS's
167 built-in support for allocating memory in the low 2Gb if
168 available (e.g. <literal>mmap</literal>
169 with <literal>MAP_32BIT</literal> on Linux), or
170 otherwise <literal>-xm40000000</literal>.
177 <sect2 id="rts-options-gc">
178 <title>RTS options to control the garbage collector</title>
180 <indexterm><primary>garbage collector</primary><secondary>options</secondary></indexterm>
181 <indexterm><primary>RTS options</primary><secondary>garbage collection</secondary></indexterm>
183 <para>There are several options to give you precise control over
184 garbage collection. Hopefully, you won't need any of these in
185 normal operation, but there are several things that can be tweaked
186 for maximum performance.</para>
192 <option>-A</option><replaceable>size</replaceable>
193 <indexterm><primary><option>-A</option></primary><secondary>RTS option</secondary></indexterm>
194 <indexterm><primary>allocation area, size</primary></indexterm>
197 <para>[Default: 512k] Set the allocation area size
198 used by the garbage collector. The allocation area
199 (actually generation 0 step 0) is fixed and is never resized
200 (unless you use <option>-H</option>, below).</para>
202 <para>Increasing the allocation area size may or may not
203 give better performance (a bigger allocation area means
204 worse cache behaviour but fewer garbage collections and less
207 <para>With only 1 generation (<option>-G1</option>) the
208 <option>-A</option> option specifies the minimum allocation
209 area, since the actual size of the allocation area will be
210 resized according to the amount of data in the heap (see
211 <option>-F</option>, below).</para>
218 <indexterm><primary><option>-c</option></primary><secondary>RTS option</secondary></indexterm>
219 <indexterm><primary>garbage collection</primary><secondary>compacting</secondary></indexterm>
220 <indexterm><primary>compacting garbage collection</primary></indexterm>
223 <para>Use a compacting algorithm for collecting the oldest
224 generation. By default, the oldest generation is collected
225 using a copying algorithm; this option causes it to be
226 compacted in-place instead. The compaction algorithm is
227 slower than the copying algorithm, but the savings in memory
228 use can be considerable.</para>
230 <para>For a given heap size (using the <option>-H</option>
231 option), compaction can in fact reduce the GC cost by
232 allowing fewer GCs to be performed. This is more likely
233 when the ratio of live data to heap size is high, say
234 >30%.</para>
236 <para>NOTE: compaction doesn't currently work when a single
237 generation is requested using the <option>-G1</option>
243 <term><option>-c</option><replaceable>n</replaceable></term>
246 <para>[Default: 30] Automatically enable
247 compacting collection when the live data exceeds
248 <replaceable>n</replaceable>% of the maximum heap size
249 (see the <option>-M</option> option). Note that the maximum
250 heap size is unlimited by default, so this option has no
251 effect unless the maximum heap size is set with
252 <option>-M</option><replaceable>size</replaceable>. </para>
258 <option>-F</option><replaceable>factor</replaceable>
259 <indexterm><primary><option>-F</option></primary><secondary>RTS option</secondary></indexterm>
260 <indexterm><primary>heap size, factor</primary></indexterm>
264 <para>[Default: 2] This option controls the amount
265 of memory reserved for the older generations (and in the
266 case of a two space collector the size of the allocation
267 area) as a factor of the amount of live data. For example,
268 if there was 2M of live data in the oldest generation when
269 we last collected it, then by default we'll wait until it
270 grows to 4M before collecting it again.</para>
272 <para>The default seems to work well here. If you have
273 plenty of memory, it is usually better to use
274 <option>-H</option><replaceable>size</replaceable> than to
276 <option>-F</option><replaceable>factor</replaceable>.</para>
278 <para>The <option>-F</option> setting will be automatically
279 reduced by the garbage collector when the maximum heap size
280 (the <option>-M</option><replaceable>size</replaceable>
281 setting) is approaching.</para>
287 <option>-G</option><replaceable>generations</replaceable>
288 <indexterm><primary><option>-G</option></primary><secondary>RTS option</secondary></indexterm>
289 <indexterm><primary>generations, number of</primary></indexterm>
292 <para>[Default: 2] Set the number of generations
293 used by the garbage collector. The default of 2 seems to be
294 good, but the garbage collector can support any number of
295 generations. Anything larger than about 4 is probably not a
296 good idea unless your program runs for a
297 <emphasis>long</emphasis> time, because the oldest
298 generation will hardly ever get collected.</para>
300 <para>Specifying 1 generation with <option>+RTS -G1</option>
301 gives you a simple 2-space collector, as you would expect.
302 In a 2-space collector, the <option>-A</option> option (see
303 above) specifies the <emphasis>minimum</emphasis> allocation
304 area size, since the allocation area will grow with the
305 amount of live data in the heap. In a multi-generational
306 collector the allocation area is a fixed size (unless you
307 use the <option>-H</option> option, see below).</para>
313 <option>-qg<optional><replaceable>gen</replaceable></optional></option>
314 <indexterm><primary><option>-qg</option><secondary>RTS
315 option</secondary></primary></indexterm>
318 <para>[New in GHC 6.12.1] [Default: 0]
320 generation <replaceable>gen</replaceable> and higher.
321 Omitting <replaceable>gen</replaceable> turns off the
322 parallel GC completely, reverting to sequential GC.</para>
324 <para>The default parallel GC settings are usually suitable
325 for parallel programs (i.e. those
326 using <literal>par</literal>, Strategies, or with multiple
327 threads). However, it is sometimes beneficial to enable
328 the parallel GC for a single-threaded sequential program
329 too, especially if the program has a large amount of heap
330 data and GC is a significant fraction of runtime. To use
331 the parallel GC in a sequential program, enable the
332 parallel runtime with a suitable <literal>-N</literal>
333 option, and additionally it might be beneficial to
334 restrict parallel GC to the old generation
335 with <literal>-qg1</literal>.</para>
341 <option>-qb<optional><replaceable>gen</replaceable></optional></option>
342 <indexterm><primary><option>-qb</option><secondary>RTS
343 option</secondary></primary></indexterm>
347 [New in GHC 6.12.1] [Default: 1] Use
348 load-balancing in the parallel GC in
349 generation <replaceable>gen</replaceable> and higher.
350 Omitting <replaceable>gen</replaceable> disables
351 load-balancing entirely.</para>
354 Load-balancing shares out the work of GC between the
355 available cores. This is a good idea when the heap is
356 large and we need to parallelise the GC work, however it
357 is also pessimal for the short young-generation
358 collections in a parallel program, because it can harm
359 locality by moving data from the cache of the CPU where is
360 it being used to the cache of another CPU. Hence the
361 default is to do load-balancing only in the
362 old-generation. In fact, for a parallel program it is
363 sometimes beneficial to disable load-balancing entirely
364 with <literal>-qb</literal>.
371 <option>-H</option><replaceable>size</replaceable>
372 <indexterm><primary><option>-H</option></primary><secondary>RTS option</secondary></indexterm>
373 <indexterm><primary>heap size, suggested</primary></indexterm>
376 <para>[Default: 0] This option provides a
377 “suggested heap size” for the garbage collector. The
378 garbage collector will use about this much memory until the
379 program residency grows and the heap size needs to be
380 expanded to retain reasonable performance.</para>
382 <para>By default, the heap will start small, and grow and
383 shrink as necessary. This can be bad for performance, so if
384 you have plenty of memory it's worthwhile supplying a big
385 <option>-H</option><replaceable>size</replaceable>. For
386 improving GC performance, using
387 <option>-H</option><replaceable>size</replaceable> is
388 usually a better bet than
389 <option>-A</option><replaceable>size</replaceable>.</para>
395 <option>-I</option><replaceable>seconds</replaceable>
396 <indexterm><primary><option>-I</option></primary>
397 <secondary>RTS option</secondary>
399 <indexterm><primary>idle GC</primary>
403 <para>(default: 0.3) In the threaded and SMP versions of the RTS (see
404 <option>-threaded</option>, <xref linkend="options-linker" />), a
405 major GC is automatically performed if the runtime has been idle
406 (no Haskell computation has been running) for a period of time.
407 The amount of idle time which must pass before a GC is performed is
408 set by the <option>-I</option><replaceable>seconds</replaceable>
409 option. Specifying <option>-I0</option> disables the idle GC.</para>
411 <para>For an interactive application, it is probably a good idea to
412 use the idle GC, because this will allow finalizers to run and
413 deadlocked threads to be detected in the idle time when no Haskell
414 computation is happening. Also, it will mean that a GC is less
415 likely to happen when the application is busy, and so
416 responsiveness may be improved. However, if the amount of live data in
417 the heap is particularly large, then the idle GC can cause a
418 significant delay, and too small an interval could adversely affect
419 interactive responsiveness.</para>
421 <para>This is an experimental feature, please let us know if it
422 causes problems and/or could benefit from further tuning.</para>
428 <option>-ki</option><replaceable>size</replaceable>
429 <indexterm><primary><option>-k</option></primary><secondary>RTS option</secondary></indexterm>
430 <indexterm><primary>stack, initial size</primary></indexterm>
434 [Default: 1k] Set the initial stack size for new
435 threads. (Note: this flag used to be
436 simply <option>-k</option>, but was renamed
437 to <option>-ki</option> in GHC 7.2.1. The old name is
438 still accepted for backwards compatibility, but that may
439 be removed in a future version).
443 Thread stacks (including the main thread's stack) live on
444 the heap. As the stack grows, new stack chunks are added
445 as required; if the stack shrinks again, these extra stack
446 chunks are reclaimed by the garbage collector. The
447 default initial stack size is deliberately small, in order
448 to keep the time and space overhead for thread creation to
449 a minimum, and to make it practical to spawn threads for
450 even tiny pieces of work.
457 <option>-kc</option><replaceable>size</replaceable>
458 <indexterm><primary><option>-kc</option></primary><secondary>RTS
459 option</secondary></indexterm>
460 <indexterm><primary>stack</primary><secondary>chunk size</secondary></indexterm>
464 [Default: 32k] Set the size of “stack
465 chunks”. When a thread's current stack overflows, a
466 new stack chunk is created and added to the thread's
467 stack, until the limit set by <option>-K</option> is
472 The advantage of smaller stack chunks is that the garbage
473 collector can avoid traversing stack chunks if they are
474 known to be unmodified since the last collection, so
475 reducing the chunk size means that the garbage collector
476 can identify more stack as unmodified, and the GC overhead
477 might be reduced. On the other hand, making stack chunks
478 too small adds some overhead as there will be more
479 overflow/underflow between chunks. The default setting of
480 32k appears to be a reasonable compromise in most cases.
487 <option>-kb</option><replaceable>size</replaceable>
488 <indexterm><primary><option>-kc</option></primary><secondary>RTS
489 option</secondary></indexterm>
490 <indexterm><primary>stack</primary><secondary>chunk buffer size</secondary></indexterm>
494 [Default: 1k] Sets the stack chunk buffer size.
495 When a stack chunk overflows and a new stack chunk is
496 created, some of the data from the previous stack chunk is
497 moved into the new chunk, to avoid an immediate underflow
498 and repeated overflow/underflow at the boundary. The
499 amount of stack moved is set by the <option>-kb</option>
503 Note that to avoid wasting space, this value should
504 typically be less than 10% of the size of a stack
505 chunk (<option>-kc</option>), because in a chain of stack
506 chunks, each chunk will have a gap of unused space of this
514 <option>-K</option><replaceable>size</replaceable>
515 <indexterm><primary><option>-K</option></primary><secondary>RTS option</secondary></indexterm>
516 <indexterm><primary>stack, maximum size</primary></indexterm>
519 <para>[Default: 8M] Set the maximum stack size for
520 an individual thread to <replaceable>size</replaceable>
521 bytes. If the thread attempts to exceed this limit, it will
522 be send the <literal>StackOverflow</literal> exception.
525 This option is there mainly to stop the program eating up
526 all the available memory in the machine if it gets into an
534 <option>-m</option><replaceable>n</replaceable>
535 <indexterm><primary><option>-m</option></primary><secondary>RTS option</secondary></indexterm>
536 <indexterm><primary>heap, minimum free</primary></indexterm>
539 <para>Minimum % <replaceable>n</replaceable> of heap
540 which must be available for allocation. The default is
547 <option>-M</option><replaceable>size</replaceable>
548 <indexterm><primary><option>-M</option></primary><secondary>RTS option</secondary></indexterm>
549 <indexterm><primary>heap size, maximum</primary></indexterm>
552 <para>[Default: unlimited] Set the maximum heap size to
553 <replaceable>size</replaceable> bytes. The heap normally
554 grows and shrinks according to the memory requirements of
555 the program. The only reason for having this option is to
556 stop the heap growing without bound and filling up all the
557 available swap space, which at the least will result in the
558 program being summarily killed by the operating
561 <para>The maximum heap size also affects other garbage
562 collection parameters: when the amount of live data in the
563 heap exceeds a certain fraction of the maximum heap size,
564 compacting collection will be automatically enabled for the
565 oldest generation, and the <option>-F</option> parameter
566 will be reduced in order to avoid exceeding the maximum heap
573 <option>-t</option><optional><replaceable>file</replaceable></optional>
574 <indexterm><primary><option>-t</option></primary><secondary>RTS option</secondary></indexterm>
577 <option>-s</option><optional><replaceable>file</replaceable></optional>
578 <indexterm><primary><option>-s</option></primary><secondary>RTS option</secondary></indexterm>
581 <option>-S</option><optional><replaceable>file</replaceable></optional>
582 <indexterm><primary><option>-S</option></primary><secondary>RTS option</secondary></indexterm>
585 <option>--machine-readable</option>
586 <indexterm><primary><option>--machine-readable</option></primary><secondary>RTS option</secondary></indexterm>
589 <para>These options produce runtime-system statistics, such
590 as the amount of time spent executing the program and in the
591 garbage collector, the amount of memory allocated, the
592 maximum size of the heap, and so on. The three
593 variants give different levels of detail:
594 <option>-t</option> produces a single line of output in the
595 same format as GHC's <option>-Rghc-timing</option> option,
596 <option>-s</option> produces a more detailed summary at the
597 end of the program, and <option>-S</option> additionally
598 produces information about each and every garbage
601 <para>The output is placed in
602 <replaceable>file</replaceable>. If
603 <replaceable>file</replaceable> is omitted, then the output
604 is sent to <constant>stderr</constant>.</para>
607 If you use the <literal>-t</literal> flag then, when your
608 program finishes, you will see something like this:
612 <<ghc: 36169392 bytes, 69 GCs, 603392/1065272 avg/max bytes residency (2 samples), 3M in use, 0.00 INIT (0.00 elapsed), 0.02 MUT (0.02 elapsed), 0.07 GC (0.07 elapsed) :ghc>>
622 The total number of bytes allocated by the program over the
628 The total number of garbage collections performed.
633 The average and maximum "residency", which is the amount of
634 live data in bytes. The runtime can only determine the
635 amount of live data during a major GC, which is why the
636 number of samples corresponds to the number of major GCs
637 (and is usually relatively small). To get a better picture
638 of the heap profile of your program, use
639 the <option>-hT</option> RTS option
640 (<xref linkend="rts-profiling" />).
645 The peak memory the RTS has allocated from the OS.
650 The amount of CPU time and elapsed wall clock time while
651 initialising the runtime system (INIT), running the program
652 itself (MUT, the mutator), and garbage collecting (GC).
658 You can also get this in a more future-proof, machine readable
659 format, with <literal>-t --machine-readable</literal>:
663 [("bytes allocated", "36169392")
665 ,("average_bytes_used", "603392")
666 ,("max_bytes_used", "1065272")
667 ,("num_byte_usage_samples", "2")
668 ,("peak_megabytes_allocated", "3")
669 ,("init_cpu_seconds", "0.00")
670 ,("init_wall_seconds", "0.00")
671 ,("mutator_cpu_seconds", "0.02")
672 ,("mutator_wall_seconds", "0.02")
673 ,("GC_cpu_seconds", "0.07")
674 ,("GC_wall_seconds", "0.07")
679 If you use the <literal>-s</literal> flag then, when your
680 program finishes, you will see something like this (the exact
681 details will vary depending on what sort of RTS you have, e.g.
682 you will only see profiling data if your RTS is compiled for
687 36,169,392 bytes allocated in the heap
688 4,057,632 bytes copied during GC
689 1,065,272 bytes maximum residency (2 sample(s))
690 54,312 bytes maximum slop
691 3 MB total memory in use (0 MB lost due to fragmentation)
693 Generation 0: 67 collections, 0 parallel, 0.04s, 0.03s elapsed
694 Generation 1: 2 collections, 0 parallel, 0.03s, 0.04s elapsed
696 SPARKS: 359207 (557 converted, 149591 pruned)
698 INIT time 0.00s ( 0.00s elapsed)
699 MUT time 0.01s ( 0.02s elapsed)
700 GC time 0.07s ( 0.07s elapsed)
701 EXIT time 0.00s ( 0.00s elapsed)
702 Total time 0.08s ( 0.09s elapsed)
704 %GC time 89.5% (75.3% elapsed)
706 Alloc rate 4,520,608,923 bytes per MUT second
708 Productivity 10.5% of total user, 9.1% of total elapsed
714 The "bytes allocated in the heap" is the total bytes allocated
715 by the program over the whole run.
720 GHC uses a copying garbage collector by default. "bytes copied
721 during GC" tells you how many bytes it had to copy during
727 The maximum space actually used by your program is the
728 "bytes maximum residency" figure. This is only checked during
729 major garbage collections, so it is only an approximation;
730 the number of samples tells you how many times it is checked.
735 The "bytes maximum slop" tells you the most space that is ever
736 wasted due to the way GHC allocates memory in blocks. Slop is
737 memory at the end of a block that was wasted. There's no way
738 to control this; we just like to see how much memory is being
744 The "total memory in use" tells you the peak memory the RTS has
745 allocated from the OS.
750 Next there is information about the garbage collections done.
751 For each generation it says how many garbage collections were
752 done, how many of those collections were done in parallel,
753 the total CPU time used for garbage collecting that generation,
754 and the total wall clock time elapsed while garbage collecting
759 <para>The <literal>SPARKS</literal> statistic refers to the
760 use of <literal>Control.Parallel.par</literal> and related
761 functionality in the program. Each spark represents a call
762 to <literal>par</literal>; a spark is "converted" when it is
763 executed in parallel; and a spark is "pruned" when it is
764 found to be already evaluated and is discarded from the pool
765 by the garbage collector. Any remaining sparks are
766 discarded at the end of execution, so "converted" plus
767 "pruned" does not necessarily add up to the total.</para>
771 Next there is the CPU time and wall clock time elapsed broken
772 down by what the runtime system was doing at the time.
773 INIT is the runtime system initialisation.
774 MUT is the mutator time, i.e. the time spent actually running
776 GC is the time spent doing garbage collection.
777 RP is the time spent doing retainer profiling.
778 PROF is the time spent doing other profiling.
779 EXIT is the runtime system shutdown time.
780 And finally, Total is, of course, the total.
783 %GC time tells you what percentage GC is of Total.
784 "Alloc rate" tells you the "bytes allocated in the heap" divided
786 "Productivity" tells you what percentage of the Total CPU and wall
787 clock elapsed times are spent in the mutator (MUT).
793 The <literal>-S</literal> flag, as well as giving the same
794 output as the <literal>-s</literal> flag, prints information
795 about each GC as it happens:
799 Alloc Copied Live GC GC TOT TOT Page Flts
800 bytes bytes bytes user elap user elap
801 528496 47728 141512 0.01 0.02 0.02 0.02 0 0 (Gen: 1)
803 524944 175944 1726384 0.00 0.00 0.08 0.11 0 0 (Gen: 0)
807 For each garbage collection, we print:
813 How many bytes we allocated this garbage collection.
818 How many bytes we copied this garbage collection.
823 How many bytes are currently live.
828 How long this garbage collection took (CPU time and elapsed
834 How long the program has been running (CPU time and elapsed
840 How many page faults occured this garbage collection.
845 How many page faults occured since the end of the last garbage
851 Which generation is being garbage collected.
863 <title>RTS options for concurrency and parallelism</title>
865 <para>The RTS options related to concurrency are described in
866 <xref linkend="using-concurrent" />, and those for parallelism in
867 <xref linkend="parallel-options"/>.</para>
870 <sect2 id="rts-profiling">
871 <title>RTS options for profiling</title>
873 <para>Most profiling runtime options are only available when you
874 compile your program for profiling (see
875 <xref linkend="prof-compiler-options" />, and
876 <xref linkend="rts-options-heap-prof" /> for the runtime options).
877 However, there is one profiling option that is available
878 for ordinary non-profiled executables:</para>
884 <indexterm><primary><option>-hT</option></primary><secondary>RTS
885 option</secondary></indexterm>
888 <para>Generates a basic heap profile, in the
889 file <literal><replaceable>prog</replaceable>.hp</literal>.
890 To produce the heap profile graph,
891 use <command>hp2ps</command> (see <xref linkend="hp2ps"
892 />). The basic heap profile is broken down by data
893 constructor, with other types of closures (functions, thunks,
894 etc.) grouped into broad categories
895 (e.g. <literal>FUN</literal>, <literal>THUNK</literal>). To
896 get a more detailed profile, use the full profiling
897 support (<xref linkend="profiling" />).</para>
903 <sect2 id="rts-eventlog">
904 <title>Tracing</title>
906 <indexterm><primary>tracing</primary></indexterm>
907 <indexterm><primary>events</primary></indexterm>
908 <indexterm><primary>eventlog files</primary></indexterm>
911 When the program is linked with the <option>-eventlog</option>
912 option (<xref linkend="options-linker" />), runtime events can
913 be logged in two ways:
919 In binary format to a file for later analysis by a
920 variety of tools. One such tool
921 is <ulink url="http://hackage.haskell.org/package/ThreadScope">ThreadScope</ulink><indexterm><primary>ThreadScope</primary></indexterm>,
922 which interprets the event log to produce a visual parallel
923 execution profile of the program.
928 As text to standard output, for debugging purposes.
936 <option>-l<optional><replaceable>flags</replaceable></optional></option>
937 <indexterm><primary><option>-l</option></primary><secondary>RTS option</secondary></indexterm>
941 Log events in binary format to the
942 file <filename><replaceable>program</replaceable>.eventlog</filename>,
943 where <replaceable>flags</replaceable> is a sequence of
944 zero or more characters indicating which kinds of events
945 to log. Currently there is only one type
946 supported: <literal>-ls</literal>, for scheduler events.
950 The format of the log file is described by the header
951 <filename>EventLogFormat.h</filename> that comes with
952 GHC, and it can be parsed in Haskell using
953 the <ulink url="http://hackage.haskell.org/package/ghc-events">ghc-events</ulink>
954 library. To dump the contents of
955 a <literal>.eventlog</literal> file as text, use the
956 tool <literal>show-ghc-events</literal> that comes with
957 the <ulink url="http://hackage.haskell.org/package/ghc-events">ghc-events</ulink>
965 <option>-v</option><optional><replaceable>flags</replaceable></optional>
966 <indexterm><primary><option>-v</option></primary><secondary>RTS option</secondary></indexterm>
970 Log events as text to standard output, instead of to
971 the <literal>.eventlog</literal> file.
972 The <replaceable>flags</replaceable> are the same as
973 for <option>-l</option>, with the additional
974 option <literal>t</literal> which indicates that the
975 each event printed should be preceded by a timestamp value
976 (in the binary <literal>.eventlog</literal> file, all
977 events are automatically associated with a timestamp).
986 options <option>-D<replaceable>x</replaceable></option> also
987 generate events which are logged using the tracing framework.
988 By default those events are dumped as text to stdout
989 (<option>-D<replaceable>x</replaceable></option>
990 implies <option>-v</option>), but they may instead be stored in
991 the binary eventlog file by using the <option>-l</option>
996 <sect2 id="rts-options-debugging">
997 <title>RTS options for hackers, debuggers, and over-interested
1000 <indexterm><primary>RTS options, hacking/debugging</primary></indexterm>
1002 <para>These RTS options might be used (a) to avoid a GHC bug,
1003 (b) to see “what's really happening”, or
1004 (c) because you feel like it. Not recommended for everyday
1012 <indexterm><primary><option>-B</option></primary><secondary>RTS option</secondary></indexterm>
1015 <para>Sound the bell at the start of each (major) garbage
1018 <para>Oddly enough, people really do use this option! Our
1019 pal in Durham (England), Paul Callaghan, writes: “Some
1020 people here use it for a variety of
1021 purposes—honestly!—e.g., confirmation that the
1022 code/machine is doing something, infinite loop detection,
1023 gauging cost of recently added code. Certain people can even
1024 tell what stage [the program] is in by the beep
1025 pattern. But the major use is for annoying others in the
1026 same office…”</para>
1032 <option>-D</option><replaceable>x</replaceable>
1033 <indexterm><primary>-D</primary><secondary>RTS option</secondary></indexterm>
1037 An RTS debugging flag; only availble if the program was
1038 linked with the <option>-debug</option> option. Various
1039 values of <replaceable>x</replaceable> are provided to
1040 enable debug messages and additional runtime sanity checks
1041 in different subsystems in the RTS, for
1042 example <literal>+RTS -Ds -RTS</literal> enables debug
1043 messages from the scheduler.
1044 Use <literal>+RTS -?</literal> to find out which
1045 debug flags are supported.
1049 Debug messages will be sent to the binary event log file
1050 instead of stdout if the <option>-l</option> option is
1051 added. This might be useful for reducing the overhead of
1059 <option>-r</option><replaceable>file</replaceable>
1060 <indexterm><primary><option>-r</option></primary><secondary>RTS option</secondary></indexterm>
1061 <indexterm><primary>ticky ticky profiling</primary></indexterm>
1062 <indexterm><primary>profiling</primary><secondary>ticky ticky</secondary></indexterm>
1065 <para>Produce “ticky-ticky” statistics at the
1066 end of the program run (only available if the program was
1067 linked with <option>-debug</option>).
1068 The <replaceable>file</replaceable> business works just like
1069 on the <option>-S</option> RTS option, above.</para>
1071 <para>For more information on ticky-ticky profiling, see
1072 <xref linkend="ticky-ticky"/>.</para>
1078 <option>-xc</option>
1079 <indexterm><primary><option>-xc</option></primary><secondary>RTS option</secondary></indexterm>
1082 <para>(Only available when the program is compiled for
1083 profiling.) When an exception is raised in the program,
1084 this option causes the current cost-centre-stack to be
1085 dumped to <literal>stderr</literal>.</para>
1087 <para>This can be particularly useful for debugging: if your
1088 program is complaining about a <literal>head []</literal>
1089 error and you haven't got a clue which bit of code is
1090 causing it, compiling with <literal>-prof
1091 -auto-all</literal> and running with <literal>+RTS -xc
1092 -RTS</literal> will tell you exactly the call stack at the
1093 point the error was raised.</para>
1095 <para>The output contains one line for each exception raised
1096 in the program (the program might raise and catch several
1097 exceptions during its execution), where each line is of the
1101 < cc<subscript>1</subscript>, ..., cc<subscript>n</subscript> >
1103 <para>each <literal>cc</literal><subscript>i</subscript> is
1104 a cost centre in the program (see <xref
1105 linkend="cost-centres"/>), and the sequence represents the
1106 “call stack” at the point the exception was
1107 raised. The leftmost item is the innermost function in the
1108 call stack, and the rightmost item is the outermost
1117 <indexterm><primary><option>-Z</option></primary><secondary>RTS option</secondary></indexterm>
1120 <para>Turn <emphasis>off</emphasis> “update-frame
1121 squeezing” at garbage-collection time. (There's no
1122 particularly good reason to turn it off, except to ensure
1123 the accuracy of certain data collected regarding thunk entry
1132 <title>Linker flags to change RTS behaviour</title>
1134 <indexterm><primary>RTS behaviour, changing</primary></indexterm>
1137 GHC lets you exercise rudimentary control over the RTS settings
1138 for any given program, by using the <literal>-with-rtsopts</literal>
1139 linker flag. For example, to set <literal>-H128m -K1m</literal>,
1140 link with <literal>-with-rtsopts="-H128m -K1m"</literal>.
1145 <sect2 id="rts-hooks">
1146 <title>“Hooks” to change RTS behaviour</title>
1148 <indexterm><primary>hooks</primary><secondary>RTS</secondary></indexterm>
1149 <indexterm><primary>RTS hooks</primary></indexterm>
1150 <indexterm><primary>RTS behaviour, changing</primary></indexterm>
1152 <para>GHC lets you exercise rudimentary control over the RTS
1153 settings for any given program, by compiling in a
1154 “hook” that is called by the run-time system. The RTS
1155 contains stub definitions for all these hooks, but by writing your
1156 own version and linking it on the GHC command line, you can
1157 override the defaults.</para>
1159 <para>Owing to the vagaries of DLL linking, these hooks don't work
1160 under Windows when the program is built dynamically.</para>
1162 <para>The hook <literal>ghc_rts_opts</literal><indexterm><primary><literal>ghc_rts_opts</literal></primary>
1163 </indexterm>lets you set RTS
1164 options permanently for a given program, in the same way as the
1165 newer <option>-with-rtsopts</option> linker option does. A common use for this is
1166 to give your program a default heap and/or stack size that is
1167 greater than the default. For example, to set <literal>-H128m
1168 -K1m</literal>, place the following definition in a C source
1172 char *ghc_rts_opts = "-H128m -K1m";
1175 <para>Compile the C file, and include the object file on the
1176 command line when you link your Haskell program.</para>
1178 <para>These flags are interpreted first, before any RTS flags from
1179 the <literal>GHCRTS</literal> environment variable and any flags
1180 on the command line.</para>
1182 <para>You can also change the messages printed when the runtime
1183 system “blows up,” e.g., on stack overflow. The hooks
1184 for these are as follows:</para>
1190 <function>void OutOfHeapHook (unsigned long, unsigned long)</function>
1191 <indexterm><primary><function>OutOfHeapHook</function></primary></indexterm>
1194 <para>The heap-overflow message.</para>
1200 <function>void StackOverflowHook (long int)</function>
1201 <indexterm><primary><function>StackOverflowHook</function></primary></indexterm>
1204 <para>The stack-overflow message.</para>
1210 <function>void MallocFailHook (long int)</function>
1211 <indexterm><primary><function>MallocFailHook</function></primary></indexterm>
1214 <para>The message printed if <function>malloc</function>
1220 <para>For examples of the use of these hooks, see GHC's own
1221 versions in the file
1222 <filename>ghc/compiler/parser/hschooks.c</filename> in a GHC
1227 <title>Getting information about the RTS</title>
1229 <indexterm><primary>RTS</primary></indexterm>
1231 <para>It is possible to ask the RTS to give some information about
1232 itself. To do this, use the <option>--info</option> flag, e.g.</para>
1234 $ ./a.out +RTS --info
1236 ,("GHC version", "6.7")
1237 ,("RTS way", "rts_p")
1238 ,("Host platform", "x86_64-unknown-linux")
1239 ,("Host architecture", "x86_64")
1240 ,("Host OS", "linux")
1241 ,("Host vendor", "unknown")
1242 ,("Build platform", "x86_64-unknown-linux")
1243 ,("Build architecture", "x86_64")
1244 ,("Build OS", "linux")
1245 ,("Build vendor", "unknown")
1246 ,("Target platform", "x86_64-unknown-linux")
1247 ,("Target architecture", "x86_64")
1248 ,("Target OS", "linux")
1249 ,("Target vendor", "unknown")
1250 ,("Word size", "64")
1251 ,("Compiler unregisterised", "NO")
1252 ,("Tables next to code", "YES")
1255 <para>The information is formatted such that it can be read as a
1256 of type <literal>[(String, String)]</literal>. Currently the following
1257 fields are present:</para>
1262 <term><literal>GHC RTS</literal></term>
1264 <para>Is this program linked against the GHC RTS? (always
1270 <term><literal>GHC version</literal></term>
1272 <para>The version of GHC used to compile this program.</para>
1277 <term><literal>RTS way</literal></term>
1279 <para>The variant (“way”) of the runtime. The
1280 most common values are <literal>rts</literal> (vanilla),
1281 <literal>rts_thr</literal> (threaded runtime, i.e. linked using the
1282 <literal>-threaded</literal> option) and <literal>rts_p</literal>
1283 (profiling runtime, i.e. linked using the <literal>-prof</literal>
1284 option). Other variants include <literal>debug</literal>
1285 (linked using <literal>-debug</literal>),
1286 <literal>t</literal> (ticky-ticky profiling) and
1287 <literal>dyn</literal> (the RTS is
1288 linked in dynamically, i.e. a shared library, rather than statically
1289 linked into the executable itself). These can be combined,
1290 e.g. you might have <literal>rts_thr_debug_p</literal>.</para>
1296 <literal>Target platform</literal>,
1297 <literal>Target architecture</literal>,
1298 <literal>Target OS</literal>,
1299 <literal>Target vendor</literal>
1302 <para>These are the platform the program is compiled to run on.</para>
1308 <literal>Build platform</literal>,
1309 <literal>Build architecture</literal>,
1310 <literal>Build OS</literal>,
1311 <literal>Build vendor</literal>
1314 <para>These are the platform where the program was built
1315 on. (That is, the target platform of GHC itself.) Ordinarily
1316 this is identical to the target platform. (It could potentially
1317 be different if cross-compiling.)</para>
1323 <literal>Host platform</literal>,
1324 <literal>Host architecture</literal>
1325 <literal>Host OS</literal>
1326 <literal>Host vendor</literal>
1329 <para>These are the platform where GHC itself was compiled.
1330 Again, this would normally be identical to the build and
1331 target platforms.</para>
1336 <term><literal>Word size</literal></term>
1338 <para>Either <literal>"32"</literal> or <literal>"64"</literal>,
1339 reflecting the word size of the target platform.</para>
1344 <term><literal>Compiler unregistered</literal></term>
1346 <para>Was this program compiled with an “unregistered”
1347 version of GHC? (I.e., a version of GHC that has no platform-specific
1348 optimisations compiled in, usually because this is a currently
1349 unsupported platform.) This value will usually be no, unless you're
1350 using an experimental build of GHC.</para>
1355 <term><literal>Tables next to code</literal></term>
1357 <para>Putting info tables directly next to entry code is a useful
1358 performance optimisation that is not available on all platforms.
1359 This field tells you whether the program has been compiled with
1360 this optimisation. (Usually yes, except on unusual platforms.)</para>
1370 ;;; Local Variables: ***
1371 ;;; sgml-parent-document: ("users_guide.xml" "book" "chapter" "sect1") ***