X-Git-Url: http://git.megacz.com/?a=blobdiff_plain;f=docs%2Fusers_guide%2Fruntime_control.xml;h=22ca59dcd25b1b1f0551bc427d8c00828b69072c;hb=4df9f0ee56ae232e1cf2f9531205af0dd916b496;hp=69e26bc9208ea67c35b1a81d01e065f05a459324;hpb=9ee105bd1f977a8fd68281e7658383d5a0c86156;p=ghc-hetmet.git diff --git a/docs/users_guide/runtime_control.xml b/docs/users_guide/runtime_control.xml index 69e26bc..22ca59d 100644 --- a/docs/users_guide/runtime_control.xml +++ b/docs/users_guide/runtime_control.xml @@ -10,7 +10,8 @@ code and then links it with a non-trivial runtime system (RTS), which handles storage management, profiling, etc. - You have some control over the behaviour of the RTS, by giving + If you set the -rtsopts flag appropriately when linking, + you have some control over the behaviour of the RTS, by giving special command-line arguments to your program. When your Haskell program starts up, its RTS extracts @@ -48,8 +49,8 @@ wraparound in the counters is your fault!) - Giving a +RTS -f - RTS option option + Giving a +RTS -? + RTS option option will print out the RTS options actually available in your program (which vary, depending on how you compiled). @@ -61,14 +62,16 @@ +RTS -M128m -RTS to the command line. - + Setting global RTS options RTS optionsfrom the environment environment variablefor setting RTS options - RTS options are also taken from the environment variable + If the -rtsopts flag is set to + something other than none when linking, + RTS options are also taken from the environment variable GHCRTSGHCRTS . For example, to set the maximum heap size to 128M for all GHC-compiled programs (using an @@ -128,6 +131,15 @@ things like ctrl-C. This option is primarily useful for when you are using the Haskell code as a DLL, and want to set your own signal handlers. + + Note that even + with , the RTS + interval timer signal is still enabled. The timer signal + is either SIGVTALRM or SIGALRM, depending on the RTS + configuration and OS capabilities. To disable the timer + signal, use the -V0 RTS option (see + above). + @@ -298,51 +310,58 @@ - - RTS + + RTS option - [New in GHC 6.12.1] Disable the parallel GC. - The parallel GC is turned on automatically when parallel - execution is enabled with the option; - this option is available to turn it off if - necessary. + [New in GHC 6.12.1] [Default: 0] + Use parallel GC in + generation gen and higher. + Omitting gen turns off the + parallel GC completely, reverting to sequential GC. - Experiments have shown that parallel GC usually - results in a performance improvement given 3 cores or - more; with 2 cores it may or may not be beneficial, - depending on the workload. Bigger heaps work better with - parallel GC, so set your value high (3 - or more times the maximum residency). Look at the timing - stats with to see whether you're - getting any benefit from parallel GC or not. If you find - parallel GC is significantly slower - (in elapsed time) than sequential GC, please report it as - a bug. - - In GHC 6.10.1 it was possible to use a different - number of threads for GC than for execution, because the GC - used its own pool of threads. Now, the GC uses the same - threads as the mutator (for executing the program). + The default parallel GC settings are usually suitable + for parallel programs (i.e. those + using par, Strategies, or with multiple + threads). However, it is sometimes beneficial to enable + the parallel GC for a single-threaded sequential program + too, especially if the program has a large amount of heap + data and GC is a significant fraction of runtime. To use + the parallel GC in a sequential program, enable the + parallel runtime with a suitable -N + option, and additionally it might be beneficial to + restrict parallel GC to the old generation + with -qg1. - - RTS + + RTS option - [Default: 1] [New in GHC 6.12.1] - Enable the parallel GC only in - generation n and greater. - Parallel GC is often not worthwhile for collections in - generation 0 (the young generation), so it is enabled by - default only for collections in generation 1 (and higher, - if applicable). + [New in GHC 6.12.1] [Default: 1] Use + load-balancing in the parallel GC in + generation gen and higher. + Omitting gen disables + load-balancing entirely. + + + Load-balancing shares out the work of GC between the + available cores. This is a good idea when the heap is + large and we need to parallelise the GC work, however it + is also pessimal for the short young-generation + collections in a parallel program, because it can harm + locality by moving data from the cache of the CPU where is + it being used to the cache of another CPU. Hence the + default is to do load-balancing only in the + old-generation. In fact, for a parallel program it is + sometimes beneficial to disable load-balancing entirely + with -qb. @@ -406,22 +425,88 @@ - size + size RTS option - stack, minimum size + stack, initial size - [Default: 1k] Set the initial stack size for - new threads. Thread stacks (including the main thread's - stack) live on the heap, and grow as required. The default - value is good for concurrent applications with lots of small - threads; if your program doesn't fit this model then - increasing this option may help performance. - - The main thread is normally started with a slightly - larger heap to cut down on unnecessary stack growth while - the program is starting up. - + + [Default: 1k] Set the initial stack size for new + threads. (Note: this flag used to be + simply , but was renamed + to in GHC 7.2.1. The old name is + still accepted for backwards compatibility, but that may + be removed in a future version). + + + + Thread stacks (including the main thread's stack) live on + the heap. As the stack grows, new stack chunks are added + as required; if the stack shrinks again, these extra stack + chunks are reclaimed by the garbage collector. The + default initial stack size is deliberately small, in order + to keep the time and space overhead for thread creation to + a minimum, and to make it practical to spawn threads for + even tiny pieces of work. + + + + + + + size + RTS + option + stackchunk size + + + + [Default: 32k] Set the size of “stack + chunks”. When a thread's current stack overflows, a + new stack chunk is created and added to the thread's + stack, until the limit set by is + reached. + + + + The advantage of smaller stack chunks is that the garbage + collector can avoid traversing stack chunks if they are + known to be unmodified since the last collection, so + reducing the chunk size means that the garbage collector + can identify more stack as unmodified, and the GC overhead + might be reduced. On the other hand, making stack chunks + too small adds some overhead as there will be more + overflow/underflow between chunks. The default setting of + 32k appears to be a reasonable compromise in most cases. + + + + + + + size + RTS + option + stackchunk buffer size + + + + [Default: 1k] Sets the stack chunk buffer size. + When a stack chunk overflows and a new stack chunk is + created, some of the data from the previous stack chunk is + moved into the new chunk, to avoid an immediate underflow + and repeated overflow/underflow at the boundary. The + amount of stack moved is set by the + option. + + + Note that to avoid wasting space, this value should + typically be less than 10% of the size of a stack + chunk (), because in a chain of stack + chunks, each chunk will have a gap of unused space of this + size. + + @@ -433,9 +518,14 @@ [Default: 8M] Set the maximum stack size for an individual thread to size - bytes. This option is there purely to stop the program - eating up all the available memory in the machine if it gets - into an infinite loop. + bytes. If the thread attempts to exceed this limit, it will + be send the StackOverflow exception. + + + This option is there mainly to stop the program eating up + all the available memory in the machine if it gets into an + infinite loop. + @@ -810,6 +900,99 @@ + + Tracing + + tracing + events + eventlog files + + + When the program is linked with the + option (), runtime events can + be logged in two ways: + + + + + + In binary format to a file for later analysis by a + variety of tools. One such tool + is ThreadScopeThreadScope, + which interprets the event log to produce a visual parallel + execution profile of the program. + + + + + As text to standard output, for debugging purposes. + + + + + + + + + RTS option + + + + Log events in binary format to the + file program.eventlog, + where flags is a sequence of + zero or more characters indicating which kinds of events + to log. Currently there is only one type + supported: -ls, for scheduler events. + + + + The format of the log file is described by the header + EventLogFormat.h that comes with + GHC, and it can be parsed in Haskell using + the ghc-events + library. To dump the contents of + a .eventlog file as text, use the + tool show-ghc-events that comes with + the ghc-events + package. + + + + + + + flags + RTS option + + + + Log events as text to standard output, instead of to + the .eventlog file. + The flags are the same as + for , with the additional + option t which indicates that the + each event printed should be preceded by a timestamp value + (in the binary .eventlog file, all + events are automatically associated with a timestamp). + + + + + + + + The debugging + options also + generate events which are logged using the tracing framework. + By default those events are dumped as text to stdout + ( + implies ), but they may instead be stored in + the binary eventlog file by using the + option. + + + RTS options for hackers, debuggers, and over-interested souls @@ -846,14 +1029,28 @@ - num + x -DRTS option - An RTS debugging flag; varying quantities of output - depending on which bits are set in - num. Only works if the RTS was - compiled with the option. + + An RTS debugging flag; only availble if the program was + linked with the option. Various + values of x are provided to + enable debug messages and additional runtime sanity checks + in different subsystems in the RTS, for + example +RTS -Ds -RTS enables debug + messages from the scheduler. + Use +RTS -? to find out which + debug flags are supported. + + + + Debug messages will be sent to the binary event log file + instead of stdout if the option is + added. This might be useful for reducing the overhead of + debug tracing. + @@ -866,20 +1063,13 @@ Produce “ticky-ticky” statistics at the - end of the program run. The file - business works just like on the RTS - option (above). - - “Ticky-ticky” statistics are counts of - various program actions (updates, enters, etc.) The program - must have been compiled using - - (a.k.a. “ticky-ticky profiling”), and, for it to - be really useful, linked with suitable system libraries. - Not a trivial undertaking: consult the installation guide on - how to set things up for easy “ticky-ticky” - profiling. For more information, see . + end of the program run (only available if the program was + linked with ). + The file business works just like + on the RTS option, above. + + For more information on ticky-ticky profiling, see + . @@ -938,6 +1128,20 @@ + + Linker flags to change RTS behaviour + + RTS behaviour, changing + + + GHC lets you exercise rudimentary control over the RTS settings + for any given program, by using the -with-rtsopts + linker flag. For example, to set -H128m -K1m, + link with -with-rtsopts="-H128m -K1m". + + + + “Hooks” to change RTS behaviour @@ -957,7 +1161,8 @@ The hook ghc_rts_optsghc_rts_opts lets you set RTS - options permanently for a given program. A common use for this is + options permanently for a given program, in the same way as the + newer linker option does. A common use for this is to give your program a default heap and/or stack size that is greater than the default. For example, to set -H128m -K1m, place the following definition in a C source @@ -1163,7 +1368,6 @@ $ ./a.out +RTS --info