X-Git-Url: http://git.megacz.com/?a=blobdiff_plain;f=docs%2Fusers_guide%2Fruntime_control.xml;h=94995b3e8d27ff962a856ab67d34d5a245c60ff7;hb=aedb94f5f220b5e442b23ecc445fd38c8d9b6ba0;hp=daed07cee340c6c5a3d8ff764854dcbb98c4c55f;hpb=0065d5ab628975892cea1ec7303f968c3338cbe1;p=ghc-hetmet.git diff --git a/docs/users_guide/runtime_control.xml b/docs/users_guide/runtime_control.xml index daed07c..94995b3 100644 --- a/docs/users_guide/runtime_control.xml +++ b/docs/users_guide/runtime_control.xml @@ -85,6 +85,83 @@ + + Miscellaneous RTS options + + + + + RTS + option + + Sets the interval that the RTS clock ticks at. The + runtime uses a single timer signal to count ticks; this timer + signal is used to control the context switch timer () and the heap profiling + timer . Also, the + time profiler uses the RTS timer signal directly to record + time profiling samples. + + Normally, setting the option + directly is not necessary: the resolution of the RTS timer is + adjusted automatically if a short interval is requested with + the or options. + However, setting is required in order to + increase the resolution of the time profiler. + + Using a value of zero disables the RTS clock + completely, and has the effect of disabling timers that + depend on it: the context switch timer and the heap profiling + timer. Context switches will still happen, but + deterministically and at a rate much faster than normal. + Disabling the interval timer is useful for debugging, because + it eliminates a source of non-determinism at runtime. + + + + + + RTS + option + + If yes (the default), the RTS installs signal handlers to catch + things like ctrl-C. This option is primarily useful for when + you are using the Haskell code as a DLL, and want to set your + own signal handlers. + + + + + + RTS + option + + + WARNING: this option is for working around memory + allocation problems only. Do not use unless GHCi fails + with a message like “failed to mmap() memory below 2Gb”. If you need to use this option to get GHCi working + on your machine, please file a bug. + + + + On 64-bit machines, the RTS needs to allocate memory in the + low 2Gb of the address space. Support for this across + different operating systems is patchy, and sometimes fails. + This option is there to give the RTS a hint about where it + should be able to allocate memory in the low 2Gb of the + address space. For example, +RTS -xm20000000 + -RTS would hint that the RTS should allocate + starting at the 0.5Gb mark. The default is to use the OS's + built-in support for allocating memory in the low 2Gb if + available (e.g. mmap + with MAP_32BIT on Linux), or + otherwise -xm40000000. + + + + + + RTS options to control the garbage collector @@ -220,6 +297,43 @@ + + threads + RTS option + + + [Default: 1] [new in GHC 6.10] Set the number + of threads to use for garbage collection. This option is + only accepted when the program was linked with the + option; see . + + The garbage collector is able to work in parallel when + given more than one OS thread. Experiments have shown + that this usually results in a performance improvement + given 3 cores or more; with 2 cores it may or may not be + beneficial, depending on the workload. Bigger heaps work + better with parallel GC, so set your + value high (3 or more times the maximum residency). Look + at the timing stats with to + see whether you're getting any benefit from parallel GC or + not. If you find parallel GC is + significantly slower (in elapsed + time) than sequential GC, please report it as a + bug. + + This value is set automatically when the + option is used, so the only reason to + use would be if you wanted to use a + different number of threads for GC than for execution. + For example, if your program is strictly single-threaded + but you still want to benefit from parallel GC, then it + might make sense to use rather than + . + + + + size RTS option @@ -246,7 +360,7 @@ seconds - + RTS option idle GC @@ -351,46 +465,283 @@ + + file + RTS option + - file + file RTS option - file + file RTS option - - Write modest () or verbose - () garbage-collector statistics into file - file. The default - file is - program.stat. The - file stderr - is treated specially, with the output really being sent to - stderr. - - This option is useful for watching how the storage - manager adjusts the heap size based on the current amount of - live data. - - - - - - RTS option + + RTS option - Write a one-line GC stats summary after running the - program. This output is in the same format as that produced - by the option. - - As with , the default - file is - program.stat. The - file stderr - is treated specially, with the output really being sent to - stderr. + These options produce runtime-system statistics, such + as the amount of time spent executing the program and in the + garbage collector, the amount of memory allocated, the + maximum size of the heap, and so on. The three + variants give different levels of detail: + produces a single line of output in the + same format as GHC's option, + produces a more detailed summary at the + end of the program, and additionally + produces information about each and every garbage + collection. + + The output is placed in + file. If + file is omitted, then the output + is sent to stderr. + + + If you use the -t flag then, when your + program finishes, you will see something like this: + + + +<<ghc: 36169392 bytes, 69 GCs, 603392/1065272 avg/max bytes residency (2 samples), 3M in use, 0.00 INIT (0.00 elapsed), 0.02 MUT (0.02 elapsed), 0.07 GC (0.07 elapsed) :ghc>> + + + + This tells you: + + + + + + The total bytes allocated by the program. This may be less + than the peak memory use, as some may be freed. + + + + + The total number of garbage collections that occurred. + + + + + The average and maximum space used by your program. + This is only checked during major garbage collections, so it + is only an approximation; the number of samples tells you how + many times it is checked. + + + + + The peak memory the RTS has allocated from the OS. + + + + + The amount of CPU time and elapsed wall clock time while + initialising the runtime system (INIT), running the program + itself (MUT, the mutator), and garbage collecting (GC). + + + + + + You can also get this in a more future-proof, machine readable + format, with -t --machine-readable: + + + + [("bytes allocated", "36169392") + ,("num_GCs", "69") + ,("average_bytes_used", "603392") + ,("max_bytes_used", "1065272") + ,("num_byte_usage_samples", "2") + ,("peak_megabytes_allocated", "3") + ,("init_cpu_seconds", "0.00") + ,("init_wall_seconds", "0.00") + ,("mutator_cpu_seconds", "0.02") + ,("mutator_wall_seconds", "0.02") + ,("GC_cpu_seconds", "0.07") + ,("GC_wall_seconds", "0.07") + ] + + + + If you use the -s flag then, when your + program finishes, you will see something like this (the exact + details will vary depending on what sort of RTS you have, e.g. + you will only see profiling data if your RTS is compiled for + profiling): + + + + 36,169,392 bytes allocated in the heap + 4,057,632 bytes copied during GC + 1,065,272 bytes maximum residency (2 sample(s)) + 54,312 bytes maximum slop + 3 MB total memory in use (0 MB lost due to fragmentation) + + Generation 0: 67 collections, 0 parallel, 0.04s, 0.03s elapsed + Generation 1: 2 collections, 0 parallel, 0.03s, 0.04s elapsed + + SPARKS: 359207 (557 converted, 149591 pruned) + + INIT time 0.00s ( 0.00s elapsed) + MUT time 0.01s ( 0.02s elapsed) + GC time 0.07s ( 0.07s elapsed) + EXIT time 0.00s ( 0.00s elapsed) + Total time 0.08s ( 0.09s elapsed) + + %GC time 89.5% (75.3% elapsed) + + Alloc rate 4,520,608,923 bytes per MUT second + + Productivity 10.5% of total user, 9.1% of total elapsed + + + + + + The "bytes allocated in the heap" is the total bytes allocated + by the program. This may be less than the peak memory use, as + some may be freed. + + + + + GHC uses a copying garbage collector. "bytes copied during GC" + tells you how many bytes it had to copy during garbage collection. + + + + + The maximum space actually used by your program is the + "bytes maximum residency" figure. This is only checked during + major garbage collections, so it is only an approximation; + the number of samples tells you how many times it is checked. + + + + + The "bytes maximum slop" tells you the most space that is ever + wasted due to the way GHC packs data into so-called "megablocks". + + + + + The "total memory in use" tells you the peak memory the RTS has + allocated from the OS. + + + + + Next there is information about the garbage collections done. + For each generation it says how many garbage collections were + done, how many of those collections used multiple threads, + the total CPU time used for garbage collecting that generation, + and the total wall clock time elapsed while garbage collecting + that generation. + + + + The SPARKS statistic refers to the + use of Control.Parallel.par and related + functionality in the program. Each spark represents a call + to par; a spark is "converted" when it is + executed in parallel; and a spark is "pruned" when it is + found to be already evaluated and is discarded from the pool + by the garbage collector. Any remaining sparks are + discarded at the end of execution, so "converted" plus + "pruned" does not necessarily add up to the total. + + + + Next there is the CPU time and wall clock time elapsedm broken + down by what the runtiem system was doing at the time. + INIT is the runtime system initialisation. + MUT is the mutator time, i.e. the time spent actually running + your code. + GC is the time spent doing garbage collection. + RP is the time spent doing retainer profiling. + PROF is the time spent doing other profiling. + EXIT is the runtime system shutdown time. + And finally, Total is, of course, the total. + + + %GC time tells you what percentage GC is of Total. + "Alloc rate" tells you the "bytes allocated in the heap" divided + by the MUT CPU time. + "Productivity" tells you what percentage of the Total CPU and wall + clock elapsed times are spent in the mutator (MUT). + + + + + + The -S flag, as well as giving the same + output as the -s flag, prints information + about each GC as it happens: + + + + Alloc Copied Live GC GC TOT TOT Page Flts + bytes bytes bytes user elap user elap + 528496 47728 141512 0.01 0.02 0.02 0.02 0 0 (Gen: 1) +[...] + 524944 175944 1726384 0.00 0.00 0.08 0.11 0 0 (Gen: 0) + + + + For each garbage collection, we print: + + + + + + How many bytes we allocated this garbage collection. + + + + + How many bytes we copied this garbage collection. + + + + + How many bytes are currently live. + + + + + How long this garbage collection took (CPU time and elapsed + wall clock time). + + + + + How long the program has been running (CPU time and elapsed + wall clock time). + + + + + How many page faults occured this garbage collection. + + + + + How many page faults occured since the end of the last garbage + collection. + + + + + Which generation is being garbage collected. + + + + @@ -398,11 +749,44 @@ - RTS options for profiling and Concurrent/Parallel Haskell + RTS options for concurrency and parallelism - The RTS options related to profiling are described in ; and those for concurrent/parallel - stuff, in . + The RTS options related to concurrency are described in + , and those for parallelism in + . + + + + RTS options for profiling + + Most profiling runtime options are only available when you + compile your program for profiling (see + , and + for the runtime options). + However, there is one profiling option that is available + for ordinary non-profiled executables: + + + + + + RTS + option + + + Generates a basic heap profile, in the + file prog.hp. + To produce the heap profile graph, + use hp2ps (see ). The basic heap profile is broken down by data + constructor, with other types of closures (functions, thunks, + etc.) grouped into broad categories + (e.g. FUN, THUNK). To + get a more detailed profile, use the full profiling + support (). + + + @@ -612,6 +996,29 @@ char *ghc_rts_opts = "-H128m -K1m"; ghc/compiler/parser/hschooks.c in a GHC source tree. + + + Getting information about the RTS + + RTS + + It is possible to ask the RTS to give some information about + itself. To do this, use the flag, e.g. + +$ ./a.out +RTS --info + [("GHC RTS", "Yes") + ,("GHC version", "6.7") + ,("RTS way", "rts_p") + ,("Host platform", "x86_64-unknown-linux") + ,("Build platform", "x86_64-unknown-linux") + ,("Target platform", "x86_64-unknown-linux") + ,("Compiler unregisterised", "NO") + ,("Tables next to code", "YES") + ] + + The information is formatted such that it can be read as a + of type [(String, String)]. +