X-Git-Url: http://git.megacz.com/?a=blobdiff_plain;f=ghc%2Fdocs%2Fusers_guide%2Fprofiling.sgml;h=613a8bcd5388257a2d1b3a8c427b17ad0f8d480d;hb=57ad929d509b5f1b5c5ad510eb2831163f2071fa;hp=a0bd4f68b51f46de1e97bc9fb0f1947997722ff2;hpb=67909e79eea6f0e9b2350d01d1322fde587bcc64;p=ghc-hetmet.git diff --git a/ghc/docs/users_guide/profiling.sgml b/ghc/docs/users_guide/profiling.sgml index a0bd4f6..613a8bc 100644 --- a/ghc/docs/users_guide/profiling.sgml +++ b/ghc/docs/users_guide/profiling.sgml @@ -5,9 +5,6 @@ profiling, with cost-centres cost-centre profiling - - - Glasgow Haskell comes with a time and space profiling system. Its purpose is to help you improve your understanding of your program's execution behaviour, so you can improve it. @@ -42,18 +39,18 @@ f x y -The costs of the evaluating the expressions bound to output1, -output2 and output3 will be attributed to the ``cost -centres'' Pass1, Pass2 and Pass3, respectively. +The costs of the evaluating the expressions bound to output1, +output2 and output3 will be attributed to the ``cost +centres'' Pass1, Pass2 and Pass3, respectively. The costs of evaluating other expressions, e.g., concat output4, -will be inherited by the scope which referenced the function f. +will be inherited by the scope which referenced the function f. -You can put in cost-centres via _scc_ constructs by hand, as in the +You can put in cost-centres via _scc_ constructs by hand, as in the example above. Perfectly cool. That's probably what you would do if your program divided into obvious ``passes'' or ``phases'', or whatever. @@ -61,9 +58,9 @@ example above. Perfectly cool. That's probably what you If your program is large or you have no clue what might be gobbling -all the time, you can get GHC to mark all functions with _scc_ -constructs, automagically. Add an -auto compilation flag to the -usual -prof option. +all the time, you can get GHC to mark all functions with _scc_ +constructs, automagically. Add an compilation flag to the +usual option. @@ -97,18 +94,18 @@ system. Just visit the Glasgow FP gro To make use of the cost centre profiling system all modules must -be compiled and linked with the -prof option.-prof option -Any _scc_ constructs you've put in your source will spring to life. +be compiled and linked with the option.-prof option +Any _scc_ constructs you've put in your source will spring to life. -Without a -prof option, your _scc_s are ignored; so you can -compiled _scc_-laden code without changing it. +Without a option, your _scc_s are ignored; so you can +compiled _scc_-laden code without changing it. There are a few other profiling-related compilation options. Use them -in addition to -prof. These do not have to be used +in addition to . These do not have to be used consistently for all modules in a program. @@ -116,28 +113,28 @@ consistently for all modules in a program. --auto: +: -auto option cost centres, automatically inserting -GHC will automatically add _scc_ constructs for +GHC will automatically add _scc_ constructs for all top-level, exported functions. --auto-all: +: -auto-all option All top-level functions, exported or not, will be automatically -_scc_'d. +_scc_'d. --caf-all: +: -caf-all option @@ -148,18 +145,18 @@ An ``if all else fails'' option… --ignore-scc: +: -ignore-scc option -Ignore any _scc_ constructs, -so a module which already has _scc_s can be +Ignore any _scc_ constructs, +so a module which already has _scc_s can be compiled for profiling with the annotations ignored. --G<group>: +: -G<group> option @@ -173,10 +170,10 @@ module name. -In addition to the -prof option your system might be setup to enable -you to compile and link with the -prof-details -prof-details +In addition to the option your system might be setup to enable +you to compile and link with the -prof-details option option instead. This enables additional detailed counts -to be reported with the -P RTS option. +to be reported with the RTS option. @@ -191,7 +188,7 @@ to be reported with the -P RTS option. -It isn't enough to compile your program for profiling with -prof! +It isn't enough to compile your program for profiling with ! @@ -202,34 +199,34 @@ set the sampling interval used in time profiling. -Executive summary: ./a.out +RTS -pT produces a time profile in -a.out.prof; ./a.out +RTS -hC produces space-profiling -info which can be mangled by hp2ps and viewed with ghostview +Executive summary: ./a.out +RTS -pT produces a time profile in +a.out.prof; ./a.out +RTS -hC produces space-profiling +info which can be mangled by hp2ps and viewed with ghostview (or equivalent). Profiling runtime flags are passed to your program between the usual -+RTS and -RTS options. + and options. --p<sort> or -P<sort>: + or : -p<sort> RTS option (profiling) -P<sort> RTS option (profiling) time profile serial time profile -The -p? option produces a standard time profile report. -It is written into the file <program>@.prof. +The option produces a standard time profile report. +It is written into the file <program>@.prof. -The -P? option produces a more detailed report containing the +The option produces a more detailed report containing the actual time and allocation data as well. (Not used much.) @@ -239,7 +236,7 @@ report. Valid <sort> options are: -T: +: by time, largest first (the default); @@ -247,7 +244,7 @@ by time, largest first (the default); -A: +: by bytes allocated, largest first; @@ -255,7 +252,7 @@ by bytes allocated, largest first; -C: +: alphabetically by group, module and cost centre. @@ -267,18 +264,18 @@ alphabetically by group, module and cost centre. --i<secs>: +: -i<secs> RTS option (profiling) Set the profiling (sampling) interval to <secs> seconds (the default is 1 second). Fractions are allowed: for example --i0.2 will get 5 samples per second. + will get 5 samples per second. --h<break-down>: +: -h<break-down> RTS option (profiling) @@ -287,8 +284,8 @@ seconds (the default is 1 second). Fractions are allowed: for example Produce a detailed space profile of the heap occupied by live -closures. The profile is written to the file <program>@.hp from -which a PostScript graph can be produced using hp2ps (see +closures. The profile is written to the file <program>@.hp from +which a PostScript graph can be produced using hp2ps (see ). @@ -297,7 +294,7 @@ The heap space profile may be broken down by different criteria: --hC: +: cost centre which produced the closure (the default). @@ -305,7 +302,7 @@ cost centre which produced the closure (the default). --hM: +: cost centre module which produced the closure. @@ -313,7 +310,7 @@ cost centre module which produced the closure. --hG: +: cost centre group which produced the closure. @@ -321,7 +318,7 @@ cost centre group which produced the closure. --hD: +: closure description—a string describing the closure. @@ -329,7 +326,7 @@ closure description—a string describing the closure. --hY: +: closure type—a string describing the closure's type. @@ -348,7 +345,7 @@ closures of interest can be selected (see below). Heap (space) profiling uses hash tables. If these tables should fill the run will abort. The --z<tbl><size>-z<tbl><size> RTS option (profiling) option is used to +-z<tbl><size> RTS option (profiling) option is used to increase the size of the relevant hash table (C, M, G, D or Y, defined as for <break-down> above). The actual size used is the next largest power of 2. @@ -365,7 +362,7 @@ and kind) using the following options: --c{<mod>:<lab>,<mod>:<lab>...}: +}: -c{<lab> RTS option (profiling)} @@ -374,7 +371,7 @@ Selects individual cost centre(s). --m{<mod>,<mod>...}: +}: -m{<mod> RTS option (profiling)} @@ -383,7 +380,7 @@ Selects all cost centres from the module(s) specified. --g{<grp>,<grp>...}: +}: -g{<grp> RTS option (profiling)} @@ -392,7 +389,7 @@ Selects all cost centres from the groups(s) specified. --d{<des>,<des>...}: +}: -d{<des> RTS option (profiling)} @@ -401,7 +398,7 @@ Selects closures which have one of the specified descriptions. --y{<typ>,<typ>...}: +}: -y{<typ> RTS option (profiling)} @@ -410,7 +407,7 @@ Selects closures which have one of the specified type descriptions. --k{<knd>,<knd>...}: +}: -k{<knd> RTS option (profiling)} @@ -449,7 +446,7 @@ centre) is selected by the option (or the option is not specified). -When you run your profiled program with the -p RTS option -p +When you run your profiled program with the RTS option -p RTS option, you get the following information about your ``cost centres'': @@ -480,7 +477,7 @@ different modules. How many times this cost-centre was entered; think -of it as ``I got to the _scc_ construct this many times…'' +of it as ``I got to the _scc_ construct this many times…'' @@ -535,7 +532,7 @@ How many dictionaries this cost centre evaluated. -In addition you can use the -P RTS option to get the following additional information: +In addition you can use the RTS option to get the following additional information: @@ -562,8 +559,8 @@ get the %alloc figure mentioned above. -Finally if you built your program with -prof-details - the -P RTS option will also +Finally if you built your program with + the RTS option will also produce the following information: @@ -631,7 +628,7 @@ Utility programs which produce graphical profiles. -<Literal>hp2ps</Literal>--heap profile to PostScript +<Title><Command>hp2ps</Command>--heap profile to PostScript @@ -653,16 +650,16 @@ hp2ps [flags] [<file>[.stat]] -The program hp2pshp2ps program converts a heap profile -as produced by the -h<break-down>-h<break-down> RTS +The program hp2pshp2ps program converts a heap profile +as produced by the -h<break-down> RTS option runtime option into a PostScript graph of the heap -profile. By convention, the file to be processed by hp2ps has a -.hp extension. The PostScript output is written to <file>@.ps. If -<file> is omitted entirely, then the program behaves as a filter. +profile. By convention, the file to be processed by hp2ps has a +.hp extension. The PostScript output is written to <file>@.ps. If +<file> is omitted entirely, then the program behaves as a filter. -hp2ps is distributed in ghc/utils/hp2ps in a GHC source +hp2ps is distributed in ghc/utils/hp2ps in a GHC source distribution. It was originally developed by Dave Wakeling as part of the HBC/LML heap profiler. @@ -672,64 +669,64 @@ The flags are: --d + -In order to make graphs more readable, hp2ps sorts the shaded +In order to make graphs more readable, hp2ps sorts the shaded bands for each identifier. The default sort ordering is for the bands with the largest area to be stacked on top of the smaller ones. The --d option causes rougher bands (those representing series of + option causes rougher bands (those representing series of values with the largest standard deviations) to be stacked on top of smoother ones. --b + -Normally, hp2ps puts the title of the graph in a small box at the +Normally, hp2ps puts the title of the graph in a small box at the top of the page. However, if the JOB string is too long to fit in a small box (more than 35 characters), then -hp2ps will choose to use a big box instead. The -b -option forces hp2ps to use a big box. +hp2ps will choose to use a big box instead. The +option forces hp2ps to use a big box. --e<float>[in|mm|pt] + Generate encapsulated PostScript suitable for inclusion in LaTeX documents. Usually, the PostScript graph is drawn in landscape mode -in an area 9 inches wide by 6 inches high, and hp2ps arranges +in an area 9 inches wide by 6 inches high, and hp2ps arranges for this area to be approximately centred on a sheet of a4 paper. This format is convenient of studying the graph in detail, but it is -unsuitable for inclusion in LaTeX documents. The -e option +unsuitable for inclusion in LaTeX documents. The option causes the graph to be drawn in portrait mode, with float specifying the width in inches, millimetres or points (the default). The resulting PostScript file conforms to the Encapsulated PostScript (EPS) convention, and it can be included in a LaTeX document using -Rokicki's dvi-to-PostScript converter dvips. +Rokicki's dvi-to-PostScript converter dvips. --g + -Create output suitable for the gs PostScript previewer (or +Create output suitable for the gs PostScript previewer (or similar). In this case the graph is printed in portrait mode without scaling. The output is unsuitable for a laser printer. --l + Normally a profile is limited to 20 bands with additional identifiers -being grouped into an OTHER band. The -l flag removes this +being grouped into an OTHER band. The flag removes this 20 band and limit, producing as many bands as necessary. No key is produced as it won't fit!. It is useful for creation time profiles with many bands. @@ -737,38 +734,38 @@ with many bands. --m<int> + Normally a profile is limited to 20 bands with additional identifiers -being grouped into an OTHER band. The -m flag specifies an +being grouped into an OTHER band. The flag specifies an alternative band limit (the maximum is 20). --m0 requests the band limit to be removed. As many bands as + requests the band limit to be removed. As many bands as necessary are produced. However no key is produced as it won't fit! It is useful for displaying creation time profiles with many bands. --p + Use previous parameters. By default, the PostScript graph is automatically scaled both horizontally and vertically so that it fills the page. However, when preparing a series of graphs for use in a presentation, it is often useful to draw a new graph using the same -scale, shading and ordering as a previous one. The -p flag causes +scale, shading and ordering as a previous one. The flag causes the graph to be drawn using the parameters determined by a previous -run of hp2ps on file. These are extracted from -file@.aux. +run of hp2ps on file. These are extracted from +file@.aux. --s + Use a small box for the title. @@ -776,22 +773,22 @@ Use a small box for the title. --t<float> + Normally trace elements which sum to a total of less than 1% of the -profile are removed from the profile. The -t option allows this +profile are removed from the profile. The option allows this percentage to be modified (maximum 5%). --t0 requests no trace elements to be removed from the profile, + requests no trace elements to be removed from the profile, ensuring that all the data will be displayed. --? + Print out usage information. @@ -804,7 +801,7 @@ Print out usage information. -<Literal>stat2resid</Literal>—residency info from GC stats +<Title><Command>stat2resid</Command>—residency info from GC stats @@ -826,30 +823,30 @@ stat2resid [<file>[.stat] [<outfile>]] -The program stat2residstat2resid converts a detailed +The program stat2residstat2resid converts a detailed garbage collection statistics file produced by the --S-S RTS option runtime option into a PostScript heap +-S RTS option runtime option into a PostScript heap residency graph. The garbage collection statistics file can be produced without compiling your program for profiling. -By convention, the file to be processed by stat2resid has a -.stat extension. If the <outfile> is not specified the -PostScript will be written to <file>@.resid.ps. If -<file> is omitted entirely, then the program behaves as a filter. +By convention, the file to be processed by stat2resid has a +.stat extension. If the <outfile> is not specified the +PostScript will be written to <file>@.resid.ps. If +<file> is omitted entirely, then the program behaves as a filter. The plot can not be produced from the statistics file for a generational collector, though a suitable stats file can be produced -using the -F2s-F2s RTS option runtime option when the +using the -F2s RTS option runtime option when the program has been compiled for generational garbage collection (the default). -stat2resid is distributed in ghc/utils/stat2resid in a GHC source +stat2resid is distributed in ghc/utils/stat2resid in a GHC source distribution. @@ -892,13 +889,13 @@ appropriate libraries and things when you made the system. See To get your compiled program to spit out the ticky-ticky numbers, use -a -r RTS option-r RTS option. See . +a RTS option-r RTS option. See . -Compiling your program with the -ticky switch yields an executable +Compiling your program with the switch yields an executable that performs these counts. Here is a sample ticky-ticky statistics -file, generated by the invocation foo +RTS -rfoo.ticky. +file, generated by the invocation foo +RTS -rfoo.ticky.