X-Git-Url: http://git.megacz.com/?a=blobdiff_plain;f=docs%2Fusers_guide%2Fprofiling.xml;h=01c7576b63ea41772a3f24754a1101e7b76446a3;hb=67d845652defc09807eaf134c6d30c8bd26b665a;hp=3442aee0812397a3a889045585444be8e3bd2759;hpb=23c11847ac06735217d778e4e78d527ca0d55649;p=ghc-hetmet.git diff --git a/docs/users_guide/profiling.xml b/docs/users_guide/profiling.xml index 3442aee..01c7576 100644 --- a/docs/users_guide/profiling.xml +++ b/docs/users_guide/profiling.xml @@ -8,7 +8,7 @@ Glasgow Haskell comes with a time and space profiling system. Its purpose is to help you improve your understanding of your program's execution behaviour, so you can improve it. - + Any comments, suggestions and/or improvements you have are welcome. Recommended “profiling tricks” would be especially cool! @@ -33,22 +33,24 @@ Run your program with one of the profiling options, eg. +RTS -p -RTS. This generates a file of - profiling information. + profiling information. Note that multi-processor execution + (e.g. +RTS -N2) is not supported while + profiling. RTS option - + Examine the generated profiling information, using one of GHC's profiling tools. The tool to use will depend on the kind of profiling information generated. - + - + Cost centres and cost-centre stacks - + GHC's profiling system assigns costs to cost centres. A cost is simply the time or space required to evaluate an expression. Cost centres are @@ -240,17 +242,15 @@ MAIN MAIN 0 0.0 0.0 100.0 100.0 although GHC does keep information about which groups of functions called each other recursively, this information isn't displayed in the basic time and allocation profile, instead the call-graph is - flattened into a tree. The XML profiling tool (described in ) will be able to display real loops in - the call-graph. + flattened into a tree. Inserting cost centres by hand Cost centres are just program annotations. When you say to the compiler, it automatically inserts a cost centre annotation around every top-level function - in your program, but you are entirely free to add the cost - centre annotations yourself. + not marked INLINE in your program, but you are entirely free to + add the cost centre annotations yourself. The syntax of a cost centre annotation is @@ -263,7 +263,29 @@ MAIN MAIN 0 0.0 0.0 100.0 100.0 in the profiling output, and <expression> is any Haskell expression. An SCC annotation extends as - far to the right as possible when parsing. + far to the right as possible when parsing. (SCC stands for "Set + Cost Centre"). + + Here is an example of a program with a couple of SCCs: + + +main :: IO () +main = do let xs = {-# SCC "X" #-} [1..1000000] + let ys = {-# SCC "Y" #-} [1..2000000] + print $ last xs + print $ last $ init xs + print $ last ys + print $ last $ init ys + + + which gives this heap profile when run: + + + @@ -363,7 +385,7 @@ x = nfib 25 - + There are a few other profiling-related compilation options. Use them in addition to . These do not have to be used consistently @@ -379,19 +401,23 @@ x = nfib 25 GHC will automatically add _scc_ constructs for all - top-level, exported functions. + top-level, exported functions not marked INLINE. If you + want a cost centre on an INLINE function, you have to add + it manually. - + : - All top-level functions, - exported or not, will be automatically - _scc_'d. + All top-level functions + not marked INLINE, exported or not, will be automatically + _scc_'d. + The functions marked INLINE must be given a cost centre + manually. @@ -436,9 +462,10 @@ x = nfib 25 - or : + or or : + time profile @@ -450,21 +477,24 @@ x = nfib 25 The option produces a more detailed report containing the actual time and allocation data as well. (Not used much.) - - - - - : - - - - The option generates profiling - information in the XML format understood by our new - profiling tool, see . + The option produces the most detailed + report containing all cost centres in addition to the actual time + and allocation data. + + + RTS + option + + Sets the interval that the RTS clock ticks at, which is + also the sampling interval of the time and allocation profile. + The default is 0.02 second. + + + @@ -479,7 +509,7 @@ x = nfib 25 - + @@ -511,7 +541,7 @@ x = nfib 25 file, prog.ps. The hp2ps utility is described in detail in - . + . Display the heap profile using a postscript viewer such @@ -520,6 +550,11 @@ x = nfib 25 + You might also want to take a look + at hp2any, + a more advanced suite of tools (not distributed with GHC) for + displaying heap profiles. + RTS options for heap profiling @@ -577,7 +612,7 @@ x = nfib 25 represent an approximation to the actual type. - + @@ -609,7 +644,7 @@ x = nfib 25 to display a profile by type but only for data produced by a certain module, or a profile by retainer for a certain type of data. Restrictions are specified as follows: - + @@ -667,7 +702,7 @@ x = nfib 25 types. - + cc,... @@ -725,7 +760,8 @@ x = nfib 25 0.1 second). Fractions are allowed: for example will get 5 samples per second. This only affects heap profiling; time profiles are always - sampled on a 1/50 second frequency. + sampled with the frequency of the RTS clock. See + for changing that. @@ -740,7 +776,7 @@ x = nfib 25 state in addition to the space allocated for its stack (stacks normally start small and then grow as necessary). - + This includes the main thread, so using is a good way to see how much stack space the program is using. @@ -766,7 +802,7 @@ x = nfib 25 - + Retainer Profiling @@ -807,7 +843,7 @@ x = nfib 25 set MANY. The maximum set size defaults to 8 and can be altered with the RTS option: - + size @@ -847,7 +883,7 @@ x = nfib 25 prog +RTS -hr -hcB - + This trick isn't foolproof, because there might be other B closures in the heap which aren't the retainers we are interested in, but we've found this to be a useful technique @@ -961,48 +997,6 @@ x = nfib 25 - - Graphical time/allocation profile - - You can view the time and allocation profiling graph of your - program graphically, using ghcprof. This is a - new tool with GHC 4.08, and will eventually be the de-facto - standard way of viewing GHC profilesActually this - isn't true any more, we are working on a new tool for - displaying heap profiles using Gtk+HS, so - ghcprof may go away at some point in the future. - - - To run ghcprof, you need - uDraw(Graph) installed, which can be - obtained from uDraw(Graph). Install one of - the binary - distributions, and set your - UDG_HOME environment variable to point to the - installation directory. - - ghcprof uses an XML-based profiling log - format, and you therefore need to run your program with a - different option: . The file generated is - still called <prog>.prof. To see the - profile, run ghcprof like this: - - - - -$ ghcprof <prog>.prof - - - which should pop up a window showing the call-graph of your - program in glorious detail. More information on using - ghcprof can be found at The - Cost-Centre Stack Profiling Tool for - GHC. - - - <command>hp2ps</command>––heap profile to PostScript @@ -1010,9 +1004,9 @@ $ ghcprof <prog>.prof heap profiles postscript, from heap profiles - + Usage: - + hp2ps [flags] [<file>[.hp]] @@ -1036,7 +1030,7 @@ hp2ps [flags] [<file>[.hp]] The flags are: - + @@ -1142,7 +1136,7 @@ hp2ps [flags] [<file>[.hp]] Use a small box for the title. - + @@ -1163,14 +1157,14 @@ hp2ps [flags] [<file>[.hp]] Generate colour output. - + Ignore marks. - + @@ -1183,7 +1177,7 @@ hp2ps [flags] [<file>[.hp]] Manipulating the hp file -(Notes kindly offered by Jan-Willhem Maessen.) +(Notes kindly offered by Jan-Willem Maessen.) The FOO.hp file produced when you ask for the @@ -1262,7 +1256,7 @@ profile of your program as it runs. Simply generate an incremental heap profile as described in the previous section. Run gv on your profile: - gv -watch -seascape FOO.ps + gv -watch -seascape FOO.ps If you forget the -watch flag you can still select "Watch file" from the "State" menu. Now each time you generate a new @@ -1298,7 +1292,7 @@ to re-read its input file: head -`fgrep -n END_SAMPLE FOO.hp | tail -1 | cut -d : -f 1` FOO.hp \ | hp2ps > FOO.ps kill -HUP $gvpsnum - done + done @@ -1311,46 +1305,44 @@ to re-read its input file: hpc - Code coverage tools allow a programer to determine what parts of + Code coverage tools allow a programmer to determine what parts of their code have been actually executed, and which parts have never actually been invoked. GHC has an option for generating instrumented code that records code coverage as part of the Haskell Program Coverage - (HPC) toolkit. HPC tools can be used to render the - outputed code coverage infomation into human understandable - format. - + (HPC) toolkit, which is included with GHC. HPC tools can + be used to render the generated code coverage information into + human understandable format. - HPC provides coverage information of two kinds: source coverage - and boolean-control coverage. Source coverage is the extent to - which every part of the program was used, measured at three - different levels: declarations (both top-level and local), - alternatives (among several equations or case branches) and - expressions (at every level). Boolean coverage is the extent to - which each of the values True and False is obtained in every - syntactic boolean context (ie. guard, condition, qualifier). - + Correctly instrumented code provides coverage information of two + kinds: source coverage and boolean-control coverage. Source + coverage is the extent to which every part of the program was + used, measured at three different levels: declarations (both + top-level and local), alternatives (among several equations or + case branches) and expressions (at every level). Boolean + coverage is the extent to which each of the values True and + False is obtained in every syntactic boolean context (ie. guard, + condition, qualifier). - HPC displays both kinds of information in two different ways: - textual reports with summary statistics (hpc-report) and sources - with color mark-up (hpc-markup). For boolean coverage, there + HPC displays both kinds of information in two primary ways: + textual reports with summary statistics (hpc report) and sources + with color mark-up (hpc markup). For boolean coverage, there are four possible outcomes for each guard, condition or qualifier: both True and False values occur; only True; only False; never evaluated. In hpc-markup output, highlighting with a yellow background indicates a part of the program that was never evaluated; a green background indicates an always-True expression and a red background indicates an always-False one. - + A small example: Reciprocation - For an example we have a program which computes exact decimal + For an example we have a program, called Recip.hs, which computes exact decimal representations of reciprocals, with recurring parts indicated in - brackets. We first build an instrumented version using the - hpc-build script. Assuming the source file is Recip.hs. + brackets. reciprocal :: Int -> (String, Int) @@ -1385,27 +1377,27 @@ main = do main -` The HPC intrumentation is enabled using the -fhpc flag. + The HPC instrumentation is enabled using the -fhpc flag. -$ ghc -fhpc Recip.hs --make +$ ghc -fhpc Recip.hs --make - HPC index (.mix) files are placed placed in .hpc subdirectory. These can be considered like - the .hi files for HPC. They contain information about what parts of the haskell each modules. + HPC index (.mix) files are placed in .hpc subdirectory. These can be considered like + the .hi files for HPC. $ ./Recip 1/3 = 0.(3) - Now for a textual summary of coverage: + We can generate a textual summary of coverage: -$ hpc-report Recip +$ hpc report Recip 80% expressions used (81/101) 12% boolean coverage (1/8) - 14% guards (1/7), 3 always True, - 1 always False, + 14% guards (1/7), 3 always True, + 1 always False, 2 unevaluated 0% 'if' conditions (0/1), 1 always False 100% qualifiers (0/0) @@ -1413,17 +1405,215 @@ $ hpc-report Recip 100% local declarations used (9/9) 100% top-level declarations used (5/5) - Finally, we generate a marked-up version of the source. + We can also generate a marked-up version of the source. -$ hpc-markup Recip +$ hpc markup Recip writing Recip.hs.html -
- Recip.hs.html - -
+ + This generates one file per Haskell module, and 4 index files, + hpc_index.html, hpc_index_alt.html, hpc_index_exp.html, + hpc_index_fun.html. + +
+ + Options for instrumenting code for coverage + + Turning on code coverage is easy, use the -fhpc flag. + Instrumented and non-instrumented can be freely mixed. + When compiling the Main module GHC automatically detects when there + is an hpc compiled file, and adds the correct initialization code. + + + + + The hpc toolkit + + + The hpc toolkit uses a cvs/svn/darcs-like interface, where a + single binary contains many function units. + +$ hpc +Usage: hpc COMMAND ... + +Commands: + help Display help for hpc or a single command +Reporting Coverage: + report Output textual report about program coverage + markup Markup Haskell source with program coverage +Processing Coverage files: + sum Sum multiple .tix files in a single .tix file + combine Combine two .tix files in a single .tix file + map Map a function over a single .tix file +Coverage Overlays: + overlay Generate a .tix file from an overlay file + draft Generate draft overlay that provides 100% coverage +Others: + show Show .tix file in readable, verbose format + version Display version for hpc + + + In general, these options act on .tix file after an + instrumented binary has generated it, which hpc acting as a + conduit between the raw .tix file, and the more detailed reports + produced. + + + + The hpc tool assumes you are in the top-level directory of + the location where you built your application, and the .tix + file is in the same top-level directory. You can use the + flag --srcdir to use hpc for any other directory, and use + --srcdir multiple times to analyse programs compiled from + difference locations, as is typical for packages. + + + + We now explain in more details the major modes of hpc. + + + hpc report + hpc report gives a textual report of coverage. By default, + all modules and packages are considered in generating report, + unless include or exclude are used. The report is a summary + unless the --per-module flag is used. The --xml-output option + allows for tools to use hpc to glean coverage. + + +$ hpc help report +Usage: hpc report [OPTION] .. <TIX_FILE> [<MODULE> [<MODULE> ..]] + +Options: + + --per-module show module level detail + --decl-list show unused decls + --exclude=[PACKAGE:][MODULE] exclude MODULE and/or PACKAGE + --include=[PACKAGE:][MODULE] include MODULE and/or PACKAGE + --srcdir=DIR path to source directory of .hs files + multi-use of srcdir possible + --hpcdir=DIR sub-directory that contains .mix files + default .hpc [rarely used] + --xml-output show output in XML + + + hpc markup + hpc markup marks up source files into colored html. + + +$ hpc help markup +Usage: hpc markup [OPTION] .. <TIX_FILE> [<MODULE> [<MODULE> ..]] + +Options: + + --exclude=[PACKAGE:][MODULE] exclude MODULE and/or PACKAGE + --include=[PACKAGE:][MODULE] include MODULE and/or PACKAGE + --srcdir=DIR path to source directory of .hs files + multi-use of srcdir possible + --hpcdir=DIR sub-directory that contains .mix files + default .hpc [rarely used] + --fun-entry-count show top-level function entry counts + --highlight-covered highlight covered code, rather that code gaps + --destdir=DIR path to write output to + + + + hpc sum + hpc sum adds together any number of .tix files into a single + .tix file. hpc sum does not change the original .tix file; it generates a new .tix file. + + +$ hpc help sum +Usage: hpc sum [OPTION] .. <TIX_FILE> [<TIX_FILE> [<TIX_FILE> ..]] +Sum multiple .tix files in a single .tix file + +Options: - + --exclude=[PACKAGE:][MODULE] exclude MODULE and/or PACKAGE + --include=[PACKAGE:][MODULE] include MODULE and/or PACKAGE + --output=FILE output FILE + --union use the union of the module namespace (default is intersection) + + + hpc combine + hpc combine is the swiss army knife of hpc. It can be + used to take the difference between .tix files, to subtract one + .tix file from another, or to add two .tix files. hpc combine does not + change the original .tix file; it generates a new .tix file. + + +$ hpc help combine +Usage: hpc combine [OPTION] .. <TIX_FILE> <TIX_FILE> +Combine two .tix files in a single .tix file + +Options: + + --exclude=[PACKAGE:][MODULE] exclude MODULE and/or PACKAGE + --include=[PACKAGE:][MODULE] include MODULE and/or PACKAGE + --output=FILE output FILE + --function=FUNCTION combine .tix files with join function, default = ADD + FUNCTION = ADD | DIFF | SUB + --union use the union of the module namespace (default is intersection) + + + hpc map + hpc map inverts or zeros a .tix file. hpc map does not + change the original .tix file; it generates a new .tix file. + + +$ hpc help map +Usage: hpc map [OPTION] .. <TIX_FILE> +Map a function over a single .tix file + +Options: + + --exclude=[PACKAGE:][MODULE] exclude MODULE and/or PACKAGE + --include=[PACKAGE:][MODULE] include MODULE and/or PACKAGE + --output=FILE output FILE + --function=FUNCTION apply function to .tix files, default = ID + FUNCTION = ID | INV | ZERO + --union use the union of the module namespace (default is intersection) + + + hpc overlay and hpc draft + + Overlays are an experimental feature of HPC, a textual description + of coverage. hpc draft is used to generate a draft overlay from a .tix file, + and hpc overlay generates a .tix files from an overlay. + + +% hpc help overlay +Usage: hpc overlay [OPTION] .. <OVERLAY_FILE> [<OVERLAY_FILE> [...]] + +Options: + + --srcdir=DIR path to source directory of .hs files + multi-use of srcdir possible + --hpcdir=DIR sub-directory that contains .mix files + default .hpc [rarely used] + --output=FILE output FILE +% hpc help draft +Usage: hpc draft [OPTION] .. <TIX_FILE> + +Options: + + --exclude=[PACKAGE:][MODULE] exclude MODULE and/or PACKAGE + --include=[PACKAGE:][MODULE] include MODULE and/or PACKAGE + --srcdir=DIR path to source directory of .hs files + multi-use of srcdir possible + --hpcdir=DIR sub-directory that contains .mix files + default .hpc [rarely used] + --output=FILE output FILE + + + + Caveats and Shortcomings of Haskell Program Coverage + + HPC does not attempt to lock the .tix file, so multiple concurrently running + binaries in the same directory will exhibit a race condition. There is no way + to change the name of the .tix file generated, apart from renaming the binary. + HPC does not work with GHCi. + +
@@ -1432,13 +1622,13 @@ writing Recip.hs.html (ToDo: document properly.) - It is possible to compile Glasgow Haskell programs so that + It is possible to compile Haskell programs so that they will count lots and lots of interesting things, e.g., number of updates, number of data constructors entered, etc., etc. We call this “ticky-ticky” profiling,ticky-ticky profiling profiling, - ticky-ticky because that's the sound a Sun4 + ticky-ticky because that's the sound a CPU makes when it is running up all those counters (slowly). @@ -1446,25 +1636,52 @@ writing Recip.hs.html it is quite separate from the main “cost-centre” profiling system, intended for all users everywhere. - To be able to use ticky-ticky profiling, you will need to - have built the ticky RTS. (This should be described in - the building guide, but amounts to building the RTS with way - "t" enabled.) + + You don't need to build GHC, the libraries, or the RTS a special + way in order to use ticky-ticky profiling. You can decide on a + module-by-module basis which parts of a program have the + counters compiled in, using the + compile-time option. Those modules that + were not compiled with won't contribute + to the ticky-ticky profiling results, and that will normally + include all the pre-compiled packages that your program links + with. + - To get your compiled program to spit out the ticky-ticky - numbers, use a RTS - option-r RTS option. - See . + + To get your compiled program to spit out the ticky-ticky + numbers: - Compiling your program with the - switch yields an executable that performs these counts. Here is a - sample ticky-ticky statistics file, generated by the invocation - foo +RTS -rfoo.ticky. + + + + Link the program with + ( is a synonym + for at link-time). This links in + the debug version of the RTS, which includes the code for + aggregating and reporting the results of ticky-ticky + profiling. + + + + + Run the program with the RTS + option-r RTS option. + See . + + + + + + + Here is a sample ticky-ticky statistics file, generated by + the invocation + foo +RTS -rfoo.ticky. + foo +RTS -rfoo.ticky - ALLOCATIONS: 3964631 (11330900 words total: 3999476 admin, 6098829 goods, 1232595 slop) total words: 2 3 4 5 6+ 69647 ( 1.8%) function values 50.0 50.0 0.0 0.0 0.0 @@ -1567,7 +1784,6 @@ Total bytes copied during GC: 190096