X-Git-Url: http://git.megacz.com/?a=blobdiff_plain;f=ghc%2Fdocs%2Fusers_guide%2Fprofiling.sgml;h=e79e63824af5a1fb52504c45f3c8e6e8c43fa3f8;hb=57d3c7c8a8523e375d49d2ac036210b6dcf7ec9c;hp=30175aed34d08c7315145a30fe1a80a0630f1376;hpb=66e87ae1ac00d54df5024033fda5d08db99177a4;p=ghc-hetmet.git
diff --git a/ghc/docs/users_guide/profiling.sgml b/ghc/docs/users_guide/profiling.sgml
index 30175ae..e79e638 100644
--- a/ghc/docs/users_guide/profiling.sgml
+++ b/ghc/docs/users_guide/profiling.sgml
@@ -4,13 +4,13 @@
cost-centre profiling
- Glasgow Haskell comes with a time and space profiling
+ Glasgow Haskell comes with a time and space profiling
system. Its purpose is to help you improve your understanding of
- your program's execution behaviour, so you can improve it.
+ your program's execution behaviour, so you can improve it.
- Any comments, suggestions and/or improvements you have are
+ Any comments, suggestions and/or improvements you have are
welcome. Recommended “profiling tricks” would be
- especially cool!
+ especially cool! Profiling a program is a three-step process:
@@ -30,12 +30,10 @@
- Run your program with one of the profiling options
- -p or -h. This generates
- a file of profiling information.
- -pRTS
- option
- -hRTS
+ Run your program with one of the profiling options, eg.
+ +RTS -p -RTS. This generates a file of
+ profiling information.
+ RTS
option
@@ -47,7 +45,7 @@
-
+ Cost centres and cost-centre stacksGHC's profiling system assigns costs
@@ -81,7 +79,7 @@ $
will contain something like this:
- Tue Apr 18 12:52 2000 Time and Allocation Profiling Report (Final)
+ Fri May 12 14:06 2000 Time and Allocation Profiling Report (Final)
Main +RTS -p -RTS
@@ -93,15 +91,16 @@ COST CENTRE MODULE %time %alloc
nfib Main 100.0 100.0
-COST CENTRE MODULE scc %time %alloc inner cafs
+ individual inherited
+COST CENTRE MODULE entries %time %alloc %time %alloc
-MAIN MAIN 0 0.0 0.0 0 1
- main Main 0 0.0 0.0 0 1
- CAF PrelHandle 3 0.0 0.0 0 3
- CAF PrelAddr 1 0.0 0.0 0 0
- CAF Main 6 0.0 0.0 1 0
- main Main 1 0.0 0.0 1 1
- nfib Main 242785 100.0 100.0 242784 4
+MAIN MAIN 0 0.0 0.0 100.0 100.0
+ main Main 0 0.0 0.0 0.0 0.0
+ CAF PrelHandle 3 0.0 0.0 0.0 0.0
+ CAF PrelAddr 1 0.0 0.0 0.0 0.0
+ CAF Main 6 0.0 0.0 100.0 100.0
+ main Main 1 0.0 0.0 100.0 100.0
+ nfib Main 242785 100.0 100.0 100.0 100.0
@@ -125,6 +124,12 @@ MAIN MAIN 0 0.0 0.0 0 1
the costly call to nfib came from
main.
+ The time and allocation incurred by a given part of the
+ program is displayed in two ways: “individual”, which
+ are the costs incurred by the code covered by this cost centre
+ stack alone, and “inherited”, which includes the costs
+ incurred by all the children of this node.
+
The usefulness of cost-centre stacks is better demonstrated
by modifying the example slightly:
@@ -139,18 +144,18 @@ nfib n = if n < 2 then 1 else nfib (n-1) + nfib (n-2)
the new profiling results:
-COST CENTRE MODULE scc %time %alloc inner cafs
-
-MAIN MAIN 0 0.0 0.0 0 1
- main Main 0 0.0 0.0 0 1
- CAF PrelHandle 3 0.0 0.0 0 3
- CAF PrelAddr 1 0.0 0.0 0 0
- CAF Main 9 0.0 0.0 1 1
- main Main 1 0.0 0.0 2 2
- g Main 1 0.0 0.0 1 3
- nfib Main 465 0.0 0.2 464 0
- f Main 1 0.0 0.0 1 1
- nfib Main 242785 100.0 99.8 242784 1
+COST CENTRE MODULE scc %time %alloc %time %alloc
+
+MAIN MAIN 0 0.0 0.0 100.0 100.0
+ main Main 0 0.0 0.0 0.0 0.0
+ CAF PrelHandle 3 0.0 0.0 0.0 0.0
+ CAF PrelAddr 1 0.0 0.0 0.0 0.0
+ CAF Main 9 0.0 0.0 100.0 100.0
+ main Main 1 0.0 0.0 100.0 100.0
+ g Main 1 0.0 0.0 0.0 0.2
+ nfib Main 465 0.0 0.2 0.0 0.2
+ f Main 1 0.0 0.0 100.0 99.8
+ nfib Main 242785 100.0 99.8 100.0 99.8
Now although we had two calls to nfib
@@ -161,7 +166,7 @@ MAIN MAIN 0 0.0 0.0 0 1
- scc
+ entriesThe number of times this particular point in the call
graph was entered.
@@ -169,7 +174,7 @@ MAIN MAIN 0 0.0 0.0 0 1
- %time
+ individual %timeThe percentage of the total run time of the program
spent at this point in the call graph.
@@ -177,7 +182,7 @@ MAIN MAIN 0 0.0 0.0 0 1
- %alloc
+ individual %allocThe percentage of the total memory allocations
(excluding profiling overheads) of the program made by this
@@ -186,19 +191,19 @@ MAIN MAIN 0 0.0 0.0 0 1
- inner
+ inherited %time
- The number of times an inner call-graph context was
- entered from here (including recursive calls).
+ The percentage of the total run time of the program
+ spent below this point in the call graph.
- cafs
+ inherited %alloc
- The number of times a CAF context was entered from
- here. CAFs are described in .
+ The percentage of the total memory allocations
+ (excluding profiling overheads) of the program made by this
+ call and all of its sub-calls.
@@ -211,20 +216,20 @@ MAIN MAIN 0 0.0 0.0 0 1
ticks
- The raw number of time “ticks” which were
+ The raw number of time “ticks” which were
attributed to this cost-centre; from this, we get the
%time figure mentioned
- above.
+ above.bytes
- Number of bytes allocated in the heap while in this
+ Number of bytes allocated in the heap while in this
cost-centre; again, this is the raw number from which we get
the %alloc figure mentioned
- above.
+ above.
@@ -249,14 +254,14 @@ MAIN MAIN 0 0.0 0.0 0 1
The syntax of a cost centre annotation is
- _scc_ "name" <expression>
+ {-# SCC "name" #-} <expression>
where "name" is an aribrary string,
that will become the name of your cost centre as it appears
in the profiling output, and
<expression> is any Haskell
- expression. An _scc_ annotation extends as
+ expression. An SCC annotation extends as
far to the right as possible when parsing.
@@ -272,14 +277,14 @@ MAIN MAIN 0 0.0 0.0 0 1
If the expression is part of the
one-off costs of evaluating the
enclosing top-level definition, then costs are attributed to
- the stack of lexically enclosing _scc_
+ the stack of lexically enclosing SCC
annotations on top of the special CAF
cost-centre. Otherwise, costs are attributed to the stack of
- lexically-enclosing _scc_ annotations,
+ lexically-enclosing SCC annotations,
appended to the cost-centre stack in effect at the
call site of the current top-level
definitionThe call-site is just the place
@@ -287,6 +292,12 @@ MAIN MAIN 0 0.0 0.0 0 1
variable.. Notice that this is a recursive
definition.
+
+
+ Time spent in foreign code (see )
+ is always attributed to the cost centre in force at the
+ Haskell call-site of the foreign function.
+ What do we mean by one-off costs? Well, Haskell is a lazy
@@ -322,143 +333,80 @@ x = nfib 25
doesn't look like you expect it to, feel free to send it (and
your program) to us at
glasgow-haskell-bugs@haskell.org.
-
-
- Profiling memory usage
-
- In addition to profiling the time and allocation behaviour
- of your program, you can also generate a graph of its memory usage
- over time. This is useful for detecting the causes of
- space leaks, when your program holds on to
- more memory at run-time that it needs to. Space leaks lead to
- longer run-times due to heavy garbage collector ativity, and may
- even cause the program to run out of memory altogether.
-
- To generate a heap profile from your program, compile it as
- before, but this time run it with the runtime
- option. This generates a file
- <prog>.hp file, which you then process
- with hp2ps to produce a Postscript file
- <prog>.ps. The Postscript file can be
- viewed with something like ghostview, or
- printed out on a Postscript-compatible printer.
-
- For the RTS options that control the kind of heap profile
- generated, see . Details on the
- usage of the hp2ps program are given in
-
-
-
-
- Graphical time/allocation profile
-
- You can view the time and allocation profiling graph of your
- program graphically, using ghcprof. This is a
- new tool with GHC 4.07, and will eventually be the de-facto
- standard way of viewing GHC profiles.
-
- To run ghcprof, you need
- daVinci installed, which can be
- obtained from The Graph
- Visualisation Tool daVinci. Install one of
- the binary
- distributionsdaVinci is
- sadly not open-source :-(., and set your
- DAVINCIHOME environment variable to point to the
- installation directory.
-
- ghcprof uses an XML-based profiling log
- format, and you therefore need to run your program with a
- different option: . The file generated is
- still called <prog>.prof. To see the
- profile, run ghcprof like this:
-
-
-
-
-$ ghcprof <prog>.prof
-
-
- which should pop up a window showing the call-graph of your
- program in glorious detail. More information on using
- ghcprof can be found at The
- Cost-Centre Stack Profiling Tool for
- GHC.
-
-
-
Compiler options for profilingprofilingoptionsoptionsfor profiling
- To make use of the cost centre profiling system
- all modules must be compiled and linked with
- the option. Any
- _scc_ constructs you've put in
- your source will spring to life.
-
- -prof
-
- Without a option, your
- _scc_s are ignored; so you can
- compiled _scc_-laden code
- without changing it.
-
- There are a few other profiling-related compilation options.
- Use them in addition to
- . These do not have to be used consistently
- for all modules in a program.
-
-
- :
- -auto
+ :
+
+
+ To make use of the profiling system
+ all modules must be compiled and linked
+ with the option. Any
+ SCC annotations you've put in your source
+ will spring to life.
+
+ Without a option, your
+ SCCs are ignored; so you can compile
+ SCC-laden code without changing
+ it.
+
+
+
+
+ There are a few other profiling-related compilation options.
+ Use them in addition to
+ . These do not have to be used consistently
+ for all modules in a program.
+
+
+
+ :
+ cost centresautomatically inserting
- GHC will automatically add
+ GHC will automatically add
_scc_ constructs for all
- top-level, exported functions.
+ top-level, exported functions.
- :
- -auto-all
+ :
+
- All top-level functions,
+ All top-level functions,
exported or not, will be automatically
- _scc_'d.
+ _scc_'d.
- :
- -caf-all
+ :
+
- The costs of all CAFs in a module are usually
+ The costs of all CAFs in a module are usually
attributed to one “big” CAF cost-centre. With
this option, all CAFs get their own cost-centre. An
- “if all else fails” option…
+ “if all else fails” option…
- :
- -ignore-scc
+ :
+
- Ignore any _scc_
+ Ignore any _scc_
constructs, so a module which already has
_scc_s can be compiled
- for profiling with the annotations ignored.
+ for profiling with the annotations ignored.
@@ -466,47 +414,29 @@ $ ghcprof <prog>.prof
-
- Runtime options for profiling
-
- profiling RTS options
- RTS options, for profiling
+
+ Time and allocation profiling
- It isn't enough to compile your program for profiling with
- !
-
- When you run your profiled program, you
- must tell the runtime system (RTS) what you want to profile (e.g.,
- time and/or space), and how you wish the collected data to be
- reported. You also may wish to set the sampling interval used in
- time profiling.
-
- Executive summary: ./a.out +RTS -pT
- produces a time profile in a.out.prof;
- ./a.out +RTS -hC produces space-profiling info
- which can be mangled by hp2ps and viewed with
- ghostview (or equivalent).
-
- Profiling runtime flags are passed to your program between
- the usual and
- options.
+ To generate a time and allocation profile, give one of the
+ following RTS options to the compiled program when you run it (RTS
+ options should be enclosed between +RTS...-RTS
+ as usual):
-
or :time profile
- The option produces a standard
+ The option produces a standard
time profile report. It is written
into the file
- <program>.prof.
+ program.prof.
- The option produces a more
+ The option produces a more
detailed report containing the actual time and allocation
- data as well. (Not used much.)
+ data as well. (Not used much.)
@@ -521,89 +451,471 @@ $ ghcprof <prog>.prof
- :
-
-
- Set the profiling (sampling) interval to
- <secs> seconds (the default is
- 1 second). Fractions are allowed: for example
- will get 5 samples per second. This
- only affects heap profiling; time profiles are always
- sampled on a 1/50 second frequency.
-
+
+ RTS
+ option
+
+ This option makes use of the extra information
+ maintained by the cost-centre-stack profiler to provide
+ useful information about the location of runtime errors.
+ See .
+
-
- :
-
- heap profile
-
- Produce a detailed heap profile
- of the heap occupied by live closures. The profile is
- written to the file <program>.hp
- from which a PostScript graph can be produced using
- hp2ps (see ).
-
- The heap space profile may be broken down by different
- criteria:
-
-
-
-
- :
-
- cost centre which produced the closure (the
- default).
-
-
-
-
- :
-
- cost centre module which produced the
- closure.
-
-
-
-
- :
-
- closure description—a string describing
- the closure.
-
-
-
-
- :
-
- closure type—a string describing the
- closure's type.
-
-
-
+
+
+
-
-
+
+ Profiling memory usage
-
- :
-
+ In addition to profiling the time and allocation behaviour
+ of your program, you can also generate a graph of its memory usage
+ over time. This is useful for detecting the causes of
+ space leaks, when your program holds on to
+ more memory at run-time that it needs to. Space leaks lead to
+ longer run-times due to heavy garbage collector ativity, and may
+ even cause the program to run out of memory altogether.
+
+ To generate a heap profile from your program:
+
+
+
+ Compile the program for profiling ().
+
+
+ Run it with one of the heap profiling options described
+ below (eg. for a basic producer profile).
+ This generates the file
+ prog.hp.
+
+
+ Run hp2ps to produce a Postscript
+ file,
+ prog.ps. The
+ hp2ps utility is described in detail in
+ .
+
+
+ Display the heap profile using a postscript viewer such
+ as Ghostview, or print it out on a
+ Postscript-capable printer.
+
+
+
+
+ RTS options for heap profiling
+
+ There are several different kinds of heap profile that can
+ be generated. All the different profile types yield a graph of
+ live heap against time, but they differ in how the live heap is
+ broken down into bands. The following RTS options select which
+ break-down to use:
+
+
+
+
+ RTS
+ option
+
+ Breaks down the graph by the cost-centre stack which
+ produced the data.
+
+
+
+
+
+ RTS
+ option
+
+ Break down the live heap by the module containing
+ the code which produced the data.
+
+
+
+
+
+ RTS
+ option
+
+ Breaks down the graph by closure
+ description. For actual data, the description
+ is just the constructor name, for other closures it is a
+ compiler-generated string identifying the closure.
+
+
+
+
+
+ RTS
+ option
+
+ Breaks down the graph by
+ type. For closures which have
+ function type or unknown/polymorphic type, the string will
+ represent an approximation to the actual type.
+
+
+
+
+
+ RTS
+ option
+
+ Break down the graph by retainer
+ set. Retainer profiling is described in more
+ detail below ().
+
+
+
+
+
+ RTS
+ option
+
+ Break down the graph by
+ biography. Biographical profiling
+ is described in more detail below ().
+
+
+
+
+ In addition, the profile can be restricted to heap data
+ which satisfies certain criteria - for example, you might want
+ to display a profile by type but only for data produced by a
+ certain module, or a profile by retainer for a certain type of
+ data. Restrictions are specified as follows:
+
+
+
+ name,...
+ RTS
+ option
+
+ Restrict the profile to closures produced by
+ cost-centre stacks with one of the specified cost centres
+ at the top.
+
+
+
+
+ name,...
+ RTS
+ option
+
+ Restrict the profile to closures produced by
+ cost-centre stacks with one of the specified cost centres
+ anywhere in the stack.
+
+
+
+
+ module,...
+ RTS
+ option
+
+ Restrict the profile to closures produced by the
+ specified modules.
+
+
+
+
+ desc,...
+ RTS
+ option
+
+ Restrict the profile to closures with the specified
+ description strings.
+
+
+
+
+ type,...
+ RTS
+ option
+
+ Restrict the profile to closures with the specified
+ types.
+
+
+
+
+ cc,...
+ RTS
+ option
+
+ Restrict the profile to closures with retainer sets
+ containing cost-centre stacks with one of the specified
+ cost centres at the top.
+
+
+
+
+ bio,...
+ RTS
+ option
+
+ Restrict the profile to closures with one of the
+ specified biographies, where
+ bio is one of
+ lag, drag,
+ void, or use.
+
+
+
+
+ For example, the following options will generate a
+ retainer profile restricted to Branch and
+ Leaf constructors:
+
+
+prog +RTS -hr -hdBranch,Leaf
+
+
+ There can only be one "break-down" option
+ (eg. in the example above), but there is no
+ limit on the number of further restrictions that may be applied.
+ All the options may be combined, with one exception: GHC doesn't
+ currently support mixing the and
+ options.
+
+ There are two more options which relate to heap
+ profiling:
+
+
+
+ :
+
+
+ Set the profiling (sampling) interval to
+ secs seconds (the default is
+ 0.1 second). Fractions are allowed: for example
+ will get 5 samples per second.
+ This only affects heap profiling; time profiles are always
+ sampled on a 1/50 second frequency.
+
+
+
+
+
+ RTS option
+
+
+ Include the memory occupied by threads in a heap
+ profile. Each thread takes up a small area for its thread
+ state in addition to the space allocated for its stack
+ (stacks normally start small and then grow as
+ necessary).
+
+ This includes the main thread, so using
+ is a good way to see how much stack
+ space the program is using.
+
+ Memory occupied by threads and their stacks is
+ labelled as “TSO” when displaying the profile
+ by closure description or type description.
+
+
+
+
+
+
+
+ Retainer Profiling
+
+ Retainer profiling is designed to help answer questions
+ like why is this data being retained?. We start
+ by defining what we mean by a retainer:
+
+
+ A retainer is either the system stack, or an unevaluated
+ closure (thunk).
+
+
+ In particular, constructors are not
+ retainers.
+
+ An object A is retained by an object B if object A can be
+ reached by recursively following pointers starting from object
+ B but not meeting any other retainers on the way. Each object
+ has one or more retainers, collectively called its
+ retainer set.
+
+ When retainer profiling is requested by giving the program
+ the option, a graph is generated which is
+ broken down by retainer set. A retainer set is displayed as a
+ set of cost-centre stacks; because this is usually too large to
+ fit on the profile graph, each retainer set is numbered and
+ shown abbreviated on the graph along with its number, and the
+ full list of retainer sets is dumped into the file
+ prog.prof.
+
+ Retainer profiling requires multiple passes over the live
+ heap in order to discover the full retainer set for each
+ object, which can be quite slow. So we set a limit on the
+ maximum size of a retainer set, where all retainer sets larger
+ than the maximum retainer set size are replaced by the special
+ set MANY. The maximum set size defaults to 8
+ and can be altered with the RTS
+ option:
+
+
+
+ size
+
+ Restrict the number of elements in a retainer set to
+ size (default 8).
+
+
+
+
+
+ Hints for using retainer profiling
+
+ The definition of retainers is designed to reflect a
+ common cause of space leaks: a large structure is retained by
+ an unevaluated computation, and will be released once the
+ compuation is forced. A good example is looking up a value in
+ a finite map, where unless the lookup is forced in a timely
+ manner the unevaluated lookup will cause the whole mapping to
+ be retained. These kind of space leaks can often be
+ eliminated by forcing the relevant computations to be
+ performed eagerly, using seq or strictness
+ annotations on data constructor fields.
+
+ Often a particular data structure is being retained by a
+ chain of unevaluated closures, only the nearest of which will
+ be reported by retainer profiling - for example A retains B, B
+ retains C, and C retains a large structure. There might be a
+ large number of Bs but only a single A, so A is really the one
+ we're interested in eliminating. However, retainer profiling
+ will in this case report B as the retainer of the large
+ structure. To move further up the chain of retainers, we can
+ ask for another retainer profile but this time restrict the
+ profile to B objects, so we get a profile of the retainers of
+ B:
+
+
+prog +RTS -hr -hcB
+
+
+ This trick isn't foolproof, because there might be other
+ B closures in the heap which aren't the retainers we are
+ interested in, but we've found this to be a useful technique
+ in most cases.
+
+
+
+
+ Biographical Profiling
+
+ A typical heap object may be in one of the following four
+ states at each point in its lifetime:
+
+
- The option generates heap
- profiling information in the XML format understood by our
- new profiling tool (NOTE: heap profiling with the new tool
- is not yet working! Use hp2ps-style heap
- profiling for the time being).
+ The lag stage, which is the
+ time between creation and the first use of the
+ object,
-
+
+ the use stage, which lasts from
+ the first use until the last use of the object, and
+
+
+ The drag stage, which lasts
+ from the final use until the last reference to the object
+ is dropped.
+
+
+ An object which is never used is said to be in the
+ void state for its whole
+ lifetime.
+
+
+
+ A biographical heap profile displays the portion of the
+ live heap in each of the four states listed above. Usually the
+ most interesting states are the void and drag states: live heap
+ in these states is more likely to be wasted space than heap in
+ the lag or use states.
+
+ It is also possible to break down the heap in one or more
+ of these states by a different criteria, by restricting a
+ profile by biography. For example, to show the portion of the
+ heap in the drag or void state by producer:
+
+
+prog +RTS -hc -hbdrag,void
+
+
+ Once you know the producer or the type of the heap in the
+ drag or void states, the next step is usually to find the
+ retainer(s):
+
+
+prog +RTS -hr -hccc...
+
+
+ NOTE: this two stage process is required because GHC
+ cannot currently profile using both biographical and retainer
+ information simultaneously.
+
+
+
+
+
+
+
+
+
+ Graphical time/allocation profile
+
+ You can view the time and allocation profiling graph of your
+ program graphically, using ghcprof. This is a
+ new tool with GHC 4.08, and will eventually be the de-facto
+ standard way of viewing GHC profilesActually this
+ isn't true any more, we are working on a new tool for
+ displaying heap profiles using Gtk+HS, so
+ ghcprof may go away at some point in the future.
+
+
+ To run ghcprof, you need
+ daVinci installed, which can be
+ obtained from The Graph
+ Visualisation Tool daVinci. Install one of
+ the binary
+ distributionsdaVinci is
+ sadly not open-source :-(., and set your
+ DAVINCIHOME environment variable to point to the
+ installation directory.
+
+ ghcprof uses an XML-based profiling log
+ format, and you therefore need to run your program with a
+ different option: . The file generated is
+ still called <prog>.prof. To see the
+ profile, run ghcprof like this:
+
+
+
+
+$ ghcprof <prog>.prof
+
+
+ which should pop up a window showing the call-graph of your
+ program in glorious detail. More information on using
+ ghcprof can be found at The
+ Cost-Centre Stack Profiling Tool for
+ GHC.
-
-
- hp2ps--heap profile to PostScript
+ hp2ps––heap profile to PostScripthp2psheap profiles
@@ -777,6 +1089,132 @@ hp2ps [flags] [<file>[.hp]]
+
+
+
+ Manipulating the hp file
+
+(Notes kindly offered by Jan-Willhem Maessen.)
+
+
+The FOO.hp file produced when you ask for the
+heap profile of a program FOO is a text file with a particularly
+simple structure. Here's a representative example, with much of the
+actual data omitted:
+
+JOB "FOO -hC"
+DATE "Thu Dec 26 18:17 2002"
+SAMPLE_UNIT "seconds"
+VALUE_UNIT "bytes"
+BEGIN_SAMPLE 0.00
+END_SAMPLE 0.00
+BEGIN_SAMPLE 15.07
+ ... sample data ...
+END_SAMPLE 15.07
+BEGIN_SAMPLE 30.23
+ ... sample data ...
+END_SAMPLE 30.23
+... etc.
+BEGIN_SAMPLE 11695.47
+END_SAMPLE 11695.47
+
+The first four lines (JOB, DATE, SAMPLE_UNIT, VALUE_UNIT) form a
+header. Each block of lines starting with BEGIN_SAMPLE and ending
+with END_SAMPLE forms a single sample (you can think of this as a
+vertical slice of your heap profile). The hp2ps utility should accept
+any input with a properly-formatted header followed by a series of
+*complete* samples.
+
+
+
+
+ Zooming in on regions of your profile
+
+
+You can look at particular regions of your profile simply by loading a
+copy of the .hp file into a text editor and deleting the unwanted
+samples. The resulting .hp file can be run through hp2ps and viewed
+or printed.
+
+
+
+
+ Viewing the heap profile of a running program
+
+
+The .hp file is generated incrementally as your
+program runs. In principle, running hp2ps on the incomplete file
+should produce a snapshot of your program's heap usage. However, the
+last sample in the file may be incomplete, causing hp2ps to fail. If
+you are using a machine with UNIX utilities installed, it's not too
+hard to work around this problem (though the resulting command line
+looks rather Byzantine):
+
+ head -`fgrep -n END_SAMPLE FOO.hp | tail -1 | cut -d : -f 1` FOO.hp \
+ | hp2ps > FOO.ps
+
+
+The command fgrep -n END_SAMPLE FOO.hp finds the
+end of every complete sample in FOO.hp, and labels each sample with
+its ending line number. We then select the line number of the last
+complete sample using tail and cut. This is used as a
+parameter to head; the result is as if we deleted the final
+incomplete sample from FOO.hp. This results in a properly-formatted
+.hp file which we feed directly to hp2ps.
+
+
+
+ Viewing a heap profile in real time
+
+
+The gv and ghostview programs
+have a "watch file" option can be used to view an up-to-date heap
+profile of your program as it runs. Simply generate an incremental
+heap profile as described in the previous section. Run gv on your
+profile:
+
+ gv -watch -seascape FOO.ps
+
+If you forget the -watch flag you can still select
+"Watch file" from the "State" menu. Now each time you generate a new
+profile FOO.ps the view will update automatically.
+
+
+
+This can all be encapsulated in a little script:
+
+ #!/bin/sh
+ head -`fgrep -n END_SAMPLE FOO.hp | tail -1 | cut -d : -f 1` FOO.hp \
+ | hp2ps > FOO.ps
+ gv -watch -seascape FOO.ps &
+ while [ 1 ] ; do
+ sleep 10 # We generate a new profile every 10 seconds.
+ head -`fgrep -n END_SAMPLE FOO.hp | tail -1 | cut -d : -f 1` FOO.hp \
+ | hp2ps > FOO.ps
+ done
+
+Occasionally gv will choke as it tries to read an incomplete copy of
+FOO.ps (because hp2ps is still running as an update
+occurs). A slightly more complicated script works around this
+problem, by using the fact that sending a SIGHUP to gv will cause it
+to re-read its input file:
+
+ #!/bin/sh
+ head -`fgrep -n END_SAMPLE FOO.hp | tail -1 | cut -d : -f 1` FOO.hp \
+ | hp2ps > FOO.ps
+ gv FOO.ps &
+ gvpsnum=$!
+ while [ 1 ] ; do
+ sleep 10
+ head -`fgrep -n END_SAMPLE FOO.hp | tail -1 | cut -d : -f 1` FOO.hp \
+ | hp2ps > FOO.ps
+ kill -HUP $gvpsnum
+ done
+
+
+
+
+