2 \documentstyle[11pt,literate]{article}
4 \title{The Glorious Haskell Compilation System\\ Profiling Guide}
5 \author{The GHC Team (Patrick M. Sansom)\\
6 Department of Computing Science\\
7 University of Glasgow\\
11 Email: glasgow-haskell-\{users,bugs\}-request\@dcs.gla.ac.uk}
18 \section[profiling]{Profiling Haskell programs}
19 \index{profiling, with cost-centres}
20 \index{cost-centre profiling}
22 Glasgow Haskell comes with a time and space profiling system. Its
23 purpose is to help you improve your understanding of your program's
24 execution behaviour, so you can improve it.
26 %This profiling system is still under development.
27 %Please e-mail reports of any bugs you discover to
28 %\tr{glasgow-haskell-bugs@dcs.gla.ac.uk}.
30 Any comments, suggestions and/or improvements you have to are welcome.
31 Recommended ``profiling tricks'' would be especially cool!
33 \subsection[profiling-intro]{How to profile a Haskell program}
35 The GHC approach to profiling is very simple: annotate the expressions
36 you consider ``interesting'' with {\em cost centre} labels (strings);
37 so, for example, you might have:
42 output1 = _scc_ "Pass1" ( pass1 x )
43 output2 = _scc_ "Pass2" ( pass2 output1 y )
44 output3 = _scc_ "Pass3" ( pass3 (output2 `zip` [1 .. ]) )
48 The costs of the evaluating the expressions bound to \tr{output1},
49 \tr{output2} and \tr{output3} will be attributed to the ``cost
50 centres'' \tr{Pass1}, \tr{Pass2} and \tr{Pass3}, respectively.
52 The costs of evaluating other expressions, e.g., \tr{concat output4},
53 will be inherited by the scope which referenced the function \tr{f}.
55 You can put in cost-centres via \tr{_scc_} constructs by hand, as in
56 the example above. Perfectly cool. That's probably what you {\em
57 would} do if your program divided into obvious ``passes'' or
58 ``phases'', or whatever.
60 If your program is large or you have no clue what might be gobbling
61 all the time, you can get GHC to mark all functions with \tr{_scc_}
62 constructs, automagically. Add an \tr{-auto} compilation flag to the
63 usual \tr{-prof} option.
65 Once you start homing in on the Guilty Suspects, you may well switch
66 from automagically-inserted cost-centres to a few well-chosen ones of
69 To use profiling, you must {\em compile} and {\em run} with special
70 options. (We usually forget the ``run'' magic!---Do as we say, not as
71 we do...) Details follow.
73 If you're serious about this profiling game, you should probably read
74 one or more of the Sansom/Peyton Jones papers about the GHC profiling
75 system. Just visit the Glasgow FP Web page...
77 %************************************************************************
79 \subsection[prof-compiler-options]{Compiling programs for profiling}
80 \index{profiling options}
81 \index{options, for profiling}
83 %************************************************************************
85 \input{prof-compiler-options.lit}
87 %************************************************************************
89 \subsection[prof-rts-options]{How to control your profiled program at runtime}
90 \index{profiling RTS options}
91 \index{RTS options, for profiling}
93 %************************************************************************
95 \input{prof-rts-options.lit}
97 %************************************************************************
99 \subsection[prof-output]{What's in a profiling report?}
100 \index{profiling report, meaning thereof}
102 %************************************************************************
104 \input{prof-output.lit}
106 %************************************************************************
108 \subsection[prof-graphs]{Producing graphical heap profiles}
109 \index{heap profiles, producing}
111 %************************************************************************
113 \input{prof-post-processors.lit}
115 % \subsection[cost-centres]{Profiling by Cost Centres}
117 % Problems with lazy evaluation
119 % The central idea is to identify particular source code expressions of
120 % interest. These expressions are annotated with a {\em cost
121 % centre}\index{cost centre} label. Execution and allocation costs are
122 % attributed to the cost centre label which encloses the expression
123 % incurring the costs.
127 % (Note: the paper in \tr{ghc/docs/papers/profiling.ps} may have some
128 % decent examples...)
130 % Costs are attribution to one cost centre.
131 % Inheritance of un-profiled costs.
133 % Degree of evaluation
134 % Unevaluated arguments
135 % Optimisation and transformation
136 % Evaluation of instances
137 % escaping functions: evaluation vs lexical
139 % \subsection[prof-annotations]{Annotating your Haskell source}
141 % Explicit annotations
142 % Automatic annotation
144 % \subsection[prof-information]{Profiling information}
146 % Cost Centre Label,Module,Group
147 % Example time/alloc profile
149 % Description of heap profile
150 % Closure Description, Type and Kind
151 % \subsection[limitations]{Limitations of the current profiling system}
153 % There are a number of limitations and shortcomings of the current
154 % profiling system. Any comments on the impact of these and any
155 % suggested improvements would be greatly appreciated.
158 % \subsubsection*{Explicit \tr{_scc_} annotations}
161 % Explicit \tr{_scc_} annotations:
164 % The explicit \tr{_scc_} source annotations cannot annotate entire
165 % function declarations as the clauses, pattern matching are not part of
166 % the expression syntax --- they are syntactic sugar. It is possible to
167 % remove the syntactic sugar by hand, translating to a simple
168 % declaration with case expressions on the rhs, but this is very
171 % We propose to introduce an additional annotation to enable a \tr{_scc_}
172 % annotation to be placed around an entire declaration.
174 % To further ease the explicit annotation process we also propose to
175 % provide annotations which instruct the compiler to annotate all the
176 % declarations in a particular \tr{let} or \tr{where} clause with the
177 % name of the declaration.
179 % Other annotation schemes are feasible. Any suggestions / requests?
183 % \subsubsection*{Closure descriptions}
186 % Closure descriptions:
189 % The closure descriptions are by no means perfect ...
191 % The descriptions for expressions are somewhat tedious as they reflect
192 % some of the structure of the transformed STG code. This is largely to
193 % provide additional information so use of the STG code can be made if
194 % required (use the compiler option \tr{-ddump-stg}). This may be
195 % removed if the name of the \pl{corner} is considered sufficient.
197 % Local bindings introduced by the compiler have a name \tr{?<tag>}.
198 % Most of these are not related to the source in any meaningful way. For
199 % example, the \tr{?stg} names are introduced during the CoreToStg pass.
200 % Some other arbitrary compiler introduced names are: \tr{?ds},
201 % \tr{?tpl}, \tr{?si}, \tr{?cs}, \tr{?ll}, and \tr{?sat}. Please let us
202 % know if any of these turn out to be a problem. We could introduce a
203 % more meaningful naming scheme into the compiler which assigns names
204 % that reflect the nearest enclosing source binding. Another possibility
205 % is to add the unique identifier so they aren't all clumped together as
206 % one indistinguishable description.
208 % There is only one closure description and type for all black holes,
209 % ``BH''. It might be useful to record the closure that is currently
210 % being evaluated as part of the black hole description.
212 % Similarly there is only one partial application description, ``PAP''.
213 % It might be useful to record the function being applied in the partial
214 % application as part of the partial application description.
218 % \subsubsection*{Garbage collection and paging}
221 % Garbage collection and paging:
224 % Currently the profiling implementation requires the two-space
225 % (\tr{-gc-2s}) garbage collector to be used. When using the \tr{-prof}
226 % options a particular garbage collector should not be specified. This
227 % imposes particular paging characteristics which may be different from
228 % the garbage collector your program normally uses. These paging
229 % characteristics may distort the user time profiling results, though we
230 % do not believe this is a significant problem.
233 % \subsection[references]{Papers describing this work}
235 % A discussion of our initial ideas are described in the paper
236 % ``Profiling Lazy Functional Languages'' by Patrick Sansom and Simon
239 % It is in the GHC distribution in \tr{ghc/docs/papers/profiling.ps},
240 % or it can be retrieved using ftp from
241 % \tr{ftp.dcs.gla.ac.uk} (\tr{[130.209.240.50]})
243 % \tr{pub/glasgow-fp/papers/lazy-profiling.ps}.
245 \begin{onlystandalone}