Debugging the compiler
debugging options (for GHC)
HACKER TERRITORY. HACKER TERRITORY.
(You were warned.)
Replacing the program for one or more phases.
GHC phases, changingphases, changing GHC
You may specify that a different program be used for one of the phases
of the compilation system, in place of whatever the driver ghc has
wired into it. For example, you might want to try a different
assembler. The
-pgm<phase><stuff>
option option to ghc will cause it to use <program-name>
for phase <phase-code>, where the codes to indicate the phases are:
codephase
L literate pre-processor
P C pre-processor (if -cpp only)
C Haskell compiler
c C compiler
a assembler
l linker
dep Makefile dependency generator Forcing options to a particular phase.
forcing GHC-phase options
The preceding sections describe driver options that are mostly
applicable to one particular phase. You may also force a
specific option to be passed to a particular phase
<phase-code> by feeding the driver the option
.-opt<phase><stuff>
option The codes to indicate the phases are the same as in the
previous section.
So, for example, to force an option to the assembler, you
would tell the driver (the dash before the E is
required).
Besides getting options to the Haskell compiler with ,
you can get options through to its runtime system with
-optCrts<blah> option.
So, for example: when I want to use my normal driver but with my
profiled compiler binary, I use this script:
#! /bin/sh
exec /local/grasp_tmp3/simonpj/ghc-BUILDS/working-alpha/ghc/driver/ghc \
-pgmC/local/grasp_tmp3/simonpj/ghc-BUILDS/working-hsc-prof/hsc \
-optCrts-i0.5 \
-optCrts-PT \
"$@"
Dumping out compiler intermediate structures
dumping GHC intermediatesintermediate passes, output:-noC option
Don't bother generating C output or an interface file. Usually
used in conjunction with one or more of the options; for
example: ghc -noC -ddump-simpl Foo.hs:-hi optionDo generate an interface file. This would normally be used in
conjunction with , which turns off interface generation;
thus: .
:-dshow-passes option
Prints a message to stderr as each pass starts. Gives a warm but
undoubtedly misleading feeling that GHC is telling you what's
happening.
:-ddump-<pass> options
Make a debugging dump after pass <pass> (may be common enough to
need a short form…). You can get all of these at once (lots of
output) by using , or most of them with .
Some of the most useful ones are:
:
parser output
:
renamer output
:
Dump to the file "M.imports" (where M is the module being compiled)
a "minimal" set of import declarations. You can safely replace
all the import declarations in "M.hs" with those found in "M.imports".
Why would you want to do that? Because the "minimal" imports (a) import
everything explicitly, by name, and (b) import nothing that is not required.
It can be quite painful to maintain this property by hand, so this flag is
intended to reduce the labour.
:
Dump to stdout a summary of the differences between the existing interface file (if any)
for this module, and the new one.
:
typechecker output
:
Dump a type signature for each value defined at the top level
of the module. The list is sorted alphabetically.
Using dumps a type signature for
all the imported and system-defined things as well; useful
for debugging the compiler.
:
derived instances
:
desugarer output
:
output of specialisation pass
:
dumps all rewrite rules (including those generated by the specialisation pass)
:
simplifer output (Core-to-Core passes)
:
UsageSP inference pre-inf and output
:
CPR analyser output
:
strictness analyser output
:
worker/wrapper split output
:
`occurrence analysis' output
:
output of STG-to-STG passes
:unflattened Abstract C
:flattened Abstract C
:
same as what goes to the C compiler
:
assembly language from the native-code generator
-ddump-all option-ddump-most option-ddump-parsed option-ddump-rn option-ddump-tc option-ddump-deriv option-ddump-ds option-ddump-simpl option-ddump-cpranal option-ddump-workwrap option-ddump-rules option-ddump-usagesp option-ddump-stranal option-ddump-occur-anal option-ddump-spec option-ddump-stg option-ddump-absC option-ddump-flatC option-ddump-realC option-ddump-asm option and :-dverbose-simpl option-dverbose-stg option
Show the output of the intermediate Core-to-Core and STG-to-STG
passes, respectively. (Lots of output!) So: when we're
really desperate:
% ghc -noC -O -ddump-simpl -dverbose-simpl -dcore-lint Foo.hs
:-ddump-simpl-iterations option
Show the output of each iteration of the simplifier (each run of
the simplifier has a maximum number of iterations, normally 4). Used
when even doesn't cut it.
}:-dppr-user option-dppr-debug option
Debugging output is in one of several “styles.” Take the printing
of types, for example. In the “user” style, the compiler's internal
ideas about types are presented in Haskell source-level syntax,
insofar as possible. In the “debug” style (which is the default for
debugging output), the types are printed in with
explicit foralls, and variables have their unique-id attached (so you
can check for things that look the same but aren't).
:-ddump-simpl-stats option
Dump statistics about how many of each kind
of transformation too place. If you add you get more detailed information.
:-ddump-raw-asm option
Dump out the assembly-language stuff, before the “mangler” gets it.
:-ddump-rn-trace
Make the renamer be *real* chatty about what it is upto.
:-dshow-rn-stats
Print out summary of what kind of information the renamer had to bring
in.
:-dshow-unused-imports
Have the renamer report what imports does not contribute.
Checking for consistency
consistency checkslint:-dcore-lint option
Turn on heavyweight intra-pass sanity-checking within GHC, at Core
level. (It checks GHC's sanity, not yours.)
:-dstg-lint option
Ditto for STG level.
:-dstg-lint option
Turn on checks around UsageSP inference (). This verifies
various simple properties of the results of the inference, and also
warns if any identifier with a used-once annotation before the
inference has a used-many annotation afterwards; this could indicate a
non-worksafe transformation is being applied.
How to read Core syntax (from some flags)reading Core syntaxCore syntax, how to read
Let's do this by commenting an example. It's from doing
on this code:
skip2 m = m : skip2 (m+2)
Before we jump in, a word about names of things. Within GHC,
variables, type constructors, etc., are identified by their
“Uniques.” These are of the form `letter' plus `number' (both
loosely interpreted). The `letter' gives some idea of where the
Unique came from; e.g., _ means “built-in type variable”;
t means “from the typechecker”; s means “from the
simplifier”; and so on. The `number' is printed fairly compactly in
a `base-62' format, which everyone hates except me (WDP).
Remember, everything has a “Unique” and it is usually printed out
when debugging, in some form or another. So here we go…
Desugared:
Main.skip2{-r1L6-} :: _forall_ a$_4 =>{{Num a$_4}} -> a$_4 -> [a$_4]
--# `r1L6' is the Unique for Main.skip2;
--# `_4' is the Unique for the type-variable (template) `a'
--# `{{Num a$_4}}' is a dictionary argument
_NI_
--# `_NI_' means "no (pragmatic) information" yet; it will later
--# evolve into the GHC_PRAGMA info that goes into interface files.
Main.skip2{-r1L6-} =
/\ _4 -> \ d.Num.t4Gt ->
let {
{- CoRec -}
+.t4Hg :: _4 -> _4 -> _4
_NI_
+.t4Hg = (+{-r3JH-} _4) d.Num.t4Gt
fromInt.t4GS :: Int{-2i-} -> _4
_NI_
fromInt.t4GS = (fromInt{-r3JX-} _4) d.Num.t4Gt
--# The `+' class method (Unique: r3JH) selects the addition code
--# from a `Num' dictionary (now an explicit lamba'd argument).
--# Because Core is 2nd-order lambda-calculus, type applications
--# and lambdas (/\) are explicit. So `+' is first applied to a
--# type (`_4'), then to a dictionary, yielding the actual addition
--# function that we will use subsequently...
--# We play the exact same game with the (non-standard) class method
--# `fromInt'. Unsurprisingly, the type `Int' is wired into the
--# compiler.
lit.t4Hb :: _4
_NI_
lit.t4Hb =
let {
ds.d4Qz :: Int{-2i-}
_NI_
ds.d4Qz = I#! 2#
} in fromInt.t4GS ds.d4Qz
--# `I# 2#' is just the literal Int `2'; it reflects the fact that
--# GHC defines `data Int = I# Int#', where Int# is the primitive
--# unboxed type. (see relevant info about unboxed types elsewhere...)
--# The `!' after `I#' indicates that this is a *saturated*
--# application of the `I#' data constructor (i.e., not partially
--# applied).
skip2.t3Ja :: _4 -> [_4]
_NI_
skip2.t3Ja =
\ m.r1H4 ->
let { ds.d4QQ :: [_4]
_NI_
ds.d4QQ =
let {
ds.d4QY :: _4
_NI_
ds.d4QY = +.t4Hg m.r1H4 lit.t4Hb
} in skip2.t3Ja ds.d4QY
} in
:! _4 m.r1H4 ds.d4QQ
{- end CoRec -}
} in skip2.t3Ja
(“It's just a simple functional language” is an unregisterised
trademark of Peyton Jones Enterprises, plc.)
Command line options in source files
source-file options
Sometimes it is useful to make the connection between a source file
and the command-line options it requires quite tight. For instance,
if a (Glasgow) Haskell source file uses casms, the C back-end
often needs to be told about which header files to include. Rather than
maintaining the list of files the source depends on in a
Makefile (using the command-line option), it is
possible to do this directly in the source file using the OPTIONS
pragma OPTIONS pragma:
{-# OPTIONS -#include "foo.h" #-}
module X where
...
OPTIONS pragmas are only looked for at the top of your source
files, upto the first (non-literate,non-empty) line not containing
OPTIONS. Multiple OPTIONS pragmas are recognised. Note
that your command shell does not get to the source file options, they
are just included literally in the array of command-line arguments
the compiler driver maintains internally, so you'll be desperately
disappointed if you try to glob etc. inside OPTIONS.
NOTE: the contents of OPTIONS are prepended to the command-line
options, so you do have the ability to override OPTIONS settings
via the command line.
It is not recommended to move all the contents of your Makefiles into
your source files, but in some circumstances, the OPTIONS pragma
is the Right Thing. (If you use and have OPTION
flags in your module, the OPTIONS will get put into the generated .hc
file).
Unregisterised compilationunregisterised compilationThe term "unregisterised" really means "compile via vanilla
C", disabling some of the platform-specific tricks that GHC
normally uses to make programs go faster. When compiling
unregisterised, GHC simply generates a C file which is compiled
via gcc.Unregisterised compilation can be useful when porting GHC to
a new machine, since it reduces the prerequisite tools to
gcc, as, and
ld and nothing more, and furthermore the amount
of platform-specific code that needs to be written in order to get
unregisterised compilation going is usually fairly small.:Compile via vanilla ANSI C only, turning off
platform-specific optimisations. NOTE: in order to use
, you need to have a set of libraries
(including the RTS) built for unregisterised compilation.
This amounts to building GHC with way "u" enabled.