Using GHC
GHC, usingusing GHC
GHC is a command-line compiler: in order to compile a Haskell program,
GHC must be invoked on the source file(s) by typing a command to the
shell. The steps involved in compiling a program can be automated
using the make tool (this is especially useful if the program
consists of multiple source files which depend on each other). This
section describes how to use GHC from the command-line.
Overall command-line structure
structure, command-linecommand-line structure
An invocation of GHC takes the following form:
ghc [argument...]
Command-line arguments are either options or file names.
Command-line options begin with -. They may not be
grouped: is different from . Options need not
precede filenames: e.g., ghc *.o -o foo. All options are
processed and then applied to all files; you cannot, for example, invoke
ghc -c -O1 Foo.hs -O2 Bar.hs to apply different optimisation
levels to the files Foo.hs and Bar.hs. For conflicting
options, e.g., , we reserve the right to do anything we
want. (Usually, the last one applies.)
Meaningful file suffixes
suffixes, filefile suffixes for GHC
File names with “meaningful” suffixes (e.g., .lhs or .o)
cause the “right thing” to happen to those files.
.lhs:lhs suffix
A “literate Haskell” module.
.hs:
A not-so-literate Haskell module.
.hi:
A Haskell interface file, probably compiler-generated.
.hc:
Intermediate C file produced by the Haskell compiler.
.c:
A C file not produced by the Haskell compiler.
.s:
An assembly-language source file, usually
produced by the compiler.
.o:
An object file, produced by an assembler.
Files with other suffixes (or without suffixes) are passed straight
to the linker.
Help and verbosity optionshelp optionsverbosity options-help-?-?-helpCause GHC to spew a long usage message to standard
output and then exit.-v-vThe option makes GHC
verbose: it reports its version number
and shows (on stderr) exactly how it invokes each phase of
the compilation system. Moreover, it passes the
flag to most phases; each reports its
version number (and possibly some other information).Please, oh please, use the option
when reporting bugs! Knowing that you ran the right bits in
the right order is always the first thing we want to
verify.-vnTo provide more control over the compiler's verbosity,
the flag takes an optional numeric
argument. Specifying on its own is
equivalent to , and the other levels
have the following meanings:-v0Disable all non-essential messages (this is the
default).-v1Minimal verbosity: print one line per
compilation (this is the default when
or
is on).-v2Print the name of each compilation phase as it
is executed. (equivalent to
).-v3The same as , except that in
addition the full command line (if appropriate) for
each compilation phase is also printed.-v4The same as except that the
intermediate program representation after each
compilation phase is also printed (excluding
preprocessed and C/assembly files).--version--versionPrint a one-line string including GHC's version number.--numeric-version--numeric-versionPrint GHC's numeric version number only.Running the right phases in the right orderorder of passes in GHCpass ordering in GHCThe basic task of the ghc driver is to
run each input file through the right phases (compiling, linking,
etc.).The first phase to run is determined by the input-file
suffix, and the last phase is determined by a flag. If no
relevant flag is present, then go all the way through linking.
This table summarises:Phase of the compilation systemSuffix saying “start here”Flag saying “stop after”(suffix of) output file
literate pre-processor .lhs - .hs
C pre-processor (opt.) .hs (with -cpp) -E .hspp
Haskell compiler .hs -C, -S .hc, .s
C compiler (opt.) .hc or .c -S .s
assembler .s -c .o
linker other - a.out -C option-S option-c option
Thus, a common invocation would be: ghc -c Foo.hs
Note: What the Haskell compiler proper produces depends on whether a
native-code generator is used (producing assembly language) or not
(producing C).
NOTE: the option -E
option runs just the pre-processing passes
of the compiler, dumping the result in a file. Note that this
differs from all GHCs prior to version 4.11, in which the result
was dumped to the standard output. If used in conjunction with
-cpp, the output is the code blocks of the original (literal)
source after having put it through the grinder that is the C
pre-processor. Sans , the output is the
de-litted version of the original source.The following options also affect which phases get run:Run the C pre-processor on the Haskell source before
compiling it. See for more
details.Use GHC's native code generator rather than compiling
via C. This will compile faster (up to twice as fast), but
may produce code that is slightly slower than compiling via
C. is the default when optimisation
is off (see ).Compile via C instead of using the native code
generator. This is default for optimised compilations, and
on architectures for which GHC doesn't have a native code
generator.Re-directing the compilation output(s)output-directing optionsredirecting compilation output-o-oGHC's compiled output normally goes into a
.hc, .o, etc.,
file, depending on the last-run compilation phase. The
option -o
option re-directs the output of that
last-run phase to file foo.Note: this “feature” can be
counterintuitive: ghc -C -o foo.o foo.hs
will put the intermediate C code in the file
foo.o, name notwithstanding!-odir-odirThe option isn't of much use if
you have several input files…
Non-interface output files are normally put in the same
directory as their corresponding input file came from. You
may specify that they be put in another directory using the
-odir
<dir> option (the “Oh,
dear” option). For example:
% ghc -c parse/Foo.hs parse/Bar.hs gurgle/Bumble.hs -odir `arch`
The output files, Foo.o,
Bar.o, and
Bumble.o would be put into a
subdirectory named after the architecture of the executing
machine (sun4,
mips, etc). The directory must already
exist; it won't be created.Note that the option does
not affect where the interface files
are put. In the above example, they would still be put in
parse/Foo.hi,
parse/Bar.hi, and
gurgle/Bumble.hi.-ohi-ohi-osuf-hisuf-osuf-hisufEXOTICA: The -osuf
<suffix> option will change the
.o file suffix for object files to
whatever you specify. (We use this in compiling the
prelude.).Similarly, the -hisuf
<suffix> option will change the
.hi file suffix for non-system
interface files (see ).The /
game is useful if you want to compile a program with both
GHC and HBC (say) in the same directory. Let HBC use the
standard .hi/.o
suffixes; add to your make rule for
GHC compiling…Keeping Intermediate Filesintermediate files, saving.hc files, saving.s files, savingThe following options are useful for keeping certain
intermediate files around, when normally GHC would throw these
away after compilation:-keep-hc-files-keep-hc-filesKeep intermediate .hc files when
doing .hs-to-.o
compilations via C (NOTE: .hc files
aren't generated when using the native code generator, you
may need to use -fvia-C to force them
to be produced).-keep-s-files-keep-s-filesKeep intermediate .s files.-keep-raw-s-files-keep-raw-s-filesKeep intermediate .raw-s files.
These are the direct output from the C compiler, before
GHC does “assembly mangling” to produce the
.s file. Again, these are not produced
when using the native code generator.-keep-tmp-files-keep-tmp-filestemporary fileskeepingInstructs the GHC driver not to delete any of its
temporary files, which it normally keeps in
/tmp (or possibly elsewhere; see ). Running GHC with
-v will show you what temporary files
were generated along the way.Redirecting temporary filestemporary filesredirecting-tmpdir-tmpdirIf you have trouble because of running out of space
in /tmp (or wherever your
installation thinks temporary files should go), you may
use the -tmpdir
<dir> option option to specify
an alternate directory. For example, says to put temporary files in the current
working directory.Alternatively, use your TMPDIR
environment variable.TMPDIR
environment variable Set it to the
name of the directory where temporary files should be put.
GCC and other programs will honour the
TMPDIR variable as well.Even better idea: Set the
DEFAULT_TMPDIR make variable when
building GHC, and never worry about
TMPDIR again. (see the build
documentation).Warnings and sanity-checkingsanity-checking optionswarningsGHC has a number of options that select which types of
non-fatal error messages, otherwise known as warnings, can be
generated during compilation. By default, you get a standard set
of warnings which are generally likely to indicate bugs in your
program. These are:
,
,
,
, and
. The following flags are
simple ways to select standard “packages” of warnings:
:-W optionProvides the standard warnings plus
,
,
and
.:-w optionTurns off all warnings, including the standard ones.:-Wall optionTurns on all warning options.The full set of warning options is described below. To turn
off any warning, simply give the corresponding
option on the command line.:deprecationsCauses a warning to be emitted when a deprecated
function or type is used. Entities can be marked as
deprecated using a pragma, see .:-fwarn-duplicate-exports optionduplicate exports, warningexport lists, duplicatesHave the compiler warn about duplicate entries in
export lists. This is useful information if you maintain
large export lists, and want to avoid the continued export
of a definition after you've deleted (one) mention of it in
the export list.This option is on by default.:-fwarn-hi-shadowing optionshadowinginterface filesCauses the compiler to emit a warning when a module or
interface file in the current directory is shadowing one
with the same module name in a library or other
directory.:-fwarn-incomplete-patterns optionincomplete patterns, warningpatterns, incompleteSimilarly for incomplete patterns, the function
g below will fail when applied to
non-empty lists, so the compiler will emit a warning about
this when is
enabled.
g [] = 2
This option isn't enabled be default because it can be
a bit noisy, and it doesn't always indicate a bug in the
program. However, it's generally considered good practice
to cover all the cases in your functions.:-fwarn-missing-fields optionmissing fields, warningfields, missingThis option is on by default, and warns you whenever
the construction of a labelled field constructor isn't
complete, missing initializers for one or more fields. While
not an error (the missing fields are initialised with
bottoms), it is often an indication of a programmer error.:-fwarn-missing-methods optionmissing methods, warningmethods, missingThis option is on by default, and warns you whenever
an instance declaration is missing one or more methods, and
the corresponding class declaration has no default
declaration for them.:-fwarn-missing-signatures optiontype signatures, missingIf you would like GHC to check that every top-level
function/value has a type signature, use the
option. This
option is off by default.:-fwarn-name-shadowing optionshadowing, warningThis option causes a warning to be emitted whenever an
inner-scope value has the same name as an outer-scope value,
i.e. the inner value shadows the outer one. This can catch
typographical errors that turn into hard-to-find bugs, e.g.,
in the inadvertent cyclic definition let x = ... x
... in.Consequently, this option does
will complain about cyclic recursive
definitions.:-fwarn-overlapping-patterns optionoverlapping patterns, warningpatterns, overlappingBy default, the compiler will warn you if a set of
patterns are overlapping, i.e.,
f :: String -> Int
f [] = 0
f (_:xs) = 1
f "2" = 2
where the last pattern match in f
won't ever be reached, as the second pattern overlaps
it. More often than not, redundant patterns is a programmer
mistake/error, so this option is enabled by default.:Causes the compiler to warn about lambda-bound
patterns that can fail, eg. \(x:xs)->....
Normally, these aren't treated as incomplete patterns by
.:-fwarn-type-defaults optiondefaulting mechanism, warningHave the compiler warn/inform you where in your source
the Haskell defaulting mechanism for numeric types kicks
in. This is useful information when converting code from a
context that assumed one default into one with another,
e.g., the `default default' for Haskell 1.4 caused the
otherwise unconstrained value 1 to be
given the type Int, whereas Haskell 98
defaults it to Integer. This may lead to
differences in performance and behaviour, hence the
usefulness of being non-silent about this.This warning is off by default.:-fwarn-unused-binds optionunused binds, warningbinds, unusedReport any function definitions (and local bindings)
which are unused. For top-level functions, the warning is
only given if the binding is not exported.:-fwarn-unused-imports optionunused imports, warningimports, unusedReport any objects that are explicitly imported but
never used.:-fwarn-unused-matches optionunused matches, warningmatches, unusedReport all unused variables which arise from pattern
matches, including patterns consisting of a single variable.
For instance f x y = [] would report
x and y as unused. To
eliminate the warning, all unused variables can be replaced
with wildcards.If you're feeling really paranoid, the
option-dcore-lint
option is a good choice. It turns on
heavyweight intra-pass sanity-checking within GHC. (It checks
GHC's sanity, not yours.)
&separate;
&packages;
Optimisation (code improvement)optimisationimprovement, codeThe options specify convenient
“packages” of optimisation flags; the
options described later on specify
individual optimisations to be turned on/off;
the options specify
machine-specific optimisations to be turned
on/off.: convenient “packages” of optimisation flags.There are many options that affect
the quality of code produced by GHC. Most people only have a
general goal, something like “Compile quickly” or
“Make my program run like greased lightning.” The
following “packages” of optimisations (or lack
thereof) should suffice.Once you choose a
“package,” stick with it—don't chop and
change. Modules' interfaces will change
with a shift to a new option, and you may
have to recompile a large chunk of all importing modules before
your program can again be run safely (see ).No -type option specified:-O* not specifiedThis is taken to mean: “Please compile
quickly; I'm not over-bothered about compiled-code
quality.” So, for example: ghc -c
Foo.hs:Means “turn off all optimisation”,
reverting to the same settings as if no
options had been specified. Saying
can be useful if
eg. make has inserted a
on the command line already. or :-O option-O1 optionoptimisenormallyMeans: “Generate good-quality code without
taking too long about it.” Thus, for example:
ghc -c -O Main.lhs:-O2 optionoptimiseaggressivelyMeans: “Apply every non-dangerous
optimisation, even if it means significantly longer
compile times.”The avoided “dangerous” optimisations
are those that can make runtime or space
worse if you're unlucky. They are
normally turned on or off individually.At the moment, is
unlikely to produce better code than
.:-O2-for-C optiongcc, invoking with -O2Says to run GCC with , which may
be worth a few percent in execution speed. Don't forget
, lest you use the native-code
generator and bypass GCC altogether!:-Ofile <file> optionoptimising, customised(NOTE: not supported yet in GHC 5.x. Please ask if
you're interested in this.)For those who need absolute
control over exactly what options are
used (e.g., compiler writers, sometimes :-), a list of
options can be put in a file and then slurped in with
.In that file, comments are of the
#-to-end-of-line variety; blank
lines and most whitespace is ignored.Please ask if you are baffled and would like an
example of !We don't use a flag for day-to-day
work. We use to get respectable speed;
e.g., when we want to measure something. When we want to go for
broke, we tend to use (and
we go for lots of coffee breaks).The easiest way to see what (etc.)
“really mean” is to run with ,
then stand back in amazement.: platform-independent flags-f* options (GHC)-fno-* options (GHC)These flags turn on and off individual optimisations.
They are normally set via the options
described above, and as such, you shouldn't need to set any of
them explicitly (indeed, doing so could lead to unexpected
results). However, there are one or two that may be of
interest::When this option is given, intermediate floating
point values can have a greater
precision/range than the final type. Generally this is a
good thing, but some programs may rely on the exact
precision/range of
Float/Double values
and should not use this option for their compilation.Turns off the strictness analyser; sometimes it eats
too many cycles.Turns off the CPR (constructed product result)
analysis; it is somewhat experimental.:strict constructor fieldsconstructor fields, strictThis option causes all constructor fields which are
marked strict (i.e. “!”) to be unboxed or
unpacked if possible. For example:
data T = T !Float !Float
will create a constructor T
containing two unboxed floats if the
flag is given.
This may not always be an optimisation: if the
T constructor is scrutinised and the
floats passed to a non-strict function for example, they
will have to be reboxed (this is done automatically by the
compiler).This option should only be used in conjunction with
, in order to expose unfoldings to the
compiler so the reboxing can be removed as often as
possible. For example:
f :: T -> Float
f (T f1 f2) = f1 + f2
The compiler will avoid reboxing
f1 and f2 by
inlining + on floats, but only when
is on.Any single-constructor data is eligible for
unpacking; for example
data T = T !(Int,Int)
will store the two Ints directly
in the T constructor, by flattening
the pair. Multi-level unpacking is also supported:
data T = T !S
data S = S !Int !Int
will store two unboxed Int#s
directly in the T constructor.Switches on an experimental "optimisation".
Switching it on makes the compiler a little keener to
inline a function that returns a constructor, if the
context is that of a thunk.
x = plusInt a b
If we inlined plusInt we might get an opportunity to use
update-in-place for the thunk 'x'.:inlining, controllingunfolding, controlling(Default: 30) By raising or lowering this number,
you can raise or lower the amount of pragmatic junk that
gets spewed into interface files. (An unfolding has a
“size” that reflects the cost in terms of
“code bloat” of expanding that unfolding in
another module. A bigger function would be assigned a
bigger cost.):inlining, controllingunfolding, controlling(Default: 30) This option is similar to
, except
that it governs unfoldings within a single module.
Increasing this figure is more likely to result in longer
compile times than faster code. The next option is more
useful::inlining, controllingunfolding, controlling(Default: 8) This is the magic cut-off figure for
unfolding: below this size, a function definition will be
unfolded at the call-site, any bigger and it won't. The
size computed for a function depends on two things: the
actual size of the expression minus any discounts that
apply (see ).
&phases;
Using Concurrent HaskellConcurrent Haskell—use
GHC (as of version 4.00) supports Concurrent Haskell by default,
without requiring a special option or libraries compiled in a certain
way. To get access to the support libraries for Concurrent Haskell
(i.e. Concurrent and friends), use the
option.
Three RTS options are provided for modifying the behaviour of the
threaded runtime system. See the descriptions of
, , and
in .
Concurrent Haskell is described in more detail in .
Using Parallel HaskellParallel Haskell—use
[You won't be able to execute parallel Haskell programs unless PVM3
(Parallel Virtual Machine, version 3) is installed at your site.]
To compile a Haskell program for parallel execution under PVM, use the
option,-parallel
option both when compiling and
linking. You will probably want to import
Parallel into your Haskell modules.
To run your parallel program, once PVM is going, just invoke it
“as normal”. The main extra RTS option is
, to say how many PVM
“processors” your program to run on. (For more details of
all relevant RTS options, please see .)
In truth, running Parallel Haskell programs and getting information
out of them (e.g., parallelism profiles) is a battle with the vagaries of
PVM, detailed in the following sections.
Dummy's guide to using PVMPVM, how to useParallel Haskell—PVM use
Before you can run a parallel program under PVM, you must set the
required environment variables (PVM's idea, not ours); something like,
probably in your .cshrc or equivalent:
setenv PVM_ROOT /wherever/you/put/it
setenv PVM_ARCH `$PVM_ROOT/lib/pvmgetarch`
setenv PVM_DPATH $PVM_ROOT/lib/pvmd
Creating and/or controlling your “parallel machine” is a purely-PVM
business; nothing specific to Parallel Haskell.
You use the pvmpvm command command to start PVM on your
machine. You can then do various things to control/monitor your
“parallel machine;” the most useful being:
ControlDexit pvm, leaving it runninghaltkill off this “parallel machine” & exitadd <host>add <host> as a processordelete <host>delete <host>resetkill what's going, but leave PVM upconflist the current configurationpsreport processes' statuspstat <pid>status of a particular process
The PVM documentation can tell you much, much more about pvm!
Parallelism profilesparallelism profilesprofiles, parallelismvisualisation tools
With Parallel Haskell programs, we usually don't care about the
results—only with “how parallel” it was! We want pretty pictures.
Parallelism profiles (à la hbcpp) can be generated with the
-q RTS option (concurrent, parallel) RTS option. The
per-processor profiling info is dumped into files named
<full-path><program>.gr. These are then munged into a PostScript picture,
which you can then display. For example, to run your program
a.out on 8 processors, then view the parallelism profile, do:
% ./a.out +RTS -N8 -q
% grs2gr *.???.gr > temp.gr # combine the 8 .gr files into one
% gr2ps -O temp.gr # cvt to .ps; output in temp.ps
% ghostview -seascape temp.ps # look at it!
The scripts for processing the parallelism profiles are distributed
in ghc/utils/parallel/.
Other useful info about running parallel programs
The “garbage-collection statistics” RTS options can be useful for
seeing what parallel programs are doing. If you do either
-Sstderr RTS option or , then
you'll get mutator, garbage-collection, etc., times on standard
error. The standard error of all PE's other than the `main thread'
appears in /tmp/pvml.nnn, courtesy of PVM.
Whether doing or not, a handy way to watch
what's happening overall is: tail -f /tmp/pvml.nnn.
RTS options for Concurrent/Parallel Haskell
RTS options, concurrentRTS options, parallelConcurrent Haskell—RTS optionsParallel Haskell—RTS options
Besides the usual runtime system (RTS) options
(), there are a few options particularly
for concurrent/parallel execution.
:-N<N> RTS option (parallel)
(PARALLEL ONLY) Use <N> PVM processors to run this program;
the default is 2.
:-C<us> RTS option Sets
the context switch interval to <s> seconds.
A context switch will occur at the next heap block allocation after
the timer expires (a heap block allocation occurs every 4k of
allocation). With or ,
context switches will occur as often as possible (at every heap block
allocation). By default, context switches occur every 20ms
milliseconds. Note that GHC's internal timer ticks every 20ms, and
the context switch timer is always a multiple of this timer, so 20ms
is the maximum granularity available for timed context switches.
:-q RTS option
(PARALLEL ONLY) Produce a quasi-parallel profile of thread activity,
in the file <program>.qp. In the style of hbcpp, this profile
records the movement of threads between the green (runnable) and red
(blocked) queues. If you specify the verbose suboption (), the
green queue is split into green (for the currently running thread
only) and amber (for other runnable threads). We do not recommend
that you use the verbose suboption if you are planning to use the
hbcpp profiling tools or if you are context switching at every heap
check (with ).
:-t<num> RTS option
(PARALLEL ONLY) Limit the number of concurrent threads per processor
to <num>. The default is 32. Each thread requires slightly over 1K
words in the heap for thread state and stack objects. (For
32-bit machines, this translates to 4K bytes, and for 64-bit machines,
8K bytes.)
:-d RTS option (parallel)
(PARALLEL ONLY) Turn on debugging. It pops up one xterm (or GDB, or
something…) per PVM processor. We use the standard debugger
script that comes with PVM3, but we sometimes meddle with the
debugger2 script. We include ours in the GHC distribution,
in ghc/utils/pvm/.
:-e<num> RTS option (parallel)
(PARALLEL ONLY) Limit the number of pending sparks per processor to
<num>. The default is 100. A larger number may be appropriate if
your program generates large amounts of parallelism initially.
:-Q<num> RTS option (parallel)
(PARALLEL ONLY) Set the size of packets transmitted between processors
to <num>. The default is 1024 words. A larger number may be
appropriate if your machine has a high communication cost relative to
computation speed.
Platform-specific Flags-m* optionsplatform-specific optionsmachine-specific optionsSome flags only make sense for particular target
platforms.:(SPARC machines)-mv8 option (SPARC
only) Means to pass the like-named
option to GCC; it says to use the Version 8 SPARC
instructions, notably integer multiply and divide. The
similiar GCC options for SPARC also
work, actually.:(HPPA machines)-mlong-calls option
(HPPA only) Means to pass the
like-named option to GCC. Required for Very Big modules,
maybe. (Probably means you're in trouble…):(iX86 machines)-monly-N-regs
option (iX86 only) GHC tries to
“steal” four registers from GCC, for performance
reasons; it almost always works. However, when GCC is
compiling some modules with four stolen registers, it will
crash, probably saying:
Foo.hc:533: fixed or forbidden register was spilled.
This may be due to a compiler bug or to impossible asm
statements or clauses.
Just give some registers back with
. Try `3' first, then `2'.
If `2' doesn't work, please report the bug to us.
&runtime
&debug
&flags