Using GHC
GHC, usingusing GHC
GHC is a command-line compiler: in order to compile a Haskell program,
GHC must be invoked on the source file(s) by typing a command to the
shell. The steps involved in compiling a program can be automated
using the make tool (this is especially useful if the program
consists of multiple source files which depend on each other). This
section describes how to use GHC from the command-line.
Overall command-line structure
structure, command-linecommand-line structure
An invocation of GHC takes the following form:
ghc [argument...]
Command-line arguments are either options or file names.
Command-line options begin with -. They may not be
grouped: is different from . Options need not
precede filenames: e.g., ghc *.o -o foo. All options are
processed and then applied to all files; you cannot, for example, invoke
ghc -c -O1 Foo.hs -O2 Bar.hs to apply different optimisation
levels to the files Foo.hs and Bar.hs. For conflicting
options, e.g., , we reserve the right to do anything we
want. (Usually, the last one applies.)
Meaningful file suffixes
suffixes, filefile suffixes for GHC
File names with “meaningful” suffixes (e.g., .lhs or .o)
cause the “right thing” to happen to those files.
.lhs:lhs suffix
A “literate Haskell” module.
.hs:
A not-so-literate Haskell module.
.hi:
A Haskell interface file, probably compiler-generated.
.hc:
Intermediate C file produced by the Haskell compiler.
.c:
A C file not produced by the Haskell compiler.
.s:
An assembly-language source file, usually
produced by the compiler.
.o:
An object file, produced by an assembler.
Files with other suffixes (or without suffixes) are passed straight
to the linker.
Help and verbosity options
help options (GHC)verbose option (GHC)
A good option to start with is the (or ) option.
-help option-? option
GHC spews a long message to standard output and then exits.
The -v option option makes GHC verbose: it
reports its version number and shows (on stderr) exactly how it invokes each
phase of the compilation system. Moreover, it passes
the flag to most phases; each reports
its version number (and possibly some other information).
Please, oh please, use the option when reporting bugs!
Knowing that you ran the right bits in the right order is always the
first thing we want to verify.
If you're just interested in the compiler version number, the
--version option option prints out a
one-line string containing the requested info.
Running the right phases in the right order
order of passes in GHCpass ordering in GHC
The basic task of the ghc driver is to run each input file
through the right phases (compiling, linking, etc.).
The first phase to run is determined by the input-file suffix, and the
last phase is determined by a flag. If no relevant flag is present,
then go all the way through linking. This table summarises:
Phase of the compilation systemSuffix saying “start here”Flag saying “stop after”(suffix of) output file
literate pre-processor .lhs - -
C pre-processor (opt.) - - -
Haskell compiler .hs -C, -S .hc, .s
C compiler (opt.) .hc or .c -S .s
assembler .s -c .o
linker other - a.out -C option-S option-c option
Thus, a common invocation would be: ghc -c Foo.hs
Note: What the Haskell compiler proper produces depends on whether a
native-code generator is used (producing assembly language) or not
(producing C).
The option -cpp option must be given for the C
pre-processor phase to be run, that is, the pre-processor will be run
over your Haskell source file before continuing.
The option -E option runs just the pre-processing
passes of the compiler, outputting the result on stdout before
stopping. If used in conjunction with -cpp, the output is the
code blocks of the original (literal) source after having put it
through the grinder that is the C pre-processor. Sans , the
output is the de-litted version of the original source.
The option -optcpp-E option runs just the
pre-processing stage of the C-compiling phase, sending the result to
stdout. (For debugging or obfuscation contests, usually.)
Re-directing the compilation output(s)
output-directing options
GHC's compiled output normally goes into a .hc, .o, etc., file,
depending on the last-run compilation phase. The option -o option re-directs the output of that last-run
phase to file foo.
Note: this “feature” can be counterintuitive:
ghc -C -o foo.o foo.hs will put the intermediate C code in the
file foo.o, name notwithstanding!
EXOTICA: But the option isn't of much use if you have
several input files… Non-interface output files are
normally put in the same directory as their corresponding input file
came from. You may specify that they be put in another directory
using the -odir <dir> option (the
“Oh, dear” option). For example:
% ghc -c parse/Foo.hs parse/Bar.hs gurgle/Bumble.hs -odir `arch`
The output files, Foo.o, Bar.o, and Bumble.o would be
put into a subdirectory named after the architecture of the executing
machine (sun4, mips, etc). The directory must already
exist; it won't be created.
Note that the option does not affect where the
interface files are put. In the above example, they would still be
put in parse/Foo.hi, parse/Bar.hi, and gurgle/Bumble.hi.
MORE EXOTICA: The -osuf <suffix>
option will change the .o file suffix for object files to
whatever you specify. (We use this in compiling the prelude.).
Similarly, the -hisuf <suffix>
option will change the .hi file suffix for non-system
interface files (see ).
The / game is useful if you want to compile a program
with both GHC and HBC (say) in the same directory. Let HBC use the
standard .hi/.o suffixes; add to your
make rule for GHC compiling…
Keeping Intermediate Filesintermediate files, saving.hc files, saving.s files, savingThe following options are useful for keeping certain
intermediate files around, when normally GHC would throw these
away after compilation:-keep-hc-files-keep-hc-filesKeep intermediate .hc files when
doing .hs-to-.o
compilations via C (NOTE: .hc files
aren't generated when using the native code generator, you
may need to use -fvia-C to force them
to be produced).-keep-s-files-keep-s-filesKeep intermediate .s files.-keep-raw-s-files-keep-raw-s-filesKeep intermediate .raw-s files.
These are the direct output from the C compiler, before
GHC does “assembly mangling” to produce the
.s file. Again, these are not produced
when using the native code generator.-keep-tmp-files-keep-tmp-filestemporary fileskeepingInstructs the GHC driver not to delete any of its
temporary files, which it normally keeps in
/tmp (or possibly elsewhere; see ). Running GHC with
-v will show you what temporary files
were generated along the way.Saving GHC's standard error output
standard error, saving
Sometimes, you may cause GHC to be rather chatty on standard error;
with , for example. You can instruct GHC to append this
output to a particular log file with a -odump
<blah> option option.
Redirecting temporary files
temporary filesredirecting
If you have trouble because of running out of space in
/tmp (or wherever your installation thinks
temporary files should go), you may use the -tmpdir <dir>
option option to specify an alternate directory.
For example, says to put temporary files in
the current working directory.
Alternatively, use your TMPDIR environment
variable.TMPDIR environment
variable Set it to the name of the directory
where temporary files should be put. GCC and other programs will
honour the TMPDIR variable as well.
Even better idea: Set the TMPDIR variable when building GHC, and
never worry about TMPDIR again. (see the build documentation).
Warnings and sanity-checking
sanity-checking optionswarnings
GHC has a number of options that select which types of non-fatal error
messages, otherwise known as warnings, can be generated during
compilation. By default, you get a standard set of warnings which are
generally likely to indicate bugs in your program. These are:
, , and
. The following flags are simple ways to
select standard “packages” of warnings:
:-Wnot option
Turns off all warnings, including the standard ones.
:-w option
Synonym for .
:-W option
Provides the standard warnings plus ,
and .
:-Wall option
Turns on all warning options.
The full set of warning options is described below. To turn off any
warning, simply give the corresponding option on
the command line.
:-fwarn-name-shadowing optionshadowing, warningThis option causes a warning to be emitted whenever an inner-scope
value has the same name as an outer-scope value, i.e. the inner value
shadows the outer one. This can catch typographical errors that turn
into hard-to-find bugs, e.g., in the inadvertent cyclic definition
let x = ... x ... in.
Consequently, this option does not allow cyclic recursive
definitions.
:-fwarn-overlapping-patterns optionoverlapping patterns, warningpatterns, overlapping
By default, the compiler will warn you if a set of patterns are
overlapping, i.e.,
f :: String -> Int
f [] = 0
f (_:xs) = 1
f "2" = 2
where the last pattern match in f won't ever be reached, as the
second pattern overlaps it. More often than not, redundant patterns
is a programmer mistake/error, so this option is enabled by default.
:-fwarn-incomplete-patterns optionincomplete patterns, warningpatterns, incomplete
Similarly for incomplete patterns, the function g below will fail
when applied to non-empty lists, so the compiler will emit a warning
about this when is enabled.
g [] = 2
This option isn't enabled be default because it can be a bit noisy,
and it doesn't always indicate a bug in the program. However, it's
generally considered good practice to cover all the cases in your
functions.
:-fwarn-missing-methods optionmissing methods, warningmethods, missing
This option is on by default, and warns you whenever an instance
declaration is missing one or more methods, and the corresponding
class declaration has no default declaration for them.
:-fwarn-missing-fields optionmissing fields, warningfields, missing
This option is on by default, and warns you whenever the construction
of a labelled field constructor isn't complete, missing initializers
for one or more fields. While not an error (the missing fields are
initialised with bottoms), it is often an indication of a programmer
error.
:-fwarn-unused-imports optionunused imports, warningimports, unused
Report any objects that are explicitly imported but never used.
:-fwarn-unused-binds optionunused binds, warningbinds, unused
Report any function definitions (and local bindings) which are unused.
For top-level functions, the warning is only given if the binding is
not exported.
:-fwarn-unused-matches optionunused matches, warningmatches, unused
Report all unused variables which arise from pattern matches,
including patterns consisting of a single variable. For instance f x
y = [] would report x and y as unused. To eliminate the warning,
all unused variables can be replaced with wildcards.
:-fwarn-duplicate-exports optionduplicate exports, warningexport lists, duplicates
Have the compiler warn about duplicate entries in export lists. This
is useful information if you maintain large export lists, and want to
avoid the continued export of a definition after you've deleted (one)
mention of it in the export list.
This option is on by default.
:-fwarn-type-defaults optiondefaulting mechanism, warning
Have the compiler warn/inform you where in your source the Haskell
defaulting mechanism for numeric types kicks in. This is useful
information when converting code from a context that assumed one
default into one with another, e.g., the `default default' for Haskell
1.4 caused the otherwise unconstrained value 1 to be given
the type Int, whereas Haskell 98 defaults it to
Integer. This may lead to differences in performance and
behaviour, hence the usefulness of being non-silent about this.
This warning is off by default.
:-fwarn-missing-signatures optiontype signatures, missing
If you would like GHC to check that every top-level function/value has
a type signature, use the option. This
option is off by default.
If you're feeling really paranoid, the
option-dcore-lint option is a good choice. It turns on
heavyweight intra-pass sanity-checking within GHC. (It checks GHC's
sanity, not yours.)
Separate compilation
separate compilationrecompilation checkermake and recompilation
This section describes how GHC supports separate compilation.
Interface files
interface files.hi files
When GHC compiles a source file F which contains a module A, say,
it generates an object F.o, and a companion interface
fileA.hi. The interface file is not intended for human
consumption, as you'll see if you take a look at one. It's merely
there to help the compiler compile other modules in the same program.
NOTE: Having the name of the interface file follow the module name and
not the file name, means that working with tools such as make
become harder. make implicitly assumes that any output files
produced by processing a translation unit will have file names that
can be derived from the file name of the translation unit. For
instance, pattern rules becomes unusable. For this reason, we
recommend you stick to using the same file name as the module name.
The interface file for A contains information needed by the compiler
when it compiles any module B that imports A, whether directly or
indirectly. When compiling B, GHC will read A.hi to find the
details that it needs to know about things defined in A.
Furthermore, when compiling module C which imports B, GHC may
decide that it needs to know something about A—for example, B
might export a function that involves a type defined in A. In this
case, GHC will go and read A.hi even though C does not explicitly
import A at all.
The interface file may contain all sorts of things that aren't
explicitly exported from A by the programmer. For example, even
though a data type is exported abstractly, A.hi will contain the
full data type definition. For small function definitions, A.hi
will contain the complete definition of the function. For bigger
functions, A.hi will contain strictness information about the
function. And so on. GHC puts much more information into .hi files
when optimisation is turned on with the flag. Without it
puts in just the minimum; with it lobs in a whole pile of stuff.
optimsation, effect on .hi filesA.hi should really be thought of as a compiler-readable version of
A.o. If you use a .hi file that wasn't generated by the same
compilation run that generates the .o file the compiler may assume
all sorts of incorrect things about A, resulting in core dumps and
other unpleasant happenings.
Finding interface files
interface files, finding themfinding interface files
In your program, you import a module Foo by saying
import Foo. GHC goes looking for an interface file, Foo.hi.
It has a builtin list of directories (notably including .) where
it looks.
-i<dirs> optionThis flag
prepends a colon-separated list of dirs to the “import
directories” list.
See also for the significance of using
relative and absolute pathnames in the list.
resets the “import directories” list back to nothing.
-fno-implicit-prelude option
GHC normally imports Prelude.hi files for you. If you'd rather it
didn't, then give it a option. You are
unlikely to get very far without a Prelude, but, hey, it's a free
country.
-I<dir> option
Once a Haskell module has been compiled to C (.hc file), you may
wish to specify where GHC tells the C compiler to look for .h files.
(Or, if you are using the option-cpp option, where
it tells the C pre-processor to look…) For this purpose, use a
option in the usual C-ish way.
Other options related to interface files
interface files, options
The interface output may be directed to another file
bar2/Wurble.iface with the option -ohi
<file> option (not recommended).
To avoid generating an interface file at all, use a
option.-nohi option
The compiler does not overwrite an existing .hi interface file if
the new one is byte-for-byte the same as the old one; this is friendly
to make. When an interface does change, it is often enlightening to
be informed. The -hi-diffs option option will
make GHC run diff on the old and new .hi files. You can also
record the difference in the interface file itself, the
-keep-hi-diffs option takes care of that.
The .hi files from GHC contain “usage” information which changes
often and uninterestingly. If you really want to see these changes
reported, you need to use the
-hi-diffs-with-usages option
option.
Interface files are normally jammed full of compiler-produced
pragmas, which record arities, strictness info, etc. If you
think these pragmas are messing you up (or you are doing some kind of
weird experiment), you can tell GHC to ignore them with the
-fignore-interface-pragmas
option option.
When compiling without optimisations on, the compiler is extra-careful
about not slurping in data constructors and instance declarations that
it will not need. If you believe it is getting it wrong and not
importing stuff which you think it should, this optimisation can be
turned off with and .
-fno-prune-tydecls option-fno-prune-instdecls
option
See also , which describes how the linker finds standard
Haskell libraries.
The recompilation checker
recompilation checker option
(On by default) Turn on recompilation checking. This will stop
compilation early, leaving an existing .o file in
place, if it can be determined that the module does not need to be
recompiled.
option
Turn off recompilation checking.
In the olden days, GHC compared the newly-generated
.hi file with the previous version; if they were
identical, it left the old one alone and didn't change its
modification date. In consequence, importers of a module with an
unchanged output .hi file were not recompiled.
This doesn't work any more. In our earlier example, module
C does not import module A
directly, yet changes to A.hi should force a
recompilation of C. And some changes to
A (changing the definition of a function that
appears in an inlining of a function exported by B,
say) may conceivably not change B.hi one jot. So
now…
GHC keeps a version number on each interface file, and on each type
signature within the interface file. It also keeps in every interface
file a list of the version numbers of everything it used when it last
compiled the file. If the source file's modification date is earlier
than the .o file's date (i.e. the source hasn't
changed since the file was last compiled), and the
is given on the command line, GHC will be
clever. It compares the version numbers on the things it needs this
time with the version numbers on the things it needed last time
(gleaned from the interface file of the module being compiled); if
they are all the same it stops compiling rather early in the process
saying “Compilation IS NOT required”. What a beautiful
sight!
Patrick Sansom had a workshop paper about how all this is done (though
the details have changed quite a bit). Ask him if you want a copy.
Using makemake
It is reasonably straightforward to set up a Makefile to use with GHC, assuming you name your source files the same as your modules.
Thus:
HC = ghc
HC_OPTS = -cpp $(EXTRA_HC_OPTS)
SRCS = Main.lhs Foo.lhs Bar.lhs
OBJS = Main.o Foo.o Bar.o
.SUFFIXES : .o .hs .hi .lhs .hc .s
cool_pgm : $(OBJS)
rm $@
$(HC) -o $@ $(HC_OPTS) $(OBJS)
# Standard suffix rules
.o.hi:
@:
.lhs.o:
$(HC) -c $< $(HC_OPTS)
.hs.o:
$(HC) -c $< $(HC_OPTS)
# Inter-module dependencies
Foo.o Foo.hc Foo.s : Baz.hi # Foo imports Baz
Main.o Main.hc Main.s : Foo.hi Baz.hi # Main imports Foo and Baz
(Sophisticated make variants may achieve some of the above more
elegantly. Notably, gmake's pattern rules let you write the more
comprehensible:
%.o : %.lhs
$(HC) -c $< $(HC_OPTS)
What we've shown should work with any make.)
Note the cheesy .o.hi rule: It records the dependency of the
interface (.hi) file on the source. The rule says a .hi file can
be made from a .o file by doing…nothing. Which is true.
Note the inter-module dependencies at the end of the Makefile, which
take the form
Foo.o Foo.hc Foo.s : Baz.hi # Foo imports Baz
They tell make that if any of Foo.o, Foo.hc or Foo.s have an
earlier modification date than Baz.hi, then the out-of-date file
must be brought up to date. To bring it up to date, make looks for
a rule to do so; one of the preceding suffix rules does the job
nicely.
Dependency generationdependencies in MakefilesMakefile dependenciesPutting inter-dependencies of the form Foo.o :
Bar.hi into your Makefile by hand
is rather error-prone. Don't worry, GHC has support for
automatically generating the required dependencies. Add the
following to your Makefile:
depend :
ghc -M $(HC_OPTS) $(SRCS)
Now, before you start compiling, and any time you change
the imports in your program, do make
depend before you do make
cool_pgm. ghc -M will append
the needed dependencies to your
Makefile.In general, if module A contains the
line
import B ...blah...
then ghc -M will generate a dependency
line of the form:
A.o : B.hi
If module A contains the line
import {-# SOURCE #-} B ...blah...
then ghc -M will generate a dependency
line of the form:
A.o : B.hi-boot
(See for details of interface files.)
If A imports multiple modules, then there
will be multiple lines with A.o as the
target.By default, ghc -M generates all the
dependencies, and then concatenates them onto the end of
makefile (or Makefile
if makefile doesn't exist) bracketed by the
lines "# DO NOT DELETE: Beginning of Haskell
dependencies" and "# DO NOT DELETE: End
of Haskell dependencies". If these lines already
exist in the makefile, then the old
dependencies are deleted first.Internally, GHC uses a script to generate the
dependencies, called mkdependHS. This script
has some options of its own, which you might find useful.
Options can be passed directly to mkdependHS
with GHC's -optdep option. For example, to
generate the dependencies into a file called
.depend instead of
Makefile:
ghc -M -optdep-f optdep.depend ...
The full list of options accepted by
mkdependHS is:Turn off warnings about interface file shadowing.Use blah as the makefile,
rather than makefile or
Makefile. If
blah doesn't exist,
mkdependHS creates it. We often use
to put the dependencies in
.depend and then
include the file
.depend into
Makefile.Use .<osuf> as the
"target file" suffix ( default: o).
Multiple flags are permitted (GHC2.05
onwards). Thus "" will
generate dependencies for .hc and
.o files.Make extra dependencies that declare that files with
suffix
.<suf>_<osuf>
depend on interface files with suffix
.<suf>_hi, or (for
{-# SOURCE #-}
imports) on .hi-boot. Multiple
flags are permitted. For example,
will make dependencies
for .hc on .hi,
.a_hc on
.a_hi, and
.b_hc on
.b_hi. (Useful in conjunction
with NoFib "ways".)Regard <file> as
"stable"; i.e., exclude it from having dependencies on
it.same as Regard the colon-separated list of directories
<dirs> as containing stable,
don't generate any dependencies on modules therein.same as .Regard <file> as not
"stable"; i.e., generate dependencies on it (if any). This
option is normally used in conjunction with the
option.Regard prelude libraries as unstable, i.e., generate
dependencies on the prelude modules used (including
Prelude). This option is normally only
used by the various system libraries. If a
option is used, dependencies will
also be generated on the library's interfaces.How to compile mutually recursive modules
module system, recursionrecursion, between modules
Currently, the compiler does not have proper support for dealing with
mutually recursive modules:
module A where
import B
newtype TA = MkTA Int
f :: TB -> TA
f (MkTB x) = MkTA x
--------
module B where
import A
data TB = MkTB !Int
g :: TA -> TB
g (MkTA x) = MkTB x
When compiling either module A and B, the compiler will try (in vain)
to look for the interface file of the other. So, to get mutually
recursive modules off the ground, you need to hand write an interface
file for A or B, so as to break the loop. These hand-written
interface files are called hi-boot files, and are placed in a file
called <module>.hi-boot. To import from an hi-boot file instead
of the standard .hi file, use the following syntax in the importing module:
hi-boot filesimporting, hi-boot files
import {-# SOURCE #-} A
The hand-written interface need only contain the bare minimum of
information needed to get the bootstrapping process started. For
example, it doesn't need to contain declarations for everything
that module A exports, only the things required by the module that
imports A recursively.
For the example at hand, the boot interface file for A would look like
the following:
__interface A 1 404 where
__export A TA{MkTA} ;
1 newtype TA = MkTA PrelBase.Int ;
The syntax is essentially the same as a normal .hi file
(unfortunately), but you can usually tailor an existing .hi file to
make a .hi-boot file.
Notice that we only put the declaration for the newtype TA in the
hi-boot file, not the signature for f, since f isn't used by
B.
The number “1” after “__interface A” gives the version number of module A;
it is incremented whenever anything in A's interface file changes. The “404” is
the version number of the interface file syntax; we change it when
we change the syntax of interface files so that you get a better error message when
you try to read an old-format file with a new-format compiler.
The number “1” at the beginning of a declaration is the version
number of that declaration: for the purposes of .hi-boot files
these can all be set to 1. All names must be fully qualified with the
original module that an object comes from: for example, the
reference to Int in the interface for A comes from PrelBase,
which is a module internal to GHC's prelude. It's a pain, but that's
the way it is.
If you want an hi-boot file to export a data type, but you don't want to give its constructors
(because the constructors aren't used by the SOURCE-importing module), you can write simply:
__interface A 1 404 where
__export A TA;
1 data TA
(You must write all the type parameters, but leave out the '=' and everything that follows it.)
Note: This is all a temporary solution, a version of the
compiler that handles mutually recursive modules properly without the manual
construction of interface files, is (allegedly) in the works.
PackagespackagesPackages are collections of libraries, conveniently grouped
together as a single entity. The package system is flexible: a
package may consist of Haskell code, foreign language code (eg. C
libraries), or a mixture of the two. A package is a good way to
group together related Haskell modules, and is essential if you
intend to make the modules into a Windows DLL (see below).Because packages can contain both Haskell and C libraries, they
are also a good way to provide convenient access to a Haskell
layer over a C library.GHC comes with several packages (see ), and packages can be added/removed from an
existing GHC installation.Listing the available packagespackageslistingTo see what packages are currently installed, use the
--list-packages option:--list-packages
$ ghc --list-packages
gmp, rts, std, lang, concurrent, data, net, posix, text, util
Note that your GHC installation might have a slightly
different set of packages installed.The gmp and rts
packages are always present, and represent the multi-precision
integer and runtime system libraries respectively. The
std package contains the Haskell prelude.
The rest of the packages are optional libraries.Using a packagepackagesusingTo use a package, add the -package flag
to the command line:-package <lib> optionThis option brings into scope all the modules from
package <lib> (they still have to
be imported in your Haskell source, however). It also
causes the relevant libraries to be linked when linking is
being done.Some packages depend on other packages, for example the
text package makes use of some of the modules
in the lang package. The package system
takes care of all these dependencies, so that when you say
-package text on the command line, you
automatically get -package lang too.Building a package from Haskell sourcepackagesbuildingIt takes some special considerations to build a new
package:A package may contain several Haskell modules. A
package may span many directories, or many packages may
exist in a single directory. Packages may not be mutually
recursive.A package has a name
(e.g. std)The Haskell code in a package may be built into one or
more Unix libraries (e.g. libHSfoo.a),
or a single DLL on Windows
(e.g. HSfoo.dll). The restriction to a
single DLL on Windows is that the package system is used to
tell the compiler when it should make an inter-DLL call
rather than an intra-DLL call (inter-DLL calls require an
extra indirection).GHC does not maintain detailed cross-package
dependency information. It does remember which modules in
other packages the current module depends on, but not which
things within those imported things.To compile a module which is to be part of a new package,
use the -package-name option:-package-nameoptionThis option is added to the command line when
compiling a module that is destined to be part of package
foo. If this flag is omitted then the
default package Main is assumed.Failure to use the -package-name option
when compiling a package will result in disaster on Windows, but
is relatively harmless on Unix at the moment (it will just cause
a few extra dependencies in some interface files). However,
bear in mind that we might add support for Unix shared libraries
at some point in the future.It is worth noting that on Windows, because each package
is built as a DLL, and a reference to a DLL costs an extra
indirection, intra-package references are cheaper than
inter-package references. Of course, this applies to the
Main package as well.Package managementpackagesmanagementGHC uses a package configuration file, called
packages.conf, which can be found in your GHC
install directory. This file isn't intended to be edited
directly, instead GHC provides options for adding & removing
packages:--add-packageoptionReads a package specification (see below) on stdin,
and adds it to the database of installed packages. The
package specification must be a package that isn't already
installed.--delete-packageoptionRemoves the specified package from the installed
configuration.In both cases, the old package configuration file is saved
in packages.conf.old in your GHC install
directory, so in an emergency you can always copy this file into
package.conf to restore the old
settings.A package specification looks like this:
("mypkg",
"4.08",
Package
{
import_dirs = ["/usr/local/lib/imports/mypkg"],
library_dirs = ["/usr/local/lib"],
libraries = ["HSmypkg", "HSmypkg_cbits"],
include_dirs = [],
c_includes = ["HsMyPkg.h"],
package_deps = ["text", "data"],
extra_ghc_opts = [],
extra_cc_opts = [],
extra_ld_opts = ["-lmy_clib"]
}
)
The first line is the name of the package, for use with
the -package flag and as listed in the
--list-packages list. The second line is the
version of GHC that was used to compile any Haskell code in this
package (GHC will refuse to add the package if its version
number differs from this one). The rest of the components of
the package specification may be specified in any order, and
are:import_dirsimport_dirspackage specificationA list of directories containing interface files
(.hi files) for this package.library_dirslibrary_dirspackage specificationA list of directories containing libraries for this
package.librarieslibrariespackage specificationA list of libraries for this package, with the
.a or .dll suffix
omitted. On Unix, the lib prefix is
also omitted.include_dirsinclude_dirspackage specificationA list of directories containing C includes for this
package (maybe the empty list).c_includesc_includespackage specificationA list of files to include for via-C compilations
using this package. Typically this include file will
contain function prototypes for any C functions used in
the package, in case they end up being called as a result
of Haskell functions from the package being
inlined.package_depspackage_depspackage specificationA list of packages which this package depends
on.extra_ghc_optsextra_ghc_optspackage specificationExtra arguments to be added to the GHC command line
when this package is being used.extra_cc_optsextra_cc_optspackage specificationExtra arguments to be added to the gcc command line
when this package is being used (only for via-C
compilations).extra_ld_optsextra_ld_optspackage specificationExtra arguments to be added to the gcc command line
(for linking) when this package is being used.For examples of more package specifications, take a look
at the package.conf in your GHC
installation.Optimisation (code improvement)
optimisation (GHC)improvement, code (GHC)
The options specify convenient “packages” of optimisation
flags; the options described later on specify
individual optimisations to be turned on/off; the
options specify machine-specific optimisations to be turned
on/off.
: convenient “packages” of optimisation flags.
-O options
There are many options that affect the quality of code
produced by GHC. Most people only have a general goal, something like
“Compile quickly” or “Make my program run like greased lightning.”
The following “packages” of optimisations (or lack thereof) should
suffice.
Once you choose a “package,” stick with it—don't chop and
change. Modules' interfaces will change with a shift to a new
option, and you may have to recompile a large chunk of all
importing modules before your program can again be run
safely (see ).
No -type option specified:-O* not specified
This is taken to mean: “Please compile quickly; I'm not over-bothered
about compiled-code quality.” So, for example: ghc -c Foo.hs or :-O option-O1 optionoptimisenormally
Means: “Generate good-quality code without taking too long about
it.” Thus, for example: ghc -c -O Main.lhs:-O2 optionoptimiseaggressively
Means: “Apply every non-dangerous optimisation, even if it means
significantly longer compile times.”
The avoided “dangerous” optimisations are those that can make
runtime or space worse if you're unlucky. They are
normally turned on or off individually.
At the moment, is unlikely to produce
better code than .
:-O2-for-C optiongcc, invoking with -O2
Says to run GCC with , which may be worth a few percent in
execution speed. Don't forget , lest you use the native-code
generator and bypass GCC altogether!
:-Onot optionoptimising, reset
This option will make GHC “forget” any
ish options it has seen so far. Sometimes useful;
for example: make all
EXTRA_HC_OPTS=-Onot.
:-Ofile <file> optionoptimising, customised
For those who need absolute control over
exactly what options are used (e.g., compiler
writers, sometimes :-), a list of options can be put in a file and
then slurped in with .
In that file, comments are of the
#-to-end-of-line variety; blank lines and most
whitespace is ignored.
Please ask if you are baffled and would like an example of !
At Glasgow, we don't use a flag for day-to-day work. We use
to get respectable speed; e.g., when we want to measure
something. When we want to go for broke, we tend to use (and we go for lots of coffee breaks).
The easiest way to see what (etc.) “really mean” is to run with
, then stand back in amazement. Alternatively, just look at the
HsC_minus<blah> lists in the GHC driver script.
: platform-independent flags-f* options (GHC)-fno-* options (GHC)
Flags can be turned off individually. (NB: I hope you have a
good reason for doing this…) To turn off the flag, just use
the flag.-fno-<opt> anti-option So, for
example, you can say , which will then drop out
any running of the strictness analyser.
The options you are most likely to want to turn off are:
-fno-strictness option (strictness
analyser, because it is sometimes slow),
-fno-specialise option (automatic
specialisation of overloaded functions, because it can make your code
bigger) (US spelling also accepted), and
-fno-cpr-analyse option switches off the CPR (constructed product
result) analyser.
Should you wish to turn individual flags on, you are advised
to use the option, described above. Because the order in
which optimisation passes are run is sometimes crucial, it's quite
hard to do with command-line options.
Here are some “dangerous” optimisations you might want to try:
:-fvia-C optionnative code generator, turning off
Compile via C, and don't use the native-code generator. (There are many
cases when GHC does this on its own.) You might pick up a little bit of
speed by compiling via C (e.g. for floating-point intensive code on Intel).
If you use _casm_s (which are utterly
deprecated), you probably have to use
.
The lower-case incantation, , is synonymous.
Compiling via C will probably be slower (in compilation time) than
using GHC's native code generator.
:-funfolding-interface-threshold optioninlining, controllingunfolding, controlling
(Default: 30) By raising or lowering this number, you can raise or
lower the amount of pragmatic junk that gets spewed into interface
files. (An unfolding has a “size” that reflects the cost in terms
of “code bloat” of expanding that unfolding in another module. A
bigger function would be assigned a bigger cost.)
:-funfolding-creation-threshold optioninlining, controllingunfolding, controlling
(Default: 30) This option is similar to
, except that it governs unfoldings
within a single module. Increasing this figure is more likely to
result in longer compile times than faster code. The next option is
more useful:
:-funfolding-use-threshold optioninlining, controllingunfolding, controlling
(Default: 8) This is the magic cut-off figure for unfolding: below
this size, a function definition will be unfolded at the call-site,
any bigger and it won't. The size computed for a function depends on
two things: the actual size of the expression minus any discounts that
apply (see ).
:-funfolding-con-discount optioninlining, controllingunfolding, controlling
(Default: 2) If the compiler decides that it can eliminate some
computation by performing an unfolding, then this is a discount factor
that it applies to the funciton size before deciding whether to unfold
it or not.
OK, folks, these magic numbers `30', `8', and '2' are mildly
arbitrary; they are of the “seem to be OK” variety. The `8' is the
more critical one; it's what determines how eager GHC is about
expanding unfoldings.
:-funbox-strict-fields optionstrict constructor fieldsconstructor fields, strict
This option causes all constructor fields which are marked strict
(i.e. “!”) to be unboxed or unpacked if possible. For example:
data T = T !Float !Float
will create a constructor T containing two unboxed floats if the
flag is given. This may not always be an
optimisation: if the T constructor is scrutinised and the floats
passed to a non-strict function for example, they will have to be
reboxed (this is done automatically by the compiler).
This option should only be used in conjunction with , in order to
expose unfoldings to the compiler so the reboxing can be removed as
often as possible. For example:
f :: T -> Float
f (T f1 f2) = f1 + f2
The compiler will avoid reboxing f1 and f2 by inlining + on
floats, but only when is on.
Any single-constructor data is eligible for unpacking; for example
data T = T !(Int,Int)
will store the two Ints directly in the T constructor, by flattening
the pair. Multi-level unpacking is also supported:
data T = T !S
data S = S !Int !Int
will store two unboxed Int#s directly in the T constructor.
:
This option (which does not work with the native-code generator)
tells the compiler to add extra code to test for already-evaluated
values. You win if you have lots of such values during a run of your
program, you lose otherwise. (And you pay in extra code space.)
We have not played with enough to recommend it.
(For all we know, it doesn't even work anymore… Sigh.)
:
This option has an effect similar to Java's strictfp
modifier: When it is not given, intermediate floating point values can
have a greater precision/range than the final type.
Generally this is a good thing, but some programs may rely on the exact
precision/range of Float/Double
values and should use this option for their compilation.
: platform-specific flags-m* options (GHC)platform-specific optionsmachine-specific options
Some flags only make sense for particular target platforms.
:
(SPARC machines)-mv8 option (SPARC only)
Means to pass the like-named option to GCC; it says to use the
Version 8 SPARC instructions, notably integer multiply and divide.
The similiar GCC options for SPARC also work, actually.
:
(HPPA machines)-mlong-calls option (HPPA only)
Means to pass the like-named option to GCC. Required for Very Big
modules, maybe. (Probably means you're in trouble…)
:
(iX86 machines)-monly-N-regs option (iX86 only)
GHC tries to “steal” four registers from GCC, for performance
reasons; it almost always works. However, when GCC is compiling some
modules with four stolen registers, it will crash, probably saying:
Foo.hc:533: fixed or forbidden register was spilled.
This may be due to a compiler bug or to impossible asm
statements or clauses.
Just give some registers back with . Try `3' first,
then `2'. If `2' doesn't work, please report the bug to us.
Code improvement by the C compiler.
optimisation by GCCGCC optimisation
The C compiler (GCC) is run with turned on. (It has
to be, actually).
If you want to run GCC with —which may be worth a few
percent in execution speed—you can give a
-O2-for-C option option.
Options related to a particular phase
The C pre-processor
pre-processing: cppC pre-processor optionscpp, pre-processing with
The C pre-processor cpp is run over your Haskell code only if the
option -cpp option is given. Unless you are
building a large system with significant doses of conditional
compilation, you really shouldn't need it.
:-D<name> option
Define macro <foo> in the usual way. NB: does not affect
macros passed to the C compiler when compiling via C! For those,
use the hack… (see ).
:-U<name> option
Undefine macro <foo> in the usual way.
:-I<dir> option
Specify a directory in which to look for #include files, in
the usual C way.
The GHC driver pre-defines several macros when processing Haskell
source code (.hs or .lhs files):
__HASKELL98__:__HASKELL98__
If defined, this means that GHC supports the language defined by the
Haskell 98 report.
__HASKELL__=98:__HASKELL__
In GHC 4.04 and later, the __HASKELL__ macro is defined as having
the value 98.
__HASKELL1__:__HASKELL1__ macro
If defined to n, that means GHC supports the Haskell language
defined in the Haskell report version 1.n. Currently 5. This
macro is deprecated, and will probably disappear in future versions.
__GLASGOW_HASKELL__:__GLASGOW_HASKELL__ macro
For version n of the GHC system, this will be #defined to
100n. So, for version 4.00, it is 400.
With any luck, __GLASGOW_HASKELL__ will be undefined in all other
implementations that support C-style pre-processing.
(For reference: the comparable symbols for other systems are:
__HUGS__ for Hugs and __HBC__ for Chalmers.)
NB. This macro is set when pre-processing both Haskell source and C
source, including the C source generated from a Haskell module
(i.e. .hs, .lhs, .c and .hc files).
__CONCURRENT_HASKELL__:__CONCURRENT_HASKELL__ macro
This symbol is defined when pre-processing Haskell (input) and
pre-processing C (GHC output). Since GHC from verion 4.00 now
supports concurrent haskell by default, this symbol is always defined.
__PARALLEL_HASKELL__:__PARALLEL_HASKELL__ macro
Only defined when is in use! This symbol is defined when
pre-processing Haskell (input) and pre-processing C (GHC output).
Options other than the above can be forced through to the C
pre-processor with the flags (see
).
A small word of warning: is not friendly to “string
gaps”.-cpp vs string gapsstring gaps vs
-cpp. In other words, strings such as the following:
strmod = "\
\ p \
\ "
don't work with ; /usr/bin/cpp elides the
backslash-newline pairs.
However, it appears that if you add a space at the end of the line,
then cpp (at least GNU cpp and possibly other cpps)
leaves the backslash-space pairs alone and the string gap works as
expected.
Options affecting the C compiler (if applicable)
include-file optionsC compiler optionsGCC options
At the moment, quite a few common C-compiler options are passed on
quietly to the C compilation of Haskell-compiler-generated C files.
THIS MAY CHANGE. Meanwhile, options so sent are:
do ANSI C (not K&R) be so (hack) short for “make GCC very paranoid”-ansi option (for GCC)-pedantic option (for GCC)-dgcc-lint option (GCC paranoia)
If you are compiling with lots of foreign calls, you may need to
tell the C compiler about some #include files. There is no real
pretty way to do this, but you can use this hack from the
command-line:
% ghc -c '-#include <X/Xlib.h>' Xstuff.lhs
Linking and consistency-checking
linker optionsld options
GHC has to link your code with various libraries, possibly including:
user-supplied, GHC-supplied, and system-supplied ( math
library, for example).
:-l<lib> option
Link in a library named lib<FOO>.a which resides somewhere on the
library directories path.
Because of the sad state of most UNIX linkers, the order of such
options does matter. Thus: ghc -lbar *.o is almost certainly
wrong, because it will search libbar.abefore it has
collected unresolved symbols from the *.o files.
ghc *.o -lbar is probably better.
The linker will of course be informed about some GHC-supplied
libraries automatically; these are:
-l equivalentdescription basic runtime libraries standard Prelude library C support code for standard Prelude library GNU multi-precision library (for Integers)-lHS library-lHS_cbits library-lHSrts library-lgmp library:-package <name> option
If you are using a Haskell “system library” (e.g., the POSIX
library), just use the option, and the correct code
should be linked in.
:-L<dir> option
Where to find user-supplied libraries… Prepend the directory
<dir> to the library directories path.
:-static option
Tell the linker to avoid shared libraries.
and :-no-link-chk option-link-chk optionconsistency checking of executables
By default, immediately after linking an executable, GHC verifies that
the pieces that went into it were compiled with compatible flags; a
“consistency check”.
(This is to avoid mysterious failures caused by non-meshing of
incompatibly-compiled programs; e.g., if one .o file was compiled
for a parallel machine and the others weren't.) You may turn off this
check with . You can turn it (back) on with
(the default).
:-no-hs-main optionlinking Haskell libraries with foreign code
In the event you want to include ghc-compiled code as part of another
(non-Haskell) program, the RTS will not be supplying its definition of
main() at link-time, you will have to. To signal that to the
driver script when linking, use .
Notice that since the command-line passed to the linker is rather
involved, you probably want to use the ghc driver script to do the
final link of your `mixed-language' application. This is not a
requirement though, just try linking once with on to see what
options the driver passes through to the linker.
Using Concurrent HaskellConcurrent Haskell—use
GHC (as of version 4.00) supports Concurrent Haskell by default,
without requiring a special option or libraries compiled in a certain
way. To get access to the support libraries for Concurrent Haskell
(i.e. Concurrent and friends), use the
option.
Three RTS options are provided for modifying the behaviour of the
threaded runtime system. See the descriptions of
, , and
in .
Concurrent Haskell is described in more detail in .
Using Parallel HaskellParallel Haskell—use
[You won't be able to execute parallel Haskell programs unless PVM3
(Parallel Virtual Machine, version 3) is installed at your site.]
To compile a Haskell program for parallel execution under PVM, use the
option,-parallel
option both when compiling and
linking. You will probably want to import
Parallel into your Haskell modules.
To run your parallel program, once PVM is going, just invoke it
“as normal”. The main extra RTS option is
, to say how many PVM
“processors” your program to run on. (For more details of
all relevant RTS options, please see .)
In truth, running Parallel Haskell programs and getting information
out of them (e.g., parallelism profiles) is a battle with the vagaries of
PVM, detailed in the following sections.
Dummy's guide to using PVMPVM, how to useParallel Haskell—PVM use
Before you can run a parallel program under PVM, you must set the
required environment variables (PVM's idea, not ours); something like,
probably in your .cshrc or equivalent:
setenv PVM_ROOT /wherever/you/put/it
setenv PVM_ARCH `$PVM_ROOT/lib/pvmgetarch`
setenv PVM_DPATH $PVM_ROOT/lib/pvmd
Creating and/or controlling your “parallel machine” is a purely-PVM
business; nothing specific to Parallel Haskell.
You use the pvmpvm command command to start PVM on your
machine. You can then do various things to control/monitor your
“parallel machine;” the most useful being:
ControlDexit pvm, leaving it runninghaltkill off this “parallel machine” & exitadd <host>add <host> as a processordelete <host>delete <host>resetkill what's going, but leave PVM upconflist the current configurationpsreport processes' statuspstat <pid>status of a particular process
The PVM documentation can tell you much, much more about pvm!
Parallelism profilesparallelism profilesprofiles, parallelismvisualisation tools
With Parallel Haskell programs, we usually don't care about the
results—only with “how parallel” it was! We want pretty pictures.
Parallelism profiles (à la hbcpp) can be generated with the
-q RTS option (concurrent, parallel) RTS option. The
per-processor profiling info is dumped into files named
<full-path><program>.gr. These are then munged into a PostScript picture,
which you can then display. For example, to run your program
a.out on 8 processors, then view the parallelism profile, do:
% ./a.out +RTS -N8 -q
% grs2gr *.???.gr > temp.gr # combine the 8 .gr files into one
% gr2ps -O temp.gr # cvt to .ps; output in temp.ps
% ghostview -seascape temp.ps # look at it!
The scripts for processing the parallelism profiles are distributed
in ghc/utils/parallel/.
Other useful info about running parallel programs
The “garbage-collection statistics” RTS options can be useful for
seeing what parallel programs are doing. If you do either
-Sstderr RTS option or , then
you'll get mutator, garbage-collection, etc., times on standard
error. The standard error of all PE's other than the `main thread'
appears in /tmp/pvml.nnn, courtesy of PVM.
Whether doing or not, a handy way to watch
what's happening overall is: tail -f /tmp/pvml.nnn.
RTS options for Concurrent/Parallel Haskell
RTS options, concurrentRTS options, parallelConcurrent Haskell—RTS optionsParallel Haskell—RTS options
Besides the usual runtime system (RTS) options
(), there are a few options particularly
for concurrent/parallel execution.
:-N<N> RTS option (parallel)
(PARALLEL ONLY) Use <N> PVM processors to run this program;
the default is 2.
:-C<us> RTS option
Sets the context switch interval to <us> microseconds. A context
switch will occur at the next heap allocation after the timer expires.
With or , context switches will occur as often as
possible (at every heap allocation). By default, context switches
occur every 10 milliseconds. Note that many interval timers are only
capable of 10 millisecond granularity, so the default setting may be
the finest granularity possible, short of a context switch at every
heap allocation.
[NOTE: this option currently has no effect (version 4.00). Context
switches happen when the current heap block is full, i.e. every 4k of
allocation].
:-q RTS option
(PARALLEL ONLY) Produce a quasi-parallel profile of thread activity,
in the file <program>.qp. In the style of hbcpp, this profile
records the movement of threads between the green (runnable) and red
(blocked) queues. If you specify the verbose suboption (), the
green queue is split into green (for the currently running thread
only) and amber (for other runnable threads). We do not recommend
that you use the verbose suboption if you are planning to use the
hbcpp profiling tools or if you are context switching at every heap
check (with ).
:-t<num> RTS option
(PARALLEL ONLY) Limit the number of concurrent threads per processor
to <num>. The default is 32. Each thread requires slightly over 1K
words in the heap for thread state and stack objects. (For
32-bit machines, this translates to 4K bytes, and for 64-bit machines,
8K bytes.)
:-d RTS option (parallel)
(PARALLEL ONLY) Turn on debugging. It pops up one xterm (or GDB, or
something…) per PVM processor. We use the standard debugger
script that comes with PVM3, but we sometimes meddle with the
debugger2 script. We include ours in the GHC distribution,
in ghc/utils/pvm/.
:-e<num> RTS option (parallel)
(PARALLEL ONLY) Limit the number of pending sparks per processor to
<num>. The default is 100. A larger number may be appropriate if
your program generates large amounts of parallelism initially.
:-Q<num> RTS option (parallel)
(PARALLEL ONLY) Set the size of packets transmitted between processors
to <num>. The default is 1024 words. A larger number may be
appropriate if your machine has a high communication cost relative to
computation speed.
&runtime
&debug