Building the Glasgow Functional Programming Tools Suite

Building the Glasgow Functional Programming Tools Suite The GHC Team

glasgow-haskell-{users,bugs}@haskell.org

November 2001 The Glasgow fptools suite is a collection of Functional Programming related tools, including the Glasgow Haskell Compiler (GHC). The source code for the whole suite is kept in a single CVS repository and shares a common build and installation system. This guide is intended for people who want to build or modify programs from the Glasgow fptools suite (as distinct from those who merely want to run them). Installation instructions are now provided in the user guide. The bulk of this guide applies to building on Unix systems; see for Windows notes. Getting the Glasgow <Literal>fptools</Literal> suite Building the Glasgow tools can be complicated, mostly because there are so many permutations of what/why/how, e.g., ``Build Happy with HBC, everything else with GHC, leave out profiling, and test it all on the `real' NoFib programs.'' Yeeps! Happily, such complications don't apply to most people. A few common ``strategies'' serve most purposes. Pick one and proceed as suggested: Binary distributionBinary distribution. If your only purpose is to install some of the fptools suite then the easiest thing to do is to get a binary distribution. In the binary distribution everything is pre-compiled for your particular machine architecture and operating system, so all you should have to do is install the binaries and libraries in suitable places. The user guide describes how to do this. A binary distribution may not work for you for two reasons. First, we may not have built the suite for the particular architecture/OS platform you want. That may be due to lack of time and energy (in which case you can get a source distribution and build from it; see below). Alternatively, it may be because we haven't yet ported the suite to your architecture, in which case you are considerably worse off. The second reason a binary distribution may not be what you want is if you want to read or modify the souce code. Source distributionSource distribution. You have a supported platform, but (a) you like the warm fuzzy feeling of compiling things yourself; (b) you want to build something ``extra''—e.g., a set of libraries with strictness-analysis turned off; or (c) you want to hack on GHC yourself. A source distribution contains complete sources for one or more projects in the fptools suite. Not only that, but the more awkward machine-independent steps are done for you. For example, if you don't have happyhappy you'll find it convenient that the source distribution contains the result of running happy on the parser specifications. If you don't want to alter the parser then this saves you having to find and install happy. You will still need a working version of GHC (preferably version 4.08+) on your machine in order to compile (most of) the sources, however. The CVS repository. CVS repository We make releases infrequently. If you want more up-to-the minute (but less tested) source code then you need to get access to our CVS repository. All the fptools source code is held in a CVS repository. CVS is a pretty good source-code control system, and best of all it works over the network. The repository holds source code only. It holds no mechanically generated files at all. So if you check out a source tree from CVS you will need to install every utility so that you can build all the derived files from scratch. More information about our CVS repository can be found in . Build GHC from intermediate C .hc fileshc files: You need a working GHC to use a source distribution. What if you don't have a working GHC? Then you may be able to bootstrap up from the intermediate C (.hc) files that we provide. Building GHC on an unsupported platform falls into this category. Beware: this route is not for the faint hearted! Please see . Once you have built GHC, you can build the other Glasgow tools with it. In theory, you can (could?) build GHC with another Haskell compiler (e.g., HBC). We haven't tried to do this for ages and it almost certainly doesn't work any more (for tedious reasons). If you are going to do any building from sources (either from a source distribution or the CVS repository) then you need to read all of this manual in detail. Using the CVS repository We use CVS (Concurrent Version System) to keep track of our sources for various software projects. CVS lets several people work on the same software at the same time, allowing changes to be checked in incrementally. This section is a set of guidelines for how to use our CVS repository, and will probably evolve in time. The main thing to remember is that most mistakes can be undone, but if there's anything you're not sure about feel free to bug the local CVS meister (namely Jeff Lewis jlewis@galconn.com). Getting access to the CVS Repository You can access the repository in one of two ways: read-only (), or read-write (). Remote Read-only CVS Access Read-only access is available to anyone - there's no need to ask us first. With read-only CVS access you can do anything except commit changes to the repository. You can make changes to your local tree, and still use CVS's merge facility to keep your tree up to date, and you can generate patches using 'cvs diff' in order to send to us for inclusion. To get read-only access to the repository: Make sure that cvs is installed on your machine. Set your $CVSROOT environment variable to :pserver:anoncvs@glass.cse.ogi.edu:/cvs Run the command $ cvs login The password is simply cvs. This sets up a file in your home directory called .cvspass, which squirrels away the dummy password, so you only need to do this step once. Now go to . Remote Read-Write CVS Access We generally supply read-write access to folk doing serious development on some part of the source tree, when going through us would be a pain. If you're developing some feature, or think you have the time and inclination to fix bugs in our sources, feel free to ask for read-write access. There is a certain amount of responsibility that goes with commit privileges; we are more likely to grant you access if you've demonstrated your competence by sending us patches via mail in the past. To get remote read-write CVS access, you need to do the following steps. Make sure that cvs and ssh are both installed on your machine. Generate a DSA private-key/public-key pair, thus: $ ssh-keygen -d (ssh-keygen comes with ssh.) Running ssh-keygen -d creates the private and public keys in $HOME/.ssh/id_dsa and $HOME/.ssh/id_dsa.pub respectively (assuming you accept the standard defaults). ssh-keygen -d will only work if you have Version 2 ssh installed; it will fail harmlessly otherwise. If you only have Version 1 you can instead generate an RSA key pair using plain $ ssh-keygen Doing so creates the private and public RSA keys in $HOME/.ssh/identity and $HOME/.ssh/identity.pub respectively. [Deprecated.] Incidentally, you can force a Version 2 ssh to use the Version 1 protocol by creating $HOME/config with the following in it: BatchMode Yes Host cvs.haskell.org Protocol 1 In both cases, ssh-keygen will ask for a passphrase. The passphrase is a password that protects your private key. In response to the 'Enter passphrase' question, you can either: [Recommended.] Enter a passphrase, which you will quote each time you use CVS. ssh-agent makes this entirely un-tiresome. [Deprecated.] Just hit return (i.e. use an empty passphrase); then you won't need to quote the passphrase when using CVS. The downside is that anyone who can see into your .ssh directory, and thereby get your private key, can mess up the repository. So you must keep the .ssh directory with draconian no-access permissions. [Windows users.] The programs ssh-keygen1, ssh1, and cvs, seem to lock up bash entirely if they try to get user input (e.g. if they ask for a password). To solve this, start up cmd.exe and run it as follows: c:\tmp> set CYGWIN32=tty c:\tmp> c:/user/local/bin/ssh-keygen1 [Windows users.] To protect your .ssh from access by anyone else, right-click your .ssh directory, and select Properties. If you are not on the access control list, add yourself, and give yourself full permissions (the second panel). Remove everyone else from the access control list. Don't leave them there but deny them access, because 'they' may be a list that includes you! Send a message to to the CVS repository administrator (currently Jeff Lewis jeff@galconn.com), containing: Your desired user-name. Your .ssh/id_dsa.pub (or .ssh/identity.pub). He will set up your account. Set the following environment variables: $HOME: points to your home directory. This is where CVS will look for its .cvsrc file. $CVS_RSH to ssh [Windows users.] Setting your CVS_RSH to ssh assumes that your CVS client understands how to execute shell script ("#!"s,really), which is what ssh is. This may not be the case on Win32 platforms, so in that case set CVS_RSH to ssh1. $CVSROOT to :ext:your-username @cvs.haskell.org:/home/cvs/root where your-username is your user name on cvs.haskell.org. The CVSROOT environment variable will be recorded in the checked-out tree, so you don't need to set this every time. $CVSEDITOR: bin/gnuclient.exe if you want to use an Emacs buffer for typing in those long commit messages. $SHELL: To use bash as the shell in Emacs, you need to set this to point to bash.exe. Put the following in $HOME/.cvsrc: checkout -P release -d update -P diff -u These are the default options for the specified CVS commands, and represent better defaults than the usual ones. (Feel free to change them.) [Windows users.] Filenames starting with . were illegal in the 8.3 DOS filesystem, but that restriction should have been lifted by now (i.e., you're using VFAT or later filesystems.) If you're still having problems creating it, don't worry; .cvsrc is entirely optional. [Experts.] Once your account is set up, you can get access from other machines without bothering Jeff, thus: Generate a public/private key pair on the new machine. Use ssh to log in to cvs.haskell.org, from your old machine. Add the public key for the new machine to the file $HOME/ssh/authorized_keys on cvs.haskell.org. (authorized_keys2, I think, for Version 2 protocol.) Make sure that the new version of authorized_keys still has 600 file permissions. Checking Out a Source Tree Make sure you set your CVSROOT environment variable according to either of the remote methods above. The Approved Way to check out a source tree is as follows: $ cvs checkout fpconfig At this point you have a new directory called fptools which contains the basic stuff for the fptools suite, including the configuration files and some other junk. [Windows users.] The following messages appear to be harmless: setsockopt IPTOS_LOWDELAY: Invalid argument setsockopt IPTOS_THROUGHPUT: Invalid argument You can call the fptools directory whatever you like, CVS won't mind: $ mv fptools directory NB: after you've read the CVS manual you might be tempted to try $ cvs checkout -d directory fpconfig instead of checking out fpconfig and then renaming it. But this doesn't work, and will result in checking out the entire repository instead of just the fpconfig bit. $ cd directory $ cvs checkout ghc hslibs The second command here checks out the relevant modules you want to work on. For a GHC build, for instance, you need at least the ghc and hslibs modules (for a full list of the projects available, see ). Committing Changes This is only if you have read-write access to the repository. For anoncvs users, CVS will issue a "read-only repository" error if you try to commit changes. Build the software, if necessary. Unless you're just working on documentation, you'll probably want to build the software in order to test any changes you make. Make changes. Preferably small ones first. Test them. You can see exactly what changes you've made by using the cvs diff command: $ cvs diff lists all the changes (using the diff command) in and below the current directory. In emacs, C-c C-v = runs cvs diff on the current buffer and shows you the results. Before checking in a change, you need to update your source tree: $ cd fptools $ cvs update This pulls in any changes that other people have made, and merges them with yours. If there are any conflicts, CVS will tell you, and you'll have to resolve them before you can check your changes in. The documentation describes what to do in the event of a conflict. It's not always necessary to do a full cvs update before checking in a change, since CVS will always tell you if you try to check in a file that someone else has changed. However, you should still update at regular intervals to avoid making changes that don't work in conjuction with changes that someone else made. Keeping an eye on what goes by on the mailing list can help here. When you're happy that your change isn't going to break anything, check it in. For a one-file change: $ cvs commit filename CVS will then pop up an editor for you to enter a "commit message", this is just a short description of what your change does, and will be kept in the history of the file. If you're using emacs, simply load up the file into a buffer and type C-x C-q, and emacs will prompt for a commit message and then check in the file for you. For a multiple-file change, things are a bit trickier. There are several ways to do this, but this is the way I find easiest. First type the commit message into a temporary file. Then either $ cvs commit -F commit-message file_1 .... file_n or, if nothing else has changed in this part of the source tree, $ cvs commit -F commit-message directory where directory is a common parent directory for all your changes, and commit-message is the name of the file containing the commit message. Shortly afterwards, you'll get some mail from the relevant mailing list saying which files changed, and giving the commit message. For a multiple-file change, you should still get only one message. Updating Your Source Tree It can be tempting to cvs update just part of a source tree to bring in some changes that someone else has made, or before committing your own changes. This is NOT RECOMMENDED! Quite often changes in one part of the tree are dependent on changes in another part of the tree (the mk/*.mk files are a good example where problems crop up quite often). Having an inconsistent tree is a major cause of headaches. So, to avoid a lot of hassle, follow this recipe for updating your tree: $ cd fptools $ cvs update -Pd 2>&1 | tee log Look at the log file, and fix any conflicts (denoted by a C in the first column). If you're using multiple build trees, then for every build tree you have pointing at this source tree, you need to update the links in case any new files have appeared: $ cd build-tree $ lndir source-tree Some files might have been removed, so you need to remove the links pointing to these non-existent files: $ find . -xtype l -exec rm '{}' \; To be really safe, you should do $ gmake all from the top-level, to update the dependencies and build any changed files. GHC Tag Policy If you want to check out a particular version of GHC, you'll need to know how we tag versions in the repository. The policy (as of 4.04) is: The tree is branched before every major release. The branch tag is ghc-x-xx-branch, where x-xx is the version number of the release with the '.' replaced by a '-'. For example, the 4.04 release lives on ghc-4-04-branch. The release itself is tagged with ghc-x-xx (on the branch). eg. 4.06 is called ghc-4-06. We didn't always follow these guidelines, so to see what tags there are for previous versions, do cvs log on a file that's been around for a while (like fptools/ghc/README). So, to check out a fresh GHC 4.06 tree you would do: $ cvs co -r ghc-4-06 fpconfig $ cd fptools $ cvs co -r ghc-4-06 ghc hslibs General Hints As a general rule: commit changes in small units, preferably addressing one issue or implementing a single feature. Provide a descriptive log message so that the repository records exactly which changes were required to implement a given feature/fix a bug. I've found this very useful in the past for finding out when a particular bug was introduced: you can just wind back the CVS tree until the bug disappears. Keep the sources at least *buildable* at any given time. No doubt bugs will creep in, but it's quite easy to ensure that any change made at least leaves the tree in a buildable state. We do nightly builds of GHC to keep an eye on what things work/don't work each day and how we're doing in relation to previous verions. This idea is truely wrecked if the compiler won't build in the first place! To check out extra bits into an already-checked-out tree, use the following procedure. Suppose you have a checked-out fptools tree containing just ghc, and you want to add nofib to it: $ cd fptools $ cvs checkout nofib or: $ cd fptools $ cvs update -d nofib (the -d flag tells update to create a new directory). If you just want part of the nofib suite, you can do $ cd fptools $ cvs checkout nofib/spectral This works because nofib is a module in its own right, and spectral is a subdirectory of the nofib module. The path argument to checkout must always start with a module name. There's no equivalent form of this command using update. What projects are there? The fptools suite consists of several projects, most of which can be downloaded, built and installed individually. Each project corresponds to a subdirectory in the source tree, and if checking out from CVS then each project can be checked out individually by sitting in the top level of your source tree and typing cvs checkout project. Here is a list of the projects currently available: ghc ghc project The Glasgow Haskell Compiler (minus libraries). Absolutely required for building GHC. glafp-utils glafp-utilsproject Utility programs, some of which are used by the build/installation system. Required for pretty much everything. green-card green-cardproject The Green Card system for generating Haskell foreign function interfaces. haggis haggisproject The Haggis Haskell GUI framework. happy happyproject The Happy Parser generator. hdirect hdirectproject The H/Direct Haskell interoperability tool. hood hoodproject The Haskell Object Observation Debugger. hslibs hslibsproject GHC's libraries. Required for building GHC. libraries project Hierarchical Haskell library suite (experimental). mhms project The Modular Haskell Metric System. nofib nofibproject The NoFib suite: A collection of Haskell programs used primarily for benchmarking. testsuite testsuiteproject A testing framework, including GHC's regression test suite. So, to build GHC you need at least the ghc and hslibs projects (a GHC source distribution will already include the bits you need). Things to check before you start Here's a list of things to check before you get started. Disk space needed Disk space needed: from about 100Mb for a basic GHC build, up to probably 500Mb for a GHC build with everything included (libraries built several different ways, etc.). Use an appropriate machine, compilers, and things. SPARC boxes, PCs running Linux or FreeBSD, and Alphas running OSF/1 are all fully supported. Win32 and HP boxes are in pretty good shape. PCs running Solaris, DEC Alphas running Linux or some BSD variant, MIPS and AIX boxes will need some minimal porting effort before they work (as of 4.06). gives the full run-down on ports or lack thereof. Be sure that the ``pre-supposed'' utilities are installed. elaborates. If you have any problem when building or installing the Glasgow tools, please check the ``known pitfalls'' (). Also check the FAQ for the version you're building, which should be available from the relevant download page on the GHC web site. known bugs bugs, known If you feel there is still some shortcoming in our procedure or instructions, please report it. For GHC, please see the bug-reporting section of the GHC Users' Guide (separate document), to maximise the usefulness of your report. bugsseporting If in doubt, please send a message to glasgow-haskell-bugs@haskell.org. bugsmailing list What machines the Glasgow tools run on ports, GHC GHC ports supported platforms platforms, supported The main question is whether or not the Haskell compiler (GHC) runs on your platform. A ``platform'' is a architecture/manufacturer/operating-system combination, such as sparc-sun-solaris2. Other common ones are alpha-dec-osf2, hppa1.1-hp-hpux9, i386-unknown-linux, i386-unknown-solaris2, i386-unknown-freebsd, i386-unknown-cygwin32, m68k-sun-sunos4, mips-sgi-irix5, sparc-sun-sunos4, sparc-sun-solaris2, powerpc-ibm-aix. Bear in mind that certain ``bundles'', e.g. parallel Haskell, may not work on all machines for which basic Haskell compiling is supported. Some libraries may only work on a limited number of platforms; for example, a sockets library is of no use unless the operating system supports the underlying BSDisms. What platforms the Haskell compiler (GHC) runs on fully-supported platforms native-code generator registerised ports unregisterised ports The GHC hierarchy of Porting Goodness: (a) Best is a native-code generator; (b) next best is a ``registerised'' port; (c) the bare minimum is an ``unregisterised'' port. (``Unregisterised'' is so terrible that we won't say more about it). We use Sparcs running Solaris 2.7 and x86 boxes running FreeBSD and Linux, so those are the best supported platforms, unsurprisingly. Here's everything that's known about GHC ports. We identify platforms by their ``canonical'' CPU/Manufacturer/OS triple. alpha-dec-{osf,linux,freebsd,openbsd,netbsd}: alpha-dec-osf alpha-dec-linux alpha-dec-freebsd alpha-dec-openbsd alpha-dec-netbsd The OSF port is currently working (as of GHC version 5.02.1) and well supported. The native code generator is currently non-working. Other operating systems will require some minor porting. sparc-sun-sunos4 sparc-sun-sunos4 Probably works with minor tweaks, hasn't been tested for a while. sparc-sun-solaris2 sparc-sun-solaris2 Fully supported, including native-code generator. hppa1.1-hp-hpux (HP-PA boxes running HPUX 9.x) hppa1.1-hp-hpux Works registerised. No native-code generator. i386-unknown-linux (PCs running Linux, ELF binary format) i386-*-linux GHC works registerised and has a native code generator. You must have GCC 2.7.x or later. NOTE about glibc versions: GHC binaries built on a system running glibc 2.0 won't work on a system running glibc 2.1, and vice versa. In general, don't expect compatibility between glibc versions, even if the shared library version hasn't changed. i386-unknown-freebsd (PCs running FreeBSD 2.2 or higher) i386-unknown-freebsd GHC works registerised. Pre-built packages are available in the native package format, so if you just need binaries you're better off just installing the package. i386-unknown-{netbsd,openbsd) (PCs running NetBSD and OpenBSD) i386-unknown-netbsd i386-unknown-openbsd Will require some minor porting effort, but should work registerised. i386-unknown-cygwin32: i386-unknown-cygwin32 Fully supported under Win9x/NT, including a native code generator. Requires the cygwin32 compatibility library and a healthy collection of GNU tools (i.e., gcc, GNU ld, bash etc.). mips-sgi-irix5 mips-sgi-irix[5-6] Port currently doesn't work, needs some minimal porting effort. As usual, we don't have access to machines and there hasn't been an overwhelming demand for this port, but feel free to get in touch. powerpc-ibm-aix powerpc-ibm-aix Port currently doesn't work, needs some minimal porting effort. As usual, we don't have access to machines and there hasn't been an overwhelming demand for this port, but feel free to get in touch. powerpc-apple-darwin powerpc-apple-darwin Works, unregisterised only at the moment. Various other systems have had GHC ported to them in the distant past, including various Motorola 68k boxes. The 68k support still remains, but porting to one of these systems will certainly be a non-trivial task. What machines the other tools run on Unless you hear otherwise, the other tools work if GHC works. Installing pre-supposed utilities <indexterm><primary>pre-supposed utilities</primary></indexterm> <indexterm><primary>utilities, pre-supposed</primary></indexterm> Here are the gory details about some utility programs you may need; perl, gcc and happy are the only important ones. (PVMPVM is important if you're going for Parallel Haskell.) The configureconfigure script will tell you if you are missing something. Perl: pre-supposed: Perl Perl, pre-supposed You have to have Perl to proceed! It is pretty easy to install. Perl 5 is required. For Win32 platforms, you should use the binary supplied in the InstallShield (copy it to /bin). The Cygwin-supplied Perl seems not to work. Perl should be put somewhere so that it can be invoked by the #! script-invoking mechanism. The full pathname may need to be less than 32 characters long on some systems. GNU C (gcc): pre-supposed: GCC (GNU C compiler) GCC (GNU C compiler), pre-supposed We recommend using GCC version 2.95.2 on all platforms. Failing that, version 2.7.2 is stable on most platforms. Earlier versions of GCC can be assumed not to work, and versions in between 2.7.2 and 2.95.2 (including egcs) have varying degrees of stability depending on the platform. If your GCC dies with ``internal error'' on some GHC source file, please let us know, so we can report it and get things improved. (Exception: on iX86 boxes—you may need to fiddle with GHC's option; see the User's Guide) Happy: Happy Happy is a parser generator tool for Haskell, and is used to generate GHC's parsers. Happy is written in Haskell, and is a project in the CVS repository (fptools/happy). It can be built from source, but bear in mind that you'll need GHC installed in order to build it. To avoid the chicken/egg problem, install a binary distribtion of either Happy or GHC to get started. Happy distributions are available from Happy's Web Page. Autoconf: pre-supposed: Autoconf Autoconf, pre-supposed GNU Autoconf is needed if you intend to build from the CVS sources, it is not needed if you just intend to build a standard source distribution. Autoconf builds the configure script from configure.in and aclocal.m4. If you modify either of these files, you'll need autoconf to rebuild configure. sed pre-supposed: sed sed, pre-supposed You need a working sed if you are going to build from sources. The build-configuration stuff needs it. GNU sed version 2.0.4 is no good! It has a bug in it that is tickled by the build-configuration. 2.0.5 is OK. Others are probably OK too (assuming we don't create too elaborate configure scripts.) One fptools project is worth a quick note at this point, because it is useful for all the others: glafp-utils contains several utilities which aren't particularly Glasgow-ish, but Occasionally Indispensable. Like lndir for creating symbolic link trees. Tools for building parallel GHC (GPH) PVM version 3: pre-supposed: PVM3 (Parallel Virtual Machine) PVM3 (Parallel Virtual Machine), pre-supposed PVM is the Parallel Virtual Machine on which Parallel Haskell programs run. (You only need this if you plan to run Parallel Haskell. Concurent Haskell, which runs concurrent threads on a uniprocessor doesn't need it.) Underneath PVM, you can have (for example) a network of workstations (slow) or a multiprocessor box (faster). The current version of PVM is 3.3.11; we use 3.3.7. It is readily available on the net; I think I got it from research.att.com, in netlib. A PVM installation is slightly quirky, but easy to do. Just follow the Readme instructions. bash: bash, presupposed (Parallel Haskell only) Sadly, the gr2ps script, used to convert ``parallelism profiles'' to PostScript, is written in Bash (GNU's Bourne Again shell). This bug will be fixed (someday). Tools for building the Documentation The following additional tools are required if you want to format the documentation that comes with the fptools projects: DocBook: pre-supposed: DocBook DocBook, pre-supposed All our documentation is written in SGML, using the DocBook DTD. Instructions on installing and configuring the DocBook tools are in the installation guide (in the GHC user guide). TeX: pre-supposed: TeX TeX, pre-supposed A decent TeX distribution is required if you want to produce printable documentation. We recomment teTeX, which includes just about everything you need. In order to actually build any documentation, you need to set SGMLDocWays in your build.mk. Valid values to add to this list are: dvi, ps, pdf, html, and rtf. Other useful tools Flex: pre-supposed: flex flex, pre-supposed This is a quite-a-bit-better-than-Lex lexer. Used to build a couple of utilities in glafp-utils. Depending on your operating system, the supplied lex may or may not work; you should get the GNU version. Building from source <indexterm><primary>Building from source</primary></indexterm> <indexterm><primary>Source, building from</primary></indexterm> You've been rash enough to want to build some of the Glasgow Functional Programming tools (GHC, Happy, nofib, etc.) from source. You've slurped the source, from the CVS repository or from a source distribution, and now you're sitting looking at a huge mound of bits, wondering what to do next. Gingerly, you type make. Wrong already! This rest of this guide is intended for duffers like me, who aren't really interested in Makefiles and systems configurations, but who need a mental model of the interlocking pieces so that they can make them work, extend them consistently when adding new software, and lay hands on them gently when they don't work. Your source tree The source code is held in your source tree. The root directory of your source tree must contain the following directories and files: Makefile: the root Makefile. mk/: the directory that contains the main Makefile code, shared by all the fptools software. configure.in, config.sub, config.guess: these files support the configuration process. install-sh. All the other directories are individual projects of the fptools system—for example, the Glasgow Haskell Compiler (ghc), the Happy parser generator (happy), the nofib benchmark suite, and so on. You can have zero or more of these. Needless to say, some of them are needed to build others. The important thing to remember is that even if you want only one project (happy, say), you must have a source tree whose root directory contains Makefile, mk/, configure.in, and the project(s) you want (happy/ in this case). You cannot get by with just the happy/ directory. Build trees <indexterm><primary>build trees</primary></indexterm> <indexterm><primary>link trees, for building</primary></indexterm> While you can build a system in the source tree, we don't recommend it. We often want to build multiple versions of our software for different architectures, or with different options (e.g. profiling). It's very desirable to share a single copy of the source code among all these builds. So for every source tree we have zero or more build trees. Each build tree is initially an exact copy of the source tree, except that each file is a symbolic link to the source file, rather than being a copy of the source file. There are ``standard'' Unix utilities that make such copies, so standard that they go by different names: lndirlndir, mkshadowdirmkshadowdir are two (If you don't have either, the source distribution includes sources for the X11 lndir—check out fptools/glafp-utils/lndir). See for a typical invocation. The build tree does not need to be anywhere near the source tree in the file system. Indeed, one advantage of separating the build tree from the source is that the build tree can be placed in a non-backed-up partition, saving your systems support people from backing up untold megabytes of easily-regenerated, and rapidly-changing, gubbins. The golden rule is that (with a single exception—) absolutely everything in the build tree is either a symbolic link to the source tree, or else is mechanically generated. It should be perfectly OK for your build tree to vanish overnight; an hour or two compiling and you're on the road again. You need to be a bit careful, though, that any new files you create (if you do any development work) are in the source tree, not a build tree! Remember, that the source files in the build tree are symbolic links to the files in the source tree. (The build tree soon accumulates lots of built files like Foo.o, as well.) You can delete a source file from the build tree without affecting the source tree (though it's an odd thing to do). On the other hand, if you edit a source file from the build tree, you'll edit the source-tree file directly. (You can set up Emacs so that if you edit a source file from the build tree, Emacs will silently create an edited copy of the source file in the build tree, leaving the source file unchanged; but the danger is that you think you've edited the source file whereas actually all you've done is edit the build-tree copy. More commonly you do want to edit the source file.) Like the source tree, the top level of your build tree must be (a linked copy of) the root directory of the fptools suite. Inside Makefiles, the root of your build tree is called $(FPTOOLS_TOP)FPTOOLS_TOP. In the rest of this document path names are relative to $(FPTOOLS_TOP) unless otherwise stated. For example, the file ghc/mk/target.mk is actually $(FPTOOLS_TOP)/ghc/mk/target.mk. Getting the build you want When you build fptools you will be compiling code on a particular host platform, to run on a particular target platform (usually the same as the host platform)platform. The difficulty is that there are minor differences between different platforms; minor, but enough that the code needs to be a bit different for each. There are some big differences too: for a different architecture we need to build GHC with a different native-code generator. There are also knobs you can turn to control how the fptools software is built. For example, you might want to build GHC optimised (so that it runs fast) or unoptimised (so that you can compile it fast after you've modified it. Or, you might want to compile it with debugging on (so that extra consistency-checking code gets included) or off. And so on. All of this stuff is called the configuration of your build. You set the configuration using a three-step process. Step 1: get ready for configuration. Change directory to $(FPTOOLS_TOP) and issue the command autoconfautoconf (with no arguments). This GNU program converts $(FPTOOLS_TOP)/configure.in to a shell script called $(FPTOOLS_TOP)/configure. Some projects, including GHC, have their own configure script. If there's an $(FPTOOLS_TOP)/<project>/configure.in, then you need to run autoconf in that directory too. Both these steps are completely platform-independent; they just mean that the human-written file (configure.in) can be short, although the resulting shell script, configure, and mk/config.h.in, are long. In case you don't have autoconf we distribute the results, configure, and mk/config.h.in, with the source distribution. They aren't kept in the repository, though. Step 2: system configuration. Runs the newly-created configure script, thus: ./configure args configure's mission is to scurry round your computer working out what architecture it has, what operating system, whether it has the vfork system call, where yacc is kept, whether gcc is available, where various obscure #include files are, whether it's a leap year, and what the systems manager had for lunch. It communicates these snippets of information in two ways: It translates mk/config.mk.inconfig.mk.in to mk/config.mkconfig.mk, substituting for things between ``@'' brackets. So, ``@HaveGcc@'' will be replaced by ``YES'' or ``NO'' depending on what configure finds. mk/config.mk is included by every Makefile (directly or indirectly), so the configuration information is thereby communicated to all Makefiles. It translates mk/config.h.inconfig.h.in to mk/config.hconfig.h. The latter is #included by various C programs, which can thereby make use of configuration information. configure takes some optional arguments. Use ./configure --help to get a list of the available arguments. Here are some of the ones you might need: --with-ghc=path --with-ghc Specifies the path to an installed GHC which you would like to use. This compiler will be used for compiling GHC-specific code (eg. GHC itself). This option cannot be specified using build.mk (see later), because configure needs to auto-detect the version of GHC you're using. The default is to look for a compiler named ghc in your path. --with-hc=path --with-hc Specifies the path to any installed Haskell compiler. This compiler will be used for compiling generic Haskell code. The default is to use ghc. --with-gcc=path --with-gcc Specifies the path to the installed GCC. This compiler will be used to compile all C files, except any generated by the installed Haskell compiler, which will have its own idea of which C compiler (if any) to use. The default is to use gcc. configure caches the results of its run in config.cache. Quite often you don't want that; you're running configure a second time because something has changed. In that case, simply delete config.cache. Step 3: build configuration. Next, you say how this build of fptools is to differ from the standard defaults by creating a new file mk/build.mkbuild.mk in the build tree. This file is the one and only file you edit in the build tree, precisely because it says how this build differs from the source. (Just in case your build tree does die, you might want to keep a private directory of build.mk files, and use a symbolic link in each build tree to point to the appropriate one.) So mk/build.mk never exists in the source tree—you create one in each build tree from the template. We'll discuss what to put in it shortly. And that's it for configuration. Simple, eh? What do you put in your build-specific configuration file mk/build.mk? For almost all purposes all you will do is put make variable definitions that override those in mk/config.mk.in. The whole point of mk/config.mk.in—and its derived counterpart mk/config.mk—is to define the build configuration. It is heavily commented, as you will see if you look at it. So generally, what you do is look at mk/config.mk.in, and add definitions in mk/build.mk that override any of the config.mk definitions that you want to change. (The override occurs because the main boilerplate file, mk/boilerplate.mkboilerplate.mk, includes build.mk after config.mk.) For example, config.mk.in contains the definition: GhcHcOpts=-O -Rghc-timing The accompanying comment explains that this is the list of flags passed to GHC when building GHC itself. For doing development, it is wise to add -DDEBUG, to enable debugging code. So you would add the following to build.mk: or, if you prefer, GhcHcOpts += -DDEBUG GNU make allows existing definitions to have new text appended using the ``+='' operator, which is quite a convenient feature.) If you want to remove the -O as well (a good idea when developing, because the turn-around cycle gets a lot quicker), you can just override GhcLibHcOpts altogether: GhcHcOpts=-DDEBUG -Rghc-timing When reading config.mk.in, remember that anything between ``@...@'' signs is going to be substituted by configure later. You can override the resulting definition if you want, but you need to be a bit surer what you are doing. For example, there's a line that says: YACC = @YaccCmd@ This defines the Make variables YACC to the pathname for a yacc that configure finds somewhere. If you have your own pet yacc you want to use instead, that's fine. Just add this line to mk/build.mk: YACC = myyacc You do not have to have a mk/build.mk file at all; if you don't, you'll get all the default settings from mk/config.mk.in. You can also use build.mk to override anything that configure got wrong. One place where this happens often is with the definition of FPTOOLS_TOP_ABS: this variable is supposed to be the canonical path to the top of your source tree, but if your system uses an automounter then the correct directory is hard to find automatically. If you find that configure has got it wrong, just put the correct definition in build.mk. The story so far Let's summarise the steps you need to carry to get yourself a fully-configured build tree from scratch. Get your source tree from somewhere (CVS repository or source distribution). Say you call the root directory myfptools (it does not have to be called fptools). Make sure that you have the essential files (see ). (Optional) Use lndir or mkshadowdir to create a build tree. $ cd myfptools $ mkshadowdir . /scratch/joe-bloggs/myfptools-sun4 (N.B. mkshadowdir's first argument is taken relative to its second.) You probably want to give the build tree a name that suggests its main defining characteristic (in your mind at least), in case you later add others. Change directory to the build tree. Everything is going to happen there now. $ cd /scratch/joe-bloggs/myfptools-sun4 Prepare for system configuration: $ autoconf (You can skip this step if you are starting from a source distribution, and you already have configure and mk/config.h.in.) Some projects, including GHC itself, have their own configure scripts, so it is necessary to run autoconf again in the appropriate subdirectories. eg: $ (cd ghc; autoconf) Do system configuration: $ ./configure Don't forget to check whether you need to add any arguments to configure; for example, a common requirement is to specify which GHC to use with . Create the file mk/build.mk, adding definitions for your desired configuration options. $ emacs mk/build.mk You can make subsequent changes to mk/build.mk as often as you like. You do not have to run any further configuration programs to make these changes take effect. In theory you should, however, say gmake clean, gmake all, because configuration option changes could affect anything—but in practice you are likely to know what's affected. Making things At this point you have made yourself a fully-configured build tree, so you are ready to start building real things. The first thing you need to know is that you must use GNU make, usually called gmake, not standard Unix make. If you use standard Unix make you will get all sorts of error messages (but no damage) because the fptools Makefiles use GNU make's facilities extensively. To just build the whole thing, cd to the top of your fptools tree and type gmake. This will prepare the tree and build the various projects in the correct order. Standard Targets targets, standard makefile makefile targets In any directory you should be able to make the following: boot: does the one-off preparation required to get ready for the real work. Notably, it does gmake depend in all directories that contain programs. It also builds the necessary tools for compilation to proceed. Invoking the boot target explicitly is not normally necessary. From the top-level fptools directory, invoking gmake causes gmake boot all to be invoked in each of the project subdirectories, in the order specified by $(AllTargets) in config.mk. If you're working in a subdirectory somewhere and need to update the dependencies, gmake boot is a good way to do it. all: makes all the final target(s) for this Makefile. Depending on which directory you are in a ``final target'' may be an executable program, a library archive, a shell script, or a Postscript file. Typing gmake alone is generally the same as typing gmake all. install: installs the things built by all (except for the documentation). Where does it install them? That is specified by mk/config.mk.in; you can override it in mk/build.mk, or by running configure with command-line arguments like --bindir=/home/simonpj/bin; see ./configure --help for the full details. install-docs: installs the documentation. Otherwise behaves just like install. uninstall: reverses the effect of install. clean: Delete all files from the current directory that are normally created by building the program. Don't delete the files that record the configuration, or files generated by gmake boot. Also preserve files that could be made by building, but normally aren't because the distribution comes with them. distclean: Delete all files from the current directory that are created by configuring or building the program. If you have unpacked the source and built the program without creating any other files, make distclean should leave only the files that were in the distribution. mostlyclean: Like clean, but may refrain from deleting a few files that people normally don't want to recompile. maintainer-clean: Delete everything from the current directory that can be reconstructed with this Makefile. This typically includes everything deleted by distclean, plus more: C source files produced by Bison, tags tables, Info files, and so on. One exception, however: make maintainer-clean should not delete configure even if configure can be remade using a rule in the Makefile. More generally, make maintainer-clean should not delete anything that needs to exist in order to run configure and then begin to build the program. check: run the test suite. All of these standard targets automatically recurse into sub-directories. Certain other standard targets do not: configure: is only available in the root directory $(FPTOOLS_TOP); it has been discussed in . depend: make a .depend file in each directory that needs it. This .depend file contains mechanically-generated dependency information; for example, suppose a directory contains a Haskell source module Foo.lhs which imports another module Baz. Then the generated .depend file will contain the dependency: Foo.o : Baz.hi which says that the object file Foo.o depends on the interface file Baz.hi generated by compiling module Baz. The .depend file is automatically included by every Makefile. binary-dist: make a binary distribution. This is the target we use to build the binary distributions of GHC and Happy. dist: make a source distribution. Note that this target does “make distclean” as part of its work; don't use it if you want to keep what you've built. Most Makefiles have targets other than these. You can discover them by looking in the Makefile itself. Using a project from the build tree If you want to build GHC (say) and just use it direct from the build tree without doing make install first, you can run the in-place driver script: ghc/compiler/ghc-inplace. Do NOT use ghc/compiler/ghc, or ghc/compiler/ghc-5.xx, as these are the scripts intended for installation, and contain hard-wired paths to the installed libraries, rather than the libraries in the build tree. Happy can similarly be run from the build tree, using happy/src/happy-inplace. Fast Making <indexterm><primary>fastmake</primary></indexterm> <indexterm><primary>dependencies, omitting</primary></indexterm> <indexterm><primary>FAST, makefile variable</primary></indexterm> Sometimes the dependencies get in the way: if you've made a small change to one file, and you're absolutely sure that it won't affect anything else, but you know that make is going to rebuild everything anyway, the following hack may be useful: gmake FAST=YES This tells the make system to ignore dependencies and just build what you tell it to. In other words, it's equivalent to temporarily removing the .depend file in the current directory (where mkdependHS and friends store their dependency information). A bit of history: GHC used to come with a fastmake script that did the above job, but GNU make provides the features we need to do it without resorting to a script. Also, we've found that fastmaking is less useful since the advent of GHC's recompilation checker (see the User's Guide section on "Separate Compilation"). The <filename>Makefile</filename> architecture <indexterm><primary>makefile architecture</primary></indexterm> make is great if everything works—you type gmake install and lo! the right things get compiled and installed in the right places. Our goal is to make this happen often, but somehow it often doesn't; instead some weird error message eventually emerges from the bowels of a directory you didn't know existed. The purpose of this section is to give you a road-map to help you figure out what is going right and what is going wrong. Debugging Debugging Makefiles is something of a black art, but here's a couple of tricks that we find particularly useful. The following command allows you to see the contents of any make variable in the context of the current Makefile: $ make show VALUE=HS_SRCS where you can replace HS_SRCS with the name of any variable you wish to see the value of. GNU make has a option which generates a dump of the decision procedure used to arrive at a conclusion about which files should be recompiled. Sometimes useful for tracking down problems with superfluous or missing recompilations. A small project To get started, let us look at the Makefile for an imaginary small fptools project, small. Each project in fptools has its own directory in FPTOOLS_TOP, so the small project will have its own directory FPOOLS_TOP/small/. Inside the small/ directory there will be a Makefile, looking something like this: Makefile, minimal # Makefile for fptools project "small" TOP = .. include $(TOP)/mk/boilerplate.mk SRCS = $(wildcard *.lhs) $(wildcard *.c) HS_PROG = small include $(TOP)/target.mk This Makefile has three sections: The first section includes One of the most important features of GNU make that we use is the ability for a Makefile to include another named file, very like cpp's #include directive. a file of ``boilerplate'' code from the level above (which in this case will be FPTOOLS_TOP/mk/boilerplate.mkboilerplate.mk). As its name suggests, boilerplate.mk consists of a large quantity of standard Makefile code. We discuss this boilerplate in more detail in . include, directive in Makefiles Makefile inclusion Before the include statement, you must define the make variable TOPTOP to be the directory containing the mk directory in which the boilerplate.mk file is. It is not OK to simply say include ../mk/boilerplate.mk # NO NO NO Why? Because the boilerplate.mk file needs to know where it is, so that it can, in turn, include other files. (Unfortunately, when an included file does an include, the filename is treated relative to the directory in which gmake is being run, not the directory in which the included sits.) In general, every file foo.mk assumes that $(TOP)/mk/foo.mk refers to itself. It is up to the Makefile doing the include to ensure this is the case. Files intended for inclusion in other Makefiles are written to have the following property: after foo.mk is included, it leaves TOP containing the same value as it had just before the include statement. In our example, this invariant guarantees that the include for target.mk will look in the same directory as that for boilerplate.mk. The second section defines the following standard make variables: SRCSSRCS (the source files from which is to be built), and HS_PROGHS_PROG (the executable binary to be built). We will discuss in more detail what the ``standard variables'' are, and how they affect what happens, in . The definition for SRCS uses the useful GNU make construct $(wildcard $pat$)wildcard, which expands to a list of all the files matching the pattern pat in the current directory. In this example, SRCS is set to the list of all the .lhs and .c files in the directory. (Let's suppose there is one of each, Foo.lhs and Baz.c.) The last section includes a second file of standard code, called target.mktarget.mk. It contains the rules that tell gmake how to make the standard targets (). Why, you ask, can't this standard code be part of boilerplate.mk? Good question. We discuss the reason later, in . You do not have to include the target.mk file. Instead, you can write rules of your own for all the standard targets. Usually, though, you will find quite a big payoff from using the canned rules in target.mk; the price tag is that you have to understand what canned rules get enabled, and what they do (). In our example Makefile, most of the work is done by the two included files. When you say gmake all, the following things happen: gmake figures out that the object files are Foo.o and Baz.o. It uses a boilerplate pattern rule to compile Foo.lhs to Foo.o using a Haskell compiler. (Which one? That is set in the build configuration.) It uses another standard pattern rule to compile Baz.c to Baz.o, using a C compiler. (Ditto.) It links the resulting .o files together to make small, using the Haskell compiler to do the link step. (Why not use ld? Because the Haskell compiler knows what standard libraries to link in. How did gmake know to use the Haskell compiler to do the link, rather than the C compiler? Because we set the variable HS_PROG rather than C_PROG.) All Makefiles should follow the above three-section format. A larger project Larger projects are usually structured into a number of sub-directories, each of which has its own Makefile. (In very large projects, this sub-structure might be iterated recursively, though that is rare.) To give you the idea, here's part of the directory structure for the (rather large) GHC project: $(FPTOOLS_TOP)/ghc/ Makefile mk/ boilerplate.mk rules.mk docs/ Makefile ...source files for documentation... driver/ Makefile ...source files for driver... compiler/ Makefile parser/...source files for parser... renamer/...source files for renamer... ...etc... The sub-directories docs, driver, compiler, and so on, each contains a sub-component of GHC, and each has its own Makefile. There must also be a Makefile in $(FPTOOLS_TOP)/ghc. It does most of its work by recursively invoking gmake on the Makefiles in the sub-directories. We say that ghc/Makefile is a non-leaf Makefile, because it does little except organise its children, while the Makefiles in the sub-directories are all leaf Makefiles. (In principle the sub-directories might themselves contain a non-leaf Makefile and several sub-sub-directories, but that does not happen in GHC.) The Makefile in ghc/compiler is considered a leaf Makefile even though the ghc/compiler has sub-directories, because these sub-directories do not themselves have Makefiles in them. They are just used to structure the collection of modules that make up GHC, but all are managed by the single Makefile in ghc/compiler. You will notice that ghc/ also contains a directory ghc/mk/. It contains GHC-specific Makefile boilerplate code. More precisely: ghc/mk/boilerplate.mk is included at the top of ghc/Makefile, and of all the leaf Makefiles in the sub-directories. It in turn includes the main boilerplate file mk/boilerplate.mk. ghc/mk/target.mk is included at the bottom of ghc/Makefile, and of all the leaf Makefiles in the sub-directories. It in turn includes the file mk/target.mk. So these two files are the place to look for GHC-wide customisation of the standard boilerplate. Boilerplate architecture <indexterm><primary>boilerplate architecture</primary></indexterm> Every Makefile includes a boilerplate.mkboilerplate.mk file at the top, and target.mktarget.mk file at the bottom. In this section we discuss what is in these files, and why there have to be two of them. In general: boilerplate.mk consists of: Definitions of millions of make variables that collectively specify the build configuration. Examples: HC_OPTSHC_OPTS, the options to feed to the Haskell compiler; NoFibSubDirsNoFibSubDirs, the sub-directories to enable within the nofib project; GhcWithHcGhcWithHc, the name of the Haskell compiler to use when compiling GHC in the ghc project. Standard pattern rules that tell gmake how to construct one file from another. boilerplate.mk needs to be included at the top of each Makefile, so that the user can replace the boilerplate definitions or pattern rules by simply giving a new definition or pattern rule in the Makefile. gmake simply takes the last definition as the definitive one. Instead of replacing boilerplate definitions, it is also quite common to augment them. For example, a Makefile might say: SRC_HC_OPTS += -O thereby adding ``'' to the end of SRC_HC_OPTSSRC_HC_OPTS. target.mk contains make rules for the standard targets described in . These rules are selectively included, depending on the setting of certain make variables. These variables are usually set in the middle section of the Makefile between the two includes. target.mk must be included at the end (rather than being part of boilerplate.mk) for several tiresome reasons: gmake commits target and dependency lists earlier than it should. For example, target.mk has a rule that looks like this: $(HS_PROG) : $(OBJS) $(HC) $(LD_OPTS) $< -o $@ If this rule was in boilerplate.mk then $(HS_PROG)HS_PROG and $(OBJS)OBJS would not have their final values at the moment gmake encountered the rule. Alas, gmake takes a snapshot of their current values, and wires that snapshot into the rule. (In contrast, the commands executed when the rule ``fires'' are only substituted at the moment of firing.) So, the rule must follow the definitions given in the Makefile itself. Unlike pattern rules, ordinary rules cannot be overriden or replaced by subsequent rules for the same target (at least, not without an error message). Including ordinary rules in boilerplate.mk would prevent the user from writing rules for specific targets in specific cases. There are a couple of other reasons I've forgotten, but it doesn't matter too much. The main <filename>mk/boilerplate.mk</filename> file <indexterm><primary>boilerplate.mk</primary></indexterm> If you look at $(FPTOOLS_TOP)/mk/boilerplate.mk you will find that it consists of the following sections, each held in a separate file: config.mk config.mk is the build configuration file we discussed at length in . paths.mk paths.mk defines make variables for pathnames and file lists. This file contains code for automatically compiling lists of source files and deriving lists of object files from those. The results can be overriden in the Makefile, but in most cases the automatic setup should do the right thing. The following variables may be set in the Makefile to affect how the automatic source file search is done: ALL_DIRS ALL_DIRS Set to a list of directories to search in addition to the current directory for source files. EXCLUDE_SRCS EXCLUDE_SRCS Set to a list of source files (relative to the current directory) to omit from the automatic search. The source searching machinery is clever enough to know that if you exclude a source file from which other sources are derived, then the derived sources should also be excluded. For example, if you set EXCLUDED_SRCS to include Foo.y, then Foo.hs will also be excluded. EXTRA_SRCS EXCLUDE_SRCS Set to a list of extra source files (perhaps in directories not listed in ALL_DIRS) that should be considered. The results of the automatic source file search are placed in the following make variables: SRCS SRCS All source files found, sorted and without duplicates, including those which might not exist yet but will be derived from other existing sources. SRCS can be overriden if necessary, in which case the variables below will follow suit. HS_SRCS HS_SRCS all Haskell source files in the current directory, including those derived from other source files (eg. Happy sources also give rise to Haskell sources). HS_OBJS HS_OBJS Object files derived from HS_SRCS. HS_IFACES HS_IFACES Interface files (.hi files) derived from HS_SRCS. C_SRCS C_SRCS All C source files found. C_OBJS C_OBJS Object files derived from C_SRCS. SCRIPT_SRCS SCRIPT_SRCS All script source files found (.lprl files). SCRIPT_OBJS SCRIPT_OBJS object files derived from SCRIPT_SRCS (.prl files). HSC_SRCS HSC_SRCS All hsc2hs source files (.hsc files). HAPPY_SRCS HAPPY_SRCS All happy source files (.y or .hy files). OBJS OBJS the concatenation of $(HS_OBJS), $(C_OBJS), and $(SCRIPT_OBJS). Any or all of these definitions can easily be overriden by giving new definitions in your Makefile. What, exactly, does paths.mk consider a source file to be? It's based on the file's suffix (e.g. .hs, .lhs, .c, .hy, etc), but this is the kind of detail that changes, so rather than enumerate the source suffices here the best thing to do is to look in paths.mk. opts.mk opts.mk defines make variables for option strings to pass to each program. For example, it defines HC_OPTSHC_OPTS, the option strings to pass to the Haskell compiler. See . suffix.mk suffix.mk defines standard pattern rules—see . Any of the variables and pattern rules defined by the boilerplate file can easily be overridden in any particular Makefile, because the boilerplate include comes first. Definitions after this include directive simply override the default ones in boilerplate.mk. Pattern rules and options <indexterm><primary>Pattern rules</primary></indexterm> The file suffix.mksuffix.mk defines standard pattern rules that say how to build one kind of file from another, for example, how to build a .o file from a .c file. (GNU make's pattern rules are more powerful and easier to use than Unix make's suffix rules.) Almost all the rules look something like this: %.o : %.c $(RM) $@ $(CC) $(CC_OPTS) -c $< -o $@ Here's how to understand the rule. It says that something.o (say Foo.o) can be built from something.c (Foo.c), by invoking the C compiler (path name held in $(CC)), passing to it the options $(CC_OPTS) and the rule's dependent file of the rule $< (Foo.c in this case), and putting the result in the rule's target $@ (Foo.o in this case). Every program is held in a make variable defined in mk/config.mk—look in mk/config.mk for the complete list. One important one is the Haskell compiler, which is called $(HC). Every program's options are are held in a make variables called <prog>_OPTS. the <prog>_OPTS variables are defined in mk/opts.mk. Almost all of them are defined like this: CC_OPTS = $(SRC_CC_OPTS) $(WAY$(_way)_CC_OPTS) $($*_CC_OPTS) $(EXTRA_CC_OPTS) The four variables from which CC_OPTS is built have the following meaning: SRC_CC_OPTSSRC_CC_OPTS: options passed to all C compilations. WAY_<way>_CC_OPTS: options passed to C compilations for way <way>. For example, WAY_mp_CC_OPTS gives options to pass to the C compiler when compiling way mp. The variable WAY_CC_OPTS holds options to pass to the C compiler when compiling the standard way. ( dicusses multi-way compilation.) <module>_CC_OPTS: options to pass to the C compiler that are specific to module <module>. For example, SMap_CC_OPTS gives the specific options to pass to the C compiler when compiling SMap.c. EXTRA_CC_OPTSEXTRA_CC_OPTS: extra options to pass to all C compilations. This is intended for command line use, thus: gmake libHS.a EXTRA_CC_OPTS="-v" The main <filename>mk/target.mk</filename> file <indexterm><primary>target.mk</primary></indexterm> target.mk contains canned rules for all the standard targets described in . It is complicated by the fact that you don't want all of these rules to be active in every Makefile. Rather than have a plethora of tiny files which you can include selectively, there is a single file, target.mk, which selectively includes rules based on whether you have defined certain variables in your Makefile. This section explains what rules you get, what variables control them, and what the rules do. Hopefully, you will also get enough of an idea of what is supposed to happen that you can read and understand any weird special cases yourself. HS_PROGHS_PROG. If HS_PROG is defined, you get rules with the following targets: HS_PROGHS_PROG itself. This rule links $(OBJS) with the Haskell runtime system to get an executable called $(HS_PROG). installinstall installs $(HS_PROG) in $(bindir). C_PROGC_PROG is similar to HS_PROG, except that the link step links $(C_OBJS) with the C runtime system. LIBRARYLIBRARY is similar to HS_PROG, except that it links $(LIB_OBJS) to make the library archive $(LIBRARY), and install installs it in $(libdir). LIB_DATALIB_DATA … LIB_EXECLIB_EXEC … HS_SRCSHS_SRCS, C_SRCSC_SRCS. If HS_SRCS is defined and non-empty, a rule for the target depend is included, which generates dependency information for Haskell programs. Similarly for C_SRCS. All of these rules are ``double-colon'' rules, thus install :: $(HS_PROG) ...how to install it... GNU make treats double-colon rules as separate entities. If there are several double-colon rules for the same target it takes each in turn and fires it if its dependencies say to do so. This means that you can, for example, define both HS_PROG and LIBRARY, which will generate two rules for install. When you type gmake install both rules will be fired, and both the program and the library will be installed, just as you wanted. Recursion <indexterm><primary>recursion, in makefiles</primary></indexterm> <indexterm><primary>Makefile, recursing into subdirectories</primary></indexterm> In leaf Makefiles the variable SUBDIRSSUBDIRS is undefined. In non-leaf Makefiles, SUBDIRS is set to the list of sub-directories that contain subordinate Makefiles. It is up to you to set SUBDIRS in the Makefile. There is no automation here—SUBDIRS is too important to automate. When SUBDIRS is defined, target.mk includes a rather neat rule for the standard targets ( that simply invokes make recursively in each of the sub-directories. These recursive invocations are guaranteed to occur in the order in which the list of directories is specified in SUBDIRS. This guarantee can be important. For example, when you say gmake boot it can be important that the recursive invocation of make boot is done in one sub-directory (the include files, say) before another (the source files). Generally, put the most independent sub-directory first, and the most dependent last. Way management way management We sometimes want to build essentially the same system in several different ``ways''. For example, we want to build GHC's Prelude libraries with and without profiling, so that there is an appropriately-built library archive to link with when the user compiles his program. It would be possible to have a completely separate build tree for each such ``way'', but it would be horribly bureaucratic, especially since often only parts of the build tree need to be constructed in multiple ways. Instead, the target.mktarget.mk contains some clever magic to allow you to build several versions of a system; and to control locally how many versions are built and how they differ. This section explains the magic. The files for a particular way are distinguished by munging the suffix. The normal way is always built, and its files have the standard suffices .o, .hi, and so on. In addition, you can build one or more extra ways, each distinguished by a way tag. The object files and interface files for one of these extra ways are distinguished by their suffix. For example, way mp has files .mp_o and .mp_hi. Library archives have their way tag the other side of the dot, for boring reasons; thus, libHS_mp.a. A make variable called way holds the current way tag. way is only ever set on the command line of gmake (usually in a recursive invocation of gmake by the system). It is never set inside a Makefile. So it is a global constant for any one invocation of gmake. Two other make variables, way_ and _way are immediately derived from $(way) and never altered. If way is not set, then neither are way_ and _way, and the invocation of make will build the normal way. If way is set, then the other two variables are set in sympathy. For example, if $(way) is ``mp'', then way_ is set to ``mp_'' and _way is set to ``_mp''. These three variables are then used when constructing file names. So how does make ever get recursively invoked with way set? There are two ways in which this happens: For some (but not all) of the standard targets, when in a leaf sub-directory, make is recursively invoked for each way tag in $(WAYS). You set WAYS in the Makefile to the list of way tags you want these targets built for. The mechanism here is very much like the recursive invocation of make in sub-directories (). It is up to you to set WAYS in your Makefile; this is how you control what ways will get built. For a useful collection of targets (such as libHS_mp.a, Foo.mp_o) there is a rule which recursively invokes make to make the specified target, setting the way variable. So if you say gmake Foo.mp_o you should see a recursive invocation gmake Foo.mp_o way=mp, and in this recursive invocation the pattern rule for compiling a Haskell file into a .o file will match. The key pattern rules (in suffix.mk) look like this: %.$(way_)o : %.lhs $(HC) $(HC_OPTS) $< -o $@ Neat, eh? You can invoke make with a particular way setting yourself, in order to build files related to a particular way in the current directory. eg. $ make way=p will build files for the profiling way only in the current directory. When the canned rule isn't right Sometimes the canned rule just doesn't do the right thing. For example, in the nofib suite we want the link step to print out timing information. The thing to do here is not to define HS_PROG or C_PROG, and instead define a special purpose rule in your own Makefile. By using different variable names you will avoid the canned rules being included, and conflicting with yours. Booting/porting from C (<filename>.hc</filename>) files <indexterm><primary>building GHC from .hc files</primary></indexterm> <indexterm><primary>booting GHC from .hc files</primary></indexterm> <indexterm><primary>porting GHC</primary></indexterm> This section is for people trying to get GHC going by using the supplied intermediate C (.hc) files. This would probably be because no binaries have been provided, or because the machine is not ``fully supported''. The intermediate C files are normally made available together with a source release, please check the announce message for exact directions of where to find them. If we haven't made them available or you can't find them, please ask. Assuming you've got them, unpack them on top of a fresh source tree. This will place matching .hc files next to the corresponding Haskell source in the compiler subdirectory ghc and in the language package of hslibs (i.e., in hslibs/lang). Then follow the `normal' instructions in for setting up a build tree. The actual build process is fully automated by the hc-build script located in the distrib directory. If you eventually want to install GHC into the directory INSTALL_DIRECTORY, the following command will execute the whole build process (it won't install yet): foo% distrib/hc-build --prefix=INSTALL_DIRECTORY --hc-build By default, the installation directory is /usr/local. If that is what you want, you may omit the argument to hc-build. Generally, any option given to hc-build is passed through to the configuration script configure. If hc-build successfully completes the build process, you can install the resulting system, as normal, with foo% make install That's the mechanics of the boot process, but, of course, if you're trying to boot on a platform that is not supported and significantly `different' from any of the supported ones, this is only the start of the adventure…(ToDo: porting tips—stuff to look out for, etc.) Known pitfalls in building Glasgow Haskell <indexterm><primary>problems, building</primary></indexterm> <indexterm><primary>pitfalls, in building</primary></indexterm> <indexterm><primary>building pitfalls</primary></indexterm> WARNINGS about pitfalls and known ``problems'': One difficulty that comes up from time to time is running out of space in TMPDIR. (It is impossible for the configuration stuff to compensate for the vagaries of different sysadmin approaches to temp space.) tmp, running out of space in The quickest way around it is setenv TMPDIR /usr/tmpTMPDIR or even setenv TMPDIR . (or the equivalent incantation with your shell of choice). The best way around it is to say export TMPDIR=<dir> in your build.mk file. Then GHC and the other fptools programs will use the appropriate directory in all cases. In compiling some support-code bits, e.g., in ghc/rts/gmp and even in ghc/lib, you may get a few C-compiler warnings. We think these are OK. When compiling via C, you'll sometimes get ``warning: assignment from incompatible pointer type'' out of GCC. Harmless. Similarly, archiving warning messages like the following are not a problem: ar: filename GlaIOMonad__1_2s.o truncated to GlaIOMonad_ ar: filename GlaIOMonad__2_2s.o truncated to GlaIOMonad_ ... In compiling the compiler proper (in compiler/), you may get an ``Out of heap space'' error message. These can vary with the vagaries of different systems, it seems. The solution is simple: If you're compiling with GHC 4.00 or later, then the maximum heap size must have been reached. This is somewhat unlikely, since the maximum is set to 64M by default. Anyway, you can raise it with the flag (add this flag to <module>_HC_OPTS make variable in the appropriate Makefile). For GHC < 4.00, add a suitable flag to the Makefile, as above. and try again: gmake. (see for information about <module>_HC_OPTS.) Alternatively, just cut to the chase: % cd ghc/compiler % make EXTRA_HC_OPTS=-optCrts-M128M If you try to compile some Haskell, and you get errors from GCC about lots of things from /usr/include/math.h, then your GCC was mis-installed. fixincludes wasn't run when it should've been. As fixincludes is now automagically run as part of GCC installation, this bug also suggests that you have an old GCC. You may need to re-ranlibranlib your libraries (on Sun4s). % cd $(libdir)/ghc-x.xx/sparc-sun-sunos4 % foreach i ( `find . -name '*.a' -print` ) # or other-shell equiv... ? ranlib $i ? # or, on some machines: ar s $i ? end We'd be interested to know if this is still necessary. GHC's sources go through cpp before being compiled, and cpp varies a bit from one Unix to another. One particular gotcha is macro calls like this: SLIT("Hello, world") Some cpps treat the comma inside the string as separating two macro arguments, so you get :731: macro `SLIT' used with too many (2) args Alas, cpp doesn't tell you the offending file! Workaround: don't put weird things in string args to cpp macros. Notes for building under Windows This section summarises how to get the utilities you need on your Win95/98/NT/2000 machine to use CVS and build GHC. Similar notes for installing and running GHC may be found in the user guide. In general, Win95/Win98 behave the same, and WinNT/Win2k behave the same. You should read the GHC installation guide sections on Windows (in the user guide) before continuing to read these notes. Before you start Make sure that the user environment variable MAKE_MODE is set to UNIX. If you don't do this you get very weird messages when you type make, such as: /c: /c: No such file or directory GHC uses the mingw C compiler to generate code, so you have to install that. Just pick up a mingw bundle at http://www.mingw.org/. We install it in c:/mingw. Install a version of GHC, and put it in your PATH. Because of various hard-wired infelicities, you need to copy bash.exe, perl.exe and cat.exe (from Cygwin's bin directory), to /bin (discover where your Cygwin root directory is by typing mount). If /bin points to the Cygwin bin directory, there's no need to copy anything. By default, cygwin provides the command shell ash as sh.exe. It has a couple of 'issues', so in your /bin directory, make sure that bash.exe is also provided as sh.exe. You should not need to install ssh and cvs: they come with Cygwin. Check out a copy of GHC sources from the CVS repository, following the instructions above (). Building GHC Run autoconf both in fptools and in fptools/ghc. If you omit the latter step you'll get an error when you run ./configure: ...lots of stuff... creating mk/config.h mk/config.h is unchanged configuring in ghc running /bin/sh ./configure --cache-file=.././config.cache --srcdir=. ./configure: ./configure: No such file or directory configure: error: ./configure failed for ghc You either need to add ghc to your PATH before you invoke configure, or use the configure option . The Windows installer for GHC tells you at the end what additions you need to make to your PATH. After autoconf run ./configure in fptools/ thus: ./configure --host=i386-unknown-mingw32 --with-gcc=/mingw/bin/gcc Both these options are important! It's possible to get into trouble using the wrong C compiler!