X-Git-Url: http://git.megacz.com/?a=blobdiff_plain;f=ghc%2Fdocs%2Fusers_guide%2Fseparate_compilation.sgml;h=6595b88435f75914a4c6cb4381eca7e298c92efe;hb=ed1703f0decbe5dc01a3c90c698dc72a946bc4fd;hp=c124eb2865335035bed54033f1955af10bf743d1;hpb=7d67a216958e8986cf067f96f9e9c7a4101fd29b;p=ghc-hetmet.git diff --git a/ghc/docs/users_guide/separate_compilation.sgml b/ghc/docs/users_guide/separate_compilation.sgml index c124eb2..6595b88 100644 --- a/ghc/docs/users_guide/separate_compilation.sgml +++ b/ghc/docs/users_guide/separate_compilation.sgml @@ -1,131 +1,482 @@ - Separate compilation - + Filenames and separate compilation + separate compilation recompilation checker make and recompilation - - This section describes how GHC supports separate - compilation. - - - Interface files - + + This section describes what files GHC expects to find, what + files it creates, where these files are stored, and what options + affect this behaviour. + + Note that this section is written with + hierarchical modules in mind (see ); hierarchical modules are an + extension to Haskell 98 which extends the lexical syntax of + module names to include a dot ‘.’. Non-hierarchical + modules are thus a special case in which none of the module names + contain dots. + + Pathname conventions vary from system to system. In + particular, the directory separator is + ‘/’ on Unix systems and + ‘\’ on Windows systems. In the + sections that follow, we shall consistently use + ‘/’ as the directory separator; + substitute this for the appropriate character for your + system. + + + Haskell source files + + Each Haskell source module should be placed in a file on + its own. + + The file should usually be named after the module name, by + replacing dots in the module name by directory separators. For + example, on a Unix system, the module A.B.C + should be placed in the file A/B/C.hs, + relative to some base directory. GHC's behaviour if this rule + is not followed is fully defined by the following section (). + + + + Output files + interface files .hi files - - When GHC compiles a source file A.hs - which contains a module A, say, it generates - an object A.o, and a - companion interface file - A.hi. The interface file is not intended - for human consumption, as you'll see if you take a look at one. - It's merely there to help the compiler compile other modules in - the same program. - - NOTE: In general, the name of a file containing module - M should be named M.hs - or M.lhs. The only exception to this rule is - module Main, which can be placed in any - file.filenamesfor - modules - - The interface file for A contains - information needed by the compiler when it compiles any module - B that imports A, whether - directly or indirectly. When compiling B, - GHC will read A.hi to find the details that - it needs to know about things defined in - A. - - The interface file may contain all sorts of things that - aren't explicitly exported from A by the - programmer. For example, even though a data type is exported - abstractly, A.hi will contain the full data - type definition. For small function definitions, - A.hi will contain the complete definition - of the function. For bigger functions, - A.hi will contain strictness information - about the function. And so on. GHC puts much more information - into .hi files when optimisation is turned - on with the flag (see ). Without it - puts in just the minimum; with it lobs in a - whole pile of stuff. optimsation, effect on - .hi files - - A.hi should really be thought of as a - compiler-readable version of A.o. If you - use a .hi file that wasn't generated by the - same compilation run that generates the .o - file the compiler may assume all sorts of incorrect things about - A, resulting in core dumps and other - unpleasant happenings. - + object files + .o files + + When asked to compile a source file, GHC normally + generates two files: an object file, and + an interface file. + + The object file, which normally ends in a + .o suffix (or .obj if + you're on Windows), contains the compiled code for the module. + + The interface file, + which normally ends in a .hi suffix, contains + the information that GHC needs in order to compile further + modules that depend on this module. It contains things like the + types of exported functions, definitions of data types, and so + on. It is stored in a binary format, so don't try to read one; + use the option instead (see ). + + You should think of the object file and the interface file as a + pair, since the interface file is in a sense a compiler-readable + description of the contents of the object file. If the + interface file and object file get out of sync for any reason, + then the compiler may end up making assumptions about the object + file that aren't true; trouble will almost certainly follow. + For this reason, we recommend keeping object files and interface + files in the same place (GHC does this by default, but it is + possible to override the defaults as we'll explain + shortly). + + Every module has a module name + defined in its source code (module A.B.C where + ...). + + The name of the object file generated by GHC is derived + according to the following rules, where + osuf is the object-file suffix (this + can be changed with the option). + + + + If there is no option (the + default), then the object filename is derived from the + source filename (ignoring the module name) by replacing the + suffix with osuf. + + + If +  dir + has been specified, then the object filename is + dir/mod.osuf, + where mod is the module name with + dots replaced by slashes. + + + + The name of the interface file is derived using the same + rules, except that the suffix is + hisuf (.hi by + default) instead of osuf, and the + relevant options are and + instead of and + respectively. + + For example, if GHC compiles the module + A.B.C in the file + src/A/B/C.hs, with no + -odir or -hidir flags, the + interface file will be put in src/A/B/C.hi + and the object file in src/A/B/C.o. + + For any module that is imported, GHC requires that the + name of the module in the import statement exactly matches the + name of the module in the interface file (or source file) found + using the strategy specified in . + This means that for most modules, the source file name should + match the module name. + + However, note that it is reasonable to have a module + Main in a file named + foo.hs, but this only works because GHC + never needs to search for the interface for module + Main (because it is never imported). It is + therefore possible to have several Main + modules in separate source files in the same directory, and GHC + will not get confused. + + In batch compilation mode, the name of the object file can + also be overriden using the option, and the + name of the interface file can be specified directly using the + option. - - Finding interface files + + The search path + search path + interface files, finding them finding interface files In your program, you import a module Foo by saying import Foo. - GHC goes looking for an interface file, - Foo.hi. It has a builtin list of - directories (notably including .) where it - looks. + In mode or GHCi, GHC will look for a + source file for Foo and arrange to compile it + first. Without , GHC will look for the + interface file for Foo, which should have + been created by an earlier compilation of + Foo. GHC uses the same strategy in each of + these cases for finding the appropriate file. + + This strategy is as follows: GHC keeps a list of + directories called the search path. For + each of these directories, it tries appending + basename.extension + to the directory, and checks whether the file exists. The value + of basename is the module name with + dots replaced by the directory separator ('/' or '\', depending + on the system), and extension is a + source extension (hs, lhs) + if we are in mode and GHCi, or + hisuf otherwise. + + For example, suppose the search path contains directories + d1, d2, and + d3, and we are in --make + mode looking for the source file for a module + A.B.C. GHC will look in + d1/A/B/C.hs, d1/A/B/C.lhs, + d2/A/B/C.hs, and so on. + + The search path by default contains a single directory: + . (i.e. the current directory). The following + options can be used to add to or change the contents of the + search path: - - + - - This flag prepends a colon-separated - list of dirs to the “import - directories” list. See also - for the significance of using relative and absolute - pathnames in the list. + + This flag appends a colon-separated + list of dirs to the search path. - resets the “import directories” list - back to nothing. + resets the search path back to nothing. - - See also the section on packages (), which describes how to use installed - libraries. - + This isn't the whole story: GHC also looks for modules in + pre-compiled libraries, known as packages. See the section on + packages (), for details. - - Other options related to interface files - interface files, options + + Redirecting the compilation output(s) + + output-directing options + redirecting compilation output + file + + + GHC's compiled output normally goes into a + .hc, .o, etc., + file, depending on the last-run compilation phase. The + option + re-directs the output of that last-run phase to + file. + + Note: this “feature” can be + counterintuitive: ghc -C -o foo.o + foo.hs will put the intermediate C code in the + file foo.o, name + notwithstanding! + + This option is most often used when creating an + executable file, to set the filename of the executable. + For example: + ghc -o prog --make Main + + will compile the program starting with module + Main and put the executable in the + file prog. + + Note: on Windows, if the result is an executable + file, the extension ".exe" is added + if the specified filename does not already have an + extension. Thus + + ghc -o foo Main.hs + + will compile and link the module + Main.hs, and put the resulting + executable in foo.exe (not + foo). + + + + + dir + + + Redirects object files to directory + dir. For example: + + +$ ghc -c parse/Foo.hs parse/Bar.hs gurgle/Bumble.hs -odir `arch` + + + The object files, Foo.o, + Bar.o, and + Bumble.o would be put into a + subdirectory named after the architecture of the executing + machine (x86, + mips, etc). + + Note that the option does + not affect where the interface files + are put; use the option for that. + In the above example, they would still be put in + parse/Foo.hi, + parse/Bar.hi, and + gurgle/Bumble.hi. + + + + file The interface output may be directed to another file bar2/Wurble.iface with the option - (not recommended). - To avoid generating an interface at all, you can say - -ohi /dev/null, for example. + (not + recommended). + + WARNING: if you redirect the interface file + somewhere that GHC can't find it, then the recompilation + checker may get confused (at the least, you won't get any + recompilation avoidance). We recommend using a + combination of and + options instead, if + possible. + + To avoid generating an interface at all, you could + use this option to redirect the interface into the bit + bucket: -ohi /dev/null, for + example. + + + + + dir + + + + Redirects all generated interface files into + dir, instead of the + default. + + + + + suffix + suffix + suffix + + + + + The + suffix will change the + .o file suffix for object files to + whatever you specify. We use this when compiling + libraries, so that objects for the profiling versions of + the libraries don't clobber the normal ones. + + Similarly, the + suffix will change the + .hi file suffix for non-system + interface files (see ). + + Finally, the option + suffix will change the + .hc file suffix for compiler-generated + intermediate C files. + + The / + game is particularly useful if you want to compile a + program both with and without profiling, in the same + directory. You can say: + + ghc ... + to get the ordinary version, and + + ghc ... -osuf prof.o -hisuf prof.hi -prof -auto-all + to get the profiled version. + + + + + + + Keeping Intermediate Files + intermediate files, saving + + .hc files, saving + + .s files, saving + + + The following options are useful for keeping certain + intermediate files around, when normally GHC would throw these + away after compilation: + + + + + + + + + Keep intermediate .hc files when + doing .hs-to-.o + compilations via C (NOTE: .hc files + aren't generated when using the native code generator, you + may need to use to force them + to be produced). + + + + + + + + + + Keep intermediate .s files. + + + + + + Keep intermediate .raw-s files. + These are the direct output from the C compiler, before + GHC does “assembly mangling” to produce the + .s file. Again, these are not produced + when using the native code generator. + + + + + + + + + + temporary files + keeping + + + Instructs the GHC driver not to delete any of its + temporary files, which it normally keeps in + /tmp (or possibly elsewhere; see ). Running GHC with + will show you what temporary files + were generated along the way. + + + + + + + Redirecting temporary files + + + temporary files + redirecting + + + + + + + + If you have trouble because of running out of space + in /tmp (or wherever your + installation thinks temporary files should go), you may + use the -tmpdir + <dir> option option to specify + an alternate directory. For example, says to put temporary files in the current + working directory. + + Alternatively, use your TMPDIR + environment variable.TMPDIR + environment variable Set it to the + name of the directory where temporary files should be put. + GCC and other programs will honour the + TMPDIR variable as well. + + Even better idea: Set the + DEFAULT_TMPDIR make variable when + building GHC, and never worry about + TMPDIR again. (see the build + documentation). + + + + + + + Other options related to interface files + interface files, options + + + @@ -166,8 +517,19 @@ the labour. + + + + file + + + + Where file is the name of + an interface file, dumps the contents of that interface in + a human-readable (ish) format. + + - @@ -190,7 +552,7 @@ - + In the olden days, GHC compared the newly-generated .hi file with the previous version; if they were identical, it left the old one alone and didn't change its @@ -201,10 +563,14 @@ This doesn't work any more. Suppose module C imports module B, and B imports module A. So - changes to A.hi should force a - recompilation of C. And some changes to - A (changing the definition of a function that - appears in an inlining of a function exported by + changes to module A might require module + C to be recompiled, and hence when + A.hi changes we should check whether + C should be recompiled. However, the + dependencies of C will only list + B.hi, not A.hi, and some + changes to A (changing the definition of a + function that appears in an inlining of a function exported by B, say) may conceivably not change B.hi one jot. So now… @@ -248,7 +614,7 @@ OBJS = Main.o Foo.o Bar.o .SUFFIXES : .o .hs .hi .lhs .hc .s cool_pgm : $(OBJS) - rm $@ + rm -f $@ $(HC) -o $@ $(HC_OPTS) $(OBJS) # Standard suffix rules @@ -367,6 +733,14 @@ A.o : B.hi-boot makefile, then the old dependencies are deleted first. + Don't forget to use the same + options on the ghc -M command line as you + would when compiling; this enables the dependency generator to + locate any imported modules that come from packages. The + package modules won't be included in the dependencies + generated, though (but see the + option below). + The dependency generation phase of GHC can take some additional options, which you may find useful. For historical reasons, each option passed to the dependency generator from @@ -377,9 +751,9 @@ A.o : B.hi-boot ghc -M -optdep-f -optdep.depend ... - + The options which affect dependency generation are: - + @@ -387,7 +761,7 @@ ghc -M -optdep-f -optdep.depend ... Turn off warnings about interface file shadowing. - + file @@ -404,6 +778,7 @@ ghc -M -optdep-f -optdep.depend ... + @@ -439,7 +814,7 @@ ghc -M -optdep-f -optdep.depend ... - + Regard <file> as "stable"; i.e., exclude it from having dependencies on @@ -450,12 +825,12 @@ ghc -M -optdep-f -optdep.depend ... - same as + same as - + Regard the colon-separated list of directories <dirs> as containing stable, @@ -465,25 +840,23 @@ ghc -M -optdep-f -optdep.depend ... - + Regard <file> as not "stable"; i.e., generate dependencies on it (if any). This option is normally used in conjunction with - the option. + the option. - + - Regard prelude libraries as unstable, i.e., - generate dependencies on the prelude modules used - (including Prelude). This option is - normally only used by the various system libraries. If a - option is used, dependencies - will also be generated on the library's - interfaces. + Regard modules imported from packages as unstable, + i.e., generate dependencies on the package modules used + (including Prelude, and all other + standard Haskell libraries). This option is normally + only used by the various system libraries. @@ -548,61 +921,147 @@ import {-# SOURCE #-} A would look like the following: -__interface A 1 0 where -__export A TA{MkTA} ; -1 newtype TA = MkTA PrelBase.Int ; +module A where +newtype TA = MkTA GHC.Base.Int - The syntax is essentially the same as a normal - .hi file (unfortunately), so you can - usually tailor an existing .hi file to make - a .hi-boot file. + The syntax is similar to a normal Haskell source file, but + with some important differences: + + + + Non-local entities must be qualified with their + original defining module. Qualifying + by a module which just re-exports the entity won't do. In + particular, most Prelude entities aren't + actually defined in the Prelude (see for + example GHC.Base.Int in the above + example). HINT: to find out the fully-qualified name for + entities in the Prelude (or anywhere for + that matter), try using GHCi's + :info command, eg. +Prelude> :m -Prelude +> :i IO.IO +-- GHC.IOBase.IO is a type constructor +newtype GHC.IOBase.IO a +... + + + Only data, type, + newtype, class, and + type signature declarations may be included. You cannot declare + instances or derive them automatically. + + + + For data or newtype declaration, you may omit all +the constructors, by omitting the '=' and everything that follows it: + +module A where + data TA + + In a source program + this would declare TA to have no constructors (a GHC extension: see ), + but in an hi-boot file it means "I don't know or care what the construtors are". + This is the most common form of data type declaration, because it's easy to get right. + + You can also write out the constructors but, if you do so, you must write + it out precisely as in its real definition. + It is especially delicate if you use a strictness annotation "!", + with or without an {-# UNPACK #-} pragma. In a source file + GHC may or may not choose to unbox the argument, but in an hi-boot file it's + assumed that you express the outcome of this decision. + (So in the cases where GHC decided not to unpack, you must not use the pragma.) + Tread with care. + + Regardless of whether you write the constructors, you must write all the type parameters, + including their kinds + if they are not '*'. (You can give explicit kinds in source files too (), + but you must do so in hi-boot files.) + + + For class declaration, you may not specify any class +operations. We could lift this restriction if it became tiresome. + + Notice that we only put the declaration for the newtype TA in the hi-boot file, not the signature for f, since f isn't used by B. - The number “1” after - “__interface A” gives the version - number of module A; it is incremented whenever anything in A's - interface file changes. In a normal interface file, the - “0” is the version number of the compiler which - generated the interface file; it is used to ensure that we don't - mix-and-match interface files between compiler versions. - Leaving it as zero in an hi-boot file turns - off this check. - - The number “1” at the beginning of a - declaration is the version number of that - declaration: for the purposes of .hi-boot - files these can all be set to 1. All names must be fully - qualified with the original module that an - object comes from: for example, the reference to - Int in the interface for A - comes from PrelBase, which is a module - internal to GHC's prelude. It's a pain, but that's the way it - is. - - If you want an hi-boot file to export a - data type, but you don't want to give its constructors (because - the constructors aren't used by the SOURCE-importing module), - you can write simply: + + + + Orphan modules and instance declarations + + Haskell specifies that when compiling module M, any instance +declaration in any module "below" M is visible. (Module A is "below" +M if A is imported directly by M, or if A is below a module that M imports directly.) +In principle, GHC must therefore read the interface files of every module below M, +just in case they contain an instance declaration that matters to M. This would +be a disaster in practice, so GHC tries to be clever. + +In particular, if an instance declaration is in the same module as the definition +of any type or class mentioned in the head of the instance declaration, then +GHC has to visit that interface file anyway. Example: -__interface A 1 0 where -__export A TA; -1 data TA + module A where + instance C a => D (T a) where ... + data T a = ... + The instance declaration is only relevant if the type T is in use, and if +so, GHC will have visited A's interface file to find T's definition. - (You must write all the type parameters, but leave out the - '=' and everything that follows it.) + The only problem comes when a module contains an instance declaration +and GHC has no other reason for visiting the module. Example: + + module Orphan where + instance C a => D (T a) where ... + class C a where ... + +Here, neither D nor T is declared in module Orphan. +We call such modules ``orphan modules'', +defined thus: + + An orphan module + orphan module + contains at least one orphan instance or at + least one orphan rule. + + An instance declaration in a module M is an orphan instance if + orphan instance + none of the type constructors + or classes mentioned in the instance head (the part after the ``=>'') are declared + in M. + + Only the instance head counts. In the example above, it is not good enough for C's declaration + to be in module A; it must be the declaration of D or T. + + + A rewrite rule in a module M is an orphan rule + orphan rule + if none of the variables, type constructors, + or classes that are free in the left hand side of the rule are declared in M. + + + + + GHC identifies orphan modules, and visits the interface file of +every orphan module below the module being compiled. This is usually +wasted work, but there is no avoiding it. You should therefore do +your best to have as few orphan modules as possible. + + + + You can identify an orphan module by looking in its interface +file, M.hi, using the +. If there is a ``!'' on the first line, +GHC considers it an orphan module. + + - Note: This is all a temporary - solution, a version of the compiler that handles mutually - recursive modules properly without the manual construction of - interface files, is (allegedly) in the works. -