X-Git-Url: http://git.megacz.com/?a=blobdiff_plain;f=docs%2Fusers_guide%2Fsooner.xml;fp=docs%2Fusers_guide%2Fsooner.xml;h=1aba5d1af0afb4ffa66c489d98c93373f6875da2;hb=0065d5ab628975892cea1ec7303f968c3338cbe1;hp=0000000000000000000000000000000000000000;hpb=28a464a75e14cece5db40f2765a29348273ff2d2;p=ghc-hetmet.git diff --git a/docs/users_guide/sooner.xml b/docs/users_guide/sooner.xml new file mode 100644 index 0000000..1aba5d1 --- /dev/null +++ b/docs/users_guide/sooner.xml @@ -0,0 +1,602 @@ + + +Advice on: sooner, faster, smaller, thriftier + +Please advise us of other “helpful hints” that +should go here! + + +Sooner: producing a program more quickly + + +compiling faster +faster compiling + + + + Don't use or (especially) : + + By using them, you are telling GHC that you are + willing to suffer longer compilation times for + better-quality code. + + GHC is surprisingly zippy for normal compilations + without ! + + + + + Use more memory: + + Within reason, more memory for heap space means less + garbage collection for GHC, which means less compilation + time. If you use the option, + you'll get a garbage-collector report. (Again, you can use + the cheap-and-nasty + option to send the GC stats straight to standard + error.) + + If it says you're using more than 20% of total + time in garbage collecting, then more memory would + help. + + If the heap size is approaching the maximum (64M by + default), and you have lots of memory, try increasing the + maximum with the + -M<size> + option option, e.g.: ghc -c + -O -M1024m Foo.hs. + + Increasing the default allocation area size used by + the compiler's RTS might also help: use the + -A<size> + option option. + + If GHC persists in being a bad memory citizen, please + report it as a bug. + + + + + Don't use too much memory! + + As soon as GHC plus its “fellow citizens” + (other processes on your machine) start using more than the + real memory on your machine, and the + machine starts “thrashing,” the party + is over. Compile times will be worse than + terrible! Use something like the csh-builtin + time command to get a report on how many + page faults you're getting. + + If you don't know what virtual memory, thrashing, and + page faults are, or you don't know the memory configuration + of your machine, don't try to be clever + about memory use: you'll just make your life a misery (and + for other people, too, probably). + + + + + Try to use local disks when linking: + + Because Haskell objects and libraries tend to be + large, it can take many real seconds to slurp the bits + to/from a remote filesystem. + + It would be quite sensible to + compile on a fast machine using + remotely-mounted disks; then link on a + slow machine that had your disks directly mounted. + + + + + Don't derive/use Read unnecessarily: + + It's ugly and slow. + + + + + GHC compiles some program constructs slowly: + + Deeply-nested list comprehensions seem to be one such; + in the past, very large constant tables were bad, + too. + + We'd rather you reported such behaviour as a bug, so + that we can try to correct it. + + The part of the compiler that is occasionally prone to + wandering off for a long time is the strictness analyser. + You can turn this off individually with + . + -fno-strictness + anti-option + + To figure out which part of the compiler is badly + behaved, the + + option is your friend. + + If your module has big wads of constant data, GHC may + produce a huge basic block that will cause the native-code + generator's register allocator to founder. Bring on + -fvia-C + option (not that GCC will be that + quick about it, either). + + + + + Explicit import declarations: + + Instead of saying import Foo, say + import Foo (...stuff I want...) You can + get GHC to tell you the minimal set of required imports by + using the option + (see ). + + Truthfully, the reduction on compilation time will be + very small. However, judicious use of + import declarations can make a program + easier to understand, so it may be a good idea + anyway. + + + + + + + Faster: producing a program that runs quicker + + faster programs, how to produce + + The key tool to use in making your Haskell program run + faster are GHC's profiling facilities, described separately in + . There is no + substitute for finding where your program's time/space + is really going, as opposed to where you + imagine it is going. + + Another point to bear in mind: By far the best way to + improve a program's performance dramatically + is to use better algorithms. Once profiling has thrown the + spotlight on the guilty time-consumer(s), it may be better to + re-think your program than to try all the tweaks listed below. + + Another extremely efficient way to make your program snappy + is to use library code that has been Seriously Tuned By Someone + Else. You might be able to write a better + quicksort than the one in Data.List, but it + will take you much longer than typing import + Data.List. + + Please report any overly-slow GHC-compiled programs. Since + GHC doesn't have any credible competition in the performance + department these days it's hard to say what overly-slow means, so + just use your judgement! Of course, if a GHC compiled program + runs slower than the same program compiled with NHC or Hugs, then + it's definitely a bug. + + + + Optimise, using or : + + This is the most basic way to make your program go + faster. Compilation time will be slower, especially with + . + + At present, is nearly + indistinguishable from . + + + + + Compile via C and crank up GCC: + + The native code-generator is designed to be quick, not + mind-bogglingly clever. Better to let GCC have a go, as it + tries much harder on register allocation, etc. + + At the moment, if you turn on you + get GCC instead. This may change in the future. + + So, when we want very fast code, we use: . + + + + + Overloaded functions are not your friend: + + Haskell's overloading (using type classes) is elegant, + neat, etc., etc., but it is death to performance if left to + linger in an inner loop. How can you squash it? + + + + Give explicit type signatures: + + Signatures are the basic trick; putting them on + exported, top-level functions is good + software-engineering practice, anyway. (Tip: using + -fwarn-missing-signatures + option can help enforce good + signature-practice). + + The automatic specialisation of overloaded + functions (with ) should take care + of overloaded local and/or unexported functions. + + + + + Use SPECIALIZE pragmas: + + SPECIALIZE pragma + overloading, death to + + Specialize the overloading on key functions in + your program. See + and . + + + + + “But how do I know where overloading is creeping in?”: + + A low-tech way: grep (search) your interface + files for overloaded type signatures. You can view + interface files using the + option (see ). + + +% ghc --show-iface Foo.hi | egrep '^[a-z].*::.*=>' + + + + + + + + + + Strict functions are your dear friends: + + and, among other things, lazy pattern-matching is your + enemy. + + (If you don't know what a “strict + function” is, please consult a functional-programming + textbook. A sentence or two of explanation here probably + would not do much good.) + + Consider these two code fragments: + + +f (Wibble x y) = ... # strict + +f arg = let { (Wibble x y) = arg } in ... # lazy + + + The former will result in far better code. + + A less contrived example shows the use of + cases instead of lets + to get stricter code (a good thing): + + +f (Wibble x y) # beautiful but slow + = let + (a1, b1, c1) = unpackFoo x + (a2, b2, c2) = unpackFoo y + in ... + +f (Wibble x y) # ugly, and proud of it + = case (unpackFoo x) of { (a1, b1, c1) -> + case (unpackFoo y) of { (a2, b2, c2) -> + ... + }} + + + + + + + + GHC loves single-constructor data-types: + + It's all the better if a function is strict in a + single-constructor type (a type with only one + data-constructor; for example, tuples are single-constructor + types). + + + + + Newtypes are better than datatypes: + + If your datatype has a single constructor with a + single field, use a newtype declaration + instead of a data declaration. The + newtype will be optimised away in most + cases. + + + + + “How do I find out a function's strictness?” + + Don't guess—look it up. + + Look for your function in the interface file, then for + the third field in the pragma; it should say + __S <string>. The + <string> gives the strictness of + the function's arguments. L is lazy + (bad), S and E are + strict (good), P is + “primitive” (good), U(...) + is strict and “unpackable” (very good), and + A is absent (very good). + + For an “unpackable” + U(...) argument, the info inside tells + the strictness of its components. So, if the argument is a + pair, and it says U(AU(LSS)), that + means “the first component of the pair isn't used; the + second component is itself unpackable, with three components + (lazy in the first, strict in the second \& + third).” + + If the function isn't exported, just compile with the + extra flag ; next to the + signature for any binder, it will print the self-same + pragmatic information as would be put in an interface file. + (Besides, Core syntax is fun to look at!) + + + + + Force key functions to be INLINEd (esp. monads): + + Placing INLINE pragmas on certain + functions that are used a lot can have a dramatic effect. + See . + + + + + Explicit export list: + + If you do not have an explicit export list in a + module, GHC must assume that everything in that module will + be exported. This has various pessimising effects. For + example, if a bit of code is actually + unused (perhaps because of unfolding + effects), GHC will not be able to throw it away, because it + is exported and some other module may be relying on its + existence. + + GHC can be quite a bit more aggressive with pieces of + code if it knows they are not exported. + + + + + Look at the Core syntax! + + (The form in which GHC manipulates your code.) Just + run your compilation with + (don't forget the ). + + If profiling has pointed the finger at particular + functions, look at their Core code. lets + are bad, cases are good, dictionaries + (d.<Class>.<Unique>) [or + anything overloading-ish] are bad, nested lambdas are + bad, explicit data constructors are good, primitive + operations (e.g., eqInt#) are + good,… + + + + + Use strictness annotations: + + Putting a strictness annotation ('!') on a constructor + field helps in two ways: it adds strictness to the program, + which gives the strictness analyser more to work with, and + it might help to reduce space leaks. + + It can also help in a third way: when used with + (see ), a strict field can be unpacked or + unboxed in the constructor, and one or more levels of + indirection may be removed. Unpacking only happens for + single-constructor datatypes (Int is a + good candidate, for example). + + Using is only + really a good idea in conjunction with , + because otherwise the extra packing and unpacking won't be + optimised away. In fact, it is possible that + may worsen + performance even with + , but this is unlikely (let us know if it + happens to you). + + + + + Use unboxed types (a GHC extension): + + When you are really desperate for + speed, and you want to get right down to the “raw + bits.” Please see for + some information about using unboxed types. + + Before resorting to explicit unboxed types, try using + strict constructor fields and + first (see above). + That way, your code stays portable. + + + + + Use foreign import (a GHC extension) to plug into fast libraries: + + This may take real work, but… There exist piles + of massively-tuned library code, and the best thing is not + to compete with it, but link with it. + + describes the foreign function + interface. + + + + + Don't use Floats: + + If you're using Complex, definitely + use Complex Double rather than + Complex Float (the former is specialised + heavily, but the latter isn't). + + Floats (probably 32-bits) are + almost always a bad idea, anyway, unless you Really Know + What You Are Doing. Use Doubles. + There's rarely a speed disadvantage—modern machines + will use the same floating-point unit for both. With + Doubles, you are much less likely to hang + yourself with numerical errors. + + One time when Float might be a good + idea is if you have a lot of them, say + a giant array of Floats. They take up + half the space in the heap compared to + Doubles. However, this isn't true on a + 64-bit machine. + + + + + Use unboxed arrays (UArray) + + GHC supports arrays of unboxed elements, for several + basic arithmetic element types including + Int and Char: see the + Data.Array.Unboxed library for details. + These arrays are likely to be much faster than using + standard Haskell 98 arrays from the + Data.Array library. + + + + + Use a bigger heap! + + If your program's GC stats + (-S RTS + option RTS option) indicate that it's + doing lots of garbage-collection (say, more than 20% + of execution time), more memory might help—with the + -M<size> + RTS option or + -A<size> + RTS option RTS options (see ). + + This is especially important if your program uses a + lot of mutable arrays of pointers or mutable variables + (i.e. STArray, + IOArray, STRef and + IORef, but not UArray, + STUArray or IOUArray). + GHC's garbage collector currently scans these objects on + every collection, so your program won't benefit from + generational GC in the normal way if you use lots of + these. Increasing the heap size to reduce the number of + collections will probably help. + + + + + + + +Smaller: producing a program that is smaller + + + +smaller programs, how to produce + + + +Decrease the “go-for-it” threshold for unfolding smallish +expressions. Give a +-funfolding-use-threshold0 +option option for the extreme case. (“Only unfoldings with +zero cost should proceed.”) Warning: except in certain specialised +cases (like Happy parsers) this is likely to actually +increase the size of your program, because unfolding +generally enables extra simplifying optimisations to be performed. + + + +Avoid Read. + + + +Use strip on your executables. + + + + + +Thriftier: producing a program that gobbles less heap space + + + +memory, using less heap +space-leaks, avoiding +heap space, using less + + + +“I think I have a space leak…” Re-run your program +with , and remove all doubt! (You'll +see the heap usage get bigger and bigger…) +[Hmmm…this might be even easier with the + RTS option; so… ./a.out +RTS +-Sstderr -G1...] +-G RTS option +-Sstderr RTS option + + + +Once again, the profiling facilities () are +the basic tool for demystifying the space behaviour of your program. + + + +Strict functions are good for space usage, as they are for time, as +discussed in the previous section. Strict functions get right down to +business, rather than filling up the heap with closures (the system's +notes to itself about how to evaluate something, should it eventually +be required). + + + + + + +