GHC's uses a many optimsations and GHC specific techniques (unboxed values, RULES pragmas, and so on) to make the heavily used Prelude code as fast as possible.
What's this "lazy" thing. Well, pseq is a seq for a parallel setting. We really mean "evaluate a, then b". But if the strictness analyser sees that pseq is strict in b, then b might be evaluated before a, which is all wrong.pseq a b = a `seq` lazy b
Solution: wrap the 'b' in a call to GHC.Base.lazy. This function is just the identity function, except that it's put into the built-in environment in MkId.lhs. That is, the MkId.lhs defn over-rides the inlining and strictness information that comes in from GHC.Base.hi. And that makes lazy look lazy, and have no inlining. So the strictness analyser gets no traction.
In the worker/wrapper phase, after strictness analysis, lazy is "manually" inlined (see WorkWrap.lhs), so we get all the efficiency back.
This supersedes an earlier scheme involving an even grosser hack in which par# and seq# returned an Int#. Now there is no seq# operator at all.
There is a lot of magic in PrelBase.lhs
-
among other things, the RULES
pragmas implementing the fold/build
optimisation. The code for map
is
a good example for how it all works. In the prelude code for version
5.03 it reads as follows:
map :: (a -> b) -> [a] -> [b] map _ [] = [] map f (x:xs) = f x : map f xs -- Note eta expanded mapFB :: (elt -> lst -> lst) -> (a -> elt) -> a -> lst -> lst {-# INLINE [0] mapFB #-} mapFB c f x ys = c (f x) ys {-# RULES "map" [~1] forall f xs. map f xs = build (\c n -> foldr (mapFB c f) n xs) "mapList" [1] forall f. foldr (mapFB (:) f) [] = map f "mapFB" forall c f g. mapFB (mapFB c f) g = mapFB c (f.g) #-}
Up to (but not including) phase 1, we use the "map"
rule to
rewrite all saturated applications of map
with its
build/fold form, hoping for fusion to happen. In phase 1 and 0, we
switch off that rule, inline build, and switch on the
"mapList"
rule, which rewrites the foldr/mapFB thing back
into plain map.
It's important that these two rules aren't both active at once (along with build's unfolding) else we'd get an infinite loop in the rules. Hence the activation control using explicit phase numbers.
The "mapFB" rule optimises compositions of map.
The mechanism as described above is new in 5.03 since January 2002,
where the [~
N]
syntax for phase number
annotations at rules was introduced. Before that the whole arrangement
was more complicated, as the corresponding prelude code for version
4.08.1 shows:
map :: (a -> b) -> [a] -> [b] map = mapList -- Note eta expanded mapFB :: (elt -> lst -> lst) -> (a -> elt) -> a -> lst -> lst mapFB c f x ys = c (f x) ys mapList :: (a -> b) -> [a] -> [b] mapList _ [] = [] mapList f (x:xs) = f x : mapList f xs {-# RULES "map" forall f xs. map f xs = build (\c n -> foldr (mapFB c f) n xs) "mapFB" forall c f g. mapFB (mapFB c f) g = mapFB c (f.g) "mapList" forall f. foldr (mapFB (:) f) [] = mapList f #-}
This code is structured as it is, because the "map" rule first
breaks the map open, which exposes it to the various
foldr/build rules, and if no foldr/build rule matches, the "mapList"
rule closes it again in a later phase of optimisation - after
build was inlined. As a consequence, the whole thing depends a bit on
the timing of the various optimsations (the map might be closed again
before any of the foldr/build rules fires). To make the timing
deterministic, build
gets a {-# INLINE 2 build
#-}
pragma, which delays build
's inlining, and thus,
the closing of the map. [NB: Phase numbering was forward at that time.]
Last modified: Mon Feb 11 20:00:49 EST 2002