+++ /dev/null
- ------------------------------------
- GHCI hacking
- ------------------------------------
-
-* Don't forget to put deferred-type-decls back into RnIfaces
-
-* Do we want to record a package name in a .hi file?
- Does pi_mod have a ModuleName or a Module?
-
- ------------------------------------
- Mainly FunDeps (23 Jan 01)
- ------------------------------------
-
-This commit re-engineers the handling of functional dependencies.
-A functional dependency is no longer an Inst; instead, the necessary
-dependencies are snaffled out of their Class when necessary.
-
-As part of this exercise I found that I had to re-work how to do generalisation
-in a binding group. There is rather exhaustive documentation on the new Plan
-at the top of TcSimplify.
-
- ******************
- WARNING: I have compiled all the libraries with this new compiler
- and all looks well, but I have not run many programs.
- Things may break. Let me know if so.
- ******************
-
-The main changes are these:
-
-1. typecheck/TcBinds and TcSimplify have a lot of changes due to the
- new generalisation and context reduction story. There are extensive
- comments at the start of TcSimplify
-
-2. typecheck/TcImprove is removed altogether. Instead, improvement is
- interleaved with context reduction (until a fixpoint is reached).
- All this is done in TcSimplify.
-
-3. types/FunDeps has new exports
- * 'improve' does improvement, returning a list of equations
- * 'grow' and 'oclose' close a list of type variables wrt a set of
- PredTypes, but in slightly different ways. Comments in file.
-
-4. I improved the way in which we check that main::IO t. It's tidier now.
-
-In addition
-
-* typecheck/TcMatches:
- a) Tidy up, introducing a common function tcCheckExistentialPat
-
- b) Improve the typechecking of parallel list comprehensions,
- which wasn't quite right before. (see comments with tcStmts)
-
- WARNING: (b) is untested! Jeff, you might want to check.
-
-* Numerous other incidental changes in the typechecker
-
-* Manuel found that rules don't fire well when you have partial applications
- from overloading. For example, we may get
-
- f a (d::Ord a) = let m_g = g a d
- in
- \y :: a -> ...(m_g (h y))...
-
- The 'method' m_g doesn't get inlined because (g a d) might be a redex.
- Yet a rule that looks like
- g a d (h y) = ...
- won't fire because that doesn't show up. One way out would be to make
- the rule matcher a bit less paranoid about duplicating work, but instead
- I've added a flag
- -fno-method-sharing
- which controls whether we generate things like m_g in the first place.
- It's not clear that they are a win in the first place.
-
- The flag is actually consulted in Inst.tcInstId
-
-
-
- ------------------------------------
- Mainly PredTypes (28 Sept 00)
- ------------------------------------
-
-Three things in this commit:
-
- 1. Main thing: tidy up PredTypes
- 2. Move all Keys into PrelNames
- 3. Check for unboxed tuples in function args
-
-1. Tidy up PredTypes
-~~~~~~~~~~~~~~~~~~~~
-The main thing in this commit is to modify the representation of Types
-so that they are a (much) better for the qualified-type world. This
-should simplify Jeff's life as he proceeds with implicit parameters
-and functional dependencies. In particular, PredType, introduced by
-Jeff, is now blessed and dignified with a place in TypeRep.lhs:
-
- data PredType = Class Class [Type]
- | IParam Name Type
-
-Consider these examples:
- f :: (Eq a) => a -> Int
- g :: (?x :: Int -> Int) => a -> Int
- h :: (r\l) => {r} => {l::Int | r}
-
-Here the "Eq a" and "?x :: Int -> Int" and "r\l" are all called
-*predicates*, and are represented by a PredType. (We don't support
-TREX records yet, but the setup is designed to expand to allow them.)
-
-In addition, Type gains an extra constructor:
-
- data Type = .... | PredTy PredType
-
-so that PredType is injected directly into Type. So the type
- p => t
-is represented by
- PredType p `FunTy` t
-
-I have deleted the hackish IPNote stuff; predicates are dealt with entirely
-through PredTys, not through NoteTy at all.
-
-
-2. Move Keys into PrelNames
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-This is just a housekeeping operation. I've moved all the pre-assigned Uniques
-(aka Keys) from Unique.lhs into PrelNames.lhs. I've also moved knowKeyRdrNames
-from PrelInfo down into PrelNames. This localises in PrelNames lots of stuff
-about predefined names. Previously one had to alter three files to add one,
-now only one.
-
-3. Unboxed tuples
-~~~~~~~~~~~~~~~~~~
-Add a static check for unboxed tuple arguments. E.g.
- data T = T (# Int, Int #)
-is illegal
-
-
-
- ---------------------------------------
- Update in place
- ---------------------------------------
-
--funfolding-update-in-place
-Switching it on doesn't affect many programs, except these
-sphere is because it makes a critical function (vecsub) more inlinable
-
- sphere 66465k -20.61%
- infer 13390k +1.27%
- parstof 1461k +1.18%
- fluid 3442k +1.61%
- atom 177163k +13.20%
- bspt 4837k +4.85%
- cichelli 33546k +2.69%
- typecheck 146023k +1.47%
-
-
- ---------------------------------------
- Simon's tuning changes: early Sept 2000
- ---------------------------------------
-
-Library changes
-~~~~~~~~~~~~~~~
-* Eta expand PrelShow.showLitChar. It's impossible to compile this well,
- and it makes a big difference to some programs (e.g. gen_regexps)
-
-* Make PrelList.concat into a good producer (in the foldr/build sense)
-
-
-Flag changes
-~~~~~~~~~~~~
-* Add -ddump-hi-diffs to print out changes in interface files. Useful
- when watching what the compiler is doing
-
-* Add -funfolding-update-in-place to enable the experimental optimisation
- that makes the inliner a bit keener to inline if it's in the RHS of
- a thunk that might be updated in place. Sometimes this is a bad idea
- (one example is in spectral/sphere; see notes in nofib/Simon-nofib-notes)
-
-
-Tuning things
-~~~~~~~~~~~~~
-* Fix a bug in SetLevels.lvlMFE. (change ctxt_lvl to dest_level)
- I don't think this has any performance effect, but it saves making
- a redundant let-binding that is later eliminated.
-
-* Desugar.dsProgram and DsForeign
- Glom together all the bindings into a single Rec. Previously the
- bindings generated by 'foreign' declarations were not glommed together, but
- this led to an infelicity (i.e. poorer code than necessary) in the modules
- that actually declare Float and Double (explained a bit more in Desugar.dsProgram)
-
-* OccurAnal.shortMeOut and IdInfo.shortableIdInfo
- Don't do the occurrence analyser's shorting out stuff for things which
- have rules. Comments near IdInfo.shortableIdInfo.
- This is deeply boring, and mainly to do with making rules work well.
- Maybe rules should have phases attached too....
-
-* CprAnalyse.addIdCprInfo
- Be a bit more willing to add CPR information to thunks;
- in particular, if the strictness analyser has just discovered that this
- is a strict let, then the let-to-case transform will happen, and CPR is fine.
- This made a big difference to PrelBase.modInt, which had something like
- modInt = \ x -> let r = ... -> I# v in
- ...body strict in r...
- r's RHS isn't a value yet; but modInt returns r in various branches, so
- if r doesn't have the CPR property then neither does modInt
-
-* MkId.mkDataConWrapId
- Arrange that vanilla constructors, like (:) and I#, get unfoldings that are
- just a simple variable $w:, $wI#. This ensures they'll be inlined even into
- rules etc, which makes matching a bit more reliable. The downside is that in
- situations like (map (:) xs), we'll end up with (map (\y ys. $w: y ys) xs.
- Which is tiresome but it doesn't happen much.
-
-* SaAbsInt.findStrictness
- Deal with the case where a thing with no arguments is bottom. This is Good.
- E.g. module M where { foo = error "help" }
- Suppose we have in another module
- case M.foo of ...
- Then we'd like to do the case-of-error transform, without inlining foo.
-
-
-Tidying up things
-~~~~~~~~~~~~~~~~~
-* Reorganised Simplify.completeBinding (again).
-
-* Removed the is_bot field in CoreUnfolding (is_cheap is true if is_bot is!)
- This is just a tidy up
-
-* HsDecls and others
- Remove the NewCon constructor from ConDecl. It just added code, and nothing else.
- And it led to a bug in MkIface, which though that a newtype decl was always changing!
-
-* IdInfo and many others
- Remove all vestiges of UpdateInfo (hasn't been used for years)
-
- ------------------------------
- Join points Sept 2000
- ------------------------------
-
-With Andrew Kennedy, I found out why a few of the join points introduced by
-the simplifier end up as *not* let-no-escpaed. Here's an example:
-
-f x y = case (pwr x b) == 1 of
- False -> False
- True -> pwr x c == 1
-
-This compiles to:
- f = \ @ t w :: Integer ->
- let {
- $j :: (State# RealWorld -> Bool)
- P
- $j
- = \ w1 :: (State# RealWorld) ->
- case pwr w c of wild {
- S# i -> case i of wild1 { 1 -> $wTrue; __DEFAULT -> $wFalse };
- J# s d1 ->
- case cmpIntegerInt# s d1 1 of wild2 {
- 0 -> $wTrue; __DEFAULT -> $wFalse
- }
- }
- } in
- case pwr w b of wild {
- S# i ->
- case i of wild1 { 1 -> $j realWorld#; __DEFAULT -> $wFalse };
- J# s d1 ->
- case cmpIntegerInt# s d1 1 of wild2 {
- 0 -> $j realWorld#; __DEFAULT -> $wFalse
- }
- }
-
-Now consider
-
- case (f x) of
- True -> False
- False -> True
-
-Suppose f is inlined into this case. No new join points are introduced,
-because the alternatives are both small. But the consumer
- case [.] of {True -> False; False -> True}
-will move into the body of f, be duplicated 4 ways, and end up consuming
-the result of the four outcomes at the body of f. This yields:
- $j :: (State# RealWorld -> Bool)
- P
- $j
- = \ w1 :: (State# RealWorld) ->
- case pwr w c of wild {
- S# i -> case i of wild1 { 1 -> $wTrue; __DEFAULT -> $wFalse };
- J# s d1 ->
- case cmpIntegerInt# s d1 1 of wild2 {
- 0 -> $wTrue; __DEFAULT -> $wFalse
- }
- }
- } in
- case pwr w b of wild {
- S# i ->
- case i of wild1 { 1 -> case $j realWorld# of {T->F; F->T}
- ; __DEFAULT -> $wTrue };
- J# s d1 ->
- case cmpIntegerInt# s d1 1 of wild2 {
- 0 -> case $j realWorld# of {T->F; F->T}
- ; __DEFAULT -> $wTrue
- }
- }
-
-And, voila, the join point $j isn't let-no-escaped any more.
-The point is that the consuming context can't "see inside" the join point.
-It's a phase ordering thing. If f is inlined before the join points
-are built in the first place, then all is well.
-
-
-
- -----------------------------
- Sept 7 2000
- -----------------------------
-
-* Make the simplifier's Stop continuation record whether the expression being
- simplified is the RHS of a thunk, or (say) the body of a lambda or case RHS.
- In the thunk case we want to be a bit keener about inlining if the type of
- the thunk is amenable to update in place.
-
-* SetLevels was being a bit too eager to float things to the top
- level; e.g. _inline_me_ (\a -> e); here e got floated...
- Easily fixed by a change to ltMajLvl
-
-* Make CoreUnfold.calcUnfoldingGuidance a bit less keen to make case expressions
- seem small. The original idea was to make inlined wrappers look small, so that
- when we inline a wrapper it doesn't make call site (much) bigger
- Otherwise we get nasty phase ordering stuff:
- -- f x = g x x
- -- h y = ...(f e)...
- If we inline g's wrapper, f looks big, and doesn't get inlined
- into h; if we inline f first, while it looks small, then g's
- wrapper will get inlined later anyway. To avoid this nasty
- ordering difference, we make (case a of (x,y) -> ...),
- *where a is one of the arguments* look free.
-
- BUT (a) It's too eager. We don't want to inline a wrapper into a
- context with no benefit.
- E.g. \ x. f (x+x) o point in inlining (+) here!
-
- (b) It's ineffective. Once g's wrapper is inlined, its case-expressions
- aren't scrutinising arguments any more
-
- So I've rescinded this idea for now. cases still look fairly small.
-
-* Fix interestingArg, which was being too liberal, and hence doing
- too much inlining.
-
-* Extended CoreUtils.exprIsCheap to make two more things cheap:
- - case (coerce x) of ...
- - let x = y +# z
- This makes a bit more eta expansion happen. It was provoked by
- a program of Marcin's.
-
-* The simplifier used to glom together all the top-level bindings into
- a single Rec every time it was invoked. The reason for this is explained
- in SimplCore.lhs, but for at least one simple program it meant that the
- simplifier never got around to unravelling the recursive group into
- non-recursive pieces. So I've put the glomming under explicit flag
- control with a -fglom-binds simplifier pass. A side benefit is
- that because it happens less often, the (expensive) SCC algorithm
- runs less often.
-
-* MkIface.ifaceBinds. Make sure that we emit rules for things
- (like class operations) that don't get a top-level binding in the
- interface file. Previously such rules were silently forgotten.
-
-* Move transformRhs to *after* simplification, which makes it a
- little easier to do, and means that the arity it computes is
- readily available to completeBinding. This gets much better
- arities.
-
-* Do coerce splitting in completeBinding. This gets good code for
- newtype CInt = CInt Int
-
- test:: CInt -> Int
- test x = case x of
- 1 -> 2
- 2 -> 4
- 3 -> 8
- 4 -> 16
- _ -> 0
-
-* Modify the meaning of "arity" so that during compilation it means
- "if you apply this function to fewer args, it will do virtually
- no work". So, for example
- f = coerce t (\x -> e)
- has arity at least 1. When a function is exported, it's arity becomes
- the number of exposed, top-level lambdas, which is subtly different.
- But that's ok.
-
- I removed CoreUtils.exprArity altogether: it looked only at the exposed
- lambdas. Instead, we use exprEtaExpandArity exclusively.
-
- All of this makes I/O programs work much better.
-
-
- -----------------------------
- Sept 4 2000
- -----------------------------
-
-* PrimRep, TysPrim. Add PrimPtrRep as the representation for
- MVars and MutVars. Previously they were given PtrRep, but that
- crashed dataReturnConvPrim! Here's the program the killed it:
- data STRef s a = STRef (MutVar# s a)
- from (STRef x) = x
-
-* Make the desugarer use string equality for string literal
- patterns longer than 1 character. And put a specialised
- eqString into PrelBase, with a suitable specialisation rule.
- This makes a huge difference to the size of the code generated
- by deriving(Read) notably in Time.lhs
-
- -----------------------------
- Marktoberdorf Commits (Aug 2000)
- -----------------------------
-
-1. Tidy up the renaming story for "system binders", such as
-dictionary functions, default methods, constructor workers etc. These
-are now documented in HsDecls. The main effect of the change, apart
-from tidying up, is to make the *type-checker* (instead of the
-renamer) generate names for dict-funs and default-methods. This is
-good because Sergei's generic-class stuff generates new classes at
-typecheck time.
-
-
-2. Fix the CSE pass so it does not require the no-shadowing invariant.
-Keith discovered that the simplifier occasionally returns a result
-with shadowing. After much fiddling around (which has improved the
-code in the simplifier a bit) I found that it is nearly impossible to
-arrange that it really does do no-shadowing. So I gave up and fixed
-the CSE pass (which is the only one to rely on it) instead.
-
-
-3. Fix a performance bug in the simplifier. The change is in
-SimplUtils.interestingArg. It computes whether an argment should
-be considered "interesting"; if a function is applied to an interesting
-argument, we are more likely to inline that function.
-Consider this case
- let x = 3 in f x
-The 'x' argument was considered "uninteresting" for a silly reason.
-Since x only occurs once, it was unconditionally substituted, but
-interestingArg didn't take account of that case. Now it does.
-
-I also made interestingArg a bit more liberal. Let's see if we
-get too much inlining now.
-
-
-4. In the occurrence analyser, we were choosing a bad loop breaker.
-Here's the comment that's now in OccurAnal.reOrderRec
-
- score ((bndr, rhs), _, _)
- | exprIsTrivial rhs = 3 -- Practically certain to be inlined
- -- Used to have also: && not (isExportedId bndr)
- -- But I found this sometimes cost an extra iteration when we have
- -- rec { d = (a,b); a = ...df...; b = ...df...; df = d }
- -- where df is the exported dictionary. Then df makes a really
- -- bad choice for loop breaker
-
-I also increased the score for bindings with a non-functional type, so that
-dictionaries have a better chance of getting inlined early
-
-
-5. Add a hash code to the InScopeSet (and make it properly abstract)
-This should make uniqAway a lot more robust. Simple experiments suggest
-that uniqAway no longer gets into the long iteration chains that it used
-to.
-
-
-6. Fix a bug in the inliner that made the simplifier tend to get into
-a loop where it would keep iterating ("4 iterations, bailing out" message).
-In SimplUtils.mkRhsTyLam we float bindings out past a big lambda, thus:
- x = /\ b -> let g = \x -> f x x
- in E
-becomes
- g* = /\a -> \x -> f x x
- x = /\ b -> let g = g* b in E
-
-It's essential that we don't simply inling g* back into the RHS of g,
-else we will be back to square 1. The inliner is meant not to do this
-because there's no benefit to the inlining, but the size calculation
-was a little off in CoreUnfold.
-
-
-7. In SetLevels we were bogus-ly building a Subst with an empty in-scope
-set, so a WARNING popped up when compiling some modules. (knights/ChessSetList
-was the example that tickled it.) Now in fact the warning wasn't an error,
-but the Right Thing to do is to carry down a proper Subst in SetLevels, so
-that is what I have now done. It is very little more expensive.
-
-
-
- ~~~~~~~~~~~~
- Apr/May 2000
- ~~~~~~~~~~~~
-
-This is a pretty big commit! It adds stuff I've been working on
-over the last month or so. DO NOT MERGE IT WITH 4.07!
-
-Recompilation checking
-~~~~~~~~~~~~~~~~~~~~~~
-Substantial improvement in recompilation checking. The version management
-is now entirely internal to GHC. ghc-iface.lprl is dead!
-
-The trick is to generate the new interface file in two steps:
- - first convert Types etc to HsTypes etc, and thereby
- build a new ParsedIface
- - then compare against the parsed (but not renamed) version of the old
- interface file
-Doing this meant adding code to convert *to* HsSyn things, and to
-compare HsSyn things for equality. That is the main tedious bit.
-
-Another improvement is that we now track version info for
-fixities and rules, which was missing before.
-
-
-Interface file reading
-~~~~~~~~~~~~~~~~~~~~~~
-Make interface files reading more robust.
- * If the old interface file is unreadable, don't fail. [bug fix]
-
- * If the old interface file mentions interfaces
- that are unreadable, don't fail. [bug fix]
-
- * When we can't find the interface file,
- print the directories we are looking in. [feature]
-
-
-Type signatures
-~~~~~~~~~~~~~~~
- * New flag -ddump-types to print type signatures
-
-
-Type pruning
-~~~~~~~~~~~~
-When importing
- data T = T1 A | T2 B | T3 C
-it seems excessive to import the types A, B, C as well, unless
-the constructors T1, T2 etc are used. A,B,C might be more types,
-and importing them may mean reading more interfaces, and so on.
- So the idea is that the renamer will just import the decl
- data T
-unless one of the constructors is used. This turns out to be quite
-easy to implement. The downside is that we must make sure the
-constructors are always available if they are really needed, so
-I regard this as an experimental feature.
-
-
-Elimininate ThinAir names
-~~~~~~~~~~~~~~~~~~~~~~~~~
-Eliminate ThinAir.lhs and all its works. It was always a hack, and now
-the desugarer carries around an environment I think we can nuke ThinAir
-altogether.
-
-As part of this, I had to move all the Prelude RdrName defns from PrelInfo
-to PrelMods --- so I renamed PrelMods as PrelNames.
-
-I also had to move the builtinRules so that they are injected by the renamer
-(rather than appearing out of the blue in SimplCore). This is if anything simpler.
-
-Miscellaneous
-~~~~~~~~~~~~~
-* Tidy up the data types involved in Rules
-
-* Eliminate RnEnv.better_provenance; use Name.hasBetterProv instead
-
-* Add Unique.hasKey :: Uniquable a => a -> Unique -> Bool
- It's useful in a lot of places
-
-* Fix a bug in interface file parsing for __U[!]
-
-
-=======================================
-To-do
-~~~~~
-* Try the effect of enhancing update in place with the CPR
- idea in CoreUnfold.calcUnfoldingGuidance
-
-* Check with Simon M re srt on Lit
-
-* Make all primops return a data type so that we can't over-apply a primop
- This makes code gen simpler. Currently the only primops with a polymorphic
- return type are:
- raise# :: a -> b
- catch# :: a -> (b->a) -> a
- tagToEnum# :: Int -> a
-
- Very strange code for PrelException.catchException! What has STret got
- to do with it?
-
-* Liberate case
-
-* Missing w/w for coerce in go2 functions of fibToList' in fibheaps
-
-* Watch out for re-boxing in workers; sometimes it happens
- and then w/w is a Bad Thing
-
-* Only two uses of mkCompulsoryUnfolding -- try to nuke it
-
-* Note that mkDupAlt makes alts that have binders that
- are guaranteed to appear just once or not at all
- (a,b) -> j a
- Same for case binder, but that's harder to take into account.
-
-* max :: Int -> Int -> Int could be CPRd but isn't.
-
-* In mandel2 we do a little less well than 4.04 because we aren't
- inlining point_colour, and that means we have to box up an argument
- before calling it. [This was due to a bug in 4.04]
- There's also a great opportunity for liberateCase
- in check_radius, where it loops around with two lazy F# built each time
-
-* In PrelShow.itos' we find a thunk like:
- tpl = case chrzh {(zpzh {(remIntzh {x{-aMf-} 10}) 48})}
- of tpl{-X1j-} __D P { __DEFAULT ->
- PrelBase.Czh{-62,s-} {tpl{-X1j-}}
- }
- This is a pity. The remInt# can't fail because the divisor isn't 0,
- so we could do the sum eagerly and allocate a charcter instead of a thunk.
-
-* It's good to do let-to-case before we wrap up. Consider
- f b xs = let ys = partition isUpper xs
- zs = case ys of (a,b) -> a
- in case b of
- True -> case ys of
- (a,b) -> (zs,[])
- False -> case ys of
- (a,b) -> (zs ++ xs,[])
- If we don't do let-to-case at all, we get 3 redundant case ys left.
- On the other hand we don't want to do it too early, because it
- prevents inlining into strict arg positions, which is important for
- rules to work.
-
-* Strict dictionaries.
-
-* INLINE functions are not always INLINEd, so it's sad to leave
- stuff in their bodies like constructors that havn't been inlined.
-
-* If let x = e in b is strict, then CPR can use the CPR info from x
- This bites in the mod method of Integral Int
-
-* Inline wrappers if they are the RHS of a let, so that update in place
- can happen?
-
-* Consider doing unboxing on strict constr args in a pattern match,
- as part of w/w.
-
-* In spectral/expert/Search.ask there's a statically visible CSE. Catching this
- depends almost entirely on chance, which is a pity.
-
-* Think about exprEtaExpandArity in WwLib. Perhaps eliminate eta expand in simplify?
- Perhaps use even if no coerces etc, just eta expansion. (e.g. PrelArr.done)
-
-* In knights/KnightHeuristic, we don't find that possibleMoves is strict
- (with important knock-on effects) unless we apply rules before floating
- out the literal list [A,B,C...].
- Similarly, in f_se (F_Cmp ...) in listcompr (but a smaller effect)
-
-* Floating can float the entire body of an INLINE thing out.
- e.g. PrelArr.done
- This is sad, and a bit stupid.
-
-* In spectral/multiplier, we have
- xor = lift21 forceBit f
- where f :: Bit -> Bit -> Bit
- f 0 0 = 0
- f 0 1 = 1
- f 1 0 = 1
- f 1 1 = 0
- Trouble is, f is CPR'd, and that means that instead of returning
- the constants I# 0, I# 1, it returns 0,1 and then boxes them.
- So allocation goes up. I don't see a way around this.
-
-* spectral/hartel/parstof ends up saying
- case (unpackCString "x") of { c:cs -> ... }
- quite a bit. We should spot these and behave accordingly.
-
-* Try a different hashing algorithms in hashUFM. This might reduce long CSE lists
- as well as making uniqAway faster.
-
-* [I'm not sure this is really important in the end.]
- Don't float out partial applications in lvlMFE. E.g. (in hPutStr defn of shoveString)
- \x -> case .. of
- [] -> setBufWPtr a b
- ...
- setBufWPtr has arity 3. Floating it out is plain silly. And in this particular
- case it's harmful, because it ends up preventing eta expansion on the \x.
- That in turn leads to a big extra cost in hPutStr.
-
- *** Try not doing lvlMFE on the body of a lambda and case alternative ***
-
-* PrelNumExtra.lhs we get three copies of dropTrailing0s. Too much inlining!
- drop0 has cost 21, but gets a discount of 6 (3 * #constrs) for its arg.
- With a keen-neess factor of 2, that makes a discount of 12. Add two for
- the arguments and we get 21-12-2, which is just small enough to inline.
- But that is plainly stupid.
-
- Add one for cases; and decrease discount for constructors.
-
-* IO.hGetContents still doesn't see that it is strict in the handle.
- Coerces still getting in the way.
-
-* Try not having really_interesting_cont (subsumed by changes in the
- way guidance is calculated for inline things?)
-
-* Enumeration types in worker/wrapper for strictness analysis
-
-* This should be reported as an error:
- data T k = MkT (k Int#)
-
-* Bogus report of overlapped pattern for
- f (R {field = [c]}) = 1
- f (R {}) = 2
- This shows up for TyCon.maybeTyConSingleCon
-
-* > module Main( main ) where
-
- > f :: String -> Int
- > f "=<" = 0
- > f "=" = 0
-
- > g :: [Char] -> Int
- > g ['=','<'] = 0
- > g ['='] = 0
-
- > main = return ()
-
- For ``f'' the following is reported.
-
- tmp.lhs:4:
- Pattern match(es) are overlapped in the definition of function `f'
- "=" = ...
-
- There are no complaints for definition for ``g''.
-
-* Without -O I don't think we need change the module version
- if the usages change; I forget why it changes even with -O
-
-* Record selectors for existential type; no good! What to do?
- Record update doesn't make sense either.
-
- Need to be careful when figuring out strictness, and when generating
- worker-wrapper split.
-
- Also when deriving.
-
-
- Jan 2000
- ~~~~~~~~
-
-A fairly big pile of work originally aimed at
-removing the Con form of Core expression, and replacing it with simple
-Lit form. However, I wanted to make sure that the resulting thing
-performed better than the original, so I ended up making an absolute
-raft of other changes.
-
-Removing the Con form of Core expressions
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-The big thing is that
-
- For every constructor C there are now *two* Ids:
-
- C is the constructor's *wrapper*. It evaluates and unboxes arguments
- before calling $wC. It has a perfectly ordinary top-level defn
- in the module defining the data type.
-
- $wC is the constructor's *worker*. It is like a primop that simply
- allocates and builds the constructor value. Its arguments are the
- actual representation arguments of the constructor.
-
- For every primop P there is *one* Id, its (curried) Id
-
- Neither contructor worker Id nor the primop Id have a defminition anywhere.
- Instead they are saturated during the core-to-STG pass, and the code generator
- generates code for them directly. The STG language still has saturated
- primops and constructor applications.
-
-* The Const type disappears, along with Const.lhs. The literal part
- of Const.lhs reappears as Literal.lhs. Much tidying up in here,
- to bring all the range checking into this one module.
-
-* I got rid of NoRep literals entirely. They just seem to be too much trouble.
-
-* Because Con's don't exist any more, the funny C { args } syntax
- disappears from inteface files.
-
-* Every constructor, C, comes with a
-
- *wrapper*, called C, whose type is exactly what it looks like
- in the source program. It is an ordinary function,
- and it gets a top-level binding like any other function
-
- *worker*, called $wC, which is the actual data constructor.
- Its type may be different to C, because:
- - useless dict args are dropped
- - strict args may be flattened
- It does not have a binding.
-
- The worker is very like a primop, in that it has no binding,
-
-
-Parsing
-~~~~~~~
-* Result type signatures now work
- f :: Int -> Int = \x -> x
- -- The Int->Int is the type of f
-
- g x y :: Int = x+y
- -- The Int is the type of the result of (g x y)
-
-
-Recompilation checking and make
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-* The .hi file for a modules is not touched if it doesn't change. (It used to
- be touched regardless, forcing a chain of recompilations.) The penalty for this
- is that we record exported things just as if they were mentioned in the body of
- the module. And the penalty for that is that we may recompile a module when
- the only things that have changed are the things it is passing on without using.
- But it seems like a good trade.
-
-* -recomp is on by default
-
-Foreign declarations
-~~~~~~~~~~~~~~~~~~~~
-* If you say
- foreign export zoo :: Int -> IO Int
- then you get a C produre called 'zoo', not 'zzoo' as before.
- I've also added a check that complains if you export (or import) a C
- procedure whose name isn't legal C.
-
-
-Code generation and labels
-~~~~~~~~~~~~~~~~~~~~~~~~~~
-* Now that constructor workers and wrappers have distinct names, there's
- no need to have a Foo_static_closure and a Foo_closure for constructor Foo.
- I nuked the entire StaticClosure story. This has effects in some of
- the RTS headers (i.e. s/static_closure/closure/g)
-
-
-Rules, constant folding
-~~~~~~~~~~~~~~~~~~~~~~~
-* Constant folding becomes just another rewrite rule, attached to the Id for the
- PrimOp. To achieve this, there's a new form of Rule, a BuiltinRule (see CoreSyn.lhs).
- The prelude rules are in prelude/PrelRules.lhs, while simplCore/ConFold.lhs has gone.
-
-* Appending of constant strings now works, using fold/build fusion, plus
- the rewrite rule
- unpack "foo" c (unpack "baz" c n) = unpack "foobaz" c n
- Implemented in PrelRules.lhs
-
-* The CCall primop is tidied up quite a bit. There is now a data type CCall,
- defined in PrimOp, that packages up the info needed for a particular CCall.
- There is a new Id for each new ccall, with an big "occurrence name"
- {__ccall "foo" gc Int# -> Int#}
- In interface files, this is parsed as a single Id, which is what it is, really.
-
-Miscellaneous
-~~~~~~~~~~~~~
-* There were numerous places where the host compiler's
- minInt/maxInt was being used as the target machine's minInt/maxInt.
- I nuked all of these; everything is localised to inIntRange and inWordRange,
- in Literal.lhs
-
-* Desugaring record updates was broken: it didn't generate correct matches when
- used withe records with fancy unboxing etc. It now uses matchWrapper.
-
-* Significant tidying up in codeGen/SMRep.lhs
-
-* Add __word, __word64, __int64 terminals to signal the obvious types
- in interface files. Add the ability to print word values in hex into
- C code.
-
-* PrimOp.lhs is no longer part of a loop. Remove PrimOp.hi-boot*
-
-
-Types
-~~~~~
-* isProductTyCon no longer returns False for recursive products, nor
- for unboxed products; you have to test for these separately.
- There's no reason not to do CPR for recursive product types, for example.
- Ditto splitProductType_maybe.
-
-Simplification
-~~~~~~~~~~~~~~~
-* New -fno-case-of-case flag for the simplifier. We use this in the first run
- of the simplifier, where it helps to stop messing up expressions that
- the (subsequent) full laziness pass would otherwise find float out.
- It's much more effective than previous half-baked hacks in inlining.
-
- Actually, it turned out that there were three places in Simplify.lhs that
- needed to know use this flag.
-
-* Make the float-in pass push duplicatable bindings into the branches of
- a case expression, in the hope that we never have to allocate them.
- (see FloatIn.sepBindsByDropPoint)
-
-* Arrange that top-level bottoming Ids get a NOINLINE pragma
- This reduced gratuitous inlining of error messages.
- But arrange that such things still get w/w'd.
-
-* Arrange that a strict argument position is regarded as an 'interesting'
- context, so that if we see
- foldr k z (g x)
- then we'll be inclined to inline g; this can expose a build.
-
-* There was a missing case in CoreUtils.exprEtaExpandArity that meant
- we were missing some obvious cases for eta expansion
- Also improve the code when handling applications.
-
-* Make record selectors (identifiable by their IdFlavour) into "cheap" operations.
- [The change is a 2-liner in CoreUtils.exprIsCheap]
- This means that record selection may be inlined into function bodies, which
- greatly improves the arities of overloaded functions.
-
-* Make a cleaner job of inlining "lone variables". There was some distributed
- cunning, but I've centralised it all now in SimplUtils.analyseCont, which
- analyses the context of a call to decide whether it is "interesting".
-
-* Don't specialise very small functions in Specialise.specDefn
- It's better to inline it. Rather like the worker/wrapper case.
-
-* Be just a little more aggressive when floating out of let rhss.
- See comments with Simplify.wantToExpose
- A small change with an occasional big effect.
-
-* Make the inline-size computation think that
- case x of I# x -> ...
- is *free*.
-
-
-CPR analysis
-~~~~~~~~~~~~
-* Fix what was essentially a bug in CPR analysis. Consider
-
- letrec f x = let g y = let ... in f e1
- in
- if ... then (a,b) else g x
-
- g has the CPR property if f does; so when generating the final annotated
- RHS for f, we must use an envt in which f is bound to its final abstract
- value. This wasn't happening. Instead, f was given the CPR tag but g
- wasn't; but of course the w/w pass gives rotten results in that case!!
- (Because f's CPR-ness relied on g's.)
-
- On they way I tidied up the code in CprAnalyse. It's quite a bit shorter.
-
- The fact that some data constructors return a constructed product shows
- up in their CPR info (MkId.mkDataConId) not in CprAnalyse.lhs
-
-
-
-Strictness analysis and worker/wrapper
-~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-* BIG THING: pass in the demand to StrictAnal.saExpr. This affects situations
- like
- f (let x = e1 in (x,x))
- where f turns out to have strictness u(SS), say. In this case we can
- mark x as demanded, and use a case expression for it.
-
- The situation before is that we didn't "know" that there is the u(SS)
- demand on the argument, so we simply computed that the body of the let
- expression is lazy in x, and marked x as lazily-demanded. Then even after
- f was w/w'd we got
-
- let x = e1 in case (x,x) of (a,b) -> $wf a b
-
- and hence
-
- let x = e1 in $wf a b
-
- I found a much more complicated situation in spectral/sphere/Main.shade,
- which improved quite a bit with this change.
-
-* Moved the StrictnessInfo type from IdInfo to Demand. It's the logical
- place for it, and helps avoid module loops
-
-* Do worker/wrapper for coerces even if the arity is zero. Thus:
- stdout = coerce Handle (..blurg..)
- ==>
- wibble = (...blurg...)
- stdout = coerce Handle wibble
- This is good because I found places where we were saying
- case coerce t stdout of { MVar a ->
- ...
- case coerce t stdout of { MVar b ->
- ...
- and the redundant case wasn't getting eliminated because of the coerce.
-
-
-
-End December
-~~~~~~~~~~~~
-* Fix a few renamer bugs
-
-* Substantially reorganise the Prelude to eliminate all orphan declarations.
- Details in PrelBase.lhs
-
-* Do a much better job of appending literal strings
- - remove NoRepStr
- - move unpackCString stuff to PrelBase
- - add BuiltinRules to the Rule type
- - add fold/build rules for literal strings
-
-
-
-Week of Mon 25 Oct
-~~~~~~~~~~~~~~~~~~
-* Fix a terrible bug in Simplify.mkDupableAlt; we were duplicating a small
- *InAlt*, but doing so invalidated occurrence info, which could lead to
- substantial code duplication.
-
-* Fix a bug in WwLib.mkWWcpr; I was generating CPR wrappers like
- I# (case x of ...)
- which is utterly wrong. It should be
- case x of ...(I# r)
- (The effect was to make functions stricter than they really are.)
-
-* Try doing no inlining at all in phase 0. This noticeably improved
- spectral/fish (esp Main.hs I think), by improving floating.
- This single change has quite a large effect on some programs (allocation)
-
- Don't inline Don't inline
- wrappers anything
- in phase 0 in phase 0
- awards 113k -7.08%
- cichelli 28962k -3.12%
- wave4main 88089k +130.45%
- fibheaps 31731k +19.01%
- fish 8273k -1.64%
- typecheck 148713k +4.91%
-
- But I found that fish worked much better if we inline *local* things
- in phase 0, but not *imported* things.
-
-* Fix a terrible bug in Simplify.mkLamBndrZapper. It was counting
- type args in one place, but not type binders, so it was sometimes
- inlining into unsaturated lambdas!
-
-* I found that there were some very bad loss-of-arity cases in PrelShow.
- In particular, we had:
-
- showl "" = showChar '"' s
- showl ('"':xs) = showString "\\\"" . showl xs
- showl (x:xs) = showLitChar x . showl xs
-
- Trouble is, we get
- showl = \xs -> case xs of
- ...
- (x:xs) -> let f = showLitChar x
- g = showl xs
- in \s -> f (g x)
- which is TERRIBLE. We can't spot that showLitChar has arity 2 because
- it looks like this:
-
- ...other eqns...
- showLitChar c = showString ('\\' : asciiTab!!ord c)
-
- notice that the (asciiTab!!orc c) is outside the \s, so GHC can't rewrite it to
-
- showLitChar c = \s -> showString ('\\' : asciiTab!!ord c) s
-
- So I've changed PrelShow.showLitChar to use explicit \s. Even then, showl
- doesn't work, because GHC can't see that showl xs can be pushed inside the \s.
- So I've put an explict \s there too.
-
- showl "" s = showChar '"' s
- showl ('"':xs) s = showString "\\\"" (showl xs s)
- showl (x:xs) s = showLitChar x (showl xs s)
-
- Net result: imaginary/gen_regexps more than halves in allocation!
-
- Turns out that the mkLamBndrZapper bug (above) meant that showl was
- erroneously inlining showLitChar x and showl xs, which is why this
- problem hasn't shown up before.
-
-* Improve CSE a bit. In ptic
- case h x of y -> ...(h x)...
- replaces (h x) by y.
-
-* Inline INLINE things very agressively, even though we get code duplication
- thereby. Reason: otherwise we sometimes call the original un-inlined INLINE
- defns, which have constructors etc still un-inlined in their RHSs. The
- improvement is dramatic for a few programs:
-
- typecheck 150865k -1.43%
- wave4main 114216k -22.87%
- boyer 28793k -7.86%
- cichelli 33786k -14.28%
- ida 59505k -1.79%
- rewrite 14665k -4.91%
- sched 17641k -4.22%
-
- Code size increases by 10% which is not so good. There must be a better way.
- Another bad thing showed up in fish/Main.hs. Here we have
- (x1,y1) `vec_add` (x2,y2) = (x1+x2, y1+y2)
- which tends to get inlined. But if we first inline (+), it looks big,
- so we don't inline it. Sigh.
-
-
-* Don't inline constructors in INLINE RHSs. Ever. Otherwise rules don't match.
- E.g. build
-
-* In ebnf2ps/Lexer.uncommentString, it would be a good idea to inline a constructor
- that occurs once in each branch of a case. That way it doesn't get allocated
- in the branches that don't use it. And in fact in this particular case
- something else good happens. So CoreUnfold now does that.
-
-* Reverted to n_val_binders+2 in calcUnfoldingGuidance
- Otherwise wrappers are inlined even if there's no benefit.
-
-
-Week of Mon 18 Oct
-~~~~~~~~~~
-* Arrange that simplConArgs works in one less pass than before.
- This exposed a bug: a bogus call to completeBeta.
-
-* Add a top-level flag in CoreUnfolding, used in callSiteInline
-
-* Extend w/w to use etaExpandArity, so it does eta/coerce expansion
-
-* Don't float anything out of an INLINE.
- Don't float things to top level unless they also escape a value lambda.
- [see comments with SetLevels.lvlMFE
- Without at least one of these changes, I found that
- {-# INLINE concat #-}
- concat = __inline (/\a -> foldr (++) [])
- was getting floated to
- concat = __inline( /\a -> lvl a )
- lvl = ...inlined version of foldr...
-
- Subsequently I found that not floating constants out of an INLINE
- gave really bad code like
- __inline (let x = e in \y -> ...)
- so I now let things float out of INLINE
-
-* Implement inline phases. The meaning of the inline pragmas is
- described in CoreUnfold.lhs
-
-* Implement the "reverse-mapping" idea for CSE; actually it turned out to be easier
- to implement it in SetLevels, and may benefit full laziness too.
-
-Thurs 14 Oct
-~~~~~~~~~~~~
-* It's a good idea to inline inRange. Consider
-
- index (l,h) i = case inRange (l,h) i of
- True -> l+i
- False -> error
- inRange itself isn't strict in h, but if it't inlined then 'index'
- *does* become strict in h. Interesting!
-
-* Big change to the way unfoldings and occurrence info is propagated in the simplifier
- The plan is described in Subst.lhs with the Subst type
- Occurrence info is now in a separate IdInfo field than user pragmas
-
-* I found that
- (coerce T (coerce S (\x.e))) y
- didn't simplify in one round. First we get to
- (\x.e) y
- and only then do the beta. Solution: cancel the coerces in the continuation
-
-* Amazingly, CoreUnfold wasn't counting the cost of a function an application.
-
-Early Oct
-~~~~~~~~~
-* No commas between for-alls in RULES
-
-* Disable rules in initial simplifier run. Otherwise full laziness
- doesn't get a chance to lift out a MFE before a rule (e.g. fusion)
- zaps it. queens is a case in point
-
-* Improve float-out stuff significantly. The big change is that if we have
-
- \x -> ... /\a -> ...let p = ..a.. in let q = ...p...
-
- where p's rhs doesn't x, we abstract a from p, so that we can get p past x.
- (We did that before.) But we also substitute (p a) for p in q, and then
- we can do the same thing for q. (We didn't do that, so q got stuck.)
- This is much better. It involves doing a substitution "as we go" in SetLevels,
- though.
-
-
-Weds 15 Sept
-~~~~~~~~~~~~
-* exprIsDupable for an application (f e1 .. en) wasn't calling exprIsDupable
- on the arguments!! So applications with few, but large, args were being dupliated.
-
-* sizeExpr on an application wasn't doing a nukeScrutDiscount on the arg of
- an application!! So bogus discounts could accumulate from arguments!
-
-* Improve handling of INLINE pragmas in calcUnfoldingGuidance. It was really
- wrong before
-
-* Substantially improve handling of coerces in worker/wrapper
-
-Tuesday 6 June
-~~~~~~~~~~~~~~
-* Fix Kevin Atkinson's cant-find-instance bug. Turns out that Rename.slurpSourceRefs
- needs to repeatedly call getImportedInstDecls, and then go back to slurping
- source-refs. Comments with Rename.slurpSourceRefs.
-
-* Add a case to Simplify.mkDupableAlt for the quite-common case where there's
- a very simple alternative, in which case there's no point in creating a
- join-point binding.
-
-* Fix CoreUtils.exprOkForSpeculation so that it returns True of (==# a# b#).
- This lack meant that
- case ==# a# b# of { True -> x; False -> x }
- was not simplifying
-
-* Make float-out dump bindings at the top of a function argument, as
- at the top of a let(rec) rhs. See notes with FloatOut.floatRhs
-
-* Make the ArgOf case of mkDupableAlt generate a OneShot lambda.
- This gave a noticeable boost to spectral/boyer2
-
-
-Monday 5 June
-~~~~~~~~~~~~~
-Work, using IO.hPutStr as an example, to reduce the number of coerces.
-The main idea is in WwLib.mkWWcoerce. The gloss is that we must do
-the w/w split even for small non-recursive things. See notes with
-WorkWrap.tryWw.
-
-
-Friday 2 June
-~~~~~~~~~~~~~
-Study why gen_regexps is slower than before. Problem is in IO.writeLines,
-in particular the local defn shoveString. Two things are getting
-in the way of arity expansion, which means we build far more function
-closures than we should:
- shove = \ x -> let lvl = \s -> ...
- in \s -> ... lvl ...
-
-The two things are:
- a) coerces
- b) full laziness floats
-
-
-Solution to (a): add coerces to the worker/wrapper stuff.
-See notes with WwLib.mkWWcoerce.
-
-This further complicated getWorkerId, so I finally bit the bullet and
-make the workerInfo field of the IdInfo work properly, including
-under substitutions. Death to getWorkerId.
-
-
-
-Solution to (b): make all lambdas over realWorldStatePrimTy
-into one-shot lambdas. This is a GROSS HACK.
-
-* Also make the occurrence analyser aware of one-shot lambdas.
-
-
-Thurs 1 June
-~~~~~~~~~~~~
-Fix SetLevels so that it does not clone top-level bindings, but it
-*does* clone bindings that are destined for the top level.
-
-The global invariant is that the top level bindings are always
-unique, and never cloned.