X-Git-Url: http://git.megacz.com/?p=ghc-hetmet.git;a=blobdiff_plain;f=compiler%2FSimon-log;fp=compiler%2FSimon-log;h=9d60ccc6eb696b94ec503a28f8f11cfcb8af5afe;hp=0000000000000000000000000000000000000000;hb=0065d5ab628975892cea1ec7303f968c3338cbe1;hpb=28a464a75e14cece5db40f2765a29348273ff2d2 diff --git a/compiler/Simon-log b/compiler/Simon-log new file mode 100644 index 0000000..9d60ccc --- /dev/null +++ b/compiler/Simon-log @@ -0,0 +1,1260 @@ + ------------------------------------ + GHCI hacking + ------------------------------------ + +* Don't forget to put deferred-type-decls back into RnIfaces + +* Do we want to record a package name in a .hi file? + Does pi_mod have a ModuleName or a Module? + + ------------------------------------ + Mainly FunDeps (23 Jan 01) + ------------------------------------ + +This commit re-engineers the handling of functional dependencies. +A functional dependency is no longer an Inst; instead, the necessary +dependencies are snaffled out of their Class when necessary. + +As part of this exercise I found that I had to re-work how to do generalisation +in a binding group. There is rather exhaustive documentation on the new Plan +at the top of TcSimplify. + + ****************** + WARNING: I have compiled all the libraries with this new compiler + and all looks well, but I have not run many programs. + Things may break. Let me know if so. + ****************** + +The main changes are these: + +1. typecheck/TcBinds and TcSimplify have a lot of changes due to the + new generalisation and context reduction story. There are extensive + comments at the start of TcSimplify + +2. typecheck/TcImprove is removed altogether. Instead, improvement is + interleaved with context reduction (until a fixpoint is reached). + All this is done in TcSimplify. + +3. types/FunDeps has new exports + * 'improve' does improvement, returning a list of equations + * 'grow' and 'oclose' close a list of type variables wrt a set of + PredTypes, but in slightly different ways. Comments in file. + +4. I improved the way in which we check that main::IO t. It's tidier now. + +In addition + +* typecheck/TcMatches: + a) Tidy up, introducing a common function tcCheckExistentialPat + + b) Improve the typechecking of parallel list comprehensions, + which wasn't quite right before. (see comments with tcStmts) + + WARNING: (b) is untested! Jeff, you might want to check. + +* Numerous other incidental changes in the typechecker + +* Manuel found that rules don't fire well when you have partial applications + from overloading. For example, we may get + + f a (d::Ord a) = let m_g = g a d + in + \y :: a -> ...(m_g (h y))... + + The 'method' m_g doesn't get inlined because (g a d) might be a redex. + Yet a rule that looks like + g a d (h y) = ... + won't fire because that doesn't show up. One way out would be to make + the rule matcher a bit less paranoid about duplicating work, but instead + I've added a flag + -fno-method-sharing + which controls whether we generate things like m_g in the first place. + It's not clear that they are a win in the first place. + + The flag is actually consulted in Inst.tcInstId + + + + ------------------------------------ + Mainly PredTypes (28 Sept 00) + ------------------------------------ + +Three things in this commit: + + 1. Main thing: tidy up PredTypes + 2. Move all Keys into PrelNames + 3. Check for unboxed tuples in function args + +1. Tidy up PredTypes +~~~~~~~~~~~~~~~~~~~~ +The main thing in this commit is to modify the representation of Types +so that they are a (much) better for the qualified-type world. This +should simplify Jeff's life as he proceeds with implicit parameters +and functional dependencies. In particular, PredType, introduced by +Jeff, is now blessed and dignified with a place in TypeRep.lhs: + + data PredType = Class Class [Type] + | IParam Name Type + +Consider these examples: + f :: (Eq a) => a -> Int + g :: (?x :: Int -> Int) => a -> Int + h :: (r\l) => {r} => {l::Int | r} + +Here the "Eq a" and "?x :: Int -> Int" and "r\l" are all called +*predicates*, and are represented by a PredType. (We don't support +TREX records yet, but the setup is designed to expand to allow them.) + +In addition, Type gains an extra constructor: + + data Type = .... | PredTy PredType + +so that PredType is injected directly into Type. So the type + p => t +is represented by + PredType p `FunTy` t + +I have deleted the hackish IPNote stuff; predicates are dealt with entirely +through PredTys, not through NoteTy at all. + + +2. Move Keys into PrelNames +~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +This is just a housekeeping operation. I've moved all the pre-assigned Uniques +(aka Keys) from Unique.lhs into PrelNames.lhs. I've also moved knowKeyRdrNames +from PrelInfo down into PrelNames. This localises in PrelNames lots of stuff +about predefined names. Previously one had to alter three files to add one, +now only one. + +3. Unboxed tuples +~~~~~~~~~~~~~~~~~~ +Add a static check for unboxed tuple arguments. E.g. + data T = T (# Int, Int #) +is illegal + + + + --------------------------------------- + Update in place + --------------------------------------- + +-funfolding-update-in-place +Switching it on doesn't affect many programs, except these +sphere is because it makes a critical function (vecsub) more inlinable + + sphere 66465k -20.61% + infer 13390k +1.27% + parstof 1461k +1.18% + fluid 3442k +1.61% + atom 177163k +13.20% + bspt 4837k +4.85% + cichelli 33546k +2.69% + typecheck 146023k +1.47% + + + --------------------------------------- + Simon's tuning changes: early Sept 2000 + --------------------------------------- + +Library changes +~~~~~~~~~~~~~~~ +* Eta expand PrelShow.showLitChar. It's impossible to compile this well, + and it makes a big difference to some programs (e.g. gen_regexps) + +* Make PrelList.concat into a good producer (in the foldr/build sense) + + +Flag changes +~~~~~~~~~~~~ +* Add -ddump-hi-diffs to print out changes in interface files. Useful + when watching what the compiler is doing + +* Add -funfolding-update-in-place to enable the experimental optimisation + that makes the inliner a bit keener to inline if it's in the RHS of + a thunk that might be updated in place. Sometimes this is a bad idea + (one example is in spectral/sphere; see notes in nofib/Simon-nofib-notes) + + +Tuning things +~~~~~~~~~~~~~ +* Fix a bug in SetLevels.lvlMFE. (change ctxt_lvl to dest_level) + I don't think this has any performance effect, but it saves making + a redundant let-binding that is later eliminated. + +* Desugar.dsProgram and DsForeign + Glom together all the bindings into a single Rec. Previously the + bindings generated by 'foreign' declarations were not glommed together, but + this led to an infelicity (i.e. poorer code than necessary) in the modules + that actually declare Float and Double (explained a bit more in Desugar.dsProgram) + +* OccurAnal.shortMeOut and IdInfo.shortableIdInfo + Don't do the occurrence analyser's shorting out stuff for things which + have rules. Comments near IdInfo.shortableIdInfo. + This is deeply boring, and mainly to do with making rules work well. + Maybe rules should have phases attached too.... + +* CprAnalyse.addIdCprInfo + Be a bit more willing to add CPR information to thunks; + in particular, if the strictness analyser has just discovered that this + is a strict let, then the let-to-case transform will happen, and CPR is fine. + This made a big difference to PrelBase.modInt, which had something like + modInt = \ x -> let r = ... -> I# v in + ...body strict in r... + r's RHS isn't a value yet; but modInt returns r in various branches, so + if r doesn't have the CPR property then neither does modInt + +* MkId.mkDataConWrapId + Arrange that vanilla constructors, like (:) and I#, get unfoldings that are + just a simple variable $w:, $wI#. This ensures they'll be inlined even into + rules etc, which makes matching a bit more reliable. The downside is that in + situations like (map (:) xs), we'll end up with (map (\y ys. $w: y ys) xs. + Which is tiresome but it doesn't happen much. + +* SaAbsInt.findStrictness + Deal with the case where a thing with no arguments is bottom. This is Good. + E.g. module M where { foo = error "help" } + Suppose we have in another module + case M.foo of ... + Then we'd like to do the case-of-error transform, without inlining foo. + + +Tidying up things +~~~~~~~~~~~~~~~~~ +* Reorganised Simplify.completeBinding (again). + +* Removed the is_bot field in CoreUnfolding (is_cheap is true if is_bot is!) + This is just a tidy up + +* HsDecls and others + Remove the NewCon constructor from ConDecl. It just added code, and nothing else. + And it led to a bug in MkIface, which though that a newtype decl was always changing! + +* IdInfo and many others + Remove all vestiges of UpdateInfo (hasn't been used for years) + + ------------------------------ + Join points Sept 2000 + ------------------------------ + +With Andrew Kennedy, I found out why a few of the join points introduced by +the simplifier end up as *not* let-no-escpaed. Here's an example: + +f x y = case (pwr x b) == 1 of + False -> False + True -> pwr x c == 1 + +This compiles to: + f = \ @ t w :: Integer -> + let { + $j :: (State# RealWorld -> Bool) + P + $j + = \ w1 :: (State# RealWorld) -> + case pwr w c of wild { + S# i -> case i of wild1 { 1 -> $wTrue; __DEFAULT -> $wFalse }; + J# s d1 -> + case cmpIntegerInt# s d1 1 of wild2 { + 0 -> $wTrue; __DEFAULT -> $wFalse + } + } + } in + case pwr w b of wild { + S# i -> + case i of wild1 { 1 -> $j realWorld#; __DEFAULT -> $wFalse }; + J# s d1 -> + case cmpIntegerInt# s d1 1 of wild2 { + 0 -> $j realWorld#; __DEFAULT -> $wFalse + } + } + +Now consider + + case (f x) of + True -> False + False -> True + +Suppose f is inlined into this case. No new join points are introduced, +because the alternatives are both small. But the consumer + case [.] of {True -> False; False -> True} +will move into the body of f, be duplicated 4 ways, and end up consuming +the result of the four outcomes at the body of f. This yields: + $j :: (State# RealWorld -> Bool) + P + $j + = \ w1 :: (State# RealWorld) -> + case pwr w c of wild { + S# i -> case i of wild1 { 1 -> $wTrue; __DEFAULT -> $wFalse }; + J# s d1 -> + case cmpIntegerInt# s d1 1 of wild2 { + 0 -> $wTrue; __DEFAULT -> $wFalse + } + } + } in + case pwr w b of wild { + S# i -> + case i of wild1 { 1 -> case $j realWorld# of {T->F; F->T} + ; __DEFAULT -> $wTrue }; + J# s d1 -> + case cmpIntegerInt# s d1 1 of wild2 { + 0 -> case $j realWorld# of {T->F; F->T} + ; __DEFAULT -> $wTrue + } + } + +And, voila, the join point $j isn't let-no-escaped any more. +The point is that the consuming context can't "see inside" the join point. +It's a phase ordering thing. If f is inlined before the join points +are built in the first place, then all is well. + + + + ----------------------------- + Sept 7 2000 + ----------------------------- + +* Make the simplifier's Stop continuation record whether the expression being + simplified is the RHS of a thunk, or (say) the body of a lambda or case RHS. + In the thunk case we want to be a bit keener about inlining if the type of + the thunk is amenable to update in place. + +* SetLevels was being a bit too eager to float things to the top + level; e.g. _inline_me_ (\a -> e); here e got floated... + Easily fixed by a change to ltMajLvl + +* Make CoreUnfold.calcUnfoldingGuidance a bit less keen to make case expressions + seem small. The original idea was to make inlined wrappers look small, so that + when we inline a wrapper it doesn't make call site (much) bigger + Otherwise we get nasty phase ordering stuff: + -- f x = g x x + -- h y = ...(f e)... + If we inline g's wrapper, f looks big, and doesn't get inlined + into h; if we inline f first, while it looks small, then g's + wrapper will get inlined later anyway. To avoid this nasty + ordering difference, we make (case a of (x,y) -> ...), + *where a is one of the arguments* look free. + + BUT (a) It's too eager. We don't want to inline a wrapper into a + context with no benefit. + E.g. \ x. f (x+x) o point in inlining (+) here! + + (b) It's ineffective. Once g's wrapper is inlined, its case-expressions + aren't scrutinising arguments any more + + So I've rescinded this idea for now. cases still look fairly small. + +* Fix interestingArg, which was being too liberal, and hence doing + too much inlining. + +* Extended CoreUtils.exprIsCheap to make two more things cheap: + - case (coerce x) of ... + - let x = y +# z + This makes a bit more eta expansion happen. It was provoked by + a program of Marcin's. + +* The simplifier used to glom together all the top-level bindings into + a single Rec every time it was invoked. The reason for this is explained + in SimplCore.lhs, but for at least one simple program it meant that the + simplifier never got around to unravelling the recursive group into + non-recursive pieces. So I've put the glomming under explicit flag + control with a -fglom-binds simplifier pass. A side benefit is + that because it happens less often, the (expensive) SCC algorithm + runs less often. + +* MkIface.ifaceBinds. Make sure that we emit rules for things + (like class operations) that don't get a top-level binding in the + interface file. Previously such rules were silently forgotten. + +* Move transformRhs to *after* simplification, which makes it a + little easier to do, and means that the arity it computes is + readily available to completeBinding. This gets much better + arities. + +* Do coerce splitting in completeBinding. This gets good code for + newtype CInt = CInt Int + + test:: CInt -> Int + test x = case x of + 1 -> 2 + 2 -> 4 + 3 -> 8 + 4 -> 16 + _ -> 0 + +* Modify the meaning of "arity" so that during compilation it means + "if you apply this function to fewer args, it will do virtually + no work". So, for example + f = coerce t (\x -> e) + has arity at least 1. When a function is exported, it's arity becomes + the number of exposed, top-level lambdas, which is subtly different. + But that's ok. + + I removed CoreUtils.exprArity altogether: it looked only at the exposed + lambdas. Instead, we use exprEtaExpandArity exclusively. + + All of this makes I/O programs work much better. + + + ----------------------------- + Sept 4 2000 + ----------------------------- + +* PrimRep, TysPrim. Add PrimPtrRep as the representation for + MVars and MutVars. Previously they were given PtrRep, but that + crashed dataReturnConvPrim! Here's the program the killed it: + data STRef s a = STRef (MutVar# s a) + from (STRef x) = x + +* Make the desugarer use string equality for string literal + patterns longer than 1 character. And put a specialised + eqString into PrelBase, with a suitable specialisation rule. + This makes a huge difference to the size of the code generated + by deriving(Read) notably in Time.lhs + + ----------------------------- + Marktoberdorf Commits (Aug 2000) + ----------------------------- + +1. Tidy up the renaming story for "system binders", such as +dictionary functions, default methods, constructor workers etc. These +are now documented in HsDecls. The main effect of the change, apart +from tidying up, is to make the *type-checker* (instead of the +renamer) generate names for dict-funs and default-methods. This is +good because Sergei's generic-class stuff generates new classes at +typecheck time. + + +2. Fix the CSE pass so it does not require the no-shadowing invariant. +Keith discovered that the simplifier occasionally returns a result +with shadowing. After much fiddling around (which has improved the +code in the simplifier a bit) I found that it is nearly impossible to +arrange that it really does do no-shadowing. So I gave up and fixed +the CSE pass (which is the only one to rely on it) instead. + + +3. Fix a performance bug in the simplifier. The change is in +SimplUtils.interestingArg. It computes whether an argment should +be considered "interesting"; if a function is applied to an interesting +argument, we are more likely to inline that function. +Consider this case + let x = 3 in f x +The 'x' argument was considered "uninteresting" for a silly reason. +Since x only occurs once, it was unconditionally substituted, but +interestingArg didn't take account of that case. Now it does. + +I also made interestingArg a bit more liberal. Let's see if we +get too much inlining now. + + +4. In the occurrence analyser, we were choosing a bad loop breaker. +Here's the comment that's now in OccurAnal.reOrderRec + + score ((bndr, rhs), _, _) + | exprIsTrivial rhs = 3 -- Practically certain to be inlined + -- Used to have also: && not (isExportedId bndr) + -- But I found this sometimes cost an extra iteration when we have + -- rec { d = (a,b); a = ...df...; b = ...df...; df = d } + -- where df is the exported dictionary. Then df makes a really + -- bad choice for loop breaker + +I also increased the score for bindings with a non-functional type, so that +dictionaries have a better chance of getting inlined early + + +5. Add a hash code to the InScopeSet (and make it properly abstract) +This should make uniqAway a lot more robust. Simple experiments suggest +that uniqAway no longer gets into the long iteration chains that it used +to. + + +6. Fix a bug in the inliner that made the simplifier tend to get into +a loop where it would keep iterating ("4 iterations, bailing out" message). +In SimplUtils.mkRhsTyLam we float bindings out past a big lambda, thus: + x = /\ b -> let g = \x -> f x x + in E +becomes + g* = /\a -> \x -> f x x + x = /\ b -> let g = g* b in E + +It's essential that we don't simply inling g* back into the RHS of g, +else we will be back to square 1. The inliner is meant not to do this +because there's no benefit to the inlining, but the size calculation +was a little off in CoreUnfold. + + +7. In SetLevels we were bogus-ly building a Subst with an empty in-scope +set, so a WARNING popped up when compiling some modules. (knights/ChessSetList +was the example that tickled it.) Now in fact the warning wasn't an error, +but the Right Thing to do is to carry down a proper Subst in SetLevels, so +that is what I have now done. It is very little more expensive. + + + + ~~~~~~~~~~~~ + Apr/May 2000 + ~~~~~~~~~~~~ + +This is a pretty big commit! It adds stuff I've been working on +over the last month or so. DO NOT MERGE IT WITH 4.07! + +Recompilation checking +~~~~~~~~~~~~~~~~~~~~~~ +Substantial improvement in recompilation checking. The version management +is now entirely internal to GHC. ghc-iface.lprl is dead! + +The trick is to generate the new interface file in two steps: + - first convert Types etc to HsTypes etc, and thereby + build a new ParsedIface + - then compare against the parsed (but not renamed) version of the old + interface file +Doing this meant adding code to convert *to* HsSyn things, and to +compare HsSyn things for equality. That is the main tedious bit. + +Another improvement is that we now track version info for +fixities and rules, which was missing before. + + +Interface file reading +~~~~~~~~~~~~~~~~~~~~~~ +Make interface files reading more robust. + * If the old interface file is unreadable, don't fail. [bug fix] + + * If the old interface file mentions interfaces + that are unreadable, don't fail. [bug fix] + + * When we can't find the interface file, + print the directories we are looking in. [feature] + + +Type signatures +~~~~~~~~~~~~~~~ + * New flag -ddump-types to print type signatures + + +Type pruning +~~~~~~~~~~~~ +When importing + data T = T1 A | T2 B | T3 C +it seems excessive to import the types A, B, C as well, unless +the constructors T1, T2 etc are used. A,B,C might be more types, +and importing them may mean reading more interfaces, and so on. + So the idea is that the renamer will just import the decl + data T +unless one of the constructors is used. This turns out to be quite +easy to implement. The downside is that we must make sure the +constructors are always available if they are really needed, so +I regard this as an experimental feature. + + +Elimininate ThinAir names +~~~~~~~~~~~~~~~~~~~~~~~~~ +Eliminate ThinAir.lhs and all its works. It was always a hack, and now +the desugarer carries around an environment I think we can nuke ThinAir +altogether. + +As part of this, I had to move all the Prelude RdrName defns from PrelInfo +to PrelMods --- so I renamed PrelMods as PrelNames. + +I also had to move the builtinRules so that they are injected by the renamer +(rather than appearing out of the blue in SimplCore). This is if anything simpler. + +Miscellaneous +~~~~~~~~~~~~~ +* Tidy up the data types involved in Rules + +* Eliminate RnEnv.better_provenance; use Name.hasBetterProv instead + +* Add Unique.hasKey :: Uniquable a => a -> Unique -> Bool + It's useful in a lot of places + +* Fix a bug in interface file parsing for __U[!] + + +======================================= +To-do +~~~~~ +* Try the effect of enhancing update in place with the CPR + idea in CoreUnfold.calcUnfoldingGuidance + +* Check with Simon M re srt on Lit + +* Make all primops return a data type so that we can't over-apply a primop + This makes code gen simpler. Currently the only primops with a polymorphic + return type are: + raise# :: a -> b + catch# :: a -> (b->a) -> a + tagToEnum# :: Int -> a + + Very strange code for PrelException.catchException! What has STret got + to do with it? + +* Liberate case + +* Missing w/w for coerce in go2 functions of fibToList' in fibheaps + +* Watch out for re-boxing in workers; sometimes it happens + and then w/w is a Bad Thing + +* Only two uses of mkCompulsoryUnfolding -- try to nuke it + +* Note that mkDupAlt makes alts that have binders that + are guaranteed to appear just once or not at all + (a,b) -> j a + Same for case binder, but that's harder to take into account. + +* max :: Int -> Int -> Int could be CPRd but isn't. + +* In mandel2 we do a little less well than 4.04 because we aren't + inlining point_colour, and that means we have to box up an argument + before calling it. [This was due to a bug in 4.04] + There's also a great opportunity for liberateCase + in check_radius, where it loops around with two lazy F# built each time + +* In PrelShow.itos' we find a thunk like: + tpl = case chrzh {(zpzh {(remIntzh {x{-aMf-} 10}) 48})} + of tpl{-X1j-} __D P { __DEFAULT -> + PrelBase.Czh{-62,s-} {tpl{-X1j-}} + } + This is a pity. The remInt# can't fail because the divisor isn't 0, + so we could do the sum eagerly and allocate a charcter instead of a thunk. + +* It's good to do let-to-case before we wrap up. Consider + f b xs = let ys = partition isUpper xs + zs = case ys of (a,b) -> a + in case b of + True -> case ys of + (a,b) -> (zs,[]) + False -> case ys of + (a,b) -> (zs ++ xs,[]) + If we don't do let-to-case at all, we get 3 redundant case ys left. + On the other hand we don't want to do it too early, because it + prevents inlining into strict arg positions, which is important for + rules to work. + +* Strict dictionaries. + +* INLINE functions are not always INLINEd, so it's sad to leave + stuff in their bodies like constructors that havn't been inlined. + +* If let x = e in b is strict, then CPR can use the CPR info from x + This bites in the mod method of Integral Int + +* Inline wrappers if they are the RHS of a let, so that update in place + can happen? + +* Consider doing unboxing on strict constr args in a pattern match, + as part of w/w. + +* In spectral/expert/Search.ask there's a statically visible CSE. Catching this + depends almost entirely on chance, which is a pity. + +* Think about exprEtaExpandArity in WwLib. Perhaps eliminate eta expand in simplify? + Perhaps use even if no coerces etc, just eta expansion. (e.g. PrelArr.done) + +* In knights/KnightHeuristic, we don't find that possibleMoves is strict + (with important knock-on effects) unless we apply rules before floating + out the literal list [A,B,C...]. + Similarly, in f_se (F_Cmp ...) in listcompr (but a smaller effect) + +* Floating can float the entire body of an INLINE thing out. + e.g. PrelArr.done + This is sad, and a bit stupid. + +* In spectral/multiplier, we have + xor = lift21 forceBit f + where f :: Bit -> Bit -> Bit + f 0 0 = 0 + f 0 1 = 1 + f 1 0 = 1 + f 1 1 = 0 + Trouble is, f is CPR'd, and that means that instead of returning + the constants I# 0, I# 1, it returns 0,1 and then boxes them. + So allocation goes up. I don't see a way around this. + +* spectral/hartel/parstof ends up saying + case (unpackCString "x") of { c:cs -> ... } + quite a bit. We should spot these and behave accordingly. + +* Try a different hashing algorithms in hashUFM. This might reduce long CSE lists + as well as making uniqAway faster. + +* [I'm not sure this is really important in the end.] + Don't float out partial applications in lvlMFE. E.g. (in hPutStr defn of shoveString) + \x -> case .. of + [] -> setBufWPtr a b + ... + setBufWPtr has arity 3. Floating it out is plain silly. And in this particular + case it's harmful, because it ends up preventing eta expansion on the \x. + That in turn leads to a big extra cost in hPutStr. + + *** Try not doing lvlMFE on the body of a lambda and case alternative *** + +* PrelNumExtra.lhs we get three copies of dropTrailing0s. Too much inlining! + drop0 has cost 21, but gets a discount of 6 (3 * #constrs) for its arg. + With a keen-neess factor of 2, that makes a discount of 12. Add two for + the arguments and we get 21-12-2, which is just small enough to inline. + But that is plainly stupid. + + Add one for cases; and decrease discount for constructors. + +* IO.hGetContents still doesn't see that it is strict in the handle. + Coerces still getting in the way. + +* Try not having really_interesting_cont (subsumed by changes in the + way guidance is calculated for inline things?) + +* Enumeration types in worker/wrapper for strictness analysis + +* This should be reported as an error: + data T k = MkT (k Int#) + +* Bogus report of overlapped pattern for + f (R {field = [c]}) = 1 + f (R {}) = 2 + This shows up for TyCon.maybeTyConSingleCon + +* > module Main( main ) where + + > f :: String -> Int + > f "=<" = 0 + > f "=" = 0 + + > g :: [Char] -> Int + > g ['=','<'] = 0 + > g ['='] = 0 + + > main = return () + + For ``f'' the following is reported. + + tmp.lhs:4: + Pattern match(es) are overlapped in the definition of function `f' + "=" = ... + + There are no complaints for definition for ``g''. + +* Without -O I don't think we need change the module version + if the usages change; I forget why it changes even with -O + +* Record selectors for existential type; no good! What to do? + Record update doesn't make sense either. + + Need to be careful when figuring out strictness, and when generating + worker-wrapper split. + + Also when deriving. + + + Jan 2000 + ~~~~~~~~ + +A fairly big pile of work originally aimed at +removing the Con form of Core expression, and replacing it with simple +Lit form. However, I wanted to make sure that the resulting thing +performed better than the original, so I ended up making an absolute +raft of other changes. + +Removing the Con form of Core expressions +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +The big thing is that + + For every constructor C there are now *two* Ids: + + C is the constructor's *wrapper*. It evaluates and unboxes arguments + before calling $wC. It has a perfectly ordinary top-level defn + in the module defining the data type. + + $wC is the constructor's *worker*. It is like a primop that simply + allocates and builds the constructor value. Its arguments are the + actual representation arguments of the constructor. + + For every primop P there is *one* Id, its (curried) Id + + Neither contructor worker Id nor the primop Id have a defminition anywhere. + Instead they are saturated during the core-to-STG pass, and the code generator + generates code for them directly. The STG language still has saturated + primops and constructor applications. + +* The Const type disappears, along with Const.lhs. The literal part + of Const.lhs reappears as Literal.lhs. Much tidying up in here, + to bring all the range checking into this one module. + +* I got rid of NoRep literals entirely. They just seem to be too much trouble. + +* Because Con's don't exist any more, the funny C { args } syntax + disappears from inteface files. + +* Every constructor, C, comes with a + + *wrapper*, called C, whose type is exactly what it looks like + in the source program. It is an ordinary function, + and it gets a top-level binding like any other function + + *worker*, called $wC, which is the actual data constructor. + Its type may be different to C, because: + - useless dict args are dropped + - strict args may be flattened + It does not have a binding. + + The worker is very like a primop, in that it has no binding, + + +Parsing +~~~~~~~ +* Result type signatures now work + f :: Int -> Int = \x -> x + -- The Int->Int is the type of f + + g x y :: Int = x+y + -- The Int is the type of the result of (g x y) + + +Recompilation checking and make +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +* The .hi file for a modules is not touched if it doesn't change. (It used to + be touched regardless, forcing a chain of recompilations.) The penalty for this + is that we record exported things just as if they were mentioned in the body of + the module. And the penalty for that is that we may recompile a module when + the only things that have changed are the things it is passing on without using. + But it seems like a good trade. + +* -recomp is on by default + +Foreign declarations +~~~~~~~~~~~~~~~~~~~~ +* If you say + foreign export zoo :: Int -> IO Int + then you get a C produre called 'zoo', not 'zzoo' as before. + I've also added a check that complains if you export (or import) a C + procedure whose name isn't legal C. + + +Code generation and labels +~~~~~~~~~~~~~~~~~~~~~~~~~~ +* Now that constructor workers and wrappers have distinct names, there's + no need to have a Foo_static_closure and a Foo_closure for constructor Foo. + I nuked the entire StaticClosure story. This has effects in some of + the RTS headers (i.e. s/static_closure/closure/g) + + +Rules, constant folding +~~~~~~~~~~~~~~~~~~~~~~~ +* Constant folding becomes just another rewrite rule, attached to the Id for the + PrimOp. To achieve this, there's a new form of Rule, a BuiltinRule (see CoreSyn.lhs). + The prelude rules are in prelude/PrelRules.lhs, while simplCore/ConFold.lhs has gone. + +* Appending of constant strings now works, using fold/build fusion, plus + the rewrite rule + unpack "foo" c (unpack "baz" c n) = unpack "foobaz" c n + Implemented in PrelRules.lhs + +* The CCall primop is tidied up quite a bit. There is now a data type CCall, + defined in PrimOp, that packages up the info needed for a particular CCall. + There is a new Id for each new ccall, with an big "occurrence name" + {__ccall "foo" gc Int# -> Int#} + In interface files, this is parsed as a single Id, which is what it is, really. + +Miscellaneous +~~~~~~~~~~~~~ +* There were numerous places where the host compiler's + minInt/maxInt was being used as the target machine's minInt/maxInt. + I nuked all of these; everything is localised to inIntRange and inWordRange, + in Literal.lhs + +* Desugaring record updates was broken: it didn't generate correct matches when + used withe records with fancy unboxing etc. It now uses matchWrapper. + +* Significant tidying up in codeGen/SMRep.lhs + +* Add __word, __word64, __int64 terminals to signal the obvious types + in interface files. Add the ability to print word values in hex into + C code. + +* PrimOp.lhs is no longer part of a loop. Remove PrimOp.hi-boot* + + +Types +~~~~~ +* isProductTyCon no longer returns False for recursive products, nor + for unboxed products; you have to test for these separately. + There's no reason not to do CPR for recursive product types, for example. + Ditto splitProductType_maybe. + +Simplification +~~~~~~~~~~~~~~~ +* New -fno-case-of-case flag for the simplifier. We use this in the first run + of the simplifier, where it helps to stop messing up expressions that + the (subsequent) full laziness pass would otherwise find float out. + It's much more effective than previous half-baked hacks in inlining. + + Actually, it turned out that there were three places in Simplify.lhs that + needed to know use this flag. + +* Make the float-in pass push duplicatable bindings into the branches of + a case expression, in the hope that we never have to allocate them. + (see FloatIn.sepBindsByDropPoint) + +* Arrange that top-level bottoming Ids get a NOINLINE pragma + This reduced gratuitous inlining of error messages. + But arrange that such things still get w/w'd. + +* Arrange that a strict argument position is regarded as an 'interesting' + context, so that if we see + foldr k z (g x) + then we'll be inclined to inline g; this can expose a build. + +* There was a missing case in CoreUtils.exprEtaExpandArity that meant + we were missing some obvious cases for eta expansion + Also improve the code when handling applications. + +* Make record selectors (identifiable by their IdFlavour) into "cheap" operations. + [The change is a 2-liner in CoreUtils.exprIsCheap] + This means that record selection may be inlined into function bodies, which + greatly improves the arities of overloaded functions. + +* Make a cleaner job of inlining "lone variables". There was some distributed + cunning, but I've centralised it all now in SimplUtils.analyseCont, which + analyses the context of a call to decide whether it is "interesting". + +* Don't specialise very small functions in Specialise.specDefn + It's better to inline it. Rather like the worker/wrapper case. + +* Be just a little more aggressive when floating out of let rhss. + See comments with Simplify.wantToExpose + A small change with an occasional big effect. + +* Make the inline-size computation think that + case x of I# x -> ... + is *free*. + + +CPR analysis +~~~~~~~~~~~~ +* Fix what was essentially a bug in CPR analysis. Consider + + letrec f x = let g y = let ... in f e1 + in + if ... then (a,b) else g x + + g has the CPR property if f does; so when generating the final annotated + RHS for f, we must use an envt in which f is bound to its final abstract + value. This wasn't happening. Instead, f was given the CPR tag but g + wasn't; but of course the w/w pass gives rotten results in that case!! + (Because f's CPR-ness relied on g's.) + + On they way I tidied up the code in CprAnalyse. It's quite a bit shorter. + + The fact that some data constructors return a constructed product shows + up in their CPR info (MkId.mkDataConId) not in CprAnalyse.lhs + + + +Strictness analysis and worker/wrapper +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ +* BIG THING: pass in the demand to StrictAnal.saExpr. This affects situations + like + f (let x = e1 in (x,x)) + where f turns out to have strictness u(SS), say. In this case we can + mark x as demanded, and use a case expression for it. + + The situation before is that we didn't "know" that there is the u(SS) + demand on the argument, so we simply computed that the body of the let + expression is lazy in x, and marked x as lazily-demanded. Then even after + f was w/w'd we got + + let x = e1 in case (x,x) of (a,b) -> $wf a b + + and hence + + let x = e1 in $wf a b + + I found a much more complicated situation in spectral/sphere/Main.shade, + which improved quite a bit with this change. + +* Moved the StrictnessInfo type from IdInfo to Demand. It's the logical + place for it, and helps avoid module loops + +* Do worker/wrapper for coerces even if the arity is zero. Thus: + stdout = coerce Handle (..blurg..) + ==> + wibble = (...blurg...) + stdout = coerce Handle wibble + This is good because I found places where we were saying + case coerce t stdout of { MVar a -> + ... + case coerce t stdout of { MVar b -> + ... + and the redundant case wasn't getting eliminated because of the coerce. + + + +End December +~~~~~~~~~~~~ +* Fix a few renamer bugs + +* Substantially reorganise the Prelude to eliminate all orphan declarations. + Details in PrelBase.lhs + +* Do a much better job of appending literal strings + - remove NoRepStr + - move unpackCString stuff to PrelBase + - add BuiltinRules to the Rule type + - add fold/build rules for literal strings + + + +Week of Mon 25 Oct +~~~~~~~~~~~~~~~~~~ +* Fix a terrible bug in Simplify.mkDupableAlt; we were duplicating a small + *InAlt*, but doing so invalidated occurrence info, which could lead to + substantial code duplication. + +* Fix a bug in WwLib.mkWWcpr; I was generating CPR wrappers like + I# (case x of ...) + which is utterly wrong. It should be + case x of ...(I# r) + (The effect was to make functions stricter than they really are.) + +* Try doing no inlining at all in phase 0. This noticeably improved + spectral/fish (esp Main.hs I think), by improving floating. + This single change has quite a large effect on some programs (allocation) + + Don't inline Don't inline + wrappers anything + in phase 0 in phase 0 + awards 113k -7.08% + cichelli 28962k -3.12% + wave4main 88089k +130.45% + fibheaps 31731k +19.01% + fish 8273k -1.64% + typecheck 148713k +4.91% + + But I found that fish worked much better if we inline *local* things + in phase 0, but not *imported* things. + +* Fix a terrible bug in Simplify.mkLamBndrZapper. It was counting + type args in one place, but not type binders, so it was sometimes + inlining into unsaturated lambdas! + +* I found that there were some very bad loss-of-arity cases in PrelShow. + In particular, we had: + + showl "" = showChar '"' s + showl ('"':xs) = showString "\\\"" . showl xs + showl (x:xs) = showLitChar x . showl xs + + Trouble is, we get + showl = \xs -> case xs of + ... + (x:xs) -> let f = showLitChar x + g = showl xs + in \s -> f (g x) + which is TERRIBLE. We can't spot that showLitChar has arity 2 because + it looks like this: + + ...other eqns... + showLitChar c = showString ('\\' : asciiTab!!ord c) + + notice that the (asciiTab!!orc c) is outside the \s, so GHC can't rewrite it to + + showLitChar c = \s -> showString ('\\' : asciiTab!!ord c) s + + So I've changed PrelShow.showLitChar to use explicit \s. Even then, showl + doesn't work, because GHC can't see that showl xs can be pushed inside the \s. + So I've put an explict \s there too. + + showl "" s = showChar '"' s + showl ('"':xs) s = showString "\\\"" (showl xs s) + showl (x:xs) s = showLitChar x (showl xs s) + + Net result: imaginary/gen_regexps more than halves in allocation! + + Turns out that the mkLamBndrZapper bug (above) meant that showl was + erroneously inlining showLitChar x and showl xs, which is why this + problem hasn't shown up before. + +* Improve CSE a bit. In ptic + case h x of y -> ...(h x)... + replaces (h x) by y. + +* Inline INLINE things very agressively, even though we get code duplication + thereby. Reason: otherwise we sometimes call the original un-inlined INLINE + defns, which have constructors etc still un-inlined in their RHSs. The + improvement is dramatic for a few programs: + + typecheck 150865k -1.43% + wave4main 114216k -22.87% + boyer 28793k -7.86% + cichelli 33786k -14.28% + ida 59505k -1.79% + rewrite 14665k -4.91% + sched 17641k -4.22% + + Code size increases by 10% which is not so good. There must be a better way. + Another bad thing showed up in fish/Main.hs. Here we have + (x1,y1) `vec_add` (x2,y2) = (x1+x2, y1+y2) + which tends to get inlined. But if we first inline (+), it looks big, + so we don't inline it. Sigh. + + +* Don't inline constructors in INLINE RHSs. Ever. Otherwise rules don't match. + E.g. build + +* In ebnf2ps/Lexer.uncommentString, it would be a good idea to inline a constructor + that occurs once in each branch of a case. That way it doesn't get allocated + in the branches that don't use it. And in fact in this particular case + something else good happens. So CoreUnfold now does that. + +* Reverted to n_val_binders+2 in calcUnfoldingGuidance + Otherwise wrappers are inlined even if there's no benefit. + + +Week of Mon 18 Oct +~~~~~~~~~~ +* Arrange that simplConArgs works in one less pass than before. + This exposed a bug: a bogus call to completeBeta. + +* Add a top-level flag in CoreUnfolding, used in callSiteInline + +* Extend w/w to use etaExpandArity, so it does eta/coerce expansion + +* Don't float anything out of an INLINE. + Don't float things to top level unless they also escape a value lambda. + [see comments with SetLevels.lvlMFE + Without at least one of these changes, I found that + {-# INLINE concat #-} + concat = __inline (/\a -> foldr (++) []) + was getting floated to + concat = __inline( /\a -> lvl a ) + lvl = ...inlined version of foldr... + + Subsequently I found that not floating constants out of an INLINE + gave really bad code like + __inline (let x = e in \y -> ...) + so I now let things float out of INLINE + +* Implement inline phases. The meaning of the inline pragmas is + described in CoreUnfold.lhs + +* Implement the "reverse-mapping" idea for CSE; actually it turned out to be easier + to implement it in SetLevels, and may benefit full laziness too. + +Thurs 14 Oct +~~~~~~~~~~~~ +* It's a good idea to inline inRange. Consider + + index (l,h) i = case inRange (l,h) i of + True -> l+i + False -> error + inRange itself isn't strict in h, but if it't inlined then 'index' + *does* become strict in h. Interesting! + +* Big change to the way unfoldings and occurrence info is propagated in the simplifier + The plan is described in Subst.lhs with the Subst type + Occurrence info is now in a separate IdInfo field than user pragmas + +* I found that + (coerce T (coerce S (\x.e))) y + didn't simplify in one round. First we get to + (\x.e) y + and only then do the beta. Solution: cancel the coerces in the continuation + +* Amazingly, CoreUnfold wasn't counting the cost of a function an application. + +Early Oct +~~~~~~~~~ +* No commas between for-alls in RULES + +* Disable rules in initial simplifier run. Otherwise full laziness + doesn't get a chance to lift out a MFE before a rule (e.g. fusion) + zaps it. queens is a case in point + +* Improve float-out stuff significantly. The big change is that if we have + + \x -> ... /\a -> ...let p = ..a.. in let q = ...p... + + where p's rhs doesn't x, we abstract a from p, so that we can get p past x. + (We did that before.) But we also substitute (p a) for p in q, and then + we can do the same thing for q. (We didn't do that, so q got stuck.) + This is much better. It involves doing a substitution "as we go" in SetLevels, + though. + + +Weds 15 Sept +~~~~~~~~~~~~ +* exprIsDupable for an application (f e1 .. en) wasn't calling exprIsDupable + on the arguments!! So applications with few, but large, args were being dupliated. + +* sizeExpr on an application wasn't doing a nukeScrutDiscount on the arg of + an application!! So bogus discounts could accumulate from arguments! + +* Improve handling of INLINE pragmas in calcUnfoldingGuidance. It was really + wrong before + +* Substantially improve handling of coerces in worker/wrapper + +Tuesday 6 June +~~~~~~~~~~~~~~ +* Fix Kevin Atkinson's cant-find-instance bug. Turns out that Rename.slurpSourceRefs + needs to repeatedly call getImportedInstDecls, and then go back to slurping + source-refs. Comments with Rename.slurpSourceRefs. + +* Add a case to Simplify.mkDupableAlt for the quite-common case where there's + a very simple alternative, in which case there's no point in creating a + join-point binding. + +* Fix CoreUtils.exprOkForSpeculation so that it returns True of (==# a# b#). + This lack meant that + case ==# a# b# of { True -> x; False -> x } + was not simplifying + +* Make float-out dump bindings at the top of a function argument, as + at the top of a let(rec) rhs. See notes with FloatOut.floatRhs + +* Make the ArgOf case of mkDupableAlt generate a OneShot lambda. + This gave a noticeable boost to spectral/boyer2 + + +Monday 5 June +~~~~~~~~~~~~~ +Work, using IO.hPutStr as an example, to reduce the number of coerces. +The main idea is in WwLib.mkWWcoerce. The gloss is that we must do +the w/w split even for small non-recursive things. See notes with +WorkWrap.tryWw. + + +Friday 2 June +~~~~~~~~~~~~~ +Study why gen_regexps is slower than before. Problem is in IO.writeLines, +in particular the local defn shoveString. Two things are getting +in the way of arity expansion, which means we build far more function +closures than we should: + shove = \ x -> let lvl = \s -> ... + in \s -> ... lvl ... + +The two things are: + a) coerces + b) full laziness floats + + +Solution to (a): add coerces to the worker/wrapper stuff. +See notes with WwLib.mkWWcoerce. + +This further complicated getWorkerId, so I finally bit the bullet and +make the workerInfo field of the IdInfo work properly, including +under substitutions. Death to getWorkerId. + + + +Solution to (b): make all lambdas over realWorldStatePrimTy +into one-shot lambdas. This is a GROSS HACK. + +* Also make the occurrence analyser aware of one-shot lambdas. + + +Thurs 1 June +~~~~~~~~~~~~ +Fix SetLevels so that it does not clone top-level bindings, but it +*does* clone bindings that are destined for the top level. + +The global invariant is that the top level bindings are always +unique, and never cloned.