2 % (c) The GRASP/AQUA Project, Glasgow University, 1993-1996
4 \section[Specialise]{Stamping out overloading, and (optionally) polymorphism}
12 #include "HsVersions.h"
14 import Id ( Id, DictVar, idType, mkUserLocal,
16 getIdSpecialisation, setIdSpecialisation,
18 IdSet, mkIdSet, addOneToIdSet, intersectIdSets, isEmptyIdSet,
19 emptyIdSet, unionIdSets, minusIdSet, unitIdSet, elementOfIdSet,
21 IdEnv, mkIdEnv, lookupIdEnv, addOneToIdEnv, delOneFromIdEnv
24 import Type ( Type, mkTyVarTy, splitSigmaTy, instantiateTy, isDictTy,
25 tyVarsOfType, tyVarsOfTypes, applyTys, mkForAllTys
27 import TyCon ( TyCon )
29 TyVarSet, mkTyVarSet, isEmptyTyVarSet, intersectTyVarSets,
30 elementOfTyVarSet, unionTyVarSets, emptyTyVarSet,
31 TyVarEnv, mkTyVarEnv, delFromTyVarEnv
34 import PprCore () -- Instances
35 import Name ( NamedThing(..), getSrcLoc )
36 import SpecEnv ( addToSpecEnv, lookupSpecEnv, specEnvValues )
38 import UniqSupply ( UniqSupply,
39 UniqSM, initUs, thenUs, returnUs, getUnique, mapUs
43 import Maybes ( MaybeErr(..), maybeToBool )
45 import List ( partition )
46 import Util ( zipEqual )
53 %************************************************************************
55 \subsection[notes-Specialise]{Implementation notes [SLPJ, Aug 18 1993]}
57 %************************************************************************
59 These notes describe how we implement specialisation to eliminate
60 overloading, and optionally to eliminate unboxed polymorphism, and
63 The specialisation pass is a partial evaluator which works on Core
64 syntax, complete with all the explicit dictionary application,
65 abstraction and construction as added by the type checker. The
66 existing type checker remains largely as it is.
68 One important thought: the {\em types} passed to an overloaded
69 function, and the {\em dictionaries} passed are mutually redundant.
70 If the same function is applied to the same type(s) then it is sure to
71 be applied to the same dictionary(s)---or rather to the same {\em
72 values}. (The arguments might look different but they will evaluate
75 Second important thought: we know that we can make progress by
76 treating dictionary arguments as static and worth specialising on. So
77 we can do without binding-time analysis, and instead specialise on
78 dictionary arguments and no others.
87 and suppose f is overloaded.
89 STEP 1: CALL-INSTANCE COLLECTION
91 We traverse <body>, accumulating all applications of f to types and
94 (Might there be partial applications, to just some of its types and
95 dictionaries? In principle yes, but in practice the type checker only
96 builds applications of f to all its types and dictionaries, so partial
97 applications could only arise as a result of transformation, and even
98 then I think it's unlikely. In any case, we simply don't accumulate such
99 partial applications.)
101 There's a choice of whether to collect details of all *polymorphic* functions
102 or simply all *overloaded* ones. How to sort this out?
103 Pass in a predicate on the function to say if it is "interesting"?
104 This is dependent on the user flags: SpecialiseOverloaded
110 So now we have a collection of calls to f:
114 Notice that f may take several type arguments. To avoid ambiguity, we
115 say that f is called at type t1/t2 and t3/t4.
117 We take equivalence classes using equality of the *types* (ignoring
118 the dictionary args, which as mentioned previously are redundant).
120 STEP 3: SPECIALISATION
122 For each equivalence class, choose a representative (f t1 t2 d1 d2),
123 and create a local instance of f, defined thus:
125 f@t1/t2 = <f_rhs> t1 t2 d1 d2
127 (f_rhs presumably has some big lambdas and dictionary lambdas, so lots
128 of simplification will now result.) Then we should recursively do
131 The new id has its own unique, but its print-name (if exported) has
132 an explicit representation of the instance types t1/t2.
134 Add this new id to f's IdInfo, to record that f has a specialised version.
136 Before doing any of this, check that f's IdInfo doesn't already
137 tell us about an existing instance of f at the required type/s.
138 (This might happen if specialisation was applied more than once, or
139 it might arise from user SPECIALIZE pragmas.)
143 Wait a minute! What if f is recursive? Then we can't just plug in
144 its right-hand side, can we?
146 But it's ok. The type checker *always* creates non-recursive definitions
147 for overloaded recursive functions. For example:
149 f x = f (x+x) -- Yes I know its silly
153 f a (d::Num a) = let p = +.sel a d
155 letrec fl (y::a) = fl (p y y)
159 We still have recusion for non-overloadd functions which we
160 speciailise, but the recursive call should get speciailised to the
161 same recursive version.
167 All this is crystal clear when the function is applied to *constant
168 types*; that is, types which have no type variables inside. But what if
169 it is applied to non-constant types? Suppose we find a call of f at type
170 t1/t2. There are two possibilities:
172 (a) The free type variables of t1, t2 are in scope at the definition point
173 of f. In this case there's no problem, we proceed just as before. A common
174 example is as follows. Here's the Haskell:
179 After typechecking we have
181 g a (d::Num a) (y::a) = let f b (d'::Num b) (x::b) = +.sel b d' x x
182 in +.sel a d (f a d y) (f a d y)
184 Notice that the call to f is at type type "a"; a non-constant type.
185 Both calls to f are at the same type, so we can specialise to give:
187 g a (d::Num a) (y::a) = let f@a (x::a) = +.sel a d x x
188 in +.sel a d (f@a y) (f@a y)
191 (b) The other case is when the type variables in the instance types
192 are *not* in scope at the definition point of f. The example we are
193 working with above is a good case. There are two instances of (+.sel a d),
194 but "a" is not in scope at the definition of +.sel. Can we do anything?
195 Yes, we can "common them up", a sort of limited common sub-expression deal.
198 g a (d::Num a) (y::a) = let +.sel@a = +.sel a d
199 f@a (x::a) = +.sel@a x x
200 in +.sel@a (f@a y) (f@a y)
202 This can save work, and can't be spotted by the type checker, because
203 the two instances of +.sel weren't originally at the same type.
207 * There are quite a few variations here. For example, the defn of
208 +.sel could be floated ouside the \y, to attempt to gain laziness.
209 It certainly mustn't be floated outside the \d because the d has to
212 * We don't want to inline f_rhs in this case, because
213 that will duplicate code. Just commoning up the call is the point.
215 * Nothing gets added to +.sel's IdInfo.
217 * Don't bother unless the equivalence class has more than one item!
219 Not clear whether this is all worth it. It is of course OK to
220 simply discard call-instances when passing a big lambda.
222 Polymorphism 2 -- Overloading
224 Consider a function whose most general type is
226 f :: forall a b. Ord a => [a] -> b -> b
228 There is really no point in making a version of g at Int/Int and another
229 at Int/Bool, because it's only instancing the type variable "a" which
230 buys us any efficiency. Since g is completely polymorphic in b there
231 ain't much point in making separate versions of g for the different
234 That suggests that we should identify which of g's type variables
235 are constrained (like "a") and which are unconstrained (like "b").
236 Then when taking equivalence classes in STEP 2, we ignore the type args
237 corresponding to unconstrained type variable. In STEP 3 we make
238 polymorphic versions. Thus:
240 f@t1/ = /\b -> <f_rhs> t1 b d1 d2
242 This seems pretty simple, and a Good Thing.
244 Polymorphism 3 -- Unboxed
247 If we are speciailising at unboxed types we must speciailise
248 regardless of the overloading constraint. In the exaple above it is
249 worth speciailising at types Int/Int#, Int/Bool# and a/Int#, Int#/Int#
252 Note that specialising an overloaded type at an uboxed type requires
253 an unboxed instance -- we cannot default to an unspecialised version!
260 f x = let g p q = p==q
266 Before specialisation, leaving out type abstractions we have
268 f df x = let g :: Eq a => a -> a -> Bool
270 h :: Num a => a -> a -> (a, Bool)
271 h dh r s = let deq = eqFromNum dh
272 in (+ dh r s, g deq r s)
276 After specialising h we get a specialised version of h, like this:
278 h' r s = let deq = eqFromNum df
279 in (+ df r s, g deq r s)
281 But we can't naively make an instance for g from this, because deq is not in scope
282 at the defn of g. Instead, we have to float out the (new) defn of deq
283 to widen its scope. Notice that this floating can't be done in advance -- it only
284 shows up when specialisation is done.
286 DELICATE MATTER: the way we tell a dictionary binding is by looking to
287 see if it has a Dict type. If the type has been "undictify'd", so that
288 it looks like a tuple, then the dictionary binding won't be floated, and
289 an opportunity to specialise might be lost.
291 User SPECIALIZE pragmas
292 ~~~~~~~~~~~~~~~~~~~~~~~
293 Specialisation pragmas can be digested by the type checker, and implemented
294 by adding extra definitions along with that of f, in the same way as before
296 f@t1/t2 = <f_rhs> t1 t2 d1 d2
298 Indeed the pragmas *have* to be dealt with by the type checker, because
299 only it knows how to build the dictionaries d1 and d2! For example
301 g :: Ord a => [a] -> [a]
302 {-# SPECIALIZE f :: [Tree Int] -> [Tree Int] #-}
304 Here, the specialised version of g is an application of g's rhs to the
305 Ord dictionary for (Tree Int), which only the type checker can conjure
306 up. There might not even *be* one, if (Tree Int) is not an instance of
307 Ord! (All the other specialision has suitable dictionaries to hand
310 Problem. The type checker doesn't have to hand a convenient <f_rhs>, because
311 it is buried in a complex (as-yet-un-desugared) binding group.
314 f@t1/t2 = f* t1 t2 d1 d2
316 where f* is the Id f with an IdInfo which says "inline me regardless!".
317 Indeed all the specialisation could be done in this way.
318 That in turn means that the simplifier has to be prepared to inline absolutely
319 any in-scope let-bound thing.
322 Again, the pragma should permit polymorphism in unconstrained variables:
324 h :: Ord a => [a] -> b -> b
325 {-# SPECIALIZE h :: [Int] -> b -> b #-}
327 We *insist* that all overloaded type variables are specialised to ground types,
328 (and hence there can be no context inside a SPECIALIZE pragma).
329 We *permit* unconstrained type variables to be specialised to
331 - or left as a polymorphic type variable
332 but nothing in between. So
334 {-# SPECIALIZE h :: [Int] -> [c] -> [c] #-}
336 is *illegal*. (It can be handled, but it adds complication, and gains the
340 SPECIALISING INSTANCE DECLARATIONS
341 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
344 instance Foo a => Foo [a] where
346 {-# SPECIALIZE instance Foo [Int] #-}
348 The original instance decl creates a dictionary-function
351 dfun.Foo.List :: forall a. Foo a -> Foo [a]
353 The SPECIALIZE pragma just makes a specialised copy, just as for
354 ordinary function definitions:
356 dfun.Foo.List@Int :: Foo [Int]
357 dfun.Foo.List@Int = dfun.Foo.List Int dFooInt
359 The information about what instance of the dfun exist gets added to
360 the dfun's IdInfo in the same way as a user-defined function too.
362 In fact, matters are a little bit more complicated than this.
363 When we make one of these specialised instances, we are defining
364 a constant dictionary, and so we want immediate access to its constant
365 methods and superclasses. Indeed, these constant methods and superclasses
366 must be in the IdInfo for the class selectors! We need help from the
367 typechecker to sort this out, perhaps by generating a separate IdInfo
370 Automatic instance decl specialisation?
371 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
372 Can instance decls be specialised automatically? It's tricky.
373 We could collect call-instance information for each dfun, but
374 then when we specialised their bodies we'd get new call-instances
375 for ordinary functions; and when we specialised their bodies, we might get
376 new call-instances of the dfuns, and so on. This all arises because of
377 the unrestricted mutual recursion between instance decls and value decls.
379 Furthermore, instance decls are usually exported and used non-locally,
380 so we'll want to compile enough to get those specialisations done.
382 Lastly, there's no such thing as a local instance decl, so we can
383 survive solely by spitting out *usage* information, and then reading that
384 back in as a pragma when next compiling the file. So for now,
385 we only specialise instance decls in response to pragmas.
387 That means that even if an instance decl ain't otherwise exported it
388 needs to be spat out as with a SPECIALIZE pragma. Furthermore, it needs
389 something to say which module defined the instance, so the usage info
390 can be fed into the right reqts info file. Blegh.
393 SPECIAILISING DATA DECLARATIONS
394 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
396 With unboxed specialisation (or full specialisation) we also require
397 data types (and their constructors) to be speciailised on unboxed
400 In addition to normal call instances we gather TyCon call instances at
401 unboxed types, determine equivalence classes for the locally defined
402 TyCons and build speciailised data constructor Ids for each TyCon and
403 substitute these in the Con calls.
405 We need the list of local TyCons to partition the TyCon instance info.
406 We pass out a FiniteMap from local TyCons to Specialised Instances to
407 give to the interface and code genertors.
409 N.B. The specialised data constructors reference the original data
410 constructor and type constructor which do not have the updated
411 specialisation info attached. Any specialisation info must be
412 extracted from the TyCon map returned.
415 SPITTING OUT USAGE INFORMATION
416 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
418 To spit out usage information we need to traverse the code collecting
419 call-instance information for all imported (non-prelude?) functions
420 and data types. Then we equivalence-class it and spit it out.
422 This is done at the top-level when all the call instances which escape
423 must be for imported functions and data types.
426 Partial specialisation by pragmas
427 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
428 What about partial specialisation:
430 k :: (Ord a, Eq b) => [a] -> b -> b -> [a]
431 {-# SPECIALIZE k :: Eq b => [Int] -> b -> b -> [a] #-}
435 {-# SPECIALIZE k :: Eq b => [Int] -> [b] -> [b] -> [a] #-}
437 Seems quite reasonable. Similar things could be done with instance decls:
439 instance (Foo a, Foo b) => Foo (a,b) where
441 {-# SPECIALIZE instance Foo a => Foo (a,Int) #-}
442 {-# SPECIALIZE instance Foo b => Foo (Int,b) #-}
444 Ho hum. Things are complex enough without this. I pass.
447 Requirements for the simplifer
448 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
449 The simplifier has to be able to take advantage of the specialisation.
451 * When the simplifier finds an application of a polymorphic f, it looks in
452 f's IdInfo in case there is a suitable instance to call instead. This converts
454 f t1 t2 d1 d2 ===> f_t1_t2
456 Note that the dictionaries get eaten up too!
458 * Dictionary selection operations on constant dictionaries must be
461 +.sel Int d ===> +Int
463 The obvious way to do this is in the same way as other specialised
464 calls: +.sel has inside it some IdInfo which tells that if it's applied
465 to the type Int then it should eat a dictionary and transform to +Int.
467 In short, dictionary selectors need IdInfo inside them for constant
470 * Exactly the same applies if a superclass dictionary is being
473 Eq.sel Int d ===> dEqInt
475 * Something similar applies to dictionary construction too. Suppose
476 dfun.Eq.List is the function taking a dictionary for (Eq a) to
477 one for (Eq [a]). Then we want
479 dfun.Eq.List Int d ===> dEq.List_Int
481 Where does the Eq [Int] dictionary come from? It is built in
482 response to a SPECIALIZE pragma on the Eq [a] instance decl.
484 In short, dfun Ids need IdInfo with a specialisation for each
485 constant instance of their instance declaration.
488 What does the specialisation IdInfo look like?
489 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
492 [Maybe Type] -- Instance types
493 Int -- No of dicts to eat
494 Id -- Specialised version
496 For example, if f has this SpecInfo:
498 SpecInfo [Just t1, Nothing, Just t3] 2 f'
502 f t1 t2 t3 d1 d2 ===> f t2
504 The "Nothings" identify type arguments in which the specialised
505 version is polymorphic.
507 What can't be done this way?
508 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
509 There is no way, post-typechecker, to get a dictionary for (say)
510 Eq a from a dictionary for Eq [a]. So if we find
514 we can't transform to
519 eqList :: (a->a->Bool) -> [a] -> [a] -> Bool
521 Of course, we currently have no way to automatically derive
522 eqList, nor to connect it to the Eq [a] instance decl, but you
523 can imagine that it might somehow be possible. Taking advantage
524 of this is permanently ruled out.
526 Still, this is no great hardship, because we intend to eliminate
527 overloading altogether anyway!
532 What about types/classes mentioned in SPECIALIZE pragmas spat out,
533 but not otherwise exported. Even if they are exported, what about
534 their original names.
536 Suggestion: use qualified names in pragmas, omitting module for
537 prelude and "this module".
544 f a (d::Num a) = let g = ...
546 ...(let d1::Ord a = Num.Ord.sel a d in g a d1)...
548 Here, g is only called at one type, but the dictionary isn't in scope at the
549 definition point for g. Usually the type checker would build a
550 definition for d1 which enclosed g, but the transformation system
551 might have moved d1's defn inward.
557 What should we do when a value is specialised to a *strict* unboxed value?
559 map_*_* f (x:xs) = let h = f x
563 Could convert let to case:
565 map_*_Int# f (x:xs) = case f x of h# ->
569 This may be undesirable since it forces evaluation here, but the value
570 may not be used in all branches of the body. In the general case this
571 transformation is impossible since the mutual recursion in a letrec
572 cannot be expressed as a case.
574 There is also a problem with top-level unboxed values, since our
575 implementation cannot handle unboxed values at the top level.
577 Solution: Lift the binding of the unboxed value and extract it when it
580 map_*_Int# f (x:xs) = let h = case (f x) of h# -> _Lift h#
585 Now give it to the simplifier and the _Lifting will be optimised away.
587 The benfit is that we have given the specialised "unboxed" values a
588 very simplep lifted semantics and then leave it up to the simplifier to
589 optimise it --- knowing that the overheads will be removed in nearly
592 In particular, the value will only be evaluted in the branches of the
593 program which use it, rather than being forced at the point where the
594 value is bound. For example:
596 filtermap_*_* p f (x:xs)
603 filtermap_*_Int# p f (x:xs)
604 = let h = case (f x) of h# -> _Lift h#
607 True -> case h of _Lift h#
611 The binding for h can still be inlined in the one branch and the
615 Question: When won't the _Lifting be eliminated?
617 Answer: When they at the top-level (where it is necessary) or when
618 inlining would duplicate work (or possibly code depending on
619 options). However, the _Lifting will still be eliminated if the
620 strictness analyser deems the lifted binding strict.
623 A note about non-tyvar dictionaries
624 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
625 Some Ids have types like
627 forall a,b,c. Eq a -> Ord [a] -> tau
629 This seems curious at first, because we usually only have dictionary
630 args whose types are of the form (C a) where a is a type variable.
631 But this doesn't hold for the functions arising from instance decls,
632 which sometimes get arguements with types of form (C (T a)) for some
635 Should we specialise wrt this compound-type dictionary? We used to say
637 "This is a heuristic judgement, as indeed is the fact that we
638 specialise wrt only dictionaries. We choose *not* to specialise
639 wrt compound dictionaries because at the moment the only place
640 they show up is in instance decls, where they are simply plugged
641 into a returned dictionary. So nothing is gained by specialising
644 But it is simpler and more uniform to specialise wrt these dicts too;
645 and in future GHC is likely to support full fledged type signatures
647 f ;: Eq [(a,b)] => ...
650 %************************************************************************
652 \subsubsection{The new specialiser}
654 %************************************************************************
656 Our basic game plan is this. For let(rec) bound function
657 f :: (C a, D c) => (a,b,c,d) -> Bool
659 * Find any specialised calls of f, (f ts ds), where
660 ts are the type arguments t1 .. t4, and
661 ds are the dictionary arguments d1 .. d2.
663 * Add a new definition for f1 (say):
665 f1 = /\ b d -> (..body of f..) t1 b t3 d d1 d2
667 Note that we abstract over the unconstrained type arguments.
671 [t1,b,t3,d] |-> \d1 d2 -> f1 b d
673 to the specialisations of f. This will be used by the
674 simplifier to replace calls
675 (f t1 t2 t3 t4) da db
677 (\d1 d1 -> f1 t2 t4) da db
679 All the stuff about how many dictionaries to discard, and what types
680 to apply the specialised function to, are handled by the fact that the
681 SpecEnv contains a template for the result of the specialisation.
683 We don't build *partial* specialisations for f. For example:
685 f :: Eq a => a -> a -> Bool
686 {-# SPECIALISE f :: (Eq b, Eq c) => (b,c) -> (b,c) -> Bool #-}
688 Here, little is gained by making a specialised copy of f.
689 There's a distinct danger that the specialised version would
690 first build a dictionary for (Eq b, Eq c), and then select the (==)
691 method from it! Even if it didn't, not a great deal is saved.
693 We do, however, generate polymorphic, but not overloaded, specialisations:
695 f :: Eq a => [a] -> b -> b -> b
696 {#- SPECIALISE f :: [Int] -> b -> b -> b #-}
698 Hence, the invariant is this:
700 *** no specialised version is overloaded ***
703 %************************************************************************
705 \subsubsection{The exported function}
707 %************************************************************************
710 specProgram :: UniqSupply -> [CoreBinding] -> [CoreBinding]
712 = initSM us (go binds `thenSM` \ (binds', uds') ->
713 returnSM (dumpAllDictBinds uds' binds')
716 go [] = returnSM ([], emptyUDs)
717 go (bind:binds) = go binds `thenSM` \ (binds', uds) ->
718 specBind bind uds `thenSM` \ (bind', uds') ->
719 returnSM (bind' ++ binds', uds')
722 %************************************************************************
724 \subsubsection{@specExpr@: the main function}
726 %************************************************************************
729 specExpr :: CoreExpr -> SpecM (CoreExpr, UsageDetails)
731 ---------------- First the easy cases --------------------
732 specExpr e@(Var _) = returnSM (e, emptyUDs)
733 specExpr e@(Lit _) = returnSM (e, emptyUDs)
734 specExpr e@(Con _ _) = returnSM (e, emptyUDs)
735 specExpr e@(Prim _ _) = returnSM (e, emptyUDs)
737 specExpr (Coerce co ty body)
738 = specExpr body `thenSM` \ (body', uds) ->
739 returnSM (Coerce co ty body', uds)
741 specExpr (SCC cc body)
742 = specExpr body `thenSM` \ (body', uds) ->
743 returnSM (SCC cc body', uds)
746 ---------------- Applications might generate a call instance --------------------
747 specExpr e@(App fun arg)
750 go (App fun arg) args = go fun (arg:args)
751 go (Var f) args = returnSM (e, mkCallUDs f args)
752 go other args = specExpr other `thenSM` \ (e', uds) ->
753 returnSM (foldl App e' args, uds)
755 ---------------- Lambda/case require dumping of usage details --------------------
757 = specExpr body `thenSM` \ (body', uds) ->
759 (filtered_uds, body'') = dumpUDs bndrs uds body'
761 returnSM (foldr Lam body'' bndrs, filtered_uds)
763 (bndrs, body) = go [] e
765 -- More efficient to collect a group of binders together all at once
766 go bndrs (Lam bndr e) = go (bndr:bndrs) e
767 go bndrs e = (reverse bndrs, e)
770 specExpr (Case scrut alts)
771 = specExpr scrut `thenSM` \ (scrut', uds_scrut) ->
772 spec_alts alts `thenSM` \ (alts', uds_alts) ->
773 returnSM (Case scrut' alts', uds_scrut `plusUDs` uds_alts)
775 spec_alts (AlgAlts alts deflt)
776 = mapAndCombineSM spec_alg_alt alts `thenSM` \ (alts', uds1) ->
777 spec_deflt deflt `thenSM` \ (deflt', uds2) ->
778 returnSM (AlgAlts alts' deflt', uds1 `plusUDs` uds2)
780 spec_alts (PrimAlts alts deflt)
781 = mapAndCombineSM spec_prim_alt alts `thenSM` \ (alts', uds1) ->
782 spec_deflt deflt `thenSM` \ (deflt', uds2) ->
783 returnSM (PrimAlts alts' deflt', uds1 `plusUDs` uds2)
785 spec_alg_alt (con, args, rhs)
786 = specExpr rhs `thenSM` \ (rhs', uds) ->
788 (uds', rhs'') = dumpUDs (map ValBinder args) uds rhs'
790 returnSM ((con, args, rhs''), uds')
792 spec_prim_alt (lit, rhs)
793 = specExpr rhs `thenSM` \ (rhs', uds) ->
794 returnSM ((lit, rhs'), uds)
796 spec_deflt NoDefault = returnSM (NoDefault, emptyUDs)
797 spec_deflt (BindDefault arg rhs)
798 = specExpr rhs `thenSM` \ (rhs', uds) ->
800 (uds', rhs'') = dumpUDs [ValBinder arg] uds rhs'
802 returnSM (BindDefault arg rhs'', uds')
804 ---------------- Finally, let is the interesting case --------------------
805 specExpr (Let bind body)
806 = -- Deal with the body
807 specExpr body `thenSM` \ (body', body_uds) ->
809 -- Deal with the bindings
810 specBind bind body_uds `thenSM` \ (binds', uds) ->
813 returnSM (foldr Let body' binds', uds)
816 %************************************************************************
818 \subsubsection{Dealing with a binding}
820 %************************************************************************
823 specBind :: CoreBinding
824 -> UsageDetails -- Info on how the scope of the binding
825 -> SpecM ([CoreBinding], -- New bindings
826 UsageDetails) -- And info to pass upstream
828 specBind (NonRec bndr rhs) body_uds
829 | isDictTy (idType bndr)
830 = -- It's a dictionary binding
831 -- Pick it up and float it outwards.
832 specExpr rhs `thenSM` \ (rhs', rhs_uds) ->
834 all_uds = rhs_uds `plusUDs` addDictBind body_uds bndr rhs'
836 returnSM ([], all_uds)
839 = -- Deal with the RHS, specialising it according
840 -- to the calls found in the body
841 specDefn (calls body_uds) (bndr,rhs) `thenSM` \ ((bndr',rhs'), spec_defns, spec_uds) ->
843 (all_uds, (dict_binds, dump_calls))
844 = splitUDs [ValBinder bndr] (spec_uds `plusUDs` body_uds)
846 -- If we make specialisations then we Rec the whole lot together
847 -- If not, leave it as a NonRec
848 new_bind | null spec_defns = NonRec bndr' rhs'
849 | otherwise = Rec ((bndr',rhs'):spec_defns)
851 returnSM ( new_bind : dict_binds, all_uds )
853 specBind (Rec pairs) body_uds
854 = mapSM (specDefn (calls body_uds)) pairs `thenSM` \ stuff ->
856 (pairs', spec_defns_s, spec_uds_s) = unzip3 stuff
857 spec_defns = concat spec_defns_s
858 spec_uds = plusUDList spec_uds_s
859 (all_uds, (dict_binds, dump_calls))
860 = splitUDs (map (ValBinder . fst) pairs) (spec_uds `plusUDs` body_uds)
861 new_bind = Rec (spec_defns ++ pairs')
863 returnSM ( new_bind : dict_binds, all_uds )
865 specDefn :: CallDetails -- Info on how it is used in its scope
866 -> (Id, CoreExpr) -- The thing being bound and its un-processed RHS
867 -> SpecM ((Id, CoreExpr), -- The thing and its processed RHS
868 -- the Id may now have specialisations attached
869 [(Id,CoreExpr)], -- Extra, specialised bindings
870 UsageDetails -- Stuff to fling upwards from the RHS and its
871 ) -- specialised versions
873 specDefn calls (fn, rhs)
874 -- The first case is the interesting one
875 | n_tyvars == length rhs_tyvars -- Rhs of fn's defn has right number of big lambdas
876 && n_dicts <= length rhs_bndrs -- and enough dict args
877 && not (null calls_for_me) -- And there are some calls to specialise
878 = -- Specialise the body of the function
879 specExpr body `thenSM` \ (body', body_uds) ->
881 (float_uds, bound_uds@(dict_binds,_)) = splitUDs rhs_bndrs body_uds
884 -- Make a specialised version for each call in calls_for_me
885 mapSM (spec_call bound_uds) calls_for_me `thenSM` \ stuff ->
887 (spec_defns, spec_uds, spec_env_stuff) = unzip3 stuff
889 fn' = addIdSpecialisations fn spec_env_stuff
890 rhs' = foldr Lam (foldr Let body' dict_binds) rhs_bndrs
892 returnSM ((fn',rhs'),
894 float_uds `plusUDs` plusUDList spec_uds)
896 | otherwise -- No calls or RHS doesn't fit our preconceptions
897 = specExpr rhs `thenSM` \ (rhs', rhs_uds) ->
898 returnSM ((fn, rhs'), [], rhs_uds)
902 (tyvars, theta, tau) = splitSigmaTy fn_type
903 n_tyvars = length tyvars
904 n_dicts = length theta
905 mk_spec_tys call_ts = zipWith mk_spec_ty call_ts tyvars
907 mk_spec_ty (Just ty) _ = ty
908 mk_spec_ty Nothing tyvar = mkTyVarTy tyvar
910 (rhs_tyvars, rhs_ids, rhs_body) = collectBinders rhs
911 rhs_dicts = take n_dicts rhs_ids
912 rhs_bndrs = map TyBinder rhs_tyvars ++ map ValBinder rhs_dicts
913 body = mkValLam (drop n_dicts rhs_ids) rhs_body
914 -- Glue back on the non-dict lambdas
916 calls_for_me = case lookupFM calls fn of
918 Just cs -> fmToList cs
920 -- Filter out calls for which we already have a specialisation
921 calls_to_spec = filter spec_me calls_for_me
922 spec_me (call_ts, _) = not (maybeToBool (lookupSpecEnv id_spec_env (mk_spec_tys call_ts)))
923 id_spec_env = getIdSpecialisation fn
925 ----------------------------------------------------------
926 -- Specialise to one particular call pattern
927 spec_call :: ProtoUsageDetails -- From the original body, captured by
928 -- the dictionary lambdas
929 -> ([Maybe Type], [DictVar]) -- Call instance
930 -> SpecM ((Id,CoreExpr), -- Specialised definition
931 UsageDetails, -- Usage details from specialised body
932 ([TyVar], [Type], CoreExpr)) -- Info for the Id's SpecEnv
933 spec_call bound_uds (call_ts, call_ds)
934 = ASSERT( length call_ts == n_tyvars && length call_ds == n_dicts )
935 -- Calls are only recorded for properly-saturated applications
937 -- Supppose the call is for f [Just t1, Nothing, Just t3, Nothing] [d1, d2]
939 -- Construct the new binding
940 -- f1 = /\ b d -> (..rhs of f..) t1 b t3 d d1 d2
941 -- and the type of this binder
943 spec_tyvars = [tyvar | (tyvar, Nothing) <- tyvars `zip` call_ts]
944 spec_tys = mk_spec_tys call_ts
945 spec_rhs = mkTyLam spec_tyvars $
946 mkGenApp rhs (map TyArg spec_tys ++ map VarArg call_ds)
947 spec_id_ty = mkForAllTys spec_tyvars (instantiateTy ty_env tau)
948 ty_env = mkTyVarEnv (zipEqual "spec_call" tyvars spec_tys)
950 newIdSM fn spec_id_ty `thenSM` \ spec_f ->
953 -- Construct the stuff for f's spec env
954 -- [b,d] [t1,b,t3,d] |-> \d1 d2 -> f1 b d
956 spec_env_rhs = mkValLam call_ds $
957 mkTyApp (Var spec_f) $
958 map mkTyVarTy spec_tyvars
959 spec_env_info = (spec_tyvars, spec_tys, spec_env_rhs)
962 -- Specialise the UDs from f's RHS
964 -- Only the overloaded tyvars should be free in the uds
965 ty_env = [ (rhs_tyvar,ty)
966 | (rhs_tyvar, Just ty) <- zipEqual "specUDs1" rhs_tyvars call_ts
968 dict_env = zipEqual "specUDs2" rhs_dicts call_ds
970 specUDs ty_env dict_env bound_uds `thenSM` \ spec_uds ->
972 returnSM ((spec_f, spec_rhs),
978 %************************************************************************
980 \subsubsection{UsageDetails and suchlike}
982 %************************************************************************
985 type FreeDicts = IdSet
989 dict_binds :: !(Bag (DictVar, CoreExpr, TyVarSet, FreeDicts)),
990 -- Floated dictionary bindings
991 -- The order is important;
992 -- in ds1 `union` ds2, bindings in ds2 can depend on those in ds1
993 -- (Remember, Bags preserve order in GHC.)
994 -- The FreeDicts is the free vars of the RHS
996 calls :: !CallDetails
999 emptyUDs = MkUD { dict_binds = emptyBag, calls = emptyFM }
1001 type ProtoUsageDetails = ([CoreBinding], -- Dict bindings
1002 [(Id, [Maybe Type], [DictVar])]
1005 ------------------------------------------------------------
1006 type CallDetails = FiniteMap Id CallInfo
1007 type CallInfo = FiniteMap [Maybe Type] -- Nothing => unconstrained type argument
1008 [DictVar] -- Dict args
1009 -- The finite maps eliminate duplicates
1010 -- The list of types and dictionaries is guaranteed to
1011 -- match the type of f
1013 callDetailsToList calls = [ (id,tys,dicts)
1014 | (id,fm) <- fmToList calls,
1015 (tys,dicts) <- fmToList fm
1018 listToCallDetails calls = foldr (unionCalls . singleCall) emptyFM calls
1020 unionCalls :: CallDetails -> CallDetails -> CallDetails
1021 unionCalls c1 c2 = plusFM_C plusFM c1 c2
1023 singleCall (id, tys, dicts) = unitFM id (unitFM tys dicts)
1027 || length spec_tys /= n_tyvars
1028 || length dicts /= n_dicts
1029 = emptyUDs -- Not overloaded
1032 = MkUD {dict_binds = emptyBag,
1033 calls = singleCall (f, spec_tys, dicts)
1036 (tyvars, theta, tau) = splitSigmaTy (idType f)
1037 constrained_tyvars = foldr (unionTyVarSets . tyVarsOfTypes . snd) emptyTyVarSet theta
1038 n_tyvars = length tyvars
1039 n_dicts = length theta
1041 spec_tys = [mk_spec_ty tv ty | (tv, TyArg ty) <- tyvars `zip` args]
1042 dicts = [d | (_, VarArg d) <- theta `zip` (drop n_tyvars args)]
1044 mk_spec_ty tyvar ty | tyvar `elementOfTyVarSet` constrained_tyvars
1049 ------------------------------------------------------------
1050 plusUDs :: UsageDetails -> UsageDetails -> UsageDetails
1051 plusUDs (MkUD {dict_binds = db1, calls = calls1})
1052 (MkUD {dict_binds = db2, calls = calls2})
1053 = MkUD {dict_binds, calls}
1055 dict_binds = db1 `unionBags` db2
1056 calls = calls1 `unionCalls` calls2
1058 plusUDList = foldr plusUDs emptyUDs
1060 mkDB dict rhs = (dict, rhs, db_ftvs, db_fvs)
1062 db_ftvs = tyVarsOfType (idType dict) -- Superset of RHS fvs
1063 db_fvs = dictRhsFVs rhs
1065 addDictBind uds dict rhs = uds { dict_binds = mkDB dict rhs `consBag` dict_binds uds }
1067 dumpAllDictBinds (MkUD {dict_binds = dbs}) binds
1068 = foldrBag add binds dbs
1070 add (dict,rhs,_,_) binds = NonRec dict rhs : binds
1072 dumpUDs :: [CoreBinder]
1073 -> UsageDetails -> CoreExpr
1074 -> (UsageDetails, CoreExpr)
1075 dumpUDs bndrs uds body
1076 = (free_uds, foldr Let body dict_binds)
1078 (free_uds, (dict_binds, _)) = splitUDs bndrs uds
1080 splitUDs :: [CoreBinder]
1082 -> (UsageDetails, -- These don't mention the binders
1083 ProtoUsageDetails) -- These do
1085 splitUDs bndrs uds@(MkUD {dict_binds = orig_dbs,
1086 calls = orig_calls})
1088 = if isEmptyBag dump_dbs && null dump_calls then
1089 -- Common case: binder doesn't affect floats
1093 -- Binders bind some of the fvs of the floats
1094 (MkUD {dict_binds = free_dbs,
1095 calls = listToCallDetails free_calls},
1096 (bagToList dump_dbs, dump_calls)
1100 tyvar_set = mkTyVarSet [tv | TyBinder tv <- bndrs]
1101 id_set = mkIdSet [id | ValBinder id <- bndrs]
1103 (free_dbs, dump_dbs, dump_idset)
1104 = foldlBag dump_db (emptyBag, emptyBag, id_set) orig_dbs
1105 -- Important that it's foldl not foldr;
1106 -- we're accumulating the set of dumped ids in dump_set
1108 -- Filter out any calls that mention things that are being dumped
1109 -- Don't need to worry about the tyvars because the dicts will
1110 -- spot the captured ones; any fully polymorphic arguments will
1111 -- be Nothings in the call details
1112 orig_call_list = callDetailsToList orig_calls
1113 (dump_calls, free_calls) = partition captured orig_call_list
1114 captured (id,tys,dicts) = any (`elementOfIdSet` dump_idset) (id:dicts)
1116 dump_db (free_dbs, dump_dbs, dump_idset) db@(dict, rhs, ftvs, fvs)
1117 | isEmptyIdSet (dump_idset `intersectIdSets` fvs)
1118 && isEmptyTyVarSet (tyvar_set `intersectTyVarSets` ftvs)
1119 = (free_dbs `snocBag` db, dump_dbs, dump_idset)
1121 | otherwise -- Dump it
1122 = (free_dbs, dump_dbs `snocBag` NonRec dict rhs,
1123 dump_idset `addOneToIdSet` dict)
1126 Given a type and value substitution, specUDs creates a specialised copy of
1130 specUDs :: [(TyVar,Type)] -> [(DictVar,DictVar)] -> ProtoUsageDetails -> SpecM UsageDetails
1131 specUDs tv_env_list dict_env_list (dbs, calls)
1132 = specDBs dict_env dbs `thenSM` \ (dict_env', dbs') ->
1133 returnSM (MkUD { dict_binds = dbs',
1134 calls = listToCallDetails (map (inst_call dict_env') calls)
1137 tv_env = mkTyVarEnv tv_env_list
1138 dict_env = mkIdEnv dict_env_list
1140 inst_call dict_env (id, tys, dicts) = (id, map inst_maybe_ty tys,
1141 map (lookupId dict_env) dicts)
1143 inst_maybe_ty Nothing = Nothing
1144 inst_maybe_ty (Just ty) = Just (instantiateTy tv_env ty)
1147 = returnSM (dict_env, emptyBag)
1148 specDBs dict_env (NonRec dict rhs : dbs)
1149 = newIdSM dict (instantiateTy tv_env (idType dict)) `thenSM` \ dict' ->
1151 dict_env' = addOneToIdEnv dict_env dict dict'
1152 rhs' = instantiateDictRhs tv_env dict_env rhs
1154 specDBs dict_env' dbs `thenSM` \ (dict_env'', dbs') ->
1155 returnSM ( dict_env'', mkDB dict' rhs' `consBag` dbs' )
1158 %************************************************************************
1160 \subsubsection{Boring helper functions}
1162 %************************************************************************
1165 lookupId:: IdEnv Id -> Id -> Id
1166 lookupId env id = case lookupIdEnv env id of
1170 instantiateDictRhs :: TyVarEnv Type -> IdEnv Id -> CoreExpr -> CoreExpr
1171 -- Cheapo function for simple RHSs
1172 instantiateDictRhs ty_env id_env rhs
1175 go_arg (VarArg a) = VarArg (lookupId id_env a)
1176 go_arg (TyArg t) = TyArg (instantiateTy ty_env t)
1178 go (App e1 arg) = App (go e1) (go_arg arg)
1179 go (Var v) = Var (lookupId id_env v)
1181 go (Con con args) = Con con (map go_arg args)
1182 go (Coerce c t e) = Coerce c (instantiateTy ty_env t) (go e)
1183 go (Case e alts) = Case (go e) alts -- See comment below re alts
1184 go other = pprPanic "instantiateDictRhs" (ppr rhs)
1187 dictRhsFVs :: CoreExpr -> IdSet
1188 -- Cheapo function for simple RHSs
1192 go (App e1 (VarArg a)) = go e1 `addOneToIdSet` a
1193 go (App e1 (TyArg t)) = go e1
1194 go (Var v) = unitIdSet v
1195 go (Lit l) = emptyIdSet
1196 go (Con _ args) = mkIdSet [id | VarArg id <- args]
1197 go (Coerce _ _ e) = go e
1199 go (Case e _) = go e -- Claim: no free dictionaries in the alternatives
1200 -- These case expressions are of the form
1201 -- case d of { D a b c -> b }
1203 go other = pprPanic "dictRhsFVs" (ppr e)
1206 addIdSpecialisations id spec_stuff
1207 = (if not (null errs) then
1208 pprTrace "Duplicate specialisations" (vcat (map ppr errs))
1211 setIdSpecialisation id new_spec_env
1213 (new_spec_env, errs) = foldr add (getIdSpecialisation id, []) spec_stuff
1215 add (tyvars, tys, template) (spec_env, errs)
1216 = case addToSpecEnv True spec_env tyvars tys template of
1217 Succeeded spec_env' -> (spec_env', errs)
1218 Failed err -> (spec_env, err:errs)
1220 -- Given an Id, isSpecVars returns all its specialisations.
1221 -- We extract these from its SpecEnv.
1222 -- This is used by the occurrence analyser and free-var finder;
1223 -- we regard an Id's specialisations as free in the Id's definition.
1225 idSpecVars :: Id -> [Id]
1227 = map get_spec (specEnvValues (getIdSpecialisation id))
1229 -- get_spec is another cheapo function like dictRhsFVs
1230 -- It knows what these specialisation temlates look like,
1231 -- and just goes for the jugular
1232 get_spec (App f _) = get_spec f
1233 get_spec (Lam _ b) = get_spec b
1234 get_spec (Var v) = v
1236 ----------------------------------------
1237 type SpecM a = UniqSM a
1241 getUniqSM = getUnique
1245 mapAndCombineSM f [] = returnSM ([], emptyUDs)
1246 mapAndCombineSM f (x:xs) = f x `thenSM` \ (y, uds1) ->
1247 mapAndCombineSM f xs `thenSM` \ (ys, uds2) ->
1248 returnSM (y:ys, uds1 `plusUDs` uds2)
1250 newIdSM old_id new_ty
1251 = getUnique `thenSM` \ uniq ->
1252 returnSM (mkUserLocal (getOccName old_id)