2 % (c) The GRASP/AQUA Project, Glasgow University, 1993-1996
4 \section[Specialise]{Stamping out overloading, and (optionally) polymorphism}
11 #include "HsVersions.h"
13 import Id ( Id, DictVar, idType, mkUserLocal,
15 getIdSpecialisation, addIdSpecialisation, isSpecPragmaId,
17 IdSet, mkIdSet, addOneToIdSet, intersectIdSets, isEmptyIdSet,
18 emptyIdSet, unionIdSets, minusIdSet, unitIdSet, elementOfIdSet,
20 IdEnv, mkIdEnv, lookupIdEnv, addOneToIdEnv
23 import Type ( Type, mkTyVarTy, splitSigmaTy, instantiateTy, isDictTy,
24 tyVarsOfType, tyVarsOfTypes, applyTys, mkForAllTys
26 import TyCon ( TyCon )
28 TyVarSet, mkTyVarSet, isEmptyTyVarSet, intersectTyVarSets,
29 elementOfTyVarSet, unionTyVarSets, emptyTyVarSet,
33 import OccurAnal ( occurAnalyseGlobalExpr )
34 import Name ( NamedThing(..), getSrcLoc )
35 import SpecEnv ( addToSpecEnv )
37 import UniqSupply ( UniqSupply,
38 UniqSM, initUs, thenUs, returnUs, getUnique, mapUs
42 import Maybes ( MaybeErr(..) )
44 import List ( partition )
45 import Util ( zipEqual )
52 %************************************************************************
54 \subsection[notes-Specialise]{Implementation notes [SLPJ, Aug 18 1993]}
56 %************************************************************************
58 These notes describe how we implement specialisation to eliminate
59 overloading, and optionally to eliminate unboxed polymorphism, and
62 The specialisation pass is a partial evaluator which works on Core
63 syntax, complete with all the explicit dictionary application,
64 abstraction and construction as added by the type checker. The
65 existing type checker remains largely as it is.
67 One important thought: the {\em types} passed to an overloaded
68 function, and the {\em dictionaries} passed are mutually redundant.
69 If the same function is applied to the same type(s) then it is sure to
70 be applied to the same dictionary(s)---or rather to the same {\em
71 values}. (The arguments might look different but they will evaluate
74 Second important thought: we know that we can make progress by
75 treating dictionary arguments as static and worth specialising on. So
76 we can do without binding-time analysis, and instead specialise on
77 dictionary arguments and no others.
86 and suppose f is overloaded.
88 STEP 1: CALL-INSTANCE COLLECTION
90 We traverse <body>, accumulating all applications of f to types and
93 (Might there be partial applications, to just some of its types and
94 dictionaries? In principle yes, but in practice the type checker only
95 builds applications of f to all its types and dictionaries, so partial
96 applications could only arise as a result of transformation, and even
97 then I think it's unlikely. In any case, we simply don't accumulate such
98 partial applications.)
100 There's a choice of whether to collect details of all *polymorphic* functions
101 or simply all *overloaded* ones. How to sort this out?
102 Pass in a predicate on the function to say if it is "interesting"?
103 This is dependent on the user flags: SpecialiseOverloaded
109 So now we have a collection of calls to f:
113 Notice that f may take several type arguments. To avoid ambiguity, we
114 say that f is called at type t1/t2 and t3/t4.
116 We take equivalence classes using equality of the *types* (ignoring
117 the dictionary args, which as mentioned previously are redundant).
119 STEP 3: SPECIALISATION
121 For each equivalence class, choose a representative (f t1 t2 d1 d2),
122 and create a local instance of f, defined thus:
124 f@t1/t2 = <f_rhs> t1 t2 d1 d2
126 (f_rhs presumably has some big lambdas and dictionary lambdas, so lots
127 of simplification will now result.) Then we should recursively do
130 The new id has its own unique, but its print-name (if exported) has
131 an explicit representation of the instance types t1/t2.
133 Add this new id to f's IdInfo, to record that f has a specialised version.
135 Before doing any of this, check that f's IdInfo doesn't already
136 tell us about an existing instance of f at the required type/s.
137 (This might happen if specialisation was applied more than once, or
138 it might arise from user SPECIALIZE pragmas.)
142 Wait a minute! What if f is recursive? Then we can't just plug in
143 its right-hand side, can we?
145 But it's ok. The type checker *always* creates non-recursive definitions
146 for overloaded recursive functions. For example:
148 f x = f (x+x) -- Yes I know its silly
152 f a (d::Num a) = let p = +.sel a d
154 letrec fl (y::a) = fl (p y y)
158 We still have recusion for non-overloadd functions which we
159 speciailise, but the recursive call should get speciailised to the
160 same recursive version.
166 All this is crystal clear when the function is applied to *constant
167 types*; that is, types which have no type variables inside. But what if
168 it is applied to non-constant types? Suppose we find a call of f at type
169 t1/t2. There are two possibilities:
171 (a) The free type variables of t1, t2 are in scope at the definition point
172 of f. In this case there's no problem, we proceed just as before. A common
173 example is as follows. Here's the Haskell:
178 After typechecking we have
180 g a (d::Num a) (y::a) = let f b (d'::Num b) (x::b) = +.sel b d' x x
181 in +.sel a d (f a d y) (f a d y)
183 Notice that the call to f is at type type "a"; a non-constant type.
184 Both calls to f are at the same type, so we can specialise to give:
186 g a (d::Num a) (y::a) = let f@a (x::a) = +.sel a d x x
187 in +.sel a d (f@a y) (f@a y)
190 (b) The other case is when the type variables in the instance types
191 are *not* in scope at the definition point of f. The example we are
192 working with above is a good case. There are two instances of (+.sel a d),
193 but "a" is not in scope at the definition of +.sel. Can we do anything?
194 Yes, we can "common them up", a sort of limited common sub-expression deal.
197 g a (d::Num a) (y::a) = let +.sel@a = +.sel a d
198 f@a (x::a) = +.sel@a x x
199 in +.sel@a (f@a y) (f@a y)
201 This can save work, and can't be spotted by the type checker, because
202 the two instances of +.sel weren't originally at the same type.
206 * There are quite a few variations here. For example, the defn of
207 +.sel could be floated ouside the \y, to attempt to gain laziness.
208 It certainly mustn't be floated outside the \d because the d has to
211 * We don't want to inline f_rhs in this case, because
212 that will duplicate code. Just commoning up the call is the point.
214 * Nothing gets added to +.sel's IdInfo.
216 * Don't bother unless the equivalence class has more than one item!
218 Not clear whether this is all worth it. It is of course OK to
219 simply discard call-instances when passing a big lambda.
221 Polymorphism 2 -- Overloading
223 Consider a function whose most general type is
225 f :: forall a b. Ord a => [a] -> b -> b
227 There is really no point in making a version of g at Int/Int and another
228 at Int/Bool, because it's only instancing the type variable "a" which
229 buys us any efficiency. Since g is completely polymorphic in b there
230 ain't much point in making separate versions of g for the different
233 That suggests that we should identify which of g's type variables
234 are constrained (like "a") and which are unconstrained (like "b").
235 Then when taking equivalence classes in STEP 2, we ignore the type args
236 corresponding to unconstrained type variable. In STEP 3 we make
237 polymorphic versions. Thus:
239 f@t1/ = /\b -> <f_rhs> t1 b d1 d2
241 This seems pretty simple, and a Good Thing.
243 Polymorphism 3 -- Unboxed
246 If we are speciailising at unboxed types we must speciailise
247 regardless of the overloading constraint. In the exaple above it is
248 worth speciailising at types Int/Int#, Int/Bool# and a/Int#, Int#/Int#
251 Note that specialising an overloaded type at an uboxed type requires
252 an unboxed instance -- we cannot default to an unspecialised version!
259 f x = let g p q = p==q
265 Before specialisation, leaving out type abstractions we have
267 f df x = let g :: Eq a => a -> a -> Bool
269 h :: Num a => a -> a -> (a, Bool)
270 h dh r s = let deq = eqFromNum dh
271 in (+ dh r s, g deq r s)
275 After specialising h we get a specialised version of h, like this:
277 h' r s = let deq = eqFromNum df
278 in (+ df r s, g deq r s)
280 But we can't naively make an instance for g from this, because deq is not in scope
281 at the defn of g. Instead, we have to float out the (new) defn of deq
282 to widen its scope. Notice that this floating can't be done in advance -- it only
283 shows up when specialisation is done.
285 DELICATE MATTER: the way we tell a dictionary binding is by looking to
286 see if it has a Dict type. If the type has been "undictify'd", so that
287 it looks like a tuple, then the dictionary binding won't be floated, and
288 an opportunity to specialise might be lost.
290 User SPECIALIZE pragmas
291 ~~~~~~~~~~~~~~~~~~~~~~~
292 Specialisation pragmas can be digested by the type checker, and implemented
293 by adding extra definitions along with that of f, in the same way as before
295 f@t1/t2 = <f_rhs> t1 t2 d1 d2
297 Indeed the pragmas *have* to be dealt with by the type checker, because
298 only it knows how to build the dictionaries d1 and d2! For example
300 g :: Ord a => [a] -> [a]
301 {-# SPECIALIZE f :: [Tree Int] -> [Tree Int] #-}
303 Here, the specialised version of g is an application of g's rhs to the
304 Ord dictionary for (Tree Int), which only the type checker can conjure
305 up. There might not even *be* one, if (Tree Int) is not an instance of
306 Ord! (All the other specialision has suitable dictionaries to hand
309 Problem. The type checker doesn't have to hand a convenient <f_rhs>, because
310 it is buried in a complex (as-yet-un-desugared) binding group.
313 f@t1/t2 = f* t1 t2 d1 d2
315 where f* is the Id f with an IdInfo which says "inline me regardless!".
316 Indeed all the specialisation could be done in this way.
317 That in turn means that the simplifier has to be prepared to inline absolutely
318 any in-scope let-bound thing.
321 Again, the pragma should permit polymorphism in unconstrained variables:
323 h :: Ord a => [a] -> b -> b
324 {-# SPECIALIZE h :: [Int] -> b -> b #-}
326 We *insist* that all overloaded type variables are specialised to ground types,
327 (and hence there can be no context inside a SPECIALIZE pragma).
328 We *permit* unconstrained type variables to be specialised to
330 - or left as a polymorphic type variable
331 but nothing in between. So
333 {-# SPECIALIZE h :: [Int] -> [c] -> [c] #-}
335 is *illegal*. (It can be handled, but it adds complication, and gains the
339 SPECIALISING INSTANCE DECLARATIONS
340 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
343 instance Foo a => Foo [a] where
345 {-# SPECIALIZE instance Foo [Int] #-}
347 The original instance decl creates a dictionary-function
350 dfun.Foo.List :: forall a. Foo a -> Foo [a]
352 The SPECIALIZE pragma just makes a specialised copy, just as for
353 ordinary function definitions:
355 dfun.Foo.List@Int :: Foo [Int]
356 dfun.Foo.List@Int = dfun.Foo.List Int dFooInt
358 The information about what instance of the dfun exist gets added to
359 the dfun's IdInfo in the same way as a user-defined function too.
361 In fact, matters are a little bit more complicated than this.
362 When we make one of these specialised instances, we are defining
363 a constant dictionary, and so we want immediate access to its constant
364 methods and superclasses. Indeed, these constant methods and superclasses
365 must be in the IdInfo for the class selectors! We need help from the
366 typechecker to sort this out, perhaps by generating a separate IdInfo
369 Automatic instance decl specialisation?
370 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
371 Can instance decls be specialised automatically? It's tricky.
372 We could collect call-instance information for each dfun, but
373 then when we specialised their bodies we'd get new call-instances
374 for ordinary functions; and when we specialised their bodies, we might get
375 new call-instances of the dfuns, and so on. This all arises because of
376 the unrestricted mutual recursion between instance decls and value decls.
378 Furthermore, instance decls are usually exported and used non-locally,
379 so we'll want to compile enough to get those specialisations done.
381 Lastly, there's no such thing as a local instance decl, so we can
382 survive solely by spitting out *usage* information, and then reading that
383 back in as a pragma when next compiling the file. So for now,
384 we only specialise instance decls in response to pragmas.
386 That means that even if an instance decl ain't otherwise exported it
387 needs to be spat out as with a SPECIALIZE pragma. Furthermore, it needs
388 something to say which module defined the instance, so the usage info
389 can be fed into the right reqts info file. Blegh.
392 SPECIAILISING DATA DECLARATIONS
393 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
395 With unboxed specialisation (or full specialisation) we also require
396 data types (and their constructors) to be speciailised on unboxed
399 In addition to normal call instances we gather TyCon call instances at
400 unboxed types, determine equivalence classes for the locally defined
401 TyCons and build speciailised data constructor Ids for each TyCon and
402 substitute these in the Con calls.
404 We need the list of local TyCons to partition the TyCon instance info.
405 We pass out a FiniteMap from local TyCons to Specialised Instances to
406 give to the interface and code genertors.
408 N.B. The specialised data constructors reference the original data
409 constructor and type constructor which do not have the updated
410 specialisation info attached. Any specialisation info must be
411 extracted from the TyCon map returned.
414 SPITTING OUT USAGE INFORMATION
415 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
417 To spit out usage information we need to traverse the code collecting
418 call-instance information for all imported (non-prelude?) functions
419 and data types. Then we equivalence-class it and spit it out.
421 This is done at the top-level when all the call instances which escape
422 must be for imported functions and data types.
425 Partial specialisation by pragmas
426 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
427 What about partial specialisation:
429 k :: (Ord a, Eq b) => [a] -> b -> b -> [a]
430 {-# SPECIALIZE k :: Eq b => [Int] -> b -> b -> [a] #-}
434 {-# SPECIALIZE k :: Eq b => [Int] -> [b] -> [b] -> [a] #-}
436 Seems quite reasonable. Similar things could be done with instance decls:
438 instance (Foo a, Foo b) => Foo (a,b) where
440 {-# SPECIALIZE instance Foo a => Foo (a,Int) #-}
441 {-# SPECIALIZE instance Foo b => Foo (Int,b) #-}
443 Ho hum. Things are complex enough without this. I pass.
446 Requirements for the simplifer
447 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
448 The simplifier has to be able to take advantage of the specialisation.
450 * When the simplifier finds an application of a polymorphic f, it looks in
451 f's IdInfo in case there is a suitable instance to call instead. This converts
453 f t1 t2 d1 d2 ===> f_t1_t2
455 Note that the dictionaries get eaten up too!
457 * Dictionary selection operations on constant dictionaries must be
460 +.sel Int d ===> +Int
462 The obvious way to do this is in the same way as other specialised
463 calls: +.sel has inside it some IdInfo which tells that if it's applied
464 to the type Int then it should eat a dictionary and transform to +Int.
466 In short, dictionary selectors need IdInfo inside them for constant
469 * Exactly the same applies if a superclass dictionary is being
472 Eq.sel Int d ===> dEqInt
474 * Something similar applies to dictionary construction too. Suppose
475 dfun.Eq.List is the function taking a dictionary for (Eq a) to
476 one for (Eq [a]). Then we want
478 dfun.Eq.List Int d ===> dEq.List_Int
480 Where does the Eq [Int] dictionary come from? It is built in
481 response to a SPECIALIZE pragma on the Eq [a] instance decl.
483 In short, dfun Ids need IdInfo with a specialisation for each
484 constant instance of their instance declaration.
487 What does the specialisation IdInfo look like?
488 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
491 [Maybe Type] -- Instance types
492 Int -- No of dicts to eat
493 Id -- Specialised version
495 For example, if f has this SpecInfo:
497 SpecInfo [Just t1, Nothing, Just t3] 2 f'
501 f t1 t2 t3 d1 d2 ===> f t2
503 The "Nothings" identify type arguments in which the specialised
504 version is polymorphic.
506 What can't be done this way?
507 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
508 There is no way, post-typechecker, to get a dictionary for (say)
509 Eq a from a dictionary for Eq [a]. So if we find
513 we can't transform to
518 eqList :: (a->a->Bool) -> [a] -> [a] -> Bool
520 Of course, we currently have no way to automatically derive
521 eqList, nor to connect it to the Eq [a] instance decl, but you
522 can imagine that it might somehow be possible. Taking advantage
523 of this is permanently ruled out.
525 Still, this is no great hardship, because we intend to eliminate
526 overloading altogether anyway!
531 What about types/classes mentioned in SPECIALIZE pragmas spat out,
532 but not otherwise exported. Even if they are exported, what about
533 their original names.
535 Suggestion: use qualified names in pragmas, omitting module for
536 prelude and "this module".
543 f a (d::Num a) = let g = ...
545 ...(let d1::Ord a = Num.Ord.sel a d in g a d1)...
547 Here, g is only called at one type, but the dictionary isn't in scope at the
548 definition point for g. Usually the type checker would build a
549 definition for d1 which enclosed g, but the transformation system
550 might have moved d1's defn inward.
556 What should we do when a value is specialised to a *strict* unboxed value?
558 map_*_* f (x:xs) = let h = f x
562 Could convert let to case:
564 map_*_Int# f (x:xs) = case f x of h# ->
568 This may be undesirable since it forces evaluation here, but the value
569 may not be used in all branches of the body. In the general case this
570 transformation is impossible since the mutual recursion in a letrec
571 cannot be expressed as a case.
573 There is also a problem with top-level unboxed values, since our
574 implementation cannot handle unboxed values at the top level.
576 Solution: Lift the binding of the unboxed value and extract it when it
579 map_*_Int# f (x:xs) = let h = case (f x) of h# -> _Lift h#
584 Now give it to the simplifier and the _Lifting will be optimised away.
586 The benfit is that we have given the specialised "unboxed" values a
587 very simplep lifted semantics and then leave it up to the simplifier to
588 optimise it --- knowing that the overheads will be removed in nearly
591 In particular, the value will only be evaluted in the branches of the
592 program which use it, rather than being forced at the point where the
593 value is bound. For example:
595 filtermap_*_* p f (x:xs)
602 filtermap_*_Int# p f (x:xs)
603 = let h = case (f x) of h# -> _Lift h#
606 True -> case h of _Lift h#
610 The binding for h can still be inlined in the one branch and the
614 Question: When won't the _Lifting be eliminated?
616 Answer: When they at the top-level (where it is necessary) or when
617 inlining would duplicate work (or possibly code depending on
618 options). However, the _Lifting will still be eliminated if the
619 strictness analyser deems the lifted binding strict.
622 A note about non-tyvar dictionaries
623 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
624 Some Ids have types like
626 forall a,b,c. Eq a -> Ord [a] -> tau
628 This seems curious at first, because we usually only have dictionary
629 args whose types are of the form (C a) where a is a type variable.
630 But this doesn't hold for the functions arising from instance decls,
631 which sometimes get arguements with types of form (C (T a)) for some
634 Should we specialise wrt this compound-type dictionary? We used to say
636 "This is a heuristic judgement, as indeed is the fact that we
637 specialise wrt only dictionaries. We choose *not* to specialise
638 wrt compound dictionaries because at the moment the only place
639 they show up is in instance decls, where they are simply plugged
640 into a returned dictionary. So nothing is gained by specialising
643 But it is simpler and more uniform to specialise wrt these dicts too;
644 and in future GHC is likely to support full fledged type signatures
646 f ;: Eq [(a,b)] => ...
649 %************************************************************************
651 \subsubsection{The new specialiser}
653 %************************************************************************
655 Our basic game plan is this. For let(rec) bound function
656 f :: (C a, D c) => (a,b,c,d) -> Bool
658 * Find any specialised calls of f, (f ts ds), where
659 ts are the type arguments t1 .. t4, and
660 ds are the dictionary arguments d1 .. d2.
662 * Add a new definition for f1 (say):
664 f1 = /\ b d -> (..body of f..) t1 b t3 d d1 d2
666 Note that we abstract over the unconstrained type arguments.
670 [t1,b,t3,d] |-> \d1 d2 -> f1 b d
672 to the specialisations of f. This will be used by the
673 simplifier to replace calls
674 (f t1 t2 t3 t4) da db
676 (\d1 d1 -> f1 t2 t4) da db
678 All the stuff about how many dictionaries to discard, and what types
679 to apply the specialised function to, are handled by the fact that the
680 SpecEnv contains a template for the result of the specialisation.
682 We don't build *partial* specialisations for f. For example:
684 f :: Eq a => a -> a -> Bool
685 {-# SPECIALISE f :: (Eq b, Eq c) => (b,c) -> (b,c) -> Bool #-}
687 Here, little is gained by making a specialised copy of f.
688 There's a distinct danger that the specialised version would
689 first build a dictionary for (Eq b, Eq c), and then select the (==)
690 method from it! Even if it didn't, not a great deal is saved.
692 We do, however, generate polymorphic, but not overloaded, specialisations:
694 f :: Eq a => [a] -> b -> b -> b
695 {#- SPECIALISE f :: [Int] -> b -> b -> b #-}
697 Hence, the invariant is this:
699 *** no specialised version is overloaded ***
702 %************************************************************************
704 \subsubsection{The exported function}
706 %************************************************************************
709 specProgram :: UniqSupply -> [CoreBinding] -> [CoreBinding]
711 = initSM us (go binds `thenSM` \ (binds', _) ->
715 go [] = returnSM ([], emptyUDs)
716 go (bind:binds) = go binds `thenSM` \ (binds', uds) ->
717 specBind bind uds `thenSM` \ (bind', uds') ->
718 returnSM (bind' ++ binds', uds')
721 %************************************************************************
723 \subsubsection{@specExpr@: the main function}
725 %************************************************************************
728 specExpr :: CoreExpr -> SpecM (CoreExpr, UsageDetails)
730 ---------------- First the easy cases --------------------
731 specExpr e@(Var _) = returnSM (e, emptyUDs)
732 specExpr e@(Lit _) = returnSM (e, emptyUDs)
733 specExpr e@(Con _ _) = returnSM (e, emptyUDs)
734 specExpr e@(Prim _ _) = returnSM (e, emptyUDs)
736 specExpr (Coerce co ty body)
737 = specExpr body `thenSM` \ (body', uds) ->
738 returnSM (Coerce co ty body', uds)
740 specExpr (SCC cc body)
741 = specExpr body `thenSM` \ (body', uds) ->
742 returnSM (SCC cc body', uds)
745 ---------------- Applications might generate a call instance --------------------
746 specExpr e@(App fun arg)
749 go (App fun arg) args = go fun (arg:args)
750 go (Var f) args = returnSM (e, mkCallUDs f args)
751 go other args = specExpr other `thenSM` \ (e', uds) ->
752 returnSM (foldl App e' args, uds)
754 ---------------- Lambda/case require dumping of usage details --------------------
756 = specExpr body `thenSM` \ (body', uds) ->
758 (filtered_uds, body'') = dumpUDs bndrs uds body'
760 returnSM (foldr Lam body'' bndrs, filtered_uds)
762 (bndrs, body) = go [] e
764 -- More efficient to collect a group of binders together all at once
765 go bndrs (Lam bndr e) = go (bndr:bndrs) e
766 go bndrs e = (reverse bndrs, e)
769 specExpr (Case scrut alts)
770 = specExpr scrut `thenSM` \ (scrut', uds_scrut) ->
771 spec_alts alts `thenSM` \ (alts', uds_alts) ->
772 returnSM (Case scrut' alts', uds_scrut `plusUDs` uds_alts)
774 spec_alts (AlgAlts alts deflt)
775 = mapAndCombineSM spec_alg_alt alts `thenSM` \ (alts', uds1) ->
776 spec_deflt deflt `thenSM` \ (deflt', uds2) ->
777 returnSM (AlgAlts alts' deflt', uds1 `plusUDs` uds2)
779 spec_alts (PrimAlts alts deflt)
780 = mapAndCombineSM spec_prim_alt alts `thenSM` \ (alts', uds1) ->
781 spec_deflt deflt `thenSM` \ (deflt', uds2) ->
782 returnSM (PrimAlts alts' deflt', uds1 `plusUDs` uds2)
784 spec_alg_alt (con, args, rhs)
785 = specExpr rhs `thenSM` \ (rhs', uds) ->
787 (uds', rhs'') = dumpUDs (map ValBinder args) uds rhs'
789 returnSM ((con, args, rhs''), uds')
791 spec_prim_alt (lit, rhs)
792 = specExpr rhs `thenSM` \ (rhs', uds) ->
793 returnSM ((lit, rhs'), uds)
795 spec_deflt NoDefault = returnSM (NoDefault, emptyUDs)
796 spec_deflt (BindDefault arg rhs)
797 = specExpr rhs `thenSM` \ (rhs', uds) ->
799 (uds', rhs'') = dumpUDs [ValBinder arg] uds rhs'
801 returnSM (BindDefault arg rhs'', uds')
803 ---------------- Finally, let is the interesting case --------------------
804 specExpr (Let bind body)
805 = -- Deal with the body
806 specExpr body `thenSM` \ (body', body_uds) ->
808 -- Deal with the bindings
809 specBind bind body_uds `thenSM` \ (binds', uds) ->
812 returnSM (foldr Let body' binds', uds)
815 %************************************************************************
817 \subsubsection{Dealing with a binding}
819 %************************************************************************
822 specBind :: CoreBinding
823 -> UsageDetails -- Info on how the scope of the binding
824 -> SpecM ([CoreBinding], -- New bindings
825 UsageDetails) -- And info to pass upstream
827 specBind (NonRec bndr rhs) body_uds
828 | isDictTy (idType bndr)
829 = -- It's a dictionary binding
830 -- Pick it up and float it outwards.
831 specExpr rhs `thenSM` \ (rhs', rhs_uds) ->
833 all_uds = rhs_uds `plusUDs` addDictBind body_uds bndr rhs'
835 returnSM ([], all_uds)
837 | isSpecPragmaId bndr
838 -- SpecPragmaIds are there solely to generate specialisations
839 -- Just drop the whole binding; keep only its usage details
840 = specExpr rhs `thenSM` \ (rhs', rhs_uds) ->
841 returnSM ([], rhs_uds `plusUDs` body_uds)
844 = -- Deal with the RHS, specialising it according
845 -- to the calls found in the body
846 specDefn (calls body_uds) (bndr,rhs) `thenSM` \ ((bndr',rhs'), spec_defns, spec_uds) ->
848 (all_uds, (dict_binds, dump_calls))
849 = splitUDs [ValBinder bndr'] (spec_uds `plusUDs` body_uds)
851 returnSM ( [NonRec bndr' rhs']
856 specBind (Rec pairs) body_uds
857 = mapSM (specDefn (calls body_uds)) pairs `thenSM` \ stuff ->
859 (pairs', spec_defns_s, spec_uds_s) = unzip3 stuff
860 spec_defns = concat spec_defns_s
861 spec_uds = plusUDList spec_uds_s
862 (all_uds, (dict_binds, dump_calls))
863 = splitUDs (map (ValBinder . fst) pairs') (spec_uds `plusUDs` body_uds)
865 returnSM ( [Rec pairs']
870 specDefn :: CallDetails -- Info on how it is used in its scope
871 -> (Id, CoreExpr) -- The thing being bound and its un-processed RHS
872 -> SpecM ((Id, CoreExpr), -- The thing and its processed RHS
873 -- the Id may now have specialisations attached
874 [CoreBinding], -- Extra, specialised bindings
875 UsageDetails -- Stuff to fling upwards from the RHS and its
876 ) -- specialised versions
878 specDefn calls (fn, rhs)
879 -- The first case is the interesting one
880 | n_tyvars == length rhs_tyvars -- Rhs of fn's defn has right number of big lambdas
881 && n_dicts <= length rhs_bndrs -- and enough dict args
882 && not (null calls_for_me) -- And there are some calls to specialise
883 = -- Specialise the body of the function
884 specExpr body `thenSM` \ (body', body_uds) ->
886 (float_uds, bound_uds@(dict_binds,_)) = splitUDs rhs_bndrs body_uds
889 -- Make a specialised version for each call in calls_for_me
890 mapSM (spec_call bound_uds) calls_for_me `thenSM` \ stuff ->
892 (spec_defns, spec_uds, spec_env_stuff) = unzip3 stuff
894 fn' = addIdSpecialisations fn spec_env_stuff
895 rhs' = foldr Lam (foldr Let body' dict_binds) rhs_bndrs
897 returnSM ((fn',rhs'),
899 float_uds `plusUDs` plusUDList spec_uds)
901 | otherwise -- No calls or RHS doesn't fit our preconceptions
902 = specExpr rhs `thenSM` \ (rhs', rhs_uds) ->
903 returnSM ((fn, rhs'), [], rhs_uds)
907 (tyvars, theta, tau) = splitSigmaTy fn_type
908 n_tyvars = length tyvars
909 n_dicts = length theta
911 (rhs_tyvars, rhs_ids, rhs_body) = collectBinders rhs
912 rhs_dicts = take n_dicts rhs_ids
913 rhs_bndrs = map TyBinder rhs_tyvars ++ map ValBinder rhs_dicts
914 body = mkValLam (drop n_dicts rhs_ids) rhs_body
915 -- Glue back on the non-dict lambdas
917 calls_for_me = case lookupFM calls fn of
919 Just cs -> fmToList cs
921 -- Specialise to one particular call pattern
922 spec_call :: ProtoUsageDetails -- From the original body, captured by
923 -- the dictionary lambdas
924 -> ([Maybe Type], [DictVar]) -- Call instance
925 -> SpecM (CoreBinding, -- Specialised definition
926 UsageDetails, -- Usage details from specialised body
927 ([Type], CoreExpr)) -- Info for the Id's SpecEnv
928 spec_call bound_uds (call_ts, call_ds)
929 = ASSERT( length call_ts == n_tyvars && length call_ds == n_dicts )
930 -- Calls are only recorded for properly-saturated applications
932 -- Supppose the call is for f [Just t1, Nothing, Just t3, Nothing] [d1, d2]
934 -- Construct the new binding
935 -- f1 = /\ b d -> (..rhs of f..) t1 b t3 d d1 d2
936 -- and the type of this binder
938 spec_tyvars = [tyvar | (tyvar, Nothing) <- tyvars `zip` call_ts]
939 spec_tys = zipWith mk_spec_ty call_ts tyvars
940 spec_rhs = mkTyLam spec_tyvars $
941 mkGenApp rhs (map TyArg spec_tys ++ map VarArg call_ds)
942 spec_id_ty = mkForAllTys spec_tyvars (applyTys fn_type spec_tys)
944 mk_spec_ty (Just ty) _ = ty
945 mk_spec_ty Nothing tyvar = mkTyVarTy tyvar
947 newIdSM fn spec_id_ty `thenSM` \ spec_f ->
950 -- Construct the stuff for f's spec env
951 -- [t1,b,t3,d] |-> \d1 d2 -> f1 b d
953 spec_env_rhs = mkValLam call_ds $
954 mkTyApp (Var spec_f) $
955 map mkTyVarTy spec_tyvars
956 spec_env_info = (spec_tys, spec_env_rhs)
959 -- Specialise the UDs from f's RHS
961 tv_env = [ (rhs_tyvar,ty)
962 | (rhs_tyvar, Just ty) <- zipEqual "specUDs1" rhs_tyvars call_ts
964 dict_env = zipEqual "specUDs2" rhs_dicts call_ds
966 specUDs tv_env dict_env bound_uds `thenSM` \ spec_uds ->
968 returnSM (NonRec spec_f spec_rhs,
974 %************************************************************************
976 \subsubsection{UsageDetails and suchlike}
978 %************************************************************************
981 type FreeDicts = IdSet
985 dict_binds :: !(Bag (DictVar, CoreExpr, TyVarSet, FreeDicts)),
986 -- Floated dictionary bindings
987 -- The order is important;
988 -- in ds1 `union` ds2, bindings in ds2 can depend on those in ds1
989 -- (Remember, Bags preserve order in GHC.)
990 -- The FreeDicts is the free vars of the RHS
992 calls :: !CallDetails
995 emptyUDs = MkUD { dict_binds = emptyBag, calls = emptyFM }
997 type ProtoUsageDetails = ([CoreBinding], -- Dict bindings
998 [(Id, [Maybe Type], [DictVar])]
1001 ------------------------------------------------------------
1002 type CallDetails = FiniteMap Id CallInfo
1003 type CallInfo = FiniteMap [Maybe Type] -- Nothing => unconstrained type argument
1004 [DictVar] -- Dict args
1005 -- The finite maps eliminate duplicates
1006 -- The list of types and dictionaries is guaranteed to
1007 -- match the type of f
1009 callDetailsToList calls = [ (id,tys,dicts)
1010 | (id,fm) <- fmToList calls,
1011 (tys,dicts) <- fmToList fm
1014 listToCallDetails calls = foldr (unionCalls . singleCall) emptyFM calls
1016 unionCalls :: CallDetails -> CallDetails -> CallDetails
1017 unionCalls c1 c2 = plusFM_C plusFM c1 c2
1019 singleCall (id, tys, dicts) = unitFM id (unitFM tys dicts)
1023 || length spec_tys /= n_tyvars
1024 || length dicts /= n_dicts
1025 = emptyUDs -- Not overloaded
1028 = MkUD {dict_binds = emptyBag,
1029 calls = singleCall (f, spec_tys, dicts)
1032 (tyvars, theta, tau) = splitSigmaTy (idType f)
1033 constrained_tyvars = foldr (unionTyVarSets . tyVarsOfTypes . snd) emptyTyVarSet theta
1034 n_tyvars = length tyvars
1035 n_dicts = length theta
1037 spec_tys = [mk_spec_ty tv ty | (tv, TyArg ty) <- tyvars `zip` args]
1038 dicts = [d | (_, VarArg d) <- theta `zip` (drop n_tyvars args)]
1040 mk_spec_ty tyvar ty | tyvar `elementOfTyVarSet` constrained_tyvars
1045 ------------------------------------------------------------
1046 plusUDs :: UsageDetails -> UsageDetails -> UsageDetails
1047 plusUDs (MkUD {dict_binds = db1, calls = calls1})
1048 (MkUD {dict_binds = db2, calls = calls2})
1049 = MkUD {dict_binds, calls}
1051 dict_binds = db1 `unionBags` db2
1052 calls = calls1 `unionCalls` calls2
1054 plusUDList = foldr plusUDs emptyUDs
1056 mkDB dict rhs = (dict, rhs, db_ftvs, db_fvs)
1058 db_ftvs = tyVarsOfType (idType dict) -- Superset of RHS fvs
1059 db_fvs = dictRhsFVs rhs
1061 addDictBind uds dict rhs = uds { dict_binds = mkDB dict rhs `consBag` dict_binds uds }
1063 dumpUDs :: [CoreBinder]
1064 -> UsageDetails -> CoreExpr
1065 -> (UsageDetails, CoreExpr)
1066 dumpUDs bndrs uds body
1067 = (free_uds, foldr Let body dict_binds)
1069 (free_uds, (dict_binds, _)) = splitUDs bndrs uds
1071 splitUDs :: [CoreBinder]
1073 -> (UsageDetails, -- These don't mention the binders
1074 ProtoUsageDetails) -- These do
1076 splitUDs bndrs uds@(MkUD {dict_binds = orig_dbs,
1077 calls = orig_calls})
1079 = if isEmptyBag dump_dbs && null dump_calls then
1080 -- Common case: binder doesn't affect floats
1084 -- Binders bind some of the fvs of the floats
1085 (MkUD {dict_binds = free_dbs,
1086 calls = listToCallDetails free_calls},
1087 (bagToList dump_dbs, dump_calls)
1091 tyvar_set = mkTyVarSet [tv | TyBinder tv <- bndrs]
1092 id_set = mkIdSet [id | ValBinder id <- bndrs]
1094 (free_dbs, dump_dbs, dump_idset)
1095 = foldlBag dump_db (emptyBag, emptyBag, id_set) orig_dbs
1096 -- Important that it's foldl not foldr;
1097 -- we're accumulating the set of dumped ids in dump_set
1099 -- Filter out any calls that mention things that are being dumped
1100 -- Don't need to worry about the tyvars because the dicts will
1101 -- spot the captured ones; any fully polymorphic arguments will
1102 -- be Nothings in the call details
1103 orig_call_list = callDetailsToList orig_calls
1104 (dump_calls, free_calls) = partition captured orig_call_list
1105 captured (id,tys,dicts) = any (`elementOfIdSet` dump_idset) (id:dicts)
1107 dump_db (free_dbs, dump_dbs, dump_idset) db@(dict, rhs, ftvs, fvs)
1108 | isEmptyIdSet (dump_idset `intersectIdSets` fvs)
1109 && isEmptyTyVarSet (tyvar_set `intersectTyVarSets` ftvs)
1110 = (free_dbs `snocBag` db, dump_dbs, dump_idset)
1112 | otherwise -- Dump it
1113 = (free_dbs, dump_dbs `snocBag` NonRec dict rhs,
1114 dump_idset `addOneToIdSet` dict)
1117 Given a type and value substitution, specUDs creates a specialised copy of
1121 specUDs :: [(TyVar,Type)] -> [(DictVar,DictVar)] -> ProtoUsageDetails -> SpecM UsageDetails
1122 specUDs tv_env_list dict_env_list (dbs, calls)
1123 = specDBs dict_env dbs `thenSM` \ (dict_env', dbs') ->
1124 returnSM (MkUD { dict_binds = dbs',
1125 calls = listToCallDetails (map (inst_call dict_env') calls)
1128 tv_env = mkTyVarEnv tv_env_list
1129 dict_env = mkIdEnv dict_env_list
1131 inst_call dict_env (id, tys, dicts) = (id, map inst_maybe_ty tys,
1132 map (lookupId dict_env) dicts)
1134 inst_maybe_ty Nothing = Nothing
1135 inst_maybe_ty (Just ty) = Just (instantiateTy tv_env ty)
1138 = returnSM (dict_env, emptyBag)
1139 specDBs dict_env (NonRec dict rhs : dbs)
1140 = newIdSM dict (instantiateTy tv_env (idType dict)) `thenSM` \ dict' ->
1142 dict_env' = addOneToIdEnv dict_env dict dict'
1143 rhs' = instantiateDictRhs tv_env dict_env rhs
1145 specDBs dict_env' dbs `thenSM` \ (dict_env'', dbs') ->
1146 returnSM ( dict_env'', mkDB dict' rhs' `consBag` dbs' )
1149 %************************************************************************
1151 \subsubsection{Boring helper functions}
1153 %************************************************************************
1156 lookupId:: IdEnv Id -> Id -> Id
1157 lookupId env id = case lookupIdEnv env id of
1161 instantiateDictRhs :: TyVarEnv Type -> IdEnv Id -> CoreExpr -> CoreExpr
1162 -- Cheapo function for simple RHSs
1163 instantiateDictRhs ty_env id_env rhs
1166 go (App e1 (VarArg a)) = App (go e1) (VarArg (lookupId id_env a))
1167 go (App e1 (TyArg t)) = App (go e1) (TyArg (instantiateTy ty_env t))
1168 go (Var v) = Var (lookupId id_env v)
1171 dictRhsFVs :: CoreExpr -> IdSet
1172 -- Cheapo function for simple RHSs
1173 dictRhsFVs (App e1 (VarArg a)) = dictRhsFVs e1 `addOneToIdSet` a
1174 dictRhsFVs (App e1 (TyArg t)) = dictRhsFVs e1
1175 dictRhsFVs (Var v) = unitIdSet v
1176 dictRhsFVs (Lit l) = emptyIdSet
1179 addIdSpecialisations id spec_stuff
1180 = (if not (null errs) then
1181 pprTrace "Duplicate specialisations" (vcat (map ppr errs))
1184 addIdSpecialisation id new_spec_env
1186 (new_spec_env, errs) = foldr add (getIdSpecialisation id, []) spec_stuff
1188 add (tys, template) (spec_env, errs)
1189 = case addToSpecEnv spec_env tys (occurAnalyseGlobalExpr template) of
1190 Succeeded spec_env' -> (spec_env', errs)
1191 Failed err -> (spec_env, err:errs)
1193 ----------------------------------------
1194 type SpecM a = UniqSM a
1198 getUniqSM = getUnique
1202 mapAndCombineSM f [] = returnSM ([], emptyUDs)
1203 mapAndCombineSM f (x:xs) = f x `thenSM` \ (y, uds1) ->
1204 mapAndCombineSM f xs `thenSM` \ (ys, uds2) ->
1205 returnSM (y:ys, uds1 `plusUDs` uds2)
1207 newIdSM old_id new_ty
1208 = getUnique `thenSM` \ uniq ->
1209 returnSM (mkUserLocal (getOccName old_id)