2 % (c) The GRASP/AQUA Project, Glasgow University, 1993-1998
4 \section[Specialise]{Stamping out overloading, and (optionally) polymorphism}
7 module Specialise ( specProgram ) where
9 #include "HsVersions.h"
11 import CmdLineOpts ( opt_D_verbose_core2core, opt_D_dump_spec, opt_D_dump_rules )
12 import Id ( Id, idName, idType, mkTemplateLocals, mkUserLocal,
13 getIdSpecialisation, setIdNoDiscard, isExportedId,
16 import IdInfo ( zapSpecPragInfo )
20 import Type ( Type, mkTyVarTy, splitSigmaTy, splitFunTysN,
21 tyVarsOfType, tyVarsOfTypes, applyTys,
22 mkForAllTys, boxedTypeKind
24 import Subst ( Subst, mkSubst, substTy, emptySubst, substBndrs, extendSubstList,
25 substExpr, substId, substIds, substAndCloneId, substAndCloneIds, lookupSubst
27 import Var ( TyVar, mkSysTyVar, setVarUnique )
31 import CoreUtils ( coreExprType, applyTypeToArgs )
32 import CoreFVs ( exprFreeVars, exprsFreeVars )
33 import CoreLint ( beginPass, endPass )
34 import PprCore ( pprCoreRules )
35 import Rules ( addIdSpecialisations )
37 import UniqSupply ( UniqSupply,
38 UniqSM, initUs_, thenUs, thenUs_, returnUs, getUniqueUs,
39 getUs, setUs, uniqFromSupply, splitUniqSupply, mapUs
41 import Name ( nameOccName, mkSpecOcc, getSrcLoc )
43 import Maybes ( MaybeErr(..), catMaybes )
44 import ErrUtils ( dumpIfSet )
46 import List ( partition )
47 import Util ( zipEqual, zipWithEqual, mapAccumL )
54 %************************************************************************
56 \subsection[notes-Specialise]{Implementation notes [SLPJ, Aug 18 1993]}
58 %************************************************************************
60 These notes describe how we implement specialisation to eliminate
63 The specialisation pass works on Core
64 syntax, complete with all the explicit dictionary application,
65 abstraction and construction as added by the type checker. The
66 existing type checker remains largely as it is.
68 One important thought: the {\em types} passed to an overloaded
69 function, and the {\em dictionaries} passed are mutually redundant.
70 If the same function is applied to the same type(s) then it is sure to
71 be applied to the same dictionary(s)---or rather to the same {\em
72 values}. (The arguments might look different but they will evaluate
75 Second important thought: we know that we can make progress by
76 treating dictionary arguments as static and worth specialising on. So
77 we can do without binding-time analysis, and instead specialise on
78 dictionary arguments and no others.
87 and suppose f is overloaded.
89 STEP 1: CALL-INSTANCE COLLECTION
91 We traverse <body>, accumulating all applications of f to types and
94 (Might there be partial applications, to just some of its types and
95 dictionaries? In principle yes, but in practice the type checker only
96 builds applications of f to all its types and dictionaries, so partial
97 applications could only arise as a result of transformation, and even
98 then I think it's unlikely. In any case, we simply don't accumulate such
99 partial applications.)
104 So now we have a collection of calls to f:
108 Notice that f may take several type arguments. To avoid ambiguity, we
109 say that f is called at type t1/t2 and t3/t4.
111 We take equivalence classes using equality of the *types* (ignoring
112 the dictionary args, which as mentioned previously are redundant).
114 STEP 3: SPECIALISATION
116 For each equivalence class, choose a representative (f t1 t2 d1 d2),
117 and create a local instance of f, defined thus:
119 f@t1/t2 = <f_rhs> t1 t2 d1 d2
121 f_rhs presumably has some big lambdas and dictionary lambdas, so lots
122 of simplification will now result. However we don't actually *do* that
123 simplification. Rather, we leave it for the simplifier to do. If we
124 *did* do it, though, we'd get more call instances from the specialised
125 RHS. We can work out what they are by instantiating the call-instance
126 set from f's RHS with the types t1, t2.
128 Add this new id to f's IdInfo, to record that f has a specialised version.
130 Before doing any of this, check that f's IdInfo doesn't already
131 tell us about an existing instance of f at the required type/s.
132 (This might happen if specialisation was applied more than once, or
133 it might arise from user SPECIALIZE pragmas.)
137 Wait a minute! What if f is recursive? Then we can't just plug in
138 its right-hand side, can we?
140 But it's ok. The type checker *always* creates non-recursive definitions
141 for overloaded recursive functions. For example:
143 f x = f (x+x) -- Yes I know its silly
147 f a (d::Num a) = let p = +.sel a d
149 letrec fl (y::a) = fl (p y y)
153 We still have recusion for non-overloaded functions which we
154 speciailise, but the recursive call should get specialised to the
155 same recursive version.
161 All this is crystal clear when the function is applied to *constant
162 types*; that is, types which have no type variables inside. But what if
163 it is applied to non-constant types? Suppose we find a call of f at type
164 t1/t2. There are two possibilities:
166 (a) The free type variables of t1, t2 are in scope at the definition point
167 of f. In this case there's no problem, we proceed just as before. A common
168 example is as follows. Here's the Haskell:
173 After typechecking we have
175 g a (d::Num a) (y::a) = let f b (d'::Num b) (x::b) = +.sel b d' x x
176 in +.sel a d (f a d y) (f a d y)
178 Notice that the call to f is at type type "a"; a non-constant type.
179 Both calls to f are at the same type, so we can specialise to give:
181 g a (d::Num a) (y::a) = let f@a (x::a) = +.sel a d x x
182 in +.sel a d (f@a y) (f@a y)
185 (b) The other case is when the type variables in the instance types
186 are *not* in scope at the definition point of f. The example we are
187 working with above is a good case. There are two instances of (+.sel a d),
188 but "a" is not in scope at the definition of +.sel. Can we do anything?
189 Yes, we can "common them up", a sort of limited common sub-expression deal.
192 g a (d::Num a) (y::a) = let +.sel@a = +.sel a d
193 f@a (x::a) = +.sel@a x x
194 in +.sel@a (f@a y) (f@a y)
196 This can save work, and can't be spotted by the type checker, because
197 the two instances of +.sel weren't originally at the same type.
201 * There are quite a few variations here. For example, the defn of
202 +.sel could be floated ouside the \y, to attempt to gain laziness.
203 It certainly mustn't be floated outside the \d because the d has to
206 * We don't want to inline f_rhs in this case, because
207 that will duplicate code. Just commoning up the call is the point.
209 * Nothing gets added to +.sel's IdInfo.
211 * Don't bother unless the equivalence class has more than one item!
213 Not clear whether this is all worth it. It is of course OK to
214 simply discard call-instances when passing a big lambda.
216 Polymorphism 2 -- Overloading
218 Consider a function whose most general type is
220 f :: forall a b. Ord a => [a] -> b -> b
222 There is really no point in making a version of g at Int/Int and another
223 at Int/Bool, because it's only instancing the type variable "a" which
224 buys us any efficiency. Since g is completely polymorphic in b there
225 ain't much point in making separate versions of g for the different
228 That suggests that we should identify which of g's type variables
229 are constrained (like "a") and which are unconstrained (like "b").
230 Then when taking equivalence classes in STEP 2, we ignore the type args
231 corresponding to unconstrained type variable. In STEP 3 we make
232 polymorphic versions. Thus:
234 f@t1/ = /\b -> <f_rhs> t1 b d1 d2
243 f a (d::Num a) = let g = ...
245 ...(let d1::Ord a = Num.Ord.sel a d in g a d1)...
247 Here, g is only called at one type, but the dictionary isn't in scope at the
248 definition point for g. Usually the type checker would build a
249 definition for d1 which enclosed g, but the transformation system
250 might have moved d1's defn inward. Solution: float dictionary bindings
251 outwards along with call instances.
255 f x = let g p q = p==q
261 Before specialisation, leaving out type abstractions we have
263 f df x = let g :: Eq a => a -> a -> Bool
265 h :: Num a => a -> a -> (a, Bool)
266 h dh r s = let deq = eqFromNum dh
267 in (+ dh r s, g deq r s)
271 After specialising h we get a specialised version of h, like this:
273 h' r s = let deq = eqFromNum df
274 in (+ df r s, g deq r s)
276 But we can't naively make an instance for g from this, because deq is not in scope
277 at the defn of g. Instead, we have to float out the (new) defn of deq
278 to widen its scope. Notice that this floating can't be done in advance -- it only
279 shows up when specialisation is done.
281 User SPECIALIZE pragmas
282 ~~~~~~~~~~~~~~~~~~~~~~~
283 Specialisation pragmas can be digested by the type checker, and implemented
284 by adding extra definitions along with that of f, in the same way as before
286 f@t1/t2 = <f_rhs> t1 t2 d1 d2
288 Indeed the pragmas *have* to be dealt with by the type checker, because
289 only it knows how to build the dictionaries d1 and d2! For example
291 g :: Ord a => [a] -> [a]
292 {-# SPECIALIZE f :: [Tree Int] -> [Tree Int] #-}
294 Here, the specialised version of g is an application of g's rhs to the
295 Ord dictionary for (Tree Int), which only the type checker can conjure
296 up. There might not even *be* one, if (Tree Int) is not an instance of
297 Ord! (All the other specialision has suitable dictionaries to hand
300 Problem. The type checker doesn't have to hand a convenient <f_rhs>, because
301 it is buried in a complex (as-yet-un-desugared) binding group.
304 f@t1/t2 = f* t1 t2 d1 d2
306 where f* is the Id f with an IdInfo which says "inline me regardless!".
307 Indeed all the specialisation could be done in this way.
308 That in turn means that the simplifier has to be prepared to inline absolutely
309 any in-scope let-bound thing.
312 Again, the pragma should permit polymorphism in unconstrained variables:
314 h :: Ord a => [a] -> b -> b
315 {-# SPECIALIZE h :: [Int] -> b -> b #-}
317 We *insist* that all overloaded type variables are specialised to ground types,
318 (and hence there can be no context inside a SPECIALIZE pragma).
319 We *permit* unconstrained type variables to be specialised to
321 - or left as a polymorphic type variable
322 but nothing in between. So
324 {-# SPECIALIZE h :: [Int] -> [c] -> [c] #-}
326 is *illegal*. (It can be handled, but it adds complication, and gains the
330 SPECIALISING INSTANCE DECLARATIONS
331 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
334 instance Foo a => Foo [a] where
336 {-# SPECIALIZE instance Foo [Int] #-}
338 The original instance decl creates a dictionary-function
341 dfun.Foo.List :: forall a. Foo a -> Foo [a]
343 The SPECIALIZE pragma just makes a specialised copy, just as for
344 ordinary function definitions:
346 dfun.Foo.List@Int :: Foo [Int]
347 dfun.Foo.List@Int = dfun.Foo.List Int dFooInt
349 The information about what instance of the dfun exist gets added to
350 the dfun's IdInfo in the same way as a user-defined function too.
353 Automatic instance decl specialisation?
354 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
355 Can instance decls be specialised automatically? It's tricky.
356 We could collect call-instance information for each dfun, but
357 then when we specialised their bodies we'd get new call-instances
358 for ordinary functions; and when we specialised their bodies, we might get
359 new call-instances of the dfuns, and so on. This all arises because of
360 the unrestricted mutual recursion between instance decls and value decls.
362 Still, there's no actual problem; it just means that we may not do all
363 the specialisation we could theoretically do.
365 Furthermore, instance decls are usually exported and used non-locally,
366 so we'll want to compile enough to get those specialisations done.
368 Lastly, there's no such thing as a local instance decl, so we can
369 survive solely by spitting out *usage* information, and then reading that
370 back in as a pragma when next compiling the file. So for now,
371 we only specialise instance decls in response to pragmas.
374 SPITTING OUT USAGE INFORMATION
375 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
377 To spit out usage information we need to traverse the code collecting
378 call-instance information for all imported (non-prelude?) functions
379 and data types. Then we equivalence-class it and spit it out.
381 This is done at the top-level when all the call instances which escape
382 must be for imported functions and data types.
384 *** Not currently done ***
387 Partial specialisation by pragmas
388 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
389 What about partial specialisation:
391 k :: (Ord a, Eq b) => [a] -> b -> b -> [a]
392 {-# SPECIALIZE k :: Eq b => [Int] -> b -> b -> [a] #-}
396 {-# SPECIALIZE k :: Eq b => [Int] -> [b] -> [b] -> [a] #-}
398 Seems quite reasonable. Similar things could be done with instance decls:
400 instance (Foo a, Foo b) => Foo (a,b) where
402 {-# SPECIALIZE instance Foo a => Foo (a,Int) #-}
403 {-# SPECIALIZE instance Foo b => Foo (Int,b) #-}
405 Ho hum. Things are complex enough without this. I pass.
408 Requirements for the simplifer
409 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
410 The simplifier has to be able to take advantage of the specialisation.
412 * When the simplifier finds an application of a polymorphic f, it looks in
413 f's IdInfo in case there is a suitable instance to call instead. This converts
415 f t1 t2 d1 d2 ===> f_t1_t2
417 Note that the dictionaries get eaten up too!
419 * Dictionary selection operations on constant dictionaries must be
422 +.sel Int d ===> +Int
424 The obvious way to do this is in the same way as other specialised
425 calls: +.sel has inside it some IdInfo which tells that if it's applied
426 to the type Int then it should eat a dictionary and transform to +Int.
428 In short, dictionary selectors need IdInfo inside them for constant
431 * Exactly the same applies if a superclass dictionary is being
434 Eq.sel Int d ===> dEqInt
436 * Something similar applies to dictionary construction too. Suppose
437 dfun.Eq.List is the function taking a dictionary for (Eq a) to
438 one for (Eq [a]). Then we want
440 dfun.Eq.List Int d ===> dEq.List_Int
442 Where does the Eq [Int] dictionary come from? It is built in
443 response to a SPECIALIZE pragma on the Eq [a] instance decl.
445 In short, dfun Ids need IdInfo with a specialisation for each
446 constant instance of their instance declaration.
448 All this uses a single mechanism: the SpecEnv inside an Id
451 What does the specialisation IdInfo look like?
452 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
454 The SpecEnv of an Id maps a list of types (the template) to an expression
458 For example, if f has this SpecInfo:
460 [Int, a] -> \d:Ord Int. f' a
462 it means that we can replace the call
464 f Int t ===> (\d. f' t)
466 This chucks one dictionary away and proceeds with the
467 specialised version of f, namely f'.
470 What can't be done this way?
471 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
472 There is no way, post-typechecker, to get a dictionary for (say)
473 Eq a from a dictionary for Eq [a]. So if we find
477 we can't transform to
482 eqList :: (a->a->Bool) -> [a] -> [a] -> Bool
484 Of course, we currently have no way to automatically derive
485 eqList, nor to connect it to the Eq [a] instance decl, but you
486 can imagine that it might somehow be possible. Taking advantage
487 of this is permanently ruled out.
489 Still, this is no great hardship, because we intend to eliminate
490 overloading altogether anyway!
494 A note about non-tyvar dictionaries
495 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
496 Some Ids have types like
498 forall a,b,c. Eq a -> Ord [a] -> tau
500 This seems curious at first, because we usually only have dictionary
501 args whose types are of the form (C a) where a is a type variable.
502 But this doesn't hold for the functions arising from instance decls,
503 which sometimes get arguements with types of form (C (T a)) for some
506 Should we specialise wrt this compound-type dictionary? We used to say
508 "This is a heuristic judgement, as indeed is the fact that we
509 specialise wrt only dictionaries. We choose *not* to specialise
510 wrt compound dictionaries because at the moment the only place
511 they show up is in instance decls, where they are simply plugged
512 into a returned dictionary. So nothing is gained by specialising
515 But it is simpler and more uniform to specialise wrt these dicts too;
516 and in future GHC is likely to support full fledged type signatures
518 f ;: Eq [(a,b)] => ...
521 %************************************************************************
523 \subsubsection{The new specialiser}
525 %************************************************************************
527 Our basic game plan is this. For let(rec) bound function
528 f :: (C a, D c) => (a,b,c,d) -> Bool
530 * Find any specialised calls of f, (f ts ds), where
531 ts are the type arguments t1 .. t4, and
532 ds are the dictionary arguments d1 .. d2.
534 * Add a new definition for f1 (say):
536 f1 = /\ b d -> (..body of f..) t1 b t3 d d1 d2
538 Note that we abstract over the unconstrained type arguments.
542 [t1,b,t3,d] |-> \d1 d2 -> f1 b d
544 to the specialisations of f. This will be used by the
545 simplifier to replace calls
546 (f t1 t2 t3 t4) da db
548 (\d1 d1 -> f1 t2 t4) da db
550 All the stuff about how many dictionaries to discard, and what types
551 to apply the specialised function to, are handled by the fact that the
552 SpecEnv contains a template for the result of the specialisation.
554 We don't build *partial* specialisations for f. For example:
556 f :: Eq a => a -> a -> Bool
557 {-# SPECIALISE f :: (Eq b, Eq c) => (b,c) -> (b,c) -> Bool #-}
559 Here, little is gained by making a specialised copy of f.
560 There's a distinct danger that the specialised version would
561 first build a dictionary for (Eq b, Eq c), and then select the (==)
562 method from it! Even if it didn't, not a great deal is saved.
564 We do, however, generate polymorphic, but not overloaded, specialisations:
566 f :: Eq a => [a] -> b -> b -> b
567 {#- SPECIALISE f :: [Int] -> b -> b -> b #-}
569 Hence, the invariant is this:
571 *** no specialised version is overloaded ***
574 %************************************************************************
576 \subsubsection{The exported function}
578 %************************************************************************
581 specProgram :: UniqSupply -> [CoreBind] -> IO [CoreBind]
584 beginPass "Specialise"
586 let binds' = initSM us (go binds `thenSM` \ (binds', uds') ->
587 returnSM (dumpAllDictBinds uds' binds'))
589 endPass "Specialise" (opt_D_dump_spec || opt_D_verbose_core2core) binds'
591 dumpIfSet opt_D_dump_rules "Top-level specialisations"
592 (vcat (map dump_specs (concat (map bindersOf binds'))))
596 go [] = returnSM ([], emptyUDs)
597 go (bind:binds) = go binds `thenSM` \ (binds', uds) ->
598 specBind emptySubst bind uds `thenSM` \ (bind', uds') ->
599 returnSM (bind' ++ binds', uds')
601 dump_specs var = pprCoreRules var (getIdSpecialisation var)
604 %************************************************************************
606 \subsubsection{@specExpr@: the main function}
608 %************************************************************************
611 specVar :: Subst -> Id -> CoreExpr
612 specVar subst v = case lookupSubst subst v of
616 specExpr :: Subst -> CoreExpr -> SpecM (CoreExpr, UsageDetails)
617 -- We carry a substitution down:
618 -- a) we must clone any binding that might flaot outwards,
619 -- to avoid name clashes
620 -- b) we carry a type substitution to use when analysing
621 -- the RHS of specialised bindings (no type-let!)
623 ---------------- First the easy cases --------------------
624 specExpr subst (Type ty) = returnSM (Type (substTy subst ty), emptyUDs)
625 specExpr subst (Var v) = returnSM (specVar subst v, emptyUDs)
627 specExpr subst e@(Con con args)
628 = mapAndCombineSM (specExpr subst) args `thenSM` \ (args', uds) ->
629 returnSM (Con con args', uds)
631 specExpr subst (Note note body)
632 = specExpr subst body `thenSM` \ (body', uds) ->
633 returnSM (Note (specNote subst note) body', uds)
636 ---------------- Applications might generate a call instance --------------------
637 specExpr subst expr@(App fun arg)
640 go (App fun arg) args = specExpr subst arg `thenSM` \ (arg', uds_arg) ->
641 go fun (arg':args) `thenSM` \ (fun', uds_app) ->
642 returnSM (App fun' arg', uds_arg `plusUDs` uds_app)
644 go (Var f) args = case specVar subst f of
645 Var f' -> returnSM (Var f', mkCallUDs f' args)
646 e' -> returnSM (e', emptyUDs) -- I don't expect this!
647 go other args = specExpr subst other
649 ---------------- Lambda/case require dumping of usage details --------------------
650 specExpr subst e@(Lam _ _)
651 = specExpr subst' body `thenSM` \ (body', uds) ->
653 (filtered_uds, body'') = dumpUDs bndrs' uds body'
655 returnSM (mkLams bndrs' body'', filtered_uds)
657 (bndrs, body) = collectBinders e
658 (subst', bndrs') = substBndrs subst bndrs
659 -- More efficient to collect a group of binders together all at once
660 -- and we don't want to split a lambda group with dumped bindings
662 specExpr subst (Case scrut case_bndr alts)
663 = specExpr subst scrut `thenSM` \ (scrut', uds_scrut) ->
664 mapAndCombineSM spec_alt alts `thenSM` \ (alts', uds_alts) ->
665 returnSM (Case scrut' case_bndr' alts', uds_scrut `plusUDs` uds_alts)
667 (subst_alt, case_bndr') = substId subst case_bndr
669 spec_alt (con, args, rhs)
670 = specExpr subst_rhs rhs `thenSM` \ (rhs', uds) ->
672 (uds', rhs'') = dumpUDs args uds rhs'
674 returnSM ((con, args', rhs''), uds')
676 (subst_rhs, args') = substBndrs subst_alt args
678 ---------------- Finally, let is the interesting case --------------------
679 specExpr subst (Let bind body)
681 cloneBindSM subst bind `thenSM` \ (rhs_subst, body_subst, bind') ->
683 -- Deal with the body
684 specExpr body_subst body `thenSM` \ (body', body_uds) ->
686 -- Deal with the bindings
687 specBind rhs_subst bind' body_uds `thenSM` \ (binds', uds) ->
690 returnSM (foldr Let body' binds', uds)
692 -- Must apply the type substitution to coerceions
693 specNote subst (Coerce t1 t2) = Coerce (substTy subst t1) (substTy subst t2)
694 specNote subst note = note
697 %************************************************************************
699 \subsubsection{Dealing with a binding}
701 %************************************************************************
704 specBind :: Subst -- Use this for RHSs
706 -> UsageDetails -- Info on how the scope of the binding
707 -> SpecM ([CoreBind], -- New bindings
708 UsageDetails) -- And info to pass upstream
710 specBind rhs_subst bind body_uds
711 = specBindItself rhs_subst bind (calls body_uds) `thenSM` \ (bind', bind_uds) ->
713 bndrs = bindersOf bind
714 all_uds = zapCalls bndrs (body_uds `plusUDs` bind_uds)
715 -- It's important that the `plusUDs` is this way round,
716 -- because body_uds may bind dictionaries that are
717 -- used in the calls passed to specDefn. So the
718 -- dictionary bindings in bind_uds may mention
719 -- dictionaries bound in body_uds.
721 case splitUDs bndrs all_uds of
723 (_, ([],[])) -- This binding doesn't bind anything needed
724 -- in the UDs, so put the binding here
725 -- This is the case for most non-dict bindings, except
726 -- for the few that are mentioned in a dict binding
727 -- that is floating upwards in body_uds
728 -> returnSM ([bind'], all_uds)
730 (float_uds, (dict_binds, calls)) -- This binding is needed in the UDs, so float it out
731 -> returnSM ([], float_uds `plusUDs` mkBigUD bind' dict_binds calls)
734 -- A truly gruesome function
735 mkBigUD bind@(NonRec _ _) dbs calls
736 = -- Common case: non-recursive and no specialisations
737 -- (if there were any specialistions it would have been made recursive)
738 MkUD { dict_binds = listToBag (mkDB bind : dbs),
739 calls = listToCallDetails calls }
741 mkBigUD bind dbs calls
743 MkUD { dict_binds = unitBag (mkDB (Rec (bind_prs bind ++ dbsToPairs dbs))),
745 calls = listToCallDetails calls }
747 bind_prs (NonRec b r) = [(b,r)]
748 bind_prs (Rec prs) = prs
751 dbsToPairs ((bind,_):dbs) = bind_prs bind ++ dbsToPairs dbs
753 -- specBindItself deals with the RHS, specialising it according
754 -- to the calls found in the body (if any)
755 specBindItself rhs_subst (NonRec bndr rhs) call_info
756 = specDefn rhs_subst call_info (bndr,rhs) `thenSM` \ ((bndr',rhs'), spec_defns, spec_uds) ->
758 new_bind | null spec_defns = NonRec bndr' rhs'
759 | otherwise = Rec ((bndr',rhs'):spec_defns)
760 -- bndr' mentions the spec_defns in its SpecEnv
761 -- Not sure why we couln't just put the spec_defns first
763 returnSM (new_bind, spec_uds)
765 specBindItself rhs_subst (Rec pairs) call_info
766 = mapSM (specDefn rhs_subst call_info) pairs `thenSM` \ stuff ->
768 (pairs', spec_defns_s, spec_uds_s) = unzip3 stuff
769 spec_defns = concat spec_defns_s
770 spec_uds = plusUDList spec_uds_s
771 new_bind = Rec (spec_defns ++ pairs')
773 returnSM (new_bind, spec_uds)
776 specDefn :: Subst -- Subst to use for RHS
777 -> CallDetails -- Info on how it is used in its scope
778 -> (Id, CoreExpr) -- The thing being bound and its un-processed RHS
779 -> SpecM ((Id, CoreExpr), -- The thing and its processed RHS
780 -- the Id may now have specialisations attached
781 [(Id,CoreExpr)], -- Extra, specialised bindings
782 UsageDetails -- Stuff to fling upwards from the RHS and its
783 ) -- specialised versions
785 specDefn subst calls (fn, rhs)
786 -- The first case is the interesting one
787 | n_tyvars == length rhs_tyvars -- Rhs of fn's defn has right number of big lambdas
788 && n_dicts <= length rhs_bndrs -- and enough dict args
789 && not (null calls_for_me) -- And there are some calls to specialise
790 = -- Specialise the body of the function
791 specExpr subst rhs `thenSM` \ (rhs', rhs_uds) ->
793 -- Make a specialised version for each call in calls_for_me
794 mapSM spec_call calls_for_me `thenSM` \ stuff ->
796 (spec_defns, spec_uds, spec_env_stuff) = unzip3 stuff
798 fn' = addIdSpecialisations zapped_fn spec_env_stuff
800 returnSM ((fn',rhs'),
802 rhs_uds `plusUDs` plusUDList spec_uds)
804 | otherwise -- No calls or RHS doesn't fit our preconceptions
805 = specExpr subst rhs `thenSM` \ (rhs', rhs_uds) ->
806 returnSM ((zapped_fn, rhs'), [], rhs_uds)
809 zapped_fn = modifyIdInfo zapSpecPragInfo fn
810 -- If the fn is a SpecPragmaId, make it discardable
811 -- It's role as a holder for a call instance is o'er
812 -- But it might be alive for some other reason by now.
815 (tyvars, theta, tau) = splitSigmaTy fn_type
816 n_tyvars = length tyvars
817 n_dicts = length theta
819 (rhs_tyvars, rhs_ids, rhs_body) = collectTyAndValBinders rhs
820 rhs_dicts = take n_dicts rhs_ids
821 rhs_bndrs = rhs_tyvars ++ rhs_dicts
822 body = mkLams (drop n_dicts rhs_ids) rhs_body
823 -- Glue back on the non-dict lambdas
825 calls_for_me = case lookupFM calls fn of
827 Just cs -> fmToList cs
829 ----------------------------------------------------------
830 -- Specialise to one particular call pattern
831 spec_call :: ([Maybe Type], ([DictExpr], IdOrTyVarSet)) -- Call instance
832 -> SpecM ((Id,CoreExpr), -- Specialised definition
833 UsageDetails, -- Usage details from specialised body
834 ([CoreBndr], [CoreExpr], CoreExpr)) -- Info for the Id's SpecEnv
835 spec_call (call_ts, (call_ds, call_fvs))
836 = ASSERT( length call_ts == n_tyvars && length call_ds == n_dicts )
837 -- Calls are only recorded for properly-saturated applications
839 -- Suppose f's defn is f = /\ a b c d -> \ d1 d2 -> rhs
840 -- Supppose the call is for f [Just t1, Nothing, Just t3, Nothing] [dx1, dx2]
842 -- Construct the new binding
843 -- f1 = SUBST[a->t1,c->t3, d1->d1', d2->d2'] (/\ b d -> rhs)
844 -- PLUS the usage-details
845 -- { d1' = dx1; d2' = dx2 }
846 -- where d1', d2' are cloned versions of d1,d2, with the type substitution applied.
848 -- Note that the substitution is applied to the whole thing.
849 -- This is convenient, but just slightly fragile. Notably:
850 -- * There had better be no name clashes in a/b/c/d
853 -- poly_tyvars = [b,d] in the example above
854 -- spec_tyvars = [a,c]
855 -- ty_args = [t1,b,t3,d]
856 poly_tyvars = [tv | (tv, Nothing) <- rhs_tyvars `zip` call_ts]
857 spec_tyvars = [tv | (tv, Just _) <- rhs_tyvars `zip` call_ts]
858 ty_args = zipWithEqual "spec_call" mk_ty_arg rhs_tyvars call_ts
860 mk_ty_arg rhs_tyvar Nothing = Type (mkTyVarTy rhs_tyvar)
861 mk_ty_arg rhs_tyvar (Just ty) = Type ty
862 rhs_subst = extendSubstList subst spec_tyvars [DoneTy ty | Just ty <- call_ts]
864 cloneBinders rhs_subst rhs_dicts `thenSM` \ (rhs_subst', rhs_dicts') ->
866 inst_args = ty_args ++ map Var rhs_dicts'
868 -- Figure out the type of the specialised function
869 spec_id_ty = mkForAllTys poly_tyvars (applyTypeToArgs rhs fn_type inst_args)
871 newIdSM fn spec_id_ty `thenSM` \ spec_f ->
872 specExpr rhs_subst' (mkLams poly_tyvars body) `thenSM` \ (spec_rhs, rhs_uds) ->
874 -- The rule to put in the function's specialisation is:
875 -- forall b,d, d1',d2'. f t1 b t3 d d1' d2' = f1 b d
876 spec_env_rule = (poly_tyvars ++ rhs_dicts',
878 mkTyApps (Var spec_f) (map mkTyVarTy poly_tyvars))
880 -- Add the { d1' = dx1; d2' = dx2 } usage stuff
881 final_uds = foldr addDictBind rhs_uds (zipEqual "spec_call" rhs_dicts' call_ds)
883 returnSM ((spec_f, spec_rhs),
888 %************************************************************************
890 \subsubsection{UsageDetails and suchlike}
892 %************************************************************************
897 dict_binds :: !(Bag DictBind),
898 -- Floated dictionary bindings
899 -- The order is important;
900 -- in ds1 `union` ds2, bindings in ds2 can depend on those in ds1
901 -- (Remember, Bags preserve order in GHC.)
903 calls :: !CallDetails
906 type DictBind = (CoreBind, IdOrTyVarSet)
907 -- The set is the free vars of the binding
908 -- both tyvars and dicts
910 type DictExpr = CoreExpr
912 emptyUDs = MkUD { dict_binds = emptyBag, calls = emptyFM }
914 type ProtoUsageDetails = ([DictBind],
915 [(Id, [Maybe Type], ([DictExpr], IdOrTyVarSet))]
918 ------------------------------------------------------------
919 type CallDetails = FiniteMap Id CallInfo
920 type CallInfo = FiniteMap [Maybe Type] -- Nothing => unconstrained type argument
921 ([DictExpr], IdOrTyVarSet) -- Dict args and the vars of the whole
922 -- call (including tyvars)
923 -- [*not* include the main id itself, of course]
924 -- The finite maps eliminate duplicates
925 -- The list of types and dictionaries is guaranteed to
926 -- match the type of f
928 unionCalls :: CallDetails -> CallDetails -> CallDetails
929 unionCalls c1 c2 = plusFM_C plusFM c1 c2
931 singleCall :: (Id, [Maybe Type], [DictExpr]) -> CallDetails
932 singleCall (id, tys, dicts)
933 = unitFM id (unitFM tys (dicts, call_fvs))
935 call_fvs = exprsFreeVars dicts `unionVarSet` tys_fvs
936 tys_fvs = tyVarsOfTypes (catMaybes tys)
937 -- The type args (tys) are guaranteed to be part of the dictionary
938 -- types, because they are just the constrained types,
939 -- and the dictionary is therefore sure to be bound
940 -- inside the binding for any type variables free in the type;
941 -- hence it's safe to neglect tyvars free in tys when making
942 -- the free-var set for this call
943 -- BUT I don't trust this reasoning; play safe and include tys_fvs
945 -- We don't include the 'id' itself.
947 listToCallDetails calls
948 = foldr (unionCalls . mk_call) emptyFM calls
950 mk_call (id, tys, dicts_w_fvs) = unitFM id (unitFM tys dicts_w_fvs)
951 -- NB: the free vars of the call are provided
953 callDetailsToList calls = [ (id,tys,dicts)
954 | (id,fm) <- fmToList calls,
955 (tys,dicts) <- fmToList fm
960 || length spec_tys /= n_tyvars
961 || length dicts /= n_dicts
962 = emptyUDs -- Not overloaded
965 = MkUD {dict_binds = emptyBag,
966 calls = singleCall (f, spec_tys, dicts)
969 (tyvars, theta, tau) = splitSigmaTy (idType f)
970 constrained_tyvars = foldr (unionVarSet . tyVarsOfTypes . snd) emptyVarSet theta
971 n_tyvars = length tyvars
972 n_dicts = length theta
974 spec_tys = [mk_spec_ty tv ty | (tv, Type ty) <- tyvars `zip` args]
975 dicts = [dict_expr | (_, dict_expr) <- theta `zip` (drop n_tyvars args)]
977 mk_spec_ty tyvar ty | tyvar `elemVarSet` constrained_tyvars
982 ------------------------------------------------------------
983 plusUDs :: UsageDetails -> UsageDetails -> UsageDetails
984 plusUDs (MkUD {dict_binds = db1, calls = calls1})
985 (MkUD {dict_binds = db2, calls = calls2})
986 = MkUD {dict_binds = d, calls = c}
988 d = db1 `unionBags` db2
989 c = calls1 `unionCalls` calls2
991 plusUDList = foldr plusUDs emptyUDs
993 -- zapCalls deletes calls to ids from uds
994 zapCalls ids uds = uds {calls = delListFromFM (calls uds) ids}
996 mkDB bind = (bind, bind_fvs bind)
998 bind_fvs (NonRec bndr rhs) = exprFreeVars rhs
999 bind_fvs (Rec prs) = foldl delVarSet rhs_fvs bndrs
1002 rhs_fvs = unionVarSets [exprFreeVars rhs | (bndr,rhs) <- prs]
1004 addDictBind (dict,rhs) uds = uds { dict_binds = mkDB (NonRec dict rhs) `consBag` dict_binds uds }
1006 dumpAllDictBinds (MkUD {dict_binds = dbs}) binds
1007 = foldrBag add binds dbs
1009 add (bind,_) binds = bind : binds
1011 dumpUDs :: [CoreBndr]
1012 -> UsageDetails -> CoreExpr
1013 -> (UsageDetails, CoreExpr)
1014 dumpUDs bndrs uds body
1015 = (free_uds, foldr add_let body dict_binds)
1017 (free_uds, (dict_binds, _)) = splitUDs bndrs uds
1018 add_let (bind,_) body = Let bind body
1020 splitUDs :: [CoreBndr]
1022 -> (UsageDetails, -- These don't mention the binders
1023 ProtoUsageDetails) -- These do
1025 splitUDs bndrs uds@(MkUD {dict_binds = orig_dbs,
1026 calls = orig_calls})
1028 = if isEmptyBag dump_dbs && null dump_calls then
1029 -- Common case: binder doesn't affect floats
1033 -- Binders bind some of the fvs of the floats
1034 (MkUD {dict_binds = free_dbs,
1035 calls = listToCallDetails free_calls},
1036 (bagToList dump_dbs, dump_calls)
1040 bndr_set = mkVarSet bndrs
1042 (free_dbs, dump_dbs, dump_idset)
1043 = foldlBag dump_db (emptyBag, emptyBag, bndr_set) orig_dbs
1044 -- Important that it's foldl not foldr;
1045 -- we're accumulating the set of dumped ids in dump_set
1047 -- Filter out any calls that mention things that are being dumped
1048 orig_call_list = callDetailsToList orig_calls
1049 (dump_calls, free_calls) = partition captured orig_call_list
1050 captured (id,tys,(dicts, fvs)) = fvs `intersectsVarSet` dump_idset
1051 || id `elemVarSet` dump_idset
1053 dump_db (free_dbs, dump_dbs, dump_idset) db@(bind, fvs)
1054 | dump_idset `intersectsVarSet` fvs -- Dump it
1055 = (free_dbs, dump_dbs `snocBag` db,
1056 dump_idset `unionVarSet` mkVarSet (bindersOf bind))
1058 | otherwise -- Don't dump it
1059 = (free_dbs `snocBag` db, dump_dbs, dump_idset)
1063 %************************************************************************
1065 \subsubsection{Boring helper functions}
1067 %************************************************************************
1070 lookupId:: IdEnv Id -> Id -> Id
1071 lookupId env id = case lookupVarEnv env id of
1075 ----------------------------------------
1076 type SpecM a = UniqSM a
1081 getUniqSM = getUniqueUs
1082 getUniqSupplySM = getUs
1083 setUniqSupplySM = setUs
1087 mapAndCombineSM f [] = returnSM ([], emptyUDs)
1088 mapAndCombineSM f (x:xs) = f x `thenSM` \ (y, uds1) ->
1089 mapAndCombineSM f xs `thenSM` \ (ys, uds2) ->
1090 returnSM (y:ys, uds1 `plusUDs` uds2)
1092 cloneBindSM :: Subst -> CoreBind -> SpecM (Subst, Subst, CoreBind)
1093 -- Clone the binders of the bind; return new bind with the cloned binders
1094 -- Return the substitution to use for RHSs, and the one to use for the body
1095 cloneBindSM subst (NonRec bndr rhs)
1096 = getUs `thenUs` \ us ->
1098 (subst', us', bndr') = substAndCloneId subst us bndr
1101 returnUs (subst, subst', NonRec bndr' rhs)
1103 cloneBindSM subst (Rec pairs)
1104 = getUs `thenUs` \ us ->
1106 (subst', us', bndrs') = substAndCloneIds subst us (map fst pairs)
1109 returnUs (subst', subst', Rec (bndrs' `zip` map snd pairs))
1111 cloneBinders subst bndrs
1112 = getUs `thenUs` \ us ->
1114 (subst', us', bndrs') = substAndCloneIds subst us bndrs
1117 returnUs (subst', bndrs')
1120 newIdSM old_id new_ty
1121 = getUniqSM `thenSM` \ uniq ->
1123 -- Give the new Id a similar occurrence name to the old one
1124 name = idName old_id
1125 new_id = mkUserLocal (mkSpecOcc (nameOccName name)) uniq new_ty (getSrcLoc name)
1127 -- If the old Id was exported, make the new one non-discardable,
1128 -- else we will discard it since it doesn't seem to be called.
1129 new_id' | isExportedId old_id = setIdNoDiscard new_id
1130 | otherwise = new_id
1135 = getUniqSM `thenSM` \ uniq ->
1136 returnSM (mkSysTyVar uniq boxedTypeKind)
1140 Old (but interesting) stuff about unboxed bindings
1141 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1143 What should we do when a value is specialised to a *strict* unboxed value?
1145 map_*_* f (x:xs) = let h = f x
1149 Could convert let to case:
1151 map_*_Int# f (x:xs) = case f x of h# ->
1155 This may be undesirable since it forces evaluation here, but the value
1156 may not be used in all branches of the body. In the general case this
1157 transformation is impossible since the mutual recursion in a letrec
1158 cannot be expressed as a case.
1160 There is also a problem with top-level unboxed values, since our
1161 implementation cannot handle unboxed values at the top level.
1163 Solution: Lift the binding of the unboxed value and extract it when it
1166 map_*_Int# f (x:xs) = let h = case (f x) of h# -> _Lift h#
1171 Now give it to the simplifier and the _Lifting will be optimised away.
1173 The benfit is that we have given the specialised "unboxed" values a
1174 very simplep lifted semantics and then leave it up to the simplifier to
1175 optimise it --- knowing that the overheads will be removed in nearly
1178 In particular, the value will only be evaluted in the branches of the
1179 program which use it, rather than being forced at the point where the
1180 value is bound. For example:
1182 filtermap_*_* p f (x:xs)
1189 filtermap_*_Int# p f (x:xs)
1190 = let h = case (f x) of h# -> _Lift h#
1193 True -> case h of _Lift h#
1197 The binding for h can still be inlined in the one branch and the
1198 _Lifting eliminated.
1201 Question: When won't the _Lifting be eliminated?
1203 Answer: When they at the top-level (where it is necessary) or when
1204 inlining would duplicate work (or possibly code depending on
1205 options). However, the _Lifting will still be eliminated if the
1206 strictness analyser deems the lifted binding strict.