compiler/specialise/Specialise.lhs

   1 %
   2 % (c) The GRASP/AQUA Project, Glasgow University, 1993-1998
   3 %
   4 \section[Specialise]{Stamping out overloading, and (optionally) polymorphism}
   5
   6 \begin{code}
   7 {-# OPTIONS -w #-}
   8 -- The above warning supression flag is a temporary kludge.
   9 -- While working on this module you are encouraged to remove it and fix
  10 -- any warnings in the module. See
  11 --     http://hackage.haskell.org/trac/ghc/wiki/Commentary/CodingStyle#Warnings
  12 -- for details
  13
  14 module Specialise ( specProgram ) where
  15
  16 #include "HsVersions.h"
  17
  18 import DynFlags ( DynFlags, DynFlag(..) )
  19 import Id               ( Id, idName, idType, mkUserLocal,
  20                           idInlinePragma, setInlinePragma )
  21 import TcType           ( Type, mkTyVarTy, tcSplitSigmaTy,
  22                           tyVarsOfTypes, tyVarsOfTheta, isClassPred,
  23                           tcCmpType, isUnLiftedType
  24                         )
  25 import CoreSubst        ( Subst, mkEmptySubst, extendTvSubstList, lookupIdSubst,
  26                           substBndr, substBndrs, substTy, substInScope,
  27                           cloneIdBndr, cloneIdBndrs, cloneRecIdBndrs
  28                         )
  29 import VarSet
  30 import VarEnv
  31 import CoreSyn
  32 import CoreUtils        ( applyTypeToArgs, mkPiTypes )
  33 import CoreFVs          ( exprFreeVars, exprsFreeVars, idFreeVars )
  34 import CoreTidy         ( tidyRules )
  35 import CoreLint         ( showPass, endPass )
  36 import Rules            ( addIdSpecialisations, mkLocalRule, lookupRule, emptyRuleBase, rulesOfBinds )
  37 import PprCore          ( pprRules )
  38 import UniqSupply       ( UniqSupply,
  39                           UniqSM, initUs_,
  40                           MonadUnique(..)
  41                         )
  42 import Name
  43 import MkId             ( voidArgId, realWorldPrimId )
  44 import FiniteMap
  45 import Maybes           ( catMaybes, maybeToBool )
  46 import ErrUtils         ( dumpIfSet_dyn )
  47 import BasicTypes       ( Activation( AlwaysActive ) )
  48 import Bag
  49 import List             ( partition )
  50 import Util             ( zipEqual, zipWithEqual, cmpList, lengthIs,
  51                           equalLength, lengthAtLeast, notNull )
  52 import Outputable
  53 import FastString
  54
  55 \end{code}
  56
  57 %************************************************************************
  58 %*                                                                      *
  59 \subsection[notes-Specialise]{Implementation notes [SLPJ, Aug 18 1993]}
  60 %*                                                                      *
  61 %************************************************************************
  62
  63 These notes describe how we implement specialisation to eliminate
  64 overloading.
  65
  66 The specialisation pass works on Core
  67 syntax, complete with all the explicit dictionary application,
  68 abstraction and construction as added by the type checker.  The
  69 existing type checker remains largely as it is.
  70
  71 One important thought: the {\em types} passed to an overloaded
  72 function, and the {\em dictionaries} passed are mutually redundant.
  73 If the same function is applied to the same type(s) then it is sure to
  74 be applied to the same dictionary(s)---or rather to the same {\em
  75 values}.  (The arguments might look different but they will evaluate
  76 to the same value.)
  77
  78 Second important thought: we know that we can make progress by
  79 treating dictionary arguments as static and worth specialising on.  So
  80 we can do without binding-time analysis, and instead specialise on
  81 dictionary arguments and no others.
  82
  83 The basic idea
  84 ~~~~~~~~~~~~~~
  85 Suppose we have
  86
  87         let f = <f_rhs>
  88         in <body>
  89
  90 and suppose f is overloaded.
  91
  92 STEP 1: CALL-INSTANCE COLLECTION
  93
  94 We traverse <body>, accumulating all applications of f to types and
  95 dictionaries.
  96
  97 (Might there be partial applications, to just some of its types and
  98 dictionaries?  In principle yes, but in practice the type checker only
  99 builds applications of f to all its types and dictionaries, so partial
 100 applications could only arise as a result of transformation, and even
 101 then I think it's unlikely.  In any case, we simply don't accumulate such
 102 partial applications.)
 103
 104
 105 STEP 2: EQUIVALENCES
 106
 107 So now we have a collection of calls to f:
 108         f t1 t2 d1 d2
 109         f t3 t4 d3 d4
 110         ...
 111 Notice that f may take several type arguments.  To avoid ambiguity, we
 112 say that f is called at type t1/t2 and t3/t4.
 113
 114 We take equivalence classes using equality of the *types* (ignoring
 115 the dictionary args, which as mentioned previously are redundant).
 116
 117 STEP 3: SPECIALISATION
 118
 119 For each equivalence class, choose a representative (f t1 t2 d1 d2),
 120 and create a local instance of f, defined thus:
 121
 122         f@t1/t2 = <f_rhs> t1 t2 d1 d2
 123
 124 f_rhs presumably has some big lambdas and dictionary lambdas, so lots
 125 of simplification will now result.  However we don't actually *do* that
 126 simplification.  Rather, we leave it for the simplifier to do.  If we
 127 *did* do it, though, we'd get more call instances from the specialised
 128 RHS.  We can work out what they are by instantiating the call-instance
 129 set from f's RHS with the types t1, t2.
 130
 131 Add this new id to f's IdInfo, to record that f has a specialised version.
 132
 133 Before doing any of this, check that f's IdInfo doesn't already
 134 tell us about an existing instance of f at the required type/s.
 135 (This might happen if specialisation was applied more than once, or
 136 it might arise from user SPECIALIZE pragmas.)
 137
 138 Recursion
 139 ~~~~~~~~~
 140 Wait a minute!  What if f is recursive?  Then we can't just plug in
 141 its right-hand side, can we?
 142
 143 But it's ok.  The type checker *always* creates non-recursive definitions
 144 for overloaded recursive functions.  For example:
 145
 146         f x = f (x+x)           -- Yes I know its silly
 147
 148 becomes
 149
 150         f a (d::Num a) = let p = +.sel a d
 151                          in
 152                          letrec fl (y::a) = fl (p y y)
 153                          in
 154                          fl
 155
 156 We still have recusion for non-overloaded functions which we
 157 speciailise, but the recursive call should get specialised to the
 158 same recursive version.
 159
 160
 161 Polymorphism 1
 162 ~~~~~~~~~~~~~~
 163
 164 All this is crystal clear when the function is applied to *constant
 165 types*; that is, types which have no type variables inside.  But what if
 166 it is applied to non-constant types?  Suppose we find a call of f at type
 167 t1/t2.  There are two possibilities:
 168
 169 (a) The free type variables of t1, t2 are in scope at the definition point
 170 of f.  In this case there's no problem, we proceed just as before.  A common
 171 example is as follows.  Here's the Haskell:
 172
 173         g y = let f x = x+x
 174               in f y + f y
 175
 176 After typechecking we have
 177
 178         g a (d::Num a) (y::a) = let f b (d'::Num b) (x::b) = +.sel b d' x x
 179                                 in +.sel a d (f a d y) (f a d y)
 180
 181 Notice that the call to f is at type type "a"; a non-constant type.
 182 Both calls to f are at the same type, so we can specialise to give:
 183
 184         g a (d::Num a) (y::a) = let f@a (x::a) = +.sel a d x x
 185                                 in +.sel a d (f@a y) (f@a y)
 186
 187
 188 (b) The other case is when the type variables in the instance types
 189 are *not* in scope at the definition point of f.  The example we are
 190 working with above is a good case.  There are two instances of (+.sel a d),
 191 but "a" is not in scope at the definition of +.sel.  Can we do anything?
 192 Yes, we can "common them up", a sort of limited common sub-expression deal.
 193 This would give:
 194
 195         g a (d::Num a) (y::a) = let +.sel@a = +.sel a d
 196                                     f@a (x::a) = +.sel@a x x
 197                                 in +.sel@a (f@a y) (f@a y)
 198
 199 This can save work, and can't be spotted by the type checker, because
 200 the two instances of +.sel weren't originally at the same type.
 201
 202 Further notes on (b)
 203
 204 * There are quite a few variations here.  For example, the defn of
 205   +.sel could be floated ouside the \y, to attempt to gain laziness.
 206   It certainly mustn't be floated outside the \d because the d has to
 207   be in scope too.
 208
 209 * We don't want to inline f_rhs in this case, because
 210 that will duplicate code.  Just commoning up the call is the point.
 211
 212 * Nothing gets added to +.sel's IdInfo.
 213
 214 * Don't bother unless the equivalence class has more than one item!
 215
 216 Not clear whether this is all worth it.  It is of course OK to
 217 simply discard call-instances when passing a big lambda.
 218
 219 Polymorphism 2 -- Overloading
 220 ~~~~~~~~~~~~~~
 221 Consider a function whose most general type is
 222
 223         f :: forall a b. Ord a => [a] -> b -> b
 224
 225 There is really no point in making a version of g at Int/Int and another
 226 at Int/Bool, because it's only instancing the type variable "a" which
 227 buys us any efficiency. Since g is completely polymorphic in b there
 228 ain't much point in making separate versions of g for the different
 229 b types.
 230
 231 That suggests that we should identify which of g's type variables
 232 are constrained (like "a") and which are unconstrained (like "b").
 233 Then when taking equivalence classes in STEP 2, we ignore the type args
 234 corresponding to unconstrained type variable.  In STEP 3 we make
 235 polymorphic versions.  Thus:
 236
 237         f@t1/ = /\b -> <f_rhs> t1 b d1 d2
 238
 239 We do this.
 240
 241
 242 Dictionary floating
 243 ~~~~~~~~~~~~~~~~~~~
 244 Consider this
 245
 246         f a (d::Num a) = let g = ...
 247                          in
 248                          ...(let d1::Ord a = Num.Ord.sel a d in g a d1)...
 249
 250 Here, g is only called at one type, but the dictionary isn't in scope at the
 251 definition point for g.  Usually the type checker would build a
 252 definition for d1 which enclosed g, but the transformation system
 253 might have moved d1's defn inward.  Solution: float dictionary bindings
 254 outwards along with call instances.
 255
 256 Consider
 257
 258         f x = let g p q = p==q
 259                   h r s = (r+s, g r s)
 260               in
 261               h x x
 262
 263
 264 Before specialisation, leaving out type abstractions we have
 265
 266         f df x = let g :: Eq a => a -> a -> Bool
 267                      g dg p q = == dg p q
 268                      h :: Num a => a -> a -> (a, Bool)
 269                      h dh r s = let deq = eqFromNum dh
 270                                 in (+ dh r s, g deq r s)
 271               in
 272               h df x x
 273
 274 After specialising h we get a specialised version of h, like this:
 275
 276                     h' r s = let deq = eqFromNum df
 277                              in (+ df r s, g deq r s)
 278
 279 But we can't naively make an instance for g from this, because deq is not in scope
 280 at the defn of g.  Instead, we have to float out the (new) defn of deq
 281 to widen its scope.  Notice that this floating can't be done in advance -- it only
 282 shows up when specialisation is done.
 283
 284 User SPECIALIZE pragmas
 285 ~~~~~~~~~~~~~~~~~~~~~~~
 286 Specialisation pragmas can be digested by the type checker, and implemented
 287 by adding extra definitions along with that of f, in the same way as before
 288
 289         f@t1/t2 = <f_rhs> t1 t2 d1 d2
 290
 291 Indeed the pragmas *have* to be dealt with by the type checker, because
 292 only it knows how to build the dictionaries d1 and d2!  For example
 293
 294         g :: Ord a => [a] -> [a]
 295         {-# SPECIALIZE f :: [Tree Int] -> [Tree Int] #-}
 296
 297 Here, the specialised version of g is an application of g's rhs to the
 298 Ord dictionary for (Tree Int), which only the type checker can conjure
 299 up.  There might not even *be* one, if (Tree Int) is not an instance of
 300 Ord!  (All the other specialision has suitable dictionaries to hand
 301 from actual calls.)
 302
 303 Problem.  The type checker doesn't have to hand a convenient <f_rhs>, because
 304 it is buried in a complex (as-yet-un-desugared) binding group.
 305 Maybe we should say
 306
 307         f@t1/t2 = f* t1 t2 d1 d2
 308
 309 where f* is the Id f with an IdInfo which says "inline me regardless!".
 310 Indeed all the specialisation could be done in this way.
 311 That in turn means that the simplifier has to be prepared to inline absolutely
 312 any in-scope let-bound thing.
 313
 314
 315 Again, the pragma should permit polymorphism in unconstrained variables:
 316
 317         h :: Ord a => [a] -> b -> b
 318         {-# SPECIALIZE h :: [Int] -> b -> b #-}
 319
 320 We *insist* that all overloaded type variables are specialised to ground types,
 321 (and hence there can be no context inside a SPECIALIZE pragma).
 322 We *permit* unconstrained type variables to be specialised to
 323         - a ground type
 324         - or left as a polymorphic type variable
 325 but nothing in between.  So
 326
 327         {-# SPECIALIZE h :: [Int] -> [c] -> [c] #-}
 328
 329 is *illegal*.  (It can be handled, but it adds complication, and gains the
 330 programmer nothing.)
 331
 332
 333 SPECIALISING INSTANCE DECLARATIONS
 334 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 335 Consider
 336
 337         instance Foo a => Foo [a] where
 338                 ...
 339         {-# SPECIALIZE instance Foo [Int] #-}
 340
 341 The original instance decl creates a dictionary-function
 342 definition:
 343
 344         dfun.Foo.List :: forall a. Foo a -> Foo [a]
 345
 346 The SPECIALIZE pragma just makes a specialised copy, just as for
 347 ordinary function definitions:
 348
 349         dfun.Foo.List@Int :: Foo [Int]
 350         dfun.Foo.List@Int = dfun.Foo.List Int dFooInt
 351
 352 The information about what instance of the dfun exist gets added to
 353 the dfun's IdInfo in the same way as a user-defined function too.
 354
 355
 356 Automatic instance decl specialisation?
 357 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 358 Can instance decls be specialised automatically?  It's tricky.
 359 We could collect call-instance information for each dfun, but
 360 then when we specialised their bodies we'd get new call-instances
 361 for ordinary functions; and when we specialised their bodies, we might get
 362 new call-instances of the dfuns, and so on.  This all arises because of
 363 the unrestricted mutual recursion between instance decls and value decls.
 364
 365 Still, there's no actual problem; it just means that we may not do all
 366 the specialisation we could theoretically do.
 367
 368 Furthermore, instance decls are usually exported and used non-locally,
 369 so we'll want to compile enough to get those specialisations done.
 370
 371 Lastly, there's no such thing as a local instance decl, so we can
 372 survive solely by spitting out *usage* information, and then reading that
 373 back in as a pragma when next compiling the file.  So for now,
 374 we only specialise instance decls in response to pragmas.
 375
 376
 377 SPITTING OUT USAGE INFORMATION
 378 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 379
 380 To spit out usage information we need to traverse the code collecting
 381 call-instance information for all imported (non-prelude?) functions
 382 and data types. Then we equivalence-class it and spit it out.
 383
 384 This is done at the top-level when all the call instances which escape
 385 must be for imported functions and data types.
 386
 387 *** Not currently done ***
 388
 389
 390 Partial specialisation by pragmas
 391 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 392 What about partial specialisation:
 393
 394         k :: (Ord a, Eq b) => [a] -> b -> b -> [a]
 395         {-# SPECIALIZE k :: Eq b => [Int] -> b -> b -> [a] #-}
 396
 397 or even
 398
 399         {-# SPECIALIZE k :: Eq b => [Int] -> [b] -> [b] -> [a] #-}
 400
 401 Seems quite reasonable.  Similar things could be done with instance decls:
 402
 403         instance (Foo a, Foo b) => Foo (a,b) where
 404                 ...
 405         {-# SPECIALIZE instance Foo a => Foo (a,Int) #-}
 406         {-# SPECIALIZE instance Foo b => Foo (Int,b) #-}
 407
 408 Ho hum.  Things are complex enough without this.  I pass.
 409
 410
 411 Requirements for the simplifer
 412 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 413 The simplifier has to be able to take advantage of the specialisation.
 414
 415 * When the simplifier finds an application of a polymorphic f, it looks in
 416 f's IdInfo in case there is a suitable instance to call instead.  This converts
 417
 418         f t1 t2 d1 d2   ===>   f_t1_t2
 419
 420 Note that the dictionaries get eaten up too!
 421
 422 * Dictionary selection operations on constant dictionaries must be
 423   short-circuited:
 424
 425         +.sel Int d     ===>  +Int
 426
 427 The obvious way to do this is in the same way as other specialised
 428 calls: +.sel has inside it some IdInfo which tells that if it's applied
 429 to the type Int then it should eat a dictionary and transform to +Int.
 430
 431 In short, dictionary selectors need IdInfo inside them for constant
 432 methods.
 433
 434 * Exactly the same applies if a superclass dictionary is being
 435   extracted:
 436
 437         Eq.sel Int d   ===>   dEqInt
 438
 439 * Something similar applies to dictionary construction too.  Suppose
 440 dfun.Eq.List is the function taking a dictionary for (Eq a) to
 441 one for (Eq [a]).  Then we want
 442
 443         dfun.Eq.List Int d      ===> dEq.List_Int
 444
 445 Where does the Eq [Int] dictionary come from?  It is built in
 446 response to a SPECIALIZE pragma on the Eq [a] instance decl.
 447
 448 In short, dfun Ids need IdInfo with a specialisation for each
 449 constant instance of their instance declaration.
 450
 451 All this uses a single mechanism: the SpecEnv inside an Id
 452
 453
 454 What does the specialisation IdInfo look like?
 455 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 456
 457 The SpecEnv of an Id maps a list of types (the template) to an expression
 458
 459         [Type]  |->  Expr
 460
 461 For example, if f has this SpecInfo:
 462
 463         [Int, a]  ->  \d:Ord Int. f' a
 464
 465 it means that we can replace the call
 466
 467         f Int t  ===>  (\d. f' t)
 468
 469 This chucks one dictionary away and proceeds with the
 470 specialised version of f, namely f'.
 471
 472
 473 What can't be done this way?
 474 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 475 There is no way, post-typechecker, to get a dictionary for (say)
 476 Eq a from a dictionary for Eq [a].  So if we find
 477
 478         ==.sel [t] d
 479
 480 we can't transform to
 481
 482         eqList (==.sel t d')
 483
 484 where
 485         eqList :: (a->a->Bool) -> [a] -> [a] -> Bool
 486
 487 Of course, we currently have no way to automatically derive
 488 eqList, nor to connect it to the Eq [a] instance decl, but you
 489 can imagine that it might somehow be possible.  Taking advantage
 490 of this is permanently ruled out.
 491
 492 Still, this is no great hardship, because we intend to eliminate
 493 overloading altogether anyway!
 494
 495
 496
 497 A note about non-tyvar dictionaries
 498 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 499 Some Ids have types like
 500
 501         forall a,b,c. Eq a -> Ord [a] -> tau
 502
 503 This seems curious at first, because we usually only have dictionary
 504 args whose types are of the form (C a) where a is a type variable.
 505 But this doesn't hold for the functions arising from instance decls,
 506 which sometimes get arguements with types of form (C (T a)) for some
 507 type constructor T.
 508
 509 Should we specialise wrt this compound-type dictionary?  We used to say
 510 "no", saying:
 511         "This is a heuristic judgement, as indeed is the fact that we
 512         specialise wrt only dictionaries.  We choose *not* to specialise
 513         wrt compound dictionaries because at the moment the only place
 514         they show up is in instance decls, where they are simply plugged
 515         into a returned dictionary.  So nothing is gained by specialising
 516         wrt them."
 517
 518 But it is simpler and more uniform to specialise wrt these dicts too;
 519 and in future GHC is likely to support full fledged type signatures
 520 like
 521         f ;: Eq [(a,b)] => ...
 522
 523
 524 %************************************************************************
 525 %*                                                                      *
 526 \subsubsection{The new specialiser}
 527 %*                                                                      *
 528 %************************************************************************
 529
 530 Our basic game plan is this.  For let(rec) bound function
 531         f :: (C a, D c) => (a,b,c,d) -> Bool
 532
 533 * Find any specialised calls of f, (f ts ds), where
 534   ts are the type arguments t1 .. t4, and
 535   ds are the dictionary arguments d1 .. d2.
 536
 537 * Add a new definition for f1 (say):
 538
 539         f1 = /\ b d -> (..body of f..) t1 b t3 d d1 d2
 540
 541   Note that we abstract over the unconstrained type arguments.
 542
 543 * Add the mapping
 544
 545         [t1,b,t3,d]  |->  \d1 d2 -> f1 b d
 546
 547   to the specialisations of f.  This will be used by the
 548   simplifier to replace calls
 549                 (f t1 t2 t3 t4) da db
 550   by
 551                 (\d1 d1 -> f1 t2 t4) da db
 552
 553   All the stuff about how many dictionaries to discard, and what types
 554   to apply the specialised function to, are handled by the fact that the
 555   SpecEnv contains a template for the result of the specialisation.
 556
 557 We don't build *partial* specialisations for f.  For example:
 558
 559   f :: Eq a => a -> a -> Bool
 560   {-# SPECIALISE f :: (Eq b, Eq c) => (b,c) -> (b,c) -> Bool #-}
 561
 562 Here, little is gained by making a specialised copy of f.
 563 There's a distinct danger that the specialised version would
 564 first build a dictionary for (Eq b, Eq c), and then select the (==)
 565 method from it!  Even if it didn't, not a great deal is saved.
 566
 567 We do, however, generate polymorphic, but not overloaded, specialisations:
 568
 569   f :: Eq a => [a] -> b -> b -> b
 570   {#- SPECIALISE f :: [Int] -> b -> b -> b #-}
 571
 572 Hence, the invariant is this:
 573
 574         *** no specialised version is overloaded ***
 575
 576
 577 %************************************************************************
 578 %*                                                                      *
 579 \subsubsection{The exported function}
 580 %*                                                                      *
 581 %************************************************************************
 582
 583 \begin{code}
 584 specProgram :: DynFlags -> UniqSupply -> [CoreBind] -> IO [CoreBind]
 585 specProgram dflags us binds = do
 586
 587         showPass dflags "Specialise"
 588
 589         let binds' = initSM us (do (binds', uds') <- go binds
 590                                    return (dumpAllDictBinds uds' binds'))
 591
 592         endPass dflags "Specialise" Opt_D_dump_spec binds'
 593
 594         dumpIfSet_dyn dflags Opt_D_dump_rules "Top-level specialisations"
 595                   (pprRules (tidyRules emptyTidyEnv (rulesOfBinds binds')))
 596
 597         return binds'
 598   where
 599         -- We need to start with a Subst that knows all the things
 600         -- that are in scope, so that the substitution engine doesn't
 601         -- accidentally re-use a unique that's already in use
 602         -- Easiest thing is to do it all at once, as if all the top-level
 603         -- decls were mutually recursive
 604     top_subst       = mkEmptySubst (mkInScopeSet (mkVarSet (bindersOfBinds binds)))
 605
 606     go []           = return ([], emptyUDs)
 607     go (bind:binds) = do (binds', uds) <- go binds
 608                          (bind', uds') <- specBind top_subst bind uds
 609                          return (bind' ++ binds', uds')
 610 \end{code}
 611
 612 %************************************************************************
 613 %*                                                                      *
 614 \subsubsection{@specExpr@: the main function}
 615 %*                                                                      *
 616 %************************************************************************
 617
 618 \begin{code}
 619 specVar :: Subst -> Id -> CoreExpr
 620 specVar subst v = lookupIdSubst subst v
 621
 622 specExpr :: Subst -> CoreExpr -> SpecM (CoreExpr, UsageDetails)
 623 -- We carry a substitution down:
 624 --      a) we must clone any binding that might flaot outwards,
 625 --         to avoid name clashes
 626 --      b) we carry a type substitution to use when analysing
 627 --         the RHS of specialised bindings (no type-let!)
 628
 629 ---------------- First the easy cases --------------------
 630 specExpr subst (Type ty) = return (Type (substTy subst ty), emptyUDs)
 631 specExpr subst (Var v)   = return (specVar subst v,         emptyUDs)
 632 specExpr subst (Lit lit) = return (Lit lit,                 emptyUDs)
 633 specExpr subst (Cast e co) = do
 634     (e', uds) <- specExpr subst e
 635     return ((Cast e' (substTy subst co)), uds)
 636 specExpr subst (Note note body) = do
 637     (body', uds) <- specExpr subst body
 638     return (Note (specNote subst note) body', uds)
 639
 640
 641 ---------------- Applications might generate a call instance --------------------
 642 specExpr subst expr@(App fun arg)
 643   = go expr []
 644   where
 645     go (App fun arg) args = do (arg', uds_arg) <- specExpr subst arg
 646                                (fun', uds_app) <- go fun (arg':args)
 647                                return (App fun' arg', uds_arg `plusUDs` uds_app)
 648
 649     go (Var f)       args = case specVar subst f of
 650                                 Var f' -> return (Var f', mkCallUDs subst f' args)
 651                                 e'     -> return (e', emptyUDs) -- I don't expect this!
 652     go other         args = specExpr subst other
 653
 654 ---------------- Lambda/case require dumping of usage details --------------------
 655 specExpr subst e@(Lam _ _) = do
 656     (body', uds) <- specExpr subst' body
 657     let (filtered_uds, body'') = dumpUDs bndrs' uds body'
 658     return (mkLams bndrs' body'', filtered_uds)
 659   where
 660     (bndrs, body) = collectBinders e
 661     (subst', bndrs') = substBndrs subst bndrs
 662         -- More efficient to collect a group of binders together all at once
 663         -- and we don't want to split a lambda group with dumped bindings
 664
 665 specExpr subst (Case scrut case_bndr ty alts) = do
 666     (scrut', uds_scrut) <- specExpr subst scrut
 667     (alts', uds_alts) <- mapAndCombineSM spec_alt alts
 668     return (Case scrut' case_bndr' (substTy subst ty) alts', uds_scrut `plusUDs` uds_alts)
 669   where
 670     (subst_alt, case_bndr') = substBndr subst case_bndr
 671         -- No need to clone case binder; it can't float like a let(rec)
 672
 673     spec_alt (con, args, rhs) = do
 674           (rhs', uds) <- specExpr subst_rhs rhs
 675           let (uds', rhs'') = do dumpUDs args uds rhs'
 676           return ((con, args', rhs''), uds')
 677         where
 678           (subst_rhs, args') = substBndrs subst_alt args
 679
 680 ---------------- Finally, let is the interesting case --------------------
 681 specExpr subst (Let bind body) = do
 682         -- Clone binders
 683     (rhs_subst, body_subst, bind') <- cloneBindSM subst bind
 684
 685         -- Deal with the body
 686     (body', body_uds) <- specExpr body_subst body
 687
 688         -- Deal with the bindings
 689     (binds', uds) <- specBind rhs_subst bind' body_uds
 690
 691         -- All done
 692     return (foldr Let body' binds', uds)
 693
 694 -- Must apply the type substitution to coerceions
 695 specNote subst note           = note
 696 \end{code}
 697
 698 %************************************************************************
 699 %*                                                                      *
 700 \subsubsection{Dealing with a binding}
 701 %*                                                                      *
 702 %************************************************************************
 703
 704 \begin{code}
 705 specBind :: Subst                       -- Use this for RHSs
 706          -> CoreBind
 707          -> UsageDetails                -- Info on how the scope of the binding
 708          -> SpecM ([CoreBind],          -- New bindings
 709                    UsageDetails)        -- And info to pass upstream
 710
 711 specBind rhs_subst bind body_uds = do
 712     (bind', bind_uds) <- specBindItself rhs_subst bind (calls body_uds)
 713     let
 714         bndrs   = bindersOf bind
 715         all_uds = zapCalls bndrs (body_uds `plusUDs` bind_uds)
 716                         -- It's important that the `plusUDs` is this way round,
 717                         -- because body_uds may bind dictionaries that are
 718                         -- used in the calls passed to specDefn.  So the
 719                         -- dictionary bindings in bind_uds may mention
 720                         -- dictionaries bound in body_uds.
 721     case splitUDs bndrs all_uds of
 722
 723         (_, ([],[]))    -- This binding doesn't bind anything needed
 724                         -- in the UDs, so put the binding here
 725                         -- This is the case for most non-dict bindings, except
 726                         -- for the few that are mentioned in a dict binding
 727                         -- that is floating upwards in body_uds
 728                 -> return ([bind'], all_uds)
 729
 730         (float_uds, (dict_binds, calls))        -- This binding is needed in the UDs, so float it out
 731                 -> return ([], float_uds `plusUDs` mkBigUD bind' dict_binds calls)
 732
 733
 734 -- A truly gruesome function
 735 mkBigUD bind@(NonRec _ _) dbs calls
 736   =     -- Common case: non-recursive and no specialisations
 737         -- (if there were any specialistions it would have been made recursive)
 738     MkUD { dict_binds = listToBag (mkDB bind : dbs),
 739            calls = listToCallDetails calls }
 740
 741 mkBigUD bind dbs calls
 742   =     -- General case
 743     MkUD { dict_binds = unitBag (mkDB (Rec (bind_prs bind ++ dbsToPairs dbs))),
 744                         -- Make a huge Rec
 745            calls = listToCallDetails calls }
 746   where
 747     bind_prs (NonRec b r) = [(b,r)]
 748     bind_prs (Rec prs)    = prs
 749
 750     dbsToPairs []             = []
 751     dbsToPairs ((bind,_):dbs) = bind_prs bind ++ dbsToPairs dbs
 752
 753 -- specBindItself deals with the RHS, specialising it according
 754 -- to the calls found in the body (if any)
 755 specBindItself rhs_subst (NonRec bndr rhs) call_info = do
 756     ((bndr',rhs'), spec_defns, spec_uds) <- specDefn rhs_subst call_info (bndr,rhs)
 757     let
 758         new_bind | null spec_defns = NonRec bndr' rhs'
 759                  | otherwise       = Rec ((bndr',rhs'):spec_defns)
 760                 -- bndr' mentions the spec_defns in its SpecEnv
 761                 -- Not sure why we couln't just put the spec_defns first
 762     return (new_bind, spec_uds)
 763
 764 specBindItself rhs_subst (Rec pairs) call_info = do
 765     stuff <- mapM (specDefn rhs_subst call_info) pairs
 766     let
 767         (pairs', spec_defns_s, spec_uds_s) = unzip3 stuff
 768         spec_defns = concat spec_defns_s
 769         spec_uds   = plusUDList spec_uds_s
 770         new_bind   = Rec (spec_defns ++ pairs')
 771     return (new_bind, spec_uds)
 772
 773
 774 specDefn :: Subst                       -- Subst to use for RHS
 775          -> CallDetails                 -- Info on how it is used in its scope
 776          -> (Id, CoreExpr)              -- The thing being bound and its un-processed RHS
 777          -> SpecM ((Id, CoreExpr),      -- The thing and its processed RHS
 778                                         --      the Id may now have specialisations attached
 779                    [(Id,CoreExpr)],     -- Extra, specialised bindings
 780                    UsageDetails         -- Stuff to fling upwards from the RHS and its
 781             )                           --      specialised versions
 782
 783 specDefn subst calls (fn, rhs)
 784         -- The first case is the interesting one
 785   |  rhs_tyvars `lengthIs`     n_tyvars -- Rhs of fn's defn has right number of big lambdas
 786   && rhs_ids    `lengthAtLeast` n_dicts -- and enough dict args
 787   && notNull calls_for_me               -- And there are some calls to specialise
 788
 789 --   && not (certainlyWillInline (idUnfolding fn))      -- And it's not small
 790 --      See Note [Inline specialisation] for why we do not
 791 --      switch off specialisation for inline functions = do
 792   = do
 793      -- Specialise the body of the function
 794     (rhs', rhs_uds) <- specExpr subst rhs
 795
 796       -- Make a specialised version for each call in calls_for_me
 797     stuff <- mapM spec_call calls_for_me
 798     let
 799         (spec_defns, spec_uds, spec_rules) = unzip3 stuff
 800
 801         fn' = addIdSpecialisations fn spec_rules
 802
 803     return ((fn',rhs'),
 804               spec_defns,
 805               rhs_uds `plusUDs` plusUDList spec_uds)
 806
 807   | otherwise   -- No calls or RHS doesn't fit our preconceptions
 808   = WARN( notNull calls_for_me, ptext SLIT("Missed specialisation opportunity for") <+> ppr fn )
 809           -- Note [Specialisation shape]
 810     (do  { (rhs', rhs_uds) <- specExpr subst rhs
 811         ; return ((fn, rhs'), [], rhs_uds) })
 812
 813   where
 814     fn_type            = idType fn
 815     (tyvars, theta, _) = tcSplitSigmaTy fn_type
 816     n_tyvars           = length tyvars
 817     n_dicts            = length theta
 818     inline_prag        = idInlinePragma fn
 819
 820         -- It's important that we "see past" any INLINE pragma
 821         -- else we'll fail to specialise an INLINE thing
 822     (inline_rhs, rhs_inside) = dropInline rhs
 823     (rhs_tyvars, rhs_ids, rhs_body) = collectTyAndValBinders rhs_inside
 824
 825     rhs_dicts = take n_dicts rhs_ids
 826     rhs_bndrs = rhs_tyvars ++ rhs_dicts
 827     body      = mkLams (drop n_dicts rhs_ids) rhs_body
 828                 -- Glue back on the non-dict lambdas
 829
 830     calls_for_me = case lookupFM calls fn of
 831                         Nothing -> []
 832                         Just cs -> fmToList cs
 833
 834     ----------------------------------------------------------
 835         -- Specialise to one particular call pattern
 836     spec_call :: (CallKey, ([DictExpr], VarSet))        -- Call instance
 837               -> SpecM ((Id,CoreExpr),                  -- Specialised definition
 838                         UsageDetails,                   -- Usage details from specialised body
 839                         CoreRule)                       -- Info for the Id's SpecEnv
 840     spec_call (CallKey call_ts, (call_ds, call_fvs))
 841       = ASSERT( call_ts `lengthIs` n_tyvars  && call_ds `lengthIs` n_dicts ) do
 842                 -- Calls are only recorded for properly-saturated applications
 843
 844         -- Suppose f's defn is  f = /\ a b c d -> \ d1 d2 -> rhs
 845         -- Supppose the call is for f [Just t1, Nothing, Just t3, Nothing] [dx1, dx2]
 846
 847         -- Construct the new binding
 848         --      f1 = SUBST[a->t1,c->t3, d1->d1', d2->d2'] (/\ b d -> rhs)
 849         -- PLUS the usage-details
 850         --      { d1' = dx1; d2' = dx2 }
 851         -- where d1', d2' are cloned versions of d1,d2, with the type substitution applied.
 852         --
 853         -- Note that the substitution is applied to the whole thing.
 854         -- This is convenient, but just slightly fragile.  Notably:
 855         --      * There had better be no name clashes in a/b/c/d
 856         --
 857         let
 858                 -- poly_tyvars = [b,d] in the example above
 859                 -- spec_tyvars = [a,c]
 860                 -- ty_args     = [t1,b,t3,d]
 861            poly_tyvars = [tv | (tv, Nothing) <- rhs_tyvars `zip` call_ts]
 862            spec_tyvars = [tv | (tv, Just _)  <- rhs_tyvars `zip` call_ts]
 863            ty_args     = zipWithEqual "spec_call" mk_ty_arg rhs_tyvars call_ts
 864                        where
 865                          mk_ty_arg rhs_tyvar Nothing   = Type (mkTyVarTy rhs_tyvar)
 866                          mk_ty_arg rhs_tyvar (Just ty) = Type ty
 867            rhs_subst  = extendTvSubstList subst (spec_tyvars `zip` [ty | Just ty <- call_ts])
 868
 869         (rhs_subst', rhs_dicts') <- cloneBinders rhs_subst rhs_dicts
 870         let
 871            inst_args = ty_args ++ map Var rhs_dicts'
 872
 873                 -- Figure out the type of the specialised function
 874            body_ty = applyTypeToArgs rhs fn_type inst_args
 875            (lam_args, app_args)                 -- Add a dummy argument if body_ty is unlifted
 876                 | isUnLiftedType body_ty        -- C.f. WwLib.mkWorkerArgs
 877                 = (poly_tyvars ++ [voidArgId], poly_tyvars ++ [realWorldPrimId])
 878                 | otherwise = (poly_tyvars, poly_tyvars)
 879            spec_id_ty = mkPiTypes lam_args body_ty
 880
 881         spec_f <- newIdSM fn spec_id_ty
 882         (spec_rhs, rhs_uds) <- specExpr rhs_subst' (mkLams lam_args body)
 883         let
 884                 -- The rule to put in the function's specialisation is:
 885                 --      forall b,d, d1',d2'.  f t1 b t3 d d1' d2' = f1 b d
 886            spec_env_rule = mkLocalRule (mkFastString ("SPEC " ++ showSDoc (ppr fn)))
 887                                 inline_prag     -- Note [Auto-specialisation and RULES]
 888                                 (idName fn)
 889                                 (poly_tyvars ++ rhs_dicts')
 890                                 inst_args
 891                                 (mkVarApps (Var spec_f) app_args)
 892
 893                 -- Add the { d1' = dx1; d2' = dx2 } usage stuff
 894            final_uds = foldr addDictBind rhs_uds (my_zipEqual "spec_call" rhs_dicts' call_ds)
 895
 896            spec_pr | inline_rhs = (spec_f `setInlinePragma` inline_prag, Note InlineMe spec_rhs)
 897                    | otherwise  = (spec_f,                               spec_rhs)
 898
 899         return (spec_pr, final_uds, spec_env_rule)
 900
 901       where
 902         my_zipEqual doc xs ys
 903 #ifdef DEBUG
 904          | not (equalLength xs ys) = pprPanic "my_zipEqual" (vcat
 905                                                 [ ppr xs, ppr ys
 906                                                 , ppr fn <+> ppr call_ts
 907                                                 , ppr (idType fn), ppr theta
 908                                                 , ppr n_dicts, ppr rhs_dicts
 909                                                 , ppr rhs])
 910 #endif
 911          | otherwise               = zipEqual doc xs ys
 912 \end{code}
 913
 914 Note [Auto-specialisation and RULES]
 915 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 916 Consider:
 917    g :: Num a => a -> a
 918    g = ...
 919
 920    f :: (Int -> Int) -> Int
 921    f w = ...
 922    {-# RULE f g = 0 #-}
 923
 924 Suppose that auto-specialisation makes a specialised version of
 925 g::Int->Int That version won't appear in the LHS of the RULE for f.
 926 So if the specialisation rule fires too early, the rule for f may
 927 never fire.
 928
 929 It might be possible to add new rules, to "complete" the rewrite system.
 930 Thus when adding
 931         RULE forall d. g Int d = g_spec
 932 also add
 933         RULE f g_spec = 0
 934
 935 But that's a bit complicated.  For now we ask the programmer's help,
 936 by *copying the INLINE activation pragma* to the auto-specialised rule.
 937 So if g says {-# NOINLINE[2] g #-}, then the auto-spec rule will also
 938 not be active until phase 2.
 939
 940
 941 Note [Specialisation shape]
 942 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
 943 We only specialise a function if it has visible top-level lambdas
 944 corresponding to its overloading.  E.g. if
 945         f :: forall a. Eq a => ....
 946 then its body must look like
 947         f = /\a. \d. ...
 948
 949 Reason: when specialising the body for a call (f ty dexp), we want to
 950 substitute dexp for d, and pick up specialised calls in the body of f.
 951
 952 This doesn't always work.  One example I came across was htis:
 953         newtype Gen a = MkGen{ unGen :: Int -> a }
 954
 955         choose :: Eq a => a -> Gen a
 956         choose n = MkGen (\r -> n)
 957
 958         oneof = choose (1::Int)
 959
 960 It's a silly exapmle, but we get
 961         choose = /\a. g `cast` co
 962 where choose doesn't have any dict arguments.  Thus far I have not
 963 tried to fix this (wait till there's a real example).
 964
 965
 966 Note [Inline specialisations]
 967 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 968 We transfer to the specialised function any INLINE stuff from the
 969 original.  This means (a) the Activation in the IdInfo, and (b) any
 970 InlineMe on the RHS.
 971
 972 This is a change (Jun06).  Previously the idea is that the point of
 973 inlining was precisely to specialise the function at its call site,
 974 and that's not so important for the specialised copies.  But
 975 *pragma-directed* specialisation now takes place in the
 976 typechecker/desugarer, with manually specified INLINEs.  The
 977 specialiation here is automatic.  It'd be very odd if a function
 978 marked INLINE was specialised (because of some local use), and then
 979 forever after (including importing modules) the specialised version
 980 wasn't INLINEd.  After all, the programmer said INLINE!
 981
 982 You might wonder why we don't just not specialise INLINE functions.
 983 It's because even INLINE functions are sometimes not inlined, when
 984 they aren't applied to interesting arguments.  But perhaps the type
 985 arguments alone are enough to specialise (even though the args are too
 986 boring to trigger inlining), and it's certainly better to call the
 987 specialised version.
 988
 989 A case in point is dictionary functions, which are current marked
 990 INLINE, but which are worth specialising.
 991
 992 \begin{code}
 993 dropInline :: CoreExpr -> (Bool, CoreExpr)
 994 dropInline (Note InlineMe rhs) = (True,  rhs)
 995 dropInline rhs                 = (False, rhs)
 996 \end{code}
 997
 998 %************************************************************************
 999 %*                                                                      *
1000 \subsubsection{UsageDetails and suchlike}
1001 %*                                                                      *
1002 %************************************************************************
1003
1004 \begin{code}
1005 data UsageDetails
1006   = MkUD {
1007         dict_binds :: !(Bag DictBind),
1008                         -- Floated dictionary bindings
1009                         -- The order is important;
1010                         -- in ds1 `union` ds2, bindings in ds2 can depend on those in ds1
1011                         -- (Remember, Bags preserve order in GHC.)
1012
1013         calls     :: !CallDetails
1014     }
1015
1016 type DictBind = (CoreBind, VarSet)
1017         -- The set is the free vars of the binding
1018         -- both tyvars and dicts
1019
1020 type DictExpr = CoreExpr
1021
1022 emptyUDs = MkUD { dict_binds = emptyBag, calls = emptyFM }
1023
1024 type ProtoUsageDetails = ([DictBind],
1025                           [(Id, CallKey, ([DictExpr], VarSet))]
1026                          )
1027
1028 ------------------------------------------------------------
1029 type CallDetails  = FiniteMap Id CallInfo
1030 newtype CallKey   = CallKey [Maybe Type]                        -- Nothing => unconstrained type argument
1031 type CallInfo     = FiniteMap CallKey
1032                               ([DictExpr], VarSet)              -- Dict args and the vars of the whole
1033                                                                 -- call (including tyvars)
1034                                                                 -- [*not* include the main id itself, of course]
1035         -- The finite maps eliminate duplicates
1036         -- The list of types and dictionaries is guaranteed to
1037         -- match the type of f
1038
1039 -- Type isn't an instance of Ord, so that we can control which
1040 -- instance we use.  That's tiresome here.  Oh well
1041 instance Eq CallKey where
1042   k1 == k2 = case k1 `compare` k2 of { EQ -> True; other -> False }
1043
1044 instance Ord CallKey where
1045   compare (CallKey k1) (CallKey k2) = cmpList cmp k1 k2
1046                 where
1047                   cmp Nothing Nothing     = EQ
1048                   cmp Nothing (Just t2)   = LT
1049                   cmp (Just t1) Nothing   = GT
1050                   cmp (Just t1) (Just t2) = tcCmpType t1 t2
1051
1052 unionCalls :: CallDetails -> CallDetails -> CallDetails
1053 unionCalls c1 c2 = plusFM_C plusFM c1 c2
1054
1055 singleCall :: Id -> [Maybe Type] -> [DictExpr] -> CallDetails
1056 singleCall id tys dicts
1057   = unitFM id (unitFM (CallKey tys) (dicts, call_fvs))
1058   where
1059     call_fvs = exprsFreeVars dicts `unionVarSet` tys_fvs
1060     tys_fvs  = tyVarsOfTypes (catMaybes tys)
1061         -- The type args (tys) are guaranteed to be part of the dictionary
1062         -- types, because they are just the constrained types,
1063         -- and the dictionary is therefore sure to be bound
1064         -- inside the binding for any type variables free in the type;
1065         -- hence it's safe to neglect tyvars free in tys when making
1066         -- the free-var set for this call
1067         -- BUT I don't trust this reasoning; play safe and include tys_fvs
1068         --
1069         -- We don't include the 'id' itself.
1070
1071 listToCallDetails calls
1072   = foldr (unionCalls . mk_call) emptyFM calls
1073   where
1074     mk_call (id, tys, dicts_w_fvs) = unitFM id (unitFM tys dicts_w_fvs)
1075         -- NB: the free vars of the call are provided
1076
1077 callDetailsToList calls = [ (id,tys,dicts)
1078                           | (id,fm) <- fmToList calls,
1079                             (tys, dicts) <- fmToList fm
1080                           ]
1081
1082 mkCallUDs subst f args
1083   | null theta
1084   || not (all isClassPred theta)
1085         -- Only specialise if all overloading is on class params.
1086         -- In ptic, with implicit params, the type args
1087         --  *don't* say what the value of the implicit param is!
1088   || not (spec_tys `lengthIs` n_tyvars)
1089   || not ( dicts   `lengthIs` n_dicts)
1090   || maybeToBool (lookupRule (\act -> True) (substInScope subst) emptyRuleBase f args)
1091         -- There's already a rule covering this call.  A typical case
1092         -- is where there's an explicit user-provided rule.  Then
1093         -- we don't want to create a specialised version
1094         -- of the function that overlaps.
1095   = emptyUDs    -- Not overloaded, or no specialisation wanted
1096
1097   | otherwise
1098   = MkUD {dict_binds = emptyBag,
1099           calls      = singleCall f spec_tys dicts
1100     }
1101   where
1102     (tyvars, theta, _) = tcSplitSigmaTy (idType f)
1103     constrained_tyvars = tyVarsOfTheta theta
1104     n_tyvars           = length tyvars
1105     n_dicts            = length theta
1106
1107     spec_tys = [mk_spec_ty tv ty | (tv, Type ty) <- tyvars `zip` args]
1108     dicts    = [dict_expr | (_, dict_expr) <- theta `zip` (drop n_tyvars args)]
1109
1110     mk_spec_ty tyvar ty
1111         | tyvar `elemVarSet` constrained_tyvars = Just ty
1112         | otherwise                             = Nothing
1113
1114 ------------------------------------------------------------
1115 plusUDs :: UsageDetails -> UsageDetails -> UsageDetails
1116 plusUDs (MkUD {dict_binds = db1, calls = calls1})
1117         (MkUD {dict_binds = db2, calls = calls2})
1118   = MkUD {dict_binds = d, calls = c}
1119   where
1120     d = db1    `unionBags`   db2
1121     c = calls1 `unionCalls`  calls2
1122
1123 plusUDList = foldr plusUDs emptyUDs
1124
1125 -- zapCalls deletes calls to ids from uds
1126 zapCalls ids uds = uds {calls = delListFromFM (calls uds) ids}
1127
1128 mkDB bind = (bind, bind_fvs bind)
1129
1130 bind_fvs (NonRec bndr rhs) = pair_fvs (bndr,rhs)
1131 bind_fvs (Rec prs)         = foldl delVarSet rhs_fvs bndrs
1132                            where
1133                              bndrs = map fst prs
1134                              rhs_fvs = unionVarSets (map pair_fvs prs)
1135
1136 pair_fvs (bndr, rhs) = exprFreeVars rhs `unionVarSet` idFreeVars bndr
1137         -- Don't forget variables mentioned in the
1138         -- rules of the bndr.  C.f. OccAnal.addRuleUsage
1139         -- Also tyvars mentioned in its type; they may not appear in the RHS
1140         --      type T a = Int
1141         --      x :: T a = 3
1142
1143 addDictBind (dict,rhs) uds = uds { dict_binds = mkDB (NonRec dict rhs) `consBag` dict_binds uds }
1144
1145 dumpAllDictBinds (MkUD {dict_binds = dbs}) binds
1146   = foldrBag add binds dbs
1147   where
1148     add (bind,_) binds = bind : binds
1149
1150 dumpUDs :: [CoreBndr]
1151         -> UsageDetails -> CoreExpr
1152         -> (UsageDetails, CoreExpr)
1153 dumpUDs bndrs uds body
1154   = (free_uds, foldr add_let body dict_binds)
1155   where
1156     (free_uds, (dict_binds, _)) = splitUDs bndrs uds
1157     add_let (bind,_) body       = Let bind body
1158
1159 splitUDs :: [CoreBndr]
1160          -> UsageDetails
1161          -> (UsageDetails,              -- These don't mention the binders
1162              ProtoUsageDetails)         -- These do
1163
1164 splitUDs bndrs uds@(MkUD {dict_binds = orig_dbs,
1165                           calls      = orig_calls})
1166
1167   = if isEmptyBag dump_dbs && null dump_calls then
1168         -- Common case: binder doesn't affect floats
1169         (uds, ([],[]))
1170
1171     else
1172         -- Binders bind some of the fvs of the floats
1173         (MkUD {dict_binds = free_dbs,
1174                calls      = listToCallDetails free_calls},
1175          (bagToList dump_dbs, dump_calls)
1176         )
1177
1178   where
1179     bndr_set = mkVarSet bndrs
1180
1181     (free_dbs, dump_dbs, dump_idset)
1182           = foldlBag dump_db (emptyBag, emptyBag, bndr_set) orig_dbs
1183                 -- Important that it's foldl not foldr;
1184                 -- we're accumulating the set of dumped ids in dump_set
1185
1186         -- Filter out any calls that mention things that are being dumped
1187     orig_call_list                 = callDetailsToList orig_calls
1188     (dump_calls, free_calls)       = partition captured orig_call_list
1189     captured (id,tys,(dicts, fvs)) =  fvs `intersectsVarSet` dump_idset
1190                                    || id `elemVarSet` dump_idset
1191
1192     dump_db (free_dbs, dump_dbs, dump_idset) db@(bind, fvs)
1193         | dump_idset `intersectsVarSet` fvs     -- Dump it
1194         = (free_dbs, dump_dbs `snocBag` db,
1195            extendVarSetList dump_idset (bindersOf bind))
1196
1197         | otherwise     -- Don't dump it
1198         = (free_dbs `snocBag` db, dump_dbs, dump_idset)
1199 \end{code}
1200
1201
1202 %************************************************************************
1203 %*                                                                      *
1204 \subsubsection{Boring helper functions}
1205 %*                                                                      *
1206 %************************************************************************
1207
1208 \begin{code}
1209 type SpecM a = UniqSM a
1210
1211 initSM    = initUs_
1212
1213 mapAndCombineSM f []     = return ([], emptyUDs)
1214 mapAndCombineSM f (x:xs) = do (y, uds1) <- f x
1215                               (ys, uds2) <- mapAndCombineSM f xs
1216                               return (y:ys, uds1 `plusUDs` uds2)
1217
1218 cloneBindSM :: Subst -> CoreBind -> SpecM (Subst, Subst, CoreBind)
1219 -- Clone the binders of the bind; return new bind with the cloned binders
1220 -- Return the substitution to use for RHSs, and the one to use for the body
1221 cloneBindSM subst (NonRec bndr rhs) = do
1222     us <- getUniqueSupplyM
1223     let (subst', bndr') = do cloneIdBndr subst us bndr
1224     return (subst, subst', NonRec bndr' rhs)
1225
1226 cloneBindSM subst (Rec pairs) = do
1227     us <- getUniqueSupplyM
1228     let (subst', bndrs') = cloneRecIdBndrs subst us (map fst pairs)
1229     return (subst', subst', Rec (bndrs' `zip` map snd pairs))
1230
1231 cloneBinders subst bndrs = do
1232     us <- getUniqueSupplyM
1233     return (cloneIdBndrs subst us bndrs)
1234
1235 newIdSM old_id new_ty = do
1236     uniq <- getUniqueM
1237     let
1238         -- Give the new Id a similar occurrence name to the old one
1239         name   = idName old_id
1240         new_id = mkUserLocal (mkSpecOcc (nameOccName name)) uniq new_ty (getSrcSpan name)
1241     return new_id
1242 \end{code}
1243
1244
1245                 Old (but interesting) stuff about unboxed bindings
1246                 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1247
1248 What should we do when a value is specialised to a *strict* unboxed value?
1249
1250         map_*_* f (x:xs) = let h = f x
1251                                t = map f xs
1252                            in h:t
1253
1254 Could convert let to case:
1255
1256         map_*_Int# f (x:xs) = case f x of h# ->
1257                               let t = map f xs
1258                               in h#:t
1259
1260 This may be undesirable since it forces evaluation here, but the value
1261 may not be used in all branches of the body. In the general case this
1262 transformation is impossible since the mutual recursion in a letrec
1263 cannot be expressed as a case.
1264
1265 There is also a problem with top-level unboxed values, since our
1266 implementation cannot handle unboxed values at the top level.
1267
1268 Solution: Lift the binding of the unboxed value and extract it when it
1269 is used:
1270
1271         map_*_Int# f (x:xs) = let h = case (f x) of h# -> _Lift h#
1272                                   t = map f xs
1273                               in case h of
1274                                  _Lift h# -> h#:t
1275
1276 Now give it to the simplifier and the _Lifting will be optimised away.
1277
1278 The benfit is that we have given the specialised "unboxed" values a
1279 very simplep lifted semantics and then leave it up to the simplifier to
1280 optimise it --- knowing that the overheads will be removed in nearly
1281 all cases.
1282
1283 In particular, the value will only be evaluted in the branches of the
1284 program which use it, rather than being forced at the point where the
1285 value is bound. For example:
1286
1287         filtermap_*_* p f (x:xs)
1288           = let h = f x
1289                 t = ...
1290             in case p x of
1291                 True  -> h:t
1292                 False -> t
1293    ==>
1294         filtermap_*_Int# p f (x:xs)
1295           = let h = case (f x) of h# -> _Lift h#
1296                 t = ...
1297             in case p x of
1298                 True  -> case h of _Lift h#
1299                            -> h#:t
1300                 False -> t
1301
1302 The binding for h can still be inlined in the one branch and the
1303 _Lifting eliminated.
1304
1305
1306 Question: When won't the _Lifting be eliminated?
1307
1308 Answer: When they at the top-level (where it is necessary) or when
1309 inlining would duplicate work (or possibly code depending on
1310 options). However, the _Lifting will still be eliminated if the
1311 strictness analyser deems the lifted binding strict.
1312