ghc/docs/users_guide/glasgow_exts.sgml

   1 <para>
   2 <indexterm><primary>language, GHC</primary></indexterm>
   3 <indexterm><primary>extensions, GHC</primary></indexterm>
   4 As with all known Haskell systems, GHC implements some extensions to
   5 the language.  To use them, you'll need to give a <option>-fglasgow-exts</option>
   6 <indexterm><primary>-fglasgow-exts option</primary></indexterm> option.
   7 </para>
   8
   9 <para>
  10 Virtually all of the Glasgow extensions serve to give you access to
  11 the underlying facilities with which we implement Haskell.  Thus, you
  12 can get at the Raw Iron, if you are willing to write some non-standard
  13 code at a more primitive level.  You need not be &ldquo;stuck&rdquo; on
  14 performance because of the implementation costs of Haskell's
  15 &ldquo;high-level&rdquo; features&mdash;you can always code &ldquo;under&rdquo; them.  In an extreme case, you can write all your time-critical code in C, and then just glue it together with Haskell!
  16 </para>
  17
  18 <para>
  19 Executive summary of our extensions:
  20 </para>
  21
  22   <variablelist>
  23
  24     <varlistentry>
  25       <term>Unboxed types and primitive operations:</Term>
  26       <listitem>
  27         <para>You can get right down to the raw machine types and
  28         operations; included in this are &ldquo;primitive
  29         arrays&rdquo; (direct access to Big Wads of Bytes).  Please
  30         see <XRef LinkEnd="glasgow-unboxed"> and following.</para>
  31       </listitem>
  32     </varlistentry>
  33
  34     <varlistentry>
  35       <term>Type system extensions:</term>
  36       <listitem>
  37         <para> GHC supports a large number of extensions to Haskell's
  38         type system.  Specifically:</para>
  39
  40         <variablelist>
  41           <varlistentry>
  42             <term>Class method types:</term>
  43             <listitem>
  44               <para><xref LinkEnd="classs-method-types"></para>
  45             </listitem>
  46           </varlistentry>
  47
  48           <varlistentry>
  49             <term>Multi-parameter type classes:</term>
  50             <listitem>
  51               <para><xref LinkEnd="multi-param-type-classes"></para>
  52             </listitem>
  53           </varlistentry>
  54
  55           <varlistentry>
  56             <term>Functional dependencies:</term>
  57             <listitem>
  58               <para><xref LinkEnd="functional-dependencies"></para>
  59             </listitem>
  60           </varlistentry>
  61
  62           <varlistentry>
  63             <term>Implicit parameters:</term>
  64             <listitem>
  65               <para><xref LinkEnd="implicit-parameters"></para>
  66             </listitem>
  67           </varlistentry>
  68
  69           <varlistentry>
  70             <term>Linear implicit parameters:</term>
  71             <listitem>
  72               <para><xref LinkEnd="linear-implicit-parameters"></para>
  73             </listitem>
  74           </varlistentry>
  75
  76           <varlistentry>
  77             <term>Local universal quantification:</term>
  78             <listitem>
  79               <para><xref LinkEnd="universal-quantification"></para>
  80             </listitem>
  81           </varlistentry>
  82
  83           <varlistentry>
  84             <term>Extistentially quantification in data types:</term>
  85             <listitem>
  86               <para><xref LinkEnd="existential-quantification"></para>
  87             </listitem>
  88           </varlistentry>
  89
  90           <varlistentry>
  91             <term>Scoped type variables:</term>
  92             <listitem>
  93               <para>Scoped type variables enable the programmer to
  94               supply type signatures for some nested declarations,
  95               where this would not be legal in Haskell 98.  Details in
  96               <xref LinkEnd="scoped-type-variables">.</para>
  97             </listitem>
  98           </varlistentry>
  99         </variablelist>
 100       </listitem>
 101     </varlistentry>
 102
 103     <varlistentry>
 104       <term>Pattern guards</term>
 105       <listitem>
 106         <para>Instead of being a boolean expression, a guard is a list
 107         of qualifiers, exactly as in a list comprehension. See <xref
 108         LinkEnd="pattern-guards">.</para>
 109       </listitem>
 110     </varlistentry>
 111
 112     <varlistentry>
 113       <term>Data types with no constructors</term>
 114       <listitem>
 115         <para>See <xref LinkEnd="nullary-types">.</para>
 116       </listitem>
 117     </varlistentry>
 118
 119     <varlistentry>
 120       <term>Parallel list comprehensions</term>
 121       <listitem>
 122         <para>An extension to the list comprehension syntax to support
 123         <literal>zipWith</literal>-like functionality.  See <xref
 124         linkend="parallel-list-comprehensions">.</para>
 125       </listitem>
 126     </varlistentry>
 127
 128     <varlistentry>
 129       <term>Foreign calling:</term>
 130       <listitem>
 131         <para>Just what it sounds like.  We provide
 132         <emphasis>lots</emphasis> of rope that you can dangle around
 133         your neck.  Please see <xref LinkEnd="ffi">.</para>
 134       </listitem>
 135     </varlistentry>
 136
 137     <varlistentry>
 138       <term>Pragmas</term>
 139       <listitem>
 140         <para>Pragmas are special instructions to the compiler placed
 141         in the source file.  The pragmas GHC supports are described in
 142         <xref LinkEnd="pragmas">.</para>
 143       </listitem>
 144     </varlistentry>
 145
 146     <varlistentry>
 147       <term>Rewrite rules:</term>
 148       <listitem>
 149         <para>The programmer can specify rewrite rules as part of the
 150         source program (in a pragma).  GHC applies these rewrite rules
 151         wherever it can.  Details in <xref
 152         LinkEnd="rewrite-rules">.</para>
 153       </listitem>
 154     </varlistentry>
 155
 156     <varlistentry>
 157       <term>Generic classes:</term>
 158       <listitem>
 159         <para>(Note: support for generic classes is currently broken
 160         in GHC 5.02).</para>
 161
 162         <para>Generic class declarations allow you to define a class
 163         whose methods say how to work over an arbitrary data type.
 164         Then it's really easy to make any new type into an instance of
 165         the class.  This generalises the rather ad-hoc "deriving"
 166         feature of Haskell 98.  Details in <xref
 167         LinkEnd="generic-classes">.</para>
 168       </listitem>
 169     </varlistentry>
 170   </variablelist>
 171
 172 <para>
 173 Before you get too carried away working at the lowest level (e.g.,
 174 sloshing <literal>MutableByteArray&num;</literal>s around your
 175 program), you may wish to check if there are libraries that provide a
 176 &ldquo;Haskellised veneer&rdquo; over the features you want.  See
 177 <xref linkend="book-hslibs">.
 178 </para>
 179
 180   <sect1 id="options-language">
 181     <title>Language options</title>
 182
 183     <indexterm><primary>language</primary><secondary>option</secondary>
 184     </indexterm>
 185     <indexterm><primary>options</primary><secondary>language</secondary>
 186     </indexterm>
 187     <indexterm><primary>extensions</primary><secondary>options controlling</secondary>
 188     </indexterm>
 189
 190     <para> These flags control what variation of the language are
 191     permitted.  Leaving out all of them gives you standard Haskell
 192     98.</para>
 193
 194     <variablelist>
 195
 196       <varlistentry>
 197         <term><option>-fglasgow-exts</option>:</term>
 198         <indexterm><primary><option>-fglasgow-exts</option></primary></indexterm>
 199         <listitem>
 200           <para>This simultaneously enables all of the extensions to
 201           Haskell 98 described in <xref
 202           linkend="ghc-language-features">, except where otherwise
 203           noted. </para>
 204         </listitem>
 205       </varlistentry>
 206
 207       <varlistentry>
 208         <term><option>-fno-monomorphism-restriction</option>:</term>
 209         <indexterm><primary><option>-fno-monomorphism-restriction</option></primary></indexterm>
 210         <listitem>
 211           <para> Switch off the Haskell 98 monomorphism restriction.
 212           Independent of the <option>-fglasgow-exts</option>
 213           flag. </para>
 214         </listitem>
 215       </varlistentry>
 216
 217       <varlistentry>
 218         <term><option>-fallow-overlapping-instances</option></term>
 219         <term><option>-fallow-undecidable-instances</option></term>
 220         <term><option>-fallow-incoherent-instances</option></term>
 221         <term><option>-fcontext-stack</option></term>
 222         <indexterm><primary><option>-fallow-overlapping-instances</option></primary></indexterm>
 223         <indexterm><primary><option>-fallow-undecidable-instances</option></primary></indexterm>
 224         <indexterm><primary><option>-fcontext-stack</option></primary></indexterm>
 225         <listitem>
 226           <para> See <xref LinkEnd="instance-decls">.  Only relevant
 227           if you also use <option>-fglasgow-exts</option>.</para>
 228         </listitem>
 229       </varlistentry>
 230
 231       <varlistentry>
 232         <term><option>-finline-phase</option></term>
 233         <indexterm><primary><option>-finline-phase</option></primary></indexterm>
 234         <listitem>
 235           <para>See <xref LinkEnd="rewrite-rules">.  Only relevant if
 236           you also use <option>-fglasgow-exts</option>.</para>
 237         </listitem>
 238       </varlistentry>
 239
 240       <varlistentry>
 241         <term><option>-fgenerics</option></term>
 242         <indexterm><primary><option>-fgenerics</option></primary></indexterm>
 243         <listitem>
 244           <para>See <xref LinkEnd="generic-classes">.  Independent of
 245           <option>-fglasgow-exts</option>.</para>
 246         </listitem>
 247       </varlistentry>
 248
 249         <varlistentry>
 250           <term><option>-fno-implicit-prelude</option></term>
 251           <listitem>
 252             <para><indexterm><primary>-fno-implicit-prelude
 253             option</primary></indexterm> GHC normally imports
 254             <filename>Prelude.hi</filename> files for you.  If you'd
 255             rather it didn't, then give it a
 256             <option>-fno-implicit-prelude</option> option.  The idea
 257             is that you can then import a Prelude of your own.  (But
 258             don't call it <literal>Prelude</literal>; the Haskell
 259             module namespace is flat, and you must not conflict with
 260             any Prelude module.)</para>
 261
 262             <para>Even though you have not imported the Prelude, all
 263             the built-in syntax still refers to the built-in Haskell
 264             Prelude types and values, as specified by the Haskell
 265             Report.  For example, the type <literal>[Int]</literal>
 266             still means <literal>Prelude.[] Int</literal>; tuples
 267             continue to refer to the standard Prelude tuples; the
 268             translation for list comprehensions continues to use
 269             <literal>Prelude.map</literal> etc.</para>
 270
 271             <para> With one group of exceptions!  You may want to
 272             define your own numeric class hierarchy.  It completely
 273             defeats that purpose if the literal "1" means
 274             "<literal>Prelude.fromInteger 1</literal>", which is what
 275             the Haskell Report specifies.  So the
 276             <option>-fno-implicit-prelude</option> flag causes the
 277             following pieces of built-in syntax to refer to <emphasis>whatever
 278             is in scope</emphasis>, not the Prelude versions:</para>
 279
 280             <itemizedlist>
 281               <listitem>
 282                 <para>Integer and fractional literals mean
 283                 "<literal>fromInteger 1</literal>" and
 284                 "<literal>fromRational 3.2</literal>", not the
 285                 Prelude-qualified versions; both in expressions and in
 286                 patterns.</para>
 287               </listitem>
 288
 289               <listitem>
 290                 <para>Negation (e.g. "<literal>- (f x)</literal>")
 291                 means "<literal>negate (f x)</literal>" (not
 292                 <literal>Prelude.negate</literal>).</para>
 293               </listitem>
 294
 295               <listitem>
 296                 <para>In an n+k pattern, the standard Prelude
 297                 <literal>Ord</literal> class is still used for comparison,
 298                 but the necessary subtraction uses whatever
 299                 "<literal>(-)</literal>" is in scope (not
 300                 "<literal>Prelude.(-)</literal>").</para>
 301               </listitem>
 302             </itemizedlist>
 303
 304              <para>Note: Negative literals, such as <literal>-3</literal>, are
 305              specified by (a careful reading of) the Haskell Report as
 306              meaning <literal>Prelude.negate (Prelude.fromInteger 3)</literal>.
 307              However, GHC deviates from this slightly, and treats them as meaning
 308              <literal>fromInteger (-3)</literal>.  One particular effect of this
 309              slightly-non-standard reading is that there is no difficulty with
 310              the literal <literal>-2147483648</literal> at type <literal>Int</literal>;
 311              it means <literal>fromInteger (-2147483648)</literal>.  The strict interpretation
 312              would be <literal>negate (fromInteger 2147483648)</literal>,
 313              and the call to <literal>fromInteger</literal> would overflow
 314              (at type <literal>Int</literal>, remember).
 315              </para>
 316
 317           </listitem>
 318         </varlistentry>
 319
 320     </variablelist>
 321   </sect1>
 322
 323 <!-- UNBOXED TYPES AND PRIMITIVE OPERATIONS -->
 324 &primitives;
 325
 326 <sect1 id="glasgow-ST-monad">
 327 <title>Primitive state-transformer monad</title>
 328
 329 <para>
 330 <indexterm><primary>state transformers (Glasgow extensions)</primary></indexterm>
 331 <indexterm><primary>ST monad (Glasgow extension)</primary></indexterm>
 332 </para>
 333
 334 <para>
 335 This monad underlies our implementation of arrays, mutable and
 336 immutable, and our implementation of I/O, including &ldquo;C calls&rdquo;.
 337 </para>
 338
 339 <para>
 340 The <literal>ST</literal> library, which provides access to the
 341 <function>ST</function> monad, is described in <xref
 342 linkend="sec-ST">.
 343 </para>
 344
 345 </sect1>
 346
 347 <sect1 id="glasgow-prim-arrays">
 348 <title>Primitive arrays, mutable and otherwise
 349 </title>
 350
 351 <para>
 352 <indexterm><primary>primitive arrays (Glasgow extension)</primary></indexterm>
 353 <indexterm><primary>arrays, primitive (Glasgow extension)</primary></indexterm>
 354 </para>
 355
 356 <para>
 357 GHC knows about quite a few flavours of Large Swathes of Bytes.
 358 </para>
 359
 360 <para>
 361 First, GHC distinguishes between primitive arrays of (boxed) Haskell
 362 objects (type <literal>Array&num; obj</literal>) and primitive arrays of bytes (type
 363 <literal>ByteArray&num;</literal>).
 364 </para>
 365
 366 <para>
 367 Second, it distinguishes between&hellip;
 368 <variablelist>
 369
 370 <varlistentry>
 371 <term>Immutable:</term>
 372 <listitem>
 373 <para>
 374 Arrays that do not change (as with &ldquo;standard&rdquo; Haskell arrays); you
 375 can only read from them.  Obviously, they do not need the care and
 376 attention of the state-transformer monad.
 377 </para>
 378 </listitem>
 379 </varlistentry>
 380 <varlistentry>
 381 <term>Mutable:</term>
 382 <listitem>
 383 <para>
 384 Arrays that may be changed or &ldquo;mutated.&rdquo;  All the operations on them
 385 live within the state-transformer monad and the updates happen
 386 <emphasis>in-place</emphasis>.
 387 </para>
 388 </listitem>
 389 </varlistentry>
 390 <varlistentry>
 391 <term>&ldquo;Static&rdquo; (in C land):</term>
 392 <listitem>
 393 <para>
 394 A C routine may pass an <literal>Addr&num;</literal> pointer back into Haskell land.  There
 395 are then primitive operations with which you may merrily grab values
 396 over in C land, by indexing off the &ldquo;static&rdquo; pointer.
 397 </para>
 398 </listitem>
 399 </varlistentry>
 400 <varlistentry>
 401 <term>&ldquo;Stable&rdquo; pointers:</term>
 402 <listitem>
 403 <para>
 404 If, for some reason, you wish to hand a Haskell pointer (i.e.,
 405 <emphasis>not</emphasis> an unboxed value) to a C routine, you first make the
 406 pointer &ldquo;stable,&rdquo; so that the garbage collector won't forget that it
 407 exists.  That is, GHC provides a safe way to pass Haskell pointers to
 408 C.
 409 </para>
 410
 411 <para>
 412 Please see <xref LinkEnd="sec-stable-pointers"> for more details.
 413 </para>
 414 </listitem>
 415 </varlistentry>
 416 <varlistentry>
 417 <term>&ldquo;Foreign objects&rdquo;:</term>
 418 <listitem>
 419 <para>
 420 A &ldquo;foreign object&rdquo; is a safe way to pass an external object (a
 421 C-allocated pointer, say) to Haskell and have Haskell do the Right
 422 Thing when it no longer references the object.  So, for example, C
 423 could pass a large bitmap over to Haskell and say &ldquo;please free this
 424 memory when you're done with it.&rdquo;
 425 </para>
 426
 427 <para>
 428 Please see <xref LinkEnd="sec-ForeignObj"> for more details.
 429 </para>
 430 </listitem>
 431 </varlistentry>
 432 </variablelist>
 433 </para>
 434
 435 <para>
 436 The libraries documentatation gives more details on all these
 437 &ldquo;primitive array&rdquo; types and the operations on them.
 438 </para>
 439
 440 </sect1>
 441
 442
 443 <sect1 id="nullary-types">
 444 <title>Data types with no constructors</title>
 445
 446 <para>With the <option>-fglasgow-exts</option> flag, GHC lets you declare
 447 a data type with no constructors.  For example:</para>
 448 <programlisting>
 449   data S      -- S :: *
 450   data T a    -- T :: * -> *
 451 </programlisting>
 452 <para>Syntactically, the declaration lacks the "= constrs" part.  The
 453 type can be parameterised, but only over ordinary types, of kind *; since
 454 Haskell does not have kind signatures, you cannot parameterise over higher-kinded
 455 types.</para>
 456
 457 <para>Such data types have only one value, namely bottom.
 458 Nevertheless, they can be useful when defining "phantom types".</para>
 459 </sect1>
 460
 461 <sect1 id="pattern-guards">
 462 <title>Pattern guards</title>
 463
 464 <para>
 465 <indexterm><primary>Pattern guards (Glasgow extension)</primary></indexterm>
 466 The discussion that follows is an abbreviated version of Simon Peyton Jones's original <ULink URL="http://research.microsoft.com/~simonpj/Haskell/guards.html">proposal</ULink>. (Note that the proposal was written before pattern guards were implemented, so refers to them as unimplemented.)
 467 </para>
 468
 469 <para>
 470 Suppose we have an abstract data type of finite maps, with a
 471 lookup operation:
 472
 473 <programlisting>
 474 lookup :: FiniteMap -> Int -> Maybe Int
 475 </programlisting>
 476
 477 The lookup returns <function>Nothing</function> if the supplied key is not in the domain of the mapping, and <function>(Just v)</function> otherwise,
 478 where <VarName>v</VarName> is the value that the key maps to.  Now consider the following definition:
 479 </para>
 480
 481 <programlisting>
 482 clunky env var1 var2 | ok1 && ok2 = val1 + val2
 483 | otherwise  = var1 + var2
 484 where
 485   m1 = lookup env var1
 486   m2 = lookup env var2
 487   ok1 = maybeToBool m1
 488   ok2 = maybeToBool m2
 489   val1 = expectJust m1
 490   val2 = expectJust m2
 491 </programlisting>
 492
 493 <para>
 494 The auxiliary functions are
 495 </para>
 496
 497 <programlisting>
 498 maybeToBool :: Maybe a -&gt; Bool
 499 maybeToBool (Just x) = True
 500 maybeToBool Nothing  = False
 501
 502 expectJust :: Maybe a -&gt; a
 503 expectJust (Just x) = x
 504 expectJust Nothing  = error "Unexpected Nothing"
 505 </programlisting>
 506
 507 <para>
 508 What is <function>clunky</function> doing? The guard <literal>ok1 &&
 509 ok2</literal> checks that both lookups succeed, using
 510 <function>maybeToBool</function> to convert the <function>Maybe</function>
 511 types to booleans. The (lazily evaluated) <function>expectJust</function>
 512 calls extract the values from the results of the lookups, and binds the
 513 returned values to <VarName>val1</VarName> and <VarName>val2</VarName>
 514 respectively.  If either lookup fails, then clunky takes the
 515 <literal>otherwise</literal> case and returns the sum of its arguments.
 516 </para>
 517
 518 <para>
 519 This is certainly legal Haskell, but it is a tremendously verbose and
 520 un-obvious way to achieve the desired effect.  Arguably, a more direct way
 521 to write clunky would be to use case expressions:
 522 </para>
 523
 524 <programlisting>
 525 clunky env var1 var1 = case lookup env var1 of
 526   Nothing -&gt; fail
 527   Just val1 -&gt; case lookup env var2 of
 528     Nothing -&gt; fail
 529     Just val2 -&gt; val1 + val2
 530 where
 531   fail = val1 + val2
 532 </programlisting>
 533
 534 <para>
 535 This is a bit shorter, but hardly better.  Of course, we can rewrite any set
 536 of pattern-matching, guarded equations as case expressions; that is
 537 precisely what the compiler does when compiling equations! The reason that
 538 Haskell provides guarded equations is because they allow us to write down
 539 the cases we want to consider, one at a time, independently of each other.
 540 This structure is hidden in the case version.  Two of the right-hand sides
 541 are really the same (<function>fail</function>), and the whole expression
 542 tends to become more and more indented.
 543 </para>
 544
 545 <para>
 546 Here is how I would write clunky:
 547 </para>
 548
 549 <programlisting>
 550 clunky env var1 var1
 551   | Just val1 &lt;- lookup env var1
 552   , Just val2 &lt;- lookup env var2
 553   = val1 + val2
 554 ...other equations for clunky...
 555 </programlisting>
 556
 557 <para>
 558 The semantics should be clear enough.  The qualifers are matched in order.
 559 For a <literal>&lt;-</literal> qualifier, which I call a pattern guard, the
 560 right hand side is evaluated and matched against the pattern on the left.
 561 If the match fails then the whole guard fails and the next equation is
 562 tried.  If it succeeds, then the appropriate binding takes place, and the
 563 next qualifier is matched, in the augmented environment.  Unlike list
 564 comprehensions, however, the type of the expression to the right of the
 565 <literal>&lt;-</literal> is the same as the type of the pattern to its
 566 left.  The bindings introduced by pattern guards scope over all the
 567 remaining guard qualifiers, and over the right hand side of the equation.
 568 </para>
 569
 570 <para>
 571 Just as with list comprehensions, boolean expressions can be freely mixed
 572 with among the pattern guards.  For example:
 573 </para>
 574
 575 <programlisting>
 576 f x | [y] <- x
 577     , y > 3
 578     , Just z <- h y
 579     = ...
 580 </programlisting>
 581
 582 <para>
 583 Haskell's current guards therefore emerge as a special case, in which the
 584 qualifier list has just one element, a boolean expression.
 585 </para>
 586 </sect1>
 587
 588   <sect1 id="parallel-list-comprehensions">
 589     <title>Parallel List Comprehensions</title>
 590     <indexterm><primary>list comprehensions</primary><secondary>parallel</secondary>
 591     </indexterm>
 592     <indexterm><primary>parallel list comprehensions</primary>
 593     </indexterm>
 594
 595     <para>Parallel list comprehensions are a natural extension to list
 596     comprehensions.  List comprehensions can be thought of as a nice
 597     syntax for writing maps and filters.  Parallel comprehensions
 598     extend this to include the zipWith family.</para>
 599
 600     <para>A parallel list comprehension has multiple independent
 601     branches of qualifier lists, each separated by a `|' symbol.  For
 602     example, the following zips together two lists:</para>
 603
 604 <programlisting>
 605    [ (x, y) | x <- xs | y <- ys ]
 606 </programlisting>
 607
 608     <para>The behavior of parallel list comprehensions follows that of
 609     zip, in that the resulting list will have the same length as the
 610     shortest branch.</para>
 611
 612     <para>We can define parallel list comprehensions by translation to
 613     regular comprehensions.  Here's the basic idea:</para>
 614
 615     <para>Given a parallel comprehension of the form: </para>
 616
 617 <programlisting>
 618    [ e | p1 <- e11, p2 <- e12, ...
 619        | q1 <- e21, q2 <- e22, ...
 620        ...
 621    ]
 622 </programlisting>
 623
 624     <para>This will be translated to: </para>
 625
 626 <programlisting>
 627    [ e | ((p1,p2), (q1,q2), ...) <- zipN [(p1,p2) | p1 <- e11, p2 <- e12, ...]
 628                                          [(q1,q2) | q1 <- e21, q2 <- e22, ...]
 629                                          ...
 630    ]
 631 </programlisting>
 632
 633     <para>where `zipN' is the appropriate zip for the given number of
 634     branches.</para>
 635
 636   </sect1>
 637
 638 <sect1 id="class-method-types">
 639 <title>Class method types
 640 </title>
 641 <para>
 642 Haskell 98 prohibits class method types to mention constraints on the
 643 class type variable, thus:
 644 <programlisting>
 645   class Seq s a where
 646     fromList :: [a] -> s a
 647     elem     :: Eq a => a -> s a -> Bool
 648 </programlisting>
 649 The type of <literal>elem</literal> is illegal in Haskell 98, because it
 650 contains the constraint <literal>Eq a</literal>, constrains only the
 651 class type variable (in this case <literal>a</literal>).
 652 </para>
 653 <para>
 654 With the <option>-fglasgow-exts</option> GHC lifts this restriction.
 655 </para>
 656
 657 </sect1>
 658
 659 <sect1 id="multi-param-type-classes">
 660 <title>Multi-parameter type classes
 661 </title>
 662
 663 <para>
 664 This section documents GHC's implementation of multi-parameter type
 665 classes.  There's lots of background in the paper <ULink
 666 URL="http://research.microsoft.com/~simonpj/multi.ps.gz" >Type
 667 classes: exploring the design space</ULink > (Simon Peyton Jones, Mark
 668 Jones, Erik Meijer).
 669 </para>
 670
 671 <para>
 672 I'd like to thank people who reported shorcomings in the GHC 3.02
 673 implementation.  Our default decisions were all conservative ones, and
 674 the experience of these heroic pioneers has given useful concrete
 675 examples to support several generalisations.  (These appear below as
 676 design choices not implemented in 3.02.)
 677 </para>
 678
 679 <para>
 680 I've discussed these notes with Mark Jones, and I believe that Hugs
 681 will migrate towards the same design choices as I outline here.
 682 Thanks to him, and to many others who have offered very useful
 683 feedback.
 684 </para>
 685
 686 <sect2>
 687 <title>Types</title>
 688
 689 <para>
 690 There are the following restrictions on the form of a qualified
 691 type:
 692 </para>
 693
 694 <para>
 695
 696 <programlisting>
 697   forall tv1..tvn (c1, ...,cn) => type
 698 </programlisting>
 699
 700 </para>
 701
 702 <para>
 703 (Here, I write the "foralls" explicitly, although the Haskell source
 704 language omits them; in Haskell 1.4, all the free type variables of an
 705 explicit source-language type signature are universally quantified,
 706 except for the class type variables in a class declaration.  However,
 707 in GHC, you can give the foralls if you want.  See <xref LinkEnd="universal-quantification">).
 708 </para>
 709
 710 <para>
 711
 712 <OrderedList>
 713 <listitem>
 714
 715 <para>
 716  <emphasis>Each universally quantified type variable
 717 <literal>tvi</literal> must be mentioned (i.e. appear free) in <literal>type</literal></emphasis>.
 718
 719 The reason for this is that a value with a type that does not obey
 720 this restriction could not be used without introducing
 721 ambiguity. Here, for example, is an illegal type:
 722
 723
 724 <programlisting>
 725   forall a. Eq a => Int
 726 </programlisting>
 727
 728
 729 When a value with this type was used, the constraint <literal>Eq tv</literal>
 730 would be introduced where <literal>tv</literal> is a fresh type variable, and
 731 (in the dictionary-translation implementation) the value would be
 732 applied to a dictionary for <literal>Eq tv</literal>.  The difficulty is that we
 733 can never know which instance of <literal>Eq</literal> to use because we never
 734 get any more information about <literal>tv</literal>.
 735
 736 </para>
 737 </listitem>
 738 <listitem>
 739
 740 <para>
 741  <emphasis>Every constraint <literal>ci</literal> must mention at least one of the
 742 universally quantified type variables <literal>tvi</literal></emphasis>.
 743
 744 For example, this type is OK because <literal>C a b</literal> mentions the
 745 universally quantified type variable <literal>b</literal>:
 746
 747
 748 <programlisting>
 749   forall a. C a b => burble
 750 </programlisting>
 751
 752
 753 The next type is illegal because the constraint <literal>Eq b</literal> does not
 754 mention <literal>a</literal>:
 755
 756
 757 <programlisting>
 758   forall a. Eq b => burble
 759 </programlisting>
 760
 761
 762 The reason for this restriction is milder than the other one.  The
 763 excluded types are never useful or necessary (because the offending
 764 context doesn't need to be witnessed at this point; it can be floated
 765 out).  Furthermore, floating them out increases sharing. Lastly,
 766 excluding them is a conservative choice; it leaves a patch of
 767 territory free in case we need it later.
 768
 769 </para>
 770 </listitem>
 771
 772 </OrderedList>
 773
 774 </para>
 775
 776 <para>
 777 These restrictions apply to all types, whether declared in a type signature
 778 or inferred.
 779 </para>
 780
 781 <para>
 782 Unlike Haskell 1.4, constraints in types do <emphasis>not</emphasis> have to be of
 783 the form <emphasis>(class type-variables)</emphasis>.  Thus, these type signatures
 784 are perfectly OK
 785 </para>
 786
 787 <para>
 788
 789 <programlisting>
 790   f :: Eq (m a) => [m a] -> [m a]
 791   g :: Eq [a] => ...
 792 </programlisting>
 793
 794 </para>
 795
 796 <para>
 797 This choice recovers principal types, a property that Haskell 1.4 does not have.
 798 </para>
 799
 800 </sect2>
 801
 802 <sect2>
 803 <title>Class declarations</title>
 804
 805 <para>
 806
 807 <OrderedList>
 808 <listitem>
 809
 810 <para>
 811  <emphasis>Multi-parameter type classes are permitted</emphasis>. For example:
 812
 813
 814 <programlisting>
 815   class Collection c a where
 816     union :: c a -> c a -> c a
 817     ...etc.
 818 </programlisting>
 819
 820
 821
 822 </para>
 823 </listitem>
 824 <listitem>
 825
 826 <para>
 827  <emphasis>The class hierarchy must be acyclic</emphasis>.  However, the definition
 828 of "acyclic" involves only the superclass relationships.  For example,
 829 this is OK:
 830
 831
 832 <programlisting>
 833   class C a where {
 834     op :: D b => a -> b -> b
 835   }
 836
 837   class C a => D a where { ... }
 838 </programlisting>
 839
 840
 841 Here, <literal>C</literal> is a superclass of <literal>D</literal>, but it's OK for a
 842 class operation <literal>op</literal> of <literal>C</literal> to mention <literal>D</literal>.  (It
 843 would not be OK for <literal>D</literal> to be a superclass of <literal>C</literal>.)
 844
 845 </para>
 846 </listitem>
 847 <listitem>
 848
 849 <para>
 850  <emphasis>There are no restrictions on the context in a class declaration
 851 (which introduces superclasses), except that the class hierarchy must
 852 be acyclic</emphasis>.  So these class declarations are OK:
 853
 854
 855 <programlisting>
 856   class Functor (m k) => FiniteMap m k where
 857     ...
 858
 859   class (Monad m, Monad (t m)) => Transform t m where
 860     lift :: m a -> (t m) a
 861 </programlisting>
 862
 863
 864 </para>
 865 </listitem>
 866 <listitem>
 867
 868 <para>
 869  <emphasis>In the signature of a class operation, every constraint
 870 must mention at least one type variable that is not a class type
 871 variable</emphasis>.
 872
 873 Thus:
 874
 875
 876 <programlisting>
 877   class Collection c a where
 878     mapC :: Collection c b => (a->b) -> c a -> c b
 879 </programlisting>
 880
 881
 882 is OK because the constraint <literal>(Collection a b)</literal> mentions
 883 <literal>b</literal>, even though it also mentions the class variable
 884 <literal>a</literal>.  On the other hand:
 885
 886
 887 <programlisting>
 888   class C a where
 889     op :: Eq a => (a,b) -> (a,b)
 890 </programlisting>
 891
 892
 893 is not OK because the constraint <literal>(Eq a)</literal> mentions on the class
 894 type variable <literal>a</literal>, but not <literal>b</literal>.  However, any such
 895 example is easily fixed by moving the offending context up to the
 896 superclass context:
 897
 898
 899 <programlisting>
 900   class Eq a => C a where
 901     op ::(a,b) -> (a,b)
 902 </programlisting>
 903
 904
 905 A yet more relaxed rule would allow the context of a class-op signature
 906 to mention only class type variables.  However, that conflicts with
 907 Rule 1(b) for types above.
 908
 909 </para>
 910 </listitem>
 911 <listitem>
 912
 913 <para>
 914  <emphasis>The type of each class operation must mention <emphasis>all</emphasis> of
 915 the class type variables</emphasis>.  For example:
 916
 917
 918 <programlisting>
 919   class Coll s a where
 920     empty  :: s
 921     insert :: s -> a -> s
 922 </programlisting>
 923
 924
 925 is not OK, because the type of <literal>empty</literal> doesn't mention
 926 <literal>a</literal>.  This rule is a consequence of Rule 1(a), above, for
 927 types, and has the same motivation.
 928
 929 Sometimes, offending class declarations exhibit misunderstandings.  For
 930 example, <literal>Coll</literal> might be rewritten
 931
 932
 933 <programlisting>
 934   class Coll s a where
 935     empty  :: s a
 936     insert :: s a -> a -> s a
 937 </programlisting>
 938
 939
 940 which makes the connection between the type of a collection of
 941 <literal>a</literal>'s (namely <literal>(s a)</literal>) and the element type <literal>a</literal>.
 942 Occasionally this really doesn't work, in which case you can split the
 943 class like this:
 944
 945
 946 <programlisting>
 947   class CollE s where
 948     empty  :: s
 949
 950   class CollE s => Coll s a where
 951     insert :: s -> a -> s
 952 </programlisting>
 953
 954
 955 </para>
 956 </listitem>
 957
 958 </OrderedList>
 959
 960 </para>
 961
 962 </sect2>
 963
 964 <sect2 id="instance-decls">
 965 <title>Instance declarations</title>
 966
 967 <para>
 968
 969 <OrderedList>
 970 <listitem>
 971
 972 <para>
 973  <emphasis>Instance declarations may not overlap</emphasis>.  The two instance
 974 declarations
 975
 976
 977 <programlisting>
 978   instance context1 => C type1 where ...
 979   instance context2 => C type2 where ...
 980 </programlisting>
 981
 982
 983 "overlap" if <literal>type1</literal> and <literal>type2</literal> unify
 984
 985 However, if you give the command line option
 986 <option>-fallow-overlapping-instances</option><indexterm><primary>-fallow-overlapping-instances
 987 option</primary></indexterm> then overlapping instance declarations are permitted.
 988 However, GHC arranges never to commit to using an instance declaration
 989 if another instance declaration also applies, either now or later.
 990
 991 <itemizedlist>
 992 <listitem>
 993
 994 <para>
 995  EITHER <literal>type1</literal> and <literal>type2</literal> do not unify
 996 </para>
 997 </listitem>
 998 <listitem>
 999
1000 <para>
1001  OR <literal>type2</literal> is a substitution instance of <literal>type1</literal>
1002 (but not identical to <literal>type1</literal>), or vice versa.
1003 </para>
1004 </listitem>
1005 </itemizedlist>
1006 Notice that these rules
1007 <itemizedlist>
1008 <listitem>
1009
1010 <para>
1011  make it clear which instance decl to use
1012 (pick the most specific one that matches)
1013
1014 </para>
1015 </listitem>
1016 <listitem>
1017
1018 <para>
1019  do not mention the contexts <literal>context1</literal>, <literal>context2</literal>
1020 Reason: you can pick which instance decl
1021 "matches" based on the type.
1022 </para>
1023 </listitem>
1024
1025 </itemizedlist>
1026 However the rules are over-conservative.  Two instance declarations can overlap,
1027 but it can still be clear in particular situations which to use.  For example:
1028 <programlisting>
1029   instance C (Int,a) where ...
1030   instance C (a,Bool) where ...
1031 </programlisting>
1032 These are rejected by GHC's rules, but it is clear what to do when trying
1033 to solve the constraint <literal>C (Int,Int)</literal> because the second instance
1034 cannot apply.  Yell if this restriction bites you.
1035 </para>
1036 <para>
1037 GHC is also conservative about committing to an overlapping instance.  For example:
1038 <programlisting>
1039   class C a where { op :: a -> a }
1040   instance C [Int] where ...
1041   instance C a => C [a] where ...
1042
1043   f :: C b => [b] -> [b]
1044   f x = op x
1045 </programlisting>
1046 From the RHS of f we get the constraint <literal>C [b]</literal>.  But
1047 GHC does not commit to the second instance declaration, because in a paricular
1048 call of f, b might be instantiate to Int, so the first instance declaration
1049 would be appropriate.  So GHC rejects the program.  If you add <option>-fallow-incoherent-instances</option>
1050 GHC will instead silently pick the second instance, without complaining about
1051 the problem of subsequent instantiations.
1052 </para>
1053 <para>
1054 Regrettably, GHC doesn't guarantee to detect overlapping instance
1055 declarations if they appear in different modules.  GHC can "see" the
1056 instance declarations in the transitive closure of all the modules
1057 imported by the one being compiled, so it can "see" all instance decls
1058 when it is compiling <literal>Main</literal>.  However, it currently chooses not
1059 to look at ones that can't possibly be of use in the module currently
1060 being compiled, in the interests of efficiency.  (Perhaps we should
1061 change that decision, at least for <literal>Main</literal>.)
1062
1063 </para>
1064 </listitem>
1065 <listitem>
1066
1067 <para>
1068  <emphasis>There are no restrictions on the type in an instance
1069 <emphasis>head</emphasis>, except that at least one must not be a type variable</emphasis>.
1070 The instance "head" is the bit after the "=>" in an instance decl. For
1071 example, these are OK:
1072
1073
1074 <programlisting>
1075   instance C Int a where ...
1076
1077   instance D (Int, Int) where ...
1078
1079   instance E [[a]] where ...
1080 </programlisting>
1081
1082
1083 Note that instance heads <emphasis>may</emphasis> contain repeated type variables.
1084 For example, this is OK:
1085
1086
1087 <programlisting>
1088   instance Stateful (ST s) (MutVar s) where ...
1089 </programlisting>
1090
1091
1092 The "at least one not a type variable" restriction is to ensure that
1093 context reduction terminates: each reduction step removes one type
1094 constructor.  For example, the following would make the type checker
1095 loop if it wasn't excluded:
1096
1097
1098 <programlisting>
1099   instance C a => C a where ...
1100 </programlisting>
1101
1102
1103 There are two situations in which the rule is a bit of a pain. First,
1104 if one allows overlapping instance declarations then it's quite
1105 convenient to have a "default instance" declaration that applies if
1106 something more specific does not:
1107
1108
1109 <programlisting>
1110   instance C a where
1111     op = ... -- Default
1112 </programlisting>
1113
1114
1115 Second, sometimes you might want to use the following to get the
1116 effect of a "class synonym":
1117
1118
1119 <programlisting>
1120   class (C1 a, C2 a, C3 a) => C a where { }
1121
1122   instance (C1 a, C2 a, C3 a) => C a where { }
1123 </programlisting>
1124
1125
1126 This allows you to write shorter signatures:
1127
1128
1129 <programlisting>
1130   f :: C a => ...
1131 </programlisting>
1132
1133
1134 instead of
1135
1136
1137 <programlisting>
1138   f :: (C1 a, C2 a, C3 a) => ...
1139 </programlisting>
1140
1141
1142 I'm on the lookout for a simple rule that preserves decidability while
1143 allowing these idioms.  The experimental flag
1144 <option>-fallow-undecidable-instances</option><indexterm><primary>-fallow-undecidable-instances
1145 option</primary></indexterm> lifts this restriction, allowing all the types in an
1146 instance head to be type variables.
1147
1148 </para>
1149 </listitem>
1150 <listitem>
1151
1152 <para>
1153  <emphasis>Unlike Haskell 1.4, instance heads may use type
1154 synonyms</emphasis>.  As always, using a type synonym is just shorthand for
1155 writing the RHS of the type synonym definition.  For example:
1156
1157
1158 <programlisting>
1159   type Point = (Int,Int)
1160   instance C Point   where ...
1161   instance C [Point] where ...
1162 </programlisting>
1163
1164
1165 is legal.  However, if you added
1166
1167
1168 <programlisting>
1169   instance C (Int,Int) where ...
1170 </programlisting>
1171
1172
1173 as well, then the compiler will complain about the overlapping
1174 (actually, identical) instance declarations.  As always, type synonyms
1175 must be fully applied.  You cannot, for example, write:
1176
1177
1178 <programlisting>
1179   type P a = [[a]]
1180   instance Monad P where ...
1181 </programlisting>
1182
1183
1184 This design decision is independent of all the others, and easily
1185 reversed, but it makes sense to me.
1186
1187 </para>
1188 </listitem>
1189 <listitem>
1190
1191 <para>
1192 <emphasis>The types in an instance-declaration <emphasis>context</emphasis> must all
1193 be type variables</emphasis>. Thus
1194
1195
1196 <programlisting>
1197 instance C a b => Eq (a,b) where ...
1198 </programlisting>
1199
1200
1201 is OK, but
1202
1203
1204 <programlisting>
1205 instance C Int b => Foo b where ...
1206 </programlisting>
1207
1208
1209 is not OK.  Again, the intent here is to make sure that context
1210 reduction terminates.
1211
1212 Voluminous correspondence on the Haskell mailing list has convinced me
1213 that it's worth experimenting with a more liberal rule.  If you use
1214 the flag <option>-fallow-undecidable-instances</option> can use arbitrary
1215 types in an instance context.  Termination is ensured by having a
1216 fixed-depth recursion stack.  If you exceed the stack depth you get a
1217 sort of backtrace, and the opportunity to increase the stack depth
1218 with <option>-fcontext-stack</option><emphasis>N</emphasis>.
1219
1220 </para>
1221 </listitem>
1222
1223 </OrderedList>
1224
1225 </para>
1226
1227 </sect2>
1228
1229 </sect1>
1230
1231 <sect1 id="implicit-parameters">
1232 <title>Implicit parameters
1233 </title>
1234
1235 <para> Implicit paramters are implemented as described in
1236 "Implicit parameters: dynamic scoping with static types",
1237 J Lewis, MB Shields, E Meijer, J Launchbury,
1238 27th ACM Symposium on Principles of Programming Languages (POPL'00),
1239 Boston, Jan 2000.
1240 </para>
1241 <para>(Most of the following, stil rather incomplete, documentation is due to Jeff Lewis.)</para>
1242 <para>
1243 A variable is called <emphasis>dynamically bound</emphasis> when it is bound by the calling
1244 context of a function and <emphasis>statically bound</emphasis> when bound by the callee's
1245 context. In Haskell, all variables are statically bound. Dynamic
1246 binding of variables is a notion that goes back to Lisp, but was later
1247 discarded in more modern incarnations, such as Scheme. Dynamic binding
1248 can be very confusing in an untyped language, and unfortunately, typed
1249 languages, in particular Hindley-Milner typed languages like Haskell,
1250 only support static scoping of variables.
1251 </para>
1252 <para>
1253 However, by a simple extension to the type class system of Haskell, we
1254 can support dynamic binding. Basically, we express the use of a
1255 dynamically bound variable as a constraint on the type. These
1256 constraints lead to types of the form <literal>(?x::t') => t</literal>, which says "this
1257 function uses a dynamically-bound variable <literal>?x</literal>
1258 of type <literal>t'</literal>". For
1259 example, the following expresses the type of a sort function,
1260 implicitly parameterized by a comparison function named <literal>cmp</literal>.
1261 <programlisting>
1262   sort :: (?cmp :: a -> a -> Bool) => [a] -> [a]
1263 </programlisting>
1264 The dynamic binding constraints are just a new form of predicate in the type class system.
1265 </para>
1266 <para>
1267 An implicit parameter is introduced by the special form <literal>?x</literal>,
1268 where <literal>x</literal> is
1269 any valid identifier. Use if this construct also introduces new
1270 dynamic binding constraints. For example, the following definition
1271 shows how we can define an implicitly parameterized sort function in
1272 terms of an explicitly parameterized <literal>sortBy</literal> function:
1273 <programlisting>
1274   sortBy :: (a -> a -> Bool) -> [a] -> [a]
1275
1276   sort   :: (?cmp :: a -> a -> Bool) => [a] -> [a]
1277   sort    = sortBy ?cmp
1278 </programlisting>
1279 Dynamic binding constraints behave just like other type class
1280 constraints in that they are automatically propagated. Thus, when a
1281 function is used, its implicit parameters are inherited by the
1282 function that called it. For example, our <literal>sort</literal> function might be used
1283 to pick out the least value in a list:
1284 <programlisting>
1285   least   :: (?cmp :: a -> a -> Bool) => [a] -> a
1286   least xs = fst (sort xs)
1287 </programlisting>
1288 Without lifting a finger, the <literal>?cmp</literal> parameter is
1289 propagated to become a parameter of <literal>least</literal> as well. With explicit
1290 parameters, the default is that parameters must always be explicit
1291 propagated. With implicit parameters, the default is to always
1292 propagate them.
1293 </para>
1294 <para>
1295 An implicit parameter differs from other type class constraints in the
1296 following way: All uses of a particular implicit parameter must have
1297 the same type. This means that the type of <literal>(?x, ?x)</literal>
1298 is <literal>(?x::a) => (a,a)</literal>, and not
1299 <literal>(?x::a, ?x::b) => (a, b)</literal>, as would be the case for type
1300 class constraints.
1301 </para>
1302 <para>
1303 An implicit parameter is bound using an expression of the form
1304 <emphasis>expr</emphasis> <literal>with</literal> <emphasis>binds</emphasis>,
1305 where <literal>with</literal> is a new keyword. This form binds the implicit
1306 parameters arising in the body, not the free variables as a <literal>let</literal> or
1307 <literal>where</literal> would do. For example, we define the <literal>min</literal> function by binding
1308 <literal>cmp</literal>.
1309 <programlisting>
1310   min :: [a] -> a
1311   min  = least with ?cmp = (<=)
1312 </programlisting>
1313 Syntactically, the <emphasis>binds</emphasis> part of a <literal>with</literal> construct must be a
1314 collection of simple bindings to variables (no function-style
1315 bindings, and no type signatures); these bindings are neither
1316 polymorphic or recursive.
1317 </para>
1318 <para>
1319 Note the following additional constraints:
1320 <itemizedlist>
1321 <listitem>
1322 <para> You can't have an implicit parameter in the context of a class or instance
1323 declaration.  For example, both these declarations are illegal:
1324 <programlisting>
1325   class (?x::Int) => C a where ...
1326   instance (?x::a) => Foo [a] where ...
1327 </programlisting>
1328 Reason: exactly which implicit parameter you pick up depends on exactly where
1329 you invoke a function. But the ``invocation'' of instance declarations is done
1330 behind the scenes by the compiler, so it's hard to figure out exactly where it is done.
1331 Easiest thing is to outlaw the offending types.</para>
1332 </listitem>
1333 </itemizedlist>
1334 </para>
1335
1336 </sect1>
1337
1338 <sect1 id="linear-implicit-parameters">
1339 <title>Linear implicit parameters
1340 </title>
1341 <para>
1342 Linear implicit parameters are an idea developed by Koen Claessen,
1343 Mark Shields, and Simon PJ.  They address the long-standing
1344 problem that monads seem over-kill for certain sorts of problem, notably:
1345 </para>
1346 <itemizedlist>
1347 <listitem> <para> distributing a supply of unique names </para> </listitem>
1348 <listitem> <para> distributing a suppply of random numbers </para> </listitem>
1349 <listitem> <para> distributing an oracle (as in QuickCheck) </para> </listitem>
1350 </itemizedlist>
1351
1352 <para>
1353 Linear implicit parameters are just like ordinary implicit parameters,
1354 except that they are "linear" -- that is, they cannot be copied, and
1355 must be explicitly "split" instead.  Linear implicit parameters are
1356 written '<literal>%x</literal>' instead of '<literal>?x</literal>'.
1357 (The '/' in the '%' suggests the split!)
1358 </para>
1359 <para>
1360 For example:
1361 <programlisting>
1362     data NameSupply = ...
1363
1364     splitNS :: NameSupply -> (NameSupply, NameSupply)
1365     newName :: NameSupply -> Name
1366
1367     instance PrelSplit.Splittable NameSupply where
1368         split = splitNS
1369
1370
1371     f :: (%ns :: NameSupply) => Env -> Expr -> Expr
1372     f env (Lam x e) = Lam x' (f env e)
1373                     where
1374                       x'   = newName %ns
1375                       env' = extend env x x'
1376     ...more equations for f...
1377 </programlisting>
1378 Notice that the implicit parameter %ns is consumed
1379 <itemizedlist>
1380 <listitem> <para> once by the call to <literal>newName</literal> </para> </listitem>
1381 <listitem> <para> once by the recursive call to <literal>f</literal> </para></listitem>
1382 </itemizedlist>
1383 </para>
1384 <para>
1385 So the translation done by the type checker makes
1386 the parameter explicit:
1387 <programlisting>
1388     f :: NameSupply -> Env -> Expr -> Expr
1389     f ns env (Lam x e) = Lam x' (f ns1 env e)
1390                        where
1391                          (ns1,ns2) = splitNS ns
1392                          x' = newName ns2
1393                          env = extend env x x'
1394 </programlisting>
1395 Notice the call to 'split' introduced by the type checker.
1396 How did it know to use 'splitNS'?  Because what it really did
1397 was to introduce a call to the overloaded function 'split',
1398 defined by
1399 <programlisting>
1400         class Splittable a where
1401           split :: a -> (a,a)
1402 </programlisting>
1403 The instance for <literal>Splittable NameSupply</literal> tells GHC how to implement
1404 split for name supplies.  But we can simply write
1405 <programlisting>
1406         g x = (x, %ns, %ns)
1407 </programlisting>
1408 and GHC will infer
1409 <programlisting>
1410         g :: (Splittable a, %ns :: a) => b -> (b,a,a)
1411 </programlisting>
1412 The <literal>Splittable</literal> class is built into GHC.  It's defined in <literal>PrelSplit</literal>,
1413 and exported by <literal>GlaExts</literal>.
1414 </para>
1415 <para>
1416 Other points:
1417 <itemizedlist>
1418 <listitem> <para> '<literal>?x</literal>' and '<literal>%x</literal>'
1419 are entirely distinct implicit parameters: you
1420   can use them together and they won't intefere with each other. </para>
1421 </listitem>
1422
1423 <listitem> <para> You can bind linear implicit parameters in 'with' clauses. </para> </listitem>
1424
1425 <listitem> <para>You cannot have implicit parameters (whether linear or not)
1426   in the context of a class or instance declaration. </para></listitem>
1427 </itemizedlist>
1428 </para>
1429
1430 <sect2><title>Warnings</title>
1431
1432 <para>
1433 The monomorphism restriction is even more important than usual.
1434 Consider the example above:
1435 <programlisting>
1436     f :: (%ns :: NameSupply) => Env -> Expr -> Expr
1437     f env (Lam x e) = Lam x' (f env e)
1438                     where
1439                       x'   = newName %ns
1440                       env' = extend env x x'
1441 </programlisting>
1442 If we replaced the two occurrences of x' by (newName %ns), which is
1443 usually a harmless thing to do, we get:
1444 <programlisting>
1445     f :: (%ns :: NameSupply) => Env -> Expr -> Expr
1446     f env (Lam x e) = Lam (newName %ns) (f env e)
1447                     where
1448                       env' = extend env x (newName %ns)
1449 </programlisting>
1450 But now the name supply is consumed in <emphasis>three</emphasis> places
1451 (the two calls to newName,and the recursive call to f), so
1452 the result is utterly different.  Urk!  We don't even have
1453 the beta rule.
1454 </para>
1455 <para>
1456 Well, this is an experimental change.  With implicit
1457 parameters we have already lost beta reduction anyway, and
1458 (as John Launchbury puts it) we can't sensibly reason about
1459 Haskell programs without knowing their typing.
1460 </para>
1461
1462 </sect2>
1463
1464 </sect1>
1465
1466 <sect1 id="functional-dependencies">
1467 <title>Functional dependencies
1468 </title>
1469
1470 <para> Functional dependencies are implemented as described by Mark Jones
1471 in "Type Classes with Functional Dependencies", Mark P. Jones,
1472 In Proceedings of the 9th European Symposium on Programming,
1473 ESOP 2000, Berlin, Germany, March 2000, Springer-Verlag LNCS 1782.
1474 </para>
1475
1476 <para>
1477 There should be more documentation, but there isn't (yet).  Yell if you need it.
1478 </para>
1479 </sect1>
1480
1481
1482 <sect1 id="universal-quantification">
1483 <title>Explicit universal quantification
1484 </title>
1485
1486 <para>
1487 Haskell type signatures are implicitly quantified.  The new keyword <literal>forall</literal>
1488 allows us to say exactly what this means.  For example:
1489 </para>
1490 <para>
1491 <programlisting>
1492         g :: b -> b
1493 </programlisting>
1494 means this:
1495 <programlisting>
1496         g :: forall b. (b -> b)
1497 </programlisting>
1498 The two are treated identically.
1499 </para>
1500
1501 <para>
1502 However, GHC's type system supports <emphasis>arbitrary-rank</emphasis>
1503 explicit universal quantification in
1504 types.
1505 For example, all the following types are legal:
1506 <programlisting>
1507     f1 :: forall a b. a -> b -> a
1508     g1 :: forall a b. (Ord a, Eq  b) => a -> b -> a
1509
1510     f2 :: (forall a. a->a) -> Int -> Int
1511     g2 :: (forall a. Eq a => [a] -> a -> Bool) -> Int -> Int
1512
1513     f3 :: ((forall a. a->a) -> Int) -> Bool -> Bool
1514 </programlisting>
1515 Here, <literal>f1</literal> and <literal>g1</literal> are rank-1 types, and
1516 can be written in standard Haskell (e.g. <literal>f1 :: a->b->a</literal>).
1517 The <literal>forall</literal> makes explicit the universal quantification that
1518 is implicitly added by Haskell.
1519 </para>
1520 <para>
1521 The functions <literal>f2</literal> and <literal>g2</literal> have rank-2 types;
1522 the <literal>forall</literal> is on the left of a function arrrow.  As <literal>g2</literal>
1523 shows, the polymorphic type on the left of the function arrow can be overloaded.
1524 </para>
1525 <para>
1526 The functions <literal>f3</literal> and <literal>g3</literal> have rank-3 types;
1527 they have rank-2 types on the left of a function arrow.
1528 </para>
1529 <para>
1530 GHC allows types of arbitrary rank; you can nest <literal>forall</literal>s
1531 arbitrarily deep in function arrows.   (GHC used to be restricted to rank 2, but
1532 that restriction has now been lifted.)
1533 In particular, a forall-type (also called a "type scheme"),
1534 including an operational type class context, is legal:
1535 <itemizedlist>
1536 <listitem> <para> On the left of a function arrow </para> </listitem>
1537 <listitem> <para> On the right of a function arrow (see <xref linkend="hoist">) </para> </listitem>
1538 <listitem> <para> As the argument of a constructor, or type of a field, in a data type declaration. For
1539 example, any of the <literal>f1,f2,f3,g1,g2,g3</literal> above would be valid
1540 field type signatures.</para> </listitem>
1541 <listitem> <para> As the type of an implicit parameter </para> </listitem>
1542 <listitem> <para> In a pattern type signature (see <xref linkend="scoped-type-variables">) </para> </listitem>
1543 </itemizedlist>
1544 There is one place you cannot put a <literal>forall</literal>:
1545 you cannot instantiate a type variable with a forall-type.  So you cannot
1546 make a forall-type the argument of a type constructor.  So these types are illegal:
1547 <programlisting>
1548     x1 :: [forall a. a->a]
1549     x2 :: (forall a. a->a, Int)
1550     x3 :: Maybe (forall a. a->a)
1551 </programlisting>
1552 Of course <literal>forall</literal> becomes a keyword; you can't use <literal>forall</literal> as
1553 a type variable any more!
1554 </para>
1555
1556
1557 <sect2 id="univ">
1558 <title>Examples
1559 </title>
1560
1561 <para>
1562 In a <literal>data</literal> or <literal>newtype</literal> declaration one can quantify
1563 the types of the constructor arguments.  Here are several examples:
1564 </para>
1565
1566 <para>
1567
1568 <programlisting>
1569 data T a = T1 (forall b. b -> b -> b) a
1570
1571 data MonadT m = MkMonad { return :: forall a. a -> m a,
1572                           bind   :: forall a b. m a -> (a -> m b) -> m b
1573                         }
1574
1575 newtype Swizzle = MkSwizzle (Ord a => [a] -> [a])
1576 </programlisting>
1577
1578 </para>
1579
1580 <para>
1581 The constructors have rank-2 types:
1582 </para>
1583
1584 <para>
1585
1586 <programlisting>
1587 T1 :: forall a. (forall b. b -> b -> b) -> a -> T a
1588 MkMonad :: forall m. (forall a. a -> m a)
1589                   -> (forall a b. m a -> (a -> m b) -> m b)
1590                   -> MonadT m
1591 MkSwizzle :: (Ord a => [a] -> [a]) -> Swizzle
1592 </programlisting>
1593
1594 </para>
1595
1596 <para>
1597 Notice that you don't need to use a <literal>forall</literal> if there's an
1598 explicit context.  For example in the first argument of the
1599 constructor <function>MkSwizzle</function>, an implicit "<literal>forall a.</literal>" is
1600 prefixed to the argument type.  The implicit <literal>forall</literal>
1601 quantifies all type variables that are not already in scope, and are
1602 mentioned in the type quantified over.
1603 </para>
1604
1605 <para>
1606 As for type signatures, implicit quantification happens for non-overloaded
1607 types too.  So if you write this:
1608
1609 <programlisting>
1610   data T a = MkT (Either a b) (b -> b)
1611 </programlisting>
1612
1613 it's just as if you had written this:
1614
1615 <programlisting>
1616   data T a = MkT (forall b. Either a b) (forall b. b -> b)
1617 </programlisting>
1618
1619 That is, since the type variable <literal>b</literal> isn't in scope, it's
1620 implicitly universally quantified.  (Arguably, it would be better
1621 to <emphasis>require</emphasis> explicit quantification on constructor arguments
1622 where that is what is wanted.  Feedback welcomed.)
1623 </para>
1624
1625 <para>
1626 You construct values of types <literal>T1, MonadT, Swizzle</literal> by applying
1627 the constructor to suitable values, just as usual.  For example,
1628 </para>
1629
1630 <para>
1631
1632 <programlisting>
1633     a1 :: T Int
1634     a1 = T1 (\xy->x) 3
1635
1636     a2, a3 :: Swizzle
1637     a2 = MkSwizzle sort
1638     a3 = MkSwizzle reverse
1639
1640     a4 :: MonadT Maybe
1641     a4 = let r x = Just x
1642              b m k = case m of
1643                        Just y -> k y
1644                        Nothing -> Nothing
1645          in
1646          MkMonad r b
1647
1648     mkTs :: (forall b. b -> b -> b) -> a -> [T a]
1649     mkTs f x y = [T1 f x, T1 f y]
1650 </programlisting>
1651
1652 </para>
1653
1654 <para>
1655 The type of the argument can, as usual, be more general than the type
1656 required, as <literal>(MkSwizzle reverse)</literal> shows.  (<function>reverse</function>
1657 does not need the <literal>Ord</literal> constraint.)
1658 </para>
1659
1660 <para>
1661 When you use pattern matching, the bound variables may now have
1662 polymorphic types.  For example:
1663 </para>
1664
1665 <para>
1666
1667 <programlisting>
1668     f :: T a -> a -> (a, Char)
1669     f (T1 w k) x = (w k x, w 'c' 'd')
1670
1671     g :: (Ord a, Ord b) => Swizzle -> [a] -> (a -> b) -> [b]
1672     g (MkSwizzle s) xs f = s (map f (s xs))
1673
1674     h :: MonadT m -> [m a] -> m [a]
1675     h m [] = return m []
1676     h m (x:xs) = bind m x          $ \y ->
1677                  bind m (h m xs)   $ \ys ->
1678                  return m (y:ys)
1679 </programlisting>
1680
1681 </para>
1682
1683 <para>
1684 In the function <function>h</function> we use the record selectors <literal>return</literal>
1685 and <literal>bind</literal> to extract the polymorphic bind and return functions
1686 from the <literal>MonadT</literal> data structure, rather than using pattern
1687 matching.
1688 </para>
1689 </sect2>
1690
1691 <sect2>
1692 <title>Type inference</title>
1693
1694 <para>
1695 In general, type inference for arbitrary-rank types is undecideable.
1696 GHC uses an algorithm proposed by Odersky and Laufer ("Putting type annotations to work", POPL'96)
1697 to get a decidable algorithm by requiring some help from the programmer.
1698 We do not yet have a formal specification of "some help" but the rule is this:
1699 </para>
1700 <para>
1701 <emphasis>For a lambda-bound or case-bound variable, x, either the programmer
1702 provides an explicit polymorphic type for x, or GHC's type inference will assume
1703 that x's type has no foralls in it</emphasis>.
1704 </para>
1705 <para>
1706 What does it mean to "provide" an explicit type for x?  You can do that by
1707 giving a type signature for x directly, using a pattern type signature
1708 (<xref linkend="scoped-type-variables">), thus:
1709 <programlisting>
1710      \ f :: (forall a. a->a) -> (f True, f 'c')
1711 </programlisting>
1712 Alternatively, you can give a type signature to the enclosing
1713 context, which GHC can "push down" to find the type for the variable:
1714 <programlisting>
1715      (\ f -> (f True, f 'c')) :: (forall a. a->a) -> (Bool,Char)
1716 </programlisting>
1717 Here the type signature on the expression can be pushed inwards
1718 to give a type signature for f.  Similarly, and more commonly,
1719 one can give a type signature for the function itself:
1720 <programlisting>
1721      h :: (forall a. a->a) -> (Bool,Char)
1722      h f = (f True, f 'c')
1723 </programlisting>
1724 You don't need to give a type signature if the lambda bound variable
1725 is a constructor argument.  Here is an example we saw earlier:
1726 <programlisting>
1727     f :: T a -> a -> (a, Char)
1728     f (T1 w k) x = (w k x, w 'c' 'd')
1729 </programlisting>
1730 Here we do not need to give a type signature to <literal>w</literal>, because
1731 it is an argument of constructor <literal>T1</literal> and that tells GHC all
1732 it needs to know.
1733 </para>
1734
1735 </sect2>
1736
1737
1738 <sect2 id="implicit-quant">
1739 <title>Implicit quantification</title>
1740
1741 <para>
1742 GHC performs implicit quantification as follows.  <emphasis>At the top level (only) of
1743 user-written types, if and only if there is no explicit <literal>forall</literal>,
1744 GHC finds all the type variables mentioned in the type that are not already
1745 in scope, and universally quantifies them.</emphasis>  For example, the following pairs are
1746 equivalent:
1747 <programlisting>
1748   f :: a -> a
1749   f :: forall a. a -> a
1750
1751   g (x::a) = let
1752                 h :: a -> b -> b
1753                 h x y = y
1754              in ...
1755   g (x::a) = let
1756                 h :: forall b. a -> b -> b
1757                 h x y = y
1758              in ...
1759 </programlisting>
1760 </para>
1761 <para>
1762 Notice that GHC does <emphasis>not</emphasis> find the innermost possible quantification
1763 point.  For example:
1764 <programlisting>
1765   f :: (a -> a) -> Int
1766            -- MEANS
1767   f :: forall a. (a -> a) -> Int
1768            -- NOT
1769   f :: (forall a. a -> a) -> Int
1770
1771
1772   g :: (Ord a => a -> a) -> Int
1773            -- MEANS the illegal type
1774   g :: forall a. (Ord a => a -> a) -> Int
1775            -- NOT
1776   g :: (forall a. Ord a => a -> a) -> Int
1777 </programlisting>
1778 The latter produces an illegal type, which you might think is silly,
1779 but at least the rule is simple.  If you want the latter type, you
1780 can write your for-alls explicitly.  Indeed, doing so is strongly advised
1781 for rank-2 types.
1782 </para>
1783 </sect2>
1784 </sect1>
1785
1786 <sect1 id="hoist">
1787 <title>Type synonyms and hoisting
1788 </title>
1789
1790 <para>
1791 Type synonmys are like macros at the type level, and GHC is much more liberal
1792 about them than Haskell 98.  In particular:
1793 <itemizedlist>
1794 <listitem> <para>You can write a <literal>forall</literal> (including overloading)
1795 in a type synonym, thus:
1796 <programlisting>
1797   type Discard a = forall b. Show b => a -> b -> (a, String)
1798
1799   f :: Discard a
1800   f x y = (x, show y)
1801
1802   g :: Discard Int -> (Int,Bool)    -- A rank-2 type
1803   g f = f Int True
1804 </programlisting>
1805 </para>
1806 </listitem>
1807
1808 <listitem><para>
1809 You can write an unboxed tuple in a type synonym:
1810 <programlisting>
1811   type Pr = (# Int, Int #)
1812
1813   h :: Int -> Pr
1814   h x = (# x, x #)
1815 </programlisting>
1816 </para></listitem>
1817 </itemizedlist>
1818 </para>
1819 <para>
1820 GHC does validity checking on types <emphasis>after expanding type synonyms</emphasis>
1821 so, for example,
1822 this will be rejected:
1823 <programlisting>
1824   type Pr = (# Int, Int #)
1825
1826   h :: Pr -> Int
1827   h x = ...
1828 </programlisting>
1829 because GHC does not allow  unboxed tuples on the left of a function arrow.
1830 </para>
1831
1832 <para>
1833 However, it is often convenient to use these sort of generalised synonyms at the right hand
1834 end of an arrow, thus:
1835 <programlisting>
1836   type Discard a = forall b. a -> b -> a
1837
1838   g :: Int -> Discard Int
1839   g x y z = x+y
1840 </programlisting>
1841 Simply expanding the type synonym would give
1842 <programlisting>
1843   g :: Int -> (forall b. Int -> b -> Int)
1844 </programlisting>
1845 but GHC "hoists" the <literal>forall</literal> to give the isomorphic type
1846 <programlisting>
1847   g :: forall b. Int -> Int -> b -> Int
1848 </programlisting>
1849 In general, the rule is this: <emphasis>to determine the type specified by any explicit
1850 user-written type (e.g. in a type signature), GHC expands type synonyms and then repeatedly
1851 performs the transformation:</emphasis>
1852 <programlisting>
1853   <emphasis>type1</emphasis> -> forall a1..an. <emphasis>context2</emphasis> => <emphasis>type2</emphasis>
1854 ==>
1855   forall a1..an. <emphasis>context2</emphasis> => <emphasis>type1</emphasis> -> <emphasis>type2</emphasis>
1856 </programlisting>
1857 (In fact, GHC tries to retain as much synonym information as possible for use in
1858 error messages, but that is a usability issue.)  This rule applies, of course, whether
1859 or not the <literal>forall</literal> comes from a synonym. For example, here is another
1860 valid way to write <literal>g</literal>'s type signature:
1861 <programlisting>
1862   g :: Int -> Int -> forall b. b -> Int
1863 </programlisting>
1864 </para>
1865 </sect1>
1866
1867
1868 <sect1 id="existential-quantification">
1869 <title>Existentially quantified data constructors
1870 </title>
1871
1872 <para>
1873 The idea of using existential quantification in data type declarations
1874 was suggested by Laufer (I believe, thought doubtless someone will
1875 correct me), and implemented in Hope+. It's been in Lennart
1876 Augustsson's <Command>hbc</Command> Haskell compiler for several years, and
1877 proved very useful.  Here's the idea.  Consider the declaration:
1878 </para>
1879
1880 <para>
1881
1882 <programlisting>
1883   data Foo = forall a. MkFoo a (a -> Bool)
1884            | Nil
1885 </programlisting>
1886
1887 </para>
1888
1889 <para>
1890 The data type <literal>Foo</literal> has two constructors with types:
1891 </para>
1892
1893 <para>
1894
1895 <programlisting>
1896   MkFoo :: forall a. a -> (a -> Bool) -> Foo
1897   Nil   :: Foo
1898 </programlisting>
1899
1900 </para>
1901
1902 <para>
1903 Notice that the type variable <literal>a</literal> in the type of <function>MkFoo</function>
1904 does not appear in the data type itself, which is plain <literal>Foo</literal>.
1905 For example, the following expression is fine:
1906 </para>
1907
1908 <para>
1909
1910 <programlisting>
1911   [MkFoo 3 even, MkFoo 'c' isUpper] :: [Foo]
1912 </programlisting>
1913
1914 </para>
1915
1916 <para>
1917 Here, <literal>(MkFoo 3 even)</literal> packages an integer with a function
1918 <function>even</function> that maps an integer to <literal>Bool</literal>; and <function>MkFoo 'c'
1919 isUpper</function> packages a character with a compatible function.  These
1920 two things are each of type <literal>Foo</literal> and can be put in a list.
1921 </para>
1922
1923 <para>
1924 What can we do with a value of type <literal>Foo</literal>?.  In particular,
1925 what happens when we pattern-match on <function>MkFoo</function>?
1926 </para>
1927
1928 <para>
1929
1930 <programlisting>
1931   f (MkFoo val fn) = ???
1932 </programlisting>
1933
1934 </para>
1935
1936 <para>
1937 Since all we know about <literal>val</literal> and <function>fn</function> is that they
1938 are compatible, the only (useful) thing we can do with them is to
1939 apply <function>fn</function> to <literal>val</literal> to get a boolean.  For example:
1940 </para>
1941
1942 <para>
1943
1944 <programlisting>
1945   f :: Foo -> Bool
1946   f (MkFoo val fn) = fn val
1947 </programlisting>
1948
1949 </para>
1950
1951 <para>
1952 What this allows us to do is to package heterogenous values
1953 together with a bunch of functions that manipulate them, and then treat
1954 that collection of packages in a uniform manner.  You can express
1955 quite a bit of object-oriented-like programming this way.
1956 </para>
1957
1958 <sect2 id="existential">
1959 <title>Why existential?
1960 </title>
1961
1962 <para>
1963 What has this to do with <emphasis>existential</emphasis> quantification?
1964 Simply that <function>MkFoo</function> has the (nearly) isomorphic type
1965 </para>
1966
1967 <para>
1968
1969 <programlisting>
1970   MkFoo :: (exists a . (a, a -> Bool)) -> Foo
1971 </programlisting>
1972
1973 </para>
1974
1975 <para>
1976 But Haskell programmers can safely think of the ordinary
1977 <emphasis>universally</emphasis> quantified type given above, thereby avoiding
1978 adding a new existential quantification construct.
1979 </para>
1980
1981 </sect2>
1982
1983 <sect2>
1984 <title>Type classes</title>
1985
1986 <para>
1987 An easy extension (implemented in <Command>hbc</Command>) is to allow
1988 arbitrary contexts before the constructor.  For example:
1989 </para>
1990
1991 <para>
1992
1993 <programlisting>
1994 data Baz = forall a. Eq a => Baz1 a a
1995          | forall b. Show b => Baz2 b (b -> b)
1996 </programlisting>
1997
1998 </para>
1999
2000 <para>
2001 The two constructors have the types you'd expect:
2002 </para>
2003
2004 <para>
2005
2006 <programlisting>
2007 Baz1 :: forall a. Eq a => a -> a -> Baz
2008 Baz2 :: forall b. Show b => b -> (b -> b) -> Baz
2009 </programlisting>
2010
2011 </para>
2012
2013 <para>
2014 But when pattern matching on <function>Baz1</function> the matched values can be compared
2015 for equality, and when pattern matching on <function>Baz2</function> the first matched
2016 value can be converted to a string (as well as applying the function to it).
2017 So this program is legal:
2018 </para>
2019
2020 <para>
2021
2022 <programlisting>
2023   f :: Baz -> String
2024   f (Baz1 p q) | p == q    = "Yes"
2025                | otherwise = "No"
2026   f (Baz2 v fn)            = show (fn v)
2027 </programlisting>
2028
2029 </para>
2030
2031 <para>
2032 Operationally, in a dictionary-passing implementation, the
2033 constructors <function>Baz1</function> and <function>Baz2</function> must store the
2034 dictionaries for <literal>Eq</literal> and <literal>Show</literal> respectively, and
2035 extract it on pattern matching.
2036 </para>
2037
2038 <para>
2039 Notice the way that the syntax fits smoothly with that used for
2040 universal quantification earlier.
2041 </para>
2042
2043 </sect2>
2044
2045 <sect2>
2046 <title>Restrictions</title>
2047
2048 <para>
2049 There are several restrictions on the ways in which existentially-quantified
2050 constructors can be use.
2051 </para>
2052
2053 <para>
2054
2055 <itemizedlist>
2056 <listitem>
2057
2058 <para>
2059  When pattern matching, each pattern match introduces a new,
2060 distinct, type for each existential type variable.  These types cannot
2061 be unified with any other type, nor can they escape from the scope of
2062 the pattern match.  For example, these fragments are incorrect:
2063
2064
2065 <programlisting>
2066 f1 (MkFoo a f) = a
2067 </programlisting>
2068
2069
2070 Here, the type bound by <function>MkFoo</function> "escapes", because <literal>a</literal>
2071 is the result of <function>f1</function>.  One way to see why this is wrong is to
2072 ask what type <function>f1</function> has:
2073
2074
2075 <programlisting>
2076   f1 :: Foo -> a             -- Weird!
2077 </programlisting>
2078
2079
2080 What is this "<literal>a</literal>" in the result type? Clearly we don't mean
2081 this:
2082
2083
2084 <programlisting>
2085   f1 :: forall a. Foo -> a   -- Wrong!
2086 </programlisting>
2087
2088
2089 The original program is just plain wrong.  Here's another sort of error
2090
2091
2092 <programlisting>
2093   f2 (Baz1 a b) (Baz1 p q) = a==q
2094 </programlisting>
2095
2096
2097 It's ok to say <literal>a==b</literal> or <literal>p==q</literal>, but
2098 <literal>a==q</literal> is wrong because it equates the two distinct types arising
2099 from the two <function>Baz1</function> constructors.
2100
2101
2102 </para>
2103 </listitem>
2104 <listitem>
2105
2106 <para>
2107 You can't pattern-match on an existentially quantified
2108 constructor in a <literal>let</literal> or <literal>where</literal> group of
2109 bindings. So this is illegal:
2110
2111
2112 <programlisting>
2113   f3 x = a==b where { Baz1 a b = x }
2114 </programlisting>
2115
2116
2117 You can only pattern-match
2118 on an existentially-quantified constructor in a <literal>case</literal> expression or
2119 in the patterns of a function definition.
2120
2121 The reason for this restriction is really an implementation one.
2122 Type-checking binding groups is already a nightmare without
2123 existentials complicating the picture.  Also an existential pattern
2124 binding at the top level of a module doesn't make sense, because it's
2125 not clear how to prevent the existentially-quantified type "escaping".
2126 So for now, there's a simple-to-state restriction.  We'll see how
2127 annoying it is.
2128
2129 </para>
2130 </listitem>
2131 <listitem>
2132
2133 <para>
2134 You can't use existential quantification for <literal>newtype</literal>
2135 declarations.  So this is illegal:
2136
2137
2138 <programlisting>
2139   newtype T = forall a. Ord a => MkT a
2140 </programlisting>
2141
2142
2143 Reason: a value of type <literal>T</literal> must be represented as a pair
2144 of a dictionary for <literal>Ord t</literal> and a value of type <literal>t</literal>.
2145 That contradicts the idea that <literal>newtype</literal> should have no
2146 concrete representation.  You can get just the same efficiency and effect
2147 by using <literal>data</literal> instead of <literal>newtype</literal>.  If there is no
2148 overloading involved, then there is more of a case for allowing
2149 an existentially-quantified <literal>newtype</literal>, because the <literal>data</literal>
2150 because the <literal>data</literal> version does carry an implementation cost,
2151 but single-field existentially quantified constructors aren't much
2152 use.  So the simple restriction (no existential stuff on <literal>newtype</literal>)
2153 stands, unless there are convincing reasons to change it.
2154
2155
2156 </para>
2157 </listitem>
2158 <listitem>
2159
2160 <para>
2161  You can't use <literal>deriving</literal> to define instances of a
2162 data type with existentially quantified data constructors.
2163
2164 Reason: in most cases it would not make sense. For example:&num;
2165
2166 <programlisting>
2167 data T = forall a. MkT [a] deriving( Eq )
2168 </programlisting>
2169
2170 To derive <literal>Eq</literal> in the standard way we would need to have equality
2171 between the single component of two <function>MkT</function> constructors:
2172
2173 <programlisting>
2174 instance Eq T where
2175   (MkT a) == (MkT b) = ???
2176 </programlisting>
2177
2178 But <VarName>a</VarName> and <VarName>b</VarName> have distinct types, and so can't be compared.
2179 It's just about possible to imagine examples in which the derived instance
2180 would make sense, but it seems altogether simpler simply to prohibit such
2181 declarations.  Define your own instances!
2182 </para>
2183 </listitem>
2184
2185 </itemizedlist>
2186
2187 </para>
2188
2189 </sect2>
2190
2191 </sect1>
2192
2193 <sect1 id="scoped-type-variables">
2194 <title>Scoped Type Variables
2195 </title>
2196
2197 <para>
2198 A <emphasis>pattern type signature</emphasis> can introduce a <emphasis>scoped type
2199 variable</emphasis>.  For example
2200 </para>
2201
2202 <para>
2203
2204 <programlisting>
2205 f (xs::[a]) = ys ++ ys
2206            where
2207               ys :: [a]
2208               ys = reverse xs
2209 </programlisting>
2210
2211 </para>
2212
2213 <para>
2214 The pattern <literal>(xs::[a])</literal> includes a type signature for <VarName>xs</VarName>.
2215 This brings the type variable <literal>a</literal> into scope; it scopes over
2216 all the patterns and right hand sides for this equation for <function>f</function>.
2217 In particular, it is in scope at the type signature for <VarName>y</VarName>.
2218 </para>
2219
2220 <para>
2221  Pattern type signatures are completely orthogonal to ordinary, separate
2222 type signatures.  The two can be used independently or together.
2223 At ordinary type signatures, such as that for <VarName>ys</VarName>, any type variables
2224 mentioned in the type signature <emphasis>that are not in scope</emphasis> are
2225 implicitly universally quantified.  (If there are no type variables in
2226 scope, all type variables mentioned in the signature are universally
2227 quantified, which is just as in Haskell 98.)  In this case, since <VarName>a</VarName>
2228 is in scope, it is not universally quantified, so the type of <VarName>ys</VarName> is
2229 the same as that of <VarName>xs</VarName>.  In Haskell 98 it is not possible to declare
2230 a type for <VarName>ys</VarName>; a major benefit of scoped type variables is that
2231 it becomes possible to do so.
2232 </para>
2233
2234 <para>
2235 Scoped type variables are implemented in both GHC and Hugs.  Where the
2236 implementations differ from the specification below, those differences
2237 are noted.
2238 </para>
2239
2240 <para>
2241 So much for the basic idea.  Here are the details.
2242 </para>
2243
2244 <sect2>
2245 <title>What a pattern type signature means</title>
2246 <para>
2247 A type variable brought into scope by a pattern type signature is simply
2248 the name for a type.   The restriction they express is that all occurrences
2249 of the same name mean the same type.  For example:
2250 <programlisting>
2251   f :: [Int] -> Int -> Int
2252   f (xs::[a]) (y::a) = (head xs + y) :: a
2253 </programlisting>
2254 The pattern type signatures on the left hand side of
2255 <literal>f</literal> express the fact that <literal>xs</literal>
2256 must be a list of things of some type <literal>a</literal>; and that <literal>y</literal>
2257 must have this same type.  The type signature on the expression <literal>(head xs)</literal>
2258 specifies that this expression must have the same type <literal>a</literal>.
2259 <emphasis>There is no requirement that the type named by "<literal>a</literal>" is
2260 in fact a type variable</emphasis>.  Indeed, in this case, the type named by "<literal>a</literal>" is
2261 <literal>Int</literal>.  (This is a slight liberalisation from the original rather complex
2262 rules, which specified that a pattern-bound type variable should be universally quantified.)
2263 For example, all of these are legal:</para>
2264
2265 <programlisting>
2266   t (x::a) (y::a) = x+y*2
2267
2268   f (x::a) (y::b) = [x,y]       -- a unifies with b
2269
2270   g (x::a) = x + 1::Int         -- a unifies with Int
2271
2272   h x = let k (y::a) = [x,y]    -- a is free in the
2273         in k x                  -- environment
2274
2275   k (x::a) True    = ...        -- a unifies with Int
2276   k (x::Int) False = ...
2277
2278   w :: [b] -> [b]
2279   w (x::a) = x                  -- a unifies with [b]
2280 </programlisting>
2281
2282 </sect2>
2283
2284 <sect2>
2285 <title>Scope and implicit quantification</title>
2286
2287 <para>
2288
2289 <itemizedlist>
2290 <listitem>
2291
2292 <para>
2293 All the type variables mentioned in a pattern,
2294 that are not already in scope,
2295 are brought into scope by the pattern.  We describe this set as
2296 the <emphasis>type variables bound by the pattern</emphasis>.
2297 For example:
2298 <programlisting>
2299   f (x::a) = let g (y::(a,b)) = fst y
2300              in
2301              g (x,True)
2302 </programlisting>
2303 The pattern <literal>(x::a)</literal> brings the type variable
2304 <literal>a</literal> into scope, as well as the term
2305 variable <literal>x</literal>.  The pattern <literal>(y::(a,b))</literal>
2306 contains an occurrence of the already-in-scope type variable <literal>a</literal>,
2307 and brings into scope the type variable <literal>b</literal>.
2308 </para>
2309 </listitem>
2310
2311 <listitem>
2312 <para>
2313 The type variable(s) bound by the pattern have the same scope
2314 as the term variable(s) bound by the pattern.  For example:
2315 <programlisting>
2316   let
2317     f (x::a) = <...rhs of f...>
2318     (p::b, q::b) = (1,2)
2319   in <...body of let...>
2320 </programlisting>
2321 Here, the type variable <literal>a</literal> scopes over the right hand side of <literal>f</literal>,
2322 just like <literal>x</literal> does; while the type variable <literal>b</literal> scopes over the
2323 body of the <literal>let</literal>, and all the other definitions in the <literal>let</literal>,
2324 just like <literal>p</literal> and <literal>q</literal> do.
2325 Indeed, the newly bound type variables also scope over any ordinary, separate
2326 type signatures in the <literal>let</literal> group.
2327 </para>
2328 </listitem>
2329
2330
2331 <listitem>
2332 <para>
2333 The type variables bound by the pattern may be
2334 mentioned in ordinary type signatures or pattern
2335 type signatures anywhere within their scope.
2336
2337 </para>
2338 </listitem>
2339
2340 <listitem>
2341 <para>
2342  In ordinary type signatures, any type variable mentioned in the
2343 signature that is in scope is <emphasis>not</emphasis> universally quantified.
2344
2345 </para>
2346 </listitem>
2347
2348 <listitem>
2349
2350 <para>
2351  Ordinary type signatures do not bring any new type variables
2352 into scope (except in the type signature itself!). So this is illegal:
2353
2354 <programlisting>
2355   f :: a -> a
2356   f x = x::a
2357 </programlisting>
2358
2359 It's illegal because <VarName>a</VarName> is not in scope in the body of <function>f</function>,
2360 so the ordinary signature <literal>x::a</literal> is equivalent to <literal>x::forall a.a</literal>;
2361 and that is an incorrect typing.
2362
2363 </para>
2364 </listitem>
2365
2366 <listitem>
2367 <para>
2368 The pattern type signature is a monotype:
2369 </para>
2370
2371 <itemizedlist>
2372 <listitem> <para>
2373 A pattern type signature cannot contain any explicit <literal>forall</literal> quantification.
2374 </para> </listitem>
2375
2376 <listitem>  <para>
2377 The type variables bound by a pattern type signature can only be instantiated to monotypes,
2378 not to type schemes.
2379 </para> </listitem>
2380
2381 <listitem>  <para>
2382 There is no implicit universal quantification on pattern type signatures (in contrast to
2383 ordinary type signatures).
2384 </para> </listitem>
2385
2386 </itemizedlist>
2387
2388 </listitem>
2389
2390 <listitem>
2391 <para>
2392
2393 The type variables in the head of a <literal>class</literal> or <literal>instance</literal> declaration
2394 scope over the methods defined in the <literal>where</literal> part.  For example:
2395
2396
2397 <programlisting>
2398   class C a where
2399     op :: [a] -> a
2400
2401     op xs = let ys::[a]
2402                 ys = reverse xs
2403             in
2404             head ys
2405 </programlisting>
2406
2407
2408 (Not implemented in Hugs yet, Dec 98).
2409 </para>
2410 </listitem>
2411
2412 </itemizedlist>
2413
2414 </para>
2415
2416 </sect2>
2417
2418 <sect2>
2419 <title>Result type signatures</title>
2420
2421 <para>
2422
2423 <itemizedlist>
2424 <listitem>
2425
2426 <para>
2427  The result type of a function can be given a signature,
2428 thus:
2429
2430
2431 <programlisting>
2432   f (x::a) :: [a] = [x,x,x]
2433 </programlisting>
2434
2435
2436 The final <literal>:: [a]</literal> after all the patterns gives a signature to the
2437 result type.  Sometimes this is the only way of naming the type variable
2438 you want:
2439
2440
2441 <programlisting>
2442   f :: Int -> [a] -> [a]
2443   f n :: ([a] -> [a]) = let g (x::a, y::a) = (y,x)
2444                         in \xs -> map g (reverse xs `zip` xs)
2445 </programlisting>
2446
2447
2448 </para>
2449 </listitem>
2450
2451 </itemizedlist>
2452
2453 </para>
2454
2455 <para>
2456 Result type signatures are not yet implemented in Hugs.
2457 </para>
2458
2459 </sect2>
2460
2461 <sect2>
2462 <title>Where a pattern type signature can occur</title>
2463
2464 <para>
2465 A pattern type signature can occur in any pattern.  For example:
2466 <itemizedlist>
2467
2468 <listitem>
2469 <para>
2470 A pattern type signature can be on an arbitrary sub-pattern, not
2471 ust on a variable:
2472
2473
2474 <programlisting>
2475   f ((x,y)::(a,b)) = (y,x) :: (b,a)
2476 </programlisting>
2477
2478
2479 </para>
2480 </listitem>
2481 <listitem>
2482
2483 <para>
2484  Pattern type signatures, including the result part, can be used
2485 in lambda abstractions:
2486
2487 <programlisting>
2488   (\ (x::a, y) :: a -> x)
2489 </programlisting>
2490 </para>
2491 </listitem>
2492 <listitem>
2493
2494 <para>
2495  Pattern type signatures, including the result part, can be used
2496 in <literal>case</literal> expressions:
2497
2498
2499 <programlisting>
2500   case e of { (x::a, y) :: a -> x }
2501 </programlisting>
2502
2503 </para>
2504 </listitem>
2505
2506 <listitem>
2507 <para>
2508 To avoid ambiguity, the type after the &ldquo;<literal>::</literal>&rdquo; in a result
2509 pattern signature on a lambda or <literal>case</literal> must be atomic (i.e. a single
2510 token or a parenthesised type of some sort).  To see why,
2511 consider how one would parse this:
2512
2513
2514 <programlisting>
2515   \ x :: a -> b -> x
2516 </programlisting>
2517
2518
2519 </para>
2520 </listitem>
2521
2522 <listitem>
2523
2524 <para>
2525  Pattern type signatures can bind existential type variables.
2526 For example:
2527
2528
2529 <programlisting>
2530   data T = forall a. MkT [a]
2531
2532   f :: T -> T
2533   f (MkT [t::a]) = MkT t3
2534                  where
2535                    t3::[a] = [t,t,t]
2536 </programlisting>
2537
2538
2539 </para>
2540 </listitem>
2541
2542
2543 <listitem>
2544
2545 <para>
2546 Pattern type signatures
2547 can be used in pattern bindings:
2548
2549 <programlisting>
2550   f x = let (y, z::a) = x in ...
2551   f1 x                = let (y, z::Int) = x in ...
2552   f2 (x::(Int,a))     = let (y, z::a)   = x in ...
2553   f3 :: (b->b)        = \x -> x
2554 </programlisting>
2555
2556 In all such cases, the binding is not generalised over the pattern-bound
2557 type variables.  Thus <literal>f3</literal> is monomorphic; <literal>f3</literal>
2558 has type <literal>b -&gt; b</literal> for some type <literal>b</literal>,
2559 and <emphasis>not</emphasis> <literal>forall b. b -&gt; b</literal>.
2560 In contrast, the binding
2561 <programlisting>
2562   f4 :: b->b
2563   f4 = \x -> x
2564 </programlisting>
2565 makes a polymorphic function, but <literal>b</literal> is not in scope anywhere
2566 in <literal>f4</literal>'s scope.
2567
2568 </para>
2569 </listitem>
2570 </itemizedlist>
2571 </para>
2572
2573 </sect2>
2574 </sect1>
2575
2576 <sect1 id="sec-kinding">
2577 <title>Explicitly-kinded quantification</title>
2578
2579 <para>
2580 Haskell infers the kind of each type variable.  Sometimes it is nice to be able
2581 to give the kind explicitly as (machine-checked) documentation,
2582 just as it is nice to give a type signature for a function.  On some occasions,
2583 it is essential to do so.  For example, in his paper "Restricted Data Types in Haskell" (Haskell Workshop 1999)
2584 John Hughes had to define the data type:
2585 <Screen>
2586      data Set cxt a = Set [a]
2587                     | Unused (cxt a -> ())
2588 </Screen>
2589 The only use for the <literal>Unused</literal> constructor was to force the correct
2590 kind for the type variable <literal>cxt</literal>.
2591 </para>
2592 <para>
2593 GHC now instead allows you to specify the kind of a type variable directly, wherever
2594 a type variable is explicitly bound.  Namely:
2595 <itemizedlist>
2596 <listitem><para><literal>data</literal> declarations:
2597 <Screen>
2598   data Set (cxt :: * -> *) a = Set [a]
2599 </Screen></para></listitem>
2600 <listitem><para><literal>type</literal> declarations:
2601 <Screen>
2602   type T (f :: * -> *) = f Int
2603 </Screen></para></listitem>
2604 <listitem><para><literal>class</literal> declarations:
2605 <Screen>
2606   class (Eq a) => C (f :: * -> *) a where ...
2607 </Screen></para></listitem>
2608 <listitem><para><literal>forall</literal>'s in type signatures:
2609 <Screen>
2610   f :: forall (cxt :: * -> *). Set cxt Int
2611 </Screen></para></listitem>
2612 </itemizedlist>
2613 </para>
2614
2615 <para>
2616 The parentheses are required.  Some of the spaces are required too, to
2617 separate the lexemes.  If you write <literal>(f::*->*)</literal> you
2618 will get a parse error, because "<literal>::*->*</literal>" is a
2619 single lexeme in Haskell.
2620 </para>
2621
2622 <para>
2623 As part of the same extension, you can put kind annotations in types
2624 as well.  Thus:
2625 <Screen>
2626    f :: (Int :: *) -> Int
2627    g :: forall a. a -> (a :: *)
2628 </Screen>
2629 The syntax is
2630 <Screen>
2631    atype ::= '(' ctype '::' kind ')
2632 </Screen>
2633 The parentheses are required.
2634 </para>
2635 </sect1>
2636
2637 <sect1 id="sec-assertions">
2638 <title>Assertions
2639 <indexterm><primary>Assertions</primary></indexterm>
2640 </title>
2641
2642 <para>
2643 If you want to make use of assertions in your standard Haskell code, you
2644 could define a function like the following:
2645 </para>
2646
2647 <para>
2648
2649 <programlisting>
2650 assert :: Bool -> a -> a
2651 assert False x = error "assertion failed!"
2652 assert _     x = x
2653 </programlisting>
2654
2655 </para>
2656
2657 <para>
2658 which works, but gives you back a less than useful error message --
2659 an assertion failed, but which and where?
2660 </para>
2661
2662 <para>
2663 One way out is to define an extended <function>assert</function> function which also
2664 takes a descriptive string to include in the error message and
2665 perhaps combine this with the use of a pre-processor which inserts
2666 the source location where <function>assert</function> was used.
2667 </para>
2668
2669 <para>
2670 Ghc offers a helping hand here, doing all of this for you. For every
2671 use of <function>assert</function> in the user's source:
2672 </para>
2673
2674 <para>
2675
2676 <programlisting>
2677 kelvinToC :: Double -> Double
2678 kelvinToC k = assert (k &gt;= 0.0) (k+273.15)
2679 </programlisting>
2680
2681 </para>
2682
2683 <para>
2684 Ghc will rewrite this to also include the source location where the
2685 assertion was made,
2686 </para>
2687
2688 <para>
2689
2690 <programlisting>
2691 assert pred val ==> assertError "Main.hs|15" pred val
2692 </programlisting>
2693
2694 </para>
2695
2696 <para>
2697 The rewrite is only performed by the compiler when it spots
2698 applications of <function>Exception.assert</function>, so you can still define and
2699 use your own versions of <function>assert</function>, should you so wish. If not,
2700 import <literal>Exception</literal> to make use <function>assert</function> in your code.
2701 </para>
2702
2703 <para>
2704 To have the compiler ignore uses of assert, use the compiler option
2705 <option>-fignore-asserts</option>. <indexterm><primary>-fignore-asserts option</primary></indexterm> That is,
2706 expressions of the form <literal>assert pred e</literal> will be rewritten to <literal>e</literal>.
2707 </para>
2708
2709 <para>
2710 Assertion failures can be caught, see the documentation for the
2711 <literal>Exception</literal> library (<xref linkend="sec-Exception">)
2712 for the details.
2713 </para>
2714
2715 </sect1>
2716
2717   <sect1 id="pragmas">
2718     <title>Pragmas</title>
2719
2720     <indexterm><primary>pragma</primary></indexterm>
2721
2722     <para>GHC supports several pragmas, or instructions to the
2723     compiler placed in the source code.  Pragmas don't normally affect
2724     the meaning of the program, but they might affect the efficiency
2725     of the generated code.</para>
2726
2727     <para>Pragmas all take the form
2728
2729 <literal>{-# <replaceable>word</replaceable> ... #-}</literal>
2730
2731     where <replaceable>word</replaceable> indicates the type of
2732     pragma, and is followed optionally by information specific to that
2733     type of pragma.  Case is ignored in
2734     <replaceable>word</replaceable>.  The various values for
2735     <replaceable>word</replaceable> that GHC understands are described
2736     in the following sections; any pragma encountered with an
2737     unrecognised <replaceable>word</replaceable> is (silently)
2738     ignored.</para>
2739
2740 <sect2 id="inline-pragma">
2741 <title>INLINE pragma
2742
2743 <indexterm><primary>INLINE pragma</primary></indexterm>
2744 <indexterm><primary>pragma, INLINE</primary></indexterm></title>
2745
2746 <para>
2747 GHC (with <option>-O</option>, as always) tries to inline (or &ldquo;unfold&rdquo;)
2748 functions/values that are &ldquo;small enough,&rdquo; thus avoiding the call
2749 overhead and possibly exposing other more-wonderful optimisations.
2750 </para>
2751
2752 <para>
2753 You will probably see these unfoldings (in Core syntax) in your
2754 interface files.
2755 </para>
2756
2757 <para>
2758 Normally, if GHC decides a function is &ldquo;too expensive&rdquo; to inline, it
2759 will not do so, nor will it export that unfolding for other modules to
2760 use.
2761 </para>
2762
2763 <para>
2764 The sledgehammer you can bring to bear is the
2765 <literal>INLINE</literal><indexterm><primary>INLINE pragma</primary></indexterm> pragma, used thusly:
2766
2767 <programlisting>
2768 key_function :: Int -> String -> (Bool, Double)
2769
2770 #ifdef __GLASGOW_HASKELL__
2771 {-# INLINE key_function #-}
2772 #endif
2773 </programlisting>
2774
2775 (You don't need to do the C pre-processor carry-on unless you're going
2776 to stick the code through HBC&mdash;it doesn't like <literal>INLINE</literal> pragmas.)
2777 </para>
2778
2779 <para>
2780 The major effect of an <literal>INLINE</literal> pragma is to declare a function's
2781 &ldquo;cost&rdquo; to be very low.  The normal unfolding machinery will then be
2782 very keen to inline it.
2783 </para>
2784
2785 <para>
2786 An <literal>INLINE</literal> pragma for a function can be put anywhere its type
2787 signature could be put.
2788 </para>
2789
2790 <para>
2791 <literal>INLINE</literal> pragmas are a particularly good idea for the
2792 <literal>then</literal>/<literal>return</literal> (or <literal>bind</literal>/<literal>unit</literal>) functions in a monad.
2793 For example, in GHC's own <literal>UniqueSupply</literal> monad code, we have:
2794
2795 <programlisting>
2796 #ifdef __GLASGOW_HASKELL__
2797 {-# INLINE thenUs #-}
2798 {-# INLINE returnUs #-}
2799 #endif
2800 </programlisting>
2801
2802 </para>
2803
2804 </sect2>
2805
2806 <sect2 id="noinline-pragma">
2807 <title>NOINLINE pragma
2808 </title>
2809
2810 <indexterm><primary>NOINLINE pragma</primary></indexterm>
2811 <indexterm><primary>pragma</primary><secondary>NOINLINE</secondary></indexterm>
2812 <indexterm><primary>NOTINLINE pragma</primary></indexterm>
2813 <indexterm><primary>pragma</primary><secondary>NOTINLINE</secondary></indexterm>
2814
2815 <para>
2816 The <literal>NOINLINE</literal> pragma does exactly what you'd expect:
2817 it stops the named function from being inlined by the compiler.  You
2818 shouldn't ever need to do this, unless you're very cautious about code
2819 size.
2820 </para>
2821
2822 <para><literal>NOTINLINE</literal> is a synonym for
2823 <literal>NOINLINE</literal> (<literal>NOTINLINE</literal> is specified
2824 by Haskell 98 as the standard way to disable inlining, so it should be
2825 used if you want your code to be portable).</para>
2826
2827 </sect2>
2828
2829     <sect2 id="specialize-pragma">
2830       <title>SPECIALIZE pragma</title>
2831
2832       <indexterm><primary>SPECIALIZE pragma</primary></indexterm>
2833       <indexterm><primary>pragma, SPECIALIZE</primary></indexterm>
2834       <indexterm><primary>overloading, death to</primary></indexterm>
2835
2836       <para>(UK spelling also accepted.)  For key overloaded
2837       functions, you can create extra versions (NB: more code space)
2838       specialised to particular types.  Thus, if you have an
2839       overloaded function:</para>
2840
2841 <programlisting>
2842 hammeredLookup :: Ord key => [(key, value)] -> key -> value
2843 </programlisting>
2844
2845       <para>If it is heavily used on lists with
2846       <literal>Widget</literal> keys, you could specialise it as
2847       follows:</para>
2848
2849 <programlisting>
2850 {-# SPECIALIZE hammeredLookup :: [(Widget, value)] -> Widget -> value #-}
2851 </programlisting>
2852
2853       <para>To get very fancy, you can also specify a named function
2854       to use for the specialised value, as in:</para>
2855
2856 <programlisting>
2857 {-# RULES hammeredLookup = blah #-}
2858 </programlisting>
2859
2860       <para>where <literal>blah</literal> is an implementation of
2861       <literal>hammerdLookup</literal> written specialy for
2862       <literal>Widget</literal> lookups.  It's <emphasis>Your
2863       Responsibility</emphasis> to make sure that
2864       <function>blah</function> really behaves as a specialised
2865       version of <function>hammeredLookup</function>!!!</para>
2866
2867       <para>Note we use the <literal>RULE</literal> pragma here to
2868       indicate that <literal>hammeredLookup</literal> applied at a
2869       certain type should be replaced by <literal>blah</literal>.  See
2870       <xref linkend="rules"> for more information on
2871       <literal>RULES</literal>.</para>
2872
2873       <para>An example in which using <literal>RULES</literal> for
2874       specialisation will Win Big:
2875
2876 <programlisting>
2877 toDouble :: Real a => a -> Double
2878 toDouble = fromRational . toRational
2879
2880 {-# SPECIALIZE toDouble :: Int -> Double = i2d #-}
2881 i2d (I# i) = D# (int2Double# i) -- uses Glasgow prim-op directly
2882 </programlisting>
2883
2884       The <function>i2d</function> function is virtually one machine
2885       instruction; the default conversion&mdash;via an intermediate
2886       <literal>Rational</literal>&mdash;is obscenely expensive by
2887       comparison.</para>
2888
2889       <para>A <literal>SPECIALIZE</literal> pragma for a function can
2890       be put anywhere its type signature could be put.</para>
2891
2892     </sect2>
2893
2894 <sect2 id="specialize-instance-pragma">
2895 <title>SPECIALIZE instance pragma
2896 </title>
2897
2898 <para>
2899 <indexterm><primary>SPECIALIZE pragma</primary></indexterm>
2900 <indexterm><primary>overloading, death to</primary></indexterm>
2901 Same idea, except for instance declarations.  For example:
2902
2903 <programlisting>
2904 instance (Eq a) => Eq (Foo a) where {
2905    {-# SPECIALIZE instance Eq (Foo [(Int, Bar)]) #-}
2906    ... usual stuff ...
2907  }
2908 </programlisting>
2909 The pragma must occur inside the <literal>where</literal> part
2910 of the instance declaration.
2911 </para>
2912 <para>
2913 Compatible with HBC, by the way, except perhaps in the placement
2914 of the pragma.
2915 </para>
2916
2917 </sect2>
2918
2919 <sect2 id="line-pragma">
2920 <title>LINE pragma
2921 </title>
2922
2923 <para>
2924 <indexterm><primary>LINE pragma</primary></indexterm>
2925 <indexterm><primary>pragma, LINE</primary></indexterm>
2926 </para>
2927
2928 <para>
2929 This pragma is similar to C's <literal>&num;line</literal> pragma, and is mainly for use in
2930 automatically generated Haskell code.  It lets you specify the line
2931 number and filename of the original code; for example
2932 </para>
2933
2934 <para>
2935
2936 <programlisting>
2937 {-# LINE 42 "Foo.vhs" #-}
2938 </programlisting>
2939
2940 </para>
2941
2942 <para>
2943 if you'd generated the current file from something called <filename>Foo.vhs</filename>
2944 and this line corresponds to line 42 in the original.  GHC will adjust
2945 its error messages to refer to the line/file named in the <literal>LINE</literal>
2946 pragma.
2947 </para>
2948
2949 </sect2>
2950
2951 <sect2 id="rules">
2952 <title>RULES pragma</title>
2953
2954 <para>
2955 The RULES pragma lets you specify rewrite rules.  It is described in
2956 <xref LinkEnd="rewrite-rules">.
2957 </para>
2958
2959 </sect2>
2960
2961 <sect2 id="deprecated-pragma">
2962 <title>DEPRECATED pragma</title>
2963
2964 <para>
2965 The DEPRECATED pragma lets you specify that a particular function, class, or type, is deprecated.
2966 There are two forms.
2967 </para>
2968 <itemizedlist>
2969 <listitem><para>
2970 You can deprecate an entire module thus:</para>
2971 <programlisting>
2972    module Wibble {-# DEPRECATED "Use Wobble instead" #-} where
2973      ...
2974 </programlisting>
2975 <para>
2976 When you compile any module that import <literal>Wibble</literal>, GHC will print
2977 the specified message.</para>
2978 </listitem>
2979
2980 <listitem>
2981 <para>
2982 You can deprecate a function, class, or type, with the following top-level declaration:
2983 </para>
2984 <programlisting>
2985    {-# DEPRECATED f, C, T "Don't use these" #-}
2986 </programlisting>
2987 <para>
2988 When you compile any module that imports and uses any of the specifed entities,
2989 GHC will print the specified message.
2990 </para>
2991 </listitem>
2992 </itemizedlist>
2993 <para>You can suppress the warnings with the flag <option>-fno-warn-deprecations</option>.</para>
2994
2995 </sect2>
2996
2997 </sect1>
2998
2999 <sect1 id="rewrite-rules">
3000 <title>Rewrite rules
3001
3002 <indexterm><primary>RULES pagma</primary></indexterm>
3003 <indexterm><primary>pragma, RULES</primary></indexterm>
3004 <indexterm><primary>rewrite rules</primary></indexterm></title>
3005
3006 <para>
3007 The programmer can specify rewrite rules as part of the source program
3008 (in a pragma).  GHC applies these rewrite rules wherever it can.
3009 </para>
3010
3011 <para>
3012 Here is an example:
3013
3014 <programlisting>
3015   {-# RULES
3016         "map/map"       forall f g xs. map f (map g xs) = map (f.g) xs
3017   #-}
3018 </programlisting>
3019
3020 </para>
3021
3022 <sect2>
3023 <title>Syntax</title>
3024
3025 <para>
3026 From a syntactic point of view:
3027
3028 <itemizedlist>
3029 <listitem>
3030
3031 <para>
3032  Each rule has a name, enclosed in double quotes.  The name itself has
3033 no significance at all.  It is only used when reporting how many times the rule fired.
3034 </para>
3035 </listitem>
3036 <listitem>
3037
3038 <para>
3039  There may be zero or more rules in a <literal>RULES</literal> pragma.
3040 </para>
3041 </listitem>
3042 <listitem>
3043
3044 <para>
3045  Layout applies in a <literal>RULES</literal> pragma.  Currently no new indentation level
3046 is set, so you must lay out your rules starting in the same column as the
3047 enclosing definitions.
3048 </para>
3049 </listitem>
3050 <listitem>
3051
3052 <para>
3053  Each variable mentioned in a rule must either be in scope (e.g. <function>map</function>),
3054 or bound by the <literal>forall</literal> (e.g. <function>f</function>, <function>g</function>, <function>xs</function>).  The variables bound by
3055 the <literal>forall</literal> are called the <emphasis>pattern</emphasis> variables.  They are separated
3056 by spaces, just like in a type <literal>forall</literal>.
3057 </para>
3058 </listitem>
3059 <listitem>
3060
3061 <para>
3062  A pattern variable may optionally have a type signature.
3063 If the type of the pattern variable is polymorphic, it <emphasis>must</emphasis> have a type signature.
3064 For example, here is the <literal>foldr/build</literal> rule:
3065
3066 <programlisting>
3067 "fold/build"  forall k z (g::forall b. (a->b->b) -> b -> b) .
3068               foldr k z (build g) = g k z
3069 </programlisting>
3070
3071 Since <function>g</function> has a polymorphic type, it must have a type signature.
3072
3073 </para>
3074 </listitem>
3075 <listitem>
3076
3077 <para>
3078 The left hand side of a rule must consist of a top-level variable applied
3079 to arbitrary expressions.  For example, this is <emphasis>not</emphasis> OK:
3080
3081 <programlisting>
3082 "wrong1"   forall e1 e2.  case True of { True -> e1; False -> e2 } = e1
3083 "wrong2"   forall f.      f True = True
3084 </programlisting>
3085
3086 In <literal>"wrong1"</literal>, the LHS is not an application; in <literal>"wrong2"</literal>, the LHS has a pattern variable
3087 in the head.
3088 </para>
3089 </listitem>
3090 <listitem>
3091
3092 <para>
3093  A rule does not need to be in the same module as (any of) the
3094 variables it mentions, though of course they need to be in scope.
3095 </para>
3096 </listitem>
3097 <listitem>
3098
3099 <para>
3100  Rules are automatically exported from a module, just as instance declarations are.
3101 </para>
3102 </listitem>
3103
3104 </itemizedlist>
3105
3106 </para>
3107
3108 </sect2>
3109
3110 <sect2>
3111 <title>Semantics</title>
3112
3113 <para>
3114 From a semantic point of view:
3115
3116 <itemizedlist>
3117 <listitem>
3118
3119 <para>
3120 Rules are only applied if you use the <option>-O</option> flag.
3121 </para>
3122 </listitem>
3123
3124 <listitem>
3125 <para>
3126  Rules are regarded as left-to-right rewrite rules.
3127 When GHC finds an expression that is a substitution instance of the LHS
3128 of a rule, it replaces the expression by the (appropriately-substituted) RHS.
3129 By "a substitution instance" we mean that the LHS can be made equal to the
3130 expression by substituting for the pattern variables.
3131
3132 </para>
3133 </listitem>
3134 <listitem>
3135
3136 <para>
3137  The LHS and RHS of a rule are typechecked, and must have the
3138 same type.
3139
3140 </para>
3141 </listitem>
3142 <listitem>
3143
3144 <para>
3145  GHC makes absolutely no attempt to verify that the LHS and RHS
3146 of a rule have the same meaning.  That is undecideable in general, and
3147 infeasible in most interesting cases.  The responsibility is entirely the programmer's!
3148
3149 </para>
3150 </listitem>
3151 <listitem>
3152
3153 <para>
3154  GHC makes no attempt to make sure that the rules are confluent or
3155 terminating.  For example:
3156
3157 <programlisting>
3158   "loop"        forall x,y.  f x y = f y x
3159 </programlisting>
3160
3161 This rule will cause the compiler to go into an infinite loop.
3162
3163 </para>
3164 </listitem>
3165 <listitem>
3166
3167 <para>
3168  If more than one rule matches a call, GHC will choose one arbitrarily to apply.
3169
3170 </para>
3171 </listitem>
3172 <listitem>
3173 <para>
3174  GHC currently uses a very simple, syntactic, matching algorithm
3175 for matching a rule LHS with an expression.  It seeks a substitution
3176 which makes the LHS and expression syntactically equal modulo alpha
3177 conversion.  The pattern (rule), but not the expression, is eta-expanded if
3178 necessary.  (Eta-expanding the epression can lead to laziness bugs.)
3179 But not beta conversion (that's called higher-order matching).
3180 </para>
3181
3182 <para>
3183 Matching is carried out on GHC's intermediate language, which includes
3184 type abstractions and applications.  So a rule only matches if the
3185 types match too.  See <xref LinkEnd="rule-spec"> below.
3186 </para>
3187 </listitem>
3188 <listitem>
3189
3190 <para>
3191  GHC keeps trying to apply the rules as it optimises the program.
3192 For example, consider:
3193
3194 <programlisting>
3195   let s = map f
3196       t = map g
3197   in
3198   s (t xs)
3199 </programlisting>
3200
3201 The expression <literal>s (t xs)</literal> does not match the rule <literal>"map/map"</literal>, but GHC
3202 will substitute for <VarName>s</VarName> and <VarName>t</VarName>, giving an expression which does match.
3203 If <VarName>s</VarName> or <VarName>t</VarName> was (a) used more than once, and (b) large or a redex, then it would
3204 not be substituted, and the rule would not fire.
3205
3206 </para>
3207 </listitem>
3208 <listitem>
3209
3210 <para>
3211  In the earlier phases of compilation, GHC inlines <emphasis>nothing
3212 that appears on the LHS of a rule</emphasis>, because once you have substituted
3213 for something you can't match against it (given the simple minded
3214 matching).  So if you write the rule
3215
3216 <programlisting>
3217         "map/map"       forall f,g.  map f . map g = map (f.g)
3218 </programlisting>
3219
3220 this <emphasis>won't</emphasis> match the expression <literal>map f (map g xs)</literal>.
3221 It will only match something written with explicit use of ".".
3222 Well, not quite.  It <emphasis>will</emphasis> match the expression
3223
3224 <programlisting>
3225 wibble f g xs
3226 </programlisting>
3227
3228 where <function>wibble</function> is defined:
3229
3230 <programlisting>
3231 wibble f g = map f . map g
3232 </programlisting>
3233
3234 because <function>wibble</function> will be inlined (it's small).
3235
3236 Later on in compilation, GHC starts inlining even things on the
3237 LHS of rules, but still leaves the rules enabled.  This inlining
3238 policy is controlled by the per-simplification-pass flag <option>-finline-phase</option><emphasis>n</emphasis>.
3239
3240 </para>
3241 </listitem>
3242 <listitem>
3243
3244 <para>
3245  All rules are implicitly exported from the module, and are therefore
3246 in force in any module that imports the module that defined the rule, directly
3247 or indirectly.  (That is, if A imports B, which imports C, then C's rules are
3248 in force when compiling A.)  The situation is very similar to that for instance
3249 declarations.
3250 </para>
3251 </listitem>
3252
3253 </itemizedlist>
3254
3255 </para>
3256
3257 </sect2>
3258
3259 <sect2>
3260 <title>List fusion</title>
3261
3262 <para>
3263 The RULES mechanism is used to implement fusion (deforestation) of common list functions.
3264 If a "good consumer" consumes an intermediate list constructed by a "good producer", the
3265 intermediate list should be eliminated entirely.
3266 </para>
3267
3268 <para>
3269 The following are good producers:
3270
3271 <itemizedlist>
3272 <listitem>
3273
3274 <para>
3275  List comprehensions
3276 </para>
3277 </listitem>
3278 <listitem>
3279
3280 <para>
3281  Enumerations of <literal>Int</literal> and <literal>Char</literal> (e.g. <literal>['a'..'z']</literal>).
3282 </para>
3283 </listitem>
3284 <listitem>
3285
3286 <para>
3287  Explicit lists (e.g. <literal>[True, False]</literal>)
3288 </para>
3289 </listitem>
3290 <listitem>
3291
3292 <para>
3293  The cons constructor (e.g <literal>3:4:[]</literal>)
3294 </para>
3295 </listitem>
3296 <listitem>
3297
3298 <para>
3299  <function>++</function>
3300 </para>
3301 </listitem>
3302
3303 <listitem>
3304 <para>
3305  <function>map</function>
3306 </para>
3307 </listitem>
3308
3309 <listitem>
3310 <para>
3311  <function>filter</function>
3312 </para>
3313 </listitem>
3314 <listitem>
3315
3316 <para>
3317  <function>iterate</function>, <function>repeat</function>
3318 </para>
3319 </listitem>
3320 <listitem>
3321
3322 <para>
3323  <function>zip</function>, <function>zipWith</function>
3324 </para>
3325 </listitem>
3326
3327 </itemizedlist>
3328
3329 </para>
3330
3331 <para>
3332 The following are good consumers:
3333
3334 <itemizedlist>
3335 <listitem>
3336
3337 <para>
3338  List comprehensions
3339 </para>
3340 </listitem>
3341 <listitem>
3342
3343 <para>
3344  <function>array</function> (on its second argument)
3345 </para>
3346 </listitem>
3347 <listitem>
3348
3349 <para>
3350  <function>length</function>
3351 </para>
3352 </listitem>
3353 <listitem>
3354
3355 <para>
3356  <function>++</function> (on its first argument)
3357 </para>
3358 </listitem>
3359
3360 <listitem>
3361 <para>
3362  <function>foldr</function>
3363 </para>
3364 </listitem>
3365
3366 <listitem>
3367 <para>
3368  <function>map</function>
3369 </para>
3370 </listitem>
3371 <listitem>
3372
3373 <para>
3374  <function>filter</function>
3375 </para>
3376 </listitem>
3377 <listitem>
3378
3379 <para>
3380  <function>concat</function>
3381 </para>
3382 </listitem>
3383 <listitem>
3384
3385 <para>
3386  <function>unzip</function>, <function>unzip2</function>, <function>unzip3</function>, <function>unzip4</function>
3387 </para>
3388 </listitem>
3389 <listitem>
3390
3391 <para>
3392  <function>zip</function>, <function>zipWith</function> (but on one argument only; if both are good producers, <function>zip</function>
3393 will fuse with one but not the other)
3394 </para>
3395 </listitem>
3396 <listitem>
3397
3398 <para>
3399  <function>partition</function>
3400 </para>
3401 </listitem>
3402 <listitem>
3403
3404 <para>
3405  <function>head</function>
3406 </para>
3407 </listitem>
3408 <listitem>
3409
3410 <para>
3411  <function>and</function>, <function>or</function>, <function>any</function>, <function>all</function>
3412 </para>
3413 </listitem>
3414 <listitem>
3415
3416 <para>
3417  <function>sequence&lowbar;</function>
3418 </para>
3419 </listitem>
3420 <listitem>
3421
3422 <para>
3423  <function>msum</function>
3424 </para>
3425 </listitem>
3426 <listitem>
3427
3428 <para>
3429  <function>sortBy</function>
3430 </para>
3431 </listitem>
3432
3433 </itemizedlist>
3434
3435 </para>
3436
3437 <para>
3438 So, for example, the following should generate no intermediate lists:
3439
3440 <programlisting>
3441 array (1,10) [(i,i*i) | i &#60;- map (+ 1) [0..9]]
3442 </programlisting>
3443
3444 </para>
3445
3446 <para>
3447 This list could readily be extended; if there are Prelude functions that you use
3448 a lot which are not included, please tell us.
3449 </para>
3450
3451 <para>
3452 If you want to write your own good consumers or producers, look at the
3453 Prelude definitions of the above functions to see how to do so.
3454 </para>
3455
3456 </sect2>
3457
3458 <sect2 id="rule-spec">
3459 <title>Specialisation
3460 </title>
3461
3462 <para>
3463 Rewrite rules can be used to get the same effect as a feature
3464 present in earlier version of GHC:
3465
3466 <programlisting>
3467   {-# SPECIALIZE fromIntegral :: Int8 -> Int16 = int8ToInt16 #-}
3468 </programlisting>
3469
3470 This told GHC to use <function>int8ToInt16</function> instead of <function>fromIntegral</function> whenever
3471 the latter was called with type <literal>Int8 -&gt; Int16</literal>.  That is, rather than
3472 specialising the original definition of <function>fromIntegral</function> the programmer is
3473 promising that it is safe to use <function>int8ToInt16</function> instead.
3474 </para>
3475
3476 <para>
3477 This feature is no longer in GHC.  But rewrite rules let you do the
3478 same thing:
3479
3480 <programlisting>
3481 {-# RULES
3482   "fromIntegral/Int8/Int16" fromIntegral = int8ToInt16
3483 #-}
3484 </programlisting>
3485
3486 This slightly odd-looking rule instructs GHC to replace <function>fromIntegral</function>
3487 by <function>int8ToInt16</function> <emphasis>whenever the types match</emphasis>.  Speaking more operationally,
3488 GHC adds the type and dictionary applications to get the typed rule
3489
3490 <programlisting>
3491 forall (d1::Integral Int8) (d2::Num Int16) .
3492         fromIntegral Int8 Int16 d1 d2 = int8ToInt16
3493 </programlisting>
3494
3495 What is more,
3496 this rule does not need to be in the same file as fromIntegral,
3497 unlike the <literal>SPECIALISE</literal> pragmas which currently do (so that they
3498 have an original definition available to specialise).
3499 </para>
3500
3501 </sect2>
3502
3503 <sect2>
3504 <title>Controlling what's going on</title>
3505
3506 <para>
3507
3508 <itemizedlist>
3509 <listitem>
3510
3511 <para>
3512  Use <option>-ddump-rules</option> to see what transformation rules GHC is using.
3513 </para>
3514 </listitem>
3515 <listitem>
3516
3517 <para>
3518  Use <option>-ddump-simpl-stats</option> to see what rules are being fired.
3519 If you add <option>-dppr-debug</option> you get a more detailed listing.
3520 </para>
3521 </listitem>
3522 <listitem>
3523
3524 <para>
3525  The defintion of (say) <function>build</function> in <FileName>PrelBase.lhs</FileName> looks llike this:
3526
3527 <programlisting>
3528         build   :: forall a. (forall b. (a -> b -> b) -> b -> b) -> [a]
3529         {-# INLINE build #-}
3530         build g = g (:) []
3531 </programlisting>
3532
3533 Notice the <literal>INLINE</literal>!  That prevents <literal>(:)</literal> from being inlined when compiling
3534 <literal>PrelBase</literal>, so that an importing module will &ldquo;see&rdquo; the <literal>(:)</literal>, and can
3535 match it on the LHS of a rule.  <literal>INLINE</literal> prevents any inlining happening
3536 in the RHS of the <literal>INLINE</literal> thing.  I regret the delicacy of this.
3537
3538 </para>
3539 </listitem>
3540 <listitem>
3541
3542 <para>
3543  In <filename>ghc/lib/std/PrelBase.lhs</filename> look at the rules for <function>map</function> to
3544 see how to write rules that will do fusion and yet give an efficient
3545 program even if fusion doesn't happen.  More rules in <filename>PrelList.lhs</filename>.
3546 </para>
3547 </listitem>
3548
3549 </itemizedlist>
3550
3551 </para>
3552
3553 </sect2>
3554
3555 </sect1>
3556
3557 <sect1 id="generic-classes">
3558 <title>Generic classes</title>
3559
3560     <para>(Note: support for generic classes is currently broken in
3561     GHC 5.02).</para>
3562
3563 <para>
3564 The ideas behind this extension are described in detail in "Derivable type classes",
3565 Ralf Hinze and Simon Peyton Jones, Haskell Workshop, Montreal Sept 2000, pp94-105.
3566 An example will give the idea:
3567 </para>
3568
3569 <programlisting>
3570   import Generics
3571
3572   class Bin a where
3573     toBin   :: a -> [Int]
3574     fromBin :: [Int] -> (a, [Int])
3575
3576     toBin {| Unit |}    Unit      = []
3577     toBin {| a :+: b |} (Inl x)   = 0 : toBin x
3578     toBin {| a :+: b |} (Inr y)   = 1 : toBin y
3579     toBin {| a :*: b |} (x :*: y) = toBin x ++ toBin y
3580
3581     fromBin {| Unit |}    bs      = (Unit, bs)
3582     fromBin {| a :+: b |} (0:bs)  = (Inl x, bs')    where (x,bs') = fromBin bs
3583     fromBin {| a :+: b |} (1:bs)  = (Inr y, bs')    where (y,bs') = fromBin bs
3584     fromBin {| a :*: b |} bs      = (x :*: y, bs'') where (x,bs' ) = fromBin bs
3585                                                           (y,bs'') = fromBin bs'
3586 </programlisting>
3587 <para>
3588 This class declaration explains how <literal>toBin</literal> and <literal>fromBin</literal>
3589 work for arbitrary data types.  They do so by giving cases for unit, product, and sum,
3590 which are defined thus in the library module <literal>Generics</literal>:
3591 </para>
3592 <programlisting>
3593   data Unit    = Unit
3594   data a :+: b = Inl a | Inr b
3595   data a :*: b = a :*: b
3596 </programlisting>
3597 <para>
3598 Now you can make a data type into an instance of Bin like this:
3599 <programlisting>
3600   instance (Bin a, Bin b) => Bin (a,b)
3601   instance Bin a => Bin [a]
3602 </programlisting>
3603 That is, just leave off the "where" clasuse.  Of course, you can put in the
3604 where clause and over-ride whichever methods you please.
3605 </para>
3606
3607     <sect2>
3608       <title> Using generics </title>
3609       <para>To use generics you need to</para>
3610       <itemizedlist>
3611         <listitem>
3612           <para>Use the flags <option>-fglasgow-exts</option> (to enable the extra syntax),
3613                 <option>-fgenerics</option> (to generate extra per-data-type code),
3614                 and <option>-package lang</option> (to make the <literal>Generics</literal> library
3615                 available.  </para>
3616         </listitem>
3617         <listitem>
3618           <para>Import the module <literal>Generics</literal> from the
3619           <literal>lang</literal> package.  This import brings into
3620           scope the data types <literal>Unit</literal>,
3621           <literal>:*:</literal>, and <literal>:+:</literal>.  (You
3622           don't need this import if you don't mention these types
3623           explicitly; for example, if you are simply giving instance
3624           declarations.)</para>
3625         </listitem>
3626       </itemizedlist>
3627     </sect2>
3628
3629 <sect2> <title> Changes wrt the paper </title>
3630 <para>
3631 Note that the type constructors <literal>:+:</literal> and <literal>:*:</literal>
3632 can be written infix (indeed, you can now use
3633 any operator starting in a colon as an infix type constructor).  Also note that
3634 the type constructors are not exactly as in the paper (Unit instead of 1, etc).
3635 Finally, note that the syntax of the type patterns in the class declaration
3636 uses "<literal>{|</literal>" and "<literal>|}</literal>" brackets; curly braces
3637 alone would ambiguous when they appear on right hand sides (an extension we
3638 anticipate wanting).
3639 </para>
3640 </sect2>
3641
3642 <sect2> <title>Terminology and restrictions</title>
3643 <para>
3644 Terminology.  A "generic default method" in a class declaration
3645 is one that is defined using type patterns as above.
3646 A "polymorphic default method" is a default method defined as in Haskell 98.
3647 A "generic class declaration" is a class declaration with at least one
3648 generic default method.
3649 </para>
3650
3651 <para>
3652 Restrictions:
3653 <itemizedlist>
3654 <listitem>
3655 <para>
3656 Alas, we do not yet implement the stuff about constructor names and
3657 field labels.
3658 </para>
3659 </listitem>
3660
3661 <listitem>
3662 <para>
3663 A generic class can have only one parameter; you can't have a generic
3664 multi-parameter class.
3665 </para>
3666 </listitem>
3667
3668 <listitem>
3669 <para>
3670 A default method must be defined entirely using type patterns, or entirely
3671 without.  So this is illegal:
3672 <programlisting>
3673   class Foo a where
3674     op :: a -> (a, Bool)
3675     op {| Unit |} Unit = (Unit, True)
3676     op x               = (x,    False)
3677 </programlisting>
3678 However it is perfectly OK for some methods of a generic class to have
3679 generic default methods and others to have polymorphic default methods.
3680 </para>
3681 </listitem>
3682
3683 <listitem>
3684 <para>
3685 The type variable(s) in the type pattern for a generic method declaration
3686 scope over the right hand side.  So this is legal (note the use of the type variable ``p'' in a type signature on the right hand side:
3687 <programlisting>
3688   class Foo a where
3689     op :: a -> Bool
3690     op {| p :*: q |} (x :*: y) = op (x :: p)
3691     ...
3692 </programlisting>
3693 </para>
3694 </listitem>
3695
3696 <listitem>
3697 <para>
3698 The type patterns in a generic default method must take one of the forms:
3699 <programlisting>
3700        a :+: b
3701        a :*: b
3702        Unit
3703 </programlisting>
3704 where "a" and "b" are type variables.  Furthermore, all the type patterns for
3705 a single type constructor (<literal>:*:</literal>, say) must be identical; they
3706 must use the same type variables.  So this is illegal:
3707 <programlisting>
3708   class Foo a where
3709     op :: a -> Bool
3710     op {| a :+: b |} (Inl x) = True
3711     op {| p :+: q |} (Inr y) = False
3712 </programlisting>
3713 The type patterns must be identical, even in equations for different methods of the class.
3714 So this too is illegal:
3715 <programlisting>
3716   class Foo a where
3717     op1 :: a -> Bool
3718     op1 {| a :*: b |} (x :*: y) = True
3719
3720     op2 :: a -> Bool
3721     op2 {| p :*: q |} (x :*: y) = False
3722 </programlisting>
3723 (The reason for this restriction is that we gather all the equations for a particular type consructor
3724 into a single generic instance declaration.)
3725 </para>
3726 </listitem>
3727
3728 <listitem>
3729 <para>
3730 A generic method declaration must give a case for each of the three type constructors.
3731 </para>
3732 </listitem>
3733
3734 <listitem>
3735 <para>
3736 The type for a generic method can be built only from:
3737   <itemizedlist>
3738   <listitem> <para> Function arrows </para> </listitem>
3739   <listitem> <para> Type variables </para> </listitem>
3740   <listitem> <para> Tuples </para> </listitem>
3741   <listitem> <para> Arbitrary types not involving type variables </para> </listitem>
3742   </itemizedlist>
3743 Here are some example type signatures for generic methods:
3744 <programlisting>
3745     op1 :: a -> Bool
3746     op2 :: Bool -> (a,Bool)
3747     op3 :: [Int] -> a -> a
3748     op4 :: [a] -> Bool
3749 </programlisting>
3750 Here, op1, op2, op3 are OK, but op4 is rejected, because it has a type variable
3751 inside a list.
3752 </para>
3753 <para>
3754 This restriction is an implementation restriction: we just havn't got around to
3755 implementing the necessary bidirectional maps over arbitrary type constructors.
3756 It would be relatively easy to add specific type constructors, such as Maybe and list,
3757 to the ones that are allowed.</para>
3758 </listitem>
3759
3760 <listitem>
3761 <para>
3762 In an instance declaration for a generic class, the idea is that the compiler
3763 will fill in the methods for you, based on the generic templates.  However it can only
3764 do so if
3765   <itemizedlist>
3766   <listitem>
3767   <para>
3768   The instance type is simple (a type constructor applied to type variables, as in Haskell 98).
3769   </para>
3770   </listitem>
3771   <listitem>
3772   <para>
3773   No constructor of the instance type has unboxed fields.
3774   </para>
3775   </listitem>
3776   </itemizedlist>
3777 (Of course, these things can only arise if you are already using GHC extensions.)
3778 However, you can still give an instance declarations for types which break these rules,
3779 provided you give explicit code to override any generic default methods.
3780 </para>
3781 </listitem>
3782
3783 </itemizedlist>
3784 </para>
3785
3786 <para>
3787 The option <option>-ddump-deriv</option> dumps incomprehensible stuff giving details of
3788 what the compiler does with generic declarations.
3789 </para>
3790
3791 </sect2>
3792
3793 <sect2> <title> Another example </title>
3794 <para>
3795 Just to finish with, here's another example I rather like:
3796 <programlisting>
3797   class Tag a where
3798     nCons :: a -> Int
3799     nCons {| Unit |}    _ = 1
3800     nCons {| a :*: b |} _ = 1
3801     nCons {| a :+: b |} _ = nCons (bot::a) + nCons (bot::b)
3802
3803     tag :: a -> Int
3804     tag {| Unit |}    _       = 1
3805     tag {| a :*: b |} _       = 1
3806     tag {| a :+: b |} (Inl x) = tag x
3807     tag {| a :+: b |} (Inr y) = nCons (bot::a) + tag y
3808 </programlisting>
3809 </para>
3810 </sect2>
3811 </sect1>
3812
3813 <sect1 id="newtype-deriving">
3814 <title>Generalised derived instances for newtypes</title>
3815
3816 <para>
3817 When you define an abstract type using <literal>newtype</literal>, you may want
3818 the new type to inherit some instances from its representation. In
3819 Haskell 98, you can inherit instances of <literal>Eq</literal>, <literal>Ord</literal>,
3820 <literal>Enum</literal> and <literal>Bounded</literal> by deriving them, but for any
3821 other classes you have to write an explicit instance declaration. For
3822 example, if you define
3823
3824 <programlisting>
3825   newtype Dollars = Dollars Int
3826 </programlisting>
3827
3828 and you want to use arithmetic on <literal>Dollars</literal>, you have to
3829 explicitly define an instance of <literal>Num</literal>:
3830
3831 <programlisting>
3832   instance Num Dollars where
3833     Dollars a + Dollars b = Dollars (a+b)
3834     ...
3835 </programlisting>
3836 All the instance does is apply and remove the <literal>newtype</literal>
3837 constructor. It is particularly galling that, since the constructor
3838 doesn't appear at run-time, this instance declaration defines a
3839 dictionary which is <emphasis>wholly equivalent</emphasis> to the <literal>Int</literal>
3840 dictionary, only slower!
3841 </para>
3842
3843 <sect2> <title> Generalising the deriving clause </title>
3844 <para>
3845 GHC now permits such instances to be derived instead, so one can write
3846 <programlisting>
3847   newtype Dollars = Dollars Int deriving (Eq,Show,Num)
3848 </programlisting>
3849
3850 and the implementation uses the <emphasis>same</emphasis> <literal>Num</literal> dictionary
3851 for <literal>Dollars</literal> as for <literal>Int</literal>. Notionally, the compiler
3852 derives an instance declaration of the form
3853
3854 <programlisting>
3855   instance Num Int => Num Dollars
3856 </programlisting>
3857
3858 which just adds or removes the <literal>newtype</literal> constructor according to the type.
3859 </para>
3860 <para>
3861
3862 We can also derive instances of constructor classes in a similar
3863 way. For example, suppose we have implemented state and failure monad
3864 transformers, such that
3865
3866 <programlisting>
3867   instance Monad m => Monad (State s m)
3868   instance Monad m => Monad (Failure m)
3869 </programlisting>
3870 In Haskell 98, we can define a parsing monad by
3871 <programlisting>
3872   type Parser tok m a = State [tok] (Failure m) a
3873 </programlisting>
3874
3875 which is automatically a monad thanks to the instance declarations
3876 above. With the extension, we can make the parser type abstract,
3877 without needing to write an instance of class <literal>Monad</literal>, via
3878
3879 <programlisting>
3880   newtype Parser tok m a = Parser (State [tok] (Failure m) a)
3881                          deriving Monad
3882 </programlisting>
3883 In this case the derived instance declaration is of the form
3884 <programlisting>
3885   instance Monad (State [tok] (Failure m)) => Monad (Parser tok m)
3886 </programlisting>
3887
3888 Notice that, since <literal>Monad</literal> is a constructor class, the
3889 instance is a <emphasis>partial application</emphasis> of the new type, not the
3890 entire left hand side. We can imagine that the type declaration is
3891 ``eta-converted'' to generate the context of the instance
3892 declaration.
3893 </para>
3894 <para>
3895
3896 We can even derive instances of multi-parameter classes, provided the
3897 newtype is the last class parameter. In this case, a ``partial
3898 application'' of the class appears in the <literal>deriving</literal>
3899 clause. For example, given the class
3900
3901 <programlisting>
3902   class StateMonad s m | m -> s where ...
3903   instance Monad m => StateMonad s (State s m) where ...
3904 </programlisting>
3905 then we can derive an instance of <literal>StateMonad</literal> for <literal>Parser</literal>s by
3906 <programlisting>
3907   newtype Parser tok m a = Parser (State [tok] (Failure m) a)
3908                          deriving (Monad, StateMonad [tok])
3909 </programlisting>
3910
3911 The derived instance is obtained by completing the application of the
3912 class to the new type:
3913
3914 <programlisting>
3915   instance StateMonad [tok] (State [tok] (Failure m)) =>
3916            StateMonad [tok] (Parser tok m)
3917 </programlisting>
3918 </para>
3919 <para>
3920
3921 As a result of this extension, all derived instances in newtype
3922 declarations are treated uniformly (and implemented just by reusing
3923 the dictionary for the representation type), <emphasis>except</emphasis>
3924 <literal>Show</literal> and <literal>Read</literal>, which really behave differently for
3925 the newtype and its representation.
3926 </para>
3927 </sect2>
3928
3929 <sect2> <title> A more precise specification </title>
3930 <para>
3931 Derived instance declarations are constructed as follows. Consider the
3932 declaration (after expansion of any type synonyms)
3933
3934 <programlisting>
3935   newtype T v1...vn = T' (S t1...tk vk+1...vn) deriving (c1...cm)
3936 </programlisting>
3937
3938 where <literal>S</literal> is a type constructor, <literal>t1...tk</literal> are
3939 types,
3940 <literal>vk+1...vn</literal> are type variables which do not occur in any of
3941 the <literal>ti</literal>, and the <literal>ci</literal> are partial applications of
3942 classes of the form <literal>C t1'...tj'</literal>.  The derived instance
3943 declarations are, for each <literal>ci</literal>,
3944
3945 <programlisting>
3946   instance ci (S t1...tk vk+1...v) => ci (T v1...vp)
3947 </programlisting>
3948 where <literal>p</literal> is chosen so that <literal>T v1...vp</literal> is of the
3949 right <emphasis>kind</emphasis> for the last parameter of class <literal>Ci</literal>.
3950 </para>
3951 <para>
3952
3953 As an example which does <emphasis>not</emphasis> work, consider
3954 <programlisting>
3955   newtype NonMonad m s = NonMonad (State s m s) deriving Monad
3956 </programlisting>
3957 Here we cannot derive the instance
3958 <programlisting>
3959   instance Monad (State s m) => Monad (NonMonad m)
3960 </programlisting>
3961
3962 because the type variable <literal>s</literal> occurs in <literal>State s m</literal>,
3963 and so cannot be "eta-converted" away. It is a good thing that this
3964 <literal>deriving</literal> clause is rejected, because <literal>NonMonad m</literal> is
3965 not, in fact, a monad --- for the same reason. Try defining
3966 <literal>>>=</literal> with the correct type: you won't be able to.
3967 </para>
3968 <para>
3969
3970 Notice also that the <emphasis>order</emphasis> of class parameters becomes
3971 important, since we can only derive instances for the last one. If the
3972 <literal>StateMonad</literal> class above were instead defined as
3973
3974 <programlisting>
3975   class StateMonad m s | m -> s where ...
3976 </programlisting>
3977
3978 then we would not have been able to derive an instance for the
3979 <literal>Parser</literal> type above. We hypothesise that multi-parameter
3980 classes usually have one "main" parameter for which deriving new
3981 instances is most interesting.
3982 </para>
3983 </sect2>
3984 </sect1>
3985
3986
3987
3988 <!-- Emacs stuff:
3989      ;;; Local Variables: ***
3990      ;;; mode: sgml ***
3991      ;;; sgml-parent-document: ("users_guide.sgml" "book" "chapter" "sect1") ***
3992      ;;; End: ***
3993  -->