ghc/docs/users_guide/glasgow_exts.sgml

   1 <para>
   2 <indexterm><primary>language, GHC</primary></indexterm>
   3 <indexterm><primary>extensions, GHC</primary></indexterm>
   4 As with all known Haskell systems, GHC implements some extensions to
   5 the language.  To use them, you'll need to give a <option>-fglasgow-exts</option>
   6 <indexterm><primary>-fglasgow-exts option</primary></indexterm> option.
   7 </para>
   8
   9 <para>
  10 Virtually all of the Glasgow extensions serve to give you access to
  11 the underlying facilities with which we implement Haskell.  Thus, you
  12 can get at the Raw Iron, if you are willing to write some non-standard
  13 code at a more primitive level.  You need not be &ldquo;stuck&rdquo; on
  14 performance because of the implementation costs of Haskell's
  15 &ldquo;high-level&rdquo; features&mdash;you can always code &ldquo;under&rdquo; them.  In an extreme case, you can write all your time-critical code in C, and then just glue it together with Haskell!
  16 </para>
  17
  18 <para>
  19 Executive summary of our extensions:
  20 </para>
  21
  22   <variablelist>
  23
  24     <varlistentry>
  25       <term>Unboxed types and primitive operations:</Term>
  26       <listitem>
  27         <para>You can get right down to the raw machine types and
  28         operations; included in this are &ldquo;primitive
  29         arrays&rdquo; (direct access to Big Wads of Bytes).  Please
  30         see <XRef LinkEnd="glasgow-unboxed"> and following.</para>
  31       </listitem>
  32     </varlistentry>
  33
  34     <varlistentry>
  35       <term>Type system extensions:</term>
  36       <listitem>
  37         <para> GHC supports a large number of extensions to Haskell's
  38         type system.  Specifically:</para>
  39
  40         <variablelist>
  41           <varlistentry>
  42             <term>Multi-parameter type classes:</term>
  43             <listitem>
  44               <para><xref LinkEnd="multi-param-type-classes"></para>
  45             </listitem>
  46           </varlistentry>
  47
  48           <varlistentry>
  49             <term>Functional dependencies:</term>
  50             <listitem>
  51               <para><xref LinkEnd="functional-dependencies"></para>
  52             </listitem>
  53           </varlistentry>
  54
  55           <varlistentry>
  56             <term>Implicit parameters:</term>
  57             <listitem>
  58               <para><xref LinkEnd="implicit-parameters"></para>
  59             </listitem>
  60           </varlistentry>
  61
  62           <varlistentry>
  63             <term>Local universal quantification:</term>
  64             <listitem>
  65               <para><xref LinkEnd="universal-quantification"></para>
  66             </listitem>
  67           </varlistentry>
  68
  69           <varlistentry>
  70             <term>Extistentially quantification in data types:</term>
  71             <listitem>
  72               <para><xref LinkEnd="existential-quantification"></para>
  73             </listitem>
  74           </varlistentry>
  75
  76           <varlistentry>
  77             <term>Scoped type variables:</term>
  78             <listitem>
  79               <para>Scoped type variables enable the programmer to
  80               supply type signatures for some nested declarations,
  81               where this would not be legal in Haskell 98.  Details in
  82               <xref LinkEnd="scoped-type-variables">.</para>
  83             </listitem>
  84           </varlistentry>
  85         </variablelist>
  86       </listitem>
  87     </varlistentry>
  88
  89     <varlistentry>
  90       <term>Pattern guards</term>
  91       <listitem>
  92         <para>Instead of being a boolean expression, a guard is a list
  93         of qualifiers, exactly as in a list comprehension. See <xref
  94         LinkEnd="pattern-guards">.</para>
  95       </listitem>
  96     </varlistentry>
  97
  98     <varlistentry>
  99       <term>Data types with no constructors</term>
 100       <listitem>
 101         <para>See <xref LinkEnd="nullary-types">.</para>
 102       </listitem>
 103     </varlistentry>
 104
 105     <varlistentry>
 106       <term>Parallel list comprehensions</term>
 107       <listitem>
 108         <para>An extension to the list comprehension syntax to support
 109         <literal>zipWith</literal>-like functionality.  See <xref
 110         linkend="parallel-list-comprehensions">.</para>
 111       </listitem>
 112     </varlistentry>
 113
 114     <varlistentry>
 115       <term>Foreign calling:</term>
 116       <listitem>
 117         <para>Just what it sounds like.  We provide
 118         <emphasis>lots</emphasis> of rope that you can dangle around
 119         your neck.  Please see <xref LinkEnd="ffi">.</para>
 120       </listitem>
 121     </varlistentry>
 122
 123     <varlistentry>
 124       <term>Pragmas</term>
 125       <listitem>
 126         <para>Pragmas are special instructions to the compiler placed
 127         in the source file.  The pragmas GHC supports are described in
 128         <xref LinkEnd="pragmas">.</para>
 129       </listitem>
 130     </varlistentry>
 131
 132     <varlistentry>
 133       <term>Rewrite rules:</term>
 134       <listitem>
 135         <para>The programmer can specify rewrite rules as part of the
 136         source program (in a pragma).  GHC applies these rewrite rules
 137         wherever it can.  Details in <xref
 138         LinkEnd="rewrite-rules">.</para>
 139       </listitem>
 140     </varlistentry>
 141
 142     <varlistentry>
 143       <term>Generic classes:</term>
 144       <listitem>
 145         <para>(Note: support for generic classes is currently broken
 146         in GHC 5.02).</para>
 147
 148         <para>Generic class declarations allow you to define a class
 149         whose methods say how to work over an arbitrary data type.
 150         Then it's really easy to make any new type into an instance of
 151         the class.  This generalises the rather ad-hoc "deriving"
 152         feature of Haskell 98.  Details in <xref
 153         LinkEnd="generic-classes">.</para>
 154       </listitem>
 155     </varlistentry>
 156   </variablelist>
 157
 158 <para>
 159 Before you get too carried away working at the lowest level (e.g.,
 160 sloshing <literal>MutableByteArray&num;</literal>s around your
 161 program), you may wish to check if there are libraries that provide a
 162 &ldquo;Haskellised veneer&rdquo; over the features you want.  See
 163 <xref linkend="book-hslibs">.
 164 </para>
 165
 166   <sect1 id="options-language">
 167     <title>Language options</title>
 168
 169     <indexterm><primary>language</primary><secondary>option</secondary>
 170     </indexterm>
 171     <indexterm><primary>options</primary><secondary>language</secondary>
 172     </indexterm>
 173     <indexterm><primary>extensions</primary><secondary>options controlling</secondary>
 174     </indexterm>
 175
 176     <para> These flags control what variation of the language are
 177     permitted.  Leaving out all of them gives you standard Haskell
 178     98.</para>
 179
 180     <variablelist>
 181
 182       <varlistentry>
 183         <term><option>-fglasgow-exts</option>:</term>
 184         <indexterm><primary><option>-fglasgow-exts</option></primary></indexterm>
 185         <listitem>
 186           <para>This simultaneously enables all of the extensions to
 187           Haskell 98 described in <xref
 188           linkend="ghc-language-features">, except where otherwise
 189           noted. </para>
 190         </listitem>
 191       </varlistentry>
 192
 193       <varlistentry>
 194         <term><option>-fno-monomorphism-restriction</option>:</term>
 195         <indexterm><primary><option>-fno-monomorphism-restriction</option></primary></indexterm>
 196         <listitem>
 197           <para> Switch off the Haskell 98 monomorphism restriction.
 198           Independent of the <option>-fglasgow-exts</option>
 199           flag. </para>
 200         </listitem>
 201       </varlistentry>
 202
 203       <varlistentry>
 204         <term><option>-fallow-overlapping-instances</option></term>
 205         <term><option>-fallow-undecidable-instances</option></term>
 206         <term><option>-fcontext-stack</option></term>
 207         <indexterm><primary><option>-fallow-overlapping-instances</option></primary></indexterm>
 208         <indexterm><primary><option>-fallow-undecidable-instances</option></primary></indexterm>
 209         <indexterm><primary><option>-fcontext-stack</option></primary></indexterm>
 210         <listitem>
 211           <para> See <xref LinkEnd="instance-decls">.  Only relevant
 212           if you also use <option>-fglasgow-exts</option>.</para>
 213         </listitem>
 214       </varlistentry>
 215
 216       <varlistentry>
 217         <term><option>-finline-phase</option></term>
 218         <indexterm><primary><option>-finline-phase</option></primary></indexterm>
 219         <listitem>
 220           <para>See <xref LinkEnd="rewrite-rules">.  Only relevant if
 221           you also use <option>-fglasgow-exts</option>.</para>
 222         </listitem>
 223       </varlistentry>
 224
 225       <varlistentry>
 226         <term><option>-fgenerics</option></term>
 227         <indexterm><primary><option>-fgenerics</option></primary></indexterm>
 228         <listitem>
 229           <para>See <xref LinkEnd="generic-classes">.  Independent of
 230           <option>-fglasgow-exts</option>.</para>
 231         </listitem>
 232       </varlistentry>
 233
 234         <varlistentry>
 235           <term><option>-fno-implicit-prelude</option></term>
 236           <listitem>
 237             <para><indexterm><primary>-fno-implicit-prelude
 238             option</primary></indexterm> GHC normally imports
 239             <filename>Prelude.hi</filename> files for you.  If you'd
 240             rather it didn't, then give it a
 241             <option>-fno-implicit-prelude</option> option.  The idea
 242             is that you can then import a Prelude of your own.  (But
 243             don't call it <literal>Prelude</literal>; the Haskell
 244             module namespace is flat, and you must not conflict with
 245             any Prelude module.)</para>
 246
 247             <para>Even though you have not imported the Prelude, all
 248             the built-in syntax still refers to the built-in Haskell
 249             Prelude types and values, as specified by the Haskell
 250             Report.  For example, the type <literal>[Int]</literal>
 251             still means <literal>Prelude.[] Int</literal>; tuples
 252             continue to refer to the standard Prelude tuples; the
 253             translation for list comprehensions continues to use
 254             <literal>Prelude.map</literal> etc.</para>
 255
 256             <para> With one group of exceptions!  You may want to
 257             define your own numeric class hierarchy.  It completely
 258             defeats that purpose if the literal "1" means
 259             "<literal>Prelude.fromInteger 1</literal>", which is what
 260             the Haskell Report specifies.  So the
 261             <option>-fno-implicit-prelude</option> flag causes the
 262             following pieces of built-in syntax to refer to <emphasis>whatever
 263             is in scope</emphasis>, not the Prelude versions:</para>
 264
 265             <itemizedlist>
 266               <listitem>
 267                 <para>Integer and fractional literals mean
 268                 "<literal>fromInteger 1</literal>" and
 269                 "<literal>fromRational 3.2</literal>", not the
 270                 Prelude-qualified versions; both in expressions and in
 271                 patterns.</para>
 272               </listitem>
 273
 274               <listitem>
 275                 <para>Negation (e.g. "<literal>- (f x)</literal>")
 276                 means "<literal>negate (f x)</literal>" (not
 277                 <literal>Prelude.negate</literal>).</para>
 278               </listitem>
 279
 280               <listitem>
 281                 <para>In an n+k pattern, the standard Prelude
 282                 <literal>Ord</literal> class is still used for comparison,
 283                 but the necessary subtraction uses whatever
 284                 "<literal>(-)</literal>" is in scope (not
 285                 "<literal>Prelude.(-)</literal>").</para>
 286               </listitem>
 287             </itemizedlist>
 288
 289              <para>Note: Negative literals, such as <literal>-3</literal>, are
 290              specified by (a careful reading of) the Haskell Report as
 291              meaning <literal>Prelude.negate (Prelude.fromInteger 3)</literal>.
 292              However, GHC deviates from this slightly, and treats them as meaning
 293              <literal>fromInteger (-3)</literal>.  One particular effect of this
 294              slightly-non-standard reading is that there is no difficulty with
 295              the literal <literal>-2147483648</literal> at type <literal>Int</literal>;
 296              it means <literal>fromInteger (-2147483648)</literal>.  The strict interpretation
 297              would be <literal>negate (fromInteger 2147483648)</literal>,
 298              and the call to <literal>fromInteger</literal> would overflow
 299              (at type <literal>Int</literal>, remember).
 300              </para>
 301
 302           </listitem>
 303         </varlistentry>
 304
 305     </variablelist>
 306   </sect1>
 307
 308 <!-- UNBOXED TYPES AND PRIMITIVE OPERATIONS -->
 309 &primitives;
 310
 311 <sect1 id="glasgow-ST-monad">
 312 <title>Primitive state-transformer monad</title>
 313
 314 <para>
 315 <indexterm><primary>state transformers (Glasgow extensions)</primary></indexterm>
 316 <indexterm><primary>ST monad (Glasgow extension)</primary></indexterm>
 317 </para>
 318
 319 <para>
 320 This monad underlies our implementation of arrays, mutable and
 321 immutable, and our implementation of I/O, including &ldquo;C calls&rdquo;.
 322 </para>
 323
 324 <para>
 325 The <literal>ST</literal> library, which provides access to the
 326 <function>ST</function> monad, is described in <xref
 327 linkend="sec-ST">.
 328 </para>
 329
 330 </sect1>
 331
 332 <sect1 id="glasgow-prim-arrays">
 333 <title>Primitive arrays, mutable and otherwise
 334 </title>
 335
 336 <para>
 337 <indexterm><primary>primitive arrays (Glasgow extension)</primary></indexterm>
 338 <indexterm><primary>arrays, primitive (Glasgow extension)</primary></indexterm>
 339 </para>
 340
 341 <para>
 342 GHC knows about quite a few flavours of Large Swathes of Bytes.
 343 </para>
 344
 345 <para>
 346 First, GHC distinguishes between primitive arrays of (boxed) Haskell
 347 objects (type <literal>Array&num; obj</literal>) and primitive arrays of bytes (type
 348 <literal>ByteArray&num;</literal>).
 349 </para>
 350
 351 <para>
 352 Second, it distinguishes between&hellip;
 353 <variablelist>
 354
 355 <varlistentry>
 356 <term>Immutable:</term>
 357 <listitem>
 358 <para>
 359 Arrays that do not change (as with &ldquo;standard&rdquo; Haskell arrays); you
 360 can only read from them.  Obviously, they do not need the care and
 361 attention of the state-transformer monad.
 362 </para>
 363 </listitem>
 364 </varlistentry>
 365 <varlistentry>
 366 <term>Mutable:</term>
 367 <listitem>
 368 <para>
 369 Arrays that may be changed or &ldquo;mutated.&rdquo;  All the operations on them
 370 live within the state-transformer monad and the updates happen
 371 <emphasis>in-place</emphasis>.
 372 </para>
 373 </listitem>
 374 </varlistentry>
 375 <varlistentry>
 376 <term>&ldquo;Static&rdquo; (in C land):</term>
 377 <listitem>
 378 <para>
 379 A C routine may pass an <literal>Addr&num;</literal> pointer back into Haskell land.  There
 380 are then primitive operations with which you may merrily grab values
 381 over in C land, by indexing off the &ldquo;static&rdquo; pointer.
 382 </para>
 383 </listitem>
 384 </varlistentry>
 385 <varlistentry>
 386 <term>&ldquo;Stable&rdquo; pointers:</term>
 387 <listitem>
 388 <para>
 389 If, for some reason, you wish to hand a Haskell pointer (i.e.,
 390 <emphasis>not</emphasis> an unboxed value) to a C routine, you first make the
 391 pointer &ldquo;stable,&rdquo; so that the garbage collector won't forget that it
 392 exists.  That is, GHC provides a safe way to pass Haskell pointers to
 393 C.
 394 </para>
 395
 396 <para>
 397 Please see <xref LinkEnd="sec-stable-pointers"> for more details.
 398 </para>
 399 </listitem>
 400 </varlistentry>
 401 <varlistentry>
 402 <term>&ldquo;Foreign objects&rdquo;:</term>
 403 <listitem>
 404 <para>
 405 A &ldquo;foreign object&rdquo; is a safe way to pass an external object (a
 406 C-allocated pointer, say) to Haskell and have Haskell do the Right
 407 Thing when it no longer references the object.  So, for example, C
 408 could pass a large bitmap over to Haskell and say &ldquo;please free this
 409 memory when you're done with it.&rdquo;
 410 </para>
 411
 412 <para>
 413 Please see <xref LinkEnd="sec-ForeignObj"> for more details.
 414 </para>
 415 </listitem>
 416 </varlistentry>
 417 </variablelist>
 418 </para>
 419
 420 <para>
 421 The libraries documentatation gives more details on all these
 422 &ldquo;primitive array&rdquo; types and the operations on them.
 423 </para>
 424
 425 </sect1>
 426
 427
 428 <sect1 id="nullary-types">
 429 <title>Data types with no constructors</title>
 430
 431 <para>With the <option>-fglasgow-exts</option> flag, GHC lets you declare
 432 a data type with no constructors.  For example:</para>
 433 <programlisting>
 434   data S      -- S :: *
 435   data T a    -- T :: * -> *
 436 </programlisting>
 437 <para>Syntactically, the declaration lacks the "= constrs" part.  The
 438 type can be parameterised, but only over ordinary types, of kind *; since
 439 Haskell does not have kind signatures, you cannot parameterise over higher-kinded
 440 types.</para>
 441
 442 <para>Such data types have only one value, namely bottom.
 443 Nevertheless, they can be useful when defining "phantom types".</para>
 444 </sect1>
 445
 446 <sect1 id="pattern-guards">
 447 <title>Pattern guards</title>
 448
 449 <para>
 450 <indexterm><primary>Pattern guards (Glasgow extension)</primary></indexterm>
 451 The discussion that follows is an abbreviated version of Simon Peyton Jones's original <ULink URL="http://research.microsoft.com/~simonpj/Haskell/guards.html">proposal</ULink>. (Note that the proposal was written before pattern guards were implemented, so refers to them as unimplemented.)
 452 </para>
 453
 454 <para>
 455 Suppose we have an abstract data type of finite maps, with a
 456 lookup operation:
 457
 458 <programlisting>
 459 lookup :: FiniteMap -> Int -> Maybe Int
 460 </programlisting>
 461
 462 The lookup returns <function>Nothing</function> if the supplied key is not in the domain of the mapping, and <function>(Just v)</function> otherwise,
 463 where <VarName>v</VarName> is the value that the key maps to.  Now consider the following definition:
 464 </para>
 465
 466 <programlisting>
 467 clunky env var1 var2 | ok1 && ok2 = val1 + val2
 468 | otherwise  = var1 + var2
 469 where
 470   m1 = lookup env var1
 471   m2 = lookup env var2
 472   ok1 = maybeToBool m1
 473   ok2 = maybeToBool m2
 474   val1 = expectJust m1
 475   val2 = expectJust m2
 476 </programlisting>
 477
 478 <para>
 479 The auxiliary functions are
 480 </para>
 481
 482 <programlisting>
 483 maybeToBool :: Maybe a -&gt; Bool
 484 maybeToBool (Just x) = True
 485 maybeToBool Nothing  = False
 486
 487 expectJust :: Maybe a -&gt; a
 488 expectJust (Just x) = x
 489 expectJust Nothing  = error "Unexpected Nothing"
 490 </programlisting>
 491
 492 <para>
 493 What is <function>clunky</function> doing? The guard <literal>ok1 &&
 494 ok2</literal> checks that both lookups succeed, using
 495 <function>maybeToBool</function> to convert the <function>Maybe</function>
 496 types to booleans. The (lazily evaluated) <function>expectJust</function>
 497 calls extract the values from the results of the lookups, and binds the
 498 returned values to <VarName>val1</VarName> and <VarName>val2</VarName>
 499 respectively.  If either lookup fails, then clunky takes the
 500 <literal>otherwise</literal> case and returns the sum of its arguments.
 501 </para>
 502
 503 <para>
 504 This is certainly legal Haskell, but it is a tremendously verbose and
 505 un-obvious way to achieve the desired effect.  Arguably, a more direct way
 506 to write clunky would be to use case expressions:
 507 </para>
 508
 509 <programlisting>
 510 clunky env var1 var1 = case lookup env var1 of
 511   Nothing -&gt; fail
 512   Just val1 -&gt; case lookup env var2 of
 513     Nothing -&gt; fail
 514     Just val2 -&gt; val1 + val2
 515 where
 516   fail = val1 + val2
 517 </programlisting>
 518
 519 <para>
 520 This is a bit shorter, but hardly better.  Of course, we can rewrite any set
 521 of pattern-matching, guarded equations as case expressions; that is
 522 precisely what the compiler does when compiling equations! The reason that
 523 Haskell provides guarded equations is because they allow us to write down
 524 the cases we want to consider, one at a time, independently of each other.
 525 This structure is hidden in the case version.  Two of the right-hand sides
 526 are really the same (<function>fail</function>), and the whole expression
 527 tends to become more and more indented.
 528 </para>
 529
 530 <para>
 531 Here is how I would write clunky:
 532 </para>
 533
 534 <programlisting>
 535 clunky env var1 var1
 536   | Just val1 &lt;- lookup env var1
 537   , Just val2 &lt;- lookup env var2
 538   = val1 + val2
 539 ...other equations for clunky...
 540 </programlisting>
 541
 542 <para>
 543 The semantics should be clear enough.  The qualifers are matched in order.
 544 For a <literal>&lt;-</literal> qualifier, which I call a pattern guard, the
 545 right hand side is evaluated and matched against the pattern on the left.
 546 If the match fails then the whole guard fails and the next equation is
 547 tried.  If it succeeds, then the appropriate binding takes place, and the
 548 next qualifier is matched, in the augmented environment.  Unlike list
 549 comprehensions, however, the type of the expression to the right of the
 550 <literal>&lt;-</literal> is the same as the type of the pattern to its
 551 left.  The bindings introduced by pattern guards scope over all the
 552 remaining guard qualifiers, and over the right hand side of the equation.
 553 </para>
 554
 555 <para>
 556 Just as with list comprehensions, boolean expressions can be freely mixed
 557 with among the pattern guards.  For example:
 558 </para>
 559
 560 <programlisting>
 561 f x | [y] <- x
 562     , y > 3
 563     , Just z <- h y
 564     = ...
 565 </programlisting>
 566
 567 <para>
 568 Haskell's current guards therefore emerge as a special case, in which the
 569 qualifier list has just one element, a boolean expression.
 570 </para>
 571 </sect1>
 572
 573   <sect1 id="parallel-list-comprehensions">
 574     <title>Parallel List Comprehensions</title>
 575     <indexterm><primary>list comprehensions</primary><secondary>parallel</secondary>
 576     </indexterm>
 577     <indexterm><primary>parallel list comprehensions</primary>
 578     </indexterm>
 579
 580     <para>Parallel list comprehensions are a natural extension to list
 581     comprehensions.  List comprehensions can be thought of as a nice
 582     syntax for writing maps and filters.  Parallel comprehensions
 583     extend this to include the zipWith family.</para>
 584
 585     <para>A parallel list comprehension has multiple independent
 586     branches of qualifier lists, each separated by a `|' symbol.  For
 587     example, the following zips together two lists:</para>
 588
 589 <programlisting>
 590    [ (x, y) | x <- xs | y <- ys ]
 591 </programlisting>
 592
 593     <para>The behavior of parallel list comprehensions follows that of
 594     zip, in that the resulting list will have the same length as the
 595     shortest branch.</para>
 596
 597     <para>We can define parallel list comprehensions by translation to
 598     regular comprehensions.  Here's the basic idea:</para>
 599
 600     <para>Given a parallel comprehension of the form: </para>
 601
 602 <programlisting>
 603    [ e | p1 <- e11, p2 <- e12, ...
 604        | q1 <- e21, q2 <- e22, ...
 605        ...
 606    ]
 607 </programlisting>
 608
 609     <para>This will be translated to: </para>
 610
 611 <programlisting>
 612    [ e | ((p1,p2), (q1,q2), ...) <- zipN [(p1,p2) | p1 <- e11, p2 <- e12, ...]
 613                                          [(q1,q2) | q1 <- e21, q2 <- e22, ...]
 614                                          ...
 615    ]
 616 </programlisting>
 617
 618     <para>where `zipN' is the appropriate zip for the given number of
 619     branches.</para>
 620
 621   </sect1>
 622
 623 <sect1 id="multi-param-type-classes">
 624 <title>Multi-parameter type classes
 625 </title>
 626
 627 <para>
 628 This section documents GHC's implementation of multi-parameter type
 629 classes.  There's lots of background in the paper <ULink
 630 URL="http://research.microsoft.com/~simonpj/multi.ps.gz" >Type
 631 classes: exploring the design space</ULink > (Simon Peyton Jones, Mark
 632 Jones, Erik Meijer).
 633 </para>
 634
 635 <para>
 636 I'd like to thank people who reported shorcomings in the GHC 3.02
 637 implementation.  Our default decisions were all conservative ones, and
 638 the experience of these heroic pioneers has given useful concrete
 639 examples to support several generalisations.  (These appear below as
 640 design choices not implemented in 3.02.)
 641 </para>
 642
 643 <para>
 644 I've discussed these notes with Mark Jones, and I believe that Hugs
 645 will migrate towards the same design choices as I outline here.
 646 Thanks to him, and to many others who have offered very useful
 647 feedback.
 648 </para>
 649
 650 <sect2>
 651 <title>Types</title>
 652
 653 <para>
 654 There are the following restrictions on the form of a qualified
 655 type:
 656 </para>
 657
 658 <para>
 659
 660 <programlisting>
 661   forall tv1..tvn (c1, ...,cn) => type
 662 </programlisting>
 663
 664 </para>
 665
 666 <para>
 667 (Here, I write the "foralls" explicitly, although the Haskell source
 668 language omits them; in Haskell 1.4, all the free type variables of an
 669 explicit source-language type signature are universally quantified,
 670 except for the class type variables in a class declaration.  However,
 671 in GHC, you can give the foralls if you want.  See <xref LinkEnd="universal-quantification">).
 672 </para>
 673
 674 <para>
 675
 676 <OrderedList>
 677 <listitem>
 678
 679 <para>
 680  <emphasis>Each universally quantified type variable
 681 <literal>tvi</literal> must be mentioned (i.e. appear free) in <literal>type</literal></emphasis>.
 682
 683 The reason for this is that a value with a type that does not obey
 684 this restriction could not be used without introducing
 685 ambiguity. Here, for example, is an illegal type:
 686
 687
 688 <programlisting>
 689   forall a. Eq a => Int
 690 </programlisting>
 691
 692
 693 When a value with this type was used, the constraint <literal>Eq tv</literal>
 694 would be introduced where <literal>tv</literal> is a fresh type variable, and
 695 (in the dictionary-translation implementation) the value would be
 696 applied to a dictionary for <literal>Eq tv</literal>.  The difficulty is that we
 697 can never know which instance of <literal>Eq</literal> to use because we never
 698 get any more information about <literal>tv</literal>.
 699
 700 </para>
 701 </listitem>
 702 <listitem>
 703
 704 <para>
 705  <emphasis>Every constraint <literal>ci</literal> must mention at least one of the
 706 universally quantified type variables <literal>tvi</literal></emphasis>.
 707
 708 For example, this type is OK because <literal>C a b</literal> mentions the
 709 universally quantified type variable <literal>b</literal>:
 710
 711
 712 <programlisting>
 713   forall a. C a b => burble
 714 </programlisting>
 715
 716
 717 The next type is illegal because the constraint <literal>Eq b</literal> does not
 718 mention <literal>a</literal>:
 719
 720
 721 <programlisting>
 722   forall a. Eq b => burble
 723 </programlisting>
 724
 725
 726 The reason for this restriction is milder than the other one.  The
 727 excluded types are never useful or necessary (because the offending
 728 context doesn't need to be witnessed at this point; it can be floated
 729 out).  Furthermore, floating them out increases sharing. Lastly,
 730 excluding them is a conservative choice; it leaves a patch of
 731 territory free in case we need it later.
 732
 733 </para>
 734 </listitem>
 735
 736 </OrderedList>
 737
 738 </para>
 739
 740 <para>
 741 These restrictions apply to all types, whether declared in a type signature
 742 or inferred.
 743 </para>
 744
 745 <para>
 746 Unlike Haskell 1.4, constraints in types do <emphasis>not</emphasis> have to be of
 747 the form <emphasis>(class type-variables)</emphasis>.  Thus, these type signatures
 748 are perfectly OK
 749 </para>
 750
 751 <para>
 752
 753 <programlisting>
 754   f :: Eq (m a) => [m a] -> [m a]
 755   g :: Eq [a] => ...
 756 </programlisting>
 757
 758 </para>
 759
 760 <para>
 761 This choice recovers principal types, a property that Haskell 1.4 does not have.
 762 </para>
 763
 764 </sect2>
 765
 766 <sect2>
 767 <title>Class declarations</title>
 768
 769 <para>
 770
 771 <OrderedList>
 772 <listitem>
 773
 774 <para>
 775  <emphasis>Multi-parameter type classes are permitted</emphasis>. For example:
 776
 777
 778 <programlisting>
 779   class Collection c a where
 780     union :: c a -> c a -> c a
 781     ...etc.
 782 </programlisting>
 783
 784
 785
 786 </para>
 787 </listitem>
 788 <listitem>
 789
 790 <para>
 791  <emphasis>The class hierarchy must be acyclic</emphasis>.  However, the definition
 792 of "acyclic" involves only the superclass relationships.  For example,
 793 this is OK:
 794
 795
 796 <programlisting>
 797   class C a where {
 798     op :: D b => a -> b -> b
 799   }
 800
 801   class C a => D a where { ... }
 802 </programlisting>
 803
 804
 805 Here, <literal>C</literal> is a superclass of <literal>D</literal>, but it's OK for a
 806 class operation <literal>op</literal> of <literal>C</literal> to mention <literal>D</literal>.  (It
 807 would not be OK for <literal>D</literal> to be a superclass of <literal>C</literal>.)
 808
 809 </para>
 810 </listitem>
 811 <listitem>
 812
 813 <para>
 814  <emphasis>There are no restrictions on the context in a class declaration
 815 (which introduces superclasses), except that the class hierarchy must
 816 be acyclic</emphasis>.  So these class declarations are OK:
 817
 818
 819 <programlisting>
 820   class Functor (m k) => FiniteMap m k where
 821     ...
 822
 823   class (Monad m, Monad (t m)) => Transform t m where
 824     lift :: m a -> (t m) a
 825 </programlisting>
 826
 827
 828 </para>
 829 </listitem>
 830 <listitem>
 831
 832 <para>
 833  <emphasis>In the signature of a class operation, every constraint
 834 must mention at least one type variable that is not a class type
 835 variable</emphasis>.
 836
 837 Thus:
 838
 839
 840 <programlisting>
 841   class Collection c a where
 842     mapC :: Collection c b => (a->b) -> c a -> c b
 843 </programlisting>
 844
 845
 846 is OK because the constraint <literal>(Collection a b)</literal> mentions
 847 <literal>b</literal>, even though it also mentions the class variable
 848 <literal>a</literal>.  On the other hand:
 849
 850
 851 <programlisting>
 852   class C a where
 853     op :: Eq a => (a,b) -> (a,b)
 854 </programlisting>
 855
 856
 857 is not OK because the constraint <literal>(Eq a)</literal> mentions on the class
 858 type variable <literal>a</literal>, but not <literal>b</literal>.  However, any such
 859 example is easily fixed by moving the offending context up to the
 860 superclass context:
 861
 862
 863 <programlisting>
 864   class Eq a => C a where
 865     op ::(a,b) -> (a,b)
 866 </programlisting>
 867
 868
 869 A yet more relaxed rule would allow the context of a class-op signature
 870 to mention only class type variables.  However, that conflicts with
 871 Rule 1(b) for types above.
 872
 873 </para>
 874 </listitem>
 875 <listitem>
 876
 877 <para>
 878  <emphasis>The type of each class operation must mention <emphasis>all</emphasis> of
 879 the class type variables</emphasis>.  For example:
 880
 881
 882 <programlisting>
 883   class Coll s a where
 884     empty  :: s
 885     insert :: s -> a -> s
 886 </programlisting>
 887
 888
 889 is not OK, because the type of <literal>empty</literal> doesn't mention
 890 <literal>a</literal>.  This rule is a consequence of Rule 1(a), above, for
 891 types, and has the same motivation.
 892
 893 Sometimes, offending class declarations exhibit misunderstandings.  For
 894 example, <literal>Coll</literal> might be rewritten
 895
 896
 897 <programlisting>
 898   class Coll s a where
 899     empty  :: s a
 900     insert :: s a -> a -> s a
 901 </programlisting>
 902
 903
 904 which makes the connection between the type of a collection of
 905 <literal>a</literal>'s (namely <literal>(s a)</literal>) and the element type <literal>a</literal>.
 906 Occasionally this really doesn't work, in which case you can split the
 907 class like this:
 908
 909
 910 <programlisting>
 911   class CollE s where
 912     empty  :: s
 913
 914   class CollE s => Coll s a where
 915     insert :: s -> a -> s
 916 </programlisting>
 917
 918
 919 </para>
 920 </listitem>
 921
 922 </OrderedList>
 923
 924 </para>
 925
 926 </sect2>
 927
 928 <sect2 id="instance-decls">
 929 <title>Instance declarations</title>
 930
 931 <para>
 932
 933 <OrderedList>
 934 <listitem>
 935
 936 <para>
 937  <emphasis>Instance declarations may not overlap</emphasis>.  The two instance
 938 declarations
 939
 940
 941 <programlisting>
 942   instance context1 => C type1 where ...
 943   instance context2 => C type2 where ...
 944 </programlisting>
 945
 946
 947 "overlap" if <literal>type1</literal> and <literal>type2</literal> unify
 948
 949 However, if you give the command line option
 950 <option>-fallow-overlapping-instances</option><indexterm><primary>-fallow-overlapping-instances
 951 option</primary></indexterm> then two overlapping instance declarations are permitted
 952 iff
 953
 954
 955 <itemizedlist>
 956 <listitem>
 957
 958 <para>
 959  EITHER <literal>type1</literal> and <literal>type2</literal> do not unify
 960 </para>
 961 </listitem>
 962 <listitem>
 963
 964 <para>
 965  OR <literal>type2</literal> is a substitution instance of <literal>type1</literal>
 966 (but not identical to <literal>type1</literal>)
 967 </para>
 968 </listitem>
 969 <listitem>
 970
 971 <para>
 972  OR vice versa
 973 </para>
 974 </listitem>
 975
 976 </itemizedlist>
 977
 978
 979 Notice that these rules
 980
 981
 982 <itemizedlist>
 983 <listitem>
 984
 985 <para>
 986  make it clear which instance decl to use
 987 (pick the most specific one that matches)
 988
 989 </para>
 990 </listitem>
 991 <listitem>
 992
 993 <para>
 994  do not mention the contexts <literal>context1</literal>, <literal>context2</literal>
 995 Reason: you can pick which instance decl
 996 "matches" based on the type.
 997 </para>
 998 </listitem>
 999
1000 </itemizedlist>
1001
1002
1003 Regrettably, GHC doesn't guarantee to detect overlapping instance
1004 declarations if they appear in different modules.  GHC can "see" the
1005 instance declarations in the transitive closure of all the modules
1006 imported by the one being compiled, so it can "see" all instance decls
1007 when it is compiling <literal>Main</literal>.  However, it currently chooses not
1008 to look at ones that can't possibly be of use in the module currently
1009 being compiled, in the interests of efficiency.  (Perhaps we should
1010 change that decision, at least for <literal>Main</literal>.)
1011
1012 </para>
1013 </listitem>
1014 <listitem>
1015
1016 <para>
1017  <emphasis>There are no restrictions on the type in an instance
1018 <emphasis>head</emphasis>, except that at least one must not be a type variable</emphasis>.
1019 The instance "head" is the bit after the "=>" in an instance decl. For
1020 example, these are OK:
1021
1022
1023 <programlisting>
1024   instance C Int a where ...
1025
1026   instance D (Int, Int) where ...
1027
1028   instance E [[a]] where ...
1029 </programlisting>
1030
1031
1032 Note that instance heads <emphasis>may</emphasis> contain repeated type variables.
1033 For example, this is OK:
1034
1035
1036 <programlisting>
1037   instance Stateful (ST s) (MutVar s) where ...
1038 </programlisting>
1039
1040
1041 The "at least one not a type variable" restriction is to ensure that
1042 context reduction terminates: each reduction step removes one type
1043 constructor.  For example, the following would make the type checker
1044 loop if it wasn't excluded:
1045
1046
1047 <programlisting>
1048   instance C a => C a where ...
1049 </programlisting>
1050
1051
1052 There are two situations in which the rule is a bit of a pain. First,
1053 if one allows overlapping instance declarations then it's quite
1054 convenient to have a "default instance" declaration that applies if
1055 something more specific does not:
1056
1057
1058 <programlisting>
1059   instance C a where
1060     op = ... -- Default
1061 </programlisting>
1062
1063
1064 Second, sometimes you might want to use the following to get the
1065 effect of a "class synonym":
1066
1067
1068 <programlisting>
1069   class (C1 a, C2 a, C3 a) => C a where { }
1070
1071   instance (C1 a, C2 a, C3 a) => C a where { }
1072 </programlisting>
1073
1074
1075 This allows you to write shorter signatures:
1076
1077
1078 <programlisting>
1079   f :: C a => ...
1080 </programlisting>
1081
1082
1083 instead of
1084
1085
1086 <programlisting>
1087   f :: (C1 a, C2 a, C3 a) => ...
1088 </programlisting>
1089
1090
1091 I'm on the lookout for a simple rule that preserves decidability while
1092 allowing these idioms.  The experimental flag
1093 <option>-fallow-undecidable-instances</option><indexterm><primary>-fallow-undecidable-instances
1094 option</primary></indexterm> lifts this restriction, allowing all the types in an
1095 instance head to be type variables.
1096
1097 </para>
1098 </listitem>
1099 <listitem>
1100
1101 <para>
1102  <emphasis>Unlike Haskell 1.4, instance heads may use type
1103 synonyms</emphasis>.  As always, using a type synonym is just shorthand for
1104 writing the RHS of the type synonym definition.  For example:
1105
1106
1107 <programlisting>
1108   type Point = (Int,Int)
1109   instance C Point   where ...
1110   instance C [Point] where ...
1111 </programlisting>
1112
1113
1114 is legal.  However, if you added
1115
1116
1117 <programlisting>
1118   instance C (Int,Int) where ...
1119 </programlisting>
1120
1121
1122 as well, then the compiler will complain about the overlapping
1123 (actually, identical) instance declarations.  As always, type synonyms
1124 must be fully applied.  You cannot, for example, write:
1125
1126
1127 <programlisting>
1128   type P a = [[a]]
1129   instance Monad P where ...
1130 </programlisting>
1131
1132
1133 This design decision is independent of all the others, and easily
1134 reversed, but it makes sense to me.
1135
1136 </para>
1137 </listitem>
1138 <listitem>
1139
1140 <para>
1141 <emphasis>The types in an instance-declaration <emphasis>context</emphasis> must all
1142 be type variables</emphasis>. Thus
1143
1144
1145 <programlisting>
1146 instance C a b => Eq (a,b) where ...
1147 </programlisting>
1148
1149
1150 is OK, but
1151
1152
1153 <programlisting>
1154 instance C Int b => Foo b where ...
1155 </programlisting>
1156
1157
1158 is not OK.  Again, the intent here is to make sure that context
1159 reduction terminates.
1160
1161 Voluminous correspondence on the Haskell mailing list has convinced me
1162 that it's worth experimenting with a more liberal rule.  If you use
1163 the flag <option>-fallow-undecidable-instances</option> can use arbitrary
1164 types in an instance context.  Termination is ensured by having a
1165 fixed-depth recursion stack.  If you exceed the stack depth you get a
1166 sort of backtrace, and the opportunity to increase the stack depth
1167 with <option>-fcontext-stack</option><emphasis>N</emphasis>.
1168
1169 </para>
1170 </listitem>
1171
1172 </OrderedList>
1173
1174 </para>
1175
1176 </sect2>
1177
1178 </sect1>
1179
1180 <sect1 id="implicit-parameters">
1181 <title>Implicit parameters
1182 </title>
1183
1184 <para> Implicit paramters are implemented as described in
1185 "Implicit parameters: dynamic scoping with static types",
1186 J Lewis, MB Shields, E Meijer, J Launchbury,
1187 27th ACM Symposium on Principles of Programming Languages (POPL'00),
1188 Boston, Jan 2000.
1189 </para>
1190
1191 <para>
1192 There should be more documentation, but there isn't (yet).  Yell if you need it.
1193 </para>
1194 <itemizedlist>
1195 <listitem>
1196 <para> You can't have an implicit parameter in the context of a class or instance
1197 declaration.  For example, both these declarations are illegal:
1198 <programlisting>
1199   class (?x::Int) => C a where ...
1200   instance (?x::a) => Foo [a] where ...
1201 </programlisting>
1202 Reason: exactly which implicit parameter you pick up depends on exactly where
1203 you invoke a function. But the ``invocation'' of instance declarations is done
1204 behind the scenes by the compiler, so it's hard to figure out exactly where it is done.
1205 Easiest thing is to outlaw the offending types.</para>
1206 </listitem>
1207
1208 </itemizedlist>
1209
1210 </sect1>
1211
1212
1213 <sect1 id="functional-dependencies">
1214 <title>Functional dependencies
1215 </title>
1216
1217 <para> Functional dependencies are implemented as described by Mark Jones
1218 in "Type Classes with Functional Dependencies", Mark P. Jones,
1219 In Proceedings of the 9th European Symposium on Programming,
1220 ESOP 2000, Berlin, Germany, March 2000, Springer-Verlag LNCS 1782.
1221 </para>
1222
1223 <para>
1224 There should be more documentation, but there isn't (yet).  Yell if you need it.
1225 </para>
1226 </sect1>
1227
1228
1229 <sect1 id="universal-quantification">
1230 <title>Explicit universal quantification
1231 </title>
1232
1233 <para>
1234 GHC's type system supports explicit universal quantification in
1235 constructor fields and function arguments.  This is useful for things
1236 like defining <literal>runST</literal> from the state-thread world.
1237 GHC's syntax for this now agrees with Hugs's, namely:
1238 </para>
1239
1240 <para>
1241
1242 <programlisting>
1243         forall a b. (Ord a, Eq  b) => a -> b -> a
1244 </programlisting>
1245
1246 </para>
1247
1248 <para>
1249 The context is, of course, optional.  You can't use <literal>forall</literal> as
1250 a type variable any more!
1251 </para>
1252
1253 <para>
1254 Haskell type signatures are implicitly quantified.  The <literal>forall</literal>
1255 allows us to say exactly what this means.  For example:
1256 </para>
1257
1258 <para>
1259
1260 <programlisting>
1261         g :: b -> b
1262 </programlisting>
1263
1264 </para>
1265
1266 <para>
1267 means this:
1268 </para>
1269
1270 <para>
1271
1272 <programlisting>
1273         g :: forall b. (b -> b)
1274 </programlisting>
1275
1276 </para>
1277
1278 <para>
1279 The two are treated identically.
1280 </para>
1281
1282 <sect2 id="univ">
1283 <title>Universally-quantified data type fields
1284 </title>
1285
1286 <para>
1287 In a <literal>data</literal> or <literal>newtype</literal> declaration one can quantify
1288 the types of the constructor arguments.  Here are several examples:
1289 </para>
1290
1291 <para>
1292
1293 <programlisting>
1294 data T a = T1 (forall b. b -> b -> b) a
1295
1296 data MonadT m = MkMonad { return :: forall a. a -> m a,
1297                           bind   :: forall a b. m a -> (a -> m b) -> m b
1298                         }
1299
1300 newtype Swizzle = MkSwizzle (Ord a => [a] -> [a])
1301 </programlisting>
1302
1303 </para>
1304
1305 <para>
1306 The constructors now have so-called <emphasis>rank 2</emphasis> polymorphic
1307 types, in which there is a for-all in the argument types.:
1308 </para>
1309
1310 <para>
1311
1312 <programlisting>
1313 T1 :: forall a. (forall b. b -> b -> b) -> a -> T a
1314 MkMonad :: forall m. (forall a. a -> m a)
1315                   -> (forall a b. m a -> (a -> m b) -> m b)
1316                   -> MonadT m
1317 MkSwizzle :: (Ord a => [a] -> [a]) -> Swizzle
1318 </programlisting>
1319
1320 </para>
1321
1322 <para>
1323 Notice that you don't need to use a <literal>forall</literal> if there's an
1324 explicit context.  For example in the first argument of the
1325 constructor <function>MkSwizzle</function>, an implicit "<literal>forall a.</literal>" is
1326 prefixed to the argument type.  The implicit <literal>forall</literal>
1327 quantifies all type variables that are not already in scope, and are
1328 mentioned in the type quantified over.
1329 </para>
1330
1331 <para>
1332 As for type signatures, implicit quantification happens for non-overloaded
1333 types too.  So if you write this:
1334
1335 <programlisting>
1336   data T a = MkT (Either a b) (b -> b)
1337 </programlisting>
1338
1339 it's just as if you had written this:
1340
1341 <programlisting>
1342   data T a = MkT (forall b. Either a b) (forall b. b -> b)
1343 </programlisting>
1344
1345 That is, since the type variable <literal>b</literal> isn't in scope, it's
1346 implicitly universally quantified.  (Arguably, it would be better
1347 to <emphasis>require</emphasis> explicit quantification on constructor arguments
1348 where that is what is wanted.  Feedback welcomed.)
1349 </para>
1350
1351 </sect2>
1352
1353 <sect2>
1354 <title>Construction </title>
1355
1356 <para>
1357 You construct values of types <literal>T1, MonadT, Swizzle</literal> by applying
1358 the constructor to suitable values, just as usual.  For example,
1359 </para>
1360
1361 <para>
1362
1363 <programlisting>
1364 (T1 (\xy->x) 3) :: T Int
1365
1366 (MkSwizzle sort)    :: Swizzle
1367 (MkSwizzle reverse) :: Swizzle
1368
1369 (let r x = Just x
1370      b m k = case m of
1371                 Just y -> k y
1372                 Nothing -> Nothing
1373   in
1374   MkMonad r b) :: MonadT Maybe
1375 </programlisting>
1376
1377 </para>
1378
1379 <para>
1380 The type of the argument can, as usual, be more general than the type
1381 required, as <literal>(MkSwizzle reverse)</literal> shows.  (<function>reverse</function>
1382 does not need the <literal>Ord</literal> constraint.)
1383 </para>
1384
1385 </sect2>
1386
1387 <sect2>
1388 <title>Pattern matching</title>
1389
1390 <para>
1391 When you use pattern matching, the bound variables may now have
1392 polymorphic types.  For example:
1393 </para>
1394
1395 <para>
1396
1397 <programlisting>
1398         f :: T a -> a -> (a, Char)
1399         f (T1 f k) x = (f k x, f 'c' 'd')
1400
1401         g :: (Ord a, Ord b) => Swizzle -> [a] -> (a -> b) -> [b]
1402         g (MkSwizzle s) xs f = s (map f (s xs))
1403
1404         h :: MonadT m -> [m a] -> m [a]
1405         h m [] = return m []
1406         h m (x:xs) = bind m x           $ \y ->
1407                       bind m (h m xs)   $ \ys ->
1408                       return m (y:ys)
1409 </programlisting>
1410
1411 </para>
1412
1413 <para>
1414 In the function <function>h</function> we use the record selectors <literal>return</literal>
1415 and <literal>bind</literal> to extract the polymorphic bind and return functions
1416 from the <literal>MonadT</literal> data structure, rather than using pattern
1417 matching.
1418 </para>
1419
1420 <para>
1421 You cannot pattern-match against an argument that is polymorphic.
1422 For example:
1423
1424 <programlisting>
1425         newtype TIM s a = TIM (ST s (Maybe a))
1426
1427         runTIM :: (forall s. TIM s a) -> Maybe a
1428         runTIM (TIM m) = runST m
1429 </programlisting>
1430
1431 </para>
1432
1433 <para>
1434 Here the pattern-match fails, because you can't pattern-match against
1435 an argument of type <literal>(forall s. TIM s a)</literal>.  Instead you
1436 must bind the variable and pattern match in the right hand side:
1437
1438 <programlisting>
1439         runTIM :: (forall s. TIM s a) -> Maybe a
1440         runTIM tm = case tm of { TIM m -> runST m }
1441 </programlisting>
1442
1443 The <literal>tm</literal> on the right hand side is (invisibly) instantiated, like
1444 any polymorphic value at its occurrence site, and now you can pattern-match
1445 against it.
1446 </para>
1447
1448 </sect2>
1449
1450 <sect2>
1451 <title>The partial-application restriction</title>
1452
1453 <para>
1454 There is really only one way in which data structures with polymorphic
1455 components might surprise you: you must not partially apply them.
1456 For example, this is illegal:
1457 </para>
1458
1459 <para>
1460
1461 <programlisting>
1462         map MkSwizzle [sort, reverse]
1463 </programlisting>
1464
1465 </para>
1466
1467 <para>
1468 The restriction is this: <emphasis>every subexpression of the program must
1469 have a type that has no for-alls, except that in a function
1470 application (f e1&hellip;en) the partial applications are not subject to
1471 this rule</emphasis>.  The restriction makes type inference feasible.
1472 </para>
1473
1474 <para>
1475 In the illegal example, the sub-expression <literal>MkSwizzle</literal> has the
1476 polymorphic type <literal>(Ord b => [b] -> [b]) -> Swizzle</literal> and is not
1477 a sub-expression of an enclosing application.  On the other hand, this
1478 expression is OK:
1479 </para>
1480
1481 <para>
1482
1483 <programlisting>
1484         map (T1 (\a b -> a)) [1,2,3]
1485 </programlisting>
1486
1487 </para>
1488
1489 <para>
1490 even though it involves a partial application of <function>T1</function>, because
1491 the sub-expression <literal>T1 (\a b -> a)</literal> has type <literal>Int -> T
1492 Int</literal>.
1493 </para>
1494
1495 </sect2>
1496
1497 <sect2 id="sigs">
1498 <title>Type signatures
1499 </title>
1500
1501 <para>
1502 Once you have data constructors with universally-quantified fields, or
1503 constants such as <Constant>runST</Constant> that have rank-2 types, it isn't long
1504 before you discover that you need more!  Consider:
1505 </para>
1506
1507 <para>
1508
1509 <programlisting>
1510   mkTs f x y = [T1 f x, T1 f y]
1511 </programlisting>
1512
1513 </para>
1514
1515 <para>
1516 <function>mkTs</function> is a fuction that constructs some values of type
1517 <literal>T</literal>, using some pieces passed to it.  The trouble is that since
1518 <literal>f</literal> is a function argument, Haskell assumes that it is
1519 monomorphic, so we'll get a type error when applying <function>T1</function> to
1520 it.  This is a rather silly example, but the problem really bites in
1521 practice.  Lots of people trip over the fact that you can't make
1522 "wrappers functions" for <Constant>runST</Constant> for exactly the same reason.
1523 In short, it is impossible to build abstractions around functions with
1524 rank-2 types.
1525 </para>
1526
1527 <para>
1528 The solution is fairly clear.  We provide the ability to give a rank-2
1529 type signature for <emphasis>ordinary</emphasis> functions (not only data
1530 constructors), thus:
1531 </para>
1532
1533 <para>
1534
1535 <programlisting>
1536   mkTs :: (forall b. b -> b -> b) -> a -> [T a]
1537   mkTs f x y = [T1 f x, T1 f y]
1538 </programlisting>
1539
1540 </para>
1541
1542 <para>
1543 This type signature tells the compiler to attribute <literal>f</literal> with
1544 the polymorphic type <literal>(forall b. b -> b -> b)</literal> when type
1545 checking the body of <function>mkTs</function>, so now the application of
1546 <function>T1</function> is fine.
1547 </para>
1548
1549 <para>
1550 There are two restrictions:
1551 </para>
1552
1553 <para>
1554
1555 <itemizedlist>
1556 <listitem>
1557
1558 <para>
1559  You can only define a rank 2 type, specified by the following
1560 grammar:
1561
1562
1563 <programlisting>
1564 rank2type ::= [forall tyvars .] [context =>] funty
1565 funty     ::= ([forall tyvars .] [context =>] ty) -> funty
1566             | ty
1567 ty        ::= ...current Haskell monotype syntax...
1568 </programlisting>
1569
1570
1571 Informally, the universal quantification must all be right at the beginning,
1572 or at the top level of a function argument.
1573
1574 </para>
1575 </listitem>
1576 <listitem>
1577
1578 <para>
1579  There is a restriction on the definition of a function whose
1580 type signature is a rank-2 type: the polymorphic arguments must be
1581 matched on the left hand side of the "<literal>=</literal>" sign.  You can't
1582 define <function>mkTs</function> like this:
1583
1584
1585 <programlisting>
1586 mkTs :: (forall b. b -> b -> b) -> a -> [T a]
1587 mkTs = \ f x y -> [T1 f x, T1 f y]
1588 </programlisting>
1589
1590
1591
1592 The same partial-application rule applies to ordinary functions with
1593 rank-2 types as applied to data constructors.
1594
1595 </para>
1596 </listitem>
1597
1598 </itemizedlist>
1599
1600 </para>
1601
1602 </sect2>
1603
1604
1605 <sect2 id="hoist">
1606 <title>Type synonyms and hoisting
1607 </title>
1608
1609 <para>
1610 GHC also allows you to write a <literal>forall</literal> in a type synonym, thus:
1611 <programlisting>
1612   type Discard a = forall b. a -> b -> a
1613
1614   f :: Discard a
1615   f x y = x
1616 </programlisting>
1617 However, it is often convenient to use these sort of synonyms at the right hand
1618 end of an arrow, thus:
1619 <programlisting>
1620   type Discard a = forall b. a -> b -> a
1621
1622   g :: Int -> Discard Int
1623   g x y z = x+y
1624 </programlisting>
1625 Simply expanding the type synonym would give
1626 <programlisting>
1627   g :: Int -> (forall b. Int -> b -> Int)
1628 </programlisting>
1629 but GHC "hoists" the <literal>forall</literal> to give the isomorphic type
1630 <programlisting>
1631   g :: forall b. Int -> Int -> b -> Int
1632 </programlisting>
1633 In general, the rule is this: <emphasis>to determine the type specified by any explicit
1634 user-written type (e.g. in a type signature), GHC expands type synonyms and then repeatedly
1635 performs the transformation:</emphasis>
1636 <programlisting>
1637   <emphasis>type1</emphasis> -> forall a. <emphasis>type2</emphasis>
1638 ==>
1639   forall a. <emphasis>type1</emphasis> -> <emphasis>type2</emphasis>
1640 </programlisting>
1641 (In fact, GHC tries to retain as much synonym information as possible for use in
1642 error messages, but that is a usability issue.)  This rule applies, of course, whether
1643 or not the <literal>forall</literal> comes from a synonym. For example, here is another
1644 valid way to write <literal>g</literal>'s type signature:
1645 <programlisting>
1646   g :: Int -> Int -> forall b. b -> Int
1647 </programlisting>
1648 </para>
1649 </sect2>
1650
1651 </sect1>
1652
1653 <sect1 id="existential-quantification">
1654 <title>Existentially quantified data constructors
1655 </title>
1656
1657 <para>
1658 The idea of using existential quantification in data type declarations
1659 was suggested by Laufer (I believe, thought doubtless someone will
1660 correct me), and implemented in Hope+. It's been in Lennart
1661 Augustsson's <Command>hbc</Command> Haskell compiler for several years, and
1662 proved very useful.  Here's the idea.  Consider the declaration:
1663 </para>
1664
1665 <para>
1666
1667 <programlisting>
1668   data Foo = forall a. MkFoo a (a -> Bool)
1669            | Nil
1670 </programlisting>
1671
1672 </para>
1673
1674 <para>
1675 The data type <literal>Foo</literal> has two constructors with types:
1676 </para>
1677
1678 <para>
1679
1680 <programlisting>
1681   MkFoo :: forall a. a -> (a -> Bool) -> Foo
1682   Nil   :: Foo
1683 </programlisting>
1684
1685 </para>
1686
1687 <para>
1688 Notice that the type variable <literal>a</literal> in the type of <function>MkFoo</function>
1689 does not appear in the data type itself, which is plain <literal>Foo</literal>.
1690 For example, the following expression is fine:
1691 </para>
1692
1693 <para>
1694
1695 <programlisting>
1696   [MkFoo 3 even, MkFoo 'c' isUpper] :: [Foo]
1697 </programlisting>
1698
1699 </para>
1700
1701 <para>
1702 Here, <literal>(MkFoo 3 even)</literal> packages an integer with a function
1703 <function>even</function> that maps an integer to <literal>Bool</literal>; and <function>MkFoo 'c'
1704 isUpper</function> packages a character with a compatible function.  These
1705 two things are each of type <literal>Foo</literal> and can be put in a list.
1706 </para>
1707
1708 <para>
1709 What can we do with a value of type <literal>Foo</literal>?.  In particular,
1710 what happens when we pattern-match on <function>MkFoo</function>?
1711 </para>
1712
1713 <para>
1714
1715 <programlisting>
1716   f (MkFoo val fn) = ???
1717 </programlisting>
1718
1719 </para>
1720
1721 <para>
1722 Since all we know about <literal>val</literal> and <function>fn</function> is that they
1723 are compatible, the only (useful) thing we can do with them is to
1724 apply <function>fn</function> to <literal>val</literal> to get a boolean.  For example:
1725 </para>
1726
1727 <para>
1728
1729 <programlisting>
1730   f :: Foo -> Bool
1731   f (MkFoo val fn) = fn val
1732 </programlisting>
1733
1734 </para>
1735
1736 <para>
1737 What this allows us to do is to package heterogenous values
1738 together with a bunch of functions that manipulate them, and then treat
1739 that collection of packages in a uniform manner.  You can express
1740 quite a bit of object-oriented-like programming this way.
1741 </para>
1742
1743 <sect2 id="existential">
1744 <title>Why existential?
1745 </title>
1746
1747 <para>
1748 What has this to do with <emphasis>existential</emphasis> quantification?
1749 Simply that <function>MkFoo</function> has the (nearly) isomorphic type
1750 </para>
1751
1752 <para>
1753
1754 <programlisting>
1755   MkFoo :: (exists a . (a, a -> Bool)) -> Foo
1756 </programlisting>
1757
1758 </para>
1759
1760 <para>
1761 But Haskell programmers can safely think of the ordinary
1762 <emphasis>universally</emphasis> quantified type given above, thereby avoiding
1763 adding a new existential quantification construct.
1764 </para>
1765
1766 </sect2>
1767
1768 <sect2>
1769 <title>Type classes</title>
1770
1771 <para>
1772 An easy extension (implemented in <Command>hbc</Command>) is to allow
1773 arbitrary contexts before the constructor.  For example:
1774 </para>
1775
1776 <para>
1777
1778 <programlisting>
1779 data Baz = forall a. Eq a => Baz1 a a
1780          | forall b. Show b => Baz2 b (b -> b)
1781 </programlisting>
1782
1783 </para>
1784
1785 <para>
1786 The two constructors have the types you'd expect:
1787 </para>
1788
1789 <para>
1790
1791 <programlisting>
1792 Baz1 :: forall a. Eq a => a -> a -> Baz
1793 Baz2 :: forall b. Show b => b -> (b -> b) -> Baz
1794 </programlisting>
1795
1796 </para>
1797
1798 <para>
1799 But when pattern matching on <function>Baz1</function> the matched values can be compared
1800 for equality, and when pattern matching on <function>Baz2</function> the first matched
1801 value can be converted to a string (as well as applying the function to it).
1802 So this program is legal:
1803 </para>
1804
1805 <para>
1806
1807 <programlisting>
1808   f :: Baz -> String
1809   f (Baz1 p q) | p == q    = "Yes"
1810                | otherwise = "No"
1811   f (Baz1 v fn)            = show (fn v)
1812 </programlisting>
1813
1814 </para>
1815
1816 <para>
1817 Operationally, in a dictionary-passing implementation, the
1818 constructors <function>Baz1</function> and <function>Baz2</function> must store the
1819 dictionaries for <literal>Eq</literal> and <literal>Show</literal> respectively, and
1820 extract it on pattern matching.
1821 </para>
1822
1823 <para>
1824 Notice the way that the syntax fits smoothly with that used for
1825 universal quantification earlier.
1826 </para>
1827
1828 </sect2>
1829
1830 <sect2>
1831 <title>Restrictions</title>
1832
1833 <para>
1834 There are several restrictions on the ways in which existentially-quantified
1835 constructors can be use.
1836 </para>
1837
1838 <para>
1839
1840 <itemizedlist>
1841 <listitem>
1842
1843 <para>
1844  When pattern matching, each pattern match introduces a new,
1845 distinct, type for each existential type variable.  These types cannot
1846 be unified with any other type, nor can they escape from the scope of
1847 the pattern match.  For example, these fragments are incorrect:
1848
1849
1850 <programlisting>
1851 f1 (MkFoo a f) = a
1852 </programlisting>
1853
1854
1855 Here, the type bound by <function>MkFoo</function> "escapes", because <literal>a</literal>
1856 is the result of <function>f1</function>.  One way to see why this is wrong is to
1857 ask what type <function>f1</function> has:
1858
1859
1860 <programlisting>
1861   f1 :: Foo -> a             -- Weird!
1862 </programlisting>
1863
1864
1865 What is this "<literal>a</literal>" in the result type? Clearly we don't mean
1866 this:
1867
1868
1869 <programlisting>
1870   f1 :: forall a. Foo -> a   -- Wrong!
1871 </programlisting>
1872
1873
1874 The original program is just plain wrong.  Here's another sort of error
1875
1876
1877 <programlisting>
1878   f2 (Baz1 a b) (Baz1 p q) = a==q
1879 </programlisting>
1880
1881
1882 It's ok to say <literal>a==b</literal> or <literal>p==q</literal>, but
1883 <literal>a==q</literal> is wrong because it equates the two distinct types arising
1884 from the two <function>Baz1</function> constructors.
1885
1886
1887 </para>
1888 </listitem>
1889 <listitem>
1890
1891 <para>
1892 You can't pattern-match on an existentially quantified
1893 constructor in a <literal>let</literal> or <literal>where</literal> group of
1894 bindings. So this is illegal:
1895
1896
1897 <programlisting>
1898   f3 x = a==b where { Baz1 a b = x }
1899 </programlisting>
1900
1901
1902 You can only pattern-match
1903 on an existentially-quantified constructor in a <literal>case</literal> expression or
1904 in the patterns of a function definition.
1905
1906 The reason for this restriction is really an implementation one.
1907 Type-checking binding groups is already a nightmare without
1908 existentials complicating the picture.  Also an existential pattern
1909 binding at the top level of a module doesn't make sense, because it's
1910 not clear how to prevent the existentially-quantified type "escaping".
1911 So for now, there's a simple-to-state restriction.  We'll see how
1912 annoying it is.
1913
1914 </para>
1915 </listitem>
1916 <listitem>
1917
1918 <para>
1919 You can't use existential quantification for <literal>newtype</literal>
1920 declarations.  So this is illegal:
1921
1922
1923 <programlisting>
1924   newtype T = forall a. Ord a => MkT a
1925 </programlisting>
1926
1927
1928 Reason: a value of type <literal>T</literal> must be represented as a pair
1929 of a dictionary for <literal>Ord t</literal> and a value of type <literal>t</literal>.
1930 That contradicts the idea that <literal>newtype</literal> should have no
1931 concrete representation.  You can get just the same efficiency and effect
1932 by using <literal>data</literal> instead of <literal>newtype</literal>.  If there is no
1933 overloading involved, then there is more of a case for allowing
1934 an existentially-quantified <literal>newtype</literal>, because the <literal>data</literal>
1935 because the <literal>data</literal> version does carry an implementation cost,
1936 but single-field existentially quantified constructors aren't much
1937 use.  So the simple restriction (no existential stuff on <literal>newtype</literal>)
1938 stands, unless there are convincing reasons to change it.
1939
1940
1941 </para>
1942 </listitem>
1943 <listitem>
1944
1945 <para>
1946  You can't use <literal>deriving</literal> to define instances of a
1947 data type with existentially quantified data constructors.
1948
1949 Reason: in most cases it would not make sense. For example:&num;
1950
1951 <programlisting>
1952 data T = forall a. MkT [a] deriving( Eq )
1953 </programlisting>
1954
1955 To derive <literal>Eq</literal> in the standard way we would need to have equality
1956 between the single component of two <function>MkT</function> constructors:
1957
1958 <programlisting>
1959 instance Eq T where
1960   (MkT a) == (MkT b) = ???
1961 </programlisting>
1962
1963 But <VarName>a</VarName> and <VarName>b</VarName> have distinct types, and so can't be compared.
1964 It's just about possible to imagine examples in which the derived instance
1965 would make sense, but it seems altogether simpler simply to prohibit such
1966 declarations.  Define your own instances!
1967 </para>
1968 </listitem>
1969
1970 </itemizedlist>
1971
1972 </para>
1973
1974 </sect2>
1975
1976 </sect1>
1977
1978 <sect1 id="sec-assertions">
1979 <title>Assertions
1980 <indexterm><primary>Assertions</primary></indexterm>
1981 </title>
1982
1983 <para>
1984 If you want to make use of assertions in your standard Haskell code, you
1985 could define a function like the following:
1986 </para>
1987
1988 <para>
1989
1990 <programlisting>
1991 assert :: Bool -> a -> a
1992 assert False x = error "assertion failed!"
1993 assert _     x = x
1994 </programlisting>
1995
1996 </para>
1997
1998 <para>
1999 which works, but gives you back a less than useful error message --
2000 an assertion failed, but which and where?
2001 </para>
2002
2003 <para>
2004 One way out is to define an extended <function>assert</function> function which also
2005 takes a descriptive string to include in the error message and
2006 perhaps combine this with the use of a pre-processor which inserts
2007 the source location where <function>assert</function> was used.
2008 </para>
2009
2010 <para>
2011 Ghc offers a helping hand here, doing all of this for you. For every
2012 use of <function>assert</function> in the user's source:
2013 </para>
2014
2015 <para>
2016
2017 <programlisting>
2018 kelvinToC :: Double -> Double
2019 kelvinToC k = assert (k &gt;= 0.0) (k+273.15)
2020 </programlisting>
2021
2022 </para>
2023
2024 <para>
2025 Ghc will rewrite this to also include the source location where the
2026 assertion was made,
2027 </para>
2028
2029 <para>
2030
2031 <programlisting>
2032 assert pred val ==> assertError "Main.hs|15" pred val
2033 </programlisting>
2034
2035 </para>
2036
2037 <para>
2038 The rewrite is only performed by the compiler when it spots
2039 applications of <function>Exception.assert</function>, so you can still define and
2040 use your own versions of <function>assert</function>, should you so wish. If not,
2041 import <literal>Exception</literal> to make use <function>assert</function> in your code.
2042 </para>
2043
2044 <para>
2045 To have the compiler ignore uses of assert, use the compiler option
2046 <option>-fignore-asserts</option>. <indexterm><primary>-fignore-asserts option</primary></indexterm> That is,
2047 expressions of the form <literal>assert pred e</literal> will be rewritten to <literal>e</literal>.
2048 </para>
2049
2050 <para>
2051 Assertion failures can be caught, see the documentation for the
2052 <literal>Exception</literal> library (<xref linkend="sec-Exception">)
2053 for the details.
2054 </para>
2055
2056 </sect1>
2057
2058 <sect1 id="scoped-type-variables">
2059 <title>Scoped Type Variables
2060 </title>
2061
2062 <para>
2063 A <emphasis>pattern type signature</emphasis> can introduce a <emphasis>scoped type
2064 variable</emphasis>.  For example
2065 </para>
2066
2067 <para>
2068
2069 <programlisting>
2070 f (xs::[a]) = ys ++ ys
2071            where
2072               ys :: [a]
2073               ys = reverse xs
2074 </programlisting>
2075
2076 </para>
2077
2078 <para>
2079 The pattern <literal>(xs::[a])</literal> includes a type signature for <VarName>xs</VarName>.
2080 This brings the type variable <literal>a</literal> into scope; it scopes over
2081 all the patterns and right hand sides for this equation for <function>f</function>.
2082 In particular, it is in scope at the type signature for <VarName>y</VarName>.
2083 </para>
2084
2085 <para>
2086  Pattern type signatures are completely orthogonal to ordinary, separate
2087 type signatures.  The two can be used independently or together.
2088 At ordinary type signatures, such as that for <VarName>ys</VarName>, any type variables
2089 mentioned in the type signature <emphasis>that are not in scope</emphasis> are
2090 implicitly universally quantified.  (If there are no type variables in
2091 scope, all type variables mentioned in the signature are universally
2092 quantified, which is just as in Haskell 98.)  In this case, since <VarName>a</VarName>
2093 is in scope, it is not universally quantified, so the type of <VarName>ys</VarName> is
2094 the same as that of <VarName>xs</VarName>.  In Haskell 98 it is not possible to declare
2095 a type for <VarName>ys</VarName>; a major benefit of scoped type variables is that
2096 it becomes possible to do so.
2097 </para>
2098
2099 <para>
2100 Scoped type variables are implemented in both GHC and Hugs.  Where the
2101 implementations differ from the specification below, those differences
2102 are noted.
2103 </para>
2104
2105 <para>
2106 So much for the basic idea.  Here are the details.
2107 </para>
2108
2109 <sect2>
2110 <title>What a pattern type signature means</title>
2111 <para>
2112 A type variable brought into scope by a pattern type signature is simply
2113 the name for a type.   The restriction they express is that all occurrences
2114 of the same name mean the same type.  For example:
2115 <programlisting>
2116   f :: [Int] -> Int -> Int
2117   f (xs::[a]) (y::a) = (head xs + y) :: a
2118 </programlisting>
2119 The pattern type signatures on the left hand side of
2120 <literal>f</literal> express the fact that <literal>xs</literal>
2121 must be a list of things of some type <literal>a</literal>; and that <literal>y</literal>
2122 must have this same type.  The type signature on the expression <literal>(head xs)</literal>
2123 specifies that this expression must have the same type <literal>a</literal>.
2124 <emphasis>There is no requirement that the type named by "<literal>a</literal>" is
2125 in fact a type variable</emphasis>.  Indeed, in this case, the type named by "<literal>a</literal>" is
2126 <literal>Int</literal>.  (This is a slight liberalisation from the original rather complex
2127 rules, which specified that a pattern-bound type variable should be universally quantified.)
2128 For example, all of these are legal:</para>
2129
2130 <programlisting>
2131   t (x::a) (y::a) = x+y*2
2132
2133   f (x::a) (y::b) = [x,y]       -- a unifies with b
2134
2135   g (x::a) = x + 1::Int         -- a unifies with Int
2136
2137   h x = let k (y::a) = [x,y]    -- a is free in the
2138         in k x                  -- environment
2139
2140   k (x::a) True    = ...        -- a unifies with Int
2141   k (x::Int) False = ...
2142
2143   w :: [b] -> [b]
2144   w (x::a) = x                  -- a unifies with [b]
2145 </programlisting>
2146
2147 </sect2>
2148
2149 <sect2>
2150 <title>Scope and implicit quantification</title>
2151
2152 <para>
2153
2154 <itemizedlist>
2155 <listitem>
2156
2157 <para>
2158 All the type variables mentioned in a pattern,
2159 that are not already in scope,
2160 are brought into scope by the pattern.  We describe this set as
2161 the <emphasis>type variables bound by the pattern</emphasis>.
2162 For example:
2163 <programlisting>
2164   f (x::a) = let g (y::(a,b)) = fst y
2165              in
2166              g (x,True)
2167 </programlisting>
2168 The pattern <literal>(x::a)</literal> brings the type variable
2169 <literal>a</literal> into scope, as well as the term
2170 variable <literal>x</literal>.  The pattern <literal>(y::(a,b))</literal>
2171 contains an occurrence of the already-in-scope type variable <literal>a</literal>,
2172 and brings into scope the type variable <literal>b</literal>.
2173 </para>
2174 </listitem>
2175
2176 <listitem>
2177 <para>
2178  The type variables thus brought into scope may be mentioned
2179 in ordinary type signatures or pattern type signatures anywhere within
2180 their scope.
2181
2182 </para>
2183 </listitem>
2184
2185 <listitem>
2186 <para>
2187  In ordinary type signatures, any type variable mentioned in the
2188 signature that is in scope is <emphasis>not</emphasis> universally quantified.
2189
2190 </para>
2191 </listitem>
2192
2193 <listitem>
2194
2195 <para>
2196  Ordinary type signatures do not bring any new type variables
2197 into scope (except in the type signature itself!). So this is illegal:
2198
2199 <programlisting>
2200   f :: a -> a
2201   f x = x::a
2202 </programlisting>
2203
2204 It's illegal because <VarName>a</VarName> is not in scope in the body of <function>f</function>,
2205 so the ordinary signature <literal>x::a</literal> is equivalent to <literal>x::forall a.a</literal>;
2206 and that is an incorrect typing.
2207
2208 </para>
2209 </listitem>
2210
2211 <listitem>
2212 <para>
2213  There is no implicit universal quantification on pattern type
2214 signatures, nor may one write an explicit <literal>forall</literal> type in a pattern
2215 type signature.  The pattern type signature is a monotype.
2216
2217 </para>
2218 </listitem>
2219
2220 <listitem>
2221 <para>
2222
2223 The type variables in the head of a <literal>class</literal> or <literal>instance</literal> declaration
2224 scope over the methods defined in the <literal>where</literal> part.  For example:
2225
2226
2227 <programlisting>
2228   class C a where
2229     op :: [a] -> a
2230
2231     op xs = let ys::[a]
2232                 ys = reverse xs
2233             in
2234             head ys
2235 </programlisting>
2236
2237
2238 (Not implemented in Hugs yet, Dec 98).
2239 </para>
2240 </listitem>
2241
2242 </itemizedlist>
2243
2244 </para>
2245
2246 </sect2>
2247
2248 <sect2>
2249 <title>Result type signatures</title>
2250
2251 <para>
2252
2253 <itemizedlist>
2254 <listitem>
2255
2256 <para>
2257  The result type of a function can be given a signature,
2258 thus:
2259
2260
2261 <programlisting>
2262   f (x::a) :: [a] = [x,x,x]
2263 </programlisting>
2264
2265
2266 The final <literal>:: [a]</literal> after all the patterns gives a signature to the
2267 result type.  Sometimes this is the only way of naming the type variable
2268 you want:
2269
2270
2271 <programlisting>
2272   f :: Int -> [a] -> [a]
2273   f n :: ([a] -> [a]) = let g (x::a, y::a) = (y,x)
2274                         in \xs -> map g (reverse xs `zip` xs)
2275 </programlisting>
2276
2277
2278 </para>
2279 </listitem>
2280
2281 </itemizedlist>
2282
2283 </para>
2284
2285 <para>
2286 Result type signatures are not yet implemented in Hugs.
2287 </para>
2288
2289 </sect2>
2290
2291 <sect2>
2292 <title>Where a pattern type signature can occur</title>
2293
2294 <para>
2295 A pattern type signature can occur in any pattern, but there
2296 are restrictions on pattern bindings:
2297 <itemizedlist>
2298
2299 <listitem>
2300 <para>
2301 A pattern type signature can be on an arbitrary sub-pattern, not
2302 ust on a variable:
2303
2304
2305 <programlisting>
2306   f ((x,y)::(a,b)) = (y,x) :: (b,a)
2307 </programlisting>
2308
2309
2310 </para>
2311 </listitem>
2312 <listitem>
2313
2314 <para>
2315  Pattern type signatures, including the result part, can be used
2316 in lambda abstractions:
2317
2318 <programlisting>
2319   (\ (x::a, y) :: a -> x)
2320 </programlisting>
2321 </para>
2322 </listitem>
2323 <listitem>
2324
2325 <para>
2326  Pattern type signatures, including the result part, can be used
2327 in <literal>case</literal> expressions:
2328
2329
2330 <programlisting>
2331   case e of { (x::a, y) :: a -> x }
2332 </programlisting>
2333
2334 </para>
2335 </listitem>
2336
2337 <listitem>
2338 <para>
2339 To avoid ambiguity, the type after the &ldquo;<literal>::</literal>&rdquo; in a result
2340 pattern signature on a lambda or <literal>case</literal> must be atomic (i.e. a single
2341 token or a parenthesised type of some sort).  To see why,
2342 consider how one would parse this:
2343
2344
2345 <programlisting>
2346   \ x :: a -> b -> x
2347 </programlisting>
2348
2349
2350 </para>
2351 </listitem>
2352
2353 <listitem>
2354
2355 <para>
2356  Pattern type signatures can bind existential type variables.
2357 For example:
2358
2359
2360 <programlisting>
2361   data T = forall a. MkT [a]
2362
2363   f :: T -> T
2364   f (MkT [t::a]) = MkT t3
2365                  where
2366                    t3::[a] = [t,t,t]
2367 </programlisting>
2368
2369
2370 </para>
2371 </listitem>
2372
2373
2374 <listitem>
2375
2376 <para>
2377 Pattern type signatures that bind new type variables
2378 may not be used in pattern bindings at all.
2379 So this is illegal:
2380
2381
2382 <programlisting>
2383   f x = let (y, z::a) = x in ...
2384 </programlisting>
2385
2386
2387 But these are OK, because they do not bind fresh type variables:
2388
2389
2390 <programlisting>
2391   f1 x            = let (y, z::Int) = x in ...
2392   f2 (x::(Int,a)) = let (y, z::a)   = x in ...
2393 </programlisting>
2394
2395
2396 However a single variable is considered a degenerate function binding,
2397 rather than a degerate pattern binding, so this is permitted, even
2398 though it binds a type variable:
2399
2400
2401 <programlisting>
2402   f :: (b->b) = \(x::b) -> x
2403 </programlisting>
2404
2405
2406 </para>
2407 </listitem>
2408
2409 </itemizedlist>
2410
2411 Such degnerate function bindings do not fall under the monomorphism
2412 restriction.  Thus:
2413 </para>
2414
2415 <para>
2416
2417 <programlisting>
2418   g :: a -> a -> Bool = \x y. x==y
2419 </programlisting>
2420
2421 </para>
2422
2423 <para>
2424 Here <function>g</function> has type <literal>forall a. Eq a =&gt; a -&gt; a -&gt; Bool</literal>, just as if
2425 <function>g</function> had a separate type signature.  Lacking a type signature, <function>g</function>
2426 would get a monomorphic type.
2427 </para>
2428
2429 </sect2>
2430
2431
2432 </sect1>
2433
2434   <sect1 id="pragmas">
2435     <title>Pragmas</title>
2436
2437     <indexterm><primary>pragma</primary></indexterm>
2438
2439     <para>GHC supports several pragmas, or instructions to the
2440     compiler placed in the source code.  Pragmas don't normally affect
2441     the meaning of the program, but they might affect the efficiency
2442     of the generated code.</para>
2443
2444     <para>Pragmas all take the form
2445
2446 <literal>{-# <replaceable>word</replaceable> ... #-}</literal>
2447
2448     where <replaceable>word</replaceable> indicates the type of
2449     pragma, and is followed optionally by information specific to that
2450     type of pragma.  Case is ignored in
2451     <replaceable>word</replaceable>.  The various values for
2452     <replaceable>word</replaceable> that GHC understands are described
2453     in the following sections; any pragma encountered with an
2454     unrecognised <replaceable>word</replaceable> is (silently)
2455     ignored.</para>
2456
2457 <sect2 id="inline-pragma">
2458 <title>INLINE pragma
2459
2460 <indexterm><primary>INLINE pragma</primary></indexterm>
2461 <indexterm><primary>pragma, INLINE</primary></indexterm></title>
2462
2463 <para>
2464 GHC (with <option>-O</option>, as always) tries to inline (or &ldquo;unfold&rdquo;)
2465 functions/values that are &ldquo;small enough,&rdquo; thus avoiding the call
2466 overhead and possibly exposing other more-wonderful optimisations.
2467 </para>
2468
2469 <para>
2470 You will probably see these unfoldings (in Core syntax) in your
2471 interface files.
2472 </para>
2473
2474 <para>
2475 Normally, if GHC decides a function is &ldquo;too expensive&rdquo; to inline, it
2476 will not do so, nor will it export that unfolding for other modules to
2477 use.
2478 </para>
2479
2480 <para>
2481 The sledgehammer you can bring to bear is the
2482 <literal>INLINE</literal><indexterm><primary>INLINE pragma</primary></indexterm> pragma, used thusly:
2483
2484 <programlisting>
2485 key_function :: Int -> String -> (Bool, Double)
2486
2487 #ifdef __GLASGOW_HASKELL__
2488 {-# INLINE key_function #-}
2489 #endif
2490 </programlisting>
2491
2492 (You don't need to do the C pre-processor carry-on unless you're going
2493 to stick the code through HBC&mdash;it doesn't like <literal>INLINE</literal> pragmas.)
2494 </para>
2495
2496 <para>
2497 The major effect of an <literal>INLINE</literal> pragma is to declare a function's
2498 &ldquo;cost&rdquo; to be very low.  The normal unfolding machinery will then be
2499 very keen to inline it.
2500 </para>
2501
2502 <para>
2503 An <literal>INLINE</literal> pragma for a function can be put anywhere its type
2504 signature could be put.
2505 </para>
2506
2507 <para>
2508 <literal>INLINE</literal> pragmas are a particularly good idea for the
2509 <literal>then</literal>/<literal>return</literal> (or <literal>bind</literal>/<literal>unit</literal>) functions in a monad.
2510 For example, in GHC's own <literal>UniqueSupply</literal> monad code, we have:
2511
2512 <programlisting>
2513 #ifdef __GLASGOW_HASKELL__
2514 {-# INLINE thenUs #-}
2515 {-# INLINE returnUs #-}
2516 #endif
2517 </programlisting>
2518
2519 </para>
2520
2521 </sect2>
2522
2523 <sect2 id="noinline-pragma">
2524 <title>NOINLINE pragma
2525 </title>
2526
2527 <indexterm><primary>NOINLINE pragma</primary></indexterm>
2528 <indexterm><primary>pragma</primary><secondary>NOINLINE</secondary></indexterm>
2529 <indexterm><primary>NOTINLINE pragma</primary></indexterm>
2530 <indexterm><primary>pragma</primary><secondary>NOTINLINE</secondary></indexterm>
2531
2532 <para>
2533 The <literal>NOINLINE</literal> pragma does exactly what you'd expect:
2534 it stops the named function from being inlined by the compiler.  You
2535 shouldn't ever need to do this, unless you're very cautious about code
2536 size.
2537 </para>
2538
2539 <para><literal>NOTINLINE</literal> is a synonym for
2540 <literal>NOINLINE</literal> (<literal>NOTINLINE</literal> is specified
2541 by Haskell 98 as the standard way to disable inlining, so it should be
2542 used if you want your code to be portable).</para>
2543
2544 </sect2>
2545
2546     <sect2 id="specialize-pragma">
2547       <title>SPECIALIZE pragma</title>
2548
2549       <indexterm><primary>SPECIALIZE pragma</primary></indexterm>
2550       <indexterm><primary>pragma, SPECIALIZE</primary></indexterm>
2551       <indexterm><primary>overloading, death to</primary></indexterm>
2552
2553       <para>(UK spelling also accepted.)  For key overloaded
2554       functions, you can create extra versions (NB: more code space)
2555       specialised to particular types.  Thus, if you have an
2556       overloaded function:</para>
2557
2558 <programlisting>
2559 hammeredLookup :: Ord key => [(key, value)] -> key -> value
2560 </programlisting>
2561
2562       <para>If it is heavily used on lists with
2563       <literal>Widget</literal> keys, you could specialise it as
2564       follows:</para>
2565
2566 <programlisting>
2567 {-# SPECIALIZE hammeredLookup :: [(Widget, value)] -> Widget -> value #-}
2568 </programlisting>
2569
2570       <para>To get very fancy, you can also specify a named function
2571       to use for the specialised value, as in:</para>
2572
2573 <programlisting>
2574 {-# RULES hammeredLookup = blah #-}
2575 </programlisting>
2576
2577       <para>where <literal>blah</literal> is an implementation of
2578       <literal>hammerdLookup</literal> written specialy for
2579       <literal>Widget</literal> lookups.  It's <emphasis>Your
2580       Responsibility</emphasis> to make sure that
2581       <function>blah</function> really behaves as a specialised
2582       version of <function>hammeredLookup</function>!!!</para>
2583
2584       <para>Note we use the <literal>RULE</literal> pragma here to
2585       indicate that <literal>hammeredLookup</literal> applied at a
2586       certain type should be replaced by <literal>blah</literal>.  See
2587       <xref linkend="rules"> for more information on
2588       <literal>RULES</literal>.</para>
2589
2590       <para>An example in which using <literal>RULES</literal> for
2591       specialisation will Win Big:
2592
2593 <programlisting>
2594 toDouble :: Real a => a -> Double
2595 toDouble = fromRational . toRational
2596
2597 {-# SPECIALIZE toDouble :: Int -> Double = i2d #-}
2598 i2d (I# i) = D# (int2Double# i) -- uses Glasgow prim-op directly
2599 </programlisting>
2600
2601       The <function>i2d</function> function is virtually one machine
2602       instruction; the default conversion&mdash;via an intermediate
2603       <literal>Rational</literal>&mdash;is obscenely expensive by
2604       comparison.</para>
2605
2606       <para>A <literal>SPECIALIZE</literal> pragma for a function can
2607       be put anywhere its type signature could be put.</para>
2608
2609     </sect2>
2610
2611 <sect2 id="specialize-instance-pragma">
2612 <title>SPECIALIZE instance pragma
2613 </title>
2614
2615 <para>
2616 <indexterm><primary>SPECIALIZE pragma</primary></indexterm>
2617 <indexterm><primary>overloading, death to</primary></indexterm>
2618 Same idea, except for instance declarations.  For example:
2619
2620 <programlisting>
2621 instance (Eq a) => Eq (Foo a) where { ... usual stuff ... }
2622
2623 {-# SPECIALIZE instance Eq (Foo [(Int, Bar)] #-}
2624 </programlisting>
2625
2626 Compatible with HBC, by the way.
2627 </para>
2628
2629 </sect2>
2630
2631 <sect2 id="line-pragma">
2632 <title>LINE pragma
2633 </title>
2634
2635 <para>
2636 <indexterm><primary>LINE pragma</primary></indexterm>
2637 <indexterm><primary>pragma, LINE</primary></indexterm>
2638 </para>
2639
2640 <para>
2641 This pragma is similar to C's <literal>&num;line</literal> pragma, and is mainly for use in
2642 automatically generated Haskell code.  It lets you specify the line
2643 number and filename of the original code; for example
2644 </para>
2645
2646 <para>
2647
2648 <programlisting>
2649 {-# LINE 42 "Foo.vhs" #-}
2650 </programlisting>
2651
2652 </para>
2653
2654 <para>
2655 if you'd generated the current file from something called <filename>Foo.vhs</filename>
2656 and this line corresponds to line 42 in the original.  GHC will adjust
2657 its error messages to refer to the line/file named in the <literal>LINE</literal>
2658 pragma.
2659 </para>
2660
2661 </sect2>
2662
2663 <sect2 id="rules">
2664 <title>RULES pragma</title>
2665
2666 <para>
2667 The RULES pragma lets you specify rewrite rules.  It is described in
2668 <xref LinkEnd="rewrite-rules">.
2669 </para>
2670
2671 </sect2>
2672
2673 <sect2 id="deprecated-pragma">
2674 <title>DEPRECATED pragma</title>
2675
2676 <para>
2677 The DEPRECATED pragma lets you specify that a particular function, class, or type, is deprecated.
2678 There are two forms.
2679 </para>
2680 <itemizedlist>
2681 <listitem><para>
2682 You can deprecate an entire module thus:</para>
2683 <programlisting>
2684    module Wibble {-# DEPRECATED "Use Wobble instead" #-} where
2685      ...
2686 </programlisting>
2687 <para>
2688 When you compile any module that import <literal>Wibble</literal>, GHC will print
2689 the specified message.</para>
2690 </listitem>
2691
2692 <listitem>
2693 <para>
2694 You can deprecate a function, class, or type, with the following top-level declaration:
2695 </para>
2696 <programlisting>
2697    {-# DEPRECATED f, C, T "Don't use these" #-}
2698 </programlisting>
2699 <para>
2700 When you compile any module that imports and uses any of the specifed entities,
2701 GHC will print the specified message.
2702 </para>
2703 </listitem>
2704 </itemizedlist>
2705 <para>You can suppress the warnings with the flag <option>-fno-warn-deprecations</option>.</para>
2706
2707 </sect2>
2708
2709 </sect1>
2710
2711 <sect1 id="rewrite-rules">
2712 <title>Rewrite rules
2713
2714 <indexterm><primary>RULES pagma</primary></indexterm>
2715 <indexterm><primary>pragma, RULES</primary></indexterm>
2716 <indexterm><primary>rewrite rules</primary></indexterm></title>
2717
2718 <para>
2719 The programmer can specify rewrite rules as part of the source program
2720 (in a pragma).  GHC applies these rewrite rules wherever it can.
2721 </para>
2722
2723 <para>
2724 Here is an example:
2725
2726 <programlisting>
2727   {-# RULES
2728         "map/map"       forall f g xs. map f (map g xs) = map (f.g) xs
2729   #-}
2730 </programlisting>
2731
2732 </para>
2733
2734 <sect2>
2735 <title>Syntax</title>
2736
2737 <para>
2738 From a syntactic point of view:
2739
2740 <itemizedlist>
2741 <listitem>
2742
2743 <para>
2744  Each rule has a name, enclosed in double quotes.  The name itself has
2745 no significance at all.  It is only used when reporting how many times the rule fired.
2746 </para>
2747 </listitem>
2748 <listitem>
2749
2750 <para>
2751  There may be zero or more rules in a <literal>RULES</literal> pragma.
2752 </para>
2753 </listitem>
2754 <listitem>
2755
2756 <para>
2757  Layout applies in a <literal>RULES</literal> pragma.  Currently no new indentation level
2758 is set, so you must lay out your rules starting in the same column as the
2759 enclosing definitions.
2760 </para>
2761 </listitem>
2762 <listitem>
2763
2764 <para>
2765  Each variable mentioned in a rule must either be in scope (e.g. <function>map</function>),
2766 or bound by the <literal>forall</literal> (e.g. <function>f</function>, <function>g</function>, <function>xs</function>).  The variables bound by
2767 the <literal>forall</literal> are called the <emphasis>pattern</emphasis> variables.  They are separated
2768 by spaces, just like in a type <literal>forall</literal>.
2769 </para>
2770 </listitem>
2771 <listitem>
2772
2773 <para>
2774  A pattern variable may optionally have a type signature.
2775 If the type of the pattern variable is polymorphic, it <emphasis>must</emphasis> have a type signature.
2776 For example, here is the <literal>foldr/build</literal> rule:
2777
2778 <programlisting>
2779 "fold/build"  forall k z (g::forall b. (a->b->b) -> b -> b) .
2780               foldr k z (build g) = g k z
2781 </programlisting>
2782
2783 Since <function>g</function> has a polymorphic type, it must have a type signature.
2784
2785 </para>
2786 </listitem>
2787 <listitem>
2788
2789 <para>
2790 The left hand side of a rule must consist of a top-level variable applied
2791 to arbitrary expressions.  For example, this is <emphasis>not</emphasis> OK:
2792
2793 <programlisting>
2794 "wrong1"   forall e1 e2.  case True of { True -> e1; False -> e2 } = e1
2795 "wrong2"   forall f.      f True = True
2796 </programlisting>
2797
2798 In <literal>"wrong1"</literal>, the LHS is not an application; in <literal>"wrong2"</literal>, the LHS has a pattern variable
2799 in the head.
2800 </para>
2801 </listitem>
2802 <listitem>
2803
2804 <para>
2805  A rule does not need to be in the same module as (any of) the
2806 variables it mentions, though of course they need to be in scope.
2807 </para>
2808 </listitem>
2809 <listitem>
2810
2811 <para>
2812  Rules are automatically exported from a module, just as instance declarations are.
2813 </para>
2814 </listitem>
2815
2816 </itemizedlist>
2817
2818 </para>
2819
2820 </sect2>
2821
2822 <sect2>
2823 <title>Semantics</title>
2824
2825 <para>
2826 From a semantic point of view:
2827
2828 <itemizedlist>
2829 <listitem>
2830
2831 <para>
2832 Rules are only applied if you use the <option>-O</option> flag.
2833 </para>
2834 </listitem>
2835
2836 <listitem>
2837 <para>
2838  Rules are regarded as left-to-right rewrite rules.
2839 When GHC finds an expression that is a substitution instance of the LHS
2840 of a rule, it replaces the expression by the (appropriately-substituted) RHS.
2841 By "a substitution instance" we mean that the LHS can be made equal to the
2842 expression by substituting for the pattern variables.
2843
2844 </para>
2845 </listitem>
2846 <listitem>
2847
2848 <para>
2849  The LHS and RHS of a rule are typechecked, and must have the
2850 same type.
2851
2852 </para>
2853 </listitem>
2854 <listitem>
2855
2856 <para>
2857  GHC makes absolutely no attempt to verify that the LHS and RHS
2858 of a rule have the same meaning.  That is undecideable in general, and
2859 infeasible in most interesting cases.  The responsibility is entirely the programmer's!
2860
2861 </para>
2862 </listitem>
2863 <listitem>
2864
2865 <para>
2866  GHC makes no attempt to make sure that the rules are confluent or
2867 terminating.  For example:
2868
2869 <programlisting>
2870   "loop"        forall x,y.  f x y = f y x
2871 </programlisting>
2872
2873 This rule will cause the compiler to go into an infinite loop.
2874
2875 </para>
2876 </listitem>
2877 <listitem>
2878
2879 <para>
2880  If more than one rule matches a call, GHC will choose one arbitrarily to apply.
2881
2882 </para>
2883 </listitem>
2884 <listitem>
2885 <para>
2886  GHC currently uses a very simple, syntactic, matching algorithm
2887 for matching a rule LHS with an expression.  It seeks a substitution
2888 which makes the LHS and expression syntactically equal modulo alpha
2889 conversion.  The pattern (rule), but not the expression, is eta-expanded if
2890 necessary.  (Eta-expanding the epression can lead to laziness bugs.)
2891 But not beta conversion (that's called higher-order matching).
2892 </para>
2893
2894 <para>
2895 Matching is carried out on GHC's intermediate language, which includes
2896 type abstractions and applications.  So a rule only matches if the
2897 types match too.  See <xref LinkEnd="rule-spec"> below.
2898 </para>
2899 </listitem>
2900 <listitem>
2901
2902 <para>
2903  GHC keeps trying to apply the rules as it optimises the program.
2904 For example, consider:
2905
2906 <programlisting>
2907   let s = map f
2908       t = map g
2909   in
2910   s (t xs)
2911 </programlisting>
2912
2913 The expression <literal>s (t xs)</literal> does not match the rule <literal>"map/map"</literal>, but GHC
2914 will substitute for <VarName>s</VarName> and <VarName>t</VarName>, giving an expression which does match.
2915 If <VarName>s</VarName> or <VarName>t</VarName> was (a) used more than once, and (b) large or a redex, then it would
2916 not be substituted, and the rule would not fire.
2917
2918 </para>
2919 </listitem>
2920 <listitem>
2921
2922 <para>
2923  In the earlier phases of compilation, GHC inlines <emphasis>nothing
2924 that appears on the LHS of a rule</emphasis>, because once you have substituted
2925 for something you can't match against it (given the simple minded
2926 matching).  So if you write the rule
2927
2928 <programlisting>
2929         "map/map"       forall f,g.  map f . map g = map (f.g)
2930 </programlisting>
2931
2932 this <emphasis>won't</emphasis> match the expression <literal>map f (map g xs)</literal>.
2933 It will only match something written with explicit use of ".".
2934 Well, not quite.  It <emphasis>will</emphasis> match the expression
2935
2936 <programlisting>
2937 wibble f g xs
2938 </programlisting>
2939
2940 where <function>wibble</function> is defined:
2941
2942 <programlisting>
2943 wibble f g = map f . map g
2944 </programlisting>
2945
2946 because <function>wibble</function> will be inlined (it's small).
2947
2948 Later on in compilation, GHC starts inlining even things on the
2949 LHS of rules, but still leaves the rules enabled.  This inlining
2950 policy is controlled by the per-simplification-pass flag <option>-finline-phase</option><emphasis>n</emphasis>.
2951
2952 </para>
2953 </listitem>
2954 <listitem>
2955
2956 <para>
2957  All rules are implicitly exported from the module, and are therefore
2958 in force in any module that imports the module that defined the rule, directly
2959 or indirectly.  (That is, if A imports B, which imports C, then C's rules are
2960 in force when compiling A.)  The situation is very similar to that for instance
2961 declarations.
2962 </para>
2963 </listitem>
2964
2965 </itemizedlist>
2966
2967 </para>
2968
2969 </sect2>
2970
2971 <sect2>
2972 <title>List fusion</title>
2973
2974 <para>
2975 The RULES mechanism is used to implement fusion (deforestation) of common list functions.
2976 If a "good consumer" consumes an intermediate list constructed by a "good producer", the
2977 intermediate list should be eliminated entirely.
2978 </para>
2979
2980 <para>
2981 The following are good producers:
2982
2983 <itemizedlist>
2984 <listitem>
2985
2986 <para>
2987  List comprehensions
2988 </para>
2989 </listitem>
2990 <listitem>
2991
2992 <para>
2993  Enumerations of <literal>Int</literal> and <literal>Char</literal> (e.g. <literal>['a'..'z']</literal>).
2994 </para>
2995 </listitem>
2996 <listitem>
2997
2998 <para>
2999  Explicit lists (e.g. <literal>[True, False]</literal>)
3000 </para>
3001 </listitem>
3002 <listitem>
3003
3004 <para>
3005  The cons constructor (e.g <literal>3:4:[]</literal>)
3006 </para>
3007 </listitem>
3008 <listitem>
3009
3010 <para>
3011  <function>++</function>
3012 </para>
3013 </listitem>
3014
3015 <listitem>
3016 <para>
3017  <function>map</function>
3018 </para>
3019 </listitem>
3020
3021 <listitem>
3022 <para>
3023  <function>filter</function>
3024 </para>
3025 </listitem>
3026 <listitem>
3027
3028 <para>
3029  <function>iterate</function>, <function>repeat</function>
3030 </para>
3031 </listitem>
3032 <listitem>
3033
3034 <para>
3035  <function>zip</function>, <function>zipWith</function>
3036 </para>
3037 </listitem>
3038
3039 </itemizedlist>
3040
3041 </para>
3042
3043 <para>
3044 The following are good consumers:
3045
3046 <itemizedlist>
3047 <listitem>
3048
3049 <para>
3050  List comprehensions
3051 </para>
3052 </listitem>
3053 <listitem>
3054
3055 <para>
3056  <function>array</function> (on its second argument)
3057 </para>
3058 </listitem>
3059 <listitem>
3060
3061 <para>
3062  <function>length</function>
3063 </para>
3064 </listitem>
3065 <listitem>
3066
3067 <para>
3068  <function>++</function> (on its first argument)
3069 </para>
3070 </listitem>
3071
3072 <listitem>
3073 <para>
3074  <function>foldr</function>
3075 </para>
3076 </listitem>
3077
3078 <listitem>
3079 <para>
3080  <function>map</function>
3081 </para>
3082 </listitem>
3083 <listitem>
3084
3085 <para>
3086  <function>filter</function>
3087 </para>
3088 </listitem>
3089 <listitem>
3090
3091 <para>
3092  <function>concat</function>
3093 </para>
3094 </listitem>
3095 <listitem>
3096
3097 <para>
3098  <function>unzip</function>, <function>unzip2</function>, <function>unzip3</function>, <function>unzip4</function>
3099 </para>
3100 </listitem>
3101 <listitem>
3102
3103 <para>
3104  <function>zip</function>, <function>zipWith</function> (but on one argument only; if both are good producers, <function>zip</function>
3105 will fuse with one but not the other)
3106 </para>
3107 </listitem>
3108 <listitem>
3109
3110 <para>
3111  <function>partition</function>
3112 </para>
3113 </listitem>
3114 <listitem>
3115
3116 <para>
3117  <function>head</function>
3118 </para>
3119 </listitem>
3120 <listitem>
3121
3122 <para>
3123  <function>and</function>, <function>or</function>, <function>any</function>, <function>all</function>
3124 </para>
3125 </listitem>
3126 <listitem>
3127
3128 <para>
3129  <function>sequence&lowbar;</function>
3130 </para>
3131 </listitem>
3132 <listitem>
3133
3134 <para>
3135  <function>msum</function>
3136 </para>
3137 </listitem>
3138 <listitem>
3139
3140 <para>
3141  <function>sortBy</function>
3142 </para>
3143 </listitem>
3144
3145 </itemizedlist>
3146
3147 </para>
3148
3149 <para>
3150 So, for example, the following should generate no intermediate lists:
3151
3152 <programlisting>
3153 array (1,10) [(i,i*i) | i &#60;- map (+ 1) [0..9]]
3154 </programlisting>
3155
3156 </para>
3157
3158 <para>
3159 This list could readily be extended; if there are Prelude functions that you use
3160 a lot which are not included, please tell us.
3161 </para>
3162
3163 <para>
3164 If you want to write your own good consumers or producers, look at the
3165 Prelude definitions of the above functions to see how to do so.
3166 </para>
3167
3168 </sect2>
3169
3170 <sect2 id="rule-spec">
3171 <title>Specialisation
3172 </title>
3173
3174 <para>
3175 Rewrite rules can be used to get the same effect as a feature
3176 present in earlier version of GHC:
3177
3178 <programlisting>
3179   {-# SPECIALIZE fromIntegral :: Int8 -> Int16 = int8ToInt16 #-}
3180 </programlisting>
3181
3182 This told GHC to use <function>int8ToInt16</function> instead of <function>fromIntegral</function> whenever
3183 the latter was called with type <literal>Int8 -&gt; Int16</literal>.  That is, rather than
3184 specialising the original definition of <function>fromIntegral</function> the programmer is
3185 promising that it is safe to use <function>int8ToInt16</function> instead.
3186 </para>
3187
3188 <para>
3189 This feature is no longer in GHC.  But rewrite rules let you do the
3190 same thing:
3191
3192 <programlisting>
3193 {-# RULES
3194   "fromIntegral/Int8/Int16" fromIntegral = int8ToInt16
3195 #-}
3196 </programlisting>
3197
3198 This slightly odd-looking rule instructs GHC to replace <function>fromIntegral</function>
3199 by <function>int8ToInt16</function> <emphasis>whenever the types match</emphasis>.  Speaking more operationally,
3200 GHC adds the type and dictionary applications to get the typed rule
3201
3202 <programlisting>
3203 forall (d1::Integral Int8) (d2::Num Int16) .
3204         fromIntegral Int8 Int16 d1 d2 = int8ToInt16
3205 </programlisting>
3206
3207 What is more,
3208 this rule does not need to be in the same file as fromIntegral,
3209 unlike the <literal>SPECIALISE</literal> pragmas which currently do (so that they
3210 have an original definition available to specialise).
3211 </para>
3212
3213 </sect2>
3214
3215 <sect2>
3216 <title>Controlling what's going on</title>
3217
3218 <para>
3219
3220 <itemizedlist>
3221 <listitem>
3222
3223 <para>
3224  Use <option>-ddump-rules</option> to see what transformation rules GHC is using.
3225 </para>
3226 </listitem>
3227 <listitem>
3228
3229 <para>
3230  Use <option>-ddump-simpl-stats</option> to see what rules are being fired.
3231 If you add <option>-dppr-debug</option> you get a more detailed listing.
3232 </para>
3233 </listitem>
3234 <listitem>
3235
3236 <para>
3237  The defintion of (say) <function>build</function> in <FileName>PrelBase.lhs</FileName> looks llike this:
3238
3239 <programlisting>
3240         build   :: forall a. (forall b. (a -> b -> b) -> b -> b) -> [a]
3241         {-# INLINE build #-}
3242         build g = g (:) []
3243 </programlisting>
3244
3245 Notice the <literal>INLINE</literal>!  That prevents <literal>(:)</literal> from being inlined when compiling
3246 <literal>PrelBase</literal>, so that an importing module will &ldquo;see&rdquo; the <literal>(:)</literal>, and can
3247 match it on the LHS of a rule.  <literal>INLINE</literal> prevents any inlining happening
3248 in the RHS of the <literal>INLINE</literal> thing.  I regret the delicacy of this.
3249
3250 </para>
3251 </listitem>
3252 <listitem>
3253
3254 <para>
3255  In <filename>ghc/lib/std/PrelBase.lhs</filename> look at the rules for <function>map</function> to
3256 see how to write rules that will do fusion and yet give an efficient
3257 program even if fusion doesn't happen.  More rules in <filename>PrelList.lhs</filename>.
3258 </para>
3259 </listitem>
3260
3261 </itemizedlist>
3262
3263 </para>
3264
3265 </sect2>
3266
3267 </sect1>
3268
3269 <sect1 id="generic-classes">
3270 <title>Generic classes</title>
3271
3272     <para>(Note: support for generic classes is currently broken in
3273     GHC 5.02).</para>
3274
3275 <para>
3276 The ideas behind this extension are described in detail in "Derivable type classes",
3277 Ralf Hinze and Simon Peyton Jones, Haskell Workshop, Montreal Sept 2000, pp94-105.
3278 An example will give the idea:
3279 </para>
3280
3281 <programlisting>
3282   import Generics
3283
3284   class Bin a where
3285     toBin   :: a -> [Int]
3286     fromBin :: [Int] -> (a, [Int])
3287
3288     toBin {| Unit |}    Unit      = []
3289     toBin {| a :+: b |} (Inl x)   = 0 : toBin x
3290     toBin {| a :+: b |} (Inr y)   = 1 : toBin y
3291     toBin {| a :*: b |} (x :*: y) = toBin x ++ toBin y
3292
3293     fromBin {| Unit |}    bs      = (Unit, bs)
3294     fromBin {| a :+: b |} (0:bs)  = (Inl x, bs')    where (x,bs') = fromBin bs
3295     fromBin {| a :+: b |} (1:bs)  = (Inr y, bs')    where (y,bs') = fromBin bs
3296     fromBin {| a :*: b |} bs      = (x :*: y, bs'') where (x,bs' ) = fromBin bs
3297                                                           (y,bs'') = fromBin bs'
3298 </programlisting>
3299 <para>
3300 This class declaration explains how <literal>toBin</literal> and <literal>fromBin</literal>
3301 work for arbitrary data types.  They do so by giving cases for unit, product, and sum,
3302 which are defined thus in the library module <literal>Generics</literal>:
3303 </para>
3304 <programlisting>
3305   data Unit    = Unit
3306   data a :+: b = Inl a | Inr b
3307   data a :*: b = a :*: b
3308 </programlisting>
3309 <para>
3310 Now you can make a data type into an instance of Bin like this:
3311 <programlisting>
3312   instance (Bin a, Bin b) => Bin (a,b)
3313   instance Bin a => Bin [a]
3314 </programlisting>
3315 That is, just leave off the "where" clasuse.  Of course, you can put in the
3316 where clause and over-ride whichever methods you please.
3317 </para>
3318
3319     <sect2>
3320       <title> Using generics </title>
3321       <para>To use generics you need to</para>
3322       <itemizedlist>
3323         <listitem>
3324           <para>Use the flags <option>-fglasgow-exts</option> (to enable the extra syntax),
3325                 <option>-fgenerics</option> (to generate extra per-data-type code),
3326                 and <option>-package lang</option> (to make the <literal>Generics</literal> library
3327                 available.  </para>
3328         </listitem>
3329         <listitem>
3330           <para>Import the module <literal>Generics</literal> from the
3331           <literal>lang</literal> package.  This import brings into
3332           scope the data types <literal>Unit</literal>,
3333           <literal>:*:</literal>, and <literal>:+:</literal>.  (You
3334           don't need this import if you don't mention these types
3335           explicitly; for example, if you are simply giving instance
3336           declarations.)</para>
3337         </listitem>
3338       </itemizedlist>
3339     </sect2>
3340
3341 <sect2> <title> Changes wrt the paper </title>
3342 <para>
3343 Note that the type constructors <literal>:+:</literal> and <literal>:*:</literal>
3344 can be written infix (indeed, you can now use
3345 any operator starting in a colon as an infix type constructor).  Also note that
3346 the type constructors are not exactly as in the paper (Unit instead of 1, etc).
3347 Finally, note that the syntax of the type patterns in the class declaration
3348 uses "<literal>{|</literal>" and "<literal>|}</literal>" brackets; curly braces
3349 alone would ambiguous when they appear on right hand sides (an extension we
3350 anticipate wanting).
3351 </para>
3352 </sect2>
3353
3354 <sect2> <title>Terminology and restrictions</title>
3355 <para>
3356 Terminology.  A "generic default method" in a class declaration
3357 is one that is defined using type patterns as above.
3358 A "polymorphic default method" is a default method defined as in Haskell 98.
3359 A "generic class declaration" is a class declaration with at least one
3360 generic default method.
3361 </para>
3362
3363 <para>
3364 Restrictions:
3365 <itemizedlist>
3366 <listitem>
3367 <para>
3368 Alas, we do not yet implement the stuff about constructor names and
3369 field labels.
3370 </para>
3371 </listitem>
3372
3373 <listitem>
3374 <para>
3375 A generic class can have only one parameter; you can't have a generic
3376 multi-parameter class.
3377 </para>
3378 </listitem>
3379
3380 <listitem>
3381 <para>
3382 A default method must be defined entirely using type patterns, or entirely
3383 without.  So this is illegal:
3384 <programlisting>
3385   class Foo a where
3386     op :: a -> (a, Bool)
3387     op {| Unit |} Unit = (Unit, True)
3388     op x               = (x,    False)
3389 </programlisting>
3390 However it is perfectly OK for some methods of a generic class to have
3391 generic default methods and others to have polymorphic default methods.
3392 </para>
3393 </listitem>
3394
3395 <listitem>
3396 <para>
3397 The type variable(s) in the type pattern for a generic method declaration
3398 scope over the right hand side.  So this is legal (note the use of the type variable ``p'' in a type signature on the right hand side:
3399 <programlisting>
3400   class Foo a where
3401     op :: a -> Bool
3402     op {| p :*: q |} (x :*: y) = op (x :: p)
3403     ...
3404 </programlisting>
3405 </para>
3406 </listitem>
3407
3408 <listitem>
3409 <para>
3410 The type patterns in a generic default method must take one of the forms:
3411 <programlisting>
3412        a :+: b
3413        a :*: b
3414        Unit
3415 </programlisting>
3416 where "a" and "b" are type variables.  Furthermore, all the type patterns for
3417 a single type constructor (<literal>:*:</literal>, say) must be identical; they
3418 must use the same type variables.  So this is illegal:
3419 <programlisting>
3420   class Foo a where
3421     op :: a -> Bool
3422     op {| a :+: b |} (Inl x) = True
3423     op {| p :+: q |} (Inr y) = False
3424 </programlisting>
3425 The type patterns must be identical, even in equations for different methods of the class.
3426 So this too is illegal:
3427 <programlisting>
3428   class Foo a where
3429     op1 :: a -> Bool
3430     op1 {| a :*: b |} (x :*: y) = True
3431
3432     op2 :: a -> Bool
3433     op2 {| p :*: q |} (x :*: y) = False
3434 </programlisting>
3435 (The reason for this restriction is that we gather all the equations for a particular type consructor
3436 into a single generic instance declaration.)
3437 </para>
3438 </listitem>
3439
3440 <listitem>
3441 <para>
3442 A generic method declaration must give a case for each of the three type constructors.
3443 </para>
3444 </listitem>
3445
3446 <listitem>
3447 <para>
3448 The type for a generic method can be built only from:
3449   <itemizedlist>
3450   <listitem> <para> Function arrows </para> </listitem>
3451   <listitem> <para> Type variables </para> </listitem>
3452   <listitem> <para> Tuples </para> </listitem>
3453   <listitem> <para> Arbitrary types not involving type variables </para> </listitem>
3454   </itemizedlist>
3455 Here are some example type signatures for generic methods:
3456 <programlisting>
3457     op1 :: a -> Bool
3458     op2 :: Bool -> (a,Bool)
3459     op3 :: [Int] -> a -> a
3460     op4 :: [a] -> Bool
3461 </programlisting>
3462 Here, op1, op2, op3 are OK, but op4 is rejected, because it has a type variable
3463 inside a list.
3464 </para>
3465 <para>
3466 This restriction is an implementation restriction: we just havn't got around to
3467 implementing the necessary bidirectional maps over arbitrary type constructors.
3468 It would be relatively easy to add specific type constructors, such as Maybe and list,
3469 to the ones that are allowed.</para>
3470 </listitem>
3471
3472 <listitem>
3473 <para>
3474 In an instance declaration for a generic class, the idea is that the compiler
3475 will fill in the methods for you, based on the generic templates.  However it can only
3476 do so if
3477   <itemizedlist>
3478   <listitem>
3479   <para>
3480   The instance type is simple (a type constructor applied to type variables, as in Haskell 98).
3481   </para>
3482   </listitem>
3483   <listitem>
3484   <para>
3485   No constructor of the instance type has unboxed fields.
3486   </para>
3487   </listitem>
3488   </itemizedlist>
3489 (Of course, these things can only arise if you are already using GHC extensions.)
3490 However, you can still give an instance declarations for types which break these rules,
3491 provided you give explicit code to override any generic default methods.
3492 </para>
3493 </listitem>
3494
3495 </itemizedlist>
3496 </para>
3497
3498 <para>
3499 The option <option>-ddump-deriv</option> dumps incomprehensible stuff giving details of
3500 what the compiler does with generic declarations.
3501 </para>
3502
3503 </sect2>
3504
3505 <sect2> <title> Another example </title>
3506 <para>
3507 Just to finish with, here's another example I rather like:
3508 <programlisting>
3509   class Tag a where
3510     nCons :: a -> Int
3511     nCons {| Unit |}    _ = 1
3512     nCons {| a :*: b |} _ = 1
3513     nCons {| a :+: b |} _ = nCons (bot::a) + nCons (bot::b)
3514
3515     tag :: a -> Int
3516     tag {| Unit |}    _       = 1
3517     tag {| a :*: b |} _       = 1
3518     tag {| a :+: b |} (Inl x) = tag x
3519     tag {| a :+: b |} (Inr y) = nCons (bot::a) + tag y
3520 </programlisting>
3521 </para>
3522 </sect2>
3523 </sect1>
3524
3525 <!-- Emacs stuff:
3526      ;;; Local Variables: ***
3527      ;;; mode: sgml ***
3528      ;;; sgml-parent-document: ("users_guide.sgml" "book" "chapter" "sect1") ***
3529      ;;; End: ***
3530  -->