ghc/docs/users_guide/glasgow_exts.sgml

   1 <para>
   2 <indexterm><primary>language, GHC</primary></indexterm>
   3 <indexterm><primary>extensions, GHC</primary></indexterm>
   4 As with all known Haskell systems, GHC implements some extensions to
   5 the language.  To use them, you'll need to give a <option>-fglasgow-exts</option>
   6 <indexterm><primary>-fglasgow-exts option</primary></indexterm> option.
   7 </para>
   8
   9 <para>
  10 Virtually all of the Glasgow extensions serve to give you access to
  11 the underlying facilities with which we implement Haskell.  Thus, you
  12 can get at the Raw Iron, if you are willing to write some non-standard
  13 code at a more primitive level.  You need not be &ldquo;stuck&rdquo; on
  14 performance because of the implementation costs of Haskell's
  15 &ldquo;high-level&rdquo; features&mdash;you can always code &ldquo;under&rdquo; them.  In an extreme case, you can write all your time-critical code in C, and then just glue it together with Haskell!
  16 </para>
  17
  18 <para>
  19 Executive summary of our extensions:
  20 </para>
  21
  22   <variablelist>
  23
  24     <varlistentry>
  25       <term>Unboxed types and primitive operations:</Term>
  26       <listitem>
  27         <para>You can get right down to the raw machine types and
  28         operations; included in this are &ldquo;primitive
  29         arrays&rdquo; (direct access to Big Wads of Bytes).  Please
  30         see <XRef LinkEnd="glasgow-unboxed"> and following.</para>
  31       </listitem>
  32     </varlistentry>
  33
  34     <varlistentry>
  35       <term>Type system extensions:</term>
  36       <listitem>
  37         <para> GHC supports a large number of extensions to Haskell's
  38         type system.  Specifically:</para>
  39
  40         <variablelist>
  41           <varlistentry>
  42             <term>Multi-parameter type classes:</term>
  43             <listitem>
  44               <para><xref LinkEnd="multi-param-type-classes"></para>
  45             </listitem>
  46           </varlistentry>
  47
  48           <varlistentry>
  49             <term>Functional dependencies:</term>
  50             <listitem>
  51               <para><xref LinkEnd="functional-dependencies"></para>
  52             </listitem>
  53           </varlistentry>
  54
  55           <varlistentry>
  56             <term>Implicit parameters:</term>
  57             <listitem>
  58               <para><xref LinkEnd="implicit-parameters"></para>
  59             </listitem>
  60           </varlistentry>
  61
  62           <varlistentry>
  63             <term>Local universal quantification:</term>
  64             <listitem>
  65               <para><xref LinkEnd="universal-quantification"></para>
  66             </listitem>
  67           </varlistentry>
  68
  69           <varlistentry>
  70             <term>Extistentially quantification in data types:</term>
  71             <listitem>
  72               <para><xref LinkEnd="existential-quantification"></para>
  73             </listitem>
  74           </varlistentry>
  75
  76           <varlistentry>
  77             <term>Scoped type variables:</term>
  78             <listitem>
  79               <para>Scoped type variables enable the programmer to
  80               supply type signatures for some nested declarations,
  81               where this would not be legal in Haskell 98.  Details in
  82               <xref LinkEnd="scoped-type-variables">.</para>
  83             </listitem>
  84           </varlistentry>
  85         </variablelist>
  86       </listitem>
  87     </varlistentry>
  88
  89     <varlistentry>
  90       <term>Pattern guards</term>
  91       <listitem>
  92         <para>Instead of being a boolean expression, a guard is a list
  93         of qualifiers, exactly as in a list comprehension. See <xref
  94         LinkEnd="pattern-guards">.</para>
  95       </listitem>
  96     </varlistentry>
  97
  98     <varlistentry>
  99       <term>Foreign calling:</term>
 100       <listitem>
 101         <para>Just what it sounds like.  We provide
 102         <emphasis>lots</emphasis> of rope that you can dangle around
 103         your neck.  Please see <xref LinkEnd="ffi">.</para>
 104       </listitem>
 105     </varlistentry>
 106
 107     <varlistentry>
 108       <term>Pragmas</term>
 109       <listitem>
 110         <para>Pragmas are special instructions to the compiler placed
 111         in the source file.  The pragmas GHC supports are described in
 112         <xref LinkEnd="pragmas">.</para>
 113       </listitem>
 114     </varlistentry>
 115
 116     <varlistentry>
 117       <term>Rewrite rules:</term>
 118       <listitem>
 119         <para>The programmer can specify rewrite rules as part of the
 120         source program (in a pragma).  GHC applies these rewrite rules
 121         wherever it can.  Details in <xref
 122         LinkEnd="rewrite-rules">.</para>
 123       </listitem>
 124     </varlistentry>
 125
 126     <varlistentry>
 127       <term>Generic classes:</term>
 128       <listitem>
 129         <para>Generic class declarations allow you to define a class
 130         whose methods say how to work over an arbitrary data type.
 131         Then it's really easy to make any new type into an instance of
 132         the class.  This generalises the rather ad-hoc "deriving"
 133         feature of Haskell 98.  Details in <xref
 134         LinkEnd="generic-classes">.</para>
 135       </listitem>
 136     </varlistentry>
 137   </variablelist>
 138
 139 <para>
 140 Before you get too carried away working at the lowest level (e.g.,
 141 sloshing <literal>MutableByteArray&num;</literal>s around your
 142 program), you may wish to check if there are libraries that provide a
 143 &ldquo;Haskellised veneer&rdquo; over the features you want.  See
 144 <xref linkend="book-hslibs">.
 145 </para>
 146
 147   <sect1 id="options-language">
 148     <title>Language options</title>
 149
 150     <indexterm><primary>language</primary><secondary>option</secondary>
 151     </indexterm>
 152     <indexterm><primary>options</primary><secondary>language</secondary>
 153     </indexterm>
 154     <indexterm><primary>extensions</primary><secondary>options controlling</secondary>
 155     </indexterm>
 156
 157     <para> These flags control what variation of the language are
 158     permitted.  Leaving out all of them gives you standard Haskell
 159     98.</para>
 160
 161     <variablelist>
 162
 163       <varlistentry>
 164         <term><option>-fglasgow-exts</option>:</term>
 165         <indexterm><primary><option>-fglasgow-exts</option></primary></indexterm>
 166         <listitem>
 167           <para>This simultaneously enables all of the extensions to
 168           Haskell 98 described in <xref
 169           linkend="ghc-language-features">, except where otherwise
 170           noted. </para>
 171         </listitem>
 172       </varlistentry>
 173
 174       <varlistentry>
 175         <term><option>-fno-monomorphism-restriction</option>:</term>
 176         <indexterm><primary><option>-fno-monomorphism-restriction</option></primary></indexterm>
 177         <listitem>
 178           <para> Switch off the Haskell 98 monomorphism restriction.
 179           Independent of the <option>-fglasgow-exts</option>
 180           flag. </para>
 181         </listitem>
 182       </varlistentry>
 183
 184       <varlistentry>
 185         <term><option>-fallow-overlapping-instances</option></term>
 186         <term><option>-fallow-undecidable-instances</option></term>
 187         <term><option>-fcontext-stack</option></term>
 188         <indexterm><primary><option>-fallow-overlapping-instances</option></primary></indexterm>
 189         <indexterm><primary><option>-fallow-undecidable-instances</option></primary></indexterm>
 190         <indexterm><primary><option>-fcontext-stack</option></primary></indexterm>
 191         <listitem>
 192           <para> See <xref LinkEnd="instance-decls">.  Only relevant
 193           if you also use <option>-fglasgow-exts</option>.</para>
 194         </listitem>
 195       </varlistentry>
 196
 197       <varlistentry>
 198         <term><option>-finline-phase</option></term>
 199         <indexterm><primary><option>-finline-phase</option></primary></indexterm>
 200         <listitem>
 201           <para>See <xref LinkEnd="rewrite-rules">.  Only relevant if
 202           you also use <option>-fglasgow-exts</option>.</para>
 203         </listitem>
 204       </varlistentry>
 205
 206       <varlistentry>
 207         <term><option>-fgenerics</option></term>
 208         <indexterm><primary><option>-fgenerics</option></primary></indexterm>
 209         <listitem>
 210           <para>See <xref LinkEnd="generic-classes">.  Independent of
 211           <option>-fglasgow-exts</option>.</para>
 212         </listitem>
 213       </varlistentry>
 214
 215         <varlistentry>
 216           <term><option>-fno-implicit-prelude</option></term>
 217           <listitem>
 218             <para><indexterm><primary>-fno-implicit-prelude
 219             option</primary></indexterm> GHC normally imports
 220             <filename>Prelude.hi</filename> files for you.  If you'd
 221             rather it didn't, then give it a
 222             <option>-fno-implicit-prelude</option> option.  The idea
 223             is that you can then import a Prelude of your own.  (But
 224             don't call it <literal>Prelude</literal>; the Haskell
 225             module namespace is flat, and you must not conflict with
 226             any Prelude module.)</para>
 227
 228             <para>Even though you have not imported the Prelude, all
 229             the built-in syntax still refers to the built-in Haskell
 230             Prelude types and values, as specified by the Haskell
 231             Report.  For example, the type <literal>[Int]</literal>
 232             still means <literal>Prelude.[] Int</literal>; tuples
 233             continue to refer to the standard Prelude tuples; the
 234             translation for list comprehensions continues to use
 235             <literal>Prelude.map</literal> etc.</para>
 236
 237             <para> With one group of exceptions!  You may want to
 238             define your own numeric class hierarchy.  It completely
 239             defeats that purpose if the literal "1" means
 240             "<literal>Prelude.fromInteger 1</literal>", which is what
 241             the Haskell Report specifies.  So the
 242             <option>-fno-implicit-prelude</option> flag causes the
 243             following pieces of built-in syntax to refer to <emphasis>whatever
 244             is in scope</emphasis>, not the Prelude versions:</para>
 245
 246             <itemizedlist>
 247               <listitem>
 248                 <para>Integer and fractional literals mean
 249                 "<literal>fromInteger 1</literal>" and
 250                 "<literal>fromRational 3.2</literal>", not the
 251                 Prelude-qualified versions; both in expressions and in
 252                 patterns.</para>
 253               </listitem>
 254
 255               <listitem>
 256                 <para>Negation (e.g. "<literal>- (f x)</literal>")
 257                 means "<literal>negate (f x)</literal>" (not
 258                 <literal>Prelude.negate</literal>).</para>
 259               </listitem>
 260
 261               <listitem>
 262                 <para>In an n+k pattern, the standard Prelude
 263                 <literal>Ord</literal> class is still used for comparison,
 264                 but the necessary subtraction uses whatever
 265                 "<literal>(-)</literal>" is in scope (not
 266                 "<literal>Prelude.(-)</literal>").</para>
 267               </listitem>
 268             </itemizedlist>
 269
 270              <para>Note: Negative literals, such as <literal>-3</literal>, are
 271              specified by (a careful reading of) the Haskell Report as
 272              meaning <literal>Prelude.negate (Prelude.fromInteger 3)</literal>.
 273              However, GHC deviates from this slightly, and treats them as meaning
 274              <literal>fromInteger (-3)</literal>.  One particular effect of this
 275              slightly-non-standard reading is that there is no difficulty with
 276              the literal <literal>-2147483648</literal> at type <literal>Int</literal>;
 277              it means <literal>fromInteger (-2147483648)</literal>.  The strict interpretation
 278              would be <literal>negate (fromInteger 2147483648)</literal>,
 279              and the call to <literal>fromInteger</literal> would overflow
 280              (at type <literal>Int</literal>, remember).
 281              </para>
 282
 283           </listitem>
 284         </varlistentry>
 285
 286     </variablelist>
 287   </sect1>
 288
 289 <!-- UNBOXED TYPES AND PRIMITIVE OPERATIONS -->
 290 &primitives;
 291
 292 <sect1 id="glasgow-ST-monad">
 293 <title>Primitive state-transformer monad</title>
 294
 295 <para>
 296 <indexterm><primary>state transformers (Glasgow extensions)</primary></indexterm>
 297 <indexterm><primary>ST monad (Glasgow extension)</primary></indexterm>
 298 </para>
 299
 300 <para>
 301 This monad underlies our implementation of arrays, mutable and
 302 immutable, and our implementation of I/O, including &ldquo;C calls&rdquo;.
 303 </para>
 304
 305 <para>
 306 The <literal>ST</literal> library, which provides access to the
 307 <function>ST</function> monad, is described in <xref
 308 linkend="sec-ST">.
 309 </para>
 310
 311 </sect1>
 312
 313 <sect1 id="glasgow-prim-arrays">
 314 <title>Primitive arrays, mutable and otherwise
 315 </title>
 316
 317 <para>
 318 <indexterm><primary>primitive arrays (Glasgow extension)</primary></indexterm>
 319 <indexterm><primary>arrays, primitive (Glasgow extension)</primary></indexterm>
 320 </para>
 321
 322 <para>
 323 GHC knows about quite a few flavours of Large Swathes of Bytes.
 324 </para>
 325
 326 <para>
 327 First, GHC distinguishes between primitive arrays of (boxed) Haskell
 328 objects (type <literal>Array&num; obj</literal>) and primitive arrays of bytes (type
 329 <literal>ByteArray&num;</literal>).
 330 </para>
 331
 332 <para>
 333 Second, it distinguishes between&hellip;
 334 <variablelist>
 335
 336 <varlistentry>
 337 <term>Immutable:</term>
 338 <listitem>
 339 <para>
 340 Arrays that do not change (as with &ldquo;standard&rdquo; Haskell arrays); you
 341 can only read from them.  Obviously, they do not need the care and
 342 attention of the state-transformer monad.
 343 </para>
 344 </listitem>
 345 </varlistentry>
 346 <varlistentry>
 347 <term>Mutable:</term>
 348 <listitem>
 349 <para>
 350 Arrays that may be changed or &ldquo;mutated.&rdquo;  All the operations on them
 351 live within the state-transformer monad and the updates happen
 352 <emphasis>in-place</emphasis>.
 353 </para>
 354 </listitem>
 355 </varlistentry>
 356 <varlistentry>
 357 <term>&ldquo;Static&rdquo; (in C land):</term>
 358 <listitem>
 359 <para>
 360 A C routine may pass an <literal>Addr&num;</literal> pointer back into Haskell land.  There
 361 are then primitive operations with which you may merrily grab values
 362 over in C land, by indexing off the &ldquo;static&rdquo; pointer.
 363 </para>
 364 </listitem>
 365 </varlistentry>
 366 <varlistentry>
 367 <term>&ldquo;Stable&rdquo; pointers:</term>
 368 <listitem>
 369 <para>
 370 If, for some reason, you wish to hand a Haskell pointer (i.e.,
 371 <emphasis>not</emphasis> an unboxed value) to a C routine, you first make the
 372 pointer &ldquo;stable,&rdquo; so that the garbage collector won't forget that it
 373 exists.  That is, GHC provides a safe way to pass Haskell pointers to
 374 C.
 375 </para>
 376
 377 <para>
 378 Please see <xref LinkEnd="sec-stable-pointers"> for more details.
 379 </para>
 380 </listitem>
 381 </varlistentry>
 382 <varlistentry>
 383 <term>&ldquo;Foreign objects&rdquo;:</term>
 384 <listitem>
 385 <para>
 386 A &ldquo;foreign object&rdquo; is a safe way to pass an external object (a
 387 C-allocated pointer, say) to Haskell and have Haskell do the Right
 388 Thing when it no longer references the object.  So, for example, C
 389 could pass a large bitmap over to Haskell and say &ldquo;please free this
 390 memory when you're done with it.&rdquo;
 391 </para>
 392
 393 <para>
 394 Please see <xref LinkEnd="sec-ForeignObj"> for more details.
 395 </para>
 396 </listitem>
 397 </varlistentry>
 398 </variablelist>
 399 </para>
 400
 401 <para>
 402 The libraries documentatation gives more details on all these
 403 &ldquo;primitive array&rdquo; types and the operations on them.
 404 </para>
 405
 406 </sect1>
 407
 408
 409 <sect1 id="pattern-guards">
 410 <title>Pattern guards</title>
 411
 412 <para>
 413 <indexterm><primary>Pattern guards (Glasgow extension)</primary></indexterm>
 414 The discussion that follows is an abbreviated version of Simon Peyton Jones's original <ULink URL="http://research.microsoft.com/~simonpj/Haskell/guards.html">proposal</ULink>. (Note that the proposal was written before pattern guards were implemented, so refers to them as unimplemented.)
 415 </para>
 416
 417 <para>
 418 Suppose we have an abstract data type of finite maps, with a
 419 lookup operation:
 420
 421 <programlisting>
 422 lookup :: FiniteMap -> Int -> Maybe Int
 423 </programlisting>
 424
 425 The lookup returns <function>Nothing</function> if the supplied key is not in the domain of the mapping, and <function>(Just v)</function> otherwise,
 426 where <VarName>v</VarName> is the value that the key maps to.  Now consider the following definition:
 427 </para>
 428
 429 <programlisting>
 430 clunky env var1 var2 | ok1 && ok2 = val1 + val2
 431 | otherwise  = var1 + var2
 432 where
 433   m1 = lookup env var1
 434   m2 = lookup env var2
 435   ok1 = maybeToBool m1
 436   ok2 = maybeToBool m2
 437   val1 = expectJust m1
 438   val2 = expectJust m2
 439 </programlisting>
 440
 441 <para>
 442 The auxiliary functions are
 443 </para>
 444
 445 <programlisting>
 446 maybeToBool :: Maybe a -&gt; Bool
 447 maybeToBool (Just x) = True
 448 maybeToBool Nothing  = False
 449
 450 expectJust :: Maybe a -&gt; a
 451 expectJust (Just x) = x
 452 expectJust Nothing  = error "Unexpected Nothing"
 453 </programlisting>
 454
 455 <para>
 456 What is <function>clunky</function> doing? The guard <literal>ok1 &&
 457 ok2</literal> checks that both lookups succeed, using
 458 <function>maybeToBool</function> to convert the <function>Maybe</function>
 459 types to booleans. The (lazily evaluated) <function>expectJust</function>
 460 calls extract the values from the results of the lookups, and binds the
 461 returned values to <VarName>val1</VarName> and <VarName>val2</VarName>
 462 respectively.  If either lookup fails, then clunky takes the
 463 <literal>otherwise</literal> case and returns the sum of its arguments.
 464 </para>
 465
 466 <para>
 467 This is certainly legal Haskell, but it is a tremendously verbose and
 468 un-obvious way to achieve the desired effect.  Arguably, a more direct way
 469 to write clunky would be to use case expressions:
 470 </para>
 471
 472 <programlisting>
 473 clunky env var1 var1 = case lookup env var1 of
 474   Nothing -&gt; fail
 475   Just val1 -&gt; case lookup env var2 of
 476     Nothing -&gt; fail
 477     Just val2 -&gt; val1 + val2
 478 where
 479   fail = val1 + val2
 480 </programlisting>
 481
 482 <para>
 483 This is a bit shorter, but hardly better.  Of course, we can rewrite any set
 484 of pattern-matching, guarded equations as case expressions; that is
 485 precisely what the compiler does when compiling equations! The reason that
 486 Haskell provides guarded equations is because they allow us to write down
 487 the cases we want to consider, one at a time, independently of each other.
 488 This structure is hidden in the case version.  Two of the right-hand sides
 489 are really the same (<function>fail</function>), and the whole expression
 490 tends to become more and more indented.
 491 </para>
 492
 493 <para>
 494 Here is how I would write clunky:
 495 </para>
 496
 497 <programlisting>
 498 clunky env var1 var1
 499   | Just val1 &lt;- lookup env var1
 500   , Just val2 &lt;- lookup env var2
 501   = val1 + val2
 502 ...other equations for clunky...
 503 </programlisting>
 504
 505 <para>
 506 The semantics should be clear enough.  The qualifers are matched in order.
 507 For a <literal>&lt;-</literal> qualifier, which I call a pattern guard, the
 508 right hand side is evaluated and matched against the pattern on the left.
 509 If the match fails then the whole guard fails and the next equation is
 510 tried.  If it succeeds, then the appropriate binding takes place, and the
 511 next qualifier is matched, in the augmented environment.  Unlike list
 512 comprehensions, however, the type of the expression to the right of the
 513 <literal>&lt;-</literal> is the same as the type of the pattern to its
 514 left.  The bindings introduced by pattern guards scope over all the
 515 remaining guard qualifiers, and over the right hand side of the equation.
 516 </para>
 517
 518 <para>
 519 Just as with list comprehensions, boolean expressions can be freely mixed
 520 with among the pattern guards.  For example:
 521 </para>
 522
 523 <programlisting>
 524 f x | [y] <- x
 525     , y > 3
 526     , Just z <- h y
 527     = ...
 528 </programlisting>
 529
 530 <para>
 531 Haskell's current guards therefore emerge as a special case, in which the
 532 qualifier list has just one element, a boolean expression.
 533 </para>
 534 </sect1>
 535
 536   <sect1 id="sec-ffi">
 537     <title>The foreign interface</title>
 538
 539     <para>The foreign interface consists of the following components:</para>
 540
 541     <itemizedlist>
 542       <listitem>
 543         <para>The Foreign Function Interface language specification
 544         (included in this manual, in <xref linkend="ffi">).</para>
 545       </listitem>
 546
 547       <listitem>
 548         <para>The <literal>Foreign</literal> module (see <xref
 549         linkend="sec-Foreign">) collects together several interfaces
 550         which are useful in specifying foreign language
 551         interfaces, including the following:</para>
 552
 553         <itemizedlist>
 554           <listitem>
 555             <para>The <literal>ForeignObj</literal> module (see <xref
 556             linkend="sec-ForeignObj">), for managing pointers from
 557             Haskell into the outside world.</para>
 558           </listitem>
 559
 560           <listitem>
 561             <para>The <literal>StablePtr</literal> module (see <xref
 562             linkend="sec-stable-pointers">), for managing pointers
 563             into Haskell from the outside world.</para>
 564           </listitem>
 565
 566           <listitem>
 567             <para>The <literal>CTypes</literal> module (see <xref
 568             linkend="sec-CTypes">) gives Haskell equivalents for the
 569             standard C datatypes, for use in making Haskell bindings
 570             to existing C libraries.</para>
 571           </listitem>
 572
 573           <listitem>
 574             <para>The <literal>CTypesISO</literal> module (see <xref
 575             linkend="sec-CTypesISO">) gives Haskell equivalents for C
 576             types defined by the ISO C standard.</para>
 577           </listitem>
 578
 579           <listitem>
 580             <para>The <literal>Storable</literal> library, for
 581             primitive marshalling of data types between Haskell and
 582             the foreign language.</para>
 583           </listitem>
 584         </itemizedlist>
 585
 586       </listitem>
 587     </itemizedlist>
 588
 589 <para>The following sections also give some hints and tips on the use
 590 of the foreign function interface in GHC.</para>
 591
 592 <sect2 id="glasgow-foreign-headers">
 593 <title>Using function headers
 594 </title>
 595
 596 <para>
 597 <indexterm><primary>C calls, function headers</primary></indexterm>
 598 </para>
 599
 600 <para>
 601 When generating C (using the <option>-fvia-C</option> directive), one can assist the
 602 C compiler in detecting type errors by using the <Command>-&num;include</Command> directive
 603 to provide <filename>.h</filename> files containing function headers.
 604 </para>
 605
 606 <para>
 607 For example,
 608 </para>
 609
 610 <para>
 611
 612 <programlisting>
 613 #include "HsFFI.h"
 614
 615 void         initialiseEFS (HsInt size);
 616 HsInt        terminateEFS (void);
 617 HsForeignObj emptyEFS(void);
 618 HsForeignObj updateEFS (HsForeignObj a, HsInt i, HsInt x);
 619 HsInt        lookupEFS (HsForeignObj a, HsInt i);
 620 </programlisting>
 621 </para>
 622
 623       <para>The types <literal>HsInt</literal>,
 624       <literal>HsForeignObj</literal> etc. are described in <xref
 625       linkend="sec-mapping-table">.</para>
 626
 627       <para>Note that this approach is only
 628       <emphasis>essential</emphasis> for returning
 629       <literal>float</literal>s (or if <literal>sizeof(int) !=
 630       sizeof(int *)</literal> on your architecture) but is a Good
 631       Thing for anyone who cares about writing solid code.  You're
 632       crazy not to do it.</para>
 633
 634 </sect2>
 635
 636 </sect1>
 637
 638 <sect1 id="multi-param-type-classes">
 639 <title>Multi-parameter type classes
 640 </title>
 641
 642 <para>
 643 This section documents GHC's implementation of multi-parameter type
 644 classes.  There's lots of background in the paper <ULink
 645 URL="http://research.microsoft.com/~simonpj/multi.ps.gz" >Type
 646 classes: exploring the design space</ULink > (Simon Peyton Jones, Mark
 647 Jones, Erik Meijer).
 648 </para>
 649
 650 <para>
 651 I'd like to thank people who reported shorcomings in the GHC 3.02
 652 implementation.  Our default decisions were all conservative ones, and
 653 the experience of these heroic pioneers has given useful concrete
 654 examples to support several generalisations.  (These appear below as
 655 design choices not implemented in 3.02.)
 656 </para>
 657
 658 <para>
 659 I've discussed these notes with Mark Jones, and I believe that Hugs
 660 will migrate towards the same design choices as I outline here.
 661 Thanks to him, and to many others who have offered very useful
 662 feedback.
 663 </para>
 664
 665 <sect2>
 666 <title>Types</title>
 667
 668 <para>
 669 There are the following restrictions on the form of a qualified
 670 type:
 671 </para>
 672
 673 <para>
 674
 675 <programlisting>
 676   forall tv1..tvn (c1, ...,cn) => type
 677 </programlisting>
 678
 679 </para>
 680
 681 <para>
 682 (Here, I write the "foralls" explicitly, although the Haskell source
 683 language omits them; in Haskell 1.4, all the free type variables of an
 684 explicit source-language type signature are universally quantified,
 685 except for the class type variables in a class declaration.  However,
 686 in GHC, you can give the foralls if you want.  See <xref LinkEnd="universal-quantification">).
 687 </para>
 688
 689 <para>
 690
 691 <OrderedList>
 692 <listitem>
 693
 694 <para>
 695  <emphasis>Each universally quantified type variable
 696 <literal>tvi</literal> must be mentioned (i.e. appear free) in <literal>type</literal></emphasis>.
 697
 698 The reason for this is that a value with a type that does not obey
 699 this restriction could not be used without introducing
 700 ambiguity. Here, for example, is an illegal type:
 701
 702
 703 <programlisting>
 704   forall a. Eq a => Int
 705 </programlisting>
 706
 707
 708 When a value with this type was used, the constraint <literal>Eq tv</literal>
 709 would be introduced where <literal>tv</literal> is a fresh type variable, and
 710 (in the dictionary-translation implementation) the value would be
 711 applied to a dictionary for <literal>Eq tv</literal>.  The difficulty is that we
 712 can never know which instance of <literal>Eq</literal> to use because we never
 713 get any more information about <literal>tv</literal>.
 714
 715 </para>
 716 </listitem>
 717 <listitem>
 718
 719 <para>
 720  <emphasis>Every constraint <literal>ci</literal> must mention at least one of the
 721 universally quantified type variables <literal>tvi</literal></emphasis>.
 722
 723 For example, this type is OK because <literal>C a b</literal> mentions the
 724 universally quantified type variable <literal>b</literal>:
 725
 726
 727 <programlisting>
 728   forall a. C a b => burble
 729 </programlisting>
 730
 731
 732 The next type is illegal because the constraint <literal>Eq b</literal> does not
 733 mention <literal>a</literal>:
 734
 735
 736 <programlisting>
 737   forall a. Eq b => burble
 738 </programlisting>
 739
 740
 741 The reason for this restriction is milder than the other one.  The
 742 excluded types are never useful or necessary (because the offending
 743 context doesn't need to be witnessed at this point; it can be floated
 744 out).  Furthermore, floating them out increases sharing. Lastly,
 745 excluding them is a conservative choice; it leaves a patch of
 746 territory free in case we need it later.
 747
 748 </para>
 749 </listitem>
 750
 751 </OrderedList>
 752
 753 </para>
 754
 755 <para>
 756 These restrictions apply to all types, whether declared in a type signature
 757 or inferred.
 758 </para>
 759
 760 <para>
 761 Unlike Haskell 1.4, constraints in types do <emphasis>not</emphasis> have to be of
 762 the form <emphasis>(class type-variables)</emphasis>.  Thus, these type signatures
 763 are perfectly OK
 764 </para>
 765
 766 <para>
 767
 768 <programlisting>
 769   f :: Eq (m a) => [m a] -> [m a]
 770   g :: Eq [a] => ...
 771 </programlisting>
 772
 773 </para>
 774
 775 <para>
 776 This choice recovers principal types, a property that Haskell 1.4 does not have.
 777 </para>
 778
 779 </sect2>
 780
 781 <sect2>
 782 <title>Class declarations</title>
 783
 784 <para>
 785
 786 <OrderedList>
 787 <listitem>
 788
 789 <para>
 790  <emphasis>Multi-parameter type classes are permitted</emphasis>. For example:
 791
 792
 793 <programlisting>
 794   class Collection c a where
 795     union :: c a -> c a -> c a
 796     ...etc.
 797 </programlisting>
 798
 799
 800
 801 </para>
 802 </listitem>
 803 <listitem>
 804
 805 <para>
 806  <emphasis>The class hierarchy must be acyclic</emphasis>.  However, the definition
 807 of "acyclic" involves only the superclass relationships.  For example,
 808 this is OK:
 809
 810
 811 <programlisting>
 812   class C a where {
 813     op :: D b => a -> b -> b
 814   }
 815
 816   class C a => D a where { ... }
 817 </programlisting>
 818
 819
 820 Here, <literal>C</literal> is a superclass of <literal>D</literal>, but it's OK for a
 821 class operation <literal>op</literal> of <literal>C</literal> to mention <literal>D</literal>.  (It
 822 would not be OK for <literal>D</literal> to be a superclass of <literal>C</literal>.)
 823
 824 </para>
 825 </listitem>
 826 <listitem>
 827
 828 <para>
 829  <emphasis>There are no restrictions on the context in a class declaration
 830 (which introduces superclasses), except that the class hierarchy must
 831 be acyclic</emphasis>.  So these class declarations are OK:
 832
 833
 834 <programlisting>
 835   class Functor (m k) => FiniteMap m k where
 836     ...
 837
 838   class (Monad m, Monad (t m)) => Transform t m where
 839     lift :: m a -> (t m) a
 840 </programlisting>
 841
 842
 843 </para>
 844 </listitem>
 845 <listitem>
 846
 847 <para>
 848  <emphasis>In the signature of a class operation, every constraint
 849 must mention at least one type variable that is not a class type
 850 variable</emphasis>.
 851
 852 Thus:
 853
 854
 855 <programlisting>
 856   class Collection c a where
 857     mapC :: Collection c b => (a->b) -> c a -> c b
 858 </programlisting>
 859
 860
 861 is OK because the constraint <literal>(Collection a b)</literal> mentions
 862 <literal>b</literal>, even though it also mentions the class variable
 863 <literal>a</literal>.  On the other hand:
 864
 865
 866 <programlisting>
 867   class C a where
 868     op :: Eq a => (a,b) -> (a,b)
 869 </programlisting>
 870
 871
 872 is not OK because the constraint <literal>(Eq a)</literal> mentions on the class
 873 type variable <literal>a</literal>, but not <literal>b</literal>.  However, any such
 874 example is easily fixed by moving the offending context up to the
 875 superclass context:
 876
 877
 878 <programlisting>
 879   class Eq a => C a where
 880     op ::(a,b) -> (a,b)
 881 </programlisting>
 882
 883
 884 A yet more relaxed rule would allow the context of a class-op signature
 885 to mention only class type variables.  However, that conflicts with
 886 Rule 1(b) for types above.
 887
 888 </para>
 889 </listitem>
 890 <listitem>
 891
 892 <para>
 893  <emphasis>The type of each class operation must mention <emphasis>all</emphasis> of
 894 the class type variables</emphasis>.  For example:
 895
 896
 897 <programlisting>
 898   class Coll s a where
 899     empty  :: s
 900     insert :: s -> a -> s
 901 </programlisting>
 902
 903
 904 is not OK, because the type of <literal>empty</literal> doesn't mention
 905 <literal>a</literal>.  This rule is a consequence of Rule 1(a), above, for
 906 types, and has the same motivation.
 907
 908 Sometimes, offending class declarations exhibit misunderstandings.  For
 909 example, <literal>Coll</literal> might be rewritten
 910
 911
 912 <programlisting>
 913   class Coll s a where
 914     empty  :: s a
 915     insert :: s a -> a -> s a
 916 </programlisting>
 917
 918
 919 which makes the connection between the type of a collection of
 920 <literal>a</literal>'s (namely <literal>(s a)</literal>) and the element type <literal>a</literal>.
 921 Occasionally this really doesn't work, in which case you can split the
 922 class like this:
 923
 924
 925 <programlisting>
 926   class CollE s where
 927     empty  :: s
 928
 929   class CollE s => Coll s a where
 930     insert :: s -> a -> s
 931 </programlisting>
 932
 933
 934 </para>
 935 </listitem>
 936
 937 </OrderedList>
 938
 939 </para>
 940
 941 </sect2>
 942
 943 <sect2 id="instance-decls">
 944 <title>Instance declarations</title>
 945
 946 <para>
 947
 948 <OrderedList>
 949 <listitem>
 950
 951 <para>
 952  <emphasis>Instance declarations may not overlap</emphasis>.  The two instance
 953 declarations
 954
 955
 956 <programlisting>
 957   instance context1 => C type1 where ...
 958   instance context2 => C type2 where ...
 959 </programlisting>
 960
 961
 962 "overlap" if <literal>type1</literal> and <literal>type2</literal> unify
 963
 964 However, if you give the command line option
 965 <option>-fallow-overlapping-instances</option><indexterm><primary>-fallow-overlapping-instances
 966 option</primary></indexterm> then two overlapping instance declarations are permitted
 967 iff
 968
 969
 970 <itemizedlist>
 971 <listitem>
 972
 973 <para>
 974  EITHER <literal>type1</literal> and <literal>type2</literal> do not unify
 975 </para>
 976 </listitem>
 977 <listitem>
 978
 979 <para>
 980  OR <literal>type2</literal> is a substitution instance of <literal>type1</literal>
 981 (but not identical to <literal>type1</literal>)
 982 </para>
 983 </listitem>
 984 <listitem>
 985
 986 <para>
 987  OR vice versa
 988 </para>
 989 </listitem>
 990
 991 </itemizedlist>
 992
 993
 994 Notice that these rules
 995
 996
 997 <itemizedlist>
 998 <listitem>
 999
1000 <para>
1001  make it clear which instance decl to use
1002 (pick the most specific one that matches)
1003
1004 </para>
1005 </listitem>
1006 <listitem>
1007
1008 <para>
1009  do not mention the contexts <literal>context1</literal>, <literal>context2</literal>
1010 Reason: you can pick which instance decl
1011 "matches" based on the type.
1012 </para>
1013 </listitem>
1014
1015 </itemizedlist>
1016
1017
1018 Regrettably, GHC doesn't guarantee to detect overlapping instance
1019 declarations if they appear in different modules.  GHC can "see" the
1020 instance declarations in the transitive closure of all the modules
1021 imported by the one being compiled, so it can "see" all instance decls
1022 when it is compiling <literal>Main</literal>.  However, it currently chooses not
1023 to look at ones that can't possibly be of use in the module currently
1024 being compiled, in the interests of efficiency.  (Perhaps we should
1025 change that decision, at least for <literal>Main</literal>.)
1026
1027 </para>
1028 </listitem>
1029 <listitem>
1030
1031 <para>
1032  <emphasis>There are no restrictions on the type in an instance
1033 <emphasis>head</emphasis>, except that at least one must not be a type variable</emphasis>.
1034 The instance "head" is the bit after the "=>" in an instance decl. For
1035 example, these are OK:
1036
1037
1038 <programlisting>
1039   instance C Int a where ...
1040
1041   instance D (Int, Int) where ...
1042
1043   instance E [[a]] where ...
1044 </programlisting>
1045
1046
1047 Note that instance heads <emphasis>may</emphasis> contain repeated type variables.
1048 For example, this is OK:
1049
1050
1051 <programlisting>
1052   instance Stateful (ST s) (MutVar s) where ...
1053 </programlisting>
1054
1055
1056 The "at least one not a type variable" restriction is to ensure that
1057 context reduction terminates: each reduction step removes one type
1058 constructor.  For example, the following would make the type checker
1059 loop if it wasn't excluded:
1060
1061
1062 <programlisting>
1063   instance C a => C a where ...
1064 </programlisting>
1065
1066
1067 There are two situations in which the rule is a bit of a pain. First,
1068 if one allows overlapping instance declarations then it's quite
1069 convenient to have a "default instance" declaration that applies if
1070 something more specific does not:
1071
1072
1073 <programlisting>
1074   instance C a where
1075     op = ... -- Default
1076 </programlisting>
1077
1078
1079 Second, sometimes you might want to use the following to get the
1080 effect of a "class synonym":
1081
1082
1083 <programlisting>
1084   class (C1 a, C2 a, C3 a) => C a where { }
1085
1086   instance (C1 a, C2 a, C3 a) => C a where { }
1087 </programlisting>
1088
1089
1090 This allows you to write shorter signatures:
1091
1092
1093 <programlisting>
1094   f :: C a => ...
1095 </programlisting>
1096
1097
1098 instead of
1099
1100
1101 <programlisting>
1102   f :: (C1 a, C2 a, C3 a) => ...
1103 </programlisting>
1104
1105
1106 I'm on the lookout for a simple rule that preserves decidability while
1107 allowing these idioms.  The experimental flag
1108 <option>-fallow-undecidable-instances</option><indexterm><primary>-fallow-undecidable-instances
1109 option</primary></indexterm> lifts this restriction, allowing all the types in an
1110 instance head to be type variables.
1111
1112 </para>
1113 </listitem>
1114 <listitem>
1115
1116 <para>
1117  <emphasis>Unlike Haskell 1.4, instance heads may use type
1118 synonyms</emphasis>.  As always, using a type synonym is just shorthand for
1119 writing the RHS of the type synonym definition.  For example:
1120
1121
1122 <programlisting>
1123   type Point = (Int,Int)
1124   instance C Point   where ...
1125   instance C [Point] where ...
1126 </programlisting>
1127
1128
1129 is legal.  However, if you added
1130
1131
1132 <programlisting>
1133   instance C (Int,Int) where ...
1134 </programlisting>
1135
1136
1137 as well, then the compiler will complain about the overlapping
1138 (actually, identical) instance declarations.  As always, type synonyms
1139 must be fully applied.  You cannot, for example, write:
1140
1141
1142 <programlisting>
1143   type P a = [[a]]
1144   instance Monad P where ...
1145 </programlisting>
1146
1147
1148 This design decision is independent of all the others, and easily
1149 reversed, but it makes sense to me.
1150
1151 </para>
1152 </listitem>
1153 <listitem>
1154
1155 <para>
1156 <emphasis>The types in an instance-declaration <emphasis>context</emphasis> must all
1157 be type variables</emphasis>. Thus
1158
1159
1160 <programlisting>
1161 instance C a b => Eq (a,b) where ...
1162 </programlisting>
1163
1164
1165 is OK, but
1166
1167
1168 <programlisting>
1169 instance C Int b => Foo b where ...
1170 </programlisting>
1171
1172
1173 is not OK.  Again, the intent here is to make sure that context
1174 reduction terminates.
1175
1176 Voluminous correspondence on the Haskell mailing list has convinced me
1177 that it's worth experimenting with a more liberal rule.  If you use
1178 the flag <option>-fallow-undecidable-instances</option> can use arbitrary
1179 types in an instance context.  Termination is ensured by having a
1180 fixed-depth recursion stack.  If you exceed the stack depth you get a
1181 sort of backtrace, and the opportunity to increase the stack depth
1182 with <option>-fcontext-stack</option><emphasis>N</emphasis>.
1183
1184 </para>
1185 </listitem>
1186
1187 </OrderedList>
1188
1189 </para>
1190
1191 </sect2>
1192
1193 </sect1>
1194
1195 <sect1 id="implicit-parameters">
1196 <title>Implicit parameters
1197 </title>
1198
1199 <para> Implicit paramters are implemented as described in
1200 "Implicit parameters: dynamic scoping with static types",
1201 J Lewis, MB Shields, E Meijer, J Launchbury,
1202 27th ACM Symposium on Principles of Programming Languages (POPL'00),
1203 Boston, Jan 2000.
1204 </para>
1205
1206 <para>
1207 There should be more documentation, but there isn't (yet).  Yell if you need it.
1208 </para>
1209 <itemizedlist>
1210 <listitem>
1211 <para> You can't have an implicit parameter in the context of a class or instance
1212 declaration.  For example, both these declarations are illegal:
1213 <programlisting>
1214   class (?x::Int) => C a where ...
1215   instance (?x::a) => Foo [a] where ...
1216 </programlisting>
1217 Reason: exactly which implicit parameter you pick up depends on exactly where
1218 you invoke a function. But the ``invocation'' of instance declarations is done
1219 behind the scenes by the compiler, so it's hard to figure out exactly where it is done.
1220 Easiest thing is to outlaw the offending types.</para>
1221 </listitem>
1222
1223 </itemizedlist>
1224
1225 </sect1>
1226
1227
1228 <sect1 id="functional-dependencies">
1229 <title>Functional dependencies
1230 </title>
1231
1232 <para> Functional dependencies are implemented as described by Mark Jones
1233 in "Type Classes with Functional Dependencies", Mark P. Jones,
1234 In Proceedings of the 9th European Symposium on Programming,
1235 ESOP 2000, Berlin, Germany, March 2000, Springer-Verlag LNCS 1782.
1236 </para>
1237
1238 <para>
1239 There should be more documentation, but there isn't (yet).  Yell if you need it.
1240 </para>
1241 </sect1>
1242
1243
1244 <sect1 id="universal-quantification">
1245 <title>Explicit universal quantification
1246 </title>
1247
1248 <para>
1249 GHC's type system supports explicit universal quantification in
1250 constructor fields and function arguments.  This is useful for things
1251 like defining <literal>runST</literal> from the state-thread world.
1252 GHC's syntax for this now agrees with Hugs's, namely:
1253 </para>
1254
1255 <para>
1256
1257 <programlisting>
1258         forall a b. (Ord a, Eq  b) => a -> b -> a
1259 </programlisting>
1260
1261 </para>
1262
1263 <para>
1264 The context is, of course, optional.  You can't use <literal>forall</literal> as
1265 a type variable any more!
1266 </para>
1267
1268 <para>
1269 Haskell type signatures are implicitly quantified.  The <literal>forall</literal>
1270 allows us to say exactly what this means.  For example:
1271 </para>
1272
1273 <para>
1274
1275 <programlisting>
1276         g :: b -> b
1277 </programlisting>
1278
1279 </para>
1280
1281 <para>
1282 means this:
1283 </para>
1284
1285 <para>
1286
1287 <programlisting>
1288         g :: forall b. (b -> b)
1289 </programlisting>
1290
1291 </para>
1292
1293 <para>
1294 The two are treated identically.
1295 </para>
1296
1297 <sect2 id="univ">
1298 <title>Universally-quantified data type fields
1299 </title>
1300
1301 <para>
1302 In a <literal>data</literal> or <literal>newtype</literal> declaration one can quantify
1303 the types of the constructor arguments.  Here are several examples:
1304 </para>
1305
1306 <para>
1307
1308 <programlisting>
1309 data T a = T1 (forall b. b -> b -> b) a
1310
1311 data MonadT m = MkMonad { return :: forall a. a -> m a,
1312                           bind   :: forall a b. m a -> (a -> m b) -> m b
1313                         }
1314
1315 newtype Swizzle = MkSwizzle (Ord a => [a] -> [a])
1316 </programlisting>
1317
1318 </para>
1319
1320 <para>
1321 The constructors now have so-called <emphasis>rank 2</emphasis> polymorphic
1322 types, in which there is a for-all in the argument types.:
1323 </para>
1324
1325 <para>
1326
1327 <programlisting>
1328 T1 :: forall a. (forall b. b -> b -> b) -> a -> T a
1329 MkMonad :: forall m. (forall a. a -> m a)
1330                   -> (forall a b. m a -> (a -> m b) -> m b)
1331                   -> MonadT m
1332 MkSwizzle :: (Ord a => [a] -> [a]) -> Swizzle
1333 </programlisting>
1334
1335 </para>
1336
1337 <para>
1338 Notice that you don't need to use a <literal>forall</literal> if there's an
1339 explicit context.  For example in the first argument of the
1340 constructor <function>MkSwizzle</function>, an implicit "<literal>forall a.</literal>" is
1341 prefixed to the argument type.  The implicit <literal>forall</literal>
1342 quantifies all type variables that are not already in scope, and are
1343 mentioned in the type quantified over.
1344 </para>
1345
1346 <para>
1347 As for type signatures, implicit quantification happens for non-overloaded
1348 types too.  So if you write this:
1349
1350 <programlisting>
1351   data T a = MkT (Either a b) (b -> b)
1352 </programlisting>
1353
1354 it's just as if you had written this:
1355
1356 <programlisting>
1357   data T a = MkT (forall b. Either a b) (forall b. b -> b)
1358 </programlisting>
1359
1360 That is, since the type variable <literal>b</literal> isn't in scope, it's
1361 implicitly universally quantified.  (Arguably, it would be better
1362 to <emphasis>require</emphasis> explicit quantification on constructor arguments
1363 where that is what is wanted.  Feedback welcomed.)
1364 </para>
1365
1366 </sect2>
1367
1368 <sect2>
1369 <title>Construction </title>
1370
1371 <para>
1372 You construct values of types <literal>T1, MonadT, Swizzle</literal> by applying
1373 the constructor to suitable values, just as usual.  For example,
1374 </para>
1375
1376 <para>
1377
1378 <programlisting>
1379 (T1 (\xy->x) 3) :: T Int
1380
1381 (MkSwizzle sort)    :: Swizzle
1382 (MkSwizzle reverse) :: Swizzle
1383
1384 (let r x = Just x
1385      b m k = case m of
1386                 Just y -> k y
1387                 Nothing -> Nothing
1388   in
1389   MkMonad r b) :: MonadT Maybe
1390 </programlisting>
1391
1392 </para>
1393
1394 <para>
1395 The type of the argument can, as usual, be more general than the type
1396 required, as <literal>(MkSwizzle reverse)</literal> shows.  (<function>reverse</function>
1397 does not need the <literal>Ord</literal> constraint.)
1398 </para>
1399
1400 </sect2>
1401
1402 <sect2>
1403 <title>Pattern matching</title>
1404
1405 <para>
1406 When you use pattern matching, the bound variables may now have
1407 polymorphic types.  For example:
1408 </para>
1409
1410 <para>
1411
1412 <programlisting>
1413         f :: T a -> a -> (a, Char)
1414         f (T1 f k) x = (f k x, f 'c' 'd')
1415
1416         g :: (Ord a, Ord b) => Swizzle -> [a] -> (a -> b) -> [b]
1417         g (MkSwizzle s) xs f = s (map f (s xs))
1418
1419         h :: MonadT m -> [m a] -> m [a]
1420         h m [] = return m []
1421         h m (x:xs) = bind m x           $ \y ->
1422                       bind m (h m xs)   $ \ys ->
1423                       return m (y:ys)
1424 </programlisting>
1425
1426 </para>
1427
1428 <para>
1429 In the function <function>h</function> we use the record selectors <literal>return</literal>
1430 and <literal>bind</literal> to extract the polymorphic bind and return functions
1431 from the <literal>MonadT</literal> data structure, rather than using pattern
1432 matching.
1433 </para>
1434
1435 <para>
1436 You cannot pattern-match against an argument that is polymorphic.
1437 For example:
1438
1439 <programlisting>
1440         newtype TIM s a = TIM (ST s (Maybe a))
1441
1442         runTIM :: (forall s. TIM s a) -> Maybe a
1443         runTIM (TIM m) = runST m
1444 </programlisting>
1445
1446 </para>
1447
1448 <para>
1449 Here the pattern-match fails, because you can't pattern-match against
1450 an argument of type <literal>(forall s. TIM s a)</literal>.  Instead you
1451 must bind the variable and pattern match in the right hand side:
1452
1453 <programlisting>
1454         runTIM :: (forall s. TIM s a) -> Maybe a
1455         runTIM tm = case tm of { TIM m -> runST m }
1456 </programlisting>
1457
1458 The <literal>tm</literal> on the right hand side is (invisibly) instantiated, like
1459 any polymorphic value at its occurrence site, and now you can pattern-match
1460 against it.
1461 </para>
1462
1463 </sect2>
1464
1465 <sect2>
1466 <title>The partial-application restriction</title>
1467
1468 <para>
1469 There is really only one way in which data structures with polymorphic
1470 components might surprise you: you must not partially apply them.
1471 For example, this is illegal:
1472 </para>
1473
1474 <para>
1475
1476 <programlisting>
1477         map MkSwizzle [sort, reverse]
1478 </programlisting>
1479
1480 </para>
1481
1482 <para>
1483 The restriction is this: <emphasis>every subexpression of the program must
1484 have a type that has no for-alls, except that in a function
1485 application (f e1&hellip;en) the partial applications are not subject to
1486 this rule</emphasis>.  The restriction makes type inference feasible.
1487 </para>
1488
1489 <para>
1490 In the illegal example, the sub-expression <literal>MkSwizzle</literal> has the
1491 polymorphic type <literal>(Ord b => [b] -> [b]) -> Swizzle</literal> and is not
1492 a sub-expression of an enclosing application.  On the other hand, this
1493 expression is OK:
1494 </para>
1495
1496 <para>
1497
1498 <programlisting>
1499         map (T1 (\a b -> a)) [1,2,3]
1500 </programlisting>
1501
1502 </para>
1503
1504 <para>
1505 even though it involves a partial application of <function>T1</function>, because
1506 the sub-expression <literal>T1 (\a b -> a)</literal> has type <literal>Int -> T
1507 Int</literal>.
1508 </para>
1509
1510 </sect2>
1511
1512 <sect2 id="sigs">
1513 <title>Type signatures
1514 </title>
1515
1516 <para>
1517 Once you have data constructors with universally-quantified fields, or
1518 constants such as <Constant>runST</Constant> that have rank-2 types, it isn't long
1519 before you discover that you need more!  Consider:
1520 </para>
1521
1522 <para>
1523
1524 <programlisting>
1525   mkTs f x y = [T1 f x, T1 f y]
1526 </programlisting>
1527
1528 </para>
1529
1530 <para>
1531 <function>mkTs</function> is a fuction that constructs some values of type
1532 <literal>T</literal>, using some pieces passed to it.  The trouble is that since
1533 <literal>f</literal> is a function argument, Haskell assumes that it is
1534 monomorphic, so we'll get a type error when applying <function>T1</function> to
1535 it.  This is a rather silly example, but the problem really bites in
1536 practice.  Lots of people trip over the fact that you can't make
1537 "wrappers functions" for <Constant>runST</Constant> for exactly the same reason.
1538 In short, it is impossible to build abstractions around functions with
1539 rank-2 types.
1540 </para>
1541
1542 <para>
1543 The solution is fairly clear.  We provide the ability to give a rank-2
1544 type signature for <emphasis>ordinary</emphasis> functions (not only data
1545 constructors), thus:
1546 </para>
1547
1548 <para>
1549
1550 <programlisting>
1551   mkTs :: (forall b. b -> b -> b) -> a -> [T a]
1552   mkTs f x y = [T1 f x, T1 f y]
1553 </programlisting>
1554
1555 </para>
1556
1557 <para>
1558 This type signature tells the compiler to attribute <literal>f</literal> with
1559 the polymorphic type <literal>(forall b. b -> b -> b)</literal> when type
1560 checking the body of <function>mkTs</function>, so now the application of
1561 <function>T1</function> is fine.
1562 </para>
1563
1564 <para>
1565 There are two restrictions:
1566 </para>
1567
1568 <para>
1569
1570 <itemizedlist>
1571 <listitem>
1572
1573 <para>
1574  You can only define a rank 2 type, specified by the following
1575 grammar:
1576
1577
1578 <programlisting>
1579 rank2type ::= [forall tyvars .] [context =>] funty
1580 funty     ::= ([forall tyvars .] [context =>] ty) -> funty
1581             | ty
1582 ty        ::= ...current Haskell monotype syntax...
1583 </programlisting>
1584
1585
1586 Informally, the universal quantification must all be right at the beginning,
1587 or at the top level of a function argument.
1588
1589 </para>
1590 </listitem>
1591 <listitem>
1592
1593 <para>
1594  There is a restriction on the definition of a function whose
1595 type signature is a rank-2 type: the polymorphic arguments must be
1596 matched on the left hand side of the "<literal>=</literal>" sign.  You can't
1597 define <function>mkTs</function> like this:
1598
1599
1600 <programlisting>
1601 mkTs :: (forall b. b -> b -> b) -> a -> [T a]
1602 mkTs = \ f x y -> [T1 f x, T1 f y]
1603 </programlisting>
1604
1605
1606
1607 The same partial-application rule applies to ordinary functions with
1608 rank-2 types as applied to data constructors.
1609
1610 </para>
1611 </listitem>
1612
1613 </itemizedlist>
1614
1615 </para>
1616
1617 </sect2>
1618
1619
1620 <sect2 id="hoist">
1621 <title>Type synonyms and hoisting
1622 </title>
1623
1624 <para>
1625 GHC also allows you to write a <literal>forall</literal> in a type synonym, thus:
1626 <programlisting>
1627   type Discard a = forall b. a -> b -> a
1628
1629   f :: Discard a
1630   f x y = x
1631 </programlisting>
1632 However, it is often convenient to use these sort of synonyms at the right hand
1633 end of an arrow, thus:
1634 <programlisting>
1635   type Discard a = forall b. a -> b -> a
1636
1637   g :: Int -> Discard Int
1638   g x y z = x+y
1639 </programlisting>
1640 Simply expanding the type synonym would give
1641 <programlisting>
1642   g :: Int -> (forall b. Int -> b -> Int)
1643 </programlisting>
1644 but GHC "hoists" the <literal>forall</literal> to give the isomorphic type
1645 <programlisting>
1646   g :: forall b. Int -> Int -> b -> Int
1647 </programlisting>
1648 In general, the rule is this: <emphasis>to determine the type specified by any explicit
1649 user-written type (e.g. in a type signature), GHC expands type synonyms and then repeatedly
1650 performs the transformation:</emphasis>
1651 <programlisting>
1652   <emphasis>type1</emphasis> -> forall a. <emphasis>type2</emphasis>
1653 ==>
1654   forall a. <emphasis>type1</emphasis> -> <emphasis>type2</emphasis>
1655 </programlisting>
1656 (In fact, GHC tries to retain as much synonym information as possible for use in
1657 error messages, but that is a usability issue.)  This rule applies, of course, whether
1658 or not the <literal>forall</literal> comes from a synonym. For example, here is another
1659 valid way to write <literal>g</literal>'s type signature:
1660 <programlisting>
1661   g :: Int -> Int -> forall b. b -> Int
1662 </programlisting>
1663 </para>
1664 </sect2>
1665
1666 </sect1>
1667
1668 <sect1 id="existential-quantification">
1669 <title>Existentially quantified data constructors
1670 </title>
1671
1672 <para>
1673 The idea of using existential quantification in data type declarations
1674 was suggested by Laufer (I believe, thought doubtless someone will
1675 correct me), and implemented in Hope+. It's been in Lennart
1676 Augustsson's <Command>hbc</Command> Haskell compiler for several years, and
1677 proved very useful.  Here's the idea.  Consider the declaration:
1678 </para>
1679
1680 <para>
1681
1682 <programlisting>
1683   data Foo = forall a. MkFoo a (a -> Bool)
1684            | Nil
1685 </programlisting>
1686
1687 </para>
1688
1689 <para>
1690 The data type <literal>Foo</literal> has two constructors with types:
1691 </para>
1692
1693 <para>
1694
1695 <programlisting>
1696   MkFoo :: forall a. a -> (a -> Bool) -> Foo
1697   Nil   :: Foo
1698 </programlisting>
1699
1700 </para>
1701
1702 <para>
1703 Notice that the type variable <literal>a</literal> in the type of <function>MkFoo</function>
1704 does not appear in the data type itself, which is plain <literal>Foo</literal>.
1705 For example, the following expression is fine:
1706 </para>
1707
1708 <para>
1709
1710 <programlisting>
1711   [MkFoo 3 even, MkFoo 'c' isUpper] :: [Foo]
1712 </programlisting>
1713
1714 </para>
1715
1716 <para>
1717 Here, <literal>(MkFoo 3 even)</literal> packages an integer with a function
1718 <function>even</function> that maps an integer to <literal>Bool</literal>; and <function>MkFoo 'c'
1719 isUpper</function> packages a character with a compatible function.  These
1720 two things are each of type <literal>Foo</literal> and can be put in a list.
1721 </para>
1722
1723 <para>
1724 What can we do with a value of type <literal>Foo</literal>?.  In particular,
1725 what happens when we pattern-match on <function>MkFoo</function>?
1726 </para>
1727
1728 <para>
1729
1730 <programlisting>
1731   f (MkFoo val fn) = ???
1732 </programlisting>
1733
1734 </para>
1735
1736 <para>
1737 Since all we know about <literal>val</literal> and <function>fn</function> is that they
1738 are compatible, the only (useful) thing we can do with them is to
1739 apply <function>fn</function> to <literal>val</literal> to get a boolean.  For example:
1740 </para>
1741
1742 <para>
1743
1744 <programlisting>
1745   f :: Foo -> Bool
1746   f (MkFoo val fn) = fn val
1747 </programlisting>
1748
1749 </para>
1750
1751 <para>
1752 What this allows us to do is to package heterogenous values
1753 together with a bunch of functions that manipulate them, and then treat
1754 that collection of packages in a uniform manner.  You can express
1755 quite a bit of object-oriented-like programming this way.
1756 </para>
1757
1758 <sect2 id="existential">
1759 <title>Why existential?
1760 </title>
1761
1762 <para>
1763 What has this to do with <emphasis>existential</emphasis> quantification?
1764 Simply that <function>MkFoo</function> has the (nearly) isomorphic type
1765 </para>
1766
1767 <para>
1768
1769 <programlisting>
1770   MkFoo :: (exists a . (a, a -> Bool)) -> Foo
1771 </programlisting>
1772
1773 </para>
1774
1775 <para>
1776 But Haskell programmers can safely think of the ordinary
1777 <emphasis>universally</emphasis> quantified type given above, thereby avoiding
1778 adding a new existential quantification construct.
1779 </para>
1780
1781 </sect2>
1782
1783 <sect2>
1784 <title>Type classes</title>
1785
1786 <para>
1787 An easy extension (implemented in <Command>hbc</Command>) is to allow
1788 arbitrary contexts before the constructor.  For example:
1789 </para>
1790
1791 <para>
1792
1793 <programlisting>
1794 data Baz = forall a. Eq a => Baz1 a a
1795          | forall b. Show b => Baz2 b (b -> b)
1796 </programlisting>
1797
1798 </para>
1799
1800 <para>
1801 The two constructors have the types you'd expect:
1802 </para>
1803
1804 <para>
1805
1806 <programlisting>
1807 Baz1 :: forall a. Eq a => a -> a -> Baz
1808 Baz2 :: forall b. Show b => b -> (b -> b) -> Baz
1809 </programlisting>
1810
1811 </para>
1812
1813 <para>
1814 But when pattern matching on <function>Baz1</function> the matched values can be compared
1815 for equality, and when pattern matching on <function>Baz2</function> the first matched
1816 value can be converted to a string (as well as applying the function to it).
1817 So this program is legal:
1818 </para>
1819
1820 <para>
1821
1822 <programlisting>
1823   f :: Baz -> String
1824   f (Baz1 p q) | p == q    = "Yes"
1825                | otherwise = "No"
1826   f (Baz1 v fn)            = show (fn v)
1827 </programlisting>
1828
1829 </para>
1830
1831 <para>
1832 Operationally, in a dictionary-passing implementation, the
1833 constructors <function>Baz1</function> and <function>Baz2</function> must store the
1834 dictionaries for <literal>Eq</literal> and <literal>Show</literal> respectively, and
1835 extract it on pattern matching.
1836 </para>
1837
1838 <para>
1839 Notice the way that the syntax fits smoothly with that used for
1840 universal quantification earlier.
1841 </para>
1842
1843 </sect2>
1844
1845 <sect2>
1846 <title>Restrictions</title>
1847
1848 <para>
1849 There are several restrictions on the ways in which existentially-quantified
1850 constructors can be use.
1851 </para>
1852
1853 <para>
1854
1855 <itemizedlist>
1856 <listitem>
1857
1858 <para>
1859  When pattern matching, each pattern match introduces a new,
1860 distinct, type for each existential type variable.  These types cannot
1861 be unified with any other type, nor can they escape from the scope of
1862 the pattern match.  For example, these fragments are incorrect:
1863
1864
1865 <programlisting>
1866 f1 (MkFoo a f) = a
1867 </programlisting>
1868
1869
1870 Here, the type bound by <function>MkFoo</function> "escapes", because <literal>a</literal>
1871 is the result of <function>f1</function>.  One way to see why this is wrong is to
1872 ask what type <function>f1</function> has:
1873
1874
1875 <programlisting>
1876   f1 :: Foo -> a             -- Weird!
1877 </programlisting>
1878
1879
1880 What is this "<literal>a</literal>" in the result type? Clearly we don't mean
1881 this:
1882
1883
1884 <programlisting>
1885   f1 :: forall a. Foo -> a   -- Wrong!
1886 </programlisting>
1887
1888
1889 The original program is just plain wrong.  Here's another sort of error
1890
1891
1892 <programlisting>
1893   f2 (Baz1 a b) (Baz1 p q) = a==q
1894 </programlisting>
1895
1896
1897 It's ok to say <literal>a==b</literal> or <literal>p==q</literal>, but
1898 <literal>a==q</literal> is wrong because it equates the two distinct types arising
1899 from the two <function>Baz1</function> constructors.
1900
1901
1902 </para>
1903 </listitem>
1904 <listitem>
1905
1906 <para>
1907 You can't pattern-match on an existentially quantified
1908 constructor in a <literal>let</literal> or <literal>where</literal> group of
1909 bindings. So this is illegal:
1910
1911
1912 <programlisting>
1913   f3 x = a==b where { Baz1 a b = x }
1914 </programlisting>
1915
1916
1917 You can only pattern-match
1918 on an existentially-quantified constructor in a <literal>case</literal> expression or
1919 in the patterns of a function definition.
1920
1921 The reason for this restriction is really an implementation one.
1922 Type-checking binding groups is already a nightmare without
1923 existentials complicating the picture.  Also an existential pattern
1924 binding at the top level of a module doesn't make sense, because it's
1925 not clear how to prevent the existentially-quantified type "escaping".
1926 So for now, there's a simple-to-state restriction.  We'll see how
1927 annoying it is.
1928
1929 </para>
1930 </listitem>
1931 <listitem>
1932
1933 <para>
1934 You can't use existential quantification for <literal>newtype</literal>
1935 declarations.  So this is illegal:
1936
1937
1938 <programlisting>
1939   newtype T = forall a. Ord a => MkT a
1940 </programlisting>
1941
1942
1943 Reason: a value of type <literal>T</literal> must be represented as a pair
1944 of a dictionary for <literal>Ord t</literal> and a value of type <literal>t</literal>.
1945 That contradicts the idea that <literal>newtype</literal> should have no
1946 concrete representation.  You can get just the same efficiency and effect
1947 by using <literal>data</literal> instead of <literal>newtype</literal>.  If there is no
1948 overloading involved, then there is more of a case for allowing
1949 an existentially-quantified <literal>newtype</literal>, because the <literal>data</literal>
1950 because the <literal>data</literal> version does carry an implementation cost,
1951 but single-field existentially quantified constructors aren't much
1952 use.  So the simple restriction (no existential stuff on <literal>newtype</literal>)
1953 stands, unless there are convincing reasons to change it.
1954
1955
1956 </para>
1957 </listitem>
1958 <listitem>
1959
1960 <para>
1961  You can't use <literal>deriving</literal> to define instances of a
1962 data type with existentially quantified data constructors.
1963
1964 Reason: in most cases it would not make sense. For example:&num;
1965
1966 <programlisting>
1967 data T = forall a. MkT [a] deriving( Eq )
1968 </programlisting>
1969
1970 To derive <literal>Eq</literal> in the standard way we would need to have equality
1971 between the single component of two <function>MkT</function> constructors:
1972
1973 <programlisting>
1974 instance Eq T where
1975   (MkT a) == (MkT b) = ???
1976 </programlisting>
1977
1978 But <VarName>a</VarName> and <VarName>b</VarName> have distinct types, and so can't be compared.
1979 It's just about possible to imagine examples in which the derived instance
1980 would make sense, but it seems altogether simpler simply to prohibit such
1981 declarations.  Define your own instances!
1982 </para>
1983 </listitem>
1984
1985 </itemizedlist>
1986
1987 </para>
1988
1989 </sect2>
1990
1991 </sect1>
1992
1993 <sect1 id="sec-assertions">
1994 <title>Assertions
1995 <indexterm><primary>Assertions</primary></indexterm>
1996 </title>
1997
1998 <para>
1999 If you want to make use of assertions in your standard Haskell code, you
2000 could define a function like the following:
2001 </para>
2002
2003 <para>
2004
2005 <programlisting>
2006 assert :: Bool -> a -> a
2007 assert False x = error "assertion failed!"
2008 assert _     x = x
2009 </programlisting>
2010
2011 </para>
2012
2013 <para>
2014 which works, but gives you back a less than useful error message --
2015 an assertion failed, but which and where?
2016 </para>
2017
2018 <para>
2019 One way out is to define an extended <function>assert</function> function which also
2020 takes a descriptive string to include in the error message and
2021 perhaps combine this with the use of a pre-processor which inserts
2022 the source location where <function>assert</function> was used.
2023 </para>
2024
2025 <para>
2026 Ghc offers a helping hand here, doing all of this for you. For every
2027 use of <function>assert</function> in the user's source:
2028 </para>
2029
2030 <para>
2031
2032 <programlisting>
2033 kelvinToC :: Double -> Double
2034 kelvinToC k = assert (k &gt;= 0.0) (k+273.15)
2035 </programlisting>
2036
2037 </para>
2038
2039 <para>
2040 Ghc will rewrite this to also include the source location where the
2041 assertion was made,
2042 </para>
2043
2044 <para>
2045
2046 <programlisting>
2047 assert pred val ==> assertError "Main.hs|15" pred val
2048 </programlisting>
2049
2050 </para>
2051
2052 <para>
2053 The rewrite is only performed by the compiler when it spots
2054 applications of <function>Exception.assert</function>, so you can still define and
2055 use your own versions of <function>assert</function>, should you so wish. If not,
2056 import <literal>Exception</literal> to make use <function>assert</function> in your code.
2057 </para>
2058
2059 <para>
2060 To have the compiler ignore uses of assert, use the compiler option
2061 <option>-fignore-asserts</option>. <indexterm><primary>-fignore-asserts option</primary></indexterm> That is,
2062 expressions of the form <literal>assert pred e</literal> will be rewritten to <literal>e</literal>.
2063 </para>
2064
2065 <para>
2066 Assertion failures can be caught, see the documentation for the
2067 <literal>Exception</literal> library (<xref linkend="sec-Exception">)
2068 for the details.
2069 </para>
2070
2071 </sect1>
2072
2073 <sect1 id="scoped-type-variables">
2074 <title>Scoped Type Variables
2075 </title>
2076
2077 <para>
2078 A <emphasis>pattern type signature</emphasis> can introduce a <emphasis>scoped type
2079 variable</emphasis>.  For example
2080 </para>
2081
2082 <para>
2083
2084 <programlisting>
2085 f (xs::[a]) = ys ++ ys
2086            where
2087               ys :: [a]
2088               ys = reverse xs
2089 </programlisting>
2090
2091 </para>
2092
2093 <para>
2094 The pattern <literal>(xs::[a])</literal> includes a type signature for <VarName>xs</VarName>.
2095 This brings the type variable <literal>a</literal> into scope; it scopes over
2096 all the patterns and right hand sides for this equation for <function>f</function>.
2097 In particular, it is in scope at the type signature for <VarName>y</VarName>.
2098 </para>
2099
2100 <para>
2101 At ordinary type signatures, such as that for <VarName>ys</VarName>, any type variables
2102 mentioned in the type signature <emphasis>that are not in scope</emphasis> are
2103 implicitly universally quantified.  (If there are no type variables in
2104 scope, all type variables mentioned in the signature are universally
2105 quantified, which is just as in Haskell 98.)  In this case, since <VarName>a</VarName>
2106 is in scope, it is not universally quantified, so the type of <VarName>ys</VarName> is
2107 the same as that of <VarName>xs</VarName>.  In Haskell 98 it is not possible to declare
2108 a type for <VarName>ys</VarName>; a major benefit of scoped type variables is that
2109 it becomes possible to do so.
2110 </para>
2111
2112 <para>
2113 Scoped type variables are implemented in both GHC and Hugs.  Where the
2114 implementations differ from the specification below, those differences
2115 are noted.
2116 </para>
2117
2118 <para>
2119 So much for the basic idea.  Here are the details.
2120 </para>
2121
2122 <sect2>
2123 <title>Scope and implicit quantification</title>
2124
2125 <para>
2126
2127 <itemizedlist>
2128 <listitem>
2129
2130 <para>
2131  All the type variables mentioned in the patterns for a single
2132 function definition equation, that are not already in scope,
2133 are brought into scope by the patterns.  We describe this set as
2134 the <emphasis>type variables bound by the equation</emphasis>.
2135
2136 </para>
2137 </listitem>
2138 <listitem>
2139
2140 <para>
2141  The type variables thus brought into scope may be mentioned
2142 in ordinary type signatures or pattern type signatures anywhere within
2143 their scope.
2144
2145 </para>
2146 </listitem>
2147 <listitem>
2148
2149 <para>
2150  In ordinary type signatures, any type variable mentioned in the
2151 signature that is in scope is <emphasis>not</emphasis> universally quantified.
2152
2153 </para>
2154 </listitem>
2155 <listitem>
2156
2157 <para>
2158  Ordinary type signatures do not bring any new type variables
2159 into scope (except in the type signature itself!). So this is illegal:
2160
2161
2162 <programlisting>
2163   f :: a -> a
2164   f x = x::a
2165 </programlisting>
2166
2167
2168 It's illegal because <VarName>a</VarName> is not in scope in the body of <function>f</function>,
2169 so the ordinary signature <literal>x::a</literal> is equivalent to <literal>x::forall a.a</literal>;
2170 and that is an incorrect typing.
2171
2172 </para>
2173 </listitem>
2174 <listitem>
2175
2176 <para>
2177  There is no implicit universal quantification on pattern type
2178 signatures, nor may one write an explicit <literal>forall</literal> type in a pattern
2179 type signature.  The pattern type signature is a monotype.
2180
2181 </para>
2182 </listitem>
2183 <listitem>
2184
2185 <para>
2186
2187 The type variables in the head of a <literal>class</literal> or <literal>instance</literal> declaration
2188 scope over the methods defined in the <literal>where</literal> part.  For example:
2189
2190
2191 <programlisting>
2192   class C a where
2193     op :: [a] -> a
2194
2195     op xs = let ys::[a]
2196                 ys = reverse xs
2197             in
2198             head ys
2199 </programlisting>
2200
2201
2202 (Not implemented in Hugs yet, Dec 98).
2203 </para>
2204 </listitem>
2205
2206 </itemizedlist>
2207
2208 </para>
2209
2210 </sect2>
2211
2212 <sect2>
2213 <title>Polymorphism</title>
2214
2215 <para>
2216
2217 <itemizedlist>
2218 <listitem>
2219
2220 <para>
2221  Pattern type signatures are completely orthogonal to ordinary, separate
2222 type signatures.  The two can be used independently or together.  There is
2223 no scoping associated with the names of the type variables in a separate type signature.
2224
2225
2226 <programlisting>
2227    f :: [a] -> [a]
2228    f (xs::[b]) = reverse xs
2229 </programlisting>
2230
2231
2232 </para>
2233 </listitem>
2234 <listitem>
2235
2236 <para>
2237  The function must be polymorphic in the type variables
2238 bound by all its equations.  Operationally, the type variables bound
2239 by one equation must not:
2240
2241
2242 <itemizedlist>
2243 <listitem>
2244
2245 <para>
2246  Be unified with a type (such as <literal>Int</literal>, or <literal>[a]</literal>).
2247 </para>
2248 </listitem>
2249 <listitem>
2250
2251 <para>
2252  Be unified with a type variable free in the environment.
2253 </para>
2254 </listitem>
2255 <listitem>
2256
2257 <para>
2258  Be unified with each other.  (They may unify with the type variables
2259 bound by another equation for the same function, of course.)
2260 </para>
2261 </listitem>
2262
2263 </itemizedlist>
2264
2265
2266 For example, the following all fail to type check:
2267
2268
2269 <programlisting>
2270   f (x::a) (y::b) = [x,y]       -- a unifies with b
2271
2272   g (x::a) = x + 1::Int         -- a unifies with Int
2273
2274   h x = let k (y::a) = [x,y]    -- a is free in the
2275         in k x                  -- environment
2276
2277   k (x::a) True    = ...        -- a unifies with Int
2278   k (x::Int) False = ...
2279
2280   w :: [b] -> [b]
2281   w (x::a) = x                  -- a unifies with [b]
2282 </programlisting>
2283
2284
2285 </para>
2286 </listitem>
2287 <listitem>
2288
2289 <para>
2290  The pattern-bound type variable may, however, be constrained
2291 by the context of the principal type, thus:
2292
2293
2294 <programlisting>
2295   f (x::a) (y::a) = x+y*2
2296 </programlisting>
2297
2298
2299 gets the inferred type: <literal>forall a. Num a =&gt; a -&gt; a -&gt; a</literal>.
2300 </para>
2301 </listitem>
2302
2303 </itemizedlist>
2304
2305 </para>
2306
2307 </sect2>
2308
2309 <sect2>
2310 <title>Result type signatures</title>
2311
2312 <para>
2313
2314 <itemizedlist>
2315 <listitem>
2316
2317 <para>
2318  The result type of a function can be given a signature,
2319 thus:
2320
2321
2322 <programlisting>
2323   f (x::a) :: [a] = [x,x,x]
2324 </programlisting>
2325
2326
2327 The final <literal>:: [a]</literal> after all the patterns gives a signature to the
2328 result type.  Sometimes this is the only way of naming the type variable
2329 you want:
2330
2331
2332 <programlisting>
2333   f :: Int -> [a] -> [a]
2334   f n :: ([a] -> [a]) = let g (x::a, y::a) = (y,x)
2335                         in \xs -> map g (reverse xs `zip` xs)
2336 </programlisting>
2337
2338
2339 </para>
2340 </listitem>
2341
2342 </itemizedlist>
2343
2344 </para>
2345
2346 <para>
2347 Result type signatures are not yet implemented in Hugs.
2348 </para>
2349
2350 </sect2>
2351
2352 <sect2>
2353 <title>Pattern signatures on other constructs</title>
2354
2355 <para>
2356
2357 <itemizedlist>
2358 <listitem>
2359
2360 <para>
2361  A pattern type signature can be on an arbitrary sub-pattern, not
2362 just on a variable:
2363
2364
2365 <programlisting>
2366   f ((x,y)::(a,b)) = (y,x) :: (b,a)
2367 </programlisting>
2368
2369
2370 </para>
2371 </listitem>
2372 <listitem>
2373
2374 <para>
2375  Pattern type signatures, including the result part, can be used
2376 in lambda abstractions:
2377
2378
2379 <programlisting>
2380   (\ (x::a, y) :: a -> x)
2381 </programlisting>
2382
2383
2384 Type variables bound by these patterns must be polymorphic in
2385 the sense defined above.
2386 For example:
2387
2388
2389 <programlisting>
2390   f1 (x::c) = f1 x      -- ok
2391   f2 = \(x::c) -> f2 x  -- not ok
2392 </programlisting>
2393
2394
2395 Here, <function>f1</function> is OK, but <function>f2</function> is not, because <VarName>c</VarName> gets unified
2396 with a type variable free in the environment, in this
2397 case, the type of <function>f2</function>, which is in the environment when
2398 the lambda abstraction is checked.
2399
2400 </para>
2401 </listitem>
2402 <listitem>
2403
2404 <para>
2405  Pattern type signatures, including the result part, can be used
2406 in <literal>case</literal> expressions:
2407
2408
2409 <programlisting>
2410   case e of { (x::a, y) :: a -> x }
2411 </programlisting>
2412
2413
2414 The pattern-bound type variables must, as usual,
2415 be polymorphic in the following sense: each case alternative,
2416 considered as a lambda abstraction, must be polymorphic.
2417 Thus this is OK:
2418
2419
2420 <programlisting>
2421   case (True,False) of { (x::a, y) -> x }
2422 </programlisting>
2423
2424
2425 Even though the context is that of a pair of booleans,
2426 the alternative itself is polymorphic.  Of course, it is
2427 also OK to say:
2428
2429
2430 <programlisting>
2431   case (True,False) of { (x::Bool, y) -> x }
2432 </programlisting>
2433
2434
2435 </para>
2436 </listitem>
2437 <listitem>
2438
2439 <para>
2440 To avoid ambiguity, the type after the &ldquo;<literal>::</literal>&rdquo; in a result
2441 pattern signature on a lambda or <literal>case</literal> must be atomic (i.e. a single
2442 token or a parenthesised type of some sort).  To see why,
2443 consider how one would parse this:
2444
2445
2446 <programlisting>
2447   \ x :: a -> b -> x
2448 </programlisting>
2449
2450
2451 </para>
2452 </listitem>
2453 <listitem>
2454
2455 <para>
2456  Pattern type signatures that bind new type variables
2457 may not be used in pattern bindings at all.
2458 So this is illegal:
2459
2460
2461 <programlisting>
2462   f x = let (y, z::a) = x in ...
2463 </programlisting>
2464
2465
2466 But these are OK, because they do not bind fresh type variables:
2467
2468
2469 <programlisting>
2470   f1 x            = let (y, z::Int) = x in ...
2471   f2 (x::(Int,a)) = let (y, z::a)   = x in ...
2472 </programlisting>
2473
2474
2475 However a single variable is considered a degenerate function binding,
2476 rather than a degerate pattern binding, so this is permitted, even
2477 though it binds a type variable:
2478
2479
2480 <programlisting>
2481   f :: (b->b) = \(x::b) -> x
2482 </programlisting>
2483
2484
2485 </para>
2486 </listitem>
2487
2488 </itemizedlist>
2489
2490 Such degnerate function bindings do not fall under the monomorphism
2491 restriction.  Thus:
2492 </para>
2493
2494 <para>
2495
2496 <programlisting>
2497   g :: a -> a -> Bool = \x y. x==y
2498 </programlisting>
2499
2500 </para>
2501
2502 <para>
2503 Here <function>g</function> has type <literal>forall a. Eq a =&gt; a -&gt; a -&gt; Bool</literal>, just as if
2504 <function>g</function> had a separate type signature.  Lacking a type signature, <function>g</function>
2505 would get a monomorphic type.
2506 </para>
2507
2508 </sect2>
2509
2510 <sect2>
2511 <title>Existentials</title>
2512
2513 <para>
2514
2515 <itemizedlist>
2516 <listitem>
2517
2518 <para>
2519  Pattern type signatures can bind existential type variables.
2520 For example:
2521
2522
2523 <programlisting>
2524   data T = forall a. MkT [a]
2525
2526   f :: T -> T
2527   f (MkT [t::a]) = MkT t3
2528                  where
2529                    t3::[a] = [t,t,t]
2530 </programlisting>
2531
2532
2533 </para>
2534 </listitem>
2535
2536 </itemizedlist>
2537
2538 </para>
2539
2540 </sect2>
2541
2542 </sect1>
2543
2544 <sect1 id="pragmas">
2545 <title>Pragmas
2546 </title>
2547
2548 <para>
2549 GHC supports several pragmas, or instructions to the compiler placed
2550 in the source code.  Pragmas don't affect the meaning of the program,
2551 but they might affect the efficiency of the generated code.
2552 </para>
2553
2554 <sect2 id="inline-pragma">
2555 <title>INLINE pragma
2556
2557 <indexterm><primary>INLINE pragma</primary></indexterm>
2558 <indexterm><primary>pragma, INLINE</primary></indexterm></title>
2559
2560 <para>
2561 GHC (with <option>-O</option>, as always) tries to inline (or &ldquo;unfold&rdquo;)
2562 functions/values that are &ldquo;small enough,&rdquo; thus avoiding the call
2563 overhead and possibly exposing other more-wonderful optimisations.
2564 </para>
2565
2566 <para>
2567 You will probably see these unfoldings (in Core syntax) in your
2568 interface files.
2569 </para>
2570
2571 <para>
2572 Normally, if GHC decides a function is &ldquo;too expensive&rdquo; to inline, it
2573 will not do so, nor will it export that unfolding for other modules to
2574 use.
2575 </para>
2576
2577 <para>
2578 The sledgehammer you can bring to bear is the
2579 <literal>INLINE</literal><indexterm><primary>INLINE pragma</primary></indexterm> pragma, used thusly:
2580
2581 <programlisting>
2582 key_function :: Int -> String -> (Bool, Double)
2583
2584 #ifdef __GLASGOW_HASKELL__
2585 {-# INLINE key_function #-}
2586 #endif
2587 </programlisting>
2588
2589 (You don't need to do the C pre-processor carry-on unless you're going
2590 to stick the code through HBC&mdash;it doesn't like <literal>INLINE</literal> pragmas.)
2591 </para>
2592
2593 <para>
2594 The major effect of an <literal>INLINE</literal> pragma is to declare a function's
2595 &ldquo;cost&rdquo; to be very low.  The normal unfolding machinery will then be
2596 very keen to inline it.
2597 </para>
2598
2599 <para>
2600 An <literal>INLINE</literal> pragma for a function can be put anywhere its type
2601 signature could be put.
2602 </para>
2603
2604 <para>
2605 <literal>INLINE</literal> pragmas are a particularly good idea for the
2606 <literal>then</literal>/<literal>return</literal> (or <literal>bind</literal>/<literal>unit</literal>) functions in a monad.
2607 For example, in GHC's own <literal>UniqueSupply</literal> monad code, we have:
2608
2609 <programlisting>
2610 #ifdef __GLASGOW_HASKELL__
2611 {-# INLINE thenUs #-}
2612 {-# INLINE returnUs #-}
2613 #endif
2614 </programlisting>
2615
2616 </para>
2617
2618 </sect2>
2619
2620 <sect2 id="noinline-pragma">
2621 <title>NOINLINE pragma
2622 </title>
2623
2624 <para>
2625 <indexterm><primary>NOINLINE pragma</primary></indexterm>
2626 <indexterm><primary>pragma, NOINLINE</primary></indexterm>
2627 </para>
2628
2629 <para>
2630 The <literal>NOINLINE</literal> pragma does exactly what you'd expect: it stops the
2631 named function from being inlined by the compiler.  You shouldn't ever
2632 need to do this, unless you're very cautious about code size.
2633 </para>
2634
2635 </sect2>
2636
2637     <sect2 id="specialize-pragma">
2638       <title>SPECIALIZE pragma</title>
2639
2640       <indexterm><primary>SPECIALIZE pragma</primary></indexterm>
2641       <indexterm><primary>pragma, SPECIALIZE</primary></indexterm>
2642       <indexterm><primary>overloading, death to</primary></indexterm>
2643
2644       <para>(UK spelling also accepted.)  For key overloaded
2645       functions, you can create extra versions (NB: more code space)
2646       specialised to particular types.  Thus, if you have an
2647       overloaded function:</para>
2648
2649 <programlisting>
2650 hammeredLookup :: Ord key => [(key, value)] -> key -> value
2651 </programlisting>
2652
2653       <para>If it is heavily used on lists with
2654       <literal>Widget</literal> keys, you could specialise it as
2655       follows:</para>
2656
2657 <programlisting>
2658 {-# SPECIALIZE hammeredLookup :: [(Widget, value)] -> Widget -> value #-}
2659 </programlisting>
2660
2661       <para>To get very fancy, you can also specify a named function
2662       to use for the specialised value, as in:</para>
2663
2664 <programlisting>
2665 {-# RULES hammeredLookup = blah #-}
2666 </programlisting>
2667
2668       <para>where <literal>blah</literal> is an implementation of
2669       <literal>hammerdLookup</literal> written specialy for
2670       <literal>Widget</literal> lookups.  It's <emphasis>Your
2671       Responsibility</emphasis> to make sure that
2672       <function>blah</function> really behaves as a specialised
2673       version of <function>hammeredLookup</function>!!!</para>
2674
2675       <para>Note we use the <literal>RULE</literal> pragma here to
2676       indicate that <literal>hammeredLookup</literal> applied at a
2677       certain type should be replaced by <literal>blah</literal>.  See
2678       <xref linkend="rules"> for more information on
2679       <literal>RULES</literal>.</para>
2680
2681       <para>An example in which using <literal>RULES</literal> for
2682       specialisation will Win Big:
2683
2684 <programlisting>
2685 toDouble :: Real a => a -> Double
2686 toDouble = fromRational . toRational
2687
2688 {-# SPECIALIZE toDouble :: Int -> Double = i2d #-}
2689 i2d (I# i) = D# (int2Double# i) -- uses Glasgow prim-op directly
2690 </programlisting>
2691
2692       The <function>i2d</function> function is virtually one machine
2693       instruction; the default conversion&mdash;via an intermediate
2694       <literal>Rational</literal>&mdash;is obscenely expensive by
2695       comparison.</para>
2696
2697       <para>A <literal>SPECIALIZE</literal> pragma for a function can
2698       be put anywhere its type signature could be put.</para>
2699
2700     </sect2>
2701
2702 <sect2 id="specialize-instance-pragma">
2703 <title>SPECIALIZE instance pragma
2704 </title>
2705
2706 <para>
2707 <indexterm><primary>SPECIALIZE pragma</primary></indexterm>
2708 <indexterm><primary>overloading, death to</primary></indexterm>
2709 Same idea, except for instance declarations.  For example:
2710
2711 <programlisting>
2712 instance (Eq a) => Eq (Foo a) where { ... usual stuff ... }
2713
2714 {-# SPECIALIZE instance Eq (Foo [(Int, Bar)] #-}
2715 </programlisting>
2716
2717 Compatible with HBC, by the way.
2718 </para>
2719
2720 </sect2>
2721
2722 <sect2 id="line-pragma">
2723 <title>LINE pragma
2724 </title>
2725
2726 <para>
2727 <indexterm><primary>LINE pragma</primary></indexterm>
2728 <indexterm><primary>pragma, LINE</primary></indexterm>
2729 </para>
2730
2731 <para>
2732 This pragma is similar to C's <literal>&num;line</literal> pragma, and is mainly for use in
2733 automatically generated Haskell code.  It lets you specify the line
2734 number and filename of the original code; for example
2735 </para>
2736
2737 <para>
2738
2739 <programlisting>
2740 {-# LINE 42 "Foo.vhs" #-}
2741 </programlisting>
2742
2743 </para>
2744
2745 <para>
2746 if you'd generated the current file from something called <filename>Foo.vhs</filename>
2747 and this line corresponds to line 42 in the original.  GHC will adjust
2748 its error messages to refer to the line/file named in the <literal>LINE</literal>
2749 pragma.
2750 </para>
2751
2752 </sect2>
2753
2754 <sect2 id="rules">
2755 <title>RULES pragma</title>
2756
2757 <para>
2758 The RULES pragma lets you specify rewrite rules.  It is described in
2759 <xref LinkEnd="rewrite-rules">.
2760 </para>
2761
2762 </sect2>
2763
2764 </sect1>
2765
2766 <sect1 id="rewrite-rules">
2767 <title>Rewrite rules
2768
2769 <indexterm><primary>RULES pagma</primary></indexterm>
2770 <indexterm><primary>pragma, RULES</primary></indexterm>
2771 <indexterm><primary>rewrite rules</primary></indexterm></title>
2772
2773 <para>
2774 The programmer can specify rewrite rules as part of the source program
2775 (in a pragma).  GHC applies these rewrite rules wherever it can.
2776 </para>
2777
2778 <para>
2779 Here is an example:
2780
2781 <programlisting>
2782   {-# RULES
2783         "map/map"       forall f g xs. map f (map g xs) = map (f.g) xs
2784   #-}
2785 </programlisting>
2786
2787 </para>
2788
2789 <sect2>
2790 <title>Syntax</title>
2791
2792 <para>
2793 From a syntactic point of view:
2794
2795 <itemizedlist>
2796 <listitem>
2797
2798 <para>
2799  Each rule has a name, enclosed in double quotes.  The name itself has
2800 no significance at all.  It is only used when reporting how many times the rule fired.
2801 </para>
2802 </listitem>
2803 <listitem>
2804
2805 <para>
2806  There may be zero or more rules in a <literal>RULES</literal> pragma.
2807 </para>
2808 </listitem>
2809 <listitem>
2810
2811 <para>
2812  Layout applies in a <literal>RULES</literal> pragma.  Currently no new indentation level
2813 is set, so you must lay out your rules starting in the same column as the
2814 enclosing definitions.
2815 </para>
2816 </listitem>
2817 <listitem>
2818
2819 <para>
2820  Each variable mentioned in a rule must either be in scope (e.g. <function>map</function>),
2821 or bound by the <literal>forall</literal> (e.g. <function>f</function>, <function>g</function>, <function>xs</function>).  The variables bound by
2822 the <literal>forall</literal> are called the <emphasis>pattern</emphasis> variables.  They are separated
2823 by spaces, just like in a type <literal>forall</literal>.
2824 </para>
2825 </listitem>
2826 <listitem>
2827
2828 <para>
2829  A pattern variable may optionally have a type signature.
2830 If the type of the pattern variable is polymorphic, it <emphasis>must</emphasis> have a type signature.
2831 For example, here is the <literal>foldr/build</literal> rule:
2832
2833 <programlisting>
2834 "fold/build"  forall k z (g::forall b. (a->b->b) -> b -> b) .
2835               foldr k z (build g) = g k z
2836 </programlisting>
2837
2838 Since <function>g</function> has a polymorphic type, it must have a type signature.
2839
2840 </para>
2841 </listitem>
2842 <listitem>
2843
2844 <para>
2845 The left hand side of a rule must consist of a top-level variable applied
2846 to arbitrary expressions.  For example, this is <emphasis>not</emphasis> OK:
2847
2848 <programlisting>
2849 "wrong1"   forall e1 e2.  case True of { True -> e1; False -> e2 } = e1
2850 "wrong2"   forall f.      f True = True
2851 </programlisting>
2852
2853 In <literal>"wrong1"</literal>, the LHS is not an application; in <literal>"wrong2"</literal>, the LHS has a pattern variable
2854 in the head.
2855 </para>
2856 </listitem>
2857 <listitem>
2858
2859 <para>
2860  A rule does not need to be in the same module as (any of) the
2861 variables it mentions, though of course they need to be in scope.
2862 </para>
2863 </listitem>
2864 <listitem>
2865
2866 <para>
2867  Rules are automatically exported from a module, just as instance declarations are.
2868 </para>
2869 </listitem>
2870
2871 </itemizedlist>
2872
2873 </para>
2874
2875 </sect2>
2876
2877 <sect2>
2878 <title>Semantics</title>
2879
2880 <para>
2881 From a semantic point of view:
2882
2883 <itemizedlist>
2884 <listitem>
2885
2886 <para>
2887 Rules are only applied if you use the <option>-O</option> flag.
2888 </para>
2889 </listitem>
2890
2891 <listitem>
2892 <para>
2893  Rules are regarded as left-to-right rewrite rules.
2894 When GHC finds an expression that is a substitution instance of the LHS
2895 of a rule, it replaces the expression by the (appropriately-substituted) RHS.
2896 By "a substitution instance" we mean that the LHS can be made equal to the
2897 expression by substituting for the pattern variables.
2898
2899 </para>
2900 </listitem>
2901 <listitem>
2902
2903 <para>
2904  The LHS and RHS of a rule are typechecked, and must have the
2905 same type.
2906
2907 </para>
2908 </listitem>
2909 <listitem>
2910
2911 <para>
2912  GHC makes absolutely no attempt to verify that the LHS and RHS
2913 of a rule have the same meaning.  That is undecideable in general, and
2914 infeasible in most interesting cases.  The responsibility is entirely the programmer's!
2915
2916 </para>
2917 </listitem>
2918 <listitem>
2919
2920 <para>
2921  GHC makes no attempt to make sure that the rules are confluent or
2922 terminating.  For example:
2923
2924 <programlisting>
2925   "loop"        forall x,y.  f x y = f y x
2926 </programlisting>
2927
2928 This rule will cause the compiler to go into an infinite loop.
2929
2930 </para>
2931 </listitem>
2932 <listitem>
2933
2934 <para>
2935  If more than one rule matches a call, GHC will choose one arbitrarily to apply.
2936
2937 </para>
2938 </listitem>
2939 <listitem>
2940 <para>
2941  GHC currently uses a very simple, syntactic, matching algorithm
2942 for matching a rule LHS with an expression.  It seeks a substitution
2943 which makes the LHS and expression syntactically equal modulo alpha
2944 conversion.  The pattern (rule), but not the expression, is eta-expanded if
2945 necessary.  (Eta-expanding the epression can lead to laziness bugs.)
2946 But not beta conversion (that's called higher-order matching).
2947 </para>
2948
2949 <para>
2950 Matching is carried out on GHC's intermediate language, which includes
2951 type abstractions and applications.  So a rule only matches if the
2952 types match too.  See <xref LinkEnd="rule-spec"> below.
2953 </para>
2954 </listitem>
2955 <listitem>
2956
2957 <para>
2958  GHC keeps trying to apply the rules as it optimises the program.
2959 For example, consider:
2960
2961 <programlisting>
2962   let s = map f
2963       t = map g
2964   in
2965   s (t xs)
2966 </programlisting>
2967
2968 The expression <literal>s (t xs)</literal> does not match the rule <literal>"map/map"</literal>, but GHC
2969 will substitute for <VarName>s</VarName> and <VarName>t</VarName>, giving an expression which does match.
2970 If <VarName>s</VarName> or <VarName>t</VarName> was (a) used more than once, and (b) large or a redex, then it would
2971 not be substituted, and the rule would not fire.
2972
2973 </para>
2974 </listitem>
2975 <listitem>
2976
2977 <para>
2978  In the earlier phases of compilation, GHC inlines <emphasis>nothing
2979 that appears on the LHS of a rule</emphasis>, because once you have substituted
2980 for something you can't match against it (given the simple minded
2981 matching).  So if you write the rule
2982
2983 <programlisting>
2984         "map/map"       forall f,g.  map f . map g = map (f.g)
2985 </programlisting>
2986
2987 this <emphasis>won't</emphasis> match the expression <literal>map f (map g xs)</literal>.
2988 It will only match something written with explicit use of ".".
2989 Well, not quite.  It <emphasis>will</emphasis> match the expression
2990
2991 <programlisting>
2992 wibble f g xs
2993 </programlisting>
2994
2995 where <function>wibble</function> is defined:
2996
2997 <programlisting>
2998 wibble f g = map f . map g
2999 </programlisting>
3000
3001 because <function>wibble</function> will be inlined (it's small).
3002
3003 Later on in compilation, GHC starts inlining even things on the
3004 LHS of rules, but still leaves the rules enabled.  This inlining
3005 policy is controlled by the per-simplification-pass flag <option>-finline-phase</option><emphasis>n</emphasis>.
3006
3007 </para>
3008 </listitem>
3009 <listitem>
3010
3011 <para>
3012  All rules are implicitly exported from the module, and are therefore
3013 in force in any module that imports the module that defined the rule, directly
3014 or indirectly.  (That is, if A imports B, which imports C, then C's rules are
3015 in force when compiling A.)  The situation is very similar to that for instance
3016 declarations.
3017 </para>
3018 </listitem>
3019
3020 </itemizedlist>
3021
3022 </para>
3023
3024 </sect2>
3025
3026 <sect2>
3027 <title>List fusion</title>
3028
3029 <para>
3030 The RULES mechanism is used to implement fusion (deforestation) of common list functions.
3031 If a "good consumer" consumes an intermediate list constructed by a "good producer", the
3032 intermediate list should be eliminated entirely.
3033 </para>
3034
3035 <para>
3036 The following are good producers:
3037
3038 <itemizedlist>
3039 <listitem>
3040
3041 <para>
3042  List comprehensions
3043 </para>
3044 </listitem>
3045 <listitem>
3046
3047 <para>
3048  Enumerations of <literal>Int</literal> and <literal>Char</literal> (e.g. <literal>['a'..'z']</literal>).
3049 </para>
3050 </listitem>
3051 <listitem>
3052
3053 <para>
3054  Explicit lists (e.g. <literal>[True, False]</literal>)
3055 </para>
3056 </listitem>
3057 <listitem>
3058
3059 <para>
3060  The cons constructor (e.g <literal>3:4:[]</literal>)
3061 </para>
3062 </listitem>
3063 <listitem>
3064
3065 <para>
3066  <function>++</function>
3067 </para>
3068 </listitem>
3069 <listitem>
3070
3071 <para>
3072  <function>map</function>
3073 </para>
3074 </listitem>
3075 <listitem>
3076
3077 <para>
3078  <function>filter</function>
3079 </para>
3080 </listitem>
3081 <listitem>
3082
3083 <para>
3084  <function>iterate</function>, <function>repeat</function>
3085 </para>
3086 </listitem>
3087 <listitem>
3088
3089 <para>
3090  <function>zip</function>, <function>zipWith</function>
3091 </para>
3092 </listitem>
3093
3094 </itemizedlist>
3095
3096 </para>
3097
3098 <para>
3099 The following are good consumers:
3100
3101 <itemizedlist>
3102 <listitem>
3103
3104 <para>
3105  List comprehensions
3106 </para>
3107 </listitem>
3108 <listitem>
3109
3110 <para>
3111  <function>array</function> (on its second argument)
3112 </para>
3113 </listitem>
3114 <listitem>
3115
3116 <para>
3117  <function>length</function>
3118 </para>
3119 </listitem>
3120 <listitem>
3121
3122 <para>
3123  <function>++</function> (on its first argument)
3124 </para>
3125 </listitem>
3126 <listitem>
3127
3128 <para>
3129  <function>map</function>
3130 </para>
3131 </listitem>
3132 <listitem>
3133
3134 <para>
3135  <function>filter</function>
3136 </para>
3137 </listitem>
3138 <listitem>
3139
3140 <para>
3141  <function>concat</function>
3142 </para>
3143 </listitem>
3144 <listitem>
3145
3146 <para>
3147  <function>unzip</function>, <function>unzip2</function>, <function>unzip3</function>, <function>unzip4</function>
3148 </para>
3149 </listitem>
3150 <listitem>
3151
3152 <para>
3153  <function>zip</function>, <function>zipWith</function> (but on one argument only; if both are good producers, <function>zip</function>
3154 will fuse with one but not the other)
3155 </para>
3156 </listitem>
3157 <listitem>
3158
3159 <para>
3160  <function>partition</function>
3161 </para>
3162 </listitem>
3163 <listitem>
3164
3165 <para>
3166  <function>head</function>
3167 </para>
3168 </listitem>
3169 <listitem>
3170
3171 <para>
3172  <function>and</function>, <function>or</function>, <function>any</function>, <function>all</function>
3173 </para>
3174 </listitem>
3175 <listitem>
3176
3177 <para>
3178  <function>sequence&lowbar;</function>
3179 </para>
3180 </listitem>
3181 <listitem>
3182
3183 <para>
3184  <function>msum</function>
3185 </para>
3186 </listitem>
3187 <listitem>
3188
3189 <para>
3190  <function>sortBy</function>
3191 </para>
3192 </listitem>
3193
3194 </itemizedlist>
3195
3196 </para>
3197
3198 <para>
3199 So, for example, the following should generate no intermediate lists:
3200
3201 <programlisting>
3202 array (1,10) [(i,i*i) | i &#60;- map (+ 1) [0..9]]
3203 </programlisting>
3204
3205 </para>
3206
3207 <para>
3208 This list could readily be extended; if there are Prelude functions that you use
3209 a lot which are not included, please tell us.
3210 </para>
3211
3212 <para>
3213 If you want to write your own good consumers or producers, look at the
3214 Prelude definitions of the above functions to see how to do so.
3215 </para>
3216
3217 </sect2>
3218
3219 <sect2 id="rule-spec">
3220 <title>Specialisation
3221 </title>
3222
3223 <para>
3224 Rewrite rules can be used to get the same effect as a feature
3225 present in earlier version of GHC:
3226
3227 <programlisting>
3228   {-# SPECIALIZE fromIntegral :: Int8 -> Int16 = int8ToInt16 #-}
3229 </programlisting>
3230
3231 This told GHC to use <function>int8ToInt16</function> instead of <function>fromIntegral</function> whenever
3232 the latter was called with type <literal>Int8 -&gt; Int16</literal>.  That is, rather than
3233 specialising the original definition of <function>fromIntegral</function> the programmer is
3234 promising that it is safe to use <function>int8ToInt16</function> instead.
3235 </para>
3236
3237 <para>
3238 This feature is no longer in GHC.  But rewrite rules let you do the
3239 same thing:
3240
3241 <programlisting>
3242 {-# RULES
3243   "fromIntegral/Int8/Int16" fromIntegral = int8ToInt16
3244 #-}
3245 </programlisting>
3246
3247 This slightly odd-looking rule instructs GHC to replace <function>fromIntegral</function>
3248 by <function>int8ToInt16</function> <emphasis>whenever the types match</emphasis>.  Speaking more operationally,
3249 GHC adds the type and dictionary applications to get the typed rule
3250
3251 <programlisting>
3252 forall (d1::Integral Int8) (d2::Num Int16) .
3253         fromIntegral Int8 Int16 d1 d2 = int8ToInt16
3254 </programlisting>
3255
3256 What is more,
3257 this rule does not need to be in the same file as fromIntegral,
3258 unlike the <literal>SPECIALISE</literal> pragmas which currently do (so that they
3259 have an original definition available to specialise).
3260 </para>
3261
3262 </sect2>
3263
3264 <sect2>
3265 <title>Controlling what's going on</title>
3266
3267 <para>
3268
3269 <itemizedlist>
3270 <listitem>
3271
3272 <para>
3273  Use <option>-ddump-rules</option> to see what transformation rules GHC is using.
3274 </para>
3275 </listitem>
3276 <listitem>
3277
3278 <para>
3279  Use <option>-ddump-simpl-stats</option> to see what rules are being fired.
3280 If you add <option>-dppr-debug</option> you get a more detailed listing.
3281 </para>
3282 </listitem>
3283 <listitem>
3284
3285 <para>
3286  The defintion of (say) <function>build</function> in <FileName>PrelBase.lhs</FileName> looks llike this:
3287
3288 <programlisting>
3289         build   :: forall a. (forall b. (a -> b -> b) -> b -> b) -> [a]
3290         {-# INLINE build #-}
3291         build g = g (:) []
3292 </programlisting>
3293
3294 Notice the <literal>INLINE</literal>!  That prevents <literal>(:)</literal> from being inlined when compiling
3295 <literal>PrelBase</literal>, so that an importing module will &ldquo;see&rdquo; the <literal>(:)</literal>, and can
3296 match it on the LHS of a rule.  <literal>INLINE</literal> prevents any inlining happening
3297 in the RHS of the <literal>INLINE</literal> thing.  I regret the delicacy of this.
3298
3299 </para>
3300 </listitem>
3301 <listitem>
3302
3303 <para>
3304  In <filename>ghc/lib/std/PrelBase.lhs</filename> look at the rules for <function>map</function> to
3305 see how to write rules that will do fusion and yet give an efficient
3306 program even if fusion doesn't happen.  More rules in <filename>PrelList.lhs</filename>.
3307 </para>
3308 </listitem>
3309
3310 </itemizedlist>
3311
3312 </para>
3313
3314 </sect2>
3315
3316 </sect1>
3317
3318 <sect1 id="generic-classes">
3319 <title>Generic classes</title>
3320
3321 <para>
3322 The ideas behind this extension are described in detail in "Derivable type classes",
3323 Ralf Hinze and Simon Peyton Jones, Haskell Workshop, Montreal Sept 2000, pp94-105.
3324 An example will give the idea:
3325 </para>
3326
3327 <programlisting>
3328   import Generics
3329
3330   class Bin a where
3331     toBin   :: a -> [Int]
3332     fromBin :: [Int] -> (a, [Int])
3333
3334     toBin {| Unit |}    Unit      = []
3335     toBin {| a :+: b |} (Inl x)   = 0 : toBin x
3336     toBin {| a :+: b |} (Inr y)   = 1 : toBin y
3337     toBin {| a :*: b |} (x :*: y) = toBin x ++ toBin y
3338
3339     fromBin {| Unit |}    bs      = (Unit, bs)
3340     fromBin {| a :+: b |} (0:bs)  = (Inl x, bs')    where (x,bs') = fromBin bs
3341     fromBin {| a :+: b |} (1:bs)  = (Inr y, bs')    where (y,bs') = fromBin bs
3342     fromBin {| a :*: b |} bs      = (x :*: y, bs'') where (x,bs' ) = fromBin bs
3343                                                           (y,bs'') = fromBin bs'
3344 </programlisting>
3345 <para>
3346 This class declaration explains how <literal>toBin</literal> and <literal>fromBin</literal>
3347 work for arbitrary data types.  They do so by giving cases for unit, product, and sum,
3348 which are defined thus in the library module <literal>Generics</literal>:
3349 </para>
3350 <programlisting>
3351   data Unit    = Unit
3352   data a :+: b = Inl a | Inr b
3353   data a :*: b = a :*: b
3354 </programlisting>
3355 <para>
3356 Now you can make a data type into an instance of Bin like this:
3357 <programlisting>
3358   instance (Bin a, Bin b) => Bin (a,b)
3359   instance Bin a => Bin [a]
3360 </programlisting>
3361 That is, just leave off the "where" clasuse.  Of course, you can put in the
3362 where clause and over-ride whichever methods you please.
3363 </para>
3364
3365     <sect2>
3366       <title> Using generics </title>
3367       <para>To use generics you need to</para>
3368       <itemizedlist>
3369         <listitem>
3370           <para>Use the <option>-fgenerics</option> flag.</para>
3371         </listitem>
3372         <listitem>
3373           <para>Import the module <literal>Generics</literal> from the
3374           <literal>lang</literal> package.  This import brings into
3375           scope the data types <literal>Unit</literal>,
3376           <literal>:*:</literal>, and <literal>:+:</literal>.  (You
3377           don't need this import if you don't mention these types
3378           explicitly; for example, if you are simply giving instance
3379           declarations.)</para>
3380         </listitem>
3381       </itemizedlist>
3382     </sect2>
3383
3384 <sect2> <title> Changes wrt the paper </title>
3385 <para>
3386 Note that the type constructors <literal>:+:</literal> and <literal>:*:</literal>
3387 can be written infix (indeed, you can now use
3388 any operator starting in a colon as an infix type constructor).  Also note that
3389 the type constructors are not exactly as in the paper (Unit instead of 1, etc).
3390 Finally, note that the syntax of the type patterns in the class declaration
3391 uses "<literal>{|</literal>" and "<literal>{|</literal>" brackets; curly braces
3392 alone would ambiguous when they appear on right hand sides (an extension we
3393 anticipate wanting).
3394 </para>
3395 </sect2>
3396
3397 <sect2> <title>Terminology and restrictions</title>
3398 <para>
3399 Terminology.  A "generic default method" in a class declaration
3400 is one that is defined using type patterns as above.
3401 A "polymorphic default method" is a default method defined as in Haskell 98.
3402 A "generic class declaration" is a class declaration with at least one
3403 generic default method.
3404 </para>
3405
3406 <para>
3407 Restrictions:
3408 <itemizedlist>
3409 <listitem>
3410 <para>
3411 Alas, we do not yet implement the stuff about constructor names and
3412 field labels.
3413 </para>
3414 </listitem>
3415
3416 <listitem>
3417 <para>
3418 A generic class can have only one parameter; you can't have a generic
3419 multi-parameter class.
3420 </para>
3421 </listitem>
3422
3423 <listitem>
3424 <para>
3425 A default method must be defined entirely using type patterns, or entirely
3426 without.  So this is illegal:
3427 <programlisting>
3428   class Foo a where
3429     op :: a -> (a, Bool)
3430     op {| Unit |} Unit = (Unit, True)
3431     op x               = (x,    False)
3432 </programlisting>
3433 However it is perfectly OK for some methods of a generic class to have
3434 generic default methods and others to have polymorphic default methods.
3435 </para>
3436 </listitem>
3437
3438 <listitem>
3439 <para>
3440 The type variable(s) in the type pattern for a generic method declaration
3441 scope over the right hand side.  So this is legal (note the use of the type variable ``p'' in a type signature on the right hand side:
3442 <programlisting>
3443   class Foo a where
3444     op :: a -> Bool
3445     op {| p :*: q |} (x :*: y) = op (x :: p)
3446     ...
3447 </programlisting>
3448 </para>
3449 </listitem>
3450
3451 <listitem>
3452 <para>
3453 The type patterns in a generic default method must take one of the forms:
3454 <programlisting>
3455        a :+: b
3456        a :*: b
3457        Unit
3458 </programlisting>
3459 where "a" and "b" are type variables.  Furthermore, all the type patterns for
3460 a single type constructor (<literal>:*:</literal>, say) must be identical; they
3461 must use the same type variables.  So this is illegal:
3462 <programlisting>
3463   class Foo a where
3464     op :: a -> Bool
3465     op {| a :+: b |} (Inl x) = True
3466     op {| p :+: q |} (Inr y) = False
3467 </programlisting>
3468 The type patterns must be identical, even in equations for different methods of the class.
3469 So this too is illegal:
3470 <programlisting>
3471   class Foo a where
3472     op1 :: a -> Bool
3473     op {| a :*: b |} (Inl x) = True
3474
3475     op2 :: a -> Bool
3476     op {| p :*: q |} (Inr y) = False
3477 </programlisting>
3478 (The reason for this restriction is that we gather all the equations for a particular type consructor
3479 into a single generic instance declaration.)
3480 </para>
3481 </listitem>
3482
3483 <listitem>
3484 <para>
3485 A generic method declaration must give a case for each of the three type constructors.
3486 </para>
3487 </listitem>
3488
3489 <listitem>
3490 <para>
3491 The type for a generic method can be built only from:
3492   <itemizedlist>
3493   <listitem> <para> Function arrows </para> </listitem>
3494   <listitem> <para> Type variables </para> </listitem>
3495   <listitem> <para> Tuples </para> </listitem>
3496   <listitem> <para> Arbitrary types not involving type variables </para> </listitem>
3497   </itemizedlist>
3498 Here are some example type signatures for generic methods:
3499 <programlisting>
3500     op1 :: a -> Bool
3501     op2 :: Bool -> (a,Bool)
3502     op3 :: [Int] -> a -> a
3503     op4 :: [a] -> Bool
3504 </programlisting>
3505 Here, op1, op2, op3 are OK, but op4 is rejected, because it has a type variable
3506 inside a list.
3507 </para>
3508 <para>
3509 This restriction is an implementation restriction: we just havn't got around to
3510 implementing the necessary bidirectional maps over arbitrary type constructors.
3511 It would be relatively easy to add specific type constructors, such as Maybe and list,
3512 to the ones that are allowed.</para>
3513 </listitem>
3514
3515 <listitem>
3516 <para>
3517 In an instance declaration for a generic class, the idea is that the compiler
3518 will fill in the methods for you, based on the generic templates.  However it can only
3519 do so if
3520   <itemizedlist>
3521   <listitem>
3522   <para>
3523   The instance type is simple (a type constructor applied to type variables, as in Haskell 98).
3524   </para>
3525   </listitem>
3526   <listitem>
3527   <para>
3528   No constructor of the instance type has unboxed fields.
3529   </para>
3530   </listitem>
3531   </itemizedlist>
3532 (Of course, these things can only arise if you are already using GHC extensions.)
3533 However, you can still give an instance declarations for types which break these rules,
3534 provided you give explicit code to override any generic default methods.
3535 </para>
3536 </listitem>
3537
3538 </itemizedlist>
3539 </para>
3540
3541 <para>
3542 The option <option>-ddump-deriv</option> dumps incomprehensible stuff giving details of
3543 what the compiler does with generic declarations.
3544 </para>
3545
3546 </sect2>
3547
3548 <sect2> <title> Another example </title>
3549 <para>
3550 Just to finish with, here's another example I rather like:
3551 <programlisting>
3552   class Tag a where
3553     nCons :: a -> Int
3554     nCons {| Unit |}    _ = 1
3555     nCons {| a :*: b |} _ = 1
3556     nCons {| a :+: b |} _ = nCons (bot::a) + nCons (bot::b)
3557
3558     tag :: a -> Int
3559     tag {| Unit |}    _       = 1
3560     tag {| a :*: b |} _       = 1
3561     tag {| a :+: b |} (Inl x) = tag x
3562     tag {| a :+: b |} (Inr y) = nCons (bot::a) + tag y
3563 </programlisting>
3564 </para>
3565 </sect2>
3566 </sect1>
3567
3568 <!-- Emacs stuff:
3569      ;;; Local Variables: ***
3570      ;;; mode: sgml ***
3571      ;;; sgml-parent-document: ("users_guide.sgml" "book" "chapter" "sect1") ***
3572      ;;; End: ***
3573  -->