ghc/docs/users_guide/glasgow_exts.sgml

   1 <para>
   2 <indexterm><primary>language, GHC</primary></indexterm>
   3 <indexterm><primary>extensions, GHC</primary></indexterm>
   4 As with all known Haskell systems, GHC implements some extensions to
   5 the language.  To use them, you'll need to give a <option>-fglasgow-exts</option>
   6 <indexterm><primary>-fglasgow-exts option</primary></indexterm> option.
   7 </para>
   8
   9 <para>
  10 Virtually all of the Glasgow extensions serve to give you access to
  11 the underlying facilities with which we implement Haskell.  Thus, you
  12 can get at the Raw Iron, if you are willing to write some non-standard
  13 code at a more primitive level.  You need not be &ldquo;stuck&rdquo; on
  14 performance because of the implementation costs of Haskell's
  15 &ldquo;high-level&rdquo; features&mdash;you can always code &ldquo;under&rdquo; them.  In an extreme case, you can write all your time-critical code in C, and then just glue it together with Haskell!
  16 </para>
  17
  18 <para>
  19 Executive summary of our extensions:
  20 </para>
  21
  22   <variablelist>
  23
  24     <varlistentry>
  25       <term>Unboxed types and primitive operations:</Term>
  26       <listitem>
  27         <para>You can get right down to the raw machine types and
  28         operations; included in this are &ldquo;primitive
  29         arrays&rdquo; (direct access to Big Wads of Bytes).  Please
  30         see <XRef LinkEnd="glasgow-unboxed"> and following.</para>
  31       </listitem>
  32     </varlistentry>
  33
  34     <varlistentry>
  35       <term>Type system extensions:</term>
  36       <listitem>
  37         <para> GHC supports a large number of extensions to Haskell's
  38         type system.  Specifically:</para>
  39
  40         <variablelist>
  41           <varlistentry>
  42             <term>Multi-parameter type classes:</term>
  43             <listitem>
  44               <para><xref LinkEnd="multi-param-type-classes"></para>
  45             </listitem>
  46           </varlistentry>
  47
  48           <varlistentry>
  49             <term>Functional dependencies:</term>
  50             <listitem>
  51               <para><xref LinkEnd="functional-dependencies"></para>
  52             </listitem>
  53           </varlistentry>
  54
  55           <varlistentry>
  56             <term>Implicit parameters:</term>
  57             <listitem>
  58               <para><xref LinkEnd="implicit-parameters"></para>
  59             </listitem>
  60           </varlistentry>
  61
  62           <varlistentry>
  63             <term>Local universal quantification:</term>
  64             <listitem>
  65               <para><xref LinkEnd="universal-quantification"></para>
  66             </listitem>
  67           </varlistentry>
  68
  69           <varlistentry>
  70             <term>Extistentially quantification in data types:</term>
  71             <listitem>
  72               <para><xref LinkEnd="existential-quantification"></para>
  73             </listitem>
  74           </varlistentry>
  75
  76           <varlistentry>
  77             <term>Scoped type variables:</term>
  78             <listitem>
  79               <para>Scoped type variables enable the programmer to
  80               supply type signatures for some nested declarations,
  81               where this would not be legal in Haskell 98.  Details in
  82               <xref LinkEnd="scoped-type-variables">.</para>
  83             </listitem>
  84           </varlistentry>
  85         </variablelist>
  86       </listitem>
  87     </varlistentry>
  88
  89     <varlistentry>
  90       <term>Pattern guards</term>
  91       <listitem>
  92         <para>Instead of being a boolean expression, a guard is a list
  93         of qualifiers, exactly as in a list comprehension. See <xref
  94         LinkEnd="pattern-guards">.</para>
  95       </listitem>
  96     </varlistentry>
  97
  98     <varlistentry>
  99       <term>Foreign calling:</term>
 100       <listitem>
 101         <para>Just what it sounds like.  We provide
 102         <emphasis>lots</emphasis> of rope that you can dangle around
 103         your neck.  Please see <xref LinkEnd="ffi">.</para>
 104       </listitem>
 105     </varlistentry>
 106
 107     <varlistentry>
 108       <term>Pragmas</term>
 109       <listitem>
 110         <para>Pragmas are special instructions to the compiler placed
 111         in the source file.  The pragmas GHC supports are described in
 112         <xref LinkEnd="pragmas">.</para>
 113       </listitem>
 114     </varlistentry>
 115
 116     <varlistentry>
 117       <term>Rewrite rules:</term>
 118       <listitem>
 119         <para>The programmer can specify rewrite rules as part of the
 120         source program (in a pragma).  GHC applies these rewrite rules
 121         wherever it can.  Details in <xref
 122         LinkEnd="rewrite-rules">.</para>
 123       </listitem>
 124     </varlistentry>
 125
 126     <varlistentry>
 127       <term>Generic classes:</term>
 128       <listitem>
 129         <para>Generic class declarations allow you to define a class
 130         whose methods say how to work over an arbitrary data type.
 131         Then it's really easy to make any new type into an instance of
 132         the class.  This generalises the rather ad-hoc "deriving"
 133         feature of Haskell 98.  Details in <xref
 134         LinkEnd="generic-classes">.</para>
 135       </listitem>
 136     </varlistentry>
 137   </variablelist>
 138
 139 <para>
 140 Before you get too carried away working at the lowest level (e.g.,
 141 sloshing <literal>MutableByteArray&num;</literal>s around your
 142 program), you may wish to check if there are libraries that provide a
 143 &ldquo;Haskellised veneer&rdquo; over the features you want.  See
 144 <xref linkend="book-hslibs">.
 145 </para>
 146
 147   <sect1 id="options-language">
 148     <title>Language options</title>
 149
 150     <indexterm><primary>language</primary><secondary>option</secondary>
 151     </indexterm>
 152     <indexterm><primary>options</primary><secondary>language</secondary>
 153     </indexterm>
 154     <indexterm><primary>extensions</primary><secondary>options controlling</secondary>
 155     </indexterm>
 156
 157     <para> These flags control what variation of the language are
 158     permitted.  Leaving out all of them gives you standard Haskell
 159     98.</para>
 160
 161     <variablelist>
 162
 163       <varlistentry>
 164         <term><option>-fglasgow-exts</option>:</term>
 165         <indexterm><primary><option>-fglasgow-exts</option></primary></indexterm>
 166         <listitem>
 167           <para>This simultaneously enables all of the extensions to
 168           Haskell 98 described in <xref
 169           linkend="ghc-language-features">, except where otherwise
 170           noted. </para>
 171         </listitem>
 172       </varlistentry>
 173
 174       <varlistentry>
 175         <term><option>-fno-monomorphism-restriction</option>:</term>
 176         <indexterm><primary><option>-fno-monomorphism-restriction</option></primary></indexterm>
 177         <listitem>
 178           <para> Switch off the Haskell 98 monomorphism restriction.
 179           Independent of the <option>-fglasgow-exts</option>
 180           flag. </para>
 181         </listitem>
 182       </varlistentry>
 183
 184       <varlistentry>
 185         <term><option>-fallow-overlapping-instances</option></term>
 186         <term><option>-fallow-undecidable-instances</option></term>
 187         <term><option>-fcontext-stack</option></term>
 188         <indexterm><primary><option>-fallow-overlapping-instances</option></primary></indexterm>
 189         <indexterm><primary><option>-fallow-undecidable-instances</option></primary></indexterm>
 190         <indexterm><primary><option>-fcontext-stack</option></primary></indexterm>
 191         <listitem>
 192           <para> See <xref LinkEnd="instance-decls">.  Only relevant
 193           if you also use <option>-fglasgow-exts</option>.</para>
 194         </listitem>
 195       </varlistentry>
 196
 197       <varlistentry>
 198         <term><option>-finline-phase</option></term>
 199         <indexterm><primary><option>-finline-phase</option></primary></indexterm>
 200         <listitem>
 201           <para>See <xref LinkEnd="rewrite-rules">.  Only relevant if
 202           you also use <option>-fglasgow-exts</option>.</para>
 203         </listitem>
 204       </varlistentry>
 205
 206       <varlistentry>
 207         <term><option>-fgenerics</option></term>
 208         <indexterm><primary><option>-fgenerics</option></primary></indexterm>
 209         <listitem>
 210           <para>See <xref LinkEnd="generic-classes">.  Independent of
 211           <option>-fglasgow-exts</option>.</para>
 212         </listitem>
 213       </varlistentry>
 214
 215         <varlistentry>
 216           <term><option>-fno-implicit-prelude</option></term>
 217           <listitem>
 218             <para><indexterm><primary>-fno-implicit-prelude
 219             option</primary></indexterm> GHC normally imports
 220             <filename>Prelude.hi</filename> files for you.  If you'd
 221             rather it didn't, then give it a
 222             <option>-fno-implicit-prelude</option> option.  The idea
 223             is that you can then import a Prelude of your own.  (But
 224             don't call it <literal>Prelude</literal>; the Haskell
 225             module namespace is flat, and you must not conflict with
 226             any Prelude module.)</para>
 227
 228             <para>Even though you have not imported the Prelude, all
 229             the built-in syntax still refers to the built-in Haskell
 230             Prelude types and values, as specified by the Haskell
 231             Report.  For example, the type <literal>[Int]</literal>
 232             still means <literal>Prelude.[] Int</literal>; tuples
 233             continue to refer to the standard Prelude tuples; the
 234             translation for list comprehensions continues to use
 235             <literal>Prelude.map</literal> etc.</para>
 236
 237             <para> With one group of exceptions!  You may want to
 238             define your own numeric class hierarchy.  It completely
 239             defeats that purpose if the literal "1" means
 240             "<literal>Prelude.fromInteger 1</literal>", which is what
 241             the Haskell Report specifies.  So the
 242             <option>-fno-implicit-prelude</option> flag causes the
 243             following pieces of built-in syntax to refer to <emphasis>whatever
 244             is in scope</emphasis>, not the Prelude versions:</para>
 245
 246             <itemizedlist>
 247               <listitem>
 248                 <para>Integer and fractional literals mean
 249                 "<literal>fromInteger 1</literal>" and
 250                 "<literal>fromRational 3.2</literal>", not the
 251                 Prelude-qualified versions; both in expressions and in
 252                 patterns.</para>
 253               </listitem>
 254
 255               <listitem>
 256                 <para>Negation (e.g. "<literal>- (f x)</literal>")
 257                 means "<literal>negate (f x)</literal>" (not
 258                 <literal>Prelude.negate</literal>).</para>
 259               </listitem>
 260
 261               <listitem>
 262                 <para>In an n+k pattern, the standard Prelude
 263                 <literal>Ord</literal> class is still used for comparison,
 264                 but the necessary subtraction uses whatever
 265                 "<literal>(-)</literal>" is in scope (not
 266                 "<literal>Prelude.(-)</literal>").</para>
 267               </listitem>
 268             </itemizedlist>
 269
 270              <para>Note: Negative literals, such as <literal>-3</literal>, are
 271              specified by (a careful reading of) the Haskell Report as
 272              meaning <literal>Prelude.negate (Prelude.fromInteger 3)</literal>.
 273              However, GHC deviates from this slightly, and treats them as meaning
 274              <literal>fromInteger (-3)</literal>.  One particular effect of this
 275              slightly-non-standard reading is that there is no difficulty with
 276              the literal <literal>-2147483648</literal> at type <literal>Int</literal>;
 277              it means <literal>fromInteger (-2147483648)</literal>.  The strict interpretation
 278              would be <literal>negate (fromInteger 21474836483)</literal>,
 279              and the call to <literal>fromInteger</literal> would overflow (at type <literal>Int</literal>, remember).
 280              </para>
 281
 282           </listitem>
 283         </varlistentry>
 284
 285     </variablelist>
 286   </sect1>
 287
 288 <!-- UNBOXED TYPES AND PRIMITIVE OPERATIONS -->
 289 &primitives;
 290
 291 <sect1 id="glasgow-ST-monad">
 292 <title>Primitive state-transformer monad</title>
 293
 294 <para>
 295 <indexterm><primary>state transformers (Glasgow extensions)</primary></indexterm>
 296 <indexterm><primary>ST monad (Glasgow extension)</primary></indexterm>
 297 </para>
 298
 299 <para>
 300 This monad underlies our implementation of arrays, mutable and
 301 immutable, and our implementation of I/O, including &ldquo;C calls&rdquo;.
 302 </para>
 303
 304 <para>
 305 The <literal>ST</literal> library, which provides access to the
 306 <function>ST</function> monad, is described in <xref
 307 linkend="sec-ST">.
 308 </para>
 309
 310 </sect1>
 311
 312 <sect1 id="glasgow-prim-arrays">
 313 <title>Primitive arrays, mutable and otherwise
 314 </title>
 315
 316 <para>
 317 <indexterm><primary>primitive arrays (Glasgow extension)</primary></indexterm>
 318 <indexterm><primary>arrays, primitive (Glasgow extension)</primary></indexterm>
 319 </para>
 320
 321 <para>
 322 GHC knows about quite a few flavours of Large Swathes of Bytes.
 323 </para>
 324
 325 <para>
 326 First, GHC distinguishes between primitive arrays of (boxed) Haskell
 327 objects (type <literal>Array&num; obj</literal>) and primitive arrays of bytes (type
 328 <literal>ByteArray&num;</literal>).
 329 </para>
 330
 331 <para>
 332 Second, it distinguishes between&hellip;
 333 <variablelist>
 334
 335 <varlistentry>
 336 <term>Immutable:</term>
 337 <listitem>
 338 <para>
 339 Arrays that do not change (as with &ldquo;standard&rdquo; Haskell arrays); you
 340 can only read from them.  Obviously, they do not need the care and
 341 attention of the state-transformer monad.
 342 </para>
 343 </listitem>
 344 </varlistentry>
 345 <varlistentry>
 346 <term>Mutable:</term>
 347 <listitem>
 348 <para>
 349 Arrays that may be changed or &ldquo;mutated.&rdquo;  All the operations on them
 350 live within the state-transformer monad and the updates happen
 351 <emphasis>in-place</emphasis>.
 352 </para>
 353 </listitem>
 354 </varlistentry>
 355 <varlistentry>
 356 <term>&ldquo;Static&rdquo; (in C land):</term>
 357 <listitem>
 358 <para>
 359 A C routine may pass an <literal>Addr&num;</literal> pointer back into Haskell land.  There
 360 are then primitive operations with which you may merrily grab values
 361 over in C land, by indexing off the &ldquo;static&rdquo; pointer.
 362 </para>
 363 </listitem>
 364 </varlistentry>
 365 <varlistentry>
 366 <term>&ldquo;Stable&rdquo; pointers:</term>
 367 <listitem>
 368 <para>
 369 If, for some reason, you wish to hand a Haskell pointer (i.e.,
 370 <emphasis>not</emphasis> an unboxed value) to a C routine, you first make the
 371 pointer &ldquo;stable,&rdquo; so that the garbage collector won't forget that it
 372 exists.  That is, GHC provides a safe way to pass Haskell pointers to
 373 C.
 374 </para>
 375
 376 <para>
 377 Please see <xref LinkEnd="sec-stable-pointers"> for more details.
 378 </para>
 379 </listitem>
 380 </varlistentry>
 381 <varlistentry>
 382 <term>&ldquo;Foreign objects&rdquo;:</term>
 383 <listitem>
 384 <para>
 385 A &ldquo;foreign object&rdquo; is a safe way to pass an external object (a
 386 C-allocated pointer, say) to Haskell and have Haskell do the Right
 387 Thing when it no longer references the object.  So, for example, C
 388 could pass a large bitmap over to Haskell and say &ldquo;please free this
 389 memory when you're done with it.&rdquo;
 390 </para>
 391
 392 <para>
 393 Please see <xref LinkEnd="sec-ForeignObj"> for more details.
 394 </para>
 395 </listitem>
 396 </varlistentry>
 397 </variablelist>
 398 </para>
 399
 400 <para>
 401 The libraries documentatation gives more details on all these
 402 &ldquo;primitive array&rdquo; types and the operations on them.
 403 </para>
 404
 405 </sect1>
 406
 407
 408 <sect1 id="pattern-guards">
 409 <title>Pattern guards</title>
 410
 411 <para>
 412 <indexterm><primary>Pattern guards (Glasgow extension)</primary></indexterm>
 413 The discussion that follows is an abbreviated version of Simon Peyton Jones's original <ULink URL="http://research.microsoft.com/~simonpj/Haskell/guards.html">proposal</ULink>. (Note that the proposal was written before pattern guards were implemented, so refers to them as unimplemented.)
 414 </para>
 415
 416 <para>
 417 Suppose we have an abstract data type of finite maps, with a
 418 lookup operation:
 419
 420 <programlisting>
 421 lookup :: FiniteMap -> Int -> Maybe Int
 422 </programlisting>
 423
 424 The lookup returns <function>Nothing</function> if the supplied key is not in the domain of the mapping, and <function>(Just v)</function> otherwise,
 425 where <VarName>v</VarName> is the value that the key maps to.  Now consider the following definition:
 426 </para>
 427
 428 <programlisting>
 429 clunky env var1 var2 | ok1 && ok2 = val1 + val2
 430 | otherwise  = var1 + var2
 431 where
 432   m1 = lookup env var1
 433   m2 = lookup env var2
 434   ok1 = maybeToBool m1
 435   ok2 = maybeToBool m2
 436   val1 = expectJust m1
 437   val2 = expectJust m2
 438 </programlisting>
 439
 440 <para>
 441 The auxiliary functions are
 442 </para>
 443
 444 <programlisting>
 445 maybeToBool :: Maybe a -&gt; Bool
 446 maybeToBool (Just x) = True
 447 maybeToBool Nothing  = False
 448
 449 expectJust :: Maybe a -&gt; a
 450 expectJust (Just x) = x
 451 expectJust Nothing  = error "Unexpected Nothing"
 452 </programlisting>
 453
 454 <para>
 455 What is <function>clunky</function> doing? The guard <literal>ok1 &&
 456 ok2</literal> checks that both lookups succeed, using
 457 <function>maybeToBool</function> to convert the <function>Maybe</function>
 458 types to booleans. The (lazily evaluated) <function>expectJust</function>
 459 calls extract the values from the results of the lookups, and binds the
 460 returned values to <VarName>val1</VarName> and <VarName>val2</VarName>
 461 respectively.  If either lookup fails, then clunky takes the
 462 <literal>otherwise</literal> case and returns the sum of its arguments.
 463 </para>
 464
 465 <para>
 466 This is certainly legal Haskell, but it is a tremendously verbose and
 467 un-obvious way to achieve the desired effect.  Arguably, a more direct way
 468 to write clunky would be to use case expressions:
 469 </para>
 470
 471 <programlisting>
 472 clunky env var1 var1 = case lookup env var1 of
 473   Nothing -&gt; fail
 474   Just val1 -&gt; case lookup env var2 of
 475     Nothing -&gt; fail
 476     Just val2 -&gt; val1 + val2
 477 where
 478   fail = val1 + val2
 479 </programlisting>
 480
 481 <para>
 482 This is a bit shorter, but hardly better.  Of course, we can rewrite any set
 483 of pattern-matching, guarded equations as case expressions; that is
 484 precisely what the compiler does when compiling equations! The reason that
 485 Haskell provides guarded equations is because they allow us to write down
 486 the cases we want to consider, one at a time, independently of each other.
 487 This structure is hidden in the case version.  Two of the right-hand sides
 488 are really the same (<function>fail</function>), and the whole expression
 489 tends to become more and more indented.
 490 </para>
 491
 492 <para>
 493 Here is how I would write clunky:
 494 </para>
 495
 496 <programlisting>
 497 clunky env var1 var1
 498   | Just val1 &lt;- lookup env var1
 499   , Just val2 &lt;- lookup env var2
 500   = val1 + val2
 501 ...other equations for clunky...
 502 </programlisting>
 503
 504 <para>
 505 The semantics should be clear enough.  The qualifers are matched in order.
 506 For a <literal>&lt;-</literal> qualifier, which I call a pattern guard, the
 507 right hand side is evaluated and matched against the pattern on the left.
 508 If the match fails then the whole guard fails and the next equation is
 509 tried.  If it succeeds, then the appropriate binding takes place, and the
 510 next qualifier is matched, in the augmented environment.  Unlike list
 511 comprehensions, however, the type of the expression to the right of the
 512 <literal>&lt;-</literal> is the same as the type of the pattern to its
 513 left.  The bindings introduced by pattern guards scope over all the
 514 remaining guard qualifiers, and over the right hand side of the equation.
 515 </para>
 516
 517 <para>
 518 Just as with list comprehensions, boolean expressions can be freely mixed
 519 with among the pattern guards.  For example:
 520 </para>
 521
 522 <programlisting>
 523 f x | [y] <- x
 524     , y > 3
 525     , Just z <- h y
 526     = ...
 527 </programlisting>
 528
 529 <para>
 530 Haskell's current guards therefore emerge as a special case, in which the
 531 qualifier list has just one element, a boolean expression.
 532 </para>
 533 </sect1>
 534
 535   <sect1 id="sec-ffi">
 536     <title>The foreign interface</title>
 537
 538     <para>The foreign interface consists of the following components:</para>
 539
 540     <itemizedlist>
 541       <listitem>
 542         <para>The Foreign Function Interface language specification
 543         (included in this manual, in <xref linkend="ffi">).</para>
 544       </listitem>
 545
 546       <listitem>
 547         <para>The <literal>Foreign</literal> module (see <xref
 548         linkend="sec-Foreign">) collects together several interfaces
 549         which are useful in specifying foreign language
 550         interfaces, including the following:</para>
 551
 552         <itemizedlist>
 553           <listitem>
 554             <para>The <literal>ForeignObj</literal> module (see <xref
 555             linkend="sec-ForeignObj">), for managing pointers from
 556             Haskell into the outside world.</para>
 557           </listitem>
 558
 559           <listitem>
 560             <para>The <literal>StablePtr</literal> module (see <xref
 561             linkend="sec-stable-pointers">), for managing pointers
 562             into Haskell from the outside world.</para>
 563           </listitem>
 564
 565           <listitem>
 566             <para>The <literal>CTypes</literal> module (see <xref
 567             linkend="sec-CTypes">) gives Haskell equivalents for the
 568             standard C datatypes, for use in making Haskell bindings
 569             to existing C libraries.</para>
 570           </listitem>
 571
 572           <listitem>
 573             <para>The <literal>CTypesISO</literal> module (see <xref
 574             linkend="sec-CTypesISO">) gives Haskell equivalents for C
 575             types defined by the ISO C standard.</para>
 576           </listitem>
 577
 578           <listitem>
 579             <para>The <literal>Storable</literal> library, for
 580             primitive marshalling of data types between Haskell and
 581             the foreign language.</para>
 582           </listitem>
 583         </itemizedlist>
 584
 585       </listitem>
 586     </itemizedlist>
 587
 588 <para>The following sections also give some hints and tips on the use
 589 of the foreign function interface in GHC.</para>
 590
 591 <sect2 id="glasgow-foreign-headers">
 592 <title>Using function headers
 593 </title>
 594
 595 <para>
 596 <indexterm><primary>C calls, function headers</primary></indexterm>
 597 </para>
 598
 599 <para>
 600 When generating C (using the <option>-fvia-C</option> directive), one can assist the
 601 C compiler in detecting type errors by using the <Command>-&num;include</Command> directive
 602 to provide <filename>.h</filename> files containing function headers.
 603 </para>
 604
 605 <para>
 606 For example,
 607 </para>
 608
 609 <para>
 610
 611 <programlisting>
 612 #include "HsFFI.h"
 613
 614 void         initialiseEFS (HsInt size);
 615 HsInt        terminateEFS (void);
 616 HsForeignObj emptyEFS(void);
 617 HsForeignObj updateEFS (HsForeignObj a, HsInt i, HsInt x);
 618 HsInt        lookupEFS (HsForeignObj a, HsInt i);
 619 </programlisting>
 620 </para>
 621
 622       <para>The types <literal>HsInt</literal>,
 623       <literal>HsForeignObj</literal> etc. are described in <xref
 624       linkend="sec-mapping-table">.</para>
 625
 626       <para>Note that this approach is only
 627       <emphasis>essential</emphasis> for returning
 628       <literal>float</literal>s (or if <literal>sizeof(int) !=
 629       sizeof(int *)</literal> on your architecture) but is a Good
 630       Thing for anyone who cares about writing solid code.  You're
 631       crazy not to do it.</para>
 632
 633 </sect2>
 634
 635 </sect1>
 636
 637 <sect1 id="multi-param-type-classes">
 638 <title>Multi-parameter type classes
 639 </title>
 640
 641 <para>
 642 This section documents GHC's implementation of multi-parameter type
 643 classes.  There's lots of background in the paper <ULink
 644 URL="http://research.microsoft.com/~simonpj/multi.ps.gz" >Type
 645 classes: exploring the design space</ULink > (Simon Peyton Jones, Mark
 646 Jones, Erik Meijer).
 647 </para>
 648
 649 <para>
 650 I'd like to thank people who reported shorcomings in the GHC 3.02
 651 implementation.  Our default decisions were all conservative ones, and
 652 the experience of these heroic pioneers has given useful concrete
 653 examples to support several generalisations.  (These appear below as
 654 design choices not implemented in 3.02.)
 655 </para>
 656
 657 <para>
 658 I've discussed these notes with Mark Jones, and I believe that Hugs
 659 will migrate towards the same design choices as I outline here.
 660 Thanks to him, and to many others who have offered very useful
 661 feedback.
 662 </para>
 663
 664 <sect2>
 665 <title>Types</title>
 666
 667 <para>
 668 There are the following restrictions on the form of a qualified
 669 type:
 670 </para>
 671
 672 <para>
 673
 674 <programlisting>
 675   forall tv1..tvn (c1, ...,cn) => type
 676 </programlisting>
 677
 678 </para>
 679
 680 <para>
 681 (Here, I write the "foralls" explicitly, although the Haskell source
 682 language omits them; in Haskell 1.4, all the free type variables of an
 683 explicit source-language type signature are universally quantified,
 684 except for the class type variables in a class declaration.  However,
 685 in GHC, you can give the foralls if you want.  See <xref LinkEnd="universal-quantification">).
 686 </para>
 687
 688 <para>
 689
 690 <OrderedList>
 691 <listitem>
 692
 693 <para>
 694  <emphasis>Each universally quantified type variable
 695 <literal>tvi</literal> must be mentioned (i.e. appear free) in <literal>type</literal></emphasis>.
 696
 697 The reason for this is that a value with a type that does not obey
 698 this restriction could not be used without introducing
 699 ambiguity. Here, for example, is an illegal type:
 700
 701
 702 <programlisting>
 703   forall a. Eq a => Int
 704 </programlisting>
 705
 706
 707 When a value with this type was used, the constraint <literal>Eq tv</literal>
 708 would be introduced where <literal>tv</literal> is a fresh type variable, and
 709 (in the dictionary-translation implementation) the value would be
 710 applied to a dictionary for <literal>Eq tv</literal>.  The difficulty is that we
 711 can never know which instance of <literal>Eq</literal> to use because we never
 712 get any more information about <literal>tv</literal>.
 713
 714 </para>
 715 </listitem>
 716 <listitem>
 717
 718 <para>
 719  <emphasis>Every constraint <literal>ci</literal> must mention at least one of the
 720 universally quantified type variables <literal>tvi</literal></emphasis>.
 721
 722 For example, this type is OK because <literal>C a b</literal> mentions the
 723 universally quantified type variable <literal>b</literal>:
 724
 725
 726 <programlisting>
 727   forall a. C a b => burble
 728 </programlisting>
 729
 730
 731 The next type is illegal because the constraint <literal>Eq b</literal> does not
 732 mention <literal>a</literal>:
 733
 734
 735 <programlisting>
 736   forall a. Eq b => burble
 737 </programlisting>
 738
 739
 740 The reason for this restriction is milder than the other one.  The
 741 excluded types are never useful or necessary (because the offending
 742 context doesn't need to be witnessed at this point; it can be floated
 743 out).  Furthermore, floating them out increases sharing. Lastly,
 744 excluding them is a conservative choice; it leaves a patch of
 745 territory free in case we need it later.
 746
 747 </para>
 748 </listitem>
 749
 750 </OrderedList>
 751
 752 </para>
 753
 754 <para>
 755 These restrictions apply to all types, whether declared in a type signature
 756 or inferred.
 757 </para>
 758
 759 <para>
 760 Unlike Haskell 1.4, constraints in types do <emphasis>not</emphasis> have to be of
 761 the form <emphasis>(class type-variables)</emphasis>.  Thus, these type signatures
 762 are perfectly OK
 763 </para>
 764
 765 <para>
 766
 767 <programlisting>
 768   f :: Eq (m a) => [m a] -> [m a]
 769   g :: Eq [a] => ...
 770 </programlisting>
 771
 772 </para>
 773
 774 <para>
 775 This choice recovers principal types, a property that Haskell 1.4 does not have.
 776 </para>
 777
 778 </sect2>
 779
 780 <sect2>
 781 <title>Class declarations</title>
 782
 783 <para>
 784
 785 <OrderedList>
 786 <listitem>
 787
 788 <para>
 789  <emphasis>Multi-parameter type classes are permitted</emphasis>. For example:
 790
 791
 792 <programlisting>
 793   class Collection c a where
 794     union :: c a -> c a -> c a
 795     ...etc.
 796 </programlisting>
 797
 798
 799
 800 </para>
 801 </listitem>
 802 <listitem>
 803
 804 <para>
 805  <emphasis>The class hierarchy must be acyclic</emphasis>.  However, the definition
 806 of "acyclic" involves only the superclass relationships.  For example,
 807 this is OK:
 808
 809
 810 <programlisting>
 811   class C a where {
 812     op :: D b => a -> b -> b
 813   }
 814
 815   class C a => D a where { ... }
 816 </programlisting>
 817
 818
 819 Here, <literal>C</literal> is a superclass of <literal>D</literal>, but it's OK for a
 820 class operation <literal>op</literal> of <literal>C</literal> to mention <literal>D</literal>.  (It
 821 would not be OK for <literal>D</literal> to be a superclass of <literal>C</literal>.)
 822
 823 </para>
 824 </listitem>
 825 <listitem>
 826
 827 <para>
 828  <emphasis>There are no restrictions on the context in a class declaration
 829 (which introduces superclasses), except that the class hierarchy must
 830 be acyclic</emphasis>.  So these class declarations are OK:
 831
 832
 833 <programlisting>
 834   class Functor (m k) => FiniteMap m k where
 835     ...
 836
 837   class (Monad m, Monad (t m)) => Transform t m where
 838     lift :: m a -> (t m) a
 839 </programlisting>
 840
 841
 842 </para>
 843 </listitem>
 844 <listitem>
 845
 846 <para>
 847  <emphasis>In the signature of a class operation, every constraint
 848 must mention at least one type variable that is not a class type
 849 variable</emphasis>.
 850
 851 Thus:
 852
 853
 854 <programlisting>
 855   class Collection c a where
 856     mapC :: Collection c b => (a->b) -> c a -> c b
 857 </programlisting>
 858
 859
 860 is OK because the constraint <literal>(Collection a b)</literal> mentions
 861 <literal>b</literal>, even though it also mentions the class variable
 862 <literal>a</literal>.  On the other hand:
 863
 864
 865 <programlisting>
 866   class C a where
 867     op :: Eq a => (a,b) -> (a,b)
 868 </programlisting>
 869
 870
 871 is not OK because the constraint <literal>(Eq a)</literal> mentions on the class
 872 type variable <literal>a</literal>, but not <literal>b</literal>.  However, any such
 873 example is easily fixed by moving the offending context up to the
 874 superclass context:
 875
 876
 877 <programlisting>
 878   class Eq a => C a where
 879     op ::(a,b) -> (a,b)
 880 </programlisting>
 881
 882
 883 A yet more relaxed rule would allow the context of a class-op signature
 884 to mention only class type variables.  However, that conflicts with
 885 Rule 1(b) for types above.
 886
 887 </para>
 888 </listitem>
 889 <listitem>
 890
 891 <para>
 892  <emphasis>The type of each class operation must mention <emphasis>all</emphasis> of
 893 the class type variables</emphasis>.  For example:
 894
 895
 896 <programlisting>
 897   class Coll s a where
 898     empty  :: s
 899     insert :: s -> a -> s
 900 </programlisting>
 901
 902
 903 is not OK, because the type of <literal>empty</literal> doesn't mention
 904 <literal>a</literal>.  This rule is a consequence of Rule 1(a), above, for
 905 types, and has the same motivation.
 906
 907 Sometimes, offending class declarations exhibit misunderstandings.  For
 908 example, <literal>Coll</literal> might be rewritten
 909
 910
 911 <programlisting>
 912   class Coll s a where
 913     empty  :: s a
 914     insert :: s a -> a -> s a
 915 </programlisting>
 916
 917
 918 which makes the connection between the type of a collection of
 919 <literal>a</literal>'s (namely <literal>(s a)</literal>) and the element type <literal>a</literal>.
 920 Occasionally this really doesn't work, in which case you can split the
 921 class like this:
 922
 923
 924 <programlisting>
 925   class CollE s where
 926     empty  :: s
 927
 928   class CollE s => Coll s a where
 929     insert :: s -> a -> s
 930 </programlisting>
 931
 932
 933 </para>
 934 </listitem>
 935
 936 </OrderedList>
 937
 938 </para>
 939
 940 </sect2>
 941
 942 <sect2 id="instance-decls">
 943 <title>Instance declarations</title>
 944
 945 <para>
 946
 947 <OrderedList>
 948 <listitem>
 949
 950 <para>
 951  <emphasis>Instance declarations may not overlap</emphasis>.  The two instance
 952 declarations
 953
 954
 955 <programlisting>
 956   instance context1 => C type1 where ...
 957   instance context2 => C type2 where ...
 958 </programlisting>
 959
 960
 961 "overlap" if <literal>type1</literal> and <literal>type2</literal> unify
 962
 963 However, if you give the command line option
 964 <option>-fallow-overlapping-instances</option><indexterm><primary>-fallow-overlapping-instances
 965 option</primary></indexterm> then two overlapping instance declarations are permitted
 966 iff
 967
 968
 969 <itemizedlist>
 970 <listitem>
 971
 972 <para>
 973  EITHER <literal>type1</literal> and <literal>type2</literal> do not unify
 974 </para>
 975 </listitem>
 976 <listitem>
 977
 978 <para>
 979  OR <literal>type2</literal> is a substitution instance of <literal>type1</literal>
 980 (but not identical to <literal>type1</literal>)
 981 </para>
 982 </listitem>
 983 <listitem>
 984
 985 <para>
 986  OR vice versa
 987 </para>
 988 </listitem>
 989
 990 </itemizedlist>
 991
 992
 993 Notice that these rules
 994
 995
 996 <itemizedlist>
 997 <listitem>
 998
 999 <para>
1000  make it clear which instance decl to use
1001 (pick the most specific one that matches)
1002
1003 </para>
1004 </listitem>
1005 <listitem>
1006
1007 <para>
1008  do not mention the contexts <literal>context1</literal>, <literal>context2</literal>
1009 Reason: you can pick which instance decl
1010 "matches" based on the type.
1011 </para>
1012 </listitem>
1013
1014 </itemizedlist>
1015
1016
1017 Regrettably, GHC doesn't guarantee to detect overlapping instance
1018 declarations if they appear in different modules.  GHC can "see" the
1019 instance declarations in the transitive closure of all the modules
1020 imported by the one being compiled, so it can "see" all instance decls
1021 when it is compiling <literal>Main</literal>.  However, it currently chooses not
1022 to look at ones that can't possibly be of use in the module currently
1023 being compiled, in the interests of efficiency.  (Perhaps we should
1024 change that decision, at least for <literal>Main</literal>.)
1025
1026 </para>
1027 </listitem>
1028 <listitem>
1029
1030 <para>
1031  <emphasis>There are no restrictions on the type in an instance
1032 <emphasis>head</emphasis>, except that at least one must not be a type variable</emphasis>.
1033 The instance "head" is the bit after the "=>" in an instance decl. For
1034 example, these are OK:
1035
1036
1037 <programlisting>
1038   instance C Int a where ...
1039
1040   instance D (Int, Int) where ...
1041
1042   instance E [[a]] where ...
1043 </programlisting>
1044
1045
1046 Note that instance heads <emphasis>may</emphasis> contain repeated type variables.
1047 For example, this is OK:
1048
1049
1050 <programlisting>
1051   instance Stateful (ST s) (MutVar s) where ...
1052 </programlisting>
1053
1054
1055 The "at least one not a type variable" restriction is to ensure that
1056 context reduction terminates: each reduction step removes one type
1057 constructor.  For example, the following would make the type checker
1058 loop if it wasn't excluded:
1059
1060
1061 <programlisting>
1062   instance C a => C a where ...
1063 </programlisting>
1064
1065
1066 There are two situations in which the rule is a bit of a pain. First,
1067 if one allows overlapping instance declarations then it's quite
1068 convenient to have a "default instance" declaration that applies if
1069 something more specific does not:
1070
1071
1072 <programlisting>
1073   instance C a where
1074     op = ... -- Default
1075 </programlisting>
1076
1077
1078 Second, sometimes you might want to use the following to get the
1079 effect of a "class synonym":
1080
1081
1082 <programlisting>
1083   class (C1 a, C2 a, C3 a) => C a where { }
1084
1085   instance (C1 a, C2 a, C3 a) => C a where { }
1086 </programlisting>
1087
1088
1089 This allows you to write shorter signatures:
1090
1091
1092 <programlisting>
1093   f :: C a => ...
1094 </programlisting>
1095
1096
1097 instead of
1098
1099
1100 <programlisting>
1101   f :: (C1 a, C2 a, C3 a) => ...
1102 </programlisting>
1103
1104
1105 I'm on the lookout for a simple rule that preserves decidability while
1106 allowing these idioms.  The experimental flag
1107 <option>-fallow-undecidable-instances</option><indexterm><primary>-fallow-undecidable-instances
1108 option</primary></indexterm> lifts this restriction, allowing all the types in an
1109 instance head to be type variables.
1110
1111 </para>
1112 </listitem>
1113 <listitem>
1114
1115 <para>
1116  <emphasis>Unlike Haskell 1.4, instance heads may use type
1117 synonyms</emphasis>.  As always, using a type synonym is just shorthand for
1118 writing the RHS of the type synonym definition.  For example:
1119
1120
1121 <programlisting>
1122   type Point = (Int,Int)
1123   instance C Point   where ...
1124   instance C [Point] where ...
1125 </programlisting>
1126
1127
1128 is legal.  However, if you added
1129
1130
1131 <programlisting>
1132   instance C (Int,Int) where ...
1133 </programlisting>
1134
1135
1136 as well, then the compiler will complain about the overlapping
1137 (actually, identical) instance declarations.  As always, type synonyms
1138 must be fully applied.  You cannot, for example, write:
1139
1140
1141 <programlisting>
1142   type P a = [[a]]
1143   instance Monad P where ...
1144 </programlisting>
1145
1146
1147 This design decision is independent of all the others, and easily
1148 reversed, but it makes sense to me.
1149
1150 </para>
1151 </listitem>
1152 <listitem>
1153
1154 <para>
1155 <emphasis>The types in an instance-declaration <emphasis>context</emphasis> must all
1156 be type variables</emphasis>. Thus
1157
1158
1159 <programlisting>
1160 instance C a b => Eq (a,b) where ...
1161 </programlisting>
1162
1163
1164 is OK, but
1165
1166
1167 <programlisting>
1168 instance C Int b => Foo b where ...
1169 </programlisting>
1170
1171
1172 is not OK.  Again, the intent here is to make sure that context
1173 reduction terminates.
1174
1175 Voluminous correspondence on the Haskell mailing list has convinced me
1176 that it's worth experimenting with a more liberal rule.  If you use
1177 the flag <option>-fallow-undecidable-instances</option> can use arbitrary
1178 types in an instance context.  Termination is ensured by having a
1179 fixed-depth recursion stack.  If you exceed the stack depth you get a
1180 sort of backtrace, and the opportunity to increase the stack depth
1181 with <option>-fcontext-stack</option><emphasis>N</emphasis>.
1182
1183 </para>
1184 </listitem>
1185
1186 </OrderedList>
1187
1188 </para>
1189
1190 </sect2>
1191
1192 </sect1>
1193
1194 <sect1 id="implicit-parameters">
1195 <title>Implicit parameters
1196 </title>
1197
1198 <para> Implicit paramters are implemented as described in
1199 "Implicit parameters: dynamic scoping with static types",
1200 J Lewis, MB Shields, E Meijer, J Launchbury,
1201 27th ACM Symposium on Principles of Programming Languages (POPL'00),
1202 Boston, Jan 2000.
1203 </para>
1204
1205 <para>
1206 There should be more documentation, but there isn't (yet).  Yell if you need it.
1207 </para>
1208 <itemizedlist>
1209 <listitem>
1210 <para> You can't have an implicit parameter in the context of a class or instance
1211 declaration.  For example, both these declarations are illegal:
1212 <programlisting>
1213   class (?x::Int) => C a where ...
1214   instance (?x::a) => Foo [a] where ...
1215 </programlisting>
1216 Reason: exactly which implicit parameter you pick up depends on exactly where
1217 you invoke a function. But the ``invocation'' of instance declarations is done
1218 behind the scenes by the compiler, so it's hard to figure out exactly where it is done.
1219 Easiest thing is to outlaw the offending types.</para>
1220 </listitem>
1221
1222 </itemizedlist>
1223
1224 </sect1>
1225
1226
1227 <sect1 id="functional-dependencies">
1228 <title>Functional dependencies
1229 </title>
1230
1231 <para> Functional dependencies are implemented as described by Mark Jones
1232 in "Type Classes with Functional Dependencies", Mark P. Jones,
1233 In Proceedings of the 9th European Symposium on Programming,
1234 ESOP 2000, Berlin, Germany, March 2000, Springer-Verlag LNCS 1782.
1235 </para>
1236
1237 <para>
1238 There should be more documentation, but there isn't (yet).  Yell if you need it.
1239 </para>
1240 </sect1>
1241
1242
1243 <sect1 id="universal-quantification">
1244 <title>Explicit universal quantification
1245 </title>
1246
1247 <para>
1248 GHC's type system supports explicit universal quantification in
1249 constructor fields and function arguments.  This is useful for things
1250 like defining <literal>runST</literal> from the state-thread world.
1251 GHC's syntax for this now agrees with Hugs's, namely:
1252 </para>
1253
1254 <para>
1255
1256 <programlisting>
1257         forall a b. (Ord a, Eq  b) => a -> b -> a
1258 </programlisting>
1259
1260 </para>
1261
1262 <para>
1263 The context is, of course, optional.  You can't use <literal>forall</literal> as
1264 a type variable any more!
1265 </para>
1266
1267 <para>
1268 Haskell type signatures are implicitly quantified.  The <literal>forall</literal>
1269 allows us to say exactly what this means.  For example:
1270 </para>
1271
1272 <para>
1273
1274 <programlisting>
1275         g :: b -> b
1276 </programlisting>
1277
1278 </para>
1279
1280 <para>
1281 means this:
1282 </para>
1283
1284 <para>
1285
1286 <programlisting>
1287         g :: forall b. (b -> b)
1288 </programlisting>
1289
1290 </para>
1291
1292 <para>
1293 The two are treated identically.
1294 </para>
1295
1296 <sect2 id="univ">
1297 <title>Universally-quantified data type fields
1298 </title>
1299
1300 <para>
1301 In a <literal>data</literal> or <literal>newtype</literal> declaration one can quantify
1302 the types of the constructor arguments.  Here are several examples:
1303 </para>
1304
1305 <para>
1306
1307 <programlisting>
1308 data T a = T1 (forall b. b -> b -> b) a
1309
1310 data MonadT m = MkMonad { return :: forall a. a -> m a,
1311                           bind   :: forall a b. m a -> (a -> m b) -> m b
1312                         }
1313
1314 newtype Swizzle = MkSwizzle (Ord a => [a] -> [a])
1315 </programlisting>
1316
1317 </para>
1318
1319 <para>
1320 The constructors now have so-called <emphasis>rank 2</emphasis> polymorphic
1321 types, in which there is a for-all in the argument types.:
1322 </para>
1323
1324 <para>
1325
1326 <programlisting>
1327 T1 :: forall a. (forall b. b -> b -> b) -> a -> T a
1328 MkMonad :: forall m. (forall a. a -> m a)
1329                   -> (forall a b. m a -> (a -> m b) -> m b)
1330                   -> MonadT m
1331 MkSwizzle :: (Ord a => [a] -> [a]) -> Swizzle
1332 </programlisting>
1333
1334 </para>
1335
1336 <para>
1337 Notice that you don't need to use a <literal>forall</literal> if there's an
1338 explicit context.  For example in the first argument of the
1339 constructor <function>MkSwizzle</function>, an implicit "<literal>forall a.</literal>" is
1340 prefixed to the argument type.  The implicit <literal>forall</literal>
1341 quantifies all type variables that are not already in scope, and are
1342 mentioned in the type quantified over.
1343 </para>
1344
1345 <para>
1346 As for type signatures, implicit quantification happens for non-overloaded
1347 types too.  So if you write this:
1348
1349 <programlisting>
1350   data T a = MkT (Either a b) (b -> b)
1351 </programlisting>
1352
1353 it's just as if you had written this:
1354
1355 <programlisting>
1356   data T a = MkT (forall b. Either a b) (forall b. b -> b)
1357 </programlisting>
1358
1359 That is, since the type variable <literal>b</literal> isn't in scope, it's
1360 implicitly universally quantified.  (Arguably, it would be better
1361 to <emphasis>require</emphasis> explicit quantification on constructor arguments
1362 where that is what is wanted.  Feedback welcomed.)
1363 </para>
1364
1365 </sect2>
1366
1367 <sect2>
1368 <title>Construction </title>
1369
1370 <para>
1371 You construct values of types <literal>T1, MonadT, Swizzle</literal> by applying
1372 the constructor to suitable values, just as usual.  For example,
1373 </para>
1374
1375 <para>
1376
1377 <programlisting>
1378 (T1 (\xy->x) 3) :: T Int
1379
1380 (MkSwizzle sort)    :: Swizzle
1381 (MkSwizzle reverse) :: Swizzle
1382
1383 (let r x = Just x
1384      b m k = case m of
1385                 Just y -> k y
1386                 Nothing -> Nothing
1387   in
1388   MkMonad r b) :: MonadT Maybe
1389 </programlisting>
1390
1391 </para>
1392
1393 <para>
1394 The type of the argument can, as usual, be more general than the type
1395 required, as <literal>(MkSwizzle reverse)</literal> shows.  (<function>reverse</function>
1396 does not need the <literal>Ord</literal> constraint.)
1397 </para>
1398
1399 </sect2>
1400
1401 <sect2>
1402 <title>Pattern matching</title>
1403
1404 <para>
1405 When you use pattern matching, the bound variables may now have
1406 polymorphic types.  For example:
1407 </para>
1408
1409 <para>
1410
1411 <programlisting>
1412         f :: T a -> a -> (a, Char)
1413         f (T1 f k) x = (f k x, f 'c' 'd')
1414
1415         g :: (Ord a, Ord b) => Swizzle -> [a] -> (a -> b) -> [b]
1416         g (MkSwizzle s) xs f = s (map f (s xs))
1417
1418         h :: MonadT m -> [m a] -> m [a]
1419         h m [] = return m []
1420         h m (x:xs) = bind m x           $ \y ->
1421                       bind m (h m xs)   $ \ys ->
1422                       return m (y:ys)
1423 </programlisting>
1424
1425 </para>
1426
1427 <para>
1428 In the function <function>h</function> we use the record selectors <literal>return</literal>
1429 and <literal>bind</literal> to extract the polymorphic bind and return functions
1430 from the <literal>MonadT</literal> data structure, rather than using pattern
1431 matching.
1432 </para>
1433
1434 <para>
1435 You cannot pattern-match against an argument that is polymorphic.
1436 For example:
1437
1438 <programlisting>
1439         newtype TIM s a = TIM (ST s (Maybe a))
1440
1441         runTIM :: (forall s. TIM s a) -> Maybe a
1442         runTIM (TIM m) = runST m
1443 </programlisting>
1444
1445 </para>
1446
1447 <para>
1448 Here the pattern-match fails, because you can't pattern-match against
1449 an argument of type <literal>(forall s. TIM s a)</literal>.  Instead you
1450 must bind the variable and pattern match in the right hand side:
1451
1452 <programlisting>
1453         runTIM :: (forall s. TIM s a) -> Maybe a
1454         runTIM tm = case tm of { TIM m -> runST m }
1455 </programlisting>
1456
1457 The <literal>tm</literal> on the right hand side is (invisibly) instantiated, like
1458 any polymorphic value at its occurrence site, and now you can pattern-match
1459 against it.
1460 </para>
1461
1462 </sect2>
1463
1464 <sect2>
1465 <title>The partial-application restriction</title>
1466
1467 <para>
1468 There is really only one way in which data structures with polymorphic
1469 components might surprise you: you must not partially apply them.
1470 For example, this is illegal:
1471 </para>
1472
1473 <para>
1474
1475 <programlisting>
1476         map MkSwizzle [sort, reverse]
1477 </programlisting>
1478
1479 </para>
1480
1481 <para>
1482 The restriction is this: <emphasis>every subexpression of the program must
1483 have a type that has no for-alls, except that in a function
1484 application (f e1&hellip;en) the partial applications are not subject to
1485 this rule</emphasis>.  The restriction makes type inference feasible.
1486 </para>
1487
1488 <para>
1489 In the illegal example, the sub-expression <literal>MkSwizzle</literal> has the
1490 polymorphic type <literal>(Ord b => [b] -> [b]) -> Swizzle</literal> and is not
1491 a sub-expression of an enclosing application.  On the other hand, this
1492 expression is OK:
1493 </para>
1494
1495 <para>
1496
1497 <programlisting>
1498         map (T1 (\a b -> a)) [1,2,3]
1499 </programlisting>
1500
1501 </para>
1502
1503 <para>
1504 even though it involves a partial application of <function>T1</function>, because
1505 the sub-expression <literal>T1 (\a b -> a)</literal> has type <literal>Int -> T
1506 Int</literal>.
1507 </para>
1508
1509 </sect2>
1510
1511 <sect2 id="sigs">
1512 <title>Type signatures
1513 </title>
1514
1515 <para>
1516 Once you have data constructors with universally-quantified fields, or
1517 constants such as <Constant>runST</Constant> that have rank-2 types, it isn't long
1518 before you discover that you need more!  Consider:
1519 </para>
1520
1521 <para>
1522
1523 <programlisting>
1524   mkTs f x y = [T1 f x, T1 f y]
1525 </programlisting>
1526
1527 </para>
1528
1529 <para>
1530 <function>mkTs</function> is a fuction that constructs some values of type
1531 <literal>T</literal>, using some pieces passed to it.  The trouble is that since
1532 <literal>f</literal> is a function argument, Haskell assumes that it is
1533 monomorphic, so we'll get a type error when applying <function>T1</function> to
1534 it.  This is a rather silly example, but the problem really bites in
1535 practice.  Lots of people trip over the fact that you can't make
1536 "wrappers functions" for <Constant>runST</Constant> for exactly the same reason.
1537 In short, it is impossible to build abstractions around functions with
1538 rank-2 types.
1539 </para>
1540
1541 <para>
1542 The solution is fairly clear.  We provide the ability to give a rank-2
1543 type signature for <emphasis>ordinary</emphasis> functions (not only data
1544 constructors), thus:
1545 </para>
1546
1547 <para>
1548
1549 <programlisting>
1550   mkTs :: (forall b. b -> b -> b) -> a -> [T a]
1551   mkTs f x y = [T1 f x, T1 f y]
1552 </programlisting>
1553
1554 </para>
1555
1556 <para>
1557 This type signature tells the compiler to attribute <literal>f</literal> with
1558 the polymorphic type <literal>(forall b. b -> b -> b)</literal> when type
1559 checking the body of <function>mkTs</function>, so now the application of
1560 <function>T1</function> is fine.
1561 </para>
1562
1563 <para>
1564 There are two restrictions:
1565 </para>
1566
1567 <para>
1568
1569 <itemizedlist>
1570 <listitem>
1571
1572 <para>
1573  You can only define a rank 2 type, specified by the following
1574 grammar:
1575
1576
1577 <programlisting>
1578 rank2type ::= [forall tyvars .] [context =>] funty
1579 funty     ::= ([forall tyvars .] [context =>] ty) -> funty
1580             | ty
1581 ty        ::= ...current Haskell monotype syntax...
1582 </programlisting>
1583
1584
1585 Informally, the universal quantification must all be right at the beginning,
1586 or at the top level of a function argument.
1587
1588 </para>
1589 </listitem>
1590 <listitem>
1591
1592 <para>
1593  There is a restriction on the definition of a function whose
1594 type signature is a rank-2 type: the polymorphic arguments must be
1595 matched on the left hand side of the "<literal>=</literal>" sign.  You can't
1596 define <function>mkTs</function> like this:
1597
1598
1599 <programlisting>
1600 mkTs :: (forall b. b -> b -> b) -> a -> [T a]
1601 mkTs = \ f x y -> [T1 f x, T1 f y]
1602 </programlisting>
1603
1604
1605
1606 The same partial-application rule applies to ordinary functions with
1607 rank-2 types as applied to data constructors.
1608
1609 </para>
1610 </listitem>
1611
1612 </itemizedlist>
1613
1614 </para>
1615
1616 </sect2>
1617
1618
1619 <sect2 id="hoist">
1620 <title>Type synonyms and hoisting
1621 </title>
1622
1623 <para>
1624 GHC also allows you to write a <literal>forall</literal> in a type synonym, thus:
1625 <programlisting>
1626   type Discard a = forall b. a -> b -> a
1627
1628   f :: Discard a
1629   f x y = x
1630 </programlisting>
1631 However, it is often convenient to use these sort of synonyms at the right hand
1632 end of an arrow, thus:
1633 <programlisting>
1634   type Discard a = forall b. a -> b -> a
1635
1636   g :: Int -> Discard Int
1637   g x y z = x+y
1638 </programlisting>
1639 Simply expanding the type synonym would give
1640 <programlisting>
1641   g :: Int -> (forall b. Int -> b -> Int)
1642 </programlisting>
1643 but GHC "hoists" the <literal>forall</literal> to give the isomorphic type
1644 <programlisting>
1645   g :: forall b. Int -> Int -> b -> Int
1646 </programlisting>
1647 In general, the rule is this: <emphasis>to determine the type specified by any explicit
1648 user-written type (e.g. in a type signature), GHC expands type synonyms and then repeatedly
1649 performs the transformation:</emphasis>
1650 <programlisting>
1651   <emphasis>type1</emphasis> -> forall a. <emphasis>type2</emphasis>
1652 ==>
1653   forall a. <emphasis>type1</emphasis> -> <emphasis>type2</emphasis>
1654 </programlisting>
1655 (In fact, GHC tries to retain as much synonym information as possible for use in
1656 error messages, but that is a usability issue.)  This rule applies, of course, whether
1657 or not the <literal>forall</literal> comes from a synonym. For example, here is another
1658 valid way to write <literal>g</literal>'s type signature:
1659 <programlisting>
1660   g :: Int -> Int -> forall b. b -> Int
1661 </programlisting>
1662 </para>
1663 </sect2>
1664
1665 </sect1>
1666
1667 <sect1 id="existential-quantification">
1668 <title>Existentially quantified data constructors
1669 </title>
1670
1671 <para>
1672 The idea of using existential quantification in data type declarations
1673 was suggested by Laufer (I believe, thought doubtless someone will
1674 correct me), and implemented in Hope+. It's been in Lennart
1675 Augustsson's <Command>hbc</Command> Haskell compiler for several years, and
1676 proved very useful.  Here's the idea.  Consider the declaration:
1677 </para>
1678
1679 <para>
1680
1681 <programlisting>
1682   data Foo = forall a. MkFoo a (a -> Bool)
1683            | Nil
1684 </programlisting>
1685
1686 </para>
1687
1688 <para>
1689 The data type <literal>Foo</literal> has two constructors with types:
1690 </para>
1691
1692 <para>
1693
1694 <programlisting>
1695   MkFoo :: forall a. a -> (a -> Bool) -> Foo
1696   Nil   :: Foo
1697 </programlisting>
1698
1699 </para>
1700
1701 <para>
1702 Notice that the type variable <literal>a</literal> in the type of <function>MkFoo</function>
1703 does not appear in the data type itself, which is plain <literal>Foo</literal>.
1704 For example, the following expression is fine:
1705 </para>
1706
1707 <para>
1708
1709 <programlisting>
1710   [MkFoo 3 even, MkFoo 'c' isUpper] :: [Foo]
1711 </programlisting>
1712
1713 </para>
1714
1715 <para>
1716 Here, <literal>(MkFoo 3 even)</literal> packages an integer with a function
1717 <function>even</function> that maps an integer to <literal>Bool</literal>; and <function>MkFoo 'c'
1718 isUpper</function> packages a character with a compatible function.  These
1719 two things are each of type <literal>Foo</literal> and can be put in a list.
1720 </para>
1721
1722 <para>
1723 What can we do with a value of type <literal>Foo</literal>?.  In particular,
1724 what happens when we pattern-match on <function>MkFoo</function>?
1725 </para>
1726
1727 <para>
1728
1729 <programlisting>
1730   f (MkFoo val fn) = ???
1731 </programlisting>
1732
1733 </para>
1734
1735 <para>
1736 Since all we know about <literal>val</literal> and <function>fn</function> is that they
1737 are compatible, the only (useful) thing we can do with them is to
1738 apply <function>fn</function> to <literal>val</literal> to get a boolean.  For example:
1739 </para>
1740
1741 <para>
1742
1743 <programlisting>
1744   f :: Foo -> Bool
1745   f (MkFoo val fn) = fn val
1746 </programlisting>
1747
1748 </para>
1749
1750 <para>
1751 What this allows us to do is to package heterogenous values
1752 together with a bunch of functions that manipulate them, and then treat
1753 that collection of packages in a uniform manner.  You can express
1754 quite a bit of object-oriented-like programming this way.
1755 </para>
1756
1757 <sect2 id="existential">
1758 <title>Why existential?
1759 </title>
1760
1761 <para>
1762 What has this to do with <emphasis>existential</emphasis> quantification?
1763 Simply that <function>MkFoo</function> has the (nearly) isomorphic type
1764 </para>
1765
1766 <para>
1767
1768 <programlisting>
1769   MkFoo :: (exists a . (a, a -> Bool)) -> Foo
1770 </programlisting>
1771
1772 </para>
1773
1774 <para>
1775 But Haskell programmers can safely think of the ordinary
1776 <emphasis>universally</emphasis> quantified type given above, thereby avoiding
1777 adding a new existential quantification construct.
1778 </para>
1779
1780 </sect2>
1781
1782 <sect2>
1783 <title>Type classes</title>
1784
1785 <para>
1786 An easy extension (implemented in <Command>hbc</Command>) is to allow
1787 arbitrary contexts before the constructor.  For example:
1788 </para>
1789
1790 <para>
1791
1792 <programlisting>
1793 data Baz = forall a. Eq a => Baz1 a a
1794          | forall b. Show b => Baz2 b (b -> b)
1795 </programlisting>
1796
1797 </para>
1798
1799 <para>
1800 The two constructors have the types you'd expect:
1801 </para>
1802
1803 <para>
1804
1805 <programlisting>
1806 Baz1 :: forall a. Eq a => a -> a -> Baz
1807 Baz2 :: forall b. Show b => b -> (b -> b) -> Baz
1808 </programlisting>
1809
1810 </para>
1811
1812 <para>
1813 But when pattern matching on <function>Baz1</function> the matched values can be compared
1814 for equality, and when pattern matching on <function>Baz2</function> the first matched
1815 value can be converted to a string (as well as applying the function to it).
1816 So this program is legal:
1817 </para>
1818
1819 <para>
1820
1821 <programlisting>
1822   f :: Baz -> String
1823   f (Baz1 p q) | p == q    = "Yes"
1824                | otherwise = "No"
1825   f (Baz1 v fn)            = show (fn v)
1826 </programlisting>
1827
1828 </para>
1829
1830 <para>
1831 Operationally, in a dictionary-passing implementation, the
1832 constructors <function>Baz1</function> and <function>Baz2</function> must store the
1833 dictionaries for <literal>Eq</literal> and <literal>Show</literal> respectively, and
1834 extract it on pattern matching.
1835 </para>
1836
1837 <para>
1838 Notice the way that the syntax fits smoothly with that used for
1839 universal quantification earlier.
1840 </para>
1841
1842 </sect2>
1843
1844 <sect2>
1845 <title>Restrictions</title>
1846
1847 <para>
1848 There are several restrictions on the ways in which existentially-quantified
1849 constructors can be use.
1850 </para>
1851
1852 <para>
1853
1854 <itemizedlist>
1855 <listitem>
1856
1857 <para>
1858  When pattern matching, each pattern match introduces a new,
1859 distinct, type for each existential type variable.  These types cannot
1860 be unified with any other type, nor can they escape from the scope of
1861 the pattern match.  For example, these fragments are incorrect:
1862
1863
1864 <programlisting>
1865 f1 (MkFoo a f) = a
1866 </programlisting>
1867
1868
1869 Here, the type bound by <function>MkFoo</function> "escapes", because <literal>a</literal>
1870 is the result of <function>f1</function>.  One way to see why this is wrong is to
1871 ask what type <function>f1</function> has:
1872
1873
1874 <programlisting>
1875   f1 :: Foo -> a             -- Weird!
1876 </programlisting>
1877
1878
1879 What is this "<literal>a</literal>" in the result type? Clearly we don't mean
1880 this:
1881
1882
1883 <programlisting>
1884   f1 :: forall a. Foo -> a   -- Wrong!
1885 </programlisting>
1886
1887
1888 The original program is just plain wrong.  Here's another sort of error
1889
1890
1891 <programlisting>
1892   f2 (Baz1 a b) (Baz1 p q) = a==q
1893 </programlisting>
1894
1895
1896 It's ok to say <literal>a==b</literal> or <literal>p==q</literal>, but
1897 <literal>a==q</literal> is wrong because it equates the two distinct types arising
1898 from the two <function>Baz1</function> constructors.
1899
1900
1901 </para>
1902 </listitem>
1903 <listitem>
1904
1905 <para>
1906 You can't pattern-match on an existentially quantified
1907 constructor in a <literal>let</literal> or <literal>where</literal> group of
1908 bindings. So this is illegal:
1909
1910
1911 <programlisting>
1912   f3 x = a==b where { Baz1 a b = x }
1913 </programlisting>
1914
1915
1916 You can only pattern-match
1917 on an existentially-quantified constructor in a <literal>case</literal> expression or
1918 in the patterns of a function definition.
1919
1920 The reason for this restriction is really an implementation one.
1921 Type-checking binding groups is already a nightmare without
1922 existentials complicating the picture.  Also an existential pattern
1923 binding at the top level of a module doesn't make sense, because it's
1924 not clear how to prevent the existentially-quantified type "escaping".
1925 So for now, there's a simple-to-state restriction.  We'll see how
1926 annoying it is.
1927
1928 </para>
1929 </listitem>
1930 <listitem>
1931
1932 <para>
1933 You can't use existential quantification for <literal>newtype</literal>
1934 declarations.  So this is illegal:
1935
1936
1937 <programlisting>
1938   newtype T = forall a. Ord a => MkT a
1939 </programlisting>
1940
1941
1942 Reason: a value of type <literal>T</literal> must be represented as a pair
1943 of a dictionary for <literal>Ord t</literal> and a value of type <literal>t</literal>.
1944 That contradicts the idea that <literal>newtype</literal> should have no
1945 concrete representation.  You can get just the same efficiency and effect
1946 by using <literal>data</literal> instead of <literal>newtype</literal>.  If there is no
1947 overloading involved, then there is more of a case for allowing
1948 an existentially-quantified <literal>newtype</literal>, because the <literal>data</literal>
1949 because the <literal>data</literal> version does carry an implementation cost,
1950 but single-field existentially quantified constructors aren't much
1951 use.  So the simple restriction (no existential stuff on <literal>newtype</literal>)
1952 stands, unless there are convincing reasons to change it.
1953
1954
1955 </para>
1956 </listitem>
1957 <listitem>
1958
1959 <para>
1960  You can't use <literal>deriving</literal> to define instances of a
1961 data type with existentially quantified data constructors.
1962
1963 Reason: in most cases it would not make sense. For example:&num;
1964
1965 <programlisting>
1966 data T = forall a. MkT [a] deriving( Eq )
1967 </programlisting>
1968
1969 To derive <literal>Eq</literal> in the standard way we would need to have equality
1970 between the single component of two <function>MkT</function> constructors:
1971
1972 <programlisting>
1973 instance Eq T where
1974   (MkT a) == (MkT b) = ???
1975 </programlisting>
1976
1977 But <VarName>a</VarName> and <VarName>b</VarName> have distinct types, and so can't be compared.
1978 It's just about possible to imagine examples in which the derived instance
1979 would make sense, but it seems altogether simpler simply to prohibit such
1980 declarations.  Define your own instances!
1981 </para>
1982 </listitem>
1983
1984 </itemizedlist>
1985
1986 </para>
1987
1988 </sect2>
1989
1990 </sect1>
1991
1992 <sect1 id="sec-assertions">
1993 <title>Assertions
1994 <indexterm><primary>Assertions</primary></indexterm>
1995 </title>
1996
1997 <para>
1998 If you want to make use of assertions in your standard Haskell code, you
1999 could define a function like the following:
2000 </para>
2001
2002 <para>
2003
2004 <programlisting>
2005 assert :: Bool -> a -> a
2006 assert False x = error "assertion failed!"
2007 assert _     x = x
2008 </programlisting>
2009
2010 </para>
2011
2012 <para>
2013 which works, but gives you back a less than useful error message --
2014 an assertion failed, but which and where?
2015 </para>
2016
2017 <para>
2018 One way out is to define an extended <function>assert</function> function which also
2019 takes a descriptive string to include in the error message and
2020 perhaps combine this with the use of a pre-processor which inserts
2021 the source location where <function>assert</function> was used.
2022 </para>
2023
2024 <para>
2025 Ghc offers a helping hand here, doing all of this for you. For every
2026 use of <function>assert</function> in the user's source:
2027 </para>
2028
2029 <para>
2030
2031 <programlisting>
2032 kelvinToC :: Double -> Double
2033 kelvinToC k = assert (k &gt;= 0.0) (k+273.15)
2034 </programlisting>
2035
2036 </para>
2037
2038 <para>
2039 Ghc will rewrite this to also include the source location where the
2040 assertion was made,
2041 </para>
2042
2043 <para>
2044
2045 <programlisting>
2046 assert pred val ==> assertError "Main.hs|15" pred val
2047 </programlisting>
2048
2049 </para>
2050
2051 <para>
2052 The rewrite is only performed by the compiler when it spots
2053 applications of <function>Exception.assert</function>, so you can still define and
2054 use your own versions of <function>assert</function>, should you so wish. If not,
2055 import <literal>Exception</literal> to make use <function>assert</function> in your code.
2056 </para>
2057
2058 <para>
2059 To have the compiler ignore uses of assert, use the compiler option
2060 <option>-fignore-asserts</option>. <indexterm><primary>-fignore-asserts option</primary></indexterm> That is,
2061 expressions of the form <literal>assert pred e</literal> will be rewritten to <literal>e</literal>.
2062 </para>
2063
2064 <para>
2065 Assertion failures can be caught, see the documentation for the
2066 <literal>Exception</literal> library (<xref linkend="sec-Exception">)
2067 for the details.
2068 </para>
2069
2070 </sect1>
2071
2072 <sect1 id="scoped-type-variables">
2073 <title>Scoped Type Variables
2074 </title>
2075
2076 <para>
2077 A <emphasis>pattern type signature</emphasis> can introduce a <emphasis>scoped type
2078 variable</emphasis>.  For example
2079 </para>
2080
2081 <para>
2082
2083 <programlisting>
2084 f (xs::[a]) = ys ++ ys
2085            where
2086               ys :: [a]
2087               ys = reverse xs
2088 </programlisting>
2089
2090 </para>
2091
2092 <para>
2093 The pattern <literal>(xs::[a])</literal> includes a type signature for <VarName>xs</VarName>.
2094 This brings the type variable <literal>a</literal> into scope; it scopes over
2095 all the patterns and right hand sides for this equation for <function>f</function>.
2096 In particular, it is in scope at the type signature for <VarName>y</VarName>.
2097 </para>
2098
2099 <para>
2100 At ordinary type signatures, such as that for <VarName>ys</VarName>, any type variables
2101 mentioned in the type signature <emphasis>that are not in scope</emphasis> are
2102 implicitly universally quantified.  (If there are no type variables in
2103 scope, all type variables mentioned in the signature are universally
2104 quantified, which is just as in Haskell 98.)  In this case, since <VarName>a</VarName>
2105 is in scope, it is not universally quantified, so the type of <VarName>ys</VarName> is
2106 the same as that of <VarName>xs</VarName>.  In Haskell 98 it is not possible to declare
2107 a type for <VarName>ys</VarName>; a major benefit of scoped type variables is that
2108 it becomes possible to do so.
2109 </para>
2110
2111 <para>
2112 Scoped type variables are implemented in both GHC and Hugs.  Where the
2113 implementations differ from the specification below, those differences
2114 are noted.
2115 </para>
2116
2117 <para>
2118 So much for the basic idea.  Here are the details.
2119 </para>
2120
2121 <sect2>
2122 <title>Scope and implicit quantification</title>
2123
2124 <para>
2125
2126 <itemizedlist>
2127 <listitem>
2128
2129 <para>
2130  All the type variables mentioned in the patterns for a single
2131 function definition equation, that are not already in scope,
2132 are brought into scope by the patterns.  We describe this set as
2133 the <emphasis>type variables bound by the equation</emphasis>.
2134
2135 </para>
2136 </listitem>
2137 <listitem>
2138
2139 <para>
2140  The type variables thus brought into scope may be mentioned
2141 in ordinary type signatures or pattern type signatures anywhere within
2142 their scope.
2143
2144 </para>
2145 </listitem>
2146 <listitem>
2147
2148 <para>
2149  In ordinary type signatures, any type variable mentioned in the
2150 signature that is in scope is <emphasis>not</emphasis> universally quantified.
2151
2152 </para>
2153 </listitem>
2154 <listitem>
2155
2156 <para>
2157  Ordinary type signatures do not bring any new type variables
2158 into scope (except in the type signature itself!). So this is illegal:
2159
2160
2161 <programlisting>
2162   f :: a -> a
2163   f x = x::a
2164 </programlisting>
2165
2166
2167 It's illegal because <VarName>a</VarName> is not in scope in the body of <function>f</function>,
2168 so the ordinary signature <literal>x::a</literal> is equivalent to <literal>x::forall a.a</literal>;
2169 and that is an incorrect typing.
2170
2171 </para>
2172 </listitem>
2173 <listitem>
2174
2175 <para>
2176  There is no implicit universal quantification on pattern type
2177 signatures, nor may one write an explicit <literal>forall</literal> type in a pattern
2178 type signature.  The pattern type signature is a monotype.
2179
2180 </para>
2181 </listitem>
2182 <listitem>
2183
2184 <para>
2185
2186 The type variables in the head of a <literal>class</literal> or <literal>instance</literal> declaration
2187 scope over the methods defined in the <literal>where</literal> part.  For example:
2188
2189
2190 <programlisting>
2191   class C a where
2192     op :: [a] -> a
2193
2194     op xs = let ys::[a]
2195                 ys = reverse xs
2196             in
2197             head ys
2198 </programlisting>
2199
2200
2201 (Not implemented in Hugs yet, Dec 98).
2202 </para>
2203 </listitem>
2204
2205 </itemizedlist>
2206
2207 </para>
2208
2209 </sect2>
2210
2211 <sect2>
2212 <title>Polymorphism</title>
2213
2214 <para>
2215
2216 <itemizedlist>
2217 <listitem>
2218
2219 <para>
2220  Pattern type signatures are completely orthogonal to ordinary, separate
2221 type signatures.  The two can be used independently or together.  There is
2222 no scoping associated with the names of the type variables in a separate type signature.
2223
2224
2225 <programlisting>
2226    f :: [a] -> [a]
2227    f (xs::[b]) = reverse xs
2228 </programlisting>
2229
2230
2231 </para>
2232 </listitem>
2233 <listitem>
2234
2235 <para>
2236  The function must be polymorphic in the type variables
2237 bound by all its equations.  Operationally, the type variables bound
2238 by one equation must not:
2239
2240
2241 <itemizedlist>
2242 <listitem>
2243
2244 <para>
2245  Be unified with a type (such as <literal>Int</literal>, or <literal>[a]</literal>).
2246 </para>
2247 </listitem>
2248 <listitem>
2249
2250 <para>
2251  Be unified with a type variable free in the environment.
2252 </para>
2253 </listitem>
2254 <listitem>
2255
2256 <para>
2257  Be unified with each other.  (They may unify with the type variables
2258 bound by another equation for the same function, of course.)
2259 </para>
2260 </listitem>
2261
2262 </itemizedlist>
2263
2264
2265 For example, the following all fail to type check:
2266
2267
2268 <programlisting>
2269   f (x::a) (y::b) = [x,y]       -- a unifies with b
2270
2271   g (x::a) = x + 1::Int         -- a unifies with Int
2272
2273   h x = let k (y::a) = [x,y]    -- a is free in the
2274         in k x                  -- environment
2275
2276   k (x::a) True    = ...        -- a unifies with Int
2277   k (x::Int) False = ...
2278
2279   w :: [b] -> [b]
2280   w (x::a) = x                  -- a unifies with [b]
2281 </programlisting>
2282
2283
2284 </para>
2285 </listitem>
2286 <listitem>
2287
2288 <para>
2289  The pattern-bound type variable may, however, be constrained
2290 by the context of the principal type, thus:
2291
2292
2293 <programlisting>
2294   f (x::a) (y::a) = x+y*2
2295 </programlisting>
2296
2297
2298 gets the inferred type: <literal>forall a. Num a =&gt; a -&gt; a -&gt; a</literal>.
2299 </para>
2300 </listitem>
2301
2302 </itemizedlist>
2303
2304 </para>
2305
2306 </sect2>
2307
2308 <sect2>
2309 <title>Result type signatures</title>
2310
2311 <para>
2312
2313 <itemizedlist>
2314 <listitem>
2315
2316 <para>
2317  The result type of a function can be given a signature,
2318 thus:
2319
2320
2321 <programlisting>
2322   f (x::a) :: [a] = [x,x,x]
2323 </programlisting>
2324
2325
2326 The final <literal>:: [a]</literal> after all the patterns gives a signature to the
2327 result type.  Sometimes this is the only way of naming the type variable
2328 you want:
2329
2330
2331 <programlisting>
2332   f :: Int -> [a] -> [a]
2333   f n :: ([a] -> [a]) = let g (x::a, y::a) = (y,x)
2334                         in \xs -> map g (reverse xs `zip` xs)
2335 </programlisting>
2336
2337
2338 </para>
2339 </listitem>
2340
2341 </itemizedlist>
2342
2343 </para>
2344
2345 <para>
2346 Result type signatures are not yet implemented in Hugs.
2347 </para>
2348
2349 </sect2>
2350
2351 <sect2>
2352 <title>Pattern signatures on other constructs</title>
2353
2354 <para>
2355
2356 <itemizedlist>
2357 <listitem>
2358
2359 <para>
2360  A pattern type signature can be on an arbitrary sub-pattern, not
2361 just on a variable:
2362
2363
2364 <programlisting>
2365   f ((x,y)::(a,b)) = (y,x) :: (b,a)
2366 </programlisting>
2367
2368
2369 </para>
2370 </listitem>
2371 <listitem>
2372
2373 <para>
2374  Pattern type signatures, including the result part, can be used
2375 in lambda abstractions:
2376
2377
2378 <programlisting>
2379   (\ (x::a, y) :: a -> x)
2380 </programlisting>
2381
2382
2383 Type variables bound by these patterns must be polymorphic in
2384 the sense defined above.
2385 For example:
2386
2387
2388 <programlisting>
2389   f1 (x::c) = f1 x      -- ok
2390   f2 = \(x::c) -> f2 x  -- not ok
2391 </programlisting>
2392
2393
2394 Here, <function>f1</function> is OK, but <function>f2</function> is not, because <VarName>c</VarName> gets unified
2395 with a type variable free in the environment, in this
2396 case, the type of <function>f2</function>, which is in the environment when
2397 the lambda abstraction is checked.
2398
2399 </para>
2400 </listitem>
2401 <listitem>
2402
2403 <para>
2404  Pattern type signatures, including the result part, can be used
2405 in <literal>case</literal> expressions:
2406
2407
2408 <programlisting>
2409   case e of { (x::a, y) :: a -> x }
2410 </programlisting>
2411
2412
2413 The pattern-bound type variables must, as usual,
2414 be polymorphic in the following sense: each case alternative,
2415 considered as a lambda abstraction, must be polymorphic.
2416 Thus this is OK:
2417
2418
2419 <programlisting>
2420   case (True,False) of { (x::a, y) -> x }
2421 </programlisting>
2422
2423
2424 Even though the context is that of a pair of booleans,
2425 the alternative itself is polymorphic.  Of course, it is
2426 also OK to say:
2427
2428
2429 <programlisting>
2430   case (True,False) of { (x::Bool, y) -> x }
2431 </programlisting>
2432
2433
2434 </para>
2435 </listitem>
2436 <listitem>
2437
2438 <para>
2439 To avoid ambiguity, the type after the &ldquo;<literal>::</literal>&rdquo; in a result
2440 pattern signature on a lambda or <literal>case</literal> must be atomic (i.e. a single
2441 token or a parenthesised type of some sort).  To see why,
2442 consider how one would parse this:
2443
2444
2445 <programlisting>
2446   \ x :: a -> b -> x
2447 </programlisting>
2448
2449
2450 </para>
2451 </listitem>
2452 <listitem>
2453
2454 <para>
2455  Pattern type signatures that bind new type variables
2456 may not be used in pattern bindings at all.
2457 So this is illegal:
2458
2459
2460 <programlisting>
2461   f x = let (y, z::a) = x in ...
2462 </programlisting>
2463
2464
2465 But these are OK, because they do not bind fresh type variables:
2466
2467
2468 <programlisting>
2469   f1 x            = let (y, z::Int) = x in ...
2470   f2 (x::(Int,a)) = let (y, z::a)   = x in ...
2471 </programlisting>
2472
2473
2474 However a single variable is considered a degenerate function binding,
2475 rather than a degerate pattern binding, so this is permitted, even
2476 though it binds a type variable:
2477
2478
2479 <programlisting>
2480   f :: (b->b) = \(x::b) -> x
2481 </programlisting>
2482
2483
2484 </para>
2485 </listitem>
2486
2487 </itemizedlist>
2488
2489 Such degnerate function bindings do not fall under the monomorphism
2490 restriction.  Thus:
2491 </para>
2492
2493 <para>
2494
2495 <programlisting>
2496   g :: a -> a -> Bool = \x y. x==y
2497 </programlisting>
2498
2499 </para>
2500
2501 <para>
2502 Here <function>g</function> has type <literal>forall a. Eq a =&gt; a -&gt; a -&gt; Bool</literal>, just as if
2503 <function>g</function> had a separate type signature.  Lacking a type signature, <function>g</function>
2504 would get a monomorphic type.
2505 </para>
2506
2507 </sect2>
2508
2509 <sect2>
2510 <title>Existentials</title>
2511
2512 <para>
2513
2514 <itemizedlist>
2515 <listitem>
2516
2517 <para>
2518  Pattern type signatures can bind existential type variables.
2519 For example:
2520
2521
2522 <programlisting>
2523   data T = forall a. MkT [a]
2524
2525   f :: T -> T
2526   f (MkT [t::a]) = MkT t3
2527                  where
2528                    t3::[a] = [t,t,t]
2529 </programlisting>
2530
2531
2532 </para>
2533 </listitem>
2534
2535 </itemizedlist>
2536
2537 </para>
2538
2539 </sect2>
2540
2541 </sect1>
2542
2543 <sect1 id="pragmas">
2544 <title>Pragmas
2545 </title>
2546
2547 <para>
2548 GHC supports several pragmas, or instructions to the compiler placed
2549 in the source code.  Pragmas don't affect the meaning of the program,
2550 but they might affect the efficiency of the generated code.
2551 </para>
2552
2553 <sect2 id="inline-pragma">
2554 <title>INLINE pragma
2555
2556 <indexterm><primary>INLINE pragma</primary></indexterm>
2557 <indexterm><primary>pragma, INLINE</primary></indexterm></title>
2558
2559 <para>
2560 GHC (with <option>-O</option>, as always) tries to inline (or &ldquo;unfold&rdquo;)
2561 functions/values that are &ldquo;small enough,&rdquo; thus avoiding the call
2562 overhead and possibly exposing other more-wonderful optimisations.
2563 </para>
2564
2565 <para>
2566 You will probably see these unfoldings (in Core syntax) in your
2567 interface files.
2568 </para>
2569
2570 <para>
2571 Normally, if GHC decides a function is &ldquo;too expensive&rdquo; to inline, it
2572 will not do so, nor will it export that unfolding for other modules to
2573 use.
2574 </para>
2575
2576 <para>
2577 The sledgehammer you can bring to bear is the
2578 <literal>INLINE</literal><indexterm><primary>INLINE pragma</primary></indexterm> pragma, used thusly:
2579
2580 <programlisting>
2581 key_function :: Int -> String -> (Bool, Double)
2582
2583 #ifdef __GLASGOW_HASKELL__
2584 {-# INLINE key_function #-}
2585 #endif
2586 </programlisting>
2587
2588 (You don't need to do the C pre-processor carry-on unless you're going
2589 to stick the code through HBC&mdash;it doesn't like <literal>INLINE</literal> pragmas.)
2590 </para>
2591
2592 <para>
2593 The major effect of an <literal>INLINE</literal> pragma is to declare a function's
2594 &ldquo;cost&rdquo; to be very low.  The normal unfolding machinery will then be
2595 very keen to inline it.
2596 </para>
2597
2598 <para>
2599 An <literal>INLINE</literal> pragma for a function can be put anywhere its type
2600 signature could be put.
2601 </para>
2602
2603 <para>
2604 <literal>INLINE</literal> pragmas are a particularly good idea for the
2605 <literal>then</literal>/<literal>return</literal> (or <literal>bind</literal>/<literal>unit</literal>) functions in a monad.
2606 For example, in GHC's own <literal>UniqueSupply</literal> monad code, we have:
2607
2608 <programlisting>
2609 #ifdef __GLASGOW_HASKELL__
2610 {-# INLINE thenUs #-}
2611 {-# INLINE returnUs #-}
2612 #endif
2613 </programlisting>
2614
2615 </para>
2616
2617 </sect2>
2618
2619 <sect2 id="noinline-pragma">
2620 <title>NOINLINE pragma
2621 </title>
2622
2623 <para>
2624 <indexterm><primary>NOINLINE pragma</primary></indexterm>
2625 <indexterm><primary>pragma, NOINLINE</primary></indexterm>
2626 </para>
2627
2628 <para>
2629 The <literal>NOINLINE</literal> pragma does exactly what you'd expect: it stops the
2630 named function from being inlined by the compiler.  You shouldn't ever
2631 need to do this, unless you're very cautious about code size.
2632 </para>
2633
2634 </sect2>
2635
2636     <sect2 id="specialize-pragma">
2637       <title>SPECIALIZE pragma</title>
2638
2639       <indexterm><primary>SPECIALIZE pragma</primary></indexterm>
2640       <indexterm><primary>pragma, SPECIALIZE</primary></indexterm>
2641       <indexterm><primary>overloading, death to</primary></indexterm>
2642
2643       <para>(UK spelling also accepted.)  For key overloaded
2644       functions, you can create extra versions (NB: more code space)
2645       specialised to particular types.  Thus, if you have an
2646       overloaded function:</para>
2647
2648 <programlisting>
2649 hammeredLookup :: Ord key => [(key, value)] -> key -> value
2650 </programlisting>
2651
2652       <para>If it is heavily used on lists with
2653       <literal>Widget</literal> keys, you could specialise it as
2654       follows:</para>
2655
2656 <programlisting>
2657 {-# SPECIALIZE hammeredLookup :: [(Widget, value)] -> Widget -> value #-}
2658 </programlisting>
2659
2660       <para>To get very fancy, you can also specify a named function
2661       to use for the specialised value, as in:</para>
2662
2663 <programlisting>
2664 {-# RULES hammeredLookup = blah #-}
2665 </programlisting>
2666
2667       <para>where <literal>blah</literal> is an implementation of
2668       <literal>hammerdLookup</literal> written specialy for
2669       <literal>Widget</literal> lookups.  It's <emphasis>Your
2670       Responsibility</emphasis> to make sure that
2671       <function>blah</function> really behaves as a specialised
2672       version of <function>hammeredLookup</function>!!!</para>
2673
2674       <para>Note we use the <literal>RULE</literal> pragma here to
2675       indicate that <literal>hammeredLookup</literal> applied at a
2676       certain type should be replaced by <literal>blah</literal>.  See
2677       <xref linkend="rules"> for more information on
2678       <literal>RULES</literal>.</para>
2679
2680       <para>An example in which using <literal>RULES</literal> for
2681       specialisation will Win Big:
2682
2683 <programlisting>
2684 toDouble :: Real a => a -> Double
2685 toDouble = fromRational . toRational
2686
2687 {-# SPECIALIZE toDouble :: Int -> Double = i2d #-}
2688 i2d (I# i) = D# (int2Double# i) -- uses Glasgow prim-op directly
2689 </programlisting>
2690
2691       The <function>i2d</function> function is virtually one machine
2692       instruction; the default conversion&mdash;via an intermediate
2693       <literal>Rational</literal>&mdash;is obscenely expensive by
2694       comparison.</para>
2695
2696       <para>A <literal>SPECIALIZE</literal> pragma for a function can
2697       be put anywhere its type signature could be put.</para>
2698
2699     </sect2>
2700
2701 <sect2 id="specialize-instance-pragma">
2702 <title>SPECIALIZE instance pragma
2703 </title>
2704
2705 <para>
2706 <indexterm><primary>SPECIALIZE pragma</primary></indexterm>
2707 <indexterm><primary>overloading, death to</primary></indexterm>
2708 Same idea, except for instance declarations.  For example:
2709
2710 <programlisting>
2711 instance (Eq a) => Eq (Foo a) where { ... usual stuff ... }
2712
2713 {-# SPECIALIZE instance Eq (Foo [(Int, Bar)] #-}
2714 </programlisting>
2715
2716 Compatible with HBC, by the way.
2717 </para>
2718
2719 </sect2>
2720
2721 <sect2 id="line-pragma">
2722 <title>LINE pragma
2723 </title>
2724
2725 <para>
2726 <indexterm><primary>LINE pragma</primary></indexterm>
2727 <indexterm><primary>pragma, LINE</primary></indexterm>
2728 </para>
2729
2730 <para>
2731 This pragma is similar to C's <literal>&num;line</literal> pragma, and is mainly for use in
2732 automatically generated Haskell code.  It lets you specify the line
2733 number and filename of the original code; for example
2734 </para>
2735
2736 <para>
2737
2738 <programlisting>
2739 {-# LINE 42 "Foo.vhs" #-}
2740 </programlisting>
2741
2742 </para>
2743
2744 <para>
2745 if you'd generated the current file from something called <filename>Foo.vhs</filename>
2746 and this line corresponds to line 42 in the original.  GHC will adjust
2747 its error messages to refer to the line/file named in the <literal>LINE</literal>
2748 pragma.
2749 </para>
2750
2751 </sect2>
2752
2753 <sect2 id="rules">
2754 <title>RULES pragma</title>
2755
2756 <para>
2757 The RULES pragma lets you specify rewrite rules.  It is described in
2758 <xref LinkEnd="rewrite-rules">.
2759 </para>
2760
2761 </sect2>
2762
2763 </sect1>
2764
2765 <sect1 id="rewrite-rules">
2766 <title>Rewrite rules
2767
2768 <indexterm><primary>RULES pagma</primary></indexterm>
2769 <indexterm><primary>pragma, RULES</primary></indexterm>
2770 <indexterm><primary>rewrite rules</primary></indexterm></title>
2771
2772 <para>
2773 The programmer can specify rewrite rules as part of the source program
2774 (in a pragma).  GHC applies these rewrite rules wherever it can.
2775 </para>
2776
2777 <para>
2778 Here is an example:
2779
2780 <programlisting>
2781   {-# RULES
2782         "map/map"       forall f g xs. map f (map g xs) = map (f.g) xs
2783   #-}
2784 </programlisting>
2785
2786 </para>
2787
2788 <sect2>
2789 <title>Syntax</title>
2790
2791 <para>
2792 From a syntactic point of view:
2793
2794 <itemizedlist>
2795 <listitem>
2796
2797 <para>
2798  Each rule has a name, enclosed in double quotes.  The name itself has
2799 no significance at all.  It is only used when reporting how many times the rule fired.
2800 </para>
2801 </listitem>
2802 <listitem>
2803
2804 <para>
2805  There may be zero or more rules in a <literal>RULES</literal> pragma.
2806 </para>
2807 </listitem>
2808 <listitem>
2809
2810 <para>
2811  Layout applies in a <literal>RULES</literal> pragma.  Currently no new indentation level
2812 is set, so you must lay out your rules starting in the same column as the
2813 enclosing definitions.
2814 </para>
2815 </listitem>
2816 <listitem>
2817
2818 <para>
2819  Each variable mentioned in a rule must either be in scope (e.g. <function>map</function>),
2820 or bound by the <literal>forall</literal> (e.g. <function>f</function>, <function>g</function>, <function>xs</function>).  The variables bound by
2821 the <literal>forall</literal> are called the <emphasis>pattern</emphasis> variables.  They are separated
2822 by spaces, just like in a type <literal>forall</literal>.
2823 </para>
2824 </listitem>
2825 <listitem>
2826
2827 <para>
2828  A pattern variable may optionally have a type signature.
2829 If the type of the pattern variable is polymorphic, it <emphasis>must</emphasis> have a type signature.
2830 For example, here is the <literal>foldr/build</literal> rule:
2831
2832 <programlisting>
2833 "fold/build"  forall k z (g::forall b. (a->b->b) -> b -> b) .
2834               foldr k z (build g) = g k z
2835 </programlisting>
2836
2837 Since <function>g</function> has a polymorphic type, it must have a type signature.
2838
2839 </para>
2840 </listitem>
2841 <listitem>
2842
2843 <para>
2844 The left hand side of a rule must consist of a top-level variable applied
2845 to arbitrary expressions.  For example, this is <emphasis>not</emphasis> OK:
2846
2847 <programlisting>
2848 "wrong1"   forall e1 e2.  case True of { True -> e1; False -> e2 } = e1
2849 "wrong2"   forall f.      f True = True
2850 </programlisting>
2851
2852 In <literal>"wrong1"</literal>, the LHS is not an application; in <literal>"wrong2"</literal>, the LHS has a pattern variable
2853 in the head.
2854 </para>
2855 </listitem>
2856 <listitem>
2857
2858 <para>
2859  A rule does not need to be in the same module as (any of) the
2860 variables it mentions, though of course they need to be in scope.
2861 </para>
2862 </listitem>
2863 <listitem>
2864
2865 <para>
2866  Rules are automatically exported from a module, just as instance declarations are.
2867 </para>
2868 </listitem>
2869
2870 </itemizedlist>
2871
2872 </para>
2873
2874 </sect2>
2875
2876 <sect2>
2877 <title>Semantics</title>
2878
2879 <para>
2880 From a semantic point of view:
2881
2882 <itemizedlist>
2883 <listitem>
2884
2885 <para>
2886 Rules are only applied if you use the <option>-O</option> flag.
2887 </para>
2888 </listitem>
2889
2890 <listitem>
2891 <para>
2892  Rules are regarded as left-to-right rewrite rules.
2893 When GHC finds an expression that is a substitution instance of the LHS
2894 of a rule, it replaces the expression by the (appropriately-substituted) RHS.
2895 By "a substitution instance" we mean that the LHS can be made equal to the
2896 expression by substituting for the pattern variables.
2897
2898 </para>
2899 </listitem>
2900 <listitem>
2901
2902 <para>
2903  The LHS and RHS of a rule are typechecked, and must have the
2904 same type.
2905
2906 </para>
2907 </listitem>
2908 <listitem>
2909
2910 <para>
2911  GHC makes absolutely no attempt to verify that the LHS and RHS
2912 of a rule have the same meaning.  That is undecideable in general, and
2913 infeasible in most interesting cases.  The responsibility is entirely the programmer's!
2914
2915 </para>
2916 </listitem>
2917 <listitem>
2918
2919 <para>
2920  GHC makes no attempt to make sure that the rules are confluent or
2921 terminating.  For example:
2922
2923 <programlisting>
2924   "loop"        forall x,y.  f x y = f y x
2925 </programlisting>
2926
2927 This rule will cause the compiler to go into an infinite loop.
2928
2929 </para>
2930 </listitem>
2931 <listitem>
2932
2933 <para>
2934  If more than one rule matches a call, GHC will choose one arbitrarily to apply.
2935
2936 </para>
2937 </listitem>
2938 <listitem>
2939 <para>
2940  GHC currently uses a very simple, syntactic, matching algorithm
2941 for matching a rule LHS with an expression.  It seeks a substitution
2942 which makes the LHS and expression syntactically equal modulo alpha
2943 conversion.  The pattern (rule), but not the expression, is eta-expanded if
2944 necessary.  (Eta-expanding the epression can lead to laziness bugs.)
2945 But not beta conversion (that's called higher-order matching).
2946 </para>
2947
2948 <para>
2949 Matching is carried out on GHC's intermediate language, which includes
2950 type abstractions and applications.  So a rule only matches if the
2951 types match too.  See <xref LinkEnd="rule-spec"> below.
2952 </para>
2953 </listitem>
2954 <listitem>
2955
2956 <para>
2957  GHC keeps trying to apply the rules as it optimises the program.
2958 For example, consider:
2959
2960 <programlisting>
2961   let s = map f
2962       t = map g
2963   in
2964   s (t xs)
2965 </programlisting>
2966
2967 The expression <literal>s (t xs)</literal> does not match the rule <literal>"map/map"</literal>, but GHC
2968 will substitute for <VarName>s</VarName> and <VarName>t</VarName>, giving an expression which does match.
2969 If <VarName>s</VarName> or <VarName>t</VarName> was (a) used more than once, and (b) large or a redex, then it would
2970 not be substituted, and the rule would not fire.
2971
2972 </para>
2973 </listitem>
2974 <listitem>
2975
2976 <para>
2977  In the earlier phases of compilation, GHC inlines <emphasis>nothing
2978 that appears on the LHS of a rule</emphasis>, because once you have substituted
2979 for something you can't match against it (given the simple minded
2980 matching).  So if you write the rule
2981
2982 <programlisting>
2983         "map/map"       forall f,g.  map f . map g = map (f.g)
2984 </programlisting>
2985
2986 this <emphasis>won't</emphasis> match the expression <literal>map f (map g xs)</literal>.
2987 It will only match something written with explicit use of ".".
2988 Well, not quite.  It <emphasis>will</emphasis> match the expression
2989
2990 <programlisting>
2991 wibble f g xs
2992 </programlisting>
2993
2994 where <function>wibble</function> is defined:
2995
2996 <programlisting>
2997 wibble f g = map f . map g
2998 </programlisting>
2999
3000 because <function>wibble</function> will be inlined (it's small).
3001
3002 Later on in compilation, GHC starts inlining even things on the
3003 LHS of rules, but still leaves the rules enabled.  This inlining
3004 policy is controlled by the per-simplification-pass flag <option>-finline-phase</option><emphasis>n</emphasis>.
3005
3006 </para>
3007 </listitem>
3008 <listitem>
3009
3010 <para>
3011  All rules are implicitly exported from the module, and are therefore
3012 in force in any module that imports the module that defined the rule, directly
3013 or indirectly.  (That is, if A imports B, which imports C, then C's rules are
3014 in force when compiling A.)  The situation is very similar to that for instance
3015 declarations.
3016 </para>
3017 </listitem>
3018
3019 </itemizedlist>
3020
3021 </para>
3022
3023 </sect2>
3024
3025 <sect2>
3026 <title>List fusion</title>
3027
3028 <para>
3029 The RULES mechanism is used to implement fusion (deforestation) of common list functions.
3030 If a "good consumer" consumes an intermediate list constructed by a "good producer", the
3031 intermediate list should be eliminated entirely.
3032 </para>
3033
3034 <para>
3035 The following are good producers:
3036
3037 <itemizedlist>
3038 <listitem>
3039
3040 <para>
3041  List comprehensions
3042 </para>
3043 </listitem>
3044 <listitem>
3045
3046 <para>
3047  Enumerations of <literal>Int</literal> and <literal>Char</literal> (e.g. <literal>['a'..'z']</literal>).
3048 </para>
3049 </listitem>
3050 <listitem>
3051
3052 <para>
3053  Explicit lists (e.g. <literal>[True, False]</literal>)
3054 </para>
3055 </listitem>
3056 <listitem>
3057
3058 <para>
3059  The cons constructor (e.g <literal>3:4:[]</literal>)
3060 </para>
3061 </listitem>
3062 <listitem>
3063
3064 <para>
3065  <function>++</function>
3066 </para>
3067 </listitem>
3068 <listitem>
3069
3070 <para>
3071  <function>map</function>
3072 </para>
3073 </listitem>
3074 <listitem>
3075
3076 <para>
3077  <function>filter</function>
3078 </para>
3079 </listitem>
3080 <listitem>
3081
3082 <para>
3083  <function>iterate</function>, <function>repeat</function>
3084 </para>
3085 </listitem>
3086 <listitem>
3087
3088 <para>
3089  <function>zip</function>, <function>zipWith</function>
3090 </para>
3091 </listitem>
3092
3093 </itemizedlist>
3094
3095 </para>
3096
3097 <para>
3098 The following are good consumers:
3099
3100 <itemizedlist>
3101 <listitem>
3102
3103 <para>
3104  List comprehensions
3105 </para>
3106 </listitem>
3107 <listitem>
3108
3109 <para>
3110  <function>array</function> (on its second argument)
3111 </para>
3112 </listitem>
3113 <listitem>
3114
3115 <para>
3116  <function>length</function>
3117 </para>
3118 </listitem>
3119 <listitem>
3120
3121 <para>
3122  <function>++</function> (on its first argument)
3123 </para>
3124 </listitem>
3125 <listitem>
3126
3127 <para>
3128  <function>map</function>
3129 </para>
3130 </listitem>
3131 <listitem>
3132
3133 <para>
3134  <function>filter</function>
3135 </para>
3136 </listitem>
3137 <listitem>
3138
3139 <para>
3140  <function>concat</function>
3141 </para>
3142 </listitem>
3143 <listitem>
3144
3145 <para>
3146  <function>unzip</function>, <function>unzip2</function>, <function>unzip3</function>, <function>unzip4</function>
3147 </para>
3148 </listitem>
3149 <listitem>
3150
3151 <para>
3152  <function>zip</function>, <function>zipWith</function> (but on one argument only; if both are good producers, <function>zip</function>
3153 will fuse with one but not the other)
3154 </para>
3155 </listitem>
3156 <listitem>
3157
3158 <para>
3159  <function>partition</function>
3160 </para>
3161 </listitem>
3162 <listitem>
3163
3164 <para>
3165  <function>head</function>
3166 </para>
3167 </listitem>
3168 <listitem>
3169
3170 <para>
3171  <function>and</function>, <function>or</function>, <function>any</function>, <function>all</function>
3172 </para>
3173 </listitem>
3174 <listitem>
3175
3176 <para>
3177  <function>sequence&lowbar;</function>
3178 </para>
3179 </listitem>
3180 <listitem>
3181
3182 <para>
3183  <function>msum</function>
3184 </para>
3185 </listitem>
3186 <listitem>
3187
3188 <para>
3189  <function>sortBy</function>
3190 </para>
3191 </listitem>
3192
3193 </itemizedlist>
3194
3195 </para>
3196
3197 <para>
3198 So, for example, the following should generate no intermediate lists:
3199
3200 <programlisting>
3201 array (1,10) [(i,i*i) | i &#60;- map (+ 1) [0..9]]
3202 </programlisting>
3203
3204 </para>
3205
3206 <para>
3207 This list could readily be extended; if there are Prelude functions that you use
3208 a lot which are not included, please tell us.
3209 </para>
3210
3211 <para>
3212 If you want to write your own good consumers or producers, look at the
3213 Prelude definitions of the above functions to see how to do so.
3214 </para>
3215
3216 </sect2>
3217
3218 <sect2 id="rule-spec">
3219 <title>Specialisation
3220 </title>
3221
3222 <para>
3223 Rewrite rules can be used to get the same effect as a feature
3224 present in earlier version of GHC:
3225
3226 <programlisting>
3227   {-# SPECIALIZE fromIntegral :: Int8 -> Int16 = int8ToInt16 #-}
3228 </programlisting>
3229
3230 This told GHC to use <function>int8ToInt16</function> instead of <function>fromIntegral</function> whenever
3231 the latter was called with type <literal>Int8 -&gt; Int16</literal>.  That is, rather than
3232 specialising the original definition of <function>fromIntegral</function> the programmer is
3233 promising that it is safe to use <function>int8ToInt16</function> instead.
3234 </para>
3235
3236 <para>
3237 This feature is no longer in GHC.  But rewrite rules let you do the
3238 same thing:
3239
3240 <programlisting>
3241 {-# RULES
3242   "fromIntegral/Int8/Int16" fromIntegral = int8ToInt16
3243 #-}
3244 </programlisting>
3245
3246 This slightly odd-looking rule instructs GHC to replace <function>fromIntegral</function>
3247 by <function>int8ToInt16</function> <emphasis>whenever the types match</emphasis>.  Speaking more operationally,
3248 GHC adds the type and dictionary applications to get the typed rule
3249
3250 <programlisting>
3251 forall (d1::Integral Int8) (d2::Num Int16) .
3252         fromIntegral Int8 Int16 d1 d2 = int8ToInt16
3253 </programlisting>
3254
3255 What is more,
3256 this rule does not need to be in the same file as fromIntegral,
3257 unlike the <literal>SPECIALISE</literal> pragmas which currently do (so that they
3258 have an original definition available to specialise).
3259 </para>
3260
3261 </sect2>
3262
3263 <sect2>
3264 <title>Controlling what's going on</title>
3265
3266 <para>
3267
3268 <itemizedlist>
3269 <listitem>
3270
3271 <para>
3272  Use <option>-ddump-rules</option> to see what transformation rules GHC is using.
3273 </para>
3274 </listitem>
3275 <listitem>
3276
3277 <para>
3278  Use <option>-ddump-simpl-stats</option> to see what rules are being fired.
3279 If you add <option>-dppr-debug</option> you get a more detailed listing.
3280 </para>
3281 </listitem>
3282 <listitem>
3283
3284 <para>
3285  The defintion of (say) <function>build</function> in <FileName>PrelBase.lhs</FileName> looks llike this:
3286
3287 <programlisting>
3288         build   :: forall a. (forall b. (a -> b -> b) -> b -> b) -> [a]
3289         {-# INLINE build #-}
3290         build g = g (:) []
3291 </programlisting>
3292
3293 Notice the <literal>INLINE</literal>!  That prevents <literal>(:)</literal> from being inlined when compiling
3294 <literal>PrelBase</literal>, so that an importing module will &ldquo;see&rdquo; the <literal>(:)</literal>, and can
3295 match it on the LHS of a rule.  <literal>INLINE</literal> prevents any inlining happening
3296 in the RHS of the <literal>INLINE</literal> thing.  I regret the delicacy of this.
3297
3298 </para>
3299 </listitem>
3300 <listitem>
3301
3302 <para>
3303  In <filename>ghc/lib/std/PrelBase.lhs</filename> look at the rules for <function>map</function> to
3304 see how to write rules that will do fusion and yet give an efficient
3305 program even if fusion doesn't happen.  More rules in <filename>PrelList.lhs</filename>.
3306 </para>
3307 </listitem>
3308
3309 </itemizedlist>
3310
3311 </para>
3312
3313 </sect2>
3314
3315 </sect1>
3316
3317 <sect1 id="generic-classes">
3318 <title>Generic classes</title>
3319
3320 <para>
3321 The ideas behind this extension are described in detail in "Derivable type classes",
3322 Ralf Hinze and Simon Peyton Jones, Haskell Workshop, Montreal Sept 2000, pp94-105.
3323 An example will give the idea:
3324 </para>
3325
3326 <programlisting>
3327   import Generics
3328
3329   class Bin a where
3330     toBin   :: a -> [Int]
3331     fromBin :: [Int] -> (a, [Int])
3332
3333     toBin {| Unit |}    Unit      = []
3334     toBin {| a :+: b |} (Inl x)   = 0 : toBin x
3335     toBin {| a :+: b |} (Inr y)   = 1 : toBin y
3336     toBin {| a :*: b |} (x :*: y) = toBin x ++ toBin y
3337
3338     fromBin {| Unit |}    bs      = (Unit, bs)
3339     fromBin {| a :+: b |} (0:bs)  = (Inl x, bs')    where (x,bs') = fromBin bs
3340     fromBin {| a :+: b |} (1:bs)  = (Inr y, bs')    where (y,bs') = fromBin bs
3341     fromBin {| a :*: b |} bs      = (x :*: y, bs'') where (x,bs' ) = fromBin bs
3342                                                           (y,bs'') = fromBin bs'
3343 </programlisting>
3344 <para>
3345 This class declaration explains how <literal>toBin</literal> and <literal>fromBin</literal>
3346 work for arbitrary data types.  They do so by giving cases for unit, product, and sum,
3347 which are defined thus in the library module <literal>Generics</literal>:
3348 </para>
3349 <programlisting>
3350   data Unit    = Unit
3351   data a :+: b = Inl a | Inr b
3352   data a :*: b = a :*: b
3353 </programlisting>
3354 <para>
3355 Now you can make a data type into an instance of Bin like this:
3356 <programlisting>
3357   instance (Bin a, Bin b) => Bin (a,b)
3358   instance Bin a => Bin [a]
3359 </programlisting>
3360 That is, just leave off the "where" clasuse.  Of course, you can put in the
3361 where clause and over-ride whichever methods you please.
3362 </para>
3363
3364     <sect2>
3365       <title> Using generics </title>
3366       <para>To use generics you need to</para>
3367       <itemizedlist>
3368         <listitem>
3369           <para>Use the <option>-fgenerics</option> flag.</para>
3370         </listitem>
3371         <listitem>
3372           <para>Import the module <literal>Generics</literal> from the
3373           <literal>lang</literal> package.  This import brings into
3374           scope the data types <literal>Unit</literal>,
3375           <literal>:*:</literal>, and <literal>:+:</literal>.  (You
3376           don't need this import if you don't mention these types
3377           explicitly; for example, if you are simply giving instance
3378           declarations.)</para>
3379         </listitem>
3380       </itemizedlist>
3381     </sect2>
3382
3383 <sect2> <title> Changes wrt the paper </title>
3384 <para>
3385 Note that the type constructors <literal>:+:</literal> and <literal>:*:</literal>
3386 can be written infix (indeed, you can now use
3387 any operator starting in a colon as an infix type constructor).  Also note that
3388 the type constructors are not exactly as in the paper (Unit instead of 1, etc).
3389 Finally, note that the syntax of the type patterns in the class declaration
3390 uses "<literal>{|</literal>" and "<literal>{|</literal>" brackets; curly braces
3391 alone would ambiguous when they appear on right hand sides (an extension we
3392 anticipate wanting).
3393 </para>
3394 </sect2>
3395
3396 <sect2> <title>Terminology and restrictions</title>
3397 <para>
3398 Terminology.  A "generic default method" in a class declaration
3399 is one that is defined using type patterns as above.
3400 A "polymorphic default method" is a default method defined as in Haskell 98.
3401 A "generic class declaration" is a class declaration with at least one
3402 generic default method.
3403 </para>
3404
3405 <para>
3406 Restrictions:
3407 <itemizedlist>
3408 <listitem>
3409 <para>
3410 Alas, we do not yet implement the stuff about constructor names and
3411 field labels.
3412 </para>
3413 </listitem>
3414
3415 <listitem>
3416 <para>
3417 A generic class can have only one parameter; you can't have a generic
3418 multi-parameter class.
3419 </para>
3420 </listitem>
3421
3422 <listitem>
3423 <para>
3424 A default method must be defined entirely using type patterns, or entirely
3425 without.  So this is illegal:
3426 <programlisting>
3427   class Foo a where
3428     op :: a -> (a, Bool)
3429     op {| Unit |} Unit = (Unit, True)
3430     op x               = (x,    False)
3431 </programlisting>
3432 However it is perfectly OK for some methods of a generic class to have
3433 generic default methods and others to have polymorphic default methods.
3434 </para>
3435 </listitem>
3436
3437 <listitem>
3438 <para>
3439 The type variable(s) in the type pattern for a generic method declaration
3440 scope over the right hand side.  So this is legal (note the use of the type variable ``p'' in a type signature on the right hand side:
3441 <programlisting>
3442   class Foo a where
3443     op :: a -> Bool
3444     op {| p :*: q |} (x :*: y) = op (x :: p)
3445     ...
3446 </programlisting>
3447 </para>
3448 </listitem>
3449
3450 <listitem>
3451 <para>
3452 The type patterns in a generic default method must take one of the forms:
3453 <programlisting>
3454        a :+: b
3455        a :*: b
3456        Unit
3457 </programlisting>
3458 where "a" and "b" are type variables.  Furthermore, all the type patterns for
3459 a single type constructor (<literal>:*:</literal>, say) must be identical; they
3460 must use the same type variables.  So this is illegal:
3461 <programlisting>
3462   class Foo a where
3463     op :: a -> Bool
3464     op {| a :+: b |} (Inl x) = True
3465     op {| p :+: q |} (Inr y) = False
3466 </programlisting>
3467 The type patterns must be identical, even in equations for different methods of the class.
3468 So this too is illegal:
3469 <programlisting>
3470   class Foo a where
3471     op1 :: a -> Bool
3472     op {| a :*: b |} (Inl x) = True
3473
3474     op2 :: a -> Bool
3475     op {| p :*: q |} (Inr y) = False
3476 </programlisting>
3477 (The reason for this restriction is that we gather all the equations for a particular type consructor
3478 into a single generic instance declaration.)
3479 </para>
3480 </listitem>
3481
3482 <listitem>
3483 <para>
3484 A generic method declaration must give a case for each of the three type constructors.
3485 </para>
3486 </listitem>
3487
3488 <listitem>
3489 <para>
3490 The type for a generic method can be built only from:
3491   <itemizedlist>
3492   <listitem> <para> Function arrows </para> </listitem>
3493   <listitem> <para> Type variables </para> </listitem>
3494   <listitem> <para> Tuples </para> </listitem>
3495   <listitem> <para> Arbitrary types not involving type variables </para> </listitem>
3496   </itemizedlist>
3497 Here are some example type signatures for generic methods:
3498 <programlisting>
3499     op1 :: a -> Bool
3500     op2 :: Bool -> (a,Bool)
3501     op3 :: [Int] -> a -> a
3502     op4 :: [a] -> Bool
3503 </programlisting>
3504 Here, op1, op2, op3 are OK, but op4 is rejected, because it has a type variable
3505 inside a list.
3506 </para>
3507 <para>
3508 This restriction is an implementation restriction: we just havn't got around to
3509 implementing the necessary bidirectional maps over arbitrary type constructors.
3510 It would be relatively easy to add specific type constructors, such as Maybe and list,
3511 to the ones that are allowed.</para>
3512 </listitem>
3513
3514 <listitem>
3515 <para>
3516 In an instance declaration for a generic class, the idea is that the compiler
3517 will fill in the methods for you, based on the generic templates.  However it can only
3518 do so if
3519   <itemizedlist>
3520   <listitem>
3521   <para>
3522   The instance type is simple (a type constructor applied to type variables, as in Haskell 98).
3523   </para>
3524   </listitem>
3525   <listitem>
3526   <para>
3527   No constructor of the instance type has unboxed fields.
3528   </para>
3529   </listitem>
3530   </itemizedlist>
3531 (Of course, these things can only arise if you are already using GHC extensions.)
3532 However, you can still give an instance declarations for types which break these rules,
3533 provided you give explicit code to override any generic default methods.
3534 </para>
3535 </listitem>
3536
3537 </itemizedlist>
3538 </para>
3539
3540 <para>
3541 The option <option>-ddump-deriv</option> dumps incomprehensible stuff giving details of
3542 what the compiler does with generic declarations.
3543 </para>
3544
3545 </sect2>
3546
3547 <sect2> <title> Another example </title>
3548 <para>
3549 Just to finish with, here's another example I rather like:
3550 <programlisting>
3551   class Tag a where
3552     nCons :: a -> Int
3553     nCons {| Unit |}    _ = 1
3554     nCons {| a :*: b |} _ = 1
3555     nCons {| a :+: b |} _ = nCons (bot::a) + nCons (bot::b)
3556
3557     tag :: a -> Int
3558     tag {| Unit |}    _       = 1
3559     tag {| a :*: b |} _       = 1
3560     tag {| a :+: b |} (Inl x) = tag x
3561     tag {| a :+: b |} (Inr y) = nCons (bot::a) + tag y
3562 </programlisting>
3563 </para>
3564 </sect2>
3565 </sect1>
3566
3567 <!-- Emacs stuff:
3568      ;;; Local Variables: ***
3569      ;;; mode: sgml ***
3570      ;;; sgml-parent-document: ("users_guide.sgml" "book" "chapter" "sect1") ***
3571      ;;; End: ***
3572  -->