ghc/docs/users_guide/glasgow_exts.sgml

   1 <Para>
   2 <IndexTerm><Primary>language, GHC</Primary></IndexTerm>
   3 <IndexTerm><Primary>extensions, GHC</Primary></IndexTerm>
   4 As with all known Haskell systems, GHC implements some extensions to
   5 the language.  To use them, you'll need to give a <Option>-fglasgow-exts</Option>
   6 <IndexTerm><Primary>-fglasgow-exts option</Primary></IndexTerm> option.
   7 </Para>
   8
   9 <Para>
  10 Virtually all of the Glasgow extensions serve to give you access to
  11 the underlying facilities with which we implement Haskell.  Thus, you
  12 can get at the Raw Iron, if you are willing to write some non-standard
  13 code at a more primitive level.  You need not be &ldquo;stuck&rdquo; on
  14 performance because of the implementation costs of Haskell's
  15 &ldquo;high-level&rdquo; features&mdash;you can always code &ldquo;under&rdquo; them.  In an extreme case, you can write all your time-critical code in C, and then just glue it together with Haskell!
  16 </Para>
  17
  18 <Para>
  19 Executive summary of our extensions:
  20 </Para>
  21
  22 <Para>
  23 <VariableList>
  24
  25 <VarListEntry>
  26 <Term>Unboxed types and primitive operations:</Term>
  27 <ListItem>
  28 <Para>
  29 You can get right down to the raw machine types and operations;
  30 included in this are &ldquo;primitive arrays&rdquo; (direct access to Big Wads
  31 of Bytes).  Please see <XRef LinkEnd="glasgow-unboxed"> and following.
  32 </Para>
  33 </ListItem>
  34 </VarListEntry>
  35
  36 <VarListEntry>
  37 <Term>Type system extensions:</Term>
  38 <ListItem>
  39 <Para> GHC supports a large number of extensions to Haskell's type
  40 system.  Specifically:
  41 </Para>
  42
  43 <VariableList>
  44 <VarListEntry>
  45 <Term>Multi-parameter type classes:</Term>
  46 <ListItem>
  47 <Para>
  48 <XRef LinkEnd="multi-param-type-classes">
  49 </Para>
  50 </ListItem>
  51 </VarListEntry>
  52
  53 <VarListEntry>
  54 <Term>Functional dependencies:</Term>
  55 <ListItem>
  56 <Para>
  57 <XRef LinkEnd="functional-dependencies">
  58 </Para>
  59 </ListItem>
  60 </VarListEntry>
  61
  62 <VarListEntry>
  63 <Term>Implicit parameters:</Term>
  64 <ListItem>
  65 <Para>
  66 <XRef LinkEnd="implicit parameters">
  67 </Para>
  68 </ListItem>
  69 </VarListEntry>
  70
  71 <VarListEntry>
  72 <Term>Local universal quantification:</Term>
  73 <ListItem>
  74 <Para>
  75 <XRef LinkEnd="universal-quantification">
  76 </Para>
  77 </ListItem>
  78 </VarListEntry>
  79
  80 <VarListEntry>
  81 <Term>Extistentially quantification in data types:</Term>
  82 <ListItem>
  83 <Para>
  84 <XRef LinkEnd="existential-quantification">
  85 </Para>
  86 </ListItem>
  87 </VarListEntry>
  88
  89 <VarListEntry>
  90 <Term>Scoped type variables:</Term>
  91 <ListItem>
  92 <Para>
  93 Scoped type variables enable the programmer to supply type signatures
  94 for some nested declarations, where this would not be legal in Haskell
  95 98.  Details in <XRef LinkEnd="scoped-type-variables">.
  96 </Para>
  97 </ListItem>
  98 </VarListEntry>
  99 </VarListEntry>
 100
 101
 102 <VarListEntry>
 103 <Term>Pattern guards</Term>
 104 <ListItem>
 105 <Para>
 106 Instead of being a boolean expression, a guard is a list of qualifiers, exactly as in a list comprehension. See <XRef LinkEnd="pattern-guards">.
 107 </Para>
 108 </ListItem>
 109 </VarListEntry>
 110
 111 <VarListEntry>
 112 <Term>Foreign calling:</Term>
 113 <ListItem>
 114 <Para>
 115 Just what it sounds like.  We provide <Emphasis>lots</Emphasis> of rope that you
 116 can dangle around your neck.  Please see <XRef LinkEnd="ffi">.
 117 </Para>
 118 </ListItem>
 119 </VarListEntry>
 120
 121 <VarListEntry>
 122 <Term>Pragmas</Term>
 123 <ListItem>
 124 <Para>
 125 Pragmas are special instructions to the compiler placed in the source
 126 file.  The pragmas GHC supports are described in <XRef LinkEnd="pragmas">.
 127 </Para>
 128 </ListItem>
 129 </VarListEntry>
 130
 131 <VarListEntry>
 132 <Term>Rewrite rules:</Term>
 133 <ListItem>
 134 <Para>
 135 The programmer can specify rewrite rules as part of the source program
 136 (in a pragma).  GHC applies these rewrite rules wherever it can.
 137 Details in <XRef LinkEnd="rewrite-rules">.
 138 </Para>
 139 </ListItem>
 140 </VarListEntry>
 141
 142 <VarListEntry>
 143 <Term>Generic classes:</Term>
 144 <ListItem>
 145 <Para>
 146 Generic class declarations allow you to define a class
 147 whose methods say how to work over an arbitrary data type.
 148 Then it's really easy to make any new type into an instance of
 149 the class.  This generalises the rather ad-hoc "deriving" feature
 150 of Haskell 98.
 151 Details in <XRef LinkEnd="generic-classes">.
 152 </Para>
 153 </ListItem>
 154 </VarListEntry>
 155 </VariableList>
 156 </Para>
 157
 158 <Para>
 159 Before you get too carried away working at the lowest level (e.g.,
 160 sloshing <Literal>MutableByteArray&num;</Literal>s around your
 161 program), you may wish to check if there are libraries that provide a
 162 &ldquo;Haskellised veneer&rdquo; over the features you want.  See
 163 <xref linkend="book-hslibs">.
 164 </Para>
 165
 166   <sect1 id="options-language">
 167     <title>Language options</title>
 168
 169     <indexterm><primary>language</primary><secondary>option</secondary>
 170     </indexterm>
 171     <indexterm><primary>options</primary><secondary>language</secondary>
 172     </indexterm>
 173     <indexterm><primary>extensions</primary><secondary>options controlling</secondary>
 174     </indexterm>
 175
 176     <para> These flags control what variation of the language are
 177     permitted.  Leaving out all of them gives you standard Haskell
 178     98.</Para>
 179
 180     <variablelist>
 181
 182       <varlistentry>
 183         <term><option>-fglasgow-exts</option>:</term>
 184         <indexterm><primary><option>-fglasgow-exts</option></primary></indexterm>
 185         <listitem>
 186           <para>This simultaneously enables all of the extensions to
 187           Haskell 98 described in <xref
 188           linkend="ghc-language-features">, except where otherwise
 189           noted. </para>
 190         </listitem>
 191       </varlistentry>
 192
 193       <varlistentry>
 194         <term><option>-fno-monomorphism-restriction</option>:</term>
 195         <indexterm><primary><option>-fno-monomorphism-restriction</option></primary></indexterm>
 196         <listitem>
 197           <para> Switch off the Haskell 98 monomorphism restriction.
 198           Independent of the <Option>-fglasgow-exts</Option>
 199           flag. </para>
 200         </listitem>
 201       </varlistentry>
 202
 203       <varlistentry>
 204         <term><option>-fallow-overlapping-instances</option></term>
 205         <term><option>-fallow-undecidable-instances</option></term>
 206         <term><option>-fcontext-stack</option></term>
 207         <indexterm><primary><option>-fallow-overlapping-instances</option></primary></indexterm>
 208         <indexterm><primary><option>-fallow-undecidable-instances</option></primary></indexterm>
 209         <indexterm><primary><option>-fcontext-stack</option></primary></indexterm>
 210         <listitem>
 211           <para> See <XRef LinkEnd="instance-decls">.  Only relevant
 212           if you also use <option>-fglasgow-exts</option>.</para>
 213         </listitem>
 214       </varlistentry>
 215
 216       <varlistentry>
 217         <term><option>-fignore-asserts</option>:</term>
 218         <indexterm><primary><option>-fignore-asserts</option></primary></indexterm>
 219         <listitem>
 220           <para>See <XRef LinkEnd="sec-assertions">.  Only relevant if
 221           you also use <option>-fglasgow-exts</option>.</Para>
 222         </listitem>
 223       </varlistentry>
 224
 225       <varlistentry>
 226         <term><option>-finline-phase</option></term>
 227         <indexterm><primary><option>-finline-phase</option></primary></indexterm>
 228         <listitem>
 229           <para>See <XRef LinkEnd="rewrite-rules">.  Only relevant if
 230           you also use <Option>-fglasgow-exts</Option>.</para>
 231         </listitem>
 232       </varlistentry>
 233
 234       <varlistentry>
 235         <term><option>-fgenerics</option></term>
 236         <indexterm><primary><option>-fgenerics</option></primary></indexterm>
 237         <listitem>
 238           <para>See <XRef LinkEnd="generic-classes">.  Independent of
 239           <Option>-fglasgow-exts</Option>.</para>
 240         </listitem>
 241       </varlistentry>
 242
 243         <varlistentry>
 244           <term><option>-fno-implicit-prelude</option></term>
 245           <listitem>
 246             <para><indexterm><primary>-fno-implicit-prelude
 247             option</primary></indexterm> GHC normally imports
 248             <filename>Prelude.hi</filename> files for you.  If you'd
 249             rather it didn't, then give it a
 250             <option>-fno-implicit-prelude</option> option.  The idea
 251             is that you can then import a Prelude of your own.  (But
 252             don't call it <literal>Prelude</literal>; the Haskell
 253             module namespace is flat, and you must not conflict with
 254             any Prelude module.)</para>
 255
 256             <para>Even though you have not imported the Prelude, all
 257             the built-in syntax still refers to the built-in Haskell
 258             Prelude types and values, as specified by the Haskell
 259             Report.  For example, the type <literal>[Int]</literal>
 260             still means <literal>Prelude.[] Int</literal>; tuples
 261             continue to refer to the standard Prelude tuples; the
 262             translation for list comprehensions continues to use
 263             <literal>Prelude.map</literal> etc.</para>
 264
 265             <para> With one group of exceptions!  You may want to
 266             define your own numeric class hierarchy.  It completely
 267             defeats that purpose if the literal "1" means
 268             "<literal>Prelude.fromInteger 1</literal>", which is what
 269             the Haskell Report specifies.  So the
 270             <option>-fno-implicit-prelude</option> flag causes the
 271             following pieces of built-in syntax to refer to whatever
 272             is in scope, not the Prelude versions:</para>
 273
 274             <itemizedlist>
 275               <listitem>
 276                 <para>Integer and fractional literals mean
 277                 "<literal>fromInteger 1</literal>" and
 278                 "<literal>fromRational 3.2</literal>", not the
 279                 Prelude-qualified versions; both in expressions and in
 280                 patterns.</para>
 281               </listitem>
 282
 283               <listitem>
 284                 <para>Negation (e.g. "<literal>- (f x)</literal>")
 285                 means "<literal>negate (f x)</literal>" (not
 286                 <literal>Prelude.negate</literal>).</para>
 287               </listitem>
 288
 289               <listitem>
 290                 <para>In an n+k pattern, the standard Prelude
 291                 <literal>Ord</literal> class is used for comparison,
 292                 but the necessary subtraction uses whatever
 293                 "<literal>(-)</literal>" is in scope (not
 294                 "<literal>Prelude.(-)</literal>").</para>
 295               </listitem>
 296             </itemizedlist>
 297
 298           </listitem>
 299         </varlistentry>
 300
 301     </variablelist>
 302   </sect1>
 303
 304 <Sect1 id="primitives">
 305 <Title>Unboxed types and primitive operations
 306 </Title>
 307 <IndexTerm><Primary>PrelGHC module</Primary></IndexTerm>
 308
 309 <Para>
 310 This module defines all the types which are primitive in Glasgow
 311 Haskell, and the operations provided for them.
 312 </Para>
 313
 314 <Sect2 id="glasgow-unboxed">
 315 <Title>Unboxed types
 316 </Title>
 317
 318 <Para>
 319 <IndexTerm><Primary>Unboxed types (Glasgow extension)</Primary></IndexTerm>
 320 </Para>
 321
 322 <para>Most types in GHC are <firstterm>boxed</firstterm>, which means
 323 that values of that type are represented by a pointer to a heap
 324 object.  The representation of a Haskell <literal>Int</literal>, for
 325 example, is a two-word heap object.  An <firstterm>unboxed</firstterm>
 326 type, however, is represented by the value itself, no pointers or heap
 327 allocation are involved.
 328 </para>
 329
 330 <Para>
 331 Unboxed types correspond to the &ldquo;raw machine&rdquo; types you
 332 would use in C: <Literal>Int&num;</Literal> (long int),
 333 <Literal>Double&num;</Literal> (double), <Literal>Addr&num;</Literal>
 334 (void *), etc.  The <Emphasis>primitive operations</Emphasis>
 335 (PrimOps) on these types are what you might expect; e.g.,
 336 <Literal>(+&num;)</Literal> is addition on
 337 <Literal>Int&num;</Literal>s, and is the machine-addition that we all
 338 know and love&mdash;usually one instruction.
 339 </Para>
 340
 341 <Para>
 342 Primitive (unboxed) types cannot be defined in Haskell, and are
 343 therefore built into the language and compiler.  Primitive types are
 344 always unlifted; that is, a value of a primitive type cannot be
 345 bottom.  We use the convention that primitive types, values, and
 346 operations have a <Literal>&num;</Literal> suffix.
 347 </Para>
 348
 349 <Para>
 350 Primitive values are often represented by a simple bit-pattern, such
 351 as <Literal>Int&num;</Literal>, <Literal>Float&num;</Literal>,
 352 <Literal>Double&num;</Literal>.  But this is not necessarily the case:
 353 a primitive value might be represented by a pointer to a
 354 heap-allocated object.  Examples include
 355 <Literal>Array&num;</Literal>, the type of primitive arrays.  A
 356 primitive array is heap-allocated because it is too big a value to fit
 357 in a register, and would be too expensive to copy around; in a sense,
 358 it is accidental that it is represented by a pointer.  If a pointer
 359 represents a primitive value, then it really does point to that value:
 360 no unevaluated thunks, no indirections&hellip;nothing can be at the
 361 other end of the pointer than the primitive value.
 362 </Para>
 363
 364 <Para>
 365 There are some restrictions on the use of primitive types, the main
 366 one being that you can't pass a primitive value to a polymorphic
 367 function or store one in a polymorphic data type.  This rules out
 368 things like <Literal>[Int&num;]</Literal> (i.e. lists of primitive
 369 integers).  The reason for this restriction is that polymorphic
 370 arguments and constructor fields are assumed to be pointers: if an
 371 unboxed integer is stored in one of these, the garbage collector would
 372 attempt to follow it, leading to unpredictable space leaks.  Or a
 373 <Function>seq</Function> operation on the polymorphic component may
 374 attempt to dereference the pointer, with disastrous results.  Even
 375 worse, the unboxed value might be larger than a pointer
 376 (<Literal>Double&num;</Literal> for instance).
 377 </Para>
 378
 379 <Para>
 380 Nevertheless, A numerically-intensive program using unboxed types can
 381 go a <Emphasis>lot</Emphasis> faster than its &ldquo;standard&rdquo;
 382 counterpart&mdash;we saw a threefold speedup on one example.
 383 </Para>
 384
 385 </sect2>
 386
 387 <Sect2 id="unboxed-tuples">
 388 <Title>Unboxed Tuples
 389 </Title>
 390
 391 <Para>
 392 Unboxed tuples aren't really exported by <Literal>PrelGHC</Literal>,
 393 they're available by default with <Option>-fglasgow-exts</Option>.  An
 394 unboxed tuple looks like this:
 395 </Para>
 396
 397 <Para>
 398
 399 <ProgramListing>
 400 (# e_1, ..., e_n #)
 401 </ProgramListing>
 402
 403 </Para>
 404
 405 <Para>
 406 where <Literal>e&lowbar;1..e&lowbar;n</Literal> are expressions of any
 407 type (primitive or non-primitive).  The type of an unboxed tuple looks
 408 the same.
 409 </Para>
 410
 411 <Para>
 412 Unboxed tuples are used for functions that need to return multiple
 413 values, but they avoid the heap allocation normally associated with
 414 using fully-fledged tuples.  When an unboxed tuple is returned, the
 415 components are put directly into registers or on the stack; the
 416 unboxed tuple itself does not have a composite representation.  Many
 417 of the primitive operations listed in this section return unboxed
 418 tuples.
 419 </Para>
 420
 421 <Para>
 422 There are some pretty stringent restrictions on the use of unboxed tuples:
 423 </Para>
 424
 425 <Para>
 426
 427 <ItemizedList>
 428 <ListItem>
 429
 430 <Para>
 431  Unboxed tuple types are subject to the same restrictions as
 432 other unboxed types; i.e. they may not be stored in polymorphic data
 433 structures or passed to polymorphic functions.
 434
 435 </Para>
 436 </ListItem>
 437 <ListItem>
 438
 439 <Para>
 440  Unboxed tuples may only be constructed as the direct result of
 441 a function, and may only be deconstructed with a <Literal>case</Literal> expression.
 442 eg. the following are valid:
 443
 444
 445 <ProgramListing>
 446 f x y = (# x+1, y-1 #)
 447 g x = case f x x of { (# a, b #) -&#62; a + b }
 448 </ProgramListing>
 449
 450
 451 but the following are invalid:
 452
 453
 454 <ProgramListing>
 455 f x y = g (# x, y #)
 456 g (# x, y #) = x + y
 457 </ProgramListing>
 458
 459
 460 </Para>
 461 </ListItem>
 462 <ListItem>
 463
 464 <Para>
 465  No variable can have an unboxed tuple type.  This is illegal:
 466
 467
 468 <ProgramListing>
 469 f :: (# Int, Int #) -&#62; (# Int, Int #)
 470 f x = x
 471 </ProgramListing>
 472
 473
 474 because <VarName>x</VarName> has an unboxed tuple type.
 475
 476 </Para>
 477 </ListItem>
 478
 479 </ItemizedList>
 480
 481 </Para>
 482
 483 <Para>
 484 Note: we may relax some of these restrictions in the future.
 485 </Para>
 486
 487 <Para>
 488 The <Literal>IO</Literal> and <Literal>ST</Literal> monads use unboxed
 489 tuples to avoid unnecessary allocation during sequences of operations.
 490 </Para>
 491
 492 </Sect2>
 493
 494 <Sect2>
 495 <Title>Character and numeric types</Title>
 496
 497 <IndexTerm><Primary>character types, primitive</Primary></IndexTerm>
 498 <IndexTerm><Primary>numeric types, primitive</Primary></IndexTerm>
 499 <IndexTerm><Primary>integer types, primitive</Primary></IndexTerm>
 500 <IndexTerm><Primary>floating point types, primitive</Primary></IndexTerm>
 501 <Para>
 502 There are the following obvious primitive types:
 503 </Para>
 504
 505 <ProgramListing>
 506 type Char#
 507 type Int#
 508 type Word#
 509 type Addr#
 510 type Float#
 511 type Double#
 512 type Int64#
 513 type Word64#
 514 </ProgramListing>
 515
 516 <IndexTerm><Primary><literal>Char&num;</literal></Primary></IndexTerm>
 517 <IndexTerm><Primary><literal>Int&num;</literal></Primary></IndexTerm>
 518 <IndexTerm><Primary><literal>Word&num;</literal></Primary></IndexTerm>
 519 <IndexTerm><Primary><literal>Addr&num;</literal></Primary></IndexTerm>
 520 <IndexTerm><Primary><literal>Float&num;</literal></Primary></IndexTerm>
 521 <IndexTerm><Primary><literal>Double&num;</literal></Primary></IndexTerm>
 522 <IndexTerm><Primary><literal>Int64&num;</literal></Primary></IndexTerm>
 523 <IndexTerm><Primary><literal>Word64&num;</literal></Primary></IndexTerm>
 524
 525 <Para>
 526 If you really want to know their exact equivalents in C, see
 527 <Filename>ghc/includes/StgTypes.h</Filename> in the GHC source tree.
 528 </Para>
 529
 530 <Para>
 531 Literals for these types may be written as follows:
 532 </Para>
 533
 534 <Para>
 535
 536 <ProgramListing>
 537 1#              an Int#
 538 1.2#            a Float#
 539 1.34##          a Double#
 540 'a'#            a Char#; for weird characters, use e.g. '\o&#60;octal&#62;'#
 541 "a"#            an Addr# (a `char *'); only characters '\0'..'\255' allowed
 542 </ProgramListing>
 543
 544 <IndexTerm><Primary>literals, primitive</Primary></IndexTerm>
 545 <IndexTerm><Primary>constants, primitive</Primary></IndexTerm>
 546 <IndexTerm><Primary>numbers, primitive</Primary></IndexTerm>
 547 </Para>
 548
 549 </Sect2>
 550
 551 <Sect2>
 552 <Title>Comparison operations</Title>
 553
 554 <Para>
 555 <IndexTerm><Primary>comparisons, primitive</Primary></IndexTerm>
 556 <IndexTerm><Primary>operators, comparison</Primary></IndexTerm>
 557 </Para>
 558
 559 <Para>
 560
 561 <ProgramListing>
 562 {&#62;,&#62;=,==,/=,&#60;,&#60;=}# :: Int# -&#62; Int# -&#62; Bool
 563
 564 {gt,ge,eq,ne,lt,le}Char# :: Char# -&#62; Char# -&#62; Bool
 565     -- ditto for Word# and Addr#
 566 </ProgramListing>
 567
 568 <IndexTerm><Primary><literal>&#62;&num;</literal></Primary></IndexTerm>
 569 <IndexTerm><Primary><literal>&#62;=&num;</literal></Primary></IndexTerm>
 570 <IndexTerm><Primary><literal>==&num;</literal></Primary></IndexTerm>
 571 <IndexTerm><Primary><literal>/=&num;</literal></Primary></IndexTerm>
 572 <IndexTerm><Primary><literal>&#60;&num;</literal></Primary></IndexTerm>
 573 <IndexTerm><Primary><literal>&#60;=&num;</literal></Primary></IndexTerm>
 574 <IndexTerm><Primary><literal>gt&lcub;Char,Word,Addr&rcub;&num;</literal></Primary></IndexTerm>
 575 <IndexTerm><Primary><literal>ge&lcub;Char,Word,Addr&rcub;&num;</literal></Primary></IndexTerm>
 576 <IndexTerm><Primary><literal>eq&lcub;Char,Word,Addr&rcub;&num;</literal></Primary></IndexTerm>
 577 <IndexTerm><Primary><literal>ne&lcub;Char,Word,Addr&rcub;&num;</literal></Primary></IndexTerm>
 578 <IndexTerm><Primary><literal>lt&lcub;Char,Word,Addr&rcub;&num;</literal></Primary></IndexTerm>
 579 <IndexTerm><Primary><literal>le&lcub;Char,Word,Addr&rcub;&num;</literal></Primary></IndexTerm>
 580 </Para>
 581
 582 </Sect2>
 583
 584 <Sect2>
 585 <Title>Primitive-character operations</Title>
 586
 587 <Para>
 588 <IndexTerm><Primary>characters, primitive operations</Primary></IndexTerm>
 589 <IndexTerm><Primary>operators, primitive character</Primary></IndexTerm>
 590 </Para>
 591
 592 <Para>
 593
 594 <ProgramListing>
 595 ord# :: Char# -&#62; Int#
 596 chr# :: Int# -&#62; Char#
 597 </ProgramListing>
 598
 599 <IndexTerm><Primary><literal>ord&num;</literal></Primary></IndexTerm>
 600 <IndexTerm><Primary><literal>chr&num;</literal></Primary></IndexTerm>
 601 </Para>
 602
 603 </Sect2>
 604
 605 <Sect2>
 606 <Title>Primitive-<Literal>Int</Literal> operations</Title>
 607
 608 <Para>
 609 <IndexTerm><Primary>integers, primitive operations</Primary></IndexTerm>
 610 <IndexTerm><Primary>operators, primitive integer</Primary></IndexTerm>
 611 </Para>
 612
 613 <Para>
 614
 615 <ProgramListing>
 616 {+,-,*,quotInt,remInt,gcdInt}# :: Int# -&#62; Int# -&#62; Int#
 617 negateInt# :: Int# -&#62; Int#
 618
 619 iShiftL#, iShiftRA#, iShiftRL# :: Int# -&#62; Int# -&#62; Int#
 620         -- shift left, right arithmetic, right logical
 621
 622 addIntC#, subIntC#, mulIntC# :: Int# -> Int# -> (# Int#, Int# #)
 623         -- add, subtract, multiply with carry
 624 </ProgramListing>
 625
 626 <IndexTerm><Primary><literal>+&num;</literal></Primary></IndexTerm>
 627 <IndexTerm><Primary><literal>-&num;</literal></Primary></IndexTerm>
 628 <IndexTerm><Primary><literal>*&num;</literal></Primary></IndexTerm>
 629 <IndexTerm><Primary><literal>quotInt&num;</literal></Primary></IndexTerm>
 630 <IndexTerm><Primary><literal>remInt&num;</literal></Primary></IndexTerm>
 631 <IndexTerm><Primary><literal>gcdInt&num;</literal></Primary></IndexTerm>
 632 <IndexTerm><Primary><literal>iShiftL&num;</literal></Primary></IndexTerm>
 633 <IndexTerm><Primary><literal>iShiftRA&num;</literal></Primary></IndexTerm>
 634 <IndexTerm><Primary><literal>iShiftRL&num;</literal></Primary></IndexTerm>
 635 <IndexTerm><Primary><literal>addIntC&num;</literal></Primary></IndexTerm>
 636 <IndexTerm><Primary><literal>subIntC&num;</literal></Primary></IndexTerm>
 637 <IndexTerm><Primary><literal>mulIntC&num;</literal></Primary></IndexTerm>
 638 <IndexTerm><Primary>shift operations, integer</Primary></IndexTerm>
 639 </Para>
 640
 641 <Para>
 642 <Emphasis>Note:</Emphasis> No error/overflow checking!
 643 </Para>
 644
 645 </Sect2>
 646
 647 <Sect2>
 648 <Title>Primitive-<Literal>Double</Literal> and <Literal>Float</Literal> operations</Title>
 649
 650 <Para>
 651 <IndexTerm><Primary>floating point numbers, primitive</Primary></IndexTerm>
 652 <IndexTerm><Primary>operators, primitive floating point</Primary></IndexTerm>
 653 </Para>
 654
 655 <Para>
 656
 657 <ProgramListing>
 658 {+,-,*,/}##         :: Double# -&#62; Double# -&#62; Double#
 659 {&#60;,&#60;=,==,/=,&#62;=,&#62;}## :: Double# -&#62; Double# -&#62; Bool
 660 negateDouble#       :: Double# -&#62; Double#
 661 double2Int#         :: Double# -&#62; Int#
 662 int2Double#         :: Int#    -&#62; Double#
 663
 664 {plus,minux,times,divide}Float# :: Float# -&#62; Float# -&#62; Float#
 665 {gt,ge,eq,ne,lt,le}Float# :: Float# -&#62; Float# -&#62; Bool
 666 negateFloat#        :: Float# -&#62; Float#
 667 float2Int#          :: Float# -&#62; Int#
 668 int2Float#          :: Int#   -&#62; Float#
 669 </ProgramListing>
 670
 671 </Para>
 672
 673 <Para>
 674 <IndexTerm><Primary><literal>+&num;&num;</literal></Primary></IndexTerm>
 675 <IndexTerm><Primary><literal>-&num;&num;</literal></Primary></IndexTerm>
 676 <IndexTerm><Primary><literal>*&num;&num;</literal></Primary></IndexTerm>
 677 <IndexTerm><Primary><literal>/&num;&num;</literal></Primary></IndexTerm>
 678 <IndexTerm><Primary><literal>&#60;&num;&num;</literal></Primary></IndexTerm>
 679 <IndexTerm><Primary><literal>&#60;=&num;&num;</literal></Primary></IndexTerm>
 680 <IndexTerm><Primary><literal>==&num;&num;</literal></Primary></IndexTerm>
 681 <IndexTerm><Primary><literal>=/&num;&num;</literal></Primary></IndexTerm>
 682 <IndexTerm><Primary><literal>&#62;=&num;&num;</literal></Primary></IndexTerm>
 683 <IndexTerm><Primary><literal>&#62;&num;&num;</literal></Primary></IndexTerm>
 684 <IndexTerm><Primary><literal>negateDouble&num;</literal></Primary></IndexTerm>
 685 <IndexTerm><Primary><literal>double2Int&num;</literal></Primary></IndexTerm>
 686 <IndexTerm><Primary><literal>int2Double&num;</literal></Primary></IndexTerm>
 687 </Para>
 688
 689 <Para>
 690 <IndexTerm><Primary><literal>plusFloat&num;</literal></Primary></IndexTerm>
 691 <IndexTerm><Primary><literal>minusFloat&num;</literal></Primary></IndexTerm>
 692 <IndexTerm><Primary><literal>timesFloat&num;</literal></Primary></IndexTerm>
 693 <IndexTerm><Primary><literal>divideFloat&num;</literal></Primary></IndexTerm>
 694 <IndexTerm><Primary><literal>gtFloat&num;</literal></Primary></IndexTerm>
 695 <IndexTerm><Primary><literal>geFloat&num;</literal></Primary></IndexTerm>
 696 <IndexTerm><Primary><literal>eqFloat&num;</literal></Primary></IndexTerm>
 697 <IndexTerm><Primary><literal>neFloat&num;</literal></Primary></IndexTerm>
 698 <IndexTerm><Primary><literal>ltFloat&num;</literal></Primary></IndexTerm>
 699 <IndexTerm><Primary><literal>leFloat&num;</literal></Primary></IndexTerm>
 700 <IndexTerm><Primary><literal>negateFloat&num;</literal></Primary></IndexTerm>
 701 <IndexTerm><Primary><literal>float2Int&num;</literal></Primary></IndexTerm>
 702 <IndexTerm><Primary><literal>int2Float&num;</literal></Primary></IndexTerm>
 703 </Para>
 704
 705 <Para>
 706 And a full complement of trigonometric functions:
 707 </Para>
 708
 709 <Para>
 710
 711 <ProgramListing>
 712 expDouble#      :: Double# -&#62; Double#
 713 logDouble#      :: Double# -&#62; Double#
 714 sqrtDouble#     :: Double# -&#62; Double#
 715 sinDouble#      :: Double# -&#62; Double#
 716 cosDouble#      :: Double# -&#62; Double#
 717 tanDouble#      :: Double# -&#62; Double#
 718 asinDouble#     :: Double# -&#62; Double#
 719 acosDouble#     :: Double# -&#62; Double#
 720 atanDouble#     :: Double# -&#62; Double#
 721 sinhDouble#     :: Double# -&#62; Double#
 722 coshDouble#     :: Double# -&#62; Double#
 723 tanhDouble#     :: Double# -&#62; Double#
 724 powerDouble#    :: Double# -&#62; Double# -&#62; Double#
 725 </ProgramListing>
 726
 727 <IndexTerm><Primary>trigonometric functions, primitive</Primary></IndexTerm>
 728 </Para>
 729
 730 <Para>
 731 similarly for <Literal>Float&num;</Literal>.
 732 </Para>
 733
 734 <Para>
 735 There are two coercion functions for <Literal>Float&num;</Literal>/<Literal>Double&num;</Literal>:
 736 </Para>
 737
 738 <Para>
 739
 740 <ProgramListing>
 741 float2Double#   :: Float# -&#62; Double#
 742 double2Float#   :: Double# -&#62; Float#
 743 </ProgramListing>
 744
 745 <IndexTerm><Primary><literal>float2Double&num;</literal></Primary></IndexTerm>
 746 <IndexTerm><Primary><literal>double2Float&num;</literal></Primary></IndexTerm>
 747 </Para>
 748
 749 <Para>
 750 The primitive version of <Function>decodeDouble</Function>
 751 (<Function>encodeDouble</Function> is implemented as an external C
 752 function):
 753 </Para>
 754
 755 <Para>
 756
 757 <ProgramListing>
 758 decodeDouble#   :: Double# -&#62; PrelNum.ReturnIntAndGMP
 759 </ProgramListing>
 760
 761 <IndexTerm><Primary><literal>encodeDouble&num;</literal></Primary></IndexTerm>
 762 <IndexTerm><Primary><literal>decodeDouble&num;</literal></Primary></IndexTerm>
 763 </Para>
 764
 765 <Para>
 766 (And the same for <Literal>Float&num;</Literal>s.)
 767 </Para>
 768
 769 </Sect2>
 770
 771 <Sect2 id="integer-operations">
 772 <Title>Operations on/for <Literal>Integers</Literal> (interface to GMP)
 773 </Title>
 774
 775 <Para>
 776 <IndexTerm><Primary>arbitrary precision integers</Primary></IndexTerm>
 777 <IndexTerm><Primary>Integer, operations on</Primary></IndexTerm>
 778 </Para>
 779
 780 <Para>
 781 We implement <Literal>Integers</Literal> (arbitrary-precision
 782 integers) using the GNU multiple-precision (GMP) package (version
 783 2.0.2).
 784 </Para>
 785
 786 <Para>
 787 The data type for <Literal>Integer</Literal> is either a small
 788 integer, represented by an <Literal>Int</Literal>, or a large integer
 789 represented using the pieces required by GMP's
 790 <Literal>MP&lowbar;INT</Literal> in <Filename>gmp.h</Filename> (see
 791 <Filename>gmp.info</Filename> in
 792 <Filename>ghc/includes/runtime/gmp</Filename>).  It comes out as:
 793 </Para>
 794
 795 <Para>
 796
 797 <ProgramListing>
 798 data Integer = S# Int#             -- small integers
 799              | J# Int# ByteArray#  -- large integers
 800 </ProgramListing>
 801
 802 <IndexTerm><Primary>Integer type</Primary></IndexTerm> The primitive
 803 ops to support large <Literal>Integers</Literal> use the
 804 &ldquo;pieces&rdquo; of the representation, and are as follows:
 805 </Para>
 806
 807 <Para>
 808
 809 <ProgramListing>
 810 negateInteger#  :: Int# -&#62; ByteArray# -&#62; Integer
 811
 812 {plus,minus,times}Integer#, gcdInteger#,
 813   quotInteger#, remInteger#, divExactInteger#
 814         :: Int# -> ByteArray#
 815         -> Int# -> ByteArray#
 816         -> (# Int#, ByteArray# #)
 817
 818 cmpInteger#
 819         :: Int# -> ByteArray#
 820         -> Int# -> ByteArray#
 821         -> Int# -- -1 for &#60;; 0 for ==; +1 for >
 822
 823 cmpIntegerInt#
 824         :: Int# -> ByteArray#
 825         -> Int#
 826         -> Int# -- -1 for &#60;; 0 for ==; +1 for >
 827
 828 gcdIntegerInt# ::
 829         :: Int# -> ByteArray#
 830         -> Int#
 831         -> Int#
 832
 833 divModInteger#, quotRemInteger#
 834         :: Int# -> ByteArray#
 835         -> Int# -> ByteArray#
 836         -> (# Int#, ByteArray#,
 837                   Int#, ByteArray# #)
 838
 839 integer2Int# :: Int# -> ByteArray# -> Int#
 840
 841 int2Integer#  :: Int#  -> Integer -- NB: no error-checking on these two!
 842 word2Integer# :: Word# -> Integer
 843
 844 addr2Integer# :: Addr# -> Integer
 845         -- the Addr# is taken to be a `char *' string
 846         -- to be converted into an Integer.
 847 </ProgramListing>
 848
 849 <IndexTerm><Primary><literal>negateInteger&num;</literal></Primary></IndexTerm>
 850 <IndexTerm><Primary><literal>plusInteger&num;</literal></Primary></IndexTerm>
 851 <IndexTerm><Primary><literal>minusInteger&num;</literal></Primary></IndexTerm>
 852 <IndexTerm><Primary><literal>timesInteger&num;</literal></Primary></IndexTerm>
 853 <IndexTerm><Primary><literal>quotInteger&num;</literal></Primary></IndexTerm>
 854 <IndexTerm><Primary><literal>remInteger&num;</literal></Primary></IndexTerm>
 855 <IndexTerm><Primary><literal>gcdInteger&num;</literal></Primary></IndexTerm>
 856 <IndexTerm><Primary><literal>gcdIntegerInt&num;</literal></Primary></IndexTerm>
 857 <IndexTerm><Primary><literal>divExactInteger&num;</literal></Primary></IndexTerm>
 858 <IndexTerm><Primary><literal>cmpInteger&num;</literal></Primary></IndexTerm>
 859 <IndexTerm><Primary><literal>divModInteger&num;</literal></Primary></IndexTerm>
 860 <IndexTerm><Primary><literal>quotRemInteger&num;</literal></Primary></IndexTerm>
 861 <IndexTerm><Primary><literal>integer2Int&num;</literal></Primary></IndexTerm>
 862 <IndexTerm><Primary><literal>int2Integer&num;</literal></Primary></IndexTerm>
 863 <IndexTerm><Primary><literal>word2Integer&num;</literal></Primary></IndexTerm>
 864 <IndexTerm><Primary><literal>addr2Integer&num;</literal></Primary></IndexTerm>
 865 </Para>
 866
 867 </Sect2>
 868
 869 <Sect2>
 870 <Title>Words and addresses</Title>
 871
 872 <Para>
 873 <IndexTerm><Primary>word, primitive type</Primary></IndexTerm>
 874 <IndexTerm><Primary>address, primitive type</Primary></IndexTerm>
 875 <IndexTerm><Primary>unsigned integer, primitive type</Primary></IndexTerm>
 876 <IndexTerm><Primary>pointer, primitive type</Primary></IndexTerm>
 877 </Para>
 878
 879 <Para>
 880 A <Literal>Word&num;</Literal> is used for bit-twiddling operations.
 881 It is the same size as an <Literal>Int&num;</Literal>, but has no sign
 882 nor any arithmetic operations.
 883
 884 <ProgramListing>
 885 type Word#      -- Same size/etc as Int# but *unsigned*
 886 type Addr#      -- A pointer from outside the "Haskell world" (from C, probably);
 887                 -- described under "arrays"
 888 </ProgramListing>
 889
 890 <IndexTerm><Primary><literal>Word&num;</literal></Primary></IndexTerm>
 891 <IndexTerm><Primary><literal>Addr&num;</literal></Primary></IndexTerm>
 892 </Para>
 893
 894 <Para>
 895 <Literal>Word&num;</Literal>s and <Literal>Addr&num;</Literal>s have
 896 the usual comparison operations.  Other
 897 unboxed-<Literal>Word</Literal> ops (bit-twiddling and coercions):
 898 </Para>
 899
 900 <Para>
 901
 902 <ProgramListing>
 903 {gt,ge,eq,ne,lt,le}Word# :: Word# -> Word# -> Bool
 904
 905 and#, or#, xor# :: Word# -> Word# -> Word#
 906         -- standard bit ops.
 907
 908 quotWord#, remWord# :: Word# -> Word# -> Word#
 909         -- word (i.e. unsigned) versions are different from int
 910         -- versions, so we have to provide these explicitly.
 911
 912 not# :: Word# -> Word#
 913
 914 shiftL#, shiftRL# :: Word# -> Int# -> Word#
 915         -- shift left, right logical
 916
 917 int2Word#       :: Int#  -> Word# -- just a cast, really
 918 word2Int#       :: Word# -> Int#
 919 </ProgramListing>
 920
 921 <IndexTerm><Primary>bit operations, Word and Addr</Primary></IndexTerm>
 922 <IndexTerm><Primary><literal>gtWord&num;</literal></Primary></IndexTerm>
 923 <IndexTerm><Primary><literal>geWord&num;</literal></Primary></IndexTerm>
 924 <IndexTerm><Primary><literal>eqWord&num;</literal></Primary></IndexTerm>
 925 <IndexTerm><Primary><literal>neWord&num;</literal></Primary></IndexTerm>
 926 <IndexTerm><Primary><literal>ltWord&num;</literal></Primary></IndexTerm>
 927 <IndexTerm><Primary><literal>leWord&num;</literal></Primary></IndexTerm>
 928 <IndexTerm><Primary><literal>and&num;</literal></Primary></IndexTerm>
 929 <IndexTerm><Primary><literal>or&num;</literal></Primary></IndexTerm>
 930 <IndexTerm><Primary><literal>xor&num;</literal></Primary></IndexTerm>
 931 <IndexTerm><Primary><literal>not&num;</literal></Primary></IndexTerm>
 932 <IndexTerm><Primary><literal>quotWord&num;</literal></Primary></IndexTerm>
 933 <IndexTerm><Primary><literal>remWord&num;</literal></Primary></IndexTerm>
 934 <IndexTerm><Primary><literal>shiftL&num;</literal></Primary></IndexTerm>
 935 <IndexTerm><Primary><literal>shiftRA&num;</literal></Primary></IndexTerm>
 936 <IndexTerm><Primary><literal>shiftRL&num;</literal></Primary></IndexTerm>
 937 <IndexTerm><Primary><literal>int2Word&num;</literal></Primary></IndexTerm>
 938 <IndexTerm><Primary><literal>word2Int&num;</literal></Primary></IndexTerm>
 939 </Para>
 940
 941 <Para>
 942 Unboxed-<Literal>Addr</Literal> ops (C casts, really):
 943
 944 <ProgramListing>
 945 {gt,ge,eq,ne,lt,le}Addr# :: Addr# -> Addr# -> Bool
 946
 947 int2Addr#       :: Int#  -> Addr#
 948 addr2Int#       :: Addr# -> Int#
 949 addr2Integer#   :: Addr# -> (# Int#, ByteArray# #)
 950 </ProgramListing>
 951
 952 <IndexTerm><Primary><literal>gtAddr&num;</literal></Primary></IndexTerm>
 953 <IndexTerm><Primary><literal>geAddr&num;</literal></Primary></IndexTerm>
 954 <IndexTerm><Primary><literal>eqAddr&num;</literal></Primary></IndexTerm>
 955 <IndexTerm><Primary><literal>neAddr&num;</literal></Primary></IndexTerm>
 956 <IndexTerm><Primary><literal>ltAddr&num;</literal></Primary></IndexTerm>
 957 <IndexTerm><Primary><literal>leAddr&num;</literal></Primary></IndexTerm>
 958 <IndexTerm><Primary><literal>int2Addr&num;</literal></Primary></IndexTerm>
 959 <IndexTerm><Primary><literal>addr2Int&num;</literal></Primary></IndexTerm>
 960 <IndexTerm><Primary><literal>addr2Integer&num;</literal></Primary></IndexTerm>
 961 </Para>
 962
 963 <Para>
 964 The casts between <Literal>Int&num;</Literal>,
 965 <Literal>Word&num;</Literal> and <Literal>Addr&num;</Literal>
 966 correspond to null operations at the machine level, but are required
 967 to keep the Haskell type checker happy.
 968 </Para>
 969
 970 <Para>
 971 Operations for indexing off of C pointers
 972 (<Literal>Addr&num;</Literal>s) to snatch values are listed under
 973 &ldquo;arrays&rdquo;.
 974 </Para>
 975
 976 </Sect2>
 977
 978 <Sect2>
 979 <Title>Arrays</Title>
 980
 981 <Para>
 982 <IndexTerm><Primary>arrays, primitive</Primary></IndexTerm>
 983 </Para>
 984
 985 <Para>
 986 The type <Literal>Array&num; elt</Literal> is the type of primitive,
 987 unpointed arrays of values of type <Literal>elt</Literal>.
 988 </Para>
 989
 990 <Para>
 991
 992 <ProgramListing>
 993 type Array# elt
 994 </ProgramListing>
 995
 996 <IndexTerm><Primary><literal>Array&num;</literal></Primary></IndexTerm>
 997 </Para>
 998
 999 <Para>
1000 <Literal>Array&num;</Literal> is more primitive than a Haskell
1001 array&mdash;indeed, the Haskell <Literal>Array</Literal> interface is
1002 implemented using <Literal>Array&num;</Literal>&mdash;in that an
1003 <Literal>Array&num;</Literal> is indexed only by
1004 <Literal>Int&num;</Literal>s, starting at zero.  It is also more
1005 primitive by virtue of being unboxed.  That doesn't mean that it isn't
1006 a heap-allocated object&mdash;of course, it is.  Rather, being unboxed
1007 means that it is represented by a pointer to the array itself, and not
1008 to a thunk which will evaluate to the array (or to bottom).  The
1009 components of an <Literal>Array&num;</Literal> are themselves boxed.
1010 </Para>
1011
1012 <Para>
1013 The type <Literal>ByteArray&num;</Literal> is similar to
1014 <Literal>Array&num;</Literal>, except that it contains just a string
1015 of (non-pointer) bytes.
1016 </Para>
1017
1018 <Para>
1019
1020 <ProgramListing>
1021 type ByteArray#
1022 </ProgramListing>
1023
1024 <IndexTerm><Primary><literal>ByteArray&num;</literal></Primary></IndexTerm>
1025 </Para>
1026
1027 <Para>
1028 Arrays of these types are useful when a Haskell program wishes to
1029 construct a value to pass to a C procedure. It is also possible to use
1030 them to build (say) arrays of unboxed characters for internal use in a
1031 Haskell program.  Given these uses, <Literal>ByteArray&num;</Literal>
1032 is deliberately a bit vague about the type of its components.
1033 Operations are provided to extract values of type
1034 <Literal>Char&num;</Literal>, <Literal>Int&num;</Literal>,
1035 <Literal>Float&num;</Literal>, <Literal>Double&num;</Literal>, and
1036 <Literal>Addr&num;</Literal> from arbitrary offsets within a
1037 <Literal>ByteArray&num;</Literal>.  (For type
1038 <Literal>Foo&num;</Literal>, the $i$th offset gets you the $i$th
1039 <Literal>Foo&num;</Literal>, not the <Literal>Foo&num;</Literal> at
1040 byte-position $i$.  Mumble.)  (If you want a
1041 <Literal>Word&num;</Literal>, grab an <Literal>Int&num;</Literal>,
1042 then coerce it.)
1043 </Para>
1044
1045 <Para>
1046 Lastly, we have static byte-arrays, of type
1047 <Literal>Addr&num;</Literal> &lsqb;mentioned previously].  (Remember
1048 the duality between arrays and pointers in C.)  Arrays of this types
1049 are represented by a pointer to an array in the world outside Haskell,
1050 so this pointer is not followed by the garbage collector.  In other
1051 respects they are just like <Literal>ByteArray&num;</Literal>.  They
1052 are only needed in order to pass values from C to Haskell.
1053 </Para>
1054
1055 </Sect2>
1056
1057 <Sect2>
1058 <Title>Reading and writing</Title>
1059
1060 <Para>
1061 Primitive arrays are linear, and indexed starting at zero.
1062 </Para>
1063
1064 <Para>
1065 The size and indices of a <Literal>ByteArray&num;</Literal>, <Literal>Addr&num;</Literal>, and
1066 <Literal>MutableByteArray&num;</Literal> are all in bytes.  It's up to the program to
1067 calculate the correct byte offset from the start of the array.  This
1068 allows a <Literal>ByteArray&num;</Literal> to contain a mixture of values of different
1069 type, which is often needed when preparing data for and unpicking
1070 results from C.  (Umm&hellip;not true of indices&hellip;WDP 95/09)
1071 </Para>
1072
1073 <Para>
1074 <Emphasis>Should we provide some <Literal>sizeOfDouble&num;</Literal> constants?</Emphasis>
1075 </Para>
1076
1077 <Para>
1078 Out-of-range errors on indexing should be caught by the code which
1079 uses the primitive operation; the primitive operations themselves do
1080 <Emphasis>not</Emphasis> check for out-of-range indexes. The intention is that the
1081 primitive ops compile to one machine instruction or thereabouts.
1082 </Para>
1083
1084 <Para>
1085 We use the terms &ldquo;reading&rdquo; and &ldquo;writing&rdquo; to refer to accessing
1086 <Emphasis>mutable</Emphasis> arrays (see <XRef LinkEnd="sect-mutable">), and
1087 &ldquo;indexing&rdquo; to refer to reading a value from an <Emphasis>immutable</Emphasis>
1088 array.
1089 </Para>
1090
1091 <Para>
1092 Immutable byte arrays are straightforward to index (all indices in bytes):
1093
1094 <ProgramListing>
1095 indexCharArray#   :: ByteArray# -> Int# -> Char#
1096 indexIntArray#    :: ByteArray# -> Int# -> Int#
1097 indexAddrArray#   :: ByteArray# -> Int# -> Addr#
1098 indexFloatArray#  :: ByteArray# -> Int# -> Float#
1099 indexDoubleArray# :: ByteArray# -> Int# -> Double#
1100
1101 indexCharOffAddr#   :: Addr# -> Int# -> Char#
1102 indexIntOffAddr#    :: Addr# -> Int# -> Int#
1103 indexFloatOffAddr#  :: Addr# -> Int# -> Float#
1104 indexDoubleOffAddr# :: Addr# -> Int# -> Double#
1105 indexAddrOffAddr#   :: Addr# -> Int# -> Addr#
1106  -- Get an Addr# from an Addr# offset
1107 </ProgramListing>
1108
1109 <IndexTerm><Primary><literal>indexCharArray&num;</literal></Primary></IndexTerm>
1110 <IndexTerm><Primary><literal>indexIntArray&num;</literal></Primary></IndexTerm>
1111 <IndexTerm><Primary><literal>indexAddrArray&num;</literal></Primary></IndexTerm>
1112 <IndexTerm><Primary><literal>indexFloatArray&num;</literal></Primary></IndexTerm>
1113 <IndexTerm><Primary><literal>indexDoubleArray&num;</literal></Primary></IndexTerm>
1114 <IndexTerm><Primary><literal>indexCharOffAddr&num;</literal></Primary></IndexTerm>
1115 <IndexTerm><Primary><literal>indexIntOffAddr&num;</literal></Primary></IndexTerm>
1116 <IndexTerm><Primary><literal>indexFloatOffAddr&num;</literal></Primary></IndexTerm>
1117 <IndexTerm><Primary><literal>indexDoubleOffAddr&num;</literal></Primary></IndexTerm>
1118 <IndexTerm><Primary><literal>indexAddrOffAddr&num;</literal></Primary></IndexTerm>
1119 </Para>
1120
1121 <Para>
1122 The last of these, <Function>indexAddrOffAddr&num;</Function>, extracts an <Literal>Addr&num;</Literal> using an offset
1123 from another <Literal>Addr&num;</Literal>, thereby providing the ability to follow a chain of
1124 C pointers.
1125 </Para>
1126
1127 <Para>
1128 Something a bit more interesting goes on when indexing arrays of boxed
1129 objects, because the result is simply the boxed object. So presumably
1130 it should be entered&mdash;we never usually return an unevaluated
1131 object!  This is a pain: primitive ops aren't supposed to do
1132 complicated things like enter objects.  The current solution is to
1133 return a single element unboxed tuple (see <XRef LinkEnd="unboxed-tuples">).
1134 </Para>
1135
1136 <Para>
1137
1138 <ProgramListing>
1139 indexArray#       :: Array# elt -> Int# -> (# elt #)
1140 </ProgramListing>
1141
1142 <IndexTerm><Primary><literal>indexArray&num;</literal></Primary></IndexTerm>
1143 </Para>
1144
1145 </Sect2>
1146
1147 <Sect2>
1148 <Title>The state type</Title>
1149
1150 <Para>
1151 <IndexTerm><Primary><literal>state, primitive type</literal></Primary></IndexTerm>
1152 <IndexTerm><Primary><literal>State&num;</literal></Primary></IndexTerm>
1153 </Para>
1154
1155 <Para>
1156 The primitive type <Literal>State&num;</Literal> represents the state of a state
1157 transformer.  It is parameterised on the desired type of state, which
1158 serves to keep states from distinct threads distinct from one another.
1159 But the <Emphasis>only</Emphasis> effect of this parameterisation is in the type
1160 system: all values of type <Literal>State&num;</Literal> are represented in the same way.
1161 Indeed, they are all represented by nothing at all!  The code
1162 generator &ldquo;knows&rdquo; to generate no code, and allocate no registers
1163 etc, for primitive states.
1164 </Para>
1165
1166 <Para>
1167
1168 <ProgramListing>
1169 type State# s
1170 </ProgramListing>
1171
1172 </Para>
1173
1174 <Para>
1175 The type <Literal>GHC.RealWorld</Literal> is truly opaque: there are no values defined
1176 of this type, and no operations over it.  It is &ldquo;primitive&rdquo; in that
1177 sense - but it is <Emphasis>not unlifted!</Emphasis> Its only role in life is to be
1178 the type which distinguishes the <Literal>IO</Literal> state transformer.
1179 </Para>
1180
1181 <Para>
1182
1183 <ProgramListing>
1184 data RealWorld
1185 </ProgramListing>
1186
1187 </Para>
1188
1189 </Sect2>
1190
1191 <Sect2>
1192 <Title>State of the world</Title>
1193
1194 <Para>
1195 A single, primitive, value of type <Literal>State&num; RealWorld</Literal> is provided.
1196 </Para>
1197
1198 <Para>
1199
1200 <ProgramListing>
1201 realWorld# :: State# RealWorld
1202 </ProgramListing>
1203
1204 <IndexTerm><Primary>realWorld&num; state object</Primary></IndexTerm>
1205 </Para>
1206
1207 <Para>
1208 (Note: in the compiler, not a <Literal>PrimOp</Literal>; just a mucho magic
1209 <Literal>Id</Literal>. Exported from <Literal>GHC</Literal>, though).
1210 </Para>
1211
1212 </Sect2>
1213
1214 <Sect2 id="sect-mutable">
1215 <Title>Mutable arrays</Title>
1216
1217 <Para>
1218 <IndexTerm><Primary>mutable arrays</Primary></IndexTerm>
1219 <IndexTerm><Primary>arrays, mutable</Primary></IndexTerm>
1220 Corresponding to <Literal>Array&num;</Literal> and <Literal>ByteArray&num;</Literal>, we have the types of
1221 mutable versions of each.  In each case, the representation is a
1222 pointer to a suitable block of (mutable) heap-allocated storage.
1223 </Para>
1224
1225 <Para>
1226
1227 <ProgramListing>
1228 type MutableArray# s elt
1229 type MutableByteArray# s
1230 </ProgramListing>
1231
1232 <IndexTerm><Primary><literal>MutableArray&num;</literal></Primary></IndexTerm>
1233 <IndexTerm><Primary><literal>MutableByteArray&num;</literal></Primary></IndexTerm>
1234 </Para>
1235
1236 <Sect3>
1237 <Title>Allocation</Title>
1238
1239 <Para>
1240 <IndexTerm><Primary>mutable arrays, allocation</Primary></IndexTerm>
1241 <IndexTerm><Primary>arrays, allocation</Primary></IndexTerm>
1242 <IndexTerm><Primary>allocation, of mutable arrays</Primary></IndexTerm>
1243 </Para>
1244
1245 <Para>
1246 Mutable arrays can be allocated. Only pointer-arrays are initialised;
1247 arrays of non-pointers are filled in by &ldquo;user code&rdquo; rather than by
1248 the array-allocation primitive.  Reason: only the pointer case has to
1249 worry about GC striking with a partly-initialised array.
1250 </Para>
1251
1252 <Para>
1253
1254 <ProgramListing>
1255 newArray#       :: Int# -> elt -> State# s -> (# State# s, MutableArray# s elt #)
1256
1257 newCharArray#   :: Int# -> State# s -> (# State# s, MutableByteArray# s elt #)
1258 newIntArray#    :: Int# -> State# s -> (# State# s, MutableByteArray# s elt #)
1259 newAddrArray#   :: Int# -> State# s -> (# State# s, MutableByteArray# s elt #)
1260 newFloatArray#  :: Int# -> State# s -> (# State# s, MutableByteArray# s elt #)
1261 newDoubleArray# :: Int# -> State# s -> (# State# s, MutableByteArray# s elt #)
1262 </ProgramListing>
1263
1264 <IndexTerm><Primary><literal>newArray&num;</literal></Primary></IndexTerm>
1265 <IndexTerm><Primary><literal>newCharArray&num;</literal></Primary></IndexTerm>
1266 <IndexTerm><Primary><literal>newIntArray&num;</literal></Primary></IndexTerm>
1267 <IndexTerm><Primary><literal>newAddrArray&num;</literal></Primary></IndexTerm>
1268 <IndexTerm><Primary><literal>newFloatArray&num;</literal></Primary></IndexTerm>
1269 <IndexTerm><Primary><literal>newDoubleArray&num;</literal></Primary></IndexTerm>
1270 </Para>
1271
1272 <Para>
1273 The size of a <Literal>ByteArray&num;</Literal> is given in bytes.
1274 </Para>
1275
1276 </Sect3>
1277
1278 <Sect3>
1279 <Title>Reading and writing</Title>
1280
1281 <Para>
1282 <IndexTerm><Primary>arrays, reading and writing</Primary></IndexTerm>
1283 </Para>
1284
1285 <Para>
1286
1287 <ProgramListing>
1288 readArray#       :: MutableArray# s elt -> Int# -> State# s -> (# State# s, elt #)
1289 readCharArray#   :: MutableByteArray# s -> Int# -> State# s -> (# State# s, Char# #)
1290 readIntArray#    :: MutableByteArray# s -> Int# -> State# s -> (# State# s, Int# #)
1291 readAddrArray#   :: MutableByteArray# s -> Int# -> State# s -> (# State# s, Addr# #)
1292 readFloatArray#  :: MutableByteArray# s -> Int# -> State# s -> (# State# s, Float# #)
1293 readDoubleArray# :: MutableByteArray# s -> Int# -> State# s -> (# State# s, Double# #)
1294
1295 writeArray#       :: MutableArray# s elt -> Int# -> elt     -> State# s -> State# s
1296 writeCharArray#   :: MutableByteArray# s -> Int# -> Char#   -> State# s -> State# s
1297 writeIntArray#    :: MutableByteArray# s -> Int# -> Int#    -> State# s -> State# s
1298 writeAddrArray#   :: MutableByteArray# s -> Int# -> Addr#   -> State# s -> State# s
1299 writeFloatArray#  :: MutableByteArray# s -> Int# -> Float#  -> State# s -> State# s
1300 writeDoubleArray# :: MutableByteArray# s -> Int# -> Double# -> State# s -> State# s
1301 </ProgramListing>
1302
1303 <IndexTerm><Primary><literal>readArray&num;</literal></Primary></IndexTerm>
1304 <IndexTerm><Primary><literal>readCharArray&num;</literal></Primary></IndexTerm>
1305 <IndexTerm><Primary><literal>readIntArray&num;</literal></Primary></IndexTerm>
1306 <IndexTerm><Primary><literal>readAddrArray&num;</literal></Primary></IndexTerm>
1307 <IndexTerm><Primary><literal>readFloatArray&num;</literal></Primary></IndexTerm>
1308 <IndexTerm><Primary><literal>readDoubleArray&num;</literal></Primary></IndexTerm>
1309 <IndexTerm><Primary><literal>writeArray&num;</literal></Primary></IndexTerm>
1310 <IndexTerm><Primary><literal>writeCharArray&num;</literal></Primary></IndexTerm>
1311 <IndexTerm><Primary><literal>writeIntArray&num;</literal></Primary></IndexTerm>
1312 <IndexTerm><Primary><literal>writeAddrArray&num;</literal></Primary></IndexTerm>
1313 <IndexTerm><Primary><literal>writeFloatArray&num;</literal></Primary></IndexTerm>
1314 <IndexTerm><Primary><literal>writeDoubleArray&num;</literal></Primary></IndexTerm>
1315 </Para>
1316
1317 </Sect3>
1318
1319 <Sect3>
1320 <Title>Equality</Title>
1321
1322 <Para>
1323 <IndexTerm><Primary>arrays, testing for equality</Primary></IndexTerm>
1324 </Para>
1325
1326 <Para>
1327 One can take &ldquo;equality&rdquo; of mutable arrays.  What is compared is the
1328 <Emphasis>name</Emphasis> or reference to the mutable array, not its contents.
1329 </Para>
1330
1331 <Para>
1332
1333 <ProgramListing>
1334 sameMutableArray#     :: MutableArray# s elt -> MutableArray# s elt -> Bool
1335 sameMutableByteArray# :: MutableByteArray# s -> MutableByteArray# s -> Bool
1336 </ProgramListing>
1337
1338 <IndexTerm><Primary><literal>sameMutableArray&num;</literal></Primary></IndexTerm>
1339 <IndexTerm><Primary><literal>sameMutableByteArray&num;</literal></Primary></IndexTerm>
1340 </Para>
1341
1342 </Sect3>
1343
1344 <Sect3>
1345 <Title>Freezing mutable arrays</Title>
1346
1347 <Para>
1348 <IndexTerm><Primary>arrays, freezing mutable</Primary></IndexTerm>
1349 <IndexTerm><Primary>freezing mutable arrays</Primary></IndexTerm>
1350 <IndexTerm><Primary>mutable arrays, freezing</Primary></IndexTerm>
1351 </Para>
1352
1353 <Para>
1354 Only unsafe-freeze has a primitive.  (Safe freeze is done directly in Haskell
1355 by copying the array and then using <Function>unsafeFreeze</Function>.)
1356 </Para>
1357
1358 <Para>
1359
1360 <ProgramListing>
1361 unsafeFreezeArray#     :: MutableArray# s elt -> State# s -> (# State# s, Array# s elt #)
1362 unsafeFreezeByteArray# :: MutableByteArray# s -> State# s -> (# State# s, ByteArray# #)
1363 </ProgramListing>
1364
1365 <IndexTerm><Primary><literal>unsafeFreezeArray&num;</literal></Primary></IndexTerm>
1366 <IndexTerm><Primary><literal>unsafeFreezeByteArray&num;</literal></Primary></IndexTerm>
1367 </Para>
1368
1369 </Sect3>
1370
1371 </Sect2>
1372
1373 <Sect2>
1374 <Title>Synchronizing variables (M-vars)</Title>
1375
1376 <Para>
1377 <IndexTerm><Primary>synchronising variables (M-vars)</Primary></IndexTerm>
1378 <IndexTerm><Primary>M-Vars</Primary></IndexTerm>
1379 </Para>
1380
1381 <Para>
1382 Synchronising variables are the primitive type used to implement
1383 Concurrent Haskell's MVars (see the Concurrent Haskell paper for
1384 the operational behaviour of these operations).
1385 </Para>
1386
1387 <Para>
1388
1389 <ProgramListing>
1390 type MVar# s elt        -- primitive
1391
1392 newMVar#    :: State# s -> (# State# s, MVar# s elt #)
1393 takeMVar#   :: SynchVar# s elt -> State# s -> (# State# s, elt #)
1394 putMVar#    :: SynchVar# s elt -> State# s -> State# s
1395 </ProgramListing>
1396
1397 <IndexTerm><Primary><literal>SynchVar&num;</literal></Primary></IndexTerm>
1398 <IndexTerm><Primary><literal>newSynchVar&num;</literal></Primary></IndexTerm>
1399 <IndexTerm><Primary><literal>takeMVar</literal></Primary></IndexTerm>
1400 <IndexTerm><Primary><literal>putMVar</literal></Primary></IndexTerm>
1401 </Para>
1402
1403 </Sect2>
1404
1405 </Sect1>
1406
1407 <Sect1 id="glasgow-ST-monad">
1408 <Title>Primitive state-transformer monad
1409 </Title>
1410
1411 <Para>
1412 <IndexTerm><Primary>state transformers (Glasgow extensions)</Primary></IndexTerm>
1413 <IndexTerm><Primary>ST monad (Glasgow extension)</Primary></IndexTerm>
1414 </Para>
1415
1416 <Para>
1417 This monad underlies our implementation of arrays, mutable and
1418 immutable, and our implementation of I/O, including &ldquo;C calls&rdquo;.
1419 </Para>
1420
1421 <Para>
1422 The <Literal>ST</Literal> library, which provides access to the
1423 <Function>ST</Function> monad, is described in <xref
1424 linkend="sec-ST">.
1425 </Para>
1426
1427 </Sect1>
1428
1429 <Sect1 id="glasgow-prim-arrays">
1430 <Title>Primitive arrays, mutable and otherwise
1431 </Title>
1432
1433 <Para>
1434 <IndexTerm><Primary>primitive arrays (Glasgow extension)</Primary></IndexTerm>
1435 <IndexTerm><Primary>arrays, primitive (Glasgow extension)</Primary></IndexTerm>
1436 </Para>
1437
1438 <Para>
1439 GHC knows about quite a few flavours of Large Swathes of Bytes.
1440 </Para>
1441
1442 <Para>
1443 First, GHC distinguishes between primitive arrays of (boxed) Haskell
1444 objects (type <Literal>Array&num; obj</Literal>) and primitive arrays of bytes (type
1445 <Literal>ByteArray&num;</Literal>).
1446 </Para>
1447
1448 <Para>
1449 Second, it distinguishes between&hellip;
1450 <VariableList>
1451
1452 <VarListEntry>
1453 <Term>Immutable:</Term>
1454 <ListItem>
1455 <Para>
1456 Arrays that do not change (as with &ldquo;standard&rdquo; Haskell arrays); you
1457 can only read from them.  Obviously, they do not need the care and
1458 attention of the state-transformer monad.
1459 </Para>
1460 </ListItem>
1461 </VarListEntry>
1462 <VarListEntry>
1463 <Term>Mutable:</Term>
1464 <ListItem>
1465 <Para>
1466 Arrays that may be changed or &ldquo;mutated.&rdquo;  All the operations on them
1467 live within the state-transformer monad and the updates happen
1468 <Emphasis>in-place</Emphasis>.
1469 </Para>
1470 </ListItem>
1471 </VarListEntry>
1472 <VarListEntry>
1473 <Term>&ldquo;Static&rdquo; (in C land):</Term>
1474 <ListItem>
1475 <Para>
1476 A C routine may pass an <Literal>Addr&num;</Literal> pointer back into Haskell land.  There
1477 are then primitive operations with which you may merrily grab values
1478 over in C land, by indexing off the &ldquo;static&rdquo; pointer.
1479 </Para>
1480 </ListItem>
1481 </VarListEntry>
1482 <VarListEntry>
1483 <Term>&ldquo;Stable&rdquo; pointers:</Term>
1484 <ListItem>
1485 <Para>
1486 If, for some reason, you wish to hand a Haskell pointer (i.e.,
1487 <Emphasis>not</Emphasis> an unboxed value) to a C routine, you first make the
1488 pointer &ldquo;stable,&rdquo; so that the garbage collector won't forget that it
1489 exists.  That is, GHC provides a safe way to pass Haskell pointers to
1490 C.
1491 </Para>
1492
1493 <Para>
1494 Please see <XRef LinkEnd="sec-stable-pointers"> for more details.
1495 </Para>
1496 </ListItem>
1497 </VarListEntry>
1498 <VarListEntry>
1499 <Term>&ldquo;Foreign objects&rdquo;:</Term>
1500 <ListItem>
1501 <Para>
1502 A &ldquo;foreign object&rdquo; is a safe way to pass an external object (a
1503 C-allocated pointer, say) to Haskell and have Haskell do the Right
1504 Thing when it no longer references the object.  So, for example, C
1505 could pass a large bitmap over to Haskell and say &ldquo;please free this
1506 memory when you're done with it.&rdquo;
1507 </Para>
1508
1509 <Para>
1510 Please see <XRef LinkEnd="sec-ForeignObj"> for more details.
1511 </Para>
1512 </ListItem>
1513 </VarListEntry>
1514 </VariableList>
1515 </Para>
1516
1517 <Para>
1518 The libraries documentatation gives more details on all these
1519 &ldquo;primitive array&rdquo; types and the operations on them.
1520 </Para>
1521
1522 </Sect1>
1523
1524
1525 <Sect1 id="pattern-guards">
1526 <Title>Pattern guards</Title>
1527
1528 <Para>
1529 <IndexTerm><Primary>Pattern guards (Glasgow extension)</Primary></IndexTerm>
1530 The discussion that follows is an abbreviated version of Simon Peyton Jones's original <ULink URL="http://research.microsoft.com/~simonpj/Haskell/guards.html">proposal</ULink>. (Note that the proposal was written before pattern guards were implemented, so refers to them as unimplemented.)
1531 </Para>
1532
1533 <Para>
1534 Suppose we have an abstract data type of finite maps, with a
1535 lookup operation:
1536
1537 <ProgramListing>
1538 lookup :: FiniteMap -> Int -> Maybe Int
1539 </ProgramListing>
1540
1541 The lookup returns <Function>Nothing</Function> if the supplied key is not in the domain of the mapping, and <Function>(Just v)</Function> otherwise,
1542 where <VarName>v</VarName> is the value that the key maps to.  Now consider the following definition:
1543 </Para>
1544
1545 <ProgramListing>
1546 clunky env var1 var2 | ok1 && ok2 = val1 + val2
1547 | otherwise  = var1 + var2
1548 where
1549   m1 = lookup env var1
1550   m2 = lookup env var2
1551   ok1 = maybeToBool m1
1552   ok2 = maybeToBool m2
1553   val1 = expectJust m1
1554   val2 = expectJust m2
1555 </ProgramListing>
1556
1557 <Para>
1558 The auxiliary functions are
1559 </Para>
1560
1561 <ProgramListing>
1562 maybeToBool :: Maybe a -&gt; Bool
1563 maybeToBool (Just x) = True
1564 maybeToBool Nothing  = False
1565
1566 expectJust :: Maybe a -&gt; a
1567 expectJust (Just x) = x
1568 expectJust Nothing  = error "Unexpected Nothing"
1569 </ProgramListing>
1570
1571 <Para>
1572 What is <Function>clunky</Function> doing? The guard <Literal>ok1 &&
1573 ok2</Literal> checks that both lookups succeed, using
1574 <Function>maybeToBool</Function> to convert the <Function>Maybe</Function>
1575 types to booleans. The (lazily evaluated) <Function>expectJust</Function>
1576 calls extract the values from the results of the lookups, and binds the
1577 returned values to <VarName>val1</VarName> and <VarName>val2</VarName>
1578 respectively.  If either lookup fails, then clunky takes the
1579 <Literal>otherwise</Literal> case and returns the sum of its arguments.
1580 </Para>
1581
1582 <Para>
1583 This is certainly legal Haskell, but it is a tremendously verbose and
1584 un-obvious way to achieve the desired effect.  Arguably, a more direct way
1585 to write clunky would be to use case expressions:
1586 </Para>
1587
1588 <ProgramListing>
1589 clunky env var1 var1 = case lookup env var1 of
1590   Nothing -&gt; fail
1591   Just val1 -&gt; case lookup env var2 of
1592     Nothing -&gt; fail
1593     Just val2 -&gt; val1 + val2
1594 where
1595   fail = val1 + val2
1596 </ProgramListing>
1597
1598 <Para>
1599 This is a bit shorter, but hardly better.  Of course, we can rewrite any set
1600 of pattern-matching, guarded equations as case expressions; that is
1601 precisely what the compiler does when compiling equations! The reason that
1602 Haskell provides guarded equations is because they allow us to write down
1603 the cases we want to consider, one at a time, independently of each other.
1604 This structure is hidden in the case version.  Two of the right-hand sides
1605 are really the same (<Function>fail</Function>), and the whole expression
1606 tends to become more and more indented.
1607 </Para>
1608
1609 <Para>
1610 Here is how I would write clunky:
1611 </Para>
1612
1613 <ProgramListing>
1614 clunky env var1 var1
1615   | Just val1 &lt;- lookup env var1
1616   , Just val2 &lt;- lookup env var2
1617   = val1 + val2
1618 ...other equations for clunky...
1619 </ProgramListing>
1620
1621 <Para>
1622 The semantics should be clear enough.  The qualifers are matched in order.
1623 For a <Literal>&lt;-</Literal> qualifier, which I call a pattern guard, the
1624 right hand side is evaluated and matched against the pattern on the left.
1625 If the match fails then the whole guard fails and the next equation is
1626 tried.  If it succeeds, then the appropriate binding takes place, and the
1627 next qualifier is matched, in the augmented environment.  Unlike list
1628 comprehensions, however, the type of the expression to the right of the
1629 <Literal>&lt;-</Literal> is the same as the type of the pattern to its
1630 left.  The bindings introduced by pattern guards scope over all the
1631 remaining guard qualifiers, and over the right hand side of the equation.
1632 </Para>
1633
1634 <Para>
1635 Just as with list comprehensions, boolean expressions can be freely mixed
1636 with among the pattern guards.  For example:
1637 </Para>
1638
1639 <ProgramListing>
1640 f x | [y] <- x
1641     , y > 3
1642     , Just z <- h y
1643     = ...
1644 </ProgramListing>
1645
1646 <Para>
1647 Haskell's current guards therefore emerge as a special case, in which the
1648 qualifier list has just one element, a boolean expression.
1649 </Para>
1650 </Sect1>
1651
1652   <sect1 id="sec-ffi">
1653     <title>The foreign interface</title>
1654
1655     <para>The foreign interface consists of the following components:</para>
1656
1657     <itemizedlist>
1658       <listitem>
1659         <para>The Foreign Function Interface language specification
1660         (included in this manual, in <xref linkend="ffi">).</para>
1661       </listitem>
1662
1663       <listitem>
1664         <para>The <literal>Foreign</literal> module (see <xref
1665         linkend="sec-Foreign">) collects together several interfaces
1666         which are useful in specifying foreign language
1667         interfaces, including the following:</para>
1668
1669         <itemizedlist>
1670           <listitem>
1671             <para>The <literal>ForeignObj</literal> module (see <xref
1672             linkend="sec-ForeignObj">), for managing pointers from
1673             Haskell into the outside world.</para>
1674           </listitem>
1675
1676           <listitem>
1677             <para>The <literal>StablePtr</literal> module (see <xref
1678             linkend="sec-stable-pointers">), for managing pointers
1679             into Haskell from the outside world.</para>
1680           </listitem>
1681
1682           <listitem>
1683             <para>The <literal>CTypes</literal> module (see <xref
1684             linkend="sec-CTypes">) gives Haskell equivalents for the
1685             standard C datatypes, for use in making Haskell bindings
1686             to existing C libraries.</para>
1687           </listitem>
1688
1689           <listitem>
1690             <para>The <literal>CTypesISO</literal> module (see <xref
1691             linkend="sec-CTypesISO">) gives Haskell equivalents for C
1692             types defined by the ISO C standard.</para>
1693           </listitem>
1694
1695           <listitem>
1696             <para>The <literal>Storable</literal> library, for
1697             primitive marshalling of data types between Haskell and
1698             the foreign language.</para>
1699           </listitem>
1700         </itemizedlist>
1701
1702       </listitem>
1703     </itemizedlist>
1704
1705 <para>The following sections also give some hints and tips on the use
1706 of the foreign function interface in GHC.</para>
1707
1708 <Sect2 id="glasgow-foreign-headers">
1709 <Title>Using function headers
1710 </Title>
1711
1712 <Para>
1713 <IndexTerm><Primary>C calls, function headers</Primary></IndexTerm>
1714 </Para>
1715
1716 <Para>
1717 When generating C (using the <Option>-fvia-C</Option> directive), one can assist the
1718 C compiler in detecting type errors by using the <Command>-&num;include</Command> directive
1719 to provide <Filename>.h</Filename> files containing function headers.
1720 </Para>
1721
1722 <Para>
1723 For example,
1724 </Para>
1725
1726 <Para>
1727
1728 <ProgramListing>
1729 #include "HsFFI.h"
1730
1731 void         initialiseEFS (HsInt size);
1732 HsInt        terminateEFS (void);
1733 HsForeignObj emptyEFS(void);
1734 HsForeignObj updateEFS (HsForeignObj a, HsInt i, HsInt x);
1735 HsInt        lookupEFS (HsForeignObj a, HsInt i);
1736 </ProgramListing>
1737 </Para>
1738
1739       <para>The types <literal>HsInt</literal>,
1740       <literal>HsForeignObj</literal> etc. are described in <xref
1741       linkend="sec-mapping-table">.</Para>
1742
1743       <Para>Note that this approach is only
1744       <Emphasis>essential</Emphasis> for returning
1745       <Literal>float</Literal>s (or if <Literal>sizeof(int) !=
1746       sizeof(int *)</Literal> on your architecture) but is a Good
1747       Thing for anyone who cares about writing solid code.  You're
1748       crazy not to do it.</Para>
1749
1750 </Sect2>
1751
1752 </Sect1>
1753
1754 <Sect1 id="multi-param-type-classes">
1755 <Title>Multi-parameter type classes
1756 </Title>
1757
1758 <Para>
1759 This section documents GHC's implementation of multi-parameter type
1760 classes.  There's lots of background in the paper <ULink
1761 URL="http://research.microsoft.com/~simonpj/multi.ps.gz" >Type
1762 classes: exploring the design space</ULink > (Simon Peyton Jones, Mark
1763 Jones, Erik Meijer).
1764 </Para>
1765
1766 <Para>
1767 I'd like to thank people who reported shorcomings in the GHC 3.02
1768 implementation.  Our default decisions were all conservative ones, and
1769 the experience of these heroic pioneers has given useful concrete
1770 examples to support several generalisations.  (These appear below as
1771 design choices not implemented in 3.02.)
1772 </Para>
1773
1774 <Para>
1775 I've discussed these notes with Mark Jones, and I believe that Hugs
1776 will migrate towards the same design choices as I outline here.
1777 Thanks to him, and to many others who have offered very useful
1778 feedback.
1779 </Para>
1780
1781 <Sect2>
1782 <Title>Types</Title>
1783
1784 <Para>
1785 There are the following restrictions on the form of a qualified
1786 type:
1787 </Para>
1788
1789 <Para>
1790
1791 <ProgramListing>
1792   forall tv1..tvn (c1, ...,cn) => type
1793 </ProgramListing>
1794
1795 </Para>
1796
1797 <Para>
1798 (Here, I write the "foralls" explicitly, although the Haskell source
1799 language omits them; in Haskell 1.4, all the free type variables of an
1800 explicit source-language type signature are universally quantified,
1801 except for the class type variables in a class declaration.  However,
1802 in GHC, you can give the foralls if you want.  See <XRef LinkEnd="universal-quantification">).
1803 </Para>
1804
1805 <Para>
1806
1807 <OrderedList>
1808 <ListItem>
1809
1810 <Para>
1811  <Emphasis>Each universally quantified type variable
1812 <Literal>tvi</Literal> must be mentioned (i.e. appear free) in <Literal>type</Literal></Emphasis>.
1813
1814 The reason for this is that a value with a type that does not obey
1815 this restriction could not be used without introducing
1816 ambiguity. Here, for example, is an illegal type:
1817
1818
1819 <ProgramListing>
1820   forall a. Eq a => Int
1821 </ProgramListing>
1822
1823
1824 When a value with this type was used, the constraint <Literal>Eq tv</Literal>
1825 would be introduced where <Literal>tv</Literal> is a fresh type variable, and
1826 (in the dictionary-translation implementation) the value would be
1827 applied to a dictionary for <Literal>Eq tv</Literal>.  The difficulty is that we
1828 can never know which instance of <Literal>Eq</Literal> to use because we never
1829 get any more information about <Literal>tv</Literal>.
1830
1831 </Para>
1832 </ListItem>
1833 <ListItem>
1834
1835 <Para>
1836  <Emphasis>Every constraint <Literal>ci</Literal> must mention at least one of the
1837 universally quantified type variables <Literal>tvi</Literal></Emphasis>.
1838
1839 For example, this type is OK because <Literal>C a b</Literal> mentions the
1840 universally quantified type variable <Literal>b</Literal>:
1841
1842
1843 <ProgramListing>
1844   forall a. C a b => burble
1845 </ProgramListing>
1846
1847
1848 The next type is illegal because the constraint <Literal>Eq b</Literal> does not
1849 mention <Literal>a</Literal>:
1850
1851
1852 <ProgramListing>
1853   forall a. Eq b => burble
1854 </ProgramListing>
1855
1856
1857 The reason for this restriction is milder than the other one.  The
1858 excluded types are never useful or necessary (because the offending
1859 context doesn't need to be witnessed at this point; it can be floated
1860 out).  Furthermore, floating them out increases sharing. Lastly,
1861 excluding them is a conservative choice; it leaves a patch of
1862 territory free in case we need it later.
1863
1864 </Para>
1865 </ListItem>
1866
1867 </OrderedList>
1868
1869 </Para>
1870
1871 <Para>
1872 These restrictions apply to all types, whether declared in a type signature
1873 or inferred.
1874 </Para>
1875
1876 <Para>
1877 Unlike Haskell 1.4, constraints in types do <Emphasis>not</Emphasis> have to be of
1878 the form <Emphasis>(class type-variables)</Emphasis>.  Thus, these type signatures
1879 are perfectly OK
1880 </Para>
1881
1882 <Para>
1883
1884 <ProgramListing>
1885   f :: Eq (m a) => [m a] -> [m a]
1886   g :: Eq [a] => ...
1887 </ProgramListing>
1888
1889 </Para>
1890
1891 <Para>
1892 This choice recovers principal types, a property that Haskell 1.4 does not have.
1893 </Para>
1894
1895 </Sect2>
1896
1897 <Sect2>
1898 <Title>Class declarations</Title>
1899
1900 <Para>
1901
1902 <OrderedList>
1903 <ListItem>
1904
1905 <Para>
1906  <Emphasis>Multi-parameter type classes are permitted</Emphasis>. For example:
1907
1908
1909 <ProgramListing>
1910   class Collection c a where
1911     union :: c a -> c a -> c a
1912     ...etc.
1913 </ProgramListing>
1914
1915
1916
1917 </Para>
1918 </ListItem>
1919 <ListItem>
1920
1921 <Para>
1922  <Emphasis>The class hierarchy must be acyclic</Emphasis>.  However, the definition
1923 of "acyclic" involves only the superclass relationships.  For example,
1924 this is OK:
1925
1926
1927 <ProgramListing>
1928   class C a where {
1929     op :: D b => a -> b -> b
1930   }
1931
1932   class C a => D a where { ... }
1933 </ProgramListing>
1934
1935
1936 Here, <Literal>C</Literal> is a superclass of <Literal>D</Literal>, but it's OK for a
1937 class operation <Literal>op</Literal> of <Literal>C</Literal> to mention <Literal>D</Literal>.  (It
1938 would not be OK for <Literal>D</Literal> to be a superclass of <Literal>C</Literal>.)
1939
1940 </Para>
1941 </ListItem>
1942 <ListItem>
1943
1944 <Para>
1945  <Emphasis>There are no restrictions on the context in a class declaration
1946 (which introduces superclasses), except that the class hierarchy must
1947 be acyclic</Emphasis>.  So these class declarations are OK:
1948
1949
1950 <ProgramListing>
1951   class Functor (m k) => FiniteMap m k where
1952     ...
1953
1954   class (Monad m, Monad (t m)) => Transform t m where
1955     lift :: m a -> (t m) a
1956 </ProgramListing>
1957
1958
1959 </Para>
1960 </ListItem>
1961 <ListItem>
1962
1963 <Para>
1964  <Emphasis>In the signature of a class operation, every constraint
1965 must mention at least one type variable that is not a class type
1966 variable</Emphasis>.
1967
1968 Thus:
1969
1970
1971 <ProgramListing>
1972   class Collection c a where
1973     mapC :: Collection c b => (a->b) -> c a -> c b
1974 </ProgramListing>
1975
1976
1977 is OK because the constraint <Literal>(Collection a b)</Literal> mentions
1978 <Literal>b</Literal>, even though it also mentions the class variable
1979 <Literal>a</Literal>.  On the other hand:
1980
1981
1982 <ProgramListing>
1983   class C a where
1984     op :: Eq a => (a,b) -> (a,b)
1985 </ProgramListing>
1986
1987
1988 is not OK because the constraint <Literal>(Eq a)</Literal> mentions on the class
1989 type variable <Literal>a</Literal>, but not <Literal>b</Literal>.  However, any such
1990 example is easily fixed by moving the offending context up to the
1991 superclass context:
1992
1993
1994 <ProgramListing>
1995   class Eq a => C a where
1996     op ::(a,b) -> (a,b)
1997 </ProgramListing>
1998
1999
2000 A yet more relaxed rule would allow the context of a class-op signature
2001 to mention only class type variables.  However, that conflicts with
2002 Rule 1(b) for types above.
2003
2004 </Para>
2005 </ListItem>
2006 <ListItem>
2007
2008 <Para>
2009  <Emphasis>The type of each class operation must mention <Emphasis>all</Emphasis> of
2010 the class type variables</Emphasis>.  For example:
2011
2012
2013 <ProgramListing>
2014   class Coll s a where
2015     empty  :: s
2016     insert :: s -> a -> s
2017 </ProgramListing>
2018
2019
2020 is not OK, because the type of <Literal>empty</Literal> doesn't mention
2021 <Literal>a</Literal>.  This rule is a consequence of Rule 1(a), above, for
2022 types, and has the same motivation.
2023
2024 Sometimes, offending class declarations exhibit misunderstandings.  For
2025 example, <Literal>Coll</Literal> might be rewritten
2026
2027
2028 <ProgramListing>
2029   class Coll s a where
2030     empty  :: s a
2031     insert :: s a -> a -> s a
2032 </ProgramListing>
2033
2034
2035 which makes the connection between the type of a collection of
2036 <Literal>a</Literal>'s (namely <Literal>(s a)</Literal>) and the element type <Literal>a</Literal>.
2037 Occasionally this really doesn't work, in which case you can split the
2038 class like this:
2039
2040
2041 <ProgramListing>
2042   class CollE s where
2043     empty  :: s
2044
2045   class CollE s => Coll s a where
2046     insert :: s -> a -> s
2047 </ProgramListing>
2048
2049
2050 </Para>
2051 </ListItem>
2052
2053 </OrderedList>
2054
2055 </Para>
2056
2057 </Sect2>
2058
2059 <Sect2 id="instance-decls">
2060 <Title>Instance declarations</Title>
2061
2062 <Para>
2063
2064 <OrderedList>
2065 <ListItem>
2066
2067 <Para>
2068  <Emphasis>Instance declarations may not overlap</Emphasis>.  The two instance
2069 declarations
2070
2071
2072 <ProgramListing>
2073   instance context1 => C type1 where ...
2074   instance context2 => C type2 where ...
2075 </ProgramListing>
2076
2077
2078 "overlap" if <Literal>type1</Literal> and <Literal>type2</Literal> unify
2079
2080 However, if you give the command line option
2081 <Option>-fallow-overlapping-instances</Option><IndexTerm><Primary>-fallow-overlapping-instances
2082 option</Primary></IndexTerm> then two overlapping instance declarations are permitted
2083 iff
2084
2085
2086 <ItemizedList>
2087 <ListItem>
2088
2089 <Para>
2090  EITHER <Literal>type1</Literal> and <Literal>type2</Literal> do not unify
2091 </Para>
2092 </ListItem>
2093 <ListItem>
2094
2095 <Para>
2096  OR <Literal>type2</Literal> is a substitution instance of <Literal>type1</Literal>
2097 (but not identical to <Literal>type1</Literal>)
2098 </Para>
2099 </ListItem>
2100 <ListItem>
2101
2102 <Para>
2103  OR vice versa
2104 </Para>
2105 </ListItem>
2106
2107 </ItemizedList>
2108
2109
2110 Notice that these rules
2111
2112
2113 <ItemizedList>
2114 <ListItem>
2115
2116 <Para>
2117  make it clear which instance decl to use
2118 (pick the most specific one that matches)
2119
2120 </Para>
2121 </ListItem>
2122 <ListItem>
2123
2124 <Para>
2125  do not mention the contexts <Literal>context1</Literal>, <Literal>context2</Literal>
2126 Reason: you can pick which instance decl
2127 "matches" based on the type.
2128 </Para>
2129 </ListItem>
2130
2131 </ItemizedList>
2132
2133
2134 Regrettably, GHC doesn't guarantee to detect overlapping instance
2135 declarations if they appear in different modules.  GHC can "see" the
2136 instance declarations in the transitive closure of all the modules
2137 imported by the one being compiled, so it can "see" all instance decls
2138 when it is compiling <Literal>Main</Literal>.  However, it currently chooses not
2139 to look at ones that can't possibly be of use in the module currently
2140 being compiled, in the interests of efficiency.  (Perhaps we should
2141 change that decision, at least for <Literal>Main</Literal>.)
2142
2143 </Para>
2144 </ListItem>
2145 <ListItem>
2146
2147 <Para>
2148  <Emphasis>There are no restrictions on the type in an instance
2149 <Emphasis>head</Emphasis>, except that at least one must not be a type variable</Emphasis>.
2150 The instance "head" is the bit after the "=>" in an instance decl. For
2151 example, these are OK:
2152
2153
2154 <ProgramListing>
2155   instance C Int a where ...
2156
2157   instance D (Int, Int) where ...
2158
2159   instance E [[a]] where ...
2160 </ProgramListing>
2161
2162
2163 Note that instance heads <Emphasis>may</Emphasis> contain repeated type variables.
2164 For example, this is OK:
2165
2166
2167 <ProgramListing>
2168   instance Stateful (ST s) (MutVar s) where ...
2169 </ProgramListing>
2170
2171
2172 The "at least one not a type variable" restriction is to ensure that
2173 context reduction terminates: each reduction step removes one type
2174 constructor.  For example, the following would make the type checker
2175 loop if it wasn't excluded:
2176
2177
2178 <ProgramListing>
2179   instance C a => C a where ...
2180 </ProgramListing>
2181
2182
2183 There are two situations in which the rule is a bit of a pain. First,
2184 if one allows overlapping instance declarations then it's quite
2185 convenient to have a "default instance" declaration that applies if
2186 something more specific does not:
2187
2188
2189 <ProgramListing>
2190   instance C a where
2191     op = ... -- Default
2192 </ProgramListing>
2193
2194
2195 Second, sometimes you might want to use the following to get the
2196 effect of a "class synonym":
2197
2198
2199 <ProgramListing>
2200   class (C1 a, C2 a, C3 a) => C a where { }
2201
2202   instance (C1 a, C2 a, C3 a) => C a where { }
2203 </ProgramListing>
2204
2205
2206 This allows you to write shorter signatures:
2207
2208
2209 <ProgramListing>
2210   f :: C a => ...
2211 </ProgramListing>
2212
2213
2214 instead of
2215
2216
2217 <ProgramListing>
2218   f :: (C1 a, C2 a, C3 a) => ...
2219 </ProgramListing>
2220
2221
2222 I'm on the lookout for a simple rule that preserves decidability while
2223 allowing these idioms.  The experimental flag
2224 <Option>-fallow-undecidable-instances</Option><IndexTerm><Primary>-fallow-undecidable-instances
2225 option</Primary></IndexTerm> lifts this restriction, allowing all the types in an
2226 instance head to be type variables.
2227
2228 </Para>
2229 </ListItem>
2230 <ListItem>
2231
2232 <Para>
2233  <Emphasis>Unlike Haskell 1.4, instance heads may use type
2234 synonyms</Emphasis>.  As always, using a type synonym is just shorthand for
2235 writing the RHS of the type synonym definition.  For example:
2236
2237
2238 <ProgramListing>
2239   type Point = (Int,Int)
2240   instance C Point   where ...
2241   instance C [Point] where ...
2242 </ProgramListing>
2243
2244
2245 is legal.  However, if you added
2246
2247
2248 <ProgramListing>
2249   instance C (Int,Int) where ...
2250 </ProgramListing>
2251
2252
2253 as well, then the compiler will complain about the overlapping
2254 (actually, identical) instance declarations.  As always, type synonyms
2255 must be fully applied.  You cannot, for example, write:
2256
2257
2258 <ProgramListing>
2259   type P a = [[a]]
2260   instance Monad P where ...
2261 </ProgramListing>
2262
2263
2264 This design decision is independent of all the others, and easily
2265 reversed, but it makes sense to me.
2266
2267 </Para>
2268 </ListItem>
2269 <ListItem>
2270
2271 <Para>
2272 <Emphasis>The types in an instance-declaration <Emphasis>context</Emphasis> must all
2273 be type variables</Emphasis>. Thus
2274
2275
2276 <ProgramListing>
2277 instance C a b => Eq (a,b) where ...
2278 </ProgramListing>
2279
2280
2281 is OK, but
2282
2283
2284 <ProgramListing>
2285 instance C Int b => Foo b where ...
2286 </ProgramListing>
2287
2288
2289 is not OK.  Again, the intent here is to make sure that context
2290 reduction terminates.
2291
2292 Voluminous correspondence on the Haskell mailing list has convinced me
2293 that it's worth experimenting with a more liberal rule.  If you use
2294 the flag <Option>-fallow-undecidable-instances</Option> can use arbitrary
2295 types in an instance context.  Termination is ensured by having a
2296 fixed-depth recursion stack.  If you exceed the stack depth you get a
2297 sort of backtrace, and the opportunity to increase the stack depth
2298 with <Option>-fcontext-stack</Option><Emphasis>N</Emphasis>.
2299
2300 </Para>
2301 </ListItem>
2302
2303 </OrderedList>
2304
2305 </Para>
2306
2307 </Sect2>
2308
2309 </Sect1>
2310
2311 <Sect1 id="implicit-parameters">
2312 <Title>Implicit parameters
2313 </Title>
2314
2315 <Para> Implicit paramters are implemented as described in
2316 "Implicit parameters: dynamic scoping with static types",
2317 J Lewis, MB Shields, E Meijer, J Launchbury,
2318 27th ACM Symposium on Principles of Programming Languages (POPL'00),
2319 Boston, Jan 2000.
2320 </Para>
2321
2322 <Para>
2323 There should be more documentation, but there isn't (yet).  Yell if you need it.
2324 </Para>
2325 <ItemizedList>
2326 <ListItem>
2327 <Para> You can't have an implicit parameter in the context of a class or instance
2328 declaration.  For example, both these declarations are illegal:
2329 <ProgramListing>
2330   class (?x::Int) => C a where ...
2331   instance (?x::a) => Foo [a] where ...
2332 </ProgramListing>
2333 Reason: exactly which implicit parameter you pick up depends on exactly where
2334 you invoke a function. But the ``invocation'' of instance declarations is done
2335 behind the scenes by the compiler, so it's hard to figure out exactly where it is done.
2336 Easiest thing is to outlaw the offending types.
2337 </ListItem>
2338
2339 </ItemizedList>
2340
2341 </Sect1>
2342
2343
2344 <Sect1 id="functional-dependencies">
2345 <Title>Functional dependencies
2346 </Title>
2347
2348 <Para> Functional dependencies are implemented as described by Mark Jones
2349 in "Type Classes with Functional Dependencies", Mark P. Jones,
2350 In Proceedings of the 9th European Symposium on Programming,
2351 ESOP 2000, Berlin, Germany, March 2000, Springer-Verlag LNCS 1782.
2352 </Para>
2353
2354 <Para>
2355 There should be more documentation, but there isn't (yet).  Yell if you need it.
2356 </Para>
2357 </Sect1>
2358
2359
2360 <Sect1 id="universal-quantification">
2361 <Title>Explicit universal quantification
2362 </Title>
2363
2364 <Para>
2365 GHC's type system supports explicit universal quantification in
2366 constructor fields and function arguments.  This is useful for things
2367 like defining <Literal>runST</Literal> from the state-thread world.
2368 GHC's syntax for this now agrees with Hugs's, namely:
2369 </Para>
2370
2371 <Para>
2372
2373 <ProgramListing>
2374         forall a b. (Ord a, Eq  b) => a -> b -> a
2375 </ProgramListing>
2376
2377 </Para>
2378
2379 <Para>
2380 The context is, of course, optional.  You can't use <Literal>forall</Literal> as
2381 a type variable any more!
2382 </Para>
2383
2384 <Para>
2385 Haskell type signatures are implicitly quantified.  The <Literal>forall</Literal>
2386 allows us to say exactly what this means.  For example:
2387 </Para>
2388
2389 <Para>
2390
2391 <ProgramListing>
2392         g :: b -> b
2393 </ProgramListing>
2394
2395 </Para>
2396
2397 <Para>
2398 means this:
2399 </Para>
2400
2401 <Para>
2402
2403 <ProgramListing>
2404         g :: forall b. (b -> b)
2405 </ProgramListing>
2406
2407 </Para>
2408
2409 <Para>
2410 The two are treated identically.
2411 </Para>
2412
2413 <Sect2 id="univ">
2414 <Title>Universally-quantified data type fields
2415 </Title>
2416
2417 <Para>
2418 In a <Literal>data</Literal> or <Literal>newtype</Literal> declaration one can quantify
2419 the types of the constructor arguments.  Here are several examples:
2420 </Para>
2421
2422 <Para>
2423
2424 <ProgramListing>
2425 data T a = T1 (forall b. b -> b -> b) a
2426
2427 data MonadT m = MkMonad { return :: forall a. a -> m a,
2428                           bind   :: forall a b. m a -> (a -> m b) -> m b
2429                         }
2430
2431 newtype Swizzle = MkSwizzle (Ord a => [a] -> [a])
2432 </ProgramListing>
2433
2434 </Para>
2435
2436 <Para>
2437 The constructors now have so-called <Emphasis>rank 2</Emphasis> polymorphic
2438 types, in which there is a for-all in the argument types.:
2439 </Para>
2440
2441 <Para>
2442
2443 <ProgramListing>
2444 T1 :: forall a. (forall b. b -> b -> b) -> a -> T a
2445 MkMonad :: forall m. (forall a. a -> m a)
2446                   -> (forall a b. m a -> (a -> m b) -> m b)
2447                   -> MonadT m
2448 MkSwizzle :: (Ord a => [a] -> [a]) -> Swizzle
2449 </ProgramListing>
2450
2451 </Para>
2452
2453 <Para>
2454 Notice that you don't need to use a <Literal>forall</Literal> if there's an
2455 explicit context.  For example in the first argument of the
2456 constructor <Function>MkSwizzle</Function>, an implicit "<Literal>forall a.</Literal>" is
2457 prefixed to the argument type.  The implicit <Literal>forall</Literal>
2458 quantifies all type variables that are not already in scope, and are
2459 mentioned in the type quantified over.
2460 </Para>
2461
2462 <Para>
2463 As for type signatures, implicit quantification happens for non-overloaded
2464 types too.  So if you write this:
2465
2466 <ProgramListing>
2467   data T a = MkT (Either a b) (b -> b)
2468 </ProgramListing>
2469
2470 it's just as if you had written this:
2471
2472 <ProgramListing>
2473   data T a = MkT (forall b. Either a b) (forall b. b -> b)
2474 </ProgramListing>
2475
2476 That is, since the type variable <Literal>b</Literal> isn't in scope, it's
2477 implicitly universally quantified.  (Arguably, it would be better
2478 to <Emphasis>require</Emphasis> explicit quantification on constructor arguments
2479 where that is what is wanted.  Feedback welcomed.)
2480 </Para>
2481
2482 </Sect2>
2483
2484 <Sect2>
2485 <Title>Construction </Title>
2486
2487 <Para>
2488 You construct values of types <Literal>T1, MonadT, Swizzle</Literal> by applying
2489 the constructor to suitable values, just as usual.  For example,
2490 </Para>
2491
2492 <Para>
2493
2494 <ProgramListing>
2495 (T1 (\xy->x) 3) :: T Int
2496
2497 (MkSwizzle sort)    :: Swizzle
2498 (MkSwizzle reverse) :: Swizzle
2499
2500 (let r x = Just x
2501      b m k = case m of
2502                 Just y -> k y
2503                 Nothing -> Nothing
2504   in
2505   MkMonad r b) :: MonadT Maybe
2506 </ProgramListing>
2507
2508 </Para>
2509
2510 <Para>
2511 The type of the argument can, as usual, be more general than the type
2512 required, as <Literal>(MkSwizzle reverse)</Literal> shows.  (<Function>reverse</Function>
2513 does not need the <Literal>Ord</Literal> constraint.)
2514 </Para>
2515
2516 </Sect2>
2517
2518 <Sect2>
2519 <Title>Pattern matching</Title>
2520
2521 <Para>
2522 When you use pattern matching, the bound variables may now have
2523 polymorphic types.  For example:
2524 </Para>
2525
2526 <Para>
2527
2528 <ProgramListing>
2529         f :: T a -> a -> (a, Char)
2530         f (T1 f k) x = (f k x, f 'c' 'd')
2531
2532         g :: (Ord a, Ord b) => Swizzle -> [a] -> (a -> b) -> [b]
2533         g (MkSwizzle s) xs f = s (map f (s xs))
2534
2535         h :: MonadT m -> [m a] -> m [a]
2536         h m [] = return m []
2537         h m (x:xs) = bind m x           $ \y ->
2538                       bind m (h m xs)   $ \ys ->
2539                       return m (y:ys)
2540 </ProgramListing>
2541
2542 </Para>
2543
2544 <Para>
2545 In the function <Function>h</Function> we use the record selectors <Literal>return</Literal>
2546 and <Literal>bind</Literal> to extract the polymorphic bind and return functions
2547 from the <Literal>MonadT</Literal> data structure, rather than using pattern
2548 matching.
2549 </Para>
2550
2551 <Para>
2552 You cannot pattern-match against an argument that is polymorphic.
2553 For example:
2554
2555 <ProgramListing>
2556         newtype TIM s a = TIM (ST s (Maybe a))
2557
2558         runTIM :: (forall s. TIM s a) -> Maybe a
2559         runTIM (TIM m) = runST m
2560 </ProgramListing>
2561
2562 </Para>
2563
2564 <Para>
2565 Here the pattern-match fails, because you can't pattern-match against
2566 an argument of type <Literal>(forall s. TIM s a)</Literal>.  Instead you
2567 must bind the variable and pattern match in the right hand side:
2568
2569 <ProgramListing>
2570         runTIM :: (forall s. TIM s a) -> Maybe a
2571         runTIM tm = case tm of { TIM m -> runST m }
2572 </ProgramListing>
2573
2574 The <Literal>tm</Literal> on the right hand side is (invisibly) instantiated, like
2575 any polymorphic value at its occurrence site, and now you can pattern-match
2576 against it.
2577 </Para>
2578
2579 </Sect2>
2580
2581 <Sect2>
2582 <Title>The partial-application restriction</Title>
2583
2584 <Para>
2585 There is really only one way in which data structures with polymorphic
2586 components might surprise you: you must not partially apply them.
2587 For example, this is illegal:
2588 </Para>
2589
2590 <Para>
2591
2592 <ProgramListing>
2593         map MkSwizzle [sort, reverse]
2594 </ProgramListing>
2595
2596 </Para>
2597
2598 <Para>
2599 The restriction is this: <Emphasis>every subexpression of the program must
2600 have a type that has no for-alls, except that in a function
2601 application (f e1&hellip;en) the partial applications are not subject to
2602 this rule</Emphasis>.  The restriction makes type inference feasible.
2603 </Para>
2604
2605 <Para>
2606 In the illegal example, the sub-expression <Literal>MkSwizzle</Literal> has the
2607 polymorphic type <Literal>(Ord b => [b] -> [b]) -> Swizzle</Literal> and is not
2608 a sub-expression of an enclosing application.  On the other hand, this
2609 expression is OK:
2610 </Para>
2611
2612 <Para>
2613
2614 <ProgramListing>
2615         map (T1 (\a b -> a)) [1,2,3]
2616 </ProgramListing>
2617
2618 </Para>
2619
2620 <Para>
2621 even though it involves a partial application of <Function>T1</Function>, because
2622 the sub-expression <Literal>T1 (\a b -> a)</Literal> has type <Literal>Int -> T
2623 Int</Literal>.
2624 </Para>
2625
2626 </Sect2>
2627
2628 <Sect2 id="sigs">
2629 <Title>Type signatures
2630 </Title>
2631
2632 <Para>
2633 Once you have data constructors with universally-quantified fields, or
2634 constants such as <Constant>runST</Constant> that have rank-2 types, it isn't long
2635 before you discover that you need more!  Consider:
2636 </Para>
2637
2638 <Para>
2639
2640 <ProgramListing>
2641   mkTs f x y = [T1 f x, T1 f y]
2642 </ProgramListing>
2643
2644 </Para>
2645
2646 <Para>
2647 <Function>mkTs</Function> is a fuction that constructs some values of type
2648 <Literal>T</Literal>, using some pieces passed to it.  The trouble is that since
2649 <Literal>f</Literal> is a function argument, Haskell assumes that it is
2650 monomorphic, so we'll get a type error when applying <Function>T1</Function> to
2651 it.  This is a rather silly example, but the problem really bites in
2652 practice.  Lots of people trip over the fact that you can't make
2653 "wrappers functions" for <Constant>runST</Constant> for exactly the same reason.
2654 In short, it is impossible to build abstractions around functions with
2655 rank-2 types.
2656 </Para>
2657
2658 <Para>
2659 The solution is fairly clear.  We provide the ability to give a rank-2
2660 type signature for <Emphasis>ordinary</Emphasis> functions (not only data
2661 constructors), thus:
2662 </Para>
2663
2664 <Para>
2665
2666 <ProgramListing>
2667   mkTs :: (forall b. b -> b -> b) -> a -> [T a]
2668   mkTs f x y = [T1 f x, T1 f y]
2669 </ProgramListing>
2670
2671 </Para>
2672
2673 <Para>
2674 This type signature tells the compiler to attribute <Literal>f</Literal> with
2675 the polymorphic type <Literal>(forall b. b -> b -> b)</Literal> when type
2676 checking the body of <Function>mkTs</Function>, so now the application of
2677 <Function>T1</Function> is fine.
2678 </Para>
2679
2680 <Para>
2681 There are two restrictions:
2682 </Para>
2683
2684 <Para>
2685
2686 <ItemizedList>
2687 <ListItem>
2688
2689 <Para>
2690  You can only define a rank 2 type, specified by the following
2691 grammar:
2692
2693
2694 <ProgramListing>
2695 rank2type ::= [forall tyvars .] [context =>] funty
2696 funty     ::= ([forall tyvars .] [context =>] ty) -> funty
2697             | ty
2698 ty        ::= ...current Haskell monotype syntax...
2699 </ProgramListing>
2700
2701
2702 Informally, the universal quantification must all be right at the beginning,
2703 or at the top level of a function argument.
2704
2705 </Para>
2706 </ListItem>
2707 <ListItem>
2708
2709 <Para>
2710  There is a restriction on the definition of a function whose
2711 type signature is a rank-2 type: the polymorphic arguments must be
2712 matched on the left hand side of the "<Literal>=</Literal>" sign.  You can't
2713 define <Function>mkTs</Function> like this:
2714
2715
2716 <ProgramListing>
2717 mkTs :: (forall b. b -> b -> b) -> a -> [T a]
2718 mkTs = \ f x y -> [T1 f x, T1 f y]
2719 </ProgramListing>
2720
2721
2722
2723 The same partial-application rule applies to ordinary functions with
2724 rank-2 types as applied to data constructors.
2725
2726 </Para>
2727 </ListItem>
2728
2729 </ItemizedList>
2730
2731 </Para>
2732
2733 </Sect2>
2734
2735
2736 <Sect2 id="hoist">
2737 <Title>Type synonyms and hoisting
2738 </Title>
2739
2740 <Para>
2741 GHC also allows you to write a <Literal>forall</Literal> in a type synonym, thus:
2742 <ProgramListing>
2743   type Discard a = forall b. a -> b -> a
2744
2745   f :: Discard a
2746   f x y = x
2747 </ProgramListing>
2748 However, it is often convenient to use these sort of synonyms at the right hand
2749 end of an arrow, thus:
2750 <ProgramListing>
2751   type Discard a = forall b. a -> b -> a
2752
2753   g :: Int -> Discard Int
2754   g x y z = x+y
2755 </ProgramListing>
2756 Simply expanding the type synonym would give
2757 <ProgramListing>
2758   g :: Int -> (forall b. Int -> b -> Int)
2759 </ProgramListing>
2760 but GHC "hoists" the <Literal>forall</Literal> to give the isomorphic type
2761 <ProgramListing>
2762   g :: forall b. Int -> Int -> b -> Int
2763 </ProgramListing>
2764 In general, the rule is this: <Emphasis>to determine the type specified by any explicit
2765 user-written type (e.g. in a type signature), GHC expands type synonyms and then repeatedly
2766 performs the transformation:</Emphasis>
2767 <ProgramListing>
2768   <Emphasis>type1</Emphasis> -> forall a. <Emphasis>type2</Emphasis>
2769 ==>
2770   forall a. <Emphasis>type1</Emphasis> -> <Emphasis>type2</Emphasis>
2771 </ProgramListing>
2772 (In fact, GHC tries to retain as much synonym information as possible for use in
2773 error messages, but that is a usability issue.)  This rule applies, of course, whether
2774 or not the <Literal>forall</Literal> comes from a synonym. For example, here is another
2775 valid way to write <Literal>g</Literal>'s type signature:
2776 <ProgramListing>
2777   g :: Int -> Int -> forall b. b -> Int
2778 </ProgramListing>
2779 </Para>
2780 </Sect2>
2781
2782 </Sect1>
2783
2784 <Sect1 id="existential-quantification">
2785 <Title>Existentially quantified data constructors
2786 </Title>
2787
2788 <Para>
2789 The idea of using existential quantification in data type declarations
2790 was suggested by Laufer (I believe, thought doubtless someone will
2791 correct me), and implemented in Hope+. It's been in Lennart
2792 Augustsson's <Command>hbc</Command> Haskell compiler for several years, and
2793 proved very useful.  Here's the idea.  Consider the declaration:
2794 </Para>
2795
2796 <Para>
2797
2798 <ProgramListing>
2799   data Foo = forall a. MkFoo a (a -> Bool)
2800            | Nil
2801 </ProgramListing>
2802
2803 </Para>
2804
2805 <Para>
2806 The data type <Literal>Foo</Literal> has two constructors with types:
2807 </Para>
2808
2809 <Para>
2810
2811 <ProgramListing>
2812   MkFoo :: forall a. a -> (a -> Bool) -> Foo
2813   Nil   :: Foo
2814 </ProgramListing>
2815
2816 </Para>
2817
2818 <Para>
2819 Notice that the type variable <Literal>a</Literal> in the type of <Function>MkFoo</Function>
2820 does not appear in the data type itself, which is plain <Literal>Foo</Literal>.
2821 For example, the following expression is fine:
2822 </Para>
2823
2824 <Para>
2825
2826 <ProgramListing>
2827   [MkFoo 3 even, MkFoo 'c' isUpper] :: [Foo]
2828 </ProgramListing>
2829
2830 </Para>
2831
2832 <Para>
2833 Here, <Literal>(MkFoo 3 even)</Literal> packages an integer with a function
2834 <Function>even</Function> that maps an integer to <Literal>Bool</Literal>; and <Function>MkFoo 'c'
2835 isUpper</Function> packages a character with a compatible function.  These
2836 two things are each of type <Literal>Foo</Literal> and can be put in a list.
2837 </Para>
2838
2839 <Para>
2840 What can we do with a value of type <Literal>Foo</Literal>?.  In particular,
2841 what happens when we pattern-match on <Function>MkFoo</Function>?
2842 </Para>
2843
2844 <Para>
2845
2846 <ProgramListing>
2847   f (MkFoo val fn) = ???
2848 </ProgramListing>
2849
2850 </Para>
2851
2852 <Para>
2853 Since all we know about <Literal>val</Literal> and <Function>fn</Function> is that they
2854 are compatible, the only (useful) thing we can do with them is to
2855 apply <Function>fn</Function> to <Literal>val</Literal> to get a boolean.  For example:
2856 </Para>
2857
2858 <Para>
2859
2860 <ProgramListing>
2861   f :: Foo -> Bool
2862   f (MkFoo val fn) = fn val
2863 </ProgramListing>
2864
2865 </Para>
2866
2867 <Para>
2868 What this allows us to do is to package heterogenous values
2869 together with a bunch of functions that manipulate them, and then treat
2870 that collection of packages in a uniform manner.  You can express
2871 quite a bit of object-oriented-like programming this way.
2872 </Para>
2873
2874 <Sect2 id="existential">
2875 <Title>Why existential?
2876 </Title>
2877
2878 <Para>
2879 What has this to do with <Emphasis>existential</Emphasis> quantification?
2880 Simply that <Function>MkFoo</Function> has the (nearly) isomorphic type
2881 </Para>
2882
2883 <Para>
2884
2885 <ProgramListing>
2886   MkFoo :: (exists a . (a, a -> Bool)) -> Foo
2887 </ProgramListing>
2888
2889 </Para>
2890
2891 <Para>
2892 But Haskell programmers can safely think of the ordinary
2893 <Emphasis>universally</Emphasis> quantified type given above, thereby avoiding
2894 adding a new existential quantification construct.
2895 </Para>
2896
2897 </Sect2>
2898
2899 <Sect2>
2900 <Title>Type classes</Title>
2901
2902 <Para>
2903 An easy extension (implemented in <Command>hbc</Command>) is to allow
2904 arbitrary contexts before the constructor.  For example:
2905 </Para>
2906
2907 <Para>
2908
2909 <ProgramListing>
2910 data Baz = forall a. Eq a => Baz1 a a
2911          | forall b. Show b => Baz2 b (b -> b)
2912 </ProgramListing>
2913
2914 </Para>
2915
2916 <Para>
2917 The two constructors have the types you'd expect:
2918 </Para>
2919
2920 <Para>
2921
2922 <ProgramListing>
2923 Baz1 :: forall a. Eq a => a -> a -> Baz
2924 Baz2 :: forall b. Show b => b -> (b -> b) -> Baz
2925 </ProgramListing>
2926
2927 </Para>
2928
2929 <Para>
2930 But when pattern matching on <Function>Baz1</Function> the matched values can be compared
2931 for equality, and when pattern matching on <Function>Baz2</Function> the first matched
2932 value can be converted to a string (as well as applying the function to it).
2933 So this program is legal:
2934 </Para>
2935
2936 <Para>
2937
2938 <ProgramListing>
2939   f :: Baz -> String
2940   f (Baz1 p q) | p == q    = "Yes"
2941                | otherwise = "No"
2942   f (Baz1 v fn)            = show (fn v)
2943 </ProgramListing>
2944
2945 </Para>
2946
2947 <Para>
2948 Operationally, in a dictionary-passing implementation, the
2949 constructors <Function>Baz1</Function> and <Function>Baz2</Function> must store the
2950 dictionaries for <Literal>Eq</Literal> and <Literal>Show</Literal> respectively, and
2951 extract it on pattern matching.
2952 </Para>
2953
2954 <Para>
2955 Notice the way that the syntax fits smoothly with that used for
2956 universal quantification earlier.
2957 </Para>
2958
2959 </Sect2>
2960
2961 <Sect2>
2962 <Title>Restrictions</Title>
2963
2964 <Para>
2965 There are several restrictions on the ways in which existentially-quantified
2966 constructors can be use.
2967 </Para>
2968
2969 <Para>
2970
2971 <ItemizedList>
2972 <ListItem>
2973
2974 <Para>
2975  When pattern matching, each pattern match introduces a new,
2976 distinct, type for each existential type variable.  These types cannot
2977 be unified with any other type, nor can they escape from the scope of
2978 the pattern match.  For example, these fragments are incorrect:
2979
2980
2981 <ProgramListing>
2982 f1 (MkFoo a f) = a
2983 </ProgramListing>
2984
2985
2986 Here, the type bound by <Function>MkFoo</Function> "escapes", because <Literal>a</Literal>
2987 is the result of <Function>f1</Function>.  One way to see why this is wrong is to
2988 ask what type <Function>f1</Function> has:
2989
2990
2991 <ProgramListing>
2992   f1 :: Foo -> a             -- Weird!
2993 </ProgramListing>
2994
2995
2996 What is this "<Literal>a</Literal>" in the result type? Clearly we don't mean
2997 this:
2998
2999
3000 <ProgramListing>
3001   f1 :: forall a. Foo -> a   -- Wrong!
3002 </ProgramListing>
3003
3004
3005 The original program is just plain wrong.  Here's another sort of error
3006
3007
3008 <ProgramListing>
3009   f2 (Baz1 a b) (Baz1 p q) = a==q
3010 </ProgramListing>
3011
3012
3013 It's ok to say <Literal>a==b</Literal> or <Literal>p==q</Literal>, but
3014 <Literal>a==q</Literal> is wrong because it equates the two distinct types arising
3015 from the two <Function>Baz1</Function> constructors.
3016
3017
3018 </Para>
3019 </ListItem>
3020 <ListItem>
3021
3022 <Para>
3023 You can't pattern-match on an existentially quantified
3024 constructor in a <Literal>let</Literal> or <Literal>where</Literal> group of
3025 bindings. So this is illegal:
3026
3027
3028 <ProgramListing>
3029   f3 x = a==b where { Baz1 a b = x }
3030 </ProgramListing>
3031
3032
3033 You can only pattern-match
3034 on an existentially-quantified constructor in a <Literal>case</Literal> expression or
3035 in the patterns of a function definition.
3036
3037 The reason for this restriction is really an implementation one.
3038 Type-checking binding groups is already a nightmare without
3039 existentials complicating the picture.  Also an existential pattern
3040 binding at the top level of a module doesn't make sense, because it's
3041 not clear how to prevent the existentially-quantified type "escaping".
3042 So for now, there's a simple-to-state restriction.  We'll see how
3043 annoying it is.
3044
3045 </Para>
3046 </ListItem>
3047 <ListItem>
3048
3049 <Para>
3050 You can't use existential quantification for <Literal>newtype</Literal>
3051 declarations.  So this is illegal:
3052
3053
3054 <ProgramListing>
3055   newtype T = forall a. Ord a => MkT a
3056 </ProgramListing>
3057
3058
3059 Reason: a value of type <Literal>T</Literal> must be represented as a pair
3060 of a dictionary for <Literal>Ord t</Literal> and a value of type <Literal>t</Literal>.
3061 That contradicts the idea that <Literal>newtype</Literal> should have no
3062 concrete representation.  You can get just the same efficiency and effect
3063 by using <Literal>data</Literal> instead of <Literal>newtype</Literal>.  If there is no
3064 overloading involved, then there is more of a case for allowing
3065 an existentially-quantified <Literal>newtype</Literal>, because the <Literal>data</Literal>
3066 because the <Literal>data</Literal> version does carry an implementation cost,
3067 but single-field existentially quantified constructors aren't much
3068 use.  So the simple restriction (no existential stuff on <Literal>newtype</Literal>)
3069 stands, unless there are convincing reasons to change it.
3070
3071
3072 </Para>
3073 </ListItem>
3074 <ListItem>
3075
3076 <Para>
3077  You can't use <Literal>deriving</Literal> to define instances of a
3078 data type with existentially quantified data constructors.
3079
3080 Reason: in most cases it would not make sense. For example:&num;
3081
3082 <ProgramListing>
3083 data T = forall a. MkT [a] deriving( Eq )
3084 </ProgramListing>
3085
3086 To derive <Literal>Eq</Literal> in the standard way we would need to have equality
3087 between the single component of two <Function>MkT</Function> constructors:
3088
3089 <ProgramListing>
3090 instance Eq T where
3091   (MkT a) == (MkT b) = ???
3092 </ProgramListing>
3093
3094 But <VarName>a</VarName> and <VarName>b</VarName> have distinct types, and so can't be compared.
3095 It's just about possible to imagine examples in which the derived instance
3096 would make sense, but it seems altogether simpler simply to prohibit such
3097 declarations.  Define your own instances!
3098 </Para>
3099 </ListItem>
3100
3101 </ItemizedList>
3102
3103 </Para>
3104
3105 </Sect2>
3106
3107 </Sect1>
3108
3109 <Sect1 id="sec-assertions">
3110 <Title>Assertions
3111 <IndexTerm><Primary>Assertions</Primary></IndexTerm>
3112 </Title>
3113
3114 <Para>
3115 If you want to make use of assertions in your standard Haskell code, you
3116 could define a function like the following:
3117 </Para>
3118
3119 <Para>
3120
3121 <ProgramListing>
3122 assert :: Bool -> a -> a
3123 assert False x = error "assertion failed!"
3124 assert _     x = x
3125 </ProgramListing>
3126
3127 </Para>
3128
3129 <Para>
3130 which works, but gives you back a less than useful error message --
3131 an assertion failed, but which and where?
3132 </Para>
3133
3134 <Para>
3135 One way out is to define an extended <Function>assert</Function> function which also
3136 takes a descriptive string to include in the error message and
3137 perhaps combine this with the use of a pre-processor which inserts
3138 the source location where <Function>assert</Function> was used.
3139 </Para>
3140
3141 <Para>
3142 Ghc offers a helping hand here, doing all of this for you. For every
3143 use of <Function>assert</Function> in the user's source:
3144 </Para>
3145
3146 <Para>
3147
3148 <ProgramListing>
3149 kelvinToC :: Double -> Double
3150 kelvinToC k = assert (k &gt;= 0.0) (k+273.15)
3151 </ProgramListing>
3152
3153 </Para>
3154
3155 <Para>
3156 Ghc will rewrite this to also include the source location where the
3157 assertion was made,
3158 </Para>
3159
3160 <Para>
3161
3162 <ProgramListing>
3163 assert pred val ==> assertError "Main.hs|15" pred val
3164 </ProgramListing>
3165
3166 </Para>
3167
3168 <Para>
3169 The rewrite is only performed by the compiler when it spots
3170 applications of <Function>Exception.assert</Function>, so you can still define and
3171 use your own versions of <Function>assert</Function>, should you so wish. If not,
3172 import <Literal>Exception</Literal> to make use <Function>assert</Function> in your code.
3173 </Para>
3174
3175 <Para>
3176 To have the compiler ignore uses of assert, use the compiler option
3177 <Option>-fignore-asserts</Option>. <IndexTerm><Primary>-fignore-asserts option</Primary></IndexTerm> That is,
3178 expressions of the form <Literal>assert pred e</Literal> will be rewritten to <Literal>e</Literal>.
3179 </Para>
3180
3181 <Para>
3182 Assertion failures can be caught, see the documentation for the
3183 <literal>Exception</literal> library (<xref linkend="sec-Exception">)
3184 for the details.
3185 </Para>
3186
3187 </Sect1>
3188
3189 <Sect1 id="scoped-type-variables">
3190 <Title>Scoped Type Variables
3191 </Title>
3192
3193 <Para>
3194 A <Emphasis>pattern type signature</Emphasis> can introduce a <Emphasis>scoped type
3195 variable</Emphasis>.  For example
3196 </Para>
3197
3198 <Para>
3199
3200 <ProgramListing>
3201 f (xs::[a]) = ys ++ ys
3202            where
3203               ys :: [a]
3204               ys = reverse xs
3205 </ProgramListing>
3206
3207 </Para>
3208
3209 <Para>
3210 The pattern <Literal>(xs::[a])</Literal> includes a type signature for <VarName>xs</VarName>.
3211 This brings the type variable <Literal>a</Literal> into scope; it scopes over
3212 all the patterns and right hand sides for this equation for <Function>f</Function>.
3213 In particular, it is in scope at the type signature for <VarName>y</VarName>.
3214 </Para>
3215
3216 <Para>
3217 At ordinary type signatures, such as that for <VarName>ys</VarName>, any type variables
3218 mentioned in the type signature <Emphasis>that are not in scope</Emphasis> are
3219 implicitly universally quantified.  (If there are no type variables in
3220 scope, all type variables mentioned in the signature are universally
3221 quantified, which is just as in Haskell 98.)  In this case, since <VarName>a</VarName>
3222 is in scope, it is not universally quantified, so the type of <VarName>ys</VarName> is
3223 the same as that of <VarName>xs</VarName>.  In Haskell 98 it is not possible to declare
3224 a type for <VarName>ys</VarName>; a major benefit of scoped type variables is that
3225 it becomes possible to do so.
3226 </Para>
3227
3228 <Para>
3229 Scoped type variables are implemented in both GHC and Hugs.  Where the
3230 implementations differ from the specification below, those differences
3231 are noted.
3232 </Para>
3233
3234 <Para>
3235 So much for the basic idea.  Here are the details.
3236 </Para>
3237
3238 <Sect2>
3239 <Title>Scope and implicit quantification</Title>
3240
3241 <Para>
3242
3243 <ItemizedList>
3244 <ListItem>
3245
3246 <Para>
3247  All the type variables mentioned in the patterns for a single
3248 function definition equation, that are not already in scope,
3249 are brought into scope by the patterns.  We describe this set as
3250 the <Emphasis>type variables bound by the equation</Emphasis>.
3251
3252 </Para>
3253 </ListItem>
3254 <ListItem>
3255
3256 <Para>
3257  The type variables thus brought into scope may be mentioned
3258 in ordinary type signatures or pattern type signatures anywhere within
3259 their scope.
3260
3261 </Para>
3262 </ListItem>
3263 <ListItem>
3264
3265 <Para>
3266  In ordinary type signatures, any type variable mentioned in the
3267 signature that is in scope is <Emphasis>not</Emphasis> universally quantified.
3268
3269 </Para>
3270 </ListItem>
3271 <ListItem>
3272
3273 <Para>
3274  Ordinary type signatures do not bring any new type variables
3275 into scope (except in the type signature itself!). So this is illegal:
3276
3277
3278 <ProgramListing>
3279   f :: a -> a
3280   f x = x::a
3281 </ProgramListing>
3282
3283
3284 It's illegal because <VarName>a</VarName> is not in scope in the body of <Function>f</Function>,
3285 so the ordinary signature <Literal>x::a</Literal> is equivalent to <Literal>x::forall a.a</Literal>;
3286 and that is an incorrect typing.
3287
3288 </Para>
3289 </ListItem>
3290 <ListItem>
3291
3292 <Para>
3293  There is no implicit universal quantification on pattern type
3294 signatures, nor may one write an explicit <Literal>forall</Literal> type in a pattern
3295 type signature.  The pattern type signature is a monotype.
3296
3297 </Para>
3298 </ListItem>
3299 <ListItem>
3300
3301 <Para>
3302
3303 The type variables in the head of a <Literal>class</Literal> or <Literal>instance</Literal> declaration
3304 scope over the methods defined in the <Literal>where</Literal> part.  For example:
3305
3306
3307 <ProgramListing>
3308   class C a where
3309     op :: [a] -> a
3310
3311     op xs = let ys::[a]
3312                 ys = reverse xs
3313             in
3314             head ys
3315 </ProgramListing>
3316
3317
3318 (Not implemented in Hugs yet, Dec 98).
3319 </Para>
3320 </ListItem>
3321
3322 </ItemizedList>
3323
3324 </Para>
3325
3326 </Sect2>
3327
3328 <Sect2>
3329 <Title>Polymorphism</Title>
3330
3331 <Para>
3332
3333 <ItemizedList>
3334 <ListItem>
3335
3336 <Para>
3337  Pattern type signatures are completely orthogonal to ordinary, separate
3338 type signatures.  The two can be used independently or together.  There is
3339 no scoping associated with the names of the type variables in a separate type signature.
3340
3341
3342 <ProgramListing>
3343    f :: [a] -> [a]
3344    f (xs::[b]) = reverse xs
3345 </ProgramListing>
3346
3347
3348 </Para>
3349 </ListItem>
3350 <ListItem>
3351
3352 <Para>
3353  The function must be polymorphic in the type variables
3354 bound by all its equations.  Operationally, the type variables bound
3355 by one equation must not:
3356
3357
3358 <ItemizedList>
3359 <ListItem>
3360
3361 <Para>
3362  Be unified with a type (such as <Literal>Int</Literal>, or <Literal>[a]</Literal>).
3363 </Para>
3364 </ListItem>
3365 <ListItem>
3366
3367 <Para>
3368  Be unified with a type variable free in the environment.
3369 </Para>
3370 </ListItem>
3371 <ListItem>
3372
3373 <Para>
3374  Be unified with each other.  (They may unify with the type variables
3375 bound by another equation for the same function, of course.)
3376 </Para>
3377 </ListItem>
3378
3379 </ItemizedList>
3380
3381
3382 For example, the following all fail to type check:
3383
3384
3385 <ProgramListing>
3386   f (x::a) (y::b) = [x,y]       -- a unifies with b
3387
3388   g (x::a) = x + 1::Int         -- a unifies with Int
3389
3390   h x = let k (y::a) = [x,y]    -- a is free in the
3391         in k x                  -- environment
3392
3393   k (x::a) True    = ...        -- a unifies with Int
3394   k (x::Int) False = ...
3395
3396   w :: [b] -> [b]
3397   w (x::a) = x                  -- a unifies with [b]
3398 </ProgramListing>
3399
3400
3401 </Para>
3402 </ListItem>
3403 <ListItem>
3404
3405 <Para>
3406  The pattern-bound type variable may, however, be constrained
3407 by the context of the principal type, thus:
3408
3409
3410 <ProgramListing>
3411   f (x::a) (y::a) = x+y*2
3412 </ProgramListing>
3413
3414
3415 gets the inferred type: <Literal>forall a. Num a =&gt; a -&gt; a -&gt; a</Literal>.
3416 </Para>
3417 </ListItem>
3418
3419 </ItemizedList>
3420
3421 </Para>
3422
3423 </Sect2>
3424
3425 <Sect2>
3426 <Title>Result type signatures</Title>
3427
3428 <Para>
3429
3430 <ItemizedList>
3431 <ListItem>
3432
3433 <Para>
3434  The result type of a function can be given a signature,
3435 thus:
3436
3437
3438 <ProgramListing>
3439   f (x::a) :: [a] = [x,x,x]
3440 </ProgramListing>
3441
3442
3443 The final <Literal>:: [a]</Literal> after all the patterns gives a signature to the
3444 result type.  Sometimes this is the only way of naming the type variable
3445 you want:
3446
3447
3448 <ProgramListing>
3449   f :: Int -> [a] -> [a]
3450   f n :: ([a] -> [a]) = let g (x::a, y::a) = (y,x)
3451                         in \xs -> map g (reverse xs `zip` xs)
3452 </ProgramListing>
3453
3454
3455 </Para>
3456 </ListItem>
3457
3458 </ItemizedList>
3459
3460 </Para>
3461
3462 <Para>
3463 Result type signatures are not yet implemented in Hugs.
3464 </Para>
3465
3466 </Sect2>
3467
3468 <Sect2>
3469 <Title>Pattern signatures on other constructs</Title>
3470
3471 <Para>
3472
3473 <ItemizedList>
3474 <ListItem>
3475
3476 <Para>
3477  A pattern type signature can be on an arbitrary sub-pattern, not
3478 just on a variable:
3479
3480
3481 <ProgramListing>
3482   f ((x,y)::(a,b)) = (y,x) :: (b,a)
3483 </ProgramListing>
3484
3485
3486 </Para>
3487 </ListItem>
3488 <ListItem>
3489
3490 <Para>
3491  Pattern type signatures, including the result part, can be used
3492 in lambda abstractions:
3493
3494
3495 <ProgramListing>
3496   (\ (x::a, y) :: a -> x)
3497 </ProgramListing>
3498
3499
3500 Type variables bound by these patterns must be polymorphic in
3501 the sense defined above.
3502 For example:
3503
3504
3505 <ProgramListing>
3506   f1 (x::c) = f1 x      -- ok
3507   f2 = \(x::c) -> f2 x  -- not ok
3508 </ProgramListing>
3509
3510
3511 Here, <Function>f1</Function> is OK, but <Function>f2</Function> is not, because <VarName>c</VarName> gets unified
3512 with a type variable free in the environment, in this
3513 case, the type of <Function>f2</Function>, which is in the environment when
3514 the lambda abstraction is checked.
3515
3516 </Para>
3517 </ListItem>
3518 <ListItem>
3519
3520 <Para>
3521  Pattern type signatures, including the result part, can be used
3522 in <Literal>case</Literal> expressions:
3523
3524
3525 <ProgramListing>
3526   case e of { (x::a, y) :: a -> x }
3527 </ProgramListing>
3528
3529
3530 The pattern-bound type variables must, as usual,
3531 be polymorphic in the following sense: each case alternative,
3532 considered as a lambda abstraction, must be polymorphic.
3533 Thus this is OK:
3534
3535
3536 <ProgramListing>
3537   case (True,False) of { (x::a, y) -> x }
3538 </ProgramListing>
3539
3540
3541 Even though the context is that of a pair of booleans,
3542 the alternative itself is polymorphic.  Of course, it is
3543 also OK to say:
3544
3545
3546 <ProgramListing>
3547   case (True,False) of { (x::Bool, y) -> x }
3548 </ProgramListing>
3549
3550
3551 </Para>
3552 </ListItem>
3553 <ListItem>
3554
3555 <Para>
3556 To avoid ambiguity, the type after the &ldquo;<Literal>::</Literal>&rdquo; in a result
3557 pattern signature on a lambda or <Literal>case</Literal> must be atomic (i.e. a single
3558 token or a parenthesised type of some sort).  To see why,
3559 consider how one would parse this:
3560
3561
3562 <ProgramListing>
3563   \ x :: a -> b -> x
3564 </ProgramListing>
3565
3566
3567 </Para>
3568 </ListItem>
3569 <ListItem>
3570
3571 <Para>
3572  Pattern type signatures that bind new type variables
3573 may not be used in pattern bindings at all.
3574 So this is illegal:
3575
3576
3577 <ProgramListing>
3578   f x = let (y, z::a) = x in ...
3579 </ProgramListing>
3580
3581
3582 But these are OK, because they do not bind fresh type variables:
3583
3584
3585 <ProgramListing>
3586   f1 x            = let (y, z::Int) = x in ...
3587   f2 (x::(Int,a)) = let (y, z::a)   = x in ...
3588 </ProgramListing>
3589
3590
3591 However a single variable is considered a degenerate function binding,
3592 rather than a degerate pattern binding, so this is permitted, even
3593 though it binds a type variable:
3594
3595
3596 <ProgramListing>
3597   f :: (b->b) = \(x::b) -> x
3598 </ProgramListing>
3599
3600
3601 </Para>
3602 </ListItem>
3603
3604 </ItemizedList>
3605
3606 Such degnerate function bindings do not fall under the monomorphism
3607 restriction.  Thus:
3608 </Para>
3609
3610 <Para>
3611
3612 <ProgramListing>
3613   g :: a -> a -> Bool = \x y. x==y
3614 </ProgramListing>
3615
3616 </Para>
3617
3618 <Para>
3619 Here <Function>g</Function> has type <Literal>forall a. Eq a =&gt; a -&gt; a -&gt; Bool</Literal>, just as if
3620 <Function>g</Function> had a separate type signature.  Lacking a type signature, <Function>g</Function>
3621 would get a monomorphic type.
3622 </Para>
3623
3624 </Sect2>
3625
3626 <Sect2>
3627 <Title>Existentials</Title>
3628
3629 <Para>
3630
3631 <ItemizedList>
3632 <ListItem>
3633
3634 <Para>
3635  Pattern type signatures can bind existential type variables.
3636 For example:
3637
3638
3639 <ProgramListing>
3640   data T = forall a. MkT [a]
3641
3642   f :: T -> T
3643   f (MkT [t::a]) = MkT t3
3644                  where
3645                    t3::[a] = [t,t,t]
3646 </ProgramListing>
3647
3648
3649 </Para>
3650 </ListItem>
3651
3652 </ItemizedList>
3653
3654 </Para>
3655
3656 </Sect2>
3657
3658 </Sect1>
3659
3660 <Sect1 id="pragmas">
3661 <Title>Pragmas
3662 </Title>
3663
3664 <Para>
3665 GHC supports several pragmas, or instructions to the compiler placed
3666 in the source code.  Pragmas don't affect the meaning of the program,
3667 but they might affect the efficiency of the generated code.
3668 </Para>
3669
3670 <Sect2 id="inline-pragma">
3671 <Title>INLINE pragma
3672
3673 <IndexTerm><Primary>INLINE pragma</Primary></IndexTerm>
3674 <IndexTerm><Primary>pragma, INLINE</Primary></IndexTerm></Title>
3675
3676 <Para>
3677 GHC (with <Option>-O</Option>, as always) tries to inline (or &ldquo;unfold&rdquo;)
3678 functions/values that are &ldquo;small enough,&rdquo; thus avoiding the call
3679 overhead and possibly exposing other more-wonderful optimisations.
3680 </Para>
3681
3682 <Para>
3683 You will probably see these unfoldings (in Core syntax) in your
3684 interface files.
3685 </Para>
3686
3687 <Para>
3688 Normally, if GHC decides a function is &ldquo;too expensive&rdquo; to inline, it
3689 will not do so, nor will it export that unfolding for other modules to
3690 use.
3691 </Para>
3692
3693 <Para>
3694 The sledgehammer you can bring to bear is the
3695 <Literal>INLINE</Literal><IndexTerm><Primary>INLINE pragma</Primary></IndexTerm> pragma, used thusly:
3696
3697 <ProgramListing>
3698 key_function :: Int -> String -> (Bool, Double)
3699
3700 #ifdef __GLASGOW_HASKELL__
3701 {-# INLINE key_function #-}
3702 #endif
3703 </ProgramListing>
3704
3705 (You don't need to do the C pre-processor carry-on unless you're going
3706 to stick the code through HBC&mdash;it doesn't like <Literal>INLINE</Literal> pragmas.)
3707 </Para>
3708
3709 <Para>
3710 The major effect of an <Literal>INLINE</Literal> pragma is to declare a function's
3711 &ldquo;cost&rdquo; to be very low.  The normal unfolding machinery will then be
3712 very keen to inline it.
3713 </Para>
3714
3715 <Para>
3716 An <Literal>INLINE</Literal> pragma for a function can be put anywhere its type
3717 signature could be put.
3718 </Para>
3719
3720 <Para>
3721 <Literal>INLINE</Literal> pragmas are a particularly good idea for the
3722 <Literal>then</Literal>/<Literal>return</Literal> (or <Literal>bind</Literal>/<Literal>unit</Literal>) functions in a monad.
3723 For example, in GHC's own <Literal>UniqueSupply</Literal> monad code, we have:
3724
3725 <ProgramListing>
3726 #ifdef __GLASGOW_HASKELL__
3727 {-# INLINE thenUs #-}
3728 {-# INLINE returnUs #-}
3729 #endif
3730 </ProgramListing>
3731
3732 </Para>
3733
3734 </Sect2>
3735
3736 <Sect2 id="noinline-pragma">
3737 <Title>NOINLINE pragma
3738 </Title>
3739
3740 <Para>
3741 <IndexTerm><Primary>NOINLINE pragma</Primary></IndexTerm>
3742 <IndexTerm><Primary>pragma, NOINLINE</Primary></IndexTerm>
3743 </Para>
3744
3745 <Para>
3746 The <Literal>NOINLINE</Literal> pragma does exactly what you'd expect: it stops the
3747 named function from being inlined by the compiler.  You shouldn't ever
3748 need to do this, unless you're very cautious about code size.
3749 </Para>
3750
3751 </Sect2>
3752
3753 <Sect2 id="specialize-pragma">
3754 <Title>SPECIALIZE pragma
3755 </Title>
3756
3757 <Para>
3758 <IndexTerm><Primary>SPECIALIZE pragma</Primary></IndexTerm>
3759 <IndexTerm><Primary>pragma, SPECIALIZE</Primary></IndexTerm>
3760 <IndexTerm><Primary>overloading, death to</Primary></IndexTerm>
3761 </Para>
3762
3763 <Para>
3764 (UK spelling also accepted.)  For key overloaded functions, you can
3765 create extra versions (NB: more code space) specialised to particular
3766 types.  Thus, if you have an overloaded function:
3767 </Para>
3768
3769 <Para>
3770
3771 <ProgramListing>
3772 hammeredLookup :: Ord key => [(key, value)] -> key -> value
3773 </ProgramListing>
3774
3775 </Para>
3776
3777 <Para>
3778 If it is heavily used on lists with <Literal>Widget</Literal> keys, you could
3779 specialise it as follows:
3780
3781 <ProgramListing>
3782 {-# SPECIALIZE hammeredLookup :: [(Widget, value)] -> Widget -> value #-}
3783 </ProgramListing>
3784
3785 </Para>
3786
3787 <Para>
3788 To get very fancy, you can also specify a named function to use for
3789 the specialised value, by adding <Literal>= blah</Literal>, as in:
3790
3791 <ProgramListing>
3792 {-# SPECIALIZE hammeredLookup :: ...as before... = blah #-}
3793 </ProgramListing>
3794
3795 It's <Emphasis>Your Responsibility</Emphasis> to make sure that <Function>blah</Function> really
3796 behaves as a specialised version of <Function>hammeredLookup</Function>!!!
3797 </Para>
3798
3799 <Para>
3800 NOTE: the <Literal>=blah</Literal> feature isn't implemented in GHC 4.xx.
3801 </Para>
3802
3803 <Para>
3804 An example in which the <Literal>= blah</Literal> form will Win Big:
3805
3806 <ProgramListing>
3807 toDouble :: Real a => a -> Double
3808 toDouble = fromRational . toRational
3809
3810 {-# SPECIALIZE toDouble :: Int -> Double = i2d #-}
3811 i2d (I# i) = D# (int2Double# i) -- uses Glasgow prim-op directly
3812 </ProgramListing>
3813
3814 The <Function>i2d</Function> function is virtually one machine instruction; the
3815 default conversion&mdash;via an intermediate <Literal>Rational</Literal>&mdash;is obscenely
3816 expensive by comparison.
3817 </Para>
3818
3819 <Para>
3820 By using the US spelling, your <Literal>SPECIALIZE</Literal> pragma will work with
3821 HBC, too.  Note that HBC doesn't support the <Literal>= blah</Literal> form.
3822 </Para>
3823
3824 <Para>
3825 A <Literal>SPECIALIZE</Literal> pragma for a function can be put anywhere its type
3826 signature could be put.
3827 </Para>
3828
3829 </Sect2>
3830
3831 <Sect2 id="specialize-instance-pragma">
3832 <Title>SPECIALIZE instance pragma
3833 </Title>
3834
3835 <Para>
3836 <IndexTerm><Primary>SPECIALIZE pragma</Primary></IndexTerm>
3837 <IndexTerm><Primary>overloading, death to</Primary></IndexTerm>
3838 Same idea, except for instance declarations.  For example:
3839
3840 <ProgramListing>
3841 instance (Eq a) => Eq (Foo a) where { ... usual stuff ... }
3842
3843 {-# SPECIALIZE instance Eq (Foo [(Int, Bar)] #-}
3844 </ProgramListing>
3845
3846 Compatible with HBC, by the way.
3847 </Para>
3848
3849 </Sect2>
3850
3851 <Sect2 id="line-pragma">
3852 <Title>LINE pragma
3853 </Title>
3854
3855 <Para>
3856 <IndexTerm><Primary>LINE pragma</Primary></IndexTerm>
3857 <IndexTerm><Primary>pragma, LINE</Primary></IndexTerm>
3858 </Para>
3859
3860 <Para>
3861 This pragma is similar to C's <Literal>&num;line</Literal> pragma, and is mainly for use in
3862 automatically generated Haskell code.  It lets you specify the line
3863 number and filename of the original code; for example
3864 </Para>
3865
3866 <Para>
3867
3868 <ProgramListing>
3869 {-# LINE 42 "Foo.vhs" #-}
3870 </ProgramListing>
3871
3872 </Para>
3873
3874 <Para>
3875 if you'd generated the current file from something called <Filename>Foo.vhs</Filename>
3876 and this line corresponds to line 42 in the original.  GHC will adjust
3877 its error messages to refer to the line/file named in the <Literal>LINE</Literal>
3878 pragma.
3879 </Para>
3880
3881 </Sect2>
3882
3883 <Sect2>
3884 <Title>RULES pragma</Title>
3885
3886 <Para>
3887 The RULES pragma lets you specify rewrite rules.  It is described in
3888 <XRef LinkEnd="rewrite-rules">.
3889 </Para>
3890
3891 </Sect2>
3892
3893 </Sect1>
3894
3895 <Sect1 id="rewrite-rules">
3896 <Title>Rewrite rules
3897
3898 <IndexTerm><Primary>RULES pagma</Primary></IndexTerm>
3899 <IndexTerm><Primary>pragma, RULES</Primary></IndexTerm>
3900 <IndexTerm><Primary>rewrite rules</Primary></IndexTerm></Title>
3901
3902 <Para>
3903 The programmer can specify rewrite rules as part of the source program
3904 (in a pragma).  GHC applies these rewrite rules wherever it can.
3905 </Para>
3906
3907 <Para>
3908 Here is an example:
3909
3910 <ProgramListing>
3911   {-# RULES
3912         "map/map"       forall f g xs. map f (map g xs) = map (f.g) xs
3913   #-}
3914 </ProgramListing>
3915
3916 </Para>
3917
3918 <Sect2>
3919 <Title>Syntax</Title>
3920
3921 <Para>
3922 From a syntactic point of view:
3923
3924 <ItemizedList>
3925 <ListItem>
3926
3927 <Para>
3928  Each rule has a name, enclosed in double quotes.  The name itself has
3929 no significance at all.  It is only used when reporting how many times the rule fired.
3930 </Para>
3931 </ListItem>
3932 <ListItem>
3933
3934 <Para>
3935  There may be zero or more rules in a <Literal>RULES</Literal> pragma.
3936 </Para>
3937 </ListItem>
3938 <ListItem>
3939
3940 <Para>
3941  Layout applies in a <Literal>RULES</Literal> pragma.  Currently no new indentation level
3942 is set, so you must lay out your rules starting in the same column as the
3943 enclosing definitions.
3944 </Para>
3945 </ListItem>
3946 <ListItem>
3947
3948 <Para>
3949  Each variable mentioned in a rule must either be in scope (e.g. <Function>map</Function>),
3950 or bound by the <Literal>forall</Literal> (e.g. <Function>f</Function>, <Function>g</Function>, <Function>xs</Function>).  The variables bound by
3951 the <Literal>forall</Literal> are called the <Emphasis>pattern</Emphasis> variables.  They are separated
3952 by spaces, just like in a type <Literal>forall</Literal>.
3953 </Para>
3954 </ListItem>
3955 <ListItem>
3956
3957 <Para>
3958  A pattern variable may optionally have a type signature.
3959 If the type of the pattern variable is polymorphic, it <Emphasis>must</Emphasis> have a type signature.
3960 For example, here is the <Literal>foldr/build</Literal> rule:
3961
3962 <ProgramListing>
3963 "fold/build"  forall k z (g::forall b. (a->b->b) -> b -> b) .
3964               foldr k z (build g) = g k z
3965 </ProgramListing>
3966
3967 Since <Function>g</Function> has a polymorphic type, it must have a type signature.
3968
3969 </Para>
3970 </ListItem>
3971 <ListItem>
3972
3973 <Para>
3974 The left hand side of a rule must consist of a top-level variable applied
3975 to arbitrary expressions.  For example, this is <Emphasis>not</Emphasis> OK:
3976
3977 <ProgramListing>
3978 "wrong1"   forall e1 e2.  case True of { True -> e1; False -> e2 } = e1
3979 "wrong2"   forall f.      f True = True
3980 </ProgramListing>
3981
3982 In <Literal>"wrong1"</Literal>, the LHS is not an application; in <Literal>"wrong2"</Literal>, the LHS has a pattern variable
3983 in the head.
3984 </Para>
3985 </ListItem>
3986 <ListItem>
3987
3988 <Para>
3989  A rule does not need to be in the same module as (any of) the
3990 variables it mentions, though of course they need to be in scope.
3991 </Para>
3992 </ListItem>
3993 <ListItem>
3994
3995 <Para>
3996  Rules are automatically exported from a module, just as instance declarations are.
3997 </Para>
3998 </ListItem>
3999
4000 </ItemizedList>
4001
4002 </Para>
4003
4004 </Sect2>
4005
4006 <Sect2>
4007 <Title>Semantics</Title>
4008
4009 <Para>
4010 From a semantic point of view:
4011
4012 <ItemizedList>
4013 <ListItem>
4014
4015 <Para>
4016 Rules are only applied if you use the <Option>-O</Option> flag.
4017 </Para>
4018 </ListItem>
4019
4020 <ListItem>
4021 <Para>
4022  Rules are regarded as left-to-right rewrite rules.
4023 When GHC finds an expression that is a substitution instance of the LHS
4024 of a rule, it replaces the expression by the (appropriately-substituted) RHS.
4025 By "a substitution instance" we mean that the LHS can be made equal to the
4026 expression by substituting for the pattern variables.
4027
4028 </Para>
4029 </ListItem>
4030 <ListItem>
4031
4032 <Para>
4033  The LHS and RHS of a rule are typechecked, and must have the
4034 same type.
4035
4036 </Para>
4037 </ListItem>
4038 <ListItem>
4039
4040 <Para>
4041  GHC makes absolutely no attempt to verify that the LHS and RHS
4042 of a rule have the same meaning.  That is undecideable in general, and
4043 infeasible in most interesting cases.  The responsibility is entirely the programmer's!
4044
4045 </Para>
4046 </ListItem>
4047 <ListItem>
4048
4049 <Para>
4050  GHC makes no attempt to make sure that the rules are confluent or
4051 terminating.  For example:
4052
4053 <ProgramListing>
4054   "loop"        forall x,y.  f x y = f y x
4055 </ProgramListing>
4056
4057 This rule will cause the compiler to go into an infinite loop.
4058
4059 </Para>
4060 </ListItem>
4061 <ListItem>
4062
4063 <Para>
4064  If more than one rule matches a call, GHC will choose one arbitrarily to apply.
4065
4066 </Para>
4067 </ListItem>
4068 <ListItem>
4069 <Para>
4070  GHC currently uses a very simple, syntactic, matching algorithm
4071 for matching a rule LHS with an expression.  It seeks a substitution
4072 which makes the LHS and expression syntactically equal modulo alpha
4073 conversion.  The pattern (rule), but not the expression, is eta-expanded if
4074 necessary.  (Eta-expanding the epression can lead to laziness bugs.)
4075 But not beta conversion (that's called higher-order matching).
4076 </Para>
4077
4078 <Para>
4079 Matching is carried out on GHC's intermediate language, which includes
4080 type abstractions and applications.  So a rule only matches if the
4081 types match too.  See <XRef LinkEnd="rule-spec"> below.
4082 </Para>
4083 </ListItem>
4084 <ListItem>
4085
4086 <Para>
4087  GHC keeps trying to apply the rules as it optimises the program.
4088 For example, consider:
4089
4090 <ProgramListing>
4091   let s = map f
4092       t = map g
4093   in
4094   s (t xs)
4095 </ProgramListing>
4096
4097 The expression <Literal>s (t xs)</Literal> does not match the rule <Literal>"map/map"</Literal>, but GHC
4098 will substitute for <VarName>s</VarName> and <VarName>t</VarName>, giving an expression which does match.
4099 If <VarName>s</VarName> or <VarName>t</VarName> was (a) used more than once, and (b) large or a redex, then it would
4100 not be substituted, and the rule would not fire.
4101
4102 </Para>
4103 </ListItem>
4104 <ListItem>
4105
4106 <Para>
4107  In the earlier phases of compilation, GHC inlines <Emphasis>nothing
4108 that appears on the LHS of a rule</Emphasis>, because once you have substituted
4109 for something you can't match against it (given the simple minded
4110 matching).  So if you write the rule
4111
4112 <ProgramListing>
4113         "map/map"       forall f,g.  map f . map g = map (f.g)
4114 </ProgramListing>
4115
4116 this <Emphasis>won't</Emphasis> match the expression <Literal>map f (map g xs)</Literal>.
4117 It will only match something written with explicit use of ".".
4118 Well, not quite.  It <Emphasis>will</Emphasis> match the expression
4119
4120 <ProgramListing>
4121 wibble f g xs
4122 </ProgramListing>
4123
4124 where <Function>wibble</Function> is defined:
4125
4126 <ProgramListing>
4127 wibble f g = map f . map g
4128 </ProgramListing>
4129
4130 because <Function>wibble</Function> will be inlined (it's small).
4131
4132 Later on in compilation, GHC starts inlining even things on the
4133 LHS of rules, but still leaves the rules enabled.  This inlining
4134 policy is controlled by the per-simplification-pass flag <Option>-finline-phase</Option><Emphasis>n</Emphasis>.
4135
4136 </Para>
4137 </ListItem>
4138 <ListItem>
4139
4140 <Para>
4141  All rules are implicitly exported from the module, and are therefore
4142 in force in any module that imports the module that defined the rule, directly
4143 or indirectly.  (That is, if A imports B, which imports C, then C's rules are
4144 in force when compiling A.)  The situation is very similar to that for instance
4145 declarations.
4146 </Para>
4147 </ListItem>
4148
4149 </ItemizedList>
4150
4151 </Para>
4152
4153 </Sect2>
4154
4155 <Sect2>
4156 <Title>List fusion</Title>
4157
4158 <Para>
4159 The RULES mechanism is used to implement fusion (deforestation) of common list functions.
4160 If a "good consumer" consumes an intermediate list constructed by a "good producer", the
4161 intermediate list should be eliminated entirely.
4162 </Para>
4163
4164 <Para>
4165 The following are good producers:
4166
4167 <ItemizedList>
4168 <ListItem>
4169
4170 <Para>
4171  List comprehensions
4172 </Para>
4173 </ListItem>
4174 <ListItem>
4175
4176 <Para>
4177  Enumerations of <Literal>Int</Literal> and <Literal>Char</Literal> (e.g. <Literal>['a'..'z']</Literal>).
4178 </Para>
4179 </ListItem>
4180 <ListItem>
4181
4182 <Para>
4183  Explicit lists (e.g. <Literal>[True, False]</Literal>)
4184 </Para>
4185 </ListItem>
4186 <ListItem>
4187
4188 <Para>
4189  The cons constructor (e.g <Literal>3:4:[]</Literal>)
4190 </Para>
4191 </ListItem>
4192 <ListItem>
4193
4194 <Para>
4195  <Function>++</Function>
4196 </Para>
4197 </ListItem>
4198 <ListItem>
4199
4200 <Para>
4201  <Function>map</Function>
4202 </Para>
4203 </ListItem>
4204 <ListItem>
4205
4206 <Para>
4207  <Function>filter</Function>
4208 </Para>
4209 </ListItem>
4210 <ListItem>
4211
4212 <Para>
4213  <Function>iterate</Function>, <Function>repeat</Function>
4214 </Para>
4215 </ListItem>
4216 <ListItem>
4217
4218 <Para>
4219  <Function>zip</Function>, <Function>zipWith</Function>
4220 </Para>
4221 </ListItem>
4222
4223 </ItemizedList>
4224
4225 </Para>
4226
4227 <Para>
4228 The following are good consumers:
4229
4230 <ItemizedList>
4231 <ListItem>
4232
4233 <Para>
4234  List comprehensions
4235 </Para>
4236 </ListItem>
4237 <ListItem>
4238
4239 <Para>
4240  <Function>array</Function> (on its second argument)
4241 </Para>
4242 </ListItem>
4243 <ListItem>
4244
4245 <Para>
4246  <Function>length</Function>
4247 </Para>
4248 </ListItem>
4249 <ListItem>
4250
4251 <Para>
4252  <Function>++</Function> (on its first argument)
4253 </Para>
4254 </ListItem>
4255 <ListItem>
4256
4257 <Para>
4258  <Function>map</Function>
4259 </Para>
4260 </ListItem>
4261 <ListItem>
4262
4263 <Para>
4264  <Function>filter</Function>
4265 </Para>
4266 </ListItem>
4267 <ListItem>
4268
4269 <Para>
4270  <Function>concat</Function>
4271 </Para>
4272 </ListItem>
4273 <ListItem>
4274
4275 <Para>
4276  <Function>unzip</Function>, <Function>unzip2</Function>, <Function>unzip3</Function>, <Function>unzip4</Function>
4277 </Para>
4278 </ListItem>
4279 <ListItem>
4280
4281 <Para>
4282  <Function>zip</Function>, <Function>zipWith</Function> (but on one argument only; if both are good producers, <Function>zip</Function>
4283 will fuse with one but not the other)
4284 </Para>
4285 </ListItem>
4286 <ListItem>
4287
4288 <Para>
4289  <Function>partition</Function>
4290 </Para>
4291 </ListItem>
4292 <ListItem>
4293
4294 <Para>
4295  <Function>head</Function>
4296 </Para>
4297 </ListItem>
4298 <ListItem>
4299
4300 <Para>
4301  <Function>and</Function>, <Function>or</Function>, <Function>any</Function>, <Function>all</Function>
4302 </Para>
4303 </ListItem>
4304 <ListItem>
4305
4306 <Para>
4307  <Function>sequence&lowbar;</Function>
4308 </Para>
4309 </ListItem>
4310 <ListItem>
4311
4312 <Para>
4313  <Function>msum</Function>
4314 </Para>
4315 </ListItem>
4316 <ListItem>
4317
4318 <Para>
4319  <Function>sortBy</Function>
4320 </Para>
4321 </ListItem>
4322
4323 </ItemizedList>
4324
4325 </Para>
4326
4327 <Para>
4328 So, for example, the following should generate no intermediate lists:
4329
4330 <ProgramListing>
4331 array (1,10) [(i,i*i) | i &#60;- map (+ 1) [0..9]]
4332 </ProgramListing>
4333
4334 </Para>
4335
4336 <Para>
4337 This list could readily be extended; if there are Prelude functions that you use
4338 a lot which are not included, please tell us.
4339 </Para>
4340
4341 <Para>
4342 If you want to write your own good consumers or producers, look at the
4343 Prelude definitions of the above functions to see how to do so.
4344 </Para>
4345
4346 </Sect2>
4347
4348 <Sect2 id="rule-spec">
4349 <Title>Specialisation
4350 </Title>
4351
4352 <Para>
4353 Rewrite rules can be used to get the same effect as a feature
4354 present in earlier version of GHC:
4355
4356 <ProgramListing>
4357   {-# SPECIALIZE fromIntegral :: Int8 -> Int16 = int8ToInt16 #-}
4358 </ProgramListing>
4359
4360 This told GHC to use <Function>int8ToInt16</Function> instead of <Function>fromIntegral</Function> whenever
4361 the latter was called with type <Literal>Int8 -&gt; Int16</Literal>.  That is, rather than
4362 specialising the original definition of <Function>fromIntegral</Function> the programmer is
4363 promising that it is safe to use <Function>int8ToInt16</Function> instead.
4364 </Para>
4365
4366 <Para>
4367 This feature is no longer in GHC.  But rewrite rules let you do the
4368 same thing:
4369
4370 <ProgramListing>
4371 {-# RULES
4372   "fromIntegral/Int8/Int16" fromIntegral = int8ToInt16
4373 #-}
4374 </ProgramListing>
4375
4376 This slightly odd-looking rule instructs GHC to replace <Function>fromIntegral</Function>
4377 by <Function>int8ToInt16</Function> <Emphasis>whenever the types match</Emphasis>.  Speaking more operationally,
4378 GHC adds the type and dictionary applications to get the typed rule
4379
4380 <ProgramListing>
4381 forall (d1::Integral Int8) (d2::Num Int16) .
4382         fromIntegral Int8 Int16 d1 d2 = int8ToInt16
4383 </ProgramListing>
4384
4385 What is more,
4386 this rule does not need to be in the same file as fromIntegral,
4387 unlike the <Literal>SPECIALISE</Literal> pragmas which currently do (so that they
4388 have an original definition available to specialise).
4389 </Para>
4390
4391 </Sect2>
4392
4393 <Sect2>
4394 <Title>Controlling what's going on</Title>
4395
4396 <Para>
4397
4398 <ItemizedList>
4399 <ListItem>
4400
4401 <Para>
4402  Use <Option>-ddump-rules</Option> to see what transformation rules GHC is using.
4403 </Para>
4404 </ListItem>
4405 <ListItem>
4406
4407 <Para>
4408  Use <Option>-ddump-simpl-stats</Option> to see what rules are being fired.
4409 If you add <Option>-dppr-debug</Option> you get a more detailed listing.
4410 </Para>
4411 </ListItem>
4412 <ListItem>
4413
4414 <Para>
4415  The defintion of (say) <Function>build</Function> in <FileName>PrelBase.lhs</FileName> looks llike this:
4416
4417 <ProgramListing>
4418         build   :: forall a. (forall b. (a -> b -> b) -> b -> b) -> [a]
4419         {-# INLINE build #-}
4420         build g = g (:) []
4421 </ProgramListing>
4422
4423 Notice the <Literal>INLINE</Literal>!  That prevents <Literal>(:)</Literal> from being inlined when compiling
4424 <Literal>PrelBase</Literal>, so that an importing module will &ldquo;see&rdquo; the <Literal>(:)</Literal>, and can
4425 match it on the LHS of a rule.  <Literal>INLINE</Literal> prevents any inlining happening
4426 in the RHS of the <Literal>INLINE</Literal> thing.  I regret the delicacy of this.
4427
4428 </Para>
4429 </ListItem>
4430 <ListItem>
4431
4432 <Para>
4433  In <Filename>ghc/lib/std/PrelBase.lhs</Filename> look at the rules for <Function>map</Function> to
4434 see how to write rules that will do fusion and yet give an efficient
4435 program even if fusion doesn't happen.  More rules in <Filename>PrelList.lhs</Filename>.
4436 </Para>
4437 </ListItem>
4438
4439 </ItemizedList>
4440
4441 </Para>
4442
4443 </Sect2>
4444
4445 </Sect1>
4446
4447 <Sect1 id="generic-classes">
4448 <Title>Generic classes</Title>
4449
4450 <Para>
4451 The ideas behind this extension are described in detail in "Derivable type classes",
4452 Ralf Hinze and Simon Peyton Jones, Haskell Workshop, Montreal Sept 2000, pp94-105.
4453 An example will give the idea:
4454 </Para>
4455
4456 <ProgramListing>
4457   import Generics
4458
4459   class Bin a where
4460     toBin   :: a -> [Int]
4461     fromBin :: [Int] -> (a, [Int])
4462
4463     toBin {| Unit |}    Unit      = []
4464     toBin {| a :+: b |} (Inl x)   = 0 : toBin x
4465     toBin {| a :+: b |} (Inr y)   = 1 : toBin y
4466     toBin {| a :*: b |} (x :*: y) = toBin x ++ toBin y
4467
4468     fromBin {| Unit |}    bs      = (Unit, bs)
4469     fromBin {| a :+: b |} (0:bs)  = (Inl x, bs')    where (x,bs') = fromBin bs
4470     fromBin {| a :+: b |} (1:bs)  = (Inr y, bs')    where (y,bs') = fromBin bs
4471     fromBin {| a :*: b |} bs      = (x :*: y, bs'') where (x,bs' ) = fromBin bs
4472                                                           (y,bs'') = fromBin bs'
4473 </ProgramListing>
4474 <Para>
4475 This class declaration explains how <Literal>toBin</Literal> and <Literal>fromBin</Literal>
4476 work for arbitrary data types.  They do so by giving cases for unit, product, and sum,
4477 which are defined thus in the library module <Literal>Generics</Literal>:
4478 </Para>
4479 <ProgramListing>
4480   data Unit    = Unit
4481   data a :+: b = Inl a | Inr b
4482   data a :*: b = a :*: b
4483 </ProgramListing>
4484 <Para>
4485 Now you can make a data type into an instance of Bin like this:
4486 <ProgramListing>
4487   instance (Bin a, Bin b) => Bin (a,b)
4488   instance Bin a => Bin [a]
4489 </ProgramListing>
4490 That is, just leave off the "where" clasuse.  Of course, you can put in the
4491 where clause and over-ride whichever methods you please.
4492 </Para>
4493
4494     <Sect2>
4495       <Title> Using generics </Title>
4496       <Para>To use generics you need to</para>
4497       <ItemizedList>
4498         <ListItem>
4499           <Para>Use the <Option>-fgenerics</Option> flag.</Para>
4500         </ListItem>
4501         <ListItem>
4502           <Para>Import the module <Literal>Generics</Literal> from the
4503           <Literal>lang</Literal> package.  This import brings into
4504           scope the data types <Literal>Unit</Literal>,
4505           <Literal>:*:</Literal>, and <Literal>:+:</Literal>.  (You
4506           don't need this import if you don't mention these types
4507           explicitly; for example, if you are simply giving instance
4508           declarations.)</Para>
4509         </ListItem>
4510       </ItemizedList>
4511     </Sect2>
4512
4513 <Sect2> <Title> Changes wrt the paper </Title>
4514 <Para>
4515 Note that the type constructors <Literal>:+:</Literal> and <Literal>:*:</Literal>
4516 can be written infix (indeed, you can now use
4517 any operator starting in a colon as an infix type constructor).  Also note that
4518 the type constructors are not exactly as in the paper (Unit instead of 1, etc).
4519 Finally, note that the syntax of the type patterns in the class declaration
4520 uses "<Literal>{|</Literal>" and "<Literal>{|</Literal>" brackets; curly braces
4521 alone would ambiguous when they appear on right hand sides (an extension we
4522 anticipate wanting).
4523 </Para>
4524 </Sect2>
4525
4526 <Sect2> <Title>Terminology and restrictions</Title>
4527 <Para>
4528 Terminology.  A "generic default method" in a class declaration
4529 is one that is defined using type patterns as above.
4530 A "polymorphic default method" is a default method defined as in Haskell 98.
4531 A "generic class declaration" is a class declaration with at least one
4532 generic default method.
4533 </Para>
4534
4535 <Para>
4536 Restrictions:
4537 <ItemizedList>
4538 <ListItem>
4539 <Para>
4540 Alas, we do not yet implement the stuff about constructor names and
4541 field labels.
4542 </Para>
4543 </ListItem>
4544
4545 <ListItem>
4546 <Para>
4547 A generic class can have only one parameter; you can't have a generic
4548 multi-parameter class.
4549 </Para>
4550 </ListItem>
4551
4552 <ListItem>
4553 <Para>
4554 A default method must be defined entirely using type patterns, or entirely
4555 without.  So this is illegal:
4556 <ProgramListing>
4557   class Foo a where
4558     op :: a -> (a, Bool)
4559     op {| Unit |} Unit = (Unit, True)
4560     op x               = (x,    False)
4561 </ProgramListing>
4562 However it is perfectly OK for some methods of a generic class to have
4563 generic default methods and others to have polymorphic default methods.
4564 </Para>
4565 </ListItem>
4566
4567 <ListItem>
4568 <Para>
4569 The type variable(s) in the type pattern for a generic method declaration
4570 scope over the right hand side.  So this is legal (note the use of the type variable ``p'' in a type signature on the right hand side:
4571 <ProgramListing>
4572   class Foo a where
4573     op :: a -> Bool
4574     op {| p :*: q |} (x :*: y) = op (x :: p)
4575     ...
4576 </ProgramListing>
4577 </Para>
4578 </ListItem>
4579
4580 <ListItem>
4581 <Para>
4582 The type patterns in a generic default method must take one of the forms:
4583 <ProgramListing>
4584        a :+: b
4585        a :*: b
4586        Unit
4587 </ProgramListing>
4588 where "a" and "b" are type variables.  Furthermore, all the type patterns for
4589 a single type constructor (<Literal>:*:</Literal>, say) must be identical; they
4590 must use the same type variables.  So this is illegal:
4591 <ProgramListing>
4592   class Foo a where
4593     op :: a -> Bool
4594     op {| a :+: b |} (Inl x) = True
4595     op {| p :+: q |} (Inr y) = False
4596 </ProgramListing>
4597 The type patterns must be identical, even in equations for different methods of the class.
4598 So this too is illegal:
4599 <ProgramListing>
4600   class Foo a where
4601     op1 :: a -> Bool
4602     op {| a :*: b |} (Inl x) = True
4603
4604     op2 :: a -> Bool
4605     op {| p :*: q |} (Inr y) = False
4606 </ProgramListing>
4607 (The reason for this restriction is that we gather all the equations for a particular type consructor
4608 into a single generic instance declaration.)
4609 </Para>
4610 </ListItem>
4611
4612 <ListItem>
4613 <Para>
4614 A generic method declaration must give a case for each of the three type constructors.
4615 </Para>
4616 </ListItem>
4617
4618 <ListItem>
4619 <Para>
4620 The type for a generic method can be built only from:
4621   <ItemizedList>
4622   <ListItem> <Para> Function arrows </Para> </ListItem>
4623   <ListItem> <Para> Type variables </Para> </ListItem>
4624   <ListItem> <Para> Tuples </Para> </ListItem>
4625   <ListItem> <Para> Arbitrary types not involving type variables </Para> </ListItem>
4626   </ItemizedList>
4627 Here are some example type signatures for generic methods:
4628 <ProgramListing>
4629     op1 :: a -> Bool
4630     op2 :: Bool -> (a,Bool)
4631     op3 :: [Int] -> a -> a
4632     op4 :: [a] -> Bool
4633 </ProgramListing>
4634 Here, op1, op2, op3 are OK, but op4 is rejected, because it has a type variable
4635 inside a list.
4636 </Para>
4637 <Para>
4638 This restriction is an implementation restriction: we just havn't got around to
4639 implementing the necessary bidirectional maps over arbitrary type constructors.
4640 It would be relatively easy to add specific type constructors, such as Maybe and list,
4641 to the ones that are allowed.</para>
4642 </ListItem>
4643
4644 <ListItem>
4645 <Para>
4646 In an instance declaration for a generic class, the idea is that the compiler
4647 will fill in the methods for you, based on the generic templates.  However it can only
4648 do so if
4649   <ItemizedList>
4650   <ListItem>
4651   <Para>
4652   The instance type is simple (a type constructor applied to type variables, as in Haskell 98).
4653   </Para>
4654   </ListItem>
4655   <ListItem>
4656   <Para>
4657   No constructor of the instance type has unboxed fields.
4658   </Para>
4659   </ListItem>
4660   </ItemizedList>
4661 (Of course, these things can only arise if you are already using GHC extensions.)
4662 However, you can still give an instance declarations for types which break these rules,
4663 provided you give explicit code to override any generic default methods.
4664 </Para>
4665 </ListItem>
4666
4667 </ItemizedList>
4668 </Para>
4669
4670 <Para>
4671 The option <Option>-ddump-deriv</Option> dumps incomprehensible stuff giving details of
4672 what the compiler does with generic declarations.
4673 </Para>
4674
4675 </Sect2>
4676
4677 <Sect2> <Title> Another example </Title>
4678 <Para>
4679 Just to finish with, here's another example I rather like:
4680 <ProgramListing>
4681   class Tag a where
4682     nCons :: a -> Int
4683     nCons {| Unit |}    _ = 1
4684     nCons {| a :*: b |} _ = 1
4685     nCons {| a :+: b |} _ = nCons (bot::a) + nCons (bot::b)
4686
4687     tag :: a -> Int
4688     tag {| Unit |}    _       = 1
4689     tag {| a :*: b |} _       = 1
4690     tag {| a :+: b |} (Inl x) = tag x
4691     tag {| a :+: b |} (Inr y) = nCons (bot::a) + tag y
4692 </ProgramListing>
4693 </Para>
4694 </Sect2>
4695 </Sect1>
4696
4697 <!-- Emacs stuff:
4698      ;;; Local Variables: ***
4699      ;;; mode: sgml ***
4700      ;;; sgml-parent-document: ("users_guide.sgml" "book" "chapter" "sect1") ***
4701      ;;; End: ***
4702  -->