ghc/docs/users_guide/glasgow_exts.sgml

   1 <Para>
   2 <IndexTerm><Primary>language, GHC</Primary></IndexTerm>
   3 <IndexTerm><Primary>extensions, GHC</Primary></IndexTerm>
   4 As with all known Haskell systems, GHC implements some extensions to
   5 the language.  To use them, you'll need to give a <Option>-fglasgow-exts</Option>
   6 <IndexTerm><Primary>-fglasgow-exts option</Primary></IndexTerm> option.
   7 </Para>
   8
   9 <Para>
  10 Virtually all of the Glasgow extensions serve to give you access to
  11 the underlying facilities with which we implement Haskell.  Thus, you
  12 can get at the Raw Iron, if you are willing to write some non-standard
  13 code at a more primitive level.  You need not be &ldquo;stuck&rdquo; on
  14 performance because of the implementation costs of Haskell's
  15 &ldquo;high-level&rdquo; features&mdash;you can always code &ldquo;under&rdquo; them.  In an extreme case, you can write all your time-critical code in C, and then just glue it together with Haskell!
  16 </Para>
  17
  18 <Para>
  19 Executive summary of our extensions:
  20 </Para>
  21
  22 <Para>
  23 <VariableList>
  24
  25 <VarListEntry>
  26 <Term>Unboxed types and primitive operations:</Term>
  27 <ListItem>
  28 <Para>
  29 You can get right down to the raw machine types and operations;
  30 included in this are &ldquo;primitive arrays&rdquo; (direct access to Big Wads
  31 of Bytes).  Please see <XRef LinkEnd="glasgow-unboxed"> and following.
  32 </Para>
  33 </ListItem>
  34 </VarListEntry>
  35
  36 <VarListEntry>
  37 <Term>Multi-parameter type classes:</Term>
  38 <ListItem>
  39 <Para>
  40 GHC's type system supports extended type classes with multiple
  41 parameters.  Please see <XRef LinkEnd="multi-param-type-classes">.
  42 </Para>
  43 </ListItem>
  44 </VarListEntry>
  45
  46 <VarListEntry>
  47 <Term>Local universal quantification:</Term>
  48 <ListItem>
  49 <Para>
  50 GHC's type system supports explicit universal quantification in
  51 constructor fields and function arguments.  This is useful for things
  52 like defining <Literal>runST</Literal> from the state-thread world.  See <XRef LinkEnd="universal-quantification">.
  53 </Para>
  54 </ListItem>
  55 </VarListEntry>
  56
  57 <VarListEntry>
  58 <Term>Extistentially quantification in data types:</Term>
  59 <ListItem>
  60 <Para>
  61 Some or all of the type variables in a datatype declaration may be
  62 <Emphasis>existentially quantified</Emphasis>.  More details in <XRef LinkEnd="existential-quantification">.
  63 </Para>
  64 </ListItem>
  65 </VarListEntry>
  66
  67 <VarListEntry>
  68 <Term>Scoped type variables:</Term>
  69 <ListItem>
  70 <Para>
  71 Scoped type variables enable the programmer to supply type signatures
  72 for some nested declarations, where this would not be legal in Haskell
  73 98.  Details in <XRef LinkEnd="scoped-type-variables">.
  74 </Para>
  75 </ListItem>
  76 </VarListEntry>
  77
  78 <VarListEntry>
  79 <Term>Pattern guards</Term>
  80 <ListItem>
  81 <Para>
  82 Instead of being a boolean expression, a guard is a list of qualifiers, exactly as in a list comprehension. See <XRef LinkEnd="pattern-guards">.
  83 </Para>
  84 </ListItem>
  85 </VarListEntry>
  86
  87 <VarListEntry>
  88 <Term>Foreign calling:</Term>
  89 <ListItem>
  90 <Para>
  91 Just what it sounds like.  We provide <Emphasis>lots</Emphasis> of rope that you
  92 can dangle around your neck.  Please see <XRef LinkEnd="ffi">.
  93 </Para>
  94 </ListItem>
  95 </VarListEntry>
  96
  97 <VarListEntry>
  98 <Term>Pragmas</Term>
  99 <ListItem>
 100 <Para>
 101 Pragmas are special instructions to the compiler placed in the source
 102 file.  The pragmas GHC supports are described in <XRef LinkEnd="pragmas">.
 103 </Para>
 104 </ListItem>
 105 </VarListEntry>
 106
 107 <VarListEntry>
 108 <Term>Rewrite rules:</Term>
 109 <ListItem>
 110 <Para>
 111 The programmer can specify rewrite rules as part of the source program
 112 (in a pragma).  GHC applies these rewrite rules wherever it can.
 113 Details in <XRef LinkEnd="rewrite-rules">.
 114 </Para>
 115 </ListItem>
 116 </VarListEntry>
 117
 118 <VarListEntry>
 119 <Term>Generic classes:</Term>
 120 <ListItem>
 121 <Para>
 122 Generic class declarations allow you to define a class
 123 whose methods say how to work over an arbitrary data type.
 124 Then it's really easy to make any new type into an instance of
 125 the class.  This generalises the rather ad-hoc "deriving" feature
 126 of Haskell 98.
 127 Details in <XRef LinkEnd="generic-classes">.
 128 </Para>
 129 </ListItem>
 130 </VarListEntry>
 131 </VariableList>
 132 </Para>
 133
 134 <Para>
 135 Before you get too carried away working at the lowest level (e.g.,
 136 sloshing <Literal>MutableByteArray&num;</Literal>s around your
 137 program), you may wish to check if there are libraries that provide a
 138 &ldquo;Haskellised veneer&rdquo; over the features you want.  See
 139 <xref linkend="book-hslibs">.
 140 </Para>
 141
 142 <Sect1 id="language-options">
 143 <Title>Language variations
 144 </Title>
 145
 146 <Para> There are several flags that control what variation of the language are permitted.
 147 Leaving out all of them gives you standard Haskell 98.</Para>
 148
 149 <VariableList>
 150
 151 <VarListEntry>
 152 <Term><Option>-fglasgow-exts</Option>:</Term>
 153 <ListItem>
 154 <Para>This simultaneously enables all of the extensions to Haskell 98 described in this
 155 chapter, except where otherwise noted. </Para>
 156 </ListItem> </VarListEntry>
 157
 158 <VarListEntry>
 159 <Term><Option>-fno-monomorphism-restriction</Option>:</Term>
 160 <ListItem>
 161 <Para> Switch off the Haskell 98 monomorphism restriction.  Independent of the <Option>-fglasgow-exts</Option>
 162 flag. </Para>
 163 </ListItem> </VarListEntry>
 164
 165 <VarListEntry>
 166 <Term><Option>-fallow-overlapping-instances</Option>,
 167       <Option>-fallow-undecidable-instances</Option>,
 168       <Option>-fcontext-stack</Option>:</Term>
 169 <ListItem>
 170 <Para> See <XRef LinkEnd="instance-decls">.
 171 Only relevant if you also use <Option>-fglasgow-exts</Option>.
 172 </Para>
 173 </ListItem> </VarListEntry>
 174
 175 <VarListEntry>
 176 <Term><Option>-fignore-asserts</Option>:</Term>
 177 <ListItem>
 178 <Para> See <XRef LinkEnd="sec-assertions">.
 179 Only relevant if you also use <Option>-fglasgow-exts</Option>.
 180 </Para>
 181 </ListItem> </VarListEntry>
 182
 183 <VarListEntry>
 184 <Term> <Option>-finline-phase</Option>:</Term>
 185 <ListItem>
 186 <Para> See <XRef LinkEnd="rewrite-rules">.
 187 Only relevant if you also use <Option>-fglasgow-exts</Option>.
 188 </ListItem> </VarListEntry>
 189
 190 <VarListEntry>
 191 <Term> <Option>-fgenerics</Option>:</Term>
 192 <ListItem>
 193 <Para> See <XRef LinkEnd="generic-classes">.
 194 Independent of <Option>-fglasgow-exts</Option>.
 195 </Para>
 196 </ListItem> </VarListEntry>
 197
 198 </VariableList>
 199
 200 <Sect1 id="primitives">
 201 <Title>Unboxed types and primitive operations
 202 </Title>
 203 <IndexTerm><Primary>PrelGHC module</Primary></IndexTerm>
 204
 205 <Para>
 206 This module defines all the types which are primitive in Glasgow
 207 Haskell, and the operations provided for them.
 208 </Para>
 209
 210 <Sect2 id="glasgow-unboxed">
 211 <Title>Unboxed types
 212 </Title>
 213
 214 <Para>
 215 <IndexTerm><Primary>Unboxed types (Glasgow extension)</Primary></IndexTerm>
 216 </Para>
 217
 218 <para>Most types in GHC are <firstterm>boxed</firstterm>, which means
 219 that values of that type are represented by a pointer to a heap
 220 object.  The representation of a Haskell <literal>Int</literal>, for
 221 example, is a two-word heap object.  An <firstterm>unboxed</firstterm>
 222 type, however, is represented by the value itself, no pointers or heap
 223 allocation are involved.
 224 </para>
 225
 226 <Para>
 227 Unboxed types correspond to the &ldquo;raw machine&rdquo; types you
 228 would use in C: <Literal>Int&num;</Literal> (long int),
 229 <Literal>Double&num;</Literal> (double), <Literal>Addr&num;</Literal>
 230 (void *), etc.  The <Emphasis>primitive operations</Emphasis>
 231 (PrimOps) on these types are what you might expect; e.g.,
 232 <Literal>(+&num;)</Literal> is addition on
 233 <Literal>Int&num;</Literal>s, and is the machine-addition that we all
 234 know and love&mdash;usually one instruction.
 235 </Para>
 236
 237 <Para>
 238 Primitive (unboxed) types cannot be defined in Haskell, and are
 239 therefore built into the language and compiler.  Primitive types are
 240 always unlifted; that is, a value of a primitive type cannot be
 241 bottom.  We use the convention that primitive types, values, and
 242 operations have a <Literal>&num;</Literal> suffix.
 243 </Para>
 244
 245 <Para>
 246 Primitive values are often represented by a simple bit-pattern, such
 247 as <Literal>Int&num;</Literal>, <Literal>Float&num;</Literal>,
 248 <Literal>Double&num;</Literal>.  But this is not necessarily the case:
 249 a primitive value might be represented by a pointer to a
 250 heap-allocated object.  Examples include
 251 <Literal>Array&num;</Literal>, the type of primitive arrays.  A
 252 primitive array is heap-allocated because it is too big a value to fit
 253 in a register, and would be too expensive to copy around; in a sense,
 254 it is accidental that it is represented by a pointer.  If a pointer
 255 represents a primitive value, then it really does point to that value:
 256 no unevaluated thunks, no indirections&hellip;nothing can be at the
 257 other end of the pointer than the primitive value.
 258 </Para>
 259
 260 <Para>
 261 There are some restrictions on the use of primitive types, the main
 262 one being that you can't pass a primitive value to a polymorphic
 263 function or store one in a polymorphic data type.  This rules out
 264 things like <Literal>[Int&num;]</Literal> (i.e. lists of primitive
 265 integers).  The reason for this restriction is that polymorphic
 266 arguments and constructor fields are assumed to be pointers: if an
 267 unboxed integer is stored in one of these, the garbage collector would
 268 attempt to follow it, leading to unpredictable space leaks.  Or a
 269 <Function>seq</Function> operation on the polymorphic component may
 270 attempt to dereference the pointer, with disastrous results.  Even
 271 worse, the unboxed value might be larger than a pointer
 272 (<Literal>Double&num;</Literal> for instance).
 273 </Para>
 274
 275 <Para>
 276 Nevertheless, A numerically-intensive program using unboxed types can
 277 go a <Emphasis>lot</Emphasis> faster than its &ldquo;standard&rdquo;
 278 counterpart&mdash;we saw a threefold speedup on one example.
 279 </Para>
 280
 281 </sect2>
 282
 283 <Sect2 id="unboxed-tuples">
 284 <Title>Unboxed Tuples
 285 </Title>
 286
 287 <Para>
 288 Unboxed tuples aren't really exported by <Literal>PrelGHC</Literal>,
 289 they're available by default with <Option>-fglasgow-exts</Option>.  An
 290 unboxed tuple looks like this:
 291 </Para>
 292
 293 <Para>
 294
 295 <ProgramListing>
 296 (# e_1, ..., e_n #)
 297 </ProgramListing>
 298
 299 </Para>
 300
 301 <Para>
 302 where <Literal>e&lowbar;1..e&lowbar;n</Literal> are expressions of any
 303 type (primitive or non-primitive).  The type of an unboxed tuple looks
 304 the same.
 305 </Para>
 306
 307 <Para>
 308 Unboxed tuples are used for functions that need to return multiple
 309 values, but they avoid the heap allocation normally associated with
 310 using fully-fledged tuples.  When an unboxed tuple is returned, the
 311 components are put directly into registers or on the stack; the
 312 unboxed tuple itself does not have a composite representation.  Many
 313 of the primitive operations listed in this section return unboxed
 314 tuples.
 315 </Para>
 316
 317 <Para>
 318 There are some pretty stringent restrictions on the use of unboxed tuples:
 319 </Para>
 320
 321 <Para>
 322
 323 <ItemizedList>
 324 <ListItem>
 325
 326 <Para>
 327  Unboxed tuple types are subject to the same restrictions as
 328 other unboxed types; i.e. they may not be stored in polymorphic data
 329 structures or passed to polymorphic functions.
 330
 331 </Para>
 332 </ListItem>
 333 <ListItem>
 334
 335 <Para>
 336  Unboxed tuples may only be constructed as the direct result of
 337 a function, and may only be deconstructed with a <Literal>case</Literal> expression.
 338 eg. the following are valid:
 339
 340
 341 <ProgramListing>
 342 f x y = (# x+1, y-1 #)
 343 g x = case f x x of { (# a, b #) -&#62; a + b }
 344 </ProgramListing>
 345
 346
 347 but the following are invalid:
 348
 349
 350 <ProgramListing>
 351 f x y = g (# x, y #)
 352 g (# x, y #) = x + y
 353 </ProgramListing>
 354
 355
 356 </Para>
 357 </ListItem>
 358 <ListItem>
 359
 360 <Para>
 361  No variable can have an unboxed tuple type.  This is illegal:
 362
 363
 364 <ProgramListing>
 365 f :: (# Int, Int #) -&#62; (# Int, Int #)
 366 f x = x
 367 </ProgramListing>
 368
 369
 370 because <VarName>x</VarName> has an unboxed tuple type.
 371
 372 </Para>
 373 </ListItem>
 374
 375 </ItemizedList>
 376
 377 </Para>
 378
 379 <Para>
 380 Note: we may relax some of these restrictions in the future.
 381 </Para>
 382
 383 <Para>
 384 The <Literal>IO</Literal> and <Literal>ST</Literal> monads use unboxed tuples to avoid unnecessary
 385 allocation during sequences of operations.
 386 </Para>
 387
 388 </Sect2>
 389
 390 <Sect2>
 391 <Title>Character and numeric types</Title>
 392
 393 <Para>
 394 <IndexTerm><Primary>character types, primitive</Primary></IndexTerm>
 395 <IndexTerm><Primary>numeric types, primitive</Primary></IndexTerm>
 396 <IndexTerm><Primary>integer types, primitive</Primary></IndexTerm>
 397 <IndexTerm><Primary>floating point types, primitive</Primary></IndexTerm>
 398 There are the following obvious primitive types:
 399 </Para>
 400
 401 <Para>
 402
 403 <ProgramListing>
 404 type Char#
 405 type Int#
 406 type Word#
 407 type Addr#
 408 type Float#
 409 type Double#
 410 type Int64#
 411 type Word64#
 412 </ProgramListing>
 413
 414 <IndexTerm><Primary><literal>Char&num;</literal></Primary></IndexTerm>
 415 <IndexTerm><Primary><literal>Int&num;</literal></Primary></IndexTerm>
 416 <IndexTerm><Primary><literal>Word&num;</literal></Primary></IndexTerm>
 417 <IndexTerm><Primary><literal>Addr&num;</literal></Primary></IndexTerm>
 418 <IndexTerm><Primary><literal>Float&num;</literal></Primary></IndexTerm>
 419 <IndexTerm><Primary><literal>Double&num;</literal></Primary></IndexTerm>
 420 <IndexTerm><Primary><literal>Int64&num;</literal></Primary></IndexTerm>
 421 <IndexTerm><Primary><literal>Word64&num;</literal></Primary></IndexTerm>
 422 </Para>
 423
 424 <Para>
 425 If you really want to know their exact equivalents in C, see
 426 <Filename>ghc/includes/StgTypes.h</Filename> in the GHC source tree.
 427 </Para>
 428
 429 <Para>
 430 Literals for these types may be written as follows:
 431 </Para>
 432
 433 <Para>
 434
 435 <ProgramListing>
 436 1#              an Int#
 437 1.2#            a Float#
 438 1.34##          a Double#
 439 'a'#            a Char#; for weird characters, use e.g. '\o&#60;octal&#62;'#
 440 "a"#            an Addr# (a `char *'); only characters '\0'..'\255' allowed
 441 </ProgramListing>
 442
 443 <IndexTerm><Primary>literals, primitive</Primary></IndexTerm>
 444 <IndexTerm><Primary>constants, primitive</Primary></IndexTerm>
 445 <IndexTerm><Primary>numbers, primitive</Primary></IndexTerm>
 446 </Para>
 447
 448 </Sect2>
 449
 450 <Sect2>
 451 <Title>Comparison operations</Title>
 452
 453 <Para>
 454 <IndexTerm><Primary>comparisons, primitive</Primary></IndexTerm>
 455 <IndexTerm><Primary>operators, comparison</Primary></IndexTerm>
 456 </Para>
 457
 458 <Para>
 459
 460 <ProgramListing>
 461 {&#62;,&#62;=,==,/=,&#60;,&#60;=}# :: Int# -&#62; Int# -&#62; Bool
 462
 463 {gt,ge,eq,ne,lt,le}Char# :: Char# -&#62; Char# -&#62; Bool
 464     -- ditto for Word# and Addr#
 465 </ProgramListing>
 466
 467 <IndexTerm><Primary><literal>&#62;&num;</literal></Primary></IndexTerm>
 468 <IndexTerm><Primary><literal>&#62;=&num;</literal></Primary></IndexTerm>
 469 <IndexTerm><Primary><literal>==&num;</literal></Primary></IndexTerm>
 470 <IndexTerm><Primary><literal>/=&num;</literal></Primary></IndexTerm>
 471 <IndexTerm><Primary><literal>&#60;&num;</literal></Primary></IndexTerm>
 472 <IndexTerm><Primary><literal>&#60;=&num;</literal></Primary></IndexTerm>
 473 <IndexTerm><Primary><literal>gt&lcub;Char,Word,Addr&rcub;&num;</literal></Primary></IndexTerm>
 474 <IndexTerm><Primary><literal>ge&lcub;Char,Word,Addr&rcub;&num;</literal></Primary></IndexTerm>
 475 <IndexTerm><Primary><literal>eq&lcub;Char,Word,Addr&rcub;&num;</literal></Primary></IndexTerm>
 476 <IndexTerm><Primary><literal>ne&lcub;Char,Word,Addr&rcub;&num;</literal></Primary></IndexTerm>
 477 <IndexTerm><Primary><literal>lt&lcub;Char,Word,Addr&rcub;&num;</literal></Primary></IndexTerm>
 478 <IndexTerm><Primary><literal>le&lcub;Char,Word,Addr&rcub;&num;</literal></Primary></IndexTerm>
 479 </Para>
 480
 481 </Sect2>
 482
 483 <Sect2>
 484 <Title>Primitive-character operations</Title>
 485
 486 <Para>
 487 <IndexTerm><Primary>characters, primitive operations</Primary></IndexTerm>
 488 <IndexTerm><Primary>operators, primitive character</Primary></IndexTerm>
 489 </Para>
 490
 491 <Para>
 492
 493 <ProgramListing>
 494 ord# :: Char# -&#62; Int#
 495 chr# :: Int# -&#62; Char#
 496 </ProgramListing>
 497
 498 <IndexTerm><Primary><literal>ord&num;</literal></Primary></IndexTerm>
 499 <IndexTerm><Primary><literal>chr&num;</literal></Primary></IndexTerm>
 500 </Para>
 501
 502 </Sect2>
 503
 504 <Sect2>
 505 <Title>Primitive-<Literal>Int</Literal> operations</Title>
 506
 507 <Para>
 508 <IndexTerm><Primary>integers, primitive operations</Primary></IndexTerm>
 509 <IndexTerm><Primary>operators, primitive integer</Primary></IndexTerm>
 510 </Para>
 511
 512 <Para>
 513
 514 <ProgramListing>
 515 {+,-,*,quotInt,remInt,gcdInt}# :: Int# -&#62; Int# -&#62; Int#
 516 negateInt# :: Int# -&#62; Int#
 517
 518 iShiftL#, iShiftRA#, iShiftRL# :: Int# -&#62; Int# -&#62; Int#
 519         -- shift left, right arithmetic, right logical
 520
 521 addIntC#, subIntC#, mulIntC# :: Int# -> Int# -> (# Int#, Int# #)
 522         -- add, subtract, multiply with carry
 523 </ProgramListing>
 524
 525 <IndexTerm><Primary><literal>+&num;</literal></Primary></IndexTerm>
 526 <IndexTerm><Primary><literal>-&num;</literal></Primary></IndexTerm>
 527 <IndexTerm><Primary><literal>*&num;</literal></Primary></IndexTerm>
 528 <IndexTerm><Primary><literal>quotInt&num;</literal></Primary></IndexTerm>
 529 <IndexTerm><Primary><literal>remInt&num;</literal></Primary></IndexTerm>
 530 <IndexTerm><Primary><literal>gcdInt&num;</literal></Primary></IndexTerm>
 531 <IndexTerm><Primary><literal>iShiftL&num;</literal></Primary></IndexTerm>
 532 <IndexTerm><Primary><literal>iShiftRA&num;</literal></Primary></IndexTerm>
 533 <IndexTerm><Primary><literal>iShiftRL&num;</literal></Primary></IndexTerm>
 534 <IndexTerm><Primary><literal>addIntC&num;</literal></Primary></IndexTerm>
 535 <IndexTerm><Primary><literal>subIntC&num;</literal></Primary></IndexTerm>
 536 <IndexTerm><Primary><literal>mulIntC&num;</literal></Primary></IndexTerm>
 537 <IndexTerm><Primary>shift operations, integer</Primary></IndexTerm>
 538 </Para>
 539
 540 <Para>
 541 <Emphasis>Note:</Emphasis> No error/overflow checking!
 542 </Para>
 543
 544 </Sect2>
 545
 546 <Sect2>
 547 <Title>Primitive-<Literal>Double</Literal> and <Literal>Float</Literal> operations</Title>
 548
 549 <Para>
 550 <IndexTerm><Primary>floating point numbers, primitive</Primary></IndexTerm>
 551 <IndexTerm><Primary>operators, primitive floating point</Primary></IndexTerm>
 552 </Para>
 553
 554 <Para>
 555
 556 <ProgramListing>
 557 {+,-,*,/}##         :: Double# -&#62; Double# -&#62; Double#
 558 {&#60;,&#60;=,==,/=,&#62;=,&#62;}## :: Double# -&#62; Double# -&#62; Bool
 559 negateDouble#       :: Double# -&#62; Double#
 560 double2Int#         :: Double# -&#62; Int#
 561 int2Double#         :: Int#    -&#62; Double#
 562
 563 {plus,minux,times,divide}Float# :: Float# -&#62; Float# -&#62; Float#
 564 {gt,ge,eq,ne,lt,le}Float# :: Float# -&#62; Float# -&#62; Bool
 565 negateFloat#        :: Float# -&#62; Float#
 566 float2Int#          :: Float# -&#62; Int#
 567 int2Float#          :: Int#   -&#62; Float#
 568 </ProgramListing>
 569
 570 </Para>
 571
 572 <Para>
 573 <IndexTerm><Primary><literal>+&num;&num;</literal></Primary></IndexTerm>
 574 <IndexTerm><Primary><literal>-&num;&num;</literal></Primary></IndexTerm>
 575 <IndexTerm><Primary><literal>*&num;&num;</literal></Primary></IndexTerm>
 576 <IndexTerm><Primary><literal>/&num;&num;</literal></Primary></IndexTerm>
 577 <IndexTerm><Primary><literal>&#60;&num;&num;</literal></Primary></IndexTerm>
 578 <IndexTerm><Primary><literal>&#60;=&num;&num;</literal></Primary></IndexTerm>
 579 <IndexTerm><Primary><literal>==&num;&num;</literal></Primary></IndexTerm>
 580 <IndexTerm><Primary><literal>=/&num;&num;</literal></Primary></IndexTerm>
 581 <IndexTerm><Primary><literal>&#62;=&num;&num;</literal></Primary></IndexTerm>
 582 <IndexTerm><Primary><literal>&#62;&num;&num;</literal></Primary></IndexTerm>
 583 <IndexTerm><Primary><literal>negateDouble&num;</literal></Primary></IndexTerm>
 584 <IndexTerm><Primary><literal>double2Int&num;</literal></Primary></IndexTerm>
 585 <IndexTerm><Primary><literal>int2Double&num;</literal></Primary></IndexTerm>
 586 </Para>
 587
 588 <Para>
 589 <IndexTerm><Primary><literal>plusFloat&num;</literal></Primary></IndexTerm>
 590 <IndexTerm><Primary><literal>minusFloat&num;</literal></Primary></IndexTerm>
 591 <IndexTerm><Primary><literal>timesFloat&num;</literal></Primary></IndexTerm>
 592 <IndexTerm><Primary><literal>divideFloat&num;</literal></Primary></IndexTerm>
 593 <IndexTerm><Primary><literal>gtFloat&num;</literal></Primary></IndexTerm>
 594 <IndexTerm><Primary><literal>geFloat&num;</literal></Primary></IndexTerm>
 595 <IndexTerm><Primary><literal>eqFloat&num;</literal></Primary></IndexTerm>
 596 <IndexTerm><Primary><literal>neFloat&num;</literal></Primary></IndexTerm>
 597 <IndexTerm><Primary><literal>ltFloat&num;</literal></Primary></IndexTerm>
 598 <IndexTerm><Primary><literal>leFloat&num;</literal></Primary></IndexTerm>
 599 <IndexTerm><Primary><literal>negateFloat&num;</literal></Primary></IndexTerm>
 600 <IndexTerm><Primary><literal>float2Int&num;</literal></Primary></IndexTerm>
 601 <IndexTerm><Primary><literal>int2Float&num;</literal></Primary></IndexTerm>
 602 </Para>
 603
 604 <Para>
 605 And a full complement of trigonometric functions:
 606 </Para>
 607
 608 <Para>
 609
 610 <ProgramListing>
 611 expDouble#      :: Double# -&#62; Double#
 612 logDouble#      :: Double# -&#62; Double#
 613 sqrtDouble#     :: Double# -&#62; Double#
 614 sinDouble#      :: Double# -&#62; Double#
 615 cosDouble#      :: Double# -&#62; Double#
 616 tanDouble#      :: Double# -&#62; Double#
 617 asinDouble#     :: Double# -&#62; Double#
 618 acosDouble#     :: Double# -&#62; Double#
 619 atanDouble#     :: Double# -&#62; Double#
 620 sinhDouble#     :: Double# -&#62; Double#
 621 coshDouble#     :: Double# -&#62; Double#
 622 tanhDouble#     :: Double# -&#62; Double#
 623 powerDouble#    :: Double# -&#62; Double# -&#62; Double#
 624 </ProgramListing>
 625
 626 <IndexTerm><Primary>trigonometric functions, primitive</Primary></IndexTerm>
 627 </Para>
 628
 629 <Para>
 630 similarly for <Literal>Float&num;</Literal>.
 631 </Para>
 632
 633 <Para>
 634 There are two coercion functions for <Literal>Float&num;</Literal>/<Literal>Double&num;</Literal>:
 635 </Para>
 636
 637 <Para>
 638
 639 <ProgramListing>
 640 float2Double#   :: Float# -&#62; Double#
 641 double2Float#   :: Double# -&#62; Float#
 642 </ProgramListing>
 643
 644 <IndexTerm><Primary><literal>float2Double&num;</literal></Primary></IndexTerm>
 645 <IndexTerm><Primary><literal>double2Float&num;</literal></Primary></IndexTerm>
 646 </Para>
 647
 648 <Para>
 649 The primitive version of <Function>decodeDouble</Function>
 650 (<Function>encodeDouble</Function> is implemented as an external C
 651 function):
 652 </Para>
 653
 654 <Para>
 655
 656 <ProgramListing>
 657 decodeDouble#   :: Double# -&#62; PrelNum.ReturnIntAndGMP
 658 </ProgramListing>
 659
 660 <IndexTerm><Primary><literal>encodeDouble&num;</literal></Primary></IndexTerm>
 661 <IndexTerm><Primary><literal>decodeDouble&num;</literal></Primary></IndexTerm>
 662 </Para>
 663
 664 <Para>
 665 (And the same for <Literal>Float&num;</Literal>s.)
 666 </Para>
 667
 668 </Sect2>
 669
 670 <Sect2 id="integer-operations">
 671 <Title>Operations on/for <Literal>Integers</Literal> (interface to GMP)
 672 </Title>
 673
 674 <Para>
 675 <IndexTerm><Primary>arbitrary precision integers</Primary></IndexTerm>
 676 <IndexTerm><Primary>Integer, operations on</Primary></IndexTerm>
 677 </Para>
 678
 679 <Para>
 680 We implement <Literal>Integers</Literal> (arbitrary-precision
 681 integers) using the GNU multiple-precision (GMP) package (version
 682 2.0.2).
 683 </Para>
 684
 685 <Para>
 686 The data type for <Literal>Integer</Literal> is either a small
 687 integer, represented by an <Literal>Int</Literal>, or a large integer
 688 represented using the pieces required by GMP's
 689 <Literal>MP&lowbar;INT</Literal> in <Filename>gmp.h</Filename> (see
 690 <Filename>gmp.info</Filename> in
 691 <Filename>ghc/includes/runtime/gmp</Filename>).  It comes out as:
 692 </Para>
 693
 694 <Para>
 695
 696 <ProgramListing>
 697 data Integer = S# Int#             -- small integers
 698              | J# Int# ByteArray#  -- large integers
 699 </ProgramListing>
 700
 701 <IndexTerm><Primary>Integer type</Primary></IndexTerm> The primitive
 702 ops to support large <Literal>Integers</Literal> use the
 703 &ldquo;pieces&rdquo; of the representation, and are as follows:
 704 </Para>
 705
 706 <Para>
 707
 708 <ProgramListing>
 709 negateInteger#  :: Int# -&#62; ByteArray# -&#62; Integer
 710
 711 {plus,minus,times}Integer#, gcdInteger#,
 712   quotInteger#, remInteger#, divExactInteger#
 713         :: Int# -> ByteArray#
 714         -> Int# -> ByteArray#
 715         -> (# Int#, ByteArray# #)
 716
 717 cmpInteger#
 718         :: Int# -> ByteArray#
 719         -> Int# -> ByteArray#
 720         -> Int# -- -1 for &#60;; 0 for ==; +1 for >
 721
 722 cmpIntegerInt#
 723         :: Int# -> ByteArray#
 724         -> Int#
 725         -> Int# -- -1 for &#60;; 0 for ==; +1 for >
 726
 727 gcdIntegerInt# ::
 728         :: Int# -> ByteArray#
 729         -> Int#
 730         -> Int#
 731
 732 divModInteger#, quotRemInteger#
 733         :: Int# -> ByteArray#
 734         -> Int# -> ByteArray#
 735         -> (# Int#, ByteArray#,
 736                   Int#, ByteArray# #)
 737
 738 integer2Int# :: Int# -> ByteArray# -> Int#
 739
 740 int2Integer#  :: Int#  -> Integer -- NB: no error-checking on these two!
 741 word2Integer# :: Word# -> Integer
 742
 743 addr2Integer# :: Addr# -> Integer
 744         -- the Addr# is taken to be a `char *' string
 745         -- to be converted into an Integer.
 746 </ProgramListing>
 747
 748 <IndexTerm><Primary><literal>negateInteger&num;</literal></Primary></IndexTerm>
 749 <IndexTerm><Primary><literal>plusInteger&num;</literal></Primary></IndexTerm>
 750 <IndexTerm><Primary><literal>minusInteger&num;</literal></Primary></IndexTerm>
 751 <IndexTerm><Primary><literal>timesInteger&num;</literal></Primary></IndexTerm>
 752 <IndexTerm><Primary><literal>quotInteger&num;</literal></Primary></IndexTerm>
 753 <IndexTerm><Primary><literal>remInteger&num;</literal></Primary></IndexTerm>
 754 <IndexTerm><Primary><literal>gcdInteger&num;</literal></Primary></IndexTerm>
 755 <IndexTerm><Primary><literal>gcdIntegerInt&num;</literal></Primary></IndexTerm>
 756 <IndexTerm><Primary><literal>divExactInteger&num;</literal></Primary></IndexTerm>
 757 <IndexTerm><Primary><literal>cmpInteger&num;</literal></Primary></IndexTerm>
 758 <IndexTerm><Primary><literal>divModInteger&num;</literal></Primary></IndexTerm>
 759 <IndexTerm><Primary><literal>quotRemInteger&num;</literal></Primary></IndexTerm>
 760 <IndexTerm><Primary><literal>integer2Int&num;</literal></Primary></IndexTerm>
 761 <IndexTerm><Primary><literal>int2Integer&num;</literal></Primary></IndexTerm>
 762 <IndexTerm><Primary><literal>word2Integer&num;</literal></Primary></IndexTerm>
 763 <IndexTerm><Primary><literal>addr2Integer&num;</literal></Primary></IndexTerm>
 764 </Para>
 765
 766 </Sect2>
 767
 768 <Sect2>
 769 <Title>Words and addresses</Title>
 770
 771 <Para>
 772 <IndexTerm><Primary>word, primitive type</Primary></IndexTerm>
 773 <IndexTerm><Primary>address, primitive type</Primary></IndexTerm>
 774 <IndexTerm><Primary>unsigned integer, primitive type</Primary></IndexTerm>
 775 <IndexTerm><Primary>pointer, primitive type</Primary></IndexTerm>
 776 </Para>
 777
 778 <Para>
 779 A <Literal>Word&num;</Literal> is used for bit-twiddling operations.
 780 It is the same size as an <Literal>Int&num;</Literal>, but has no sign
 781 nor any arithmetic operations.
 782
 783 <ProgramListing>
 784 type Word#      -- Same size/etc as Int# but *unsigned*
 785 type Addr#      -- A pointer from outside the "Haskell world" (from C, probably);
 786                 -- described under "arrays"
 787 </ProgramListing>
 788
 789 <IndexTerm><Primary><literal>Word&num;</literal></Primary></IndexTerm>
 790 <IndexTerm><Primary><literal>Addr&num;</literal></Primary></IndexTerm>
 791 </Para>
 792
 793 <Para>
 794 <Literal>Word&num;</Literal>s and <Literal>Addr&num;</Literal>s have
 795 the usual comparison operations.  Other
 796 unboxed-<Literal>Word</Literal> ops (bit-twiddling and coercions):
 797 </Para>
 798
 799 <Para>
 800
 801 <ProgramListing>
 802 {gt,ge,eq,ne,lt,le}Word# :: Word# -> Word# -> Bool
 803
 804 and#, or#, xor# :: Word# -> Word# -> Word#
 805         -- standard bit ops.
 806
 807 quotWord#, remWord# :: Word# -> Word# -> Word#
 808         -- word (i.e. unsigned) versions are different from int
 809         -- versions, so we have to provide these explicitly.
 810
 811 not# :: Word# -> Word#
 812
 813 shiftL#, shiftRL# :: Word# -> Int# -> Word#
 814         -- shift left, right logical
 815
 816 int2Word#       :: Int#  -> Word# -- just a cast, really
 817 word2Int#       :: Word# -> Int#
 818 </ProgramListing>
 819
 820 <IndexTerm><Primary>bit operations, Word and Addr</Primary></IndexTerm>
 821 <IndexTerm><Primary><literal>gtWord&num;</literal></Primary></IndexTerm>
 822 <IndexTerm><Primary><literal>geWord&num;</literal></Primary></IndexTerm>
 823 <IndexTerm><Primary><literal>eqWord&num;</literal></Primary></IndexTerm>
 824 <IndexTerm><Primary><literal>neWord&num;</literal></Primary></IndexTerm>
 825 <IndexTerm><Primary><literal>ltWord&num;</literal></Primary></IndexTerm>
 826 <IndexTerm><Primary><literal>leWord&num;</literal></Primary></IndexTerm>
 827 <IndexTerm><Primary><literal>and&num;</literal></Primary></IndexTerm>
 828 <IndexTerm><Primary><literal>or&num;</literal></Primary></IndexTerm>
 829 <IndexTerm><Primary><literal>xor&num;</literal></Primary></IndexTerm>
 830 <IndexTerm><Primary><literal>not&num;</literal></Primary></IndexTerm>
 831 <IndexTerm><Primary><literal>quotWord&num;</literal></Primary></IndexTerm>
 832 <IndexTerm><Primary><literal>remWord&num;</literal></Primary></IndexTerm>
 833 <IndexTerm><Primary><literal>shiftL&num;</literal></Primary></IndexTerm>
 834 <IndexTerm><Primary><literal>shiftRA&num;</literal></Primary></IndexTerm>
 835 <IndexTerm><Primary><literal>shiftRL&num;</literal></Primary></IndexTerm>
 836 <IndexTerm><Primary><literal>int2Word&num;</literal></Primary></IndexTerm>
 837 <IndexTerm><Primary><literal>word2Int&num;</literal></Primary></IndexTerm>
 838 </Para>
 839
 840 <Para>
 841 Unboxed-<Literal>Addr</Literal> ops (C casts, really):
 842
 843 <ProgramListing>
 844 {gt,ge,eq,ne,lt,le}Addr# :: Addr# -> Addr# -> Bool
 845
 846 int2Addr#       :: Int#  -> Addr#
 847 addr2Int#       :: Addr# -> Int#
 848 addr2Integer#   :: Addr# -> (# Int#, ByteArray# #)
 849 </ProgramListing>
 850
 851 <IndexTerm><Primary><literal>gtAddr&num;</literal></Primary></IndexTerm>
 852 <IndexTerm><Primary><literal>geAddr&num;</literal></Primary></IndexTerm>
 853 <IndexTerm><Primary><literal>eqAddr&num;</literal></Primary></IndexTerm>
 854 <IndexTerm><Primary><literal>neAddr&num;</literal></Primary></IndexTerm>
 855 <IndexTerm><Primary><literal>ltAddr&num;</literal></Primary></IndexTerm>
 856 <IndexTerm><Primary><literal>leAddr&num;</literal></Primary></IndexTerm>
 857 <IndexTerm><Primary><literal>int2Addr&num;</literal></Primary></IndexTerm>
 858 <IndexTerm><Primary><literal>addr2Int&num;</literal></Primary></IndexTerm>
 859 <IndexTerm><Primary><literal>addr2Integer&num;</literal></Primary></IndexTerm>
 860 </Para>
 861
 862 <Para>
 863 The casts between <Literal>Int&num;</Literal>,
 864 <Literal>Word&num;</Literal> and <Literal>Addr&num;</Literal>
 865 correspond to null operations at the machine level, but are required
 866 to keep the Haskell type checker happy.
 867 </Para>
 868
 869 <Para>
 870 Operations for indexing off of C pointers
 871 (<Literal>Addr&num;</Literal>s) to snatch values are listed under
 872 &ldquo;arrays&rdquo;.
 873 </Para>
 874
 875 </Sect2>
 876
 877 <Sect2>
 878 <Title>Arrays</Title>
 879
 880 <Para>
 881 <IndexTerm><Primary>arrays, primitive</Primary></IndexTerm>
 882 </Para>
 883
 884 <Para>
 885 The type <Literal>Array&num; elt</Literal> is the type of primitive,
 886 unpointed arrays of values of type <Literal>elt</Literal>.
 887 </Para>
 888
 889 <Para>
 890
 891 <ProgramListing>
 892 type Array# elt
 893 </ProgramListing>
 894
 895 <IndexTerm><Primary><literal>Array&num;</literal></Primary></IndexTerm>
 896 </Para>
 897
 898 <Para>
 899 <Literal>Array&num;</Literal> is more primitive than a Haskell
 900 array&mdash;indeed, the Haskell <Literal>Array</Literal> interface is
 901 implemented using <Literal>Array&num;</Literal>&mdash;in that an
 902 <Literal>Array&num;</Literal> is indexed only by
 903 <Literal>Int&num;</Literal>s, starting at zero.  It is also more
 904 primitive by virtue of being unboxed.  That doesn't mean that it isn't
 905 a heap-allocated object&mdash;of course, it is.  Rather, being unboxed
 906 means that it is represented by a pointer to the array itself, and not
 907 to a thunk which will evaluate to the array (or to bottom).  The
 908 components of an <Literal>Array&num;</Literal> are themselves boxed.
 909 </Para>
 910
 911 <Para>
 912 The type <Literal>ByteArray&num;</Literal> is similar to
 913 <Literal>Array&num;</Literal>, except that it contains just a string
 914 of (non-pointer) bytes.
 915 </Para>
 916
 917 <Para>
 918
 919 <ProgramListing>
 920 type ByteArray#
 921 </ProgramListing>
 922
 923 <IndexTerm><Primary><literal>ByteArray&num;</literal></Primary></IndexTerm>
 924 </Para>
 925
 926 <Para>
 927 Arrays of these types are useful when a Haskell program wishes to
 928 construct a value to pass to a C procedure. It is also possible to use
 929 them to build (say) arrays of unboxed characters for internal use in a
 930 Haskell program.  Given these uses, <Literal>ByteArray&num;</Literal>
 931 is deliberately a bit vague about the type of its components.
 932 Operations are provided to extract values of type
 933 <Literal>Char&num;</Literal>, <Literal>Int&num;</Literal>,
 934 <Literal>Float&num;</Literal>, <Literal>Double&num;</Literal>, and
 935 <Literal>Addr&num;</Literal> from arbitrary offsets within a
 936 <Literal>ByteArray&num;</Literal>.  (For type
 937 <Literal>Foo&num;</Literal>, the $i$th offset gets you the $i$th
 938 <Literal>Foo&num;</Literal>, not the <Literal>Foo&num;</Literal> at
 939 byte-position $i$.  Mumble.)  (If you want a
 940 <Literal>Word&num;</Literal>, grab an <Literal>Int&num;</Literal>,
 941 then coerce it.)
 942 </Para>
 943
 944 <Para>
 945 Lastly, we have static byte-arrays, of type
 946 <Literal>Addr&num;</Literal> &lsqb;mentioned previously].  (Remember
 947 the duality between arrays and pointers in C.)  Arrays of this types
 948 are represented by a pointer to an array in the world outside Haskell,
 949 so this pointer is not followed by the garbage collector.  In other
 950 respects they are just like <Literal>ByteArray&num;</Literal>.  They
 951 are only needed in order to pass values from C to Haskell.
 952 </Para>
 953
 954 </Sect2>
 955
 956 <Sect2>
 957 <Title>Reading and writing</Title>
 958
 959 <Para>
 960 Primitive arrays are linear, and indexed starting at zero.
 961 </Para>
 962
 963 <Para>
 964 The size and indices of a <Literal>ByteArray&num;</Literal>, <Literal>Addr&num;</Literal>, and
 965 <Literal>MutableByteArray&num;</Literal> are all in bytes.  It's up to the program to
 966 calculate the correct byte offset from the start of the array.  This
 967 allows a <Literal>ByteArray&num;</Literal> to contain a mixture of values of different
 968 type, which is often needed when preparing data for and unpicking
 969 results from C.  (Umm&hellip;not true of indices&hellip;WDP 95/09)
 970 </Para>
 971
 972 <Para>
 973 <Emphasis>Should we provide some <Literal>sizeOfDouble&num;</Literal> constants?</Emphasis>
 974 </Para>
 975
 976 <Para>
 977 Out-of-range errors on indexing should be caught by the code which
 978 uses the primitive operation; the primitive operations themselves do
 979 <Emphasis>not</Emphasis> check for out-of-range indexes. The intention is that the
 980 primitive ops compile to one machine instruction or thereabouts.
 981 </Para>
 982
 983 <Para>
 984 We use the terms &ldquo;reading&rdquo; and &ldquo;writing&rdquo; to refer to accessing
 985 <Emphasis>mutable</Emphasis> arrays (see <XRef LinkEnd="sect-mutable">), and
 986 &ldquo;indexing&rdquo; to refer to reading a value from an <Emphasis>immutable</Emphasis>
 987 array.
 988 </Para>
 989
 990 <Para>
 991 Immutable byte arrays are straightforward to index (all indices in bytes):
 992
 993 <ProgramListing>
 994 indexCharArray#   :: ByteArray# -> Int# -> Char#
 995 indexIntArray#    :: ByteArray# -> Int# -> Int#
 996 indexAddrArray#   :: ByteArray# -> Int# -> Addr#
 997 indexFloatArray#  :: ByteArray# -> Int# -> Float#
 998 indexDoubleArray# :: ByteArray# -> Int# -> Double#
 999
1000 indexCharOffAddr#   :: Addr# -> Int# -> Char#
1001 indexIntOffAddr#    :: Addr# -> Int# -> Int#
1002 indexFloatOffAddr#  :: Addr# -> Int# -> Float#
1003 indexDoubleOffAddr# :: Addr# -> Int# -> Double#
1004 indexAddrOffAddr#   :: Addr# -> Int# -> Addr#
1005  -- Get an Addr# from an Addr# offset
1006 </ProgramListing>
1007
1008 <IndexTerm><Primary><literal>indexCharArray&num;</literal></Primary></IndexTerm>
1009 <IndexTerm><Primary><literal>indexIntArray&num;</literal></Primary></IndexTerm>
1010 <IndexTerm><Primary><literal>indexAddrArray&num;</literal></Primary></IndexTerm>
1011 <IndexTerm><Primary><literal>indexFloatArray&num;</literal></Primary></IndexTerm>
1012 <IndexTerm><Primary><literal>indexDoubleArray&num;</literal></Primary></IndexTerm>
1013 <IndexTerm><Primary><literal>indexCharOffAddr&num;</literal></Primary></IndexTerm>
1014 <IndexTerm><Primary><literal>indexIntOffAddr&num;</literal></Primary></IndexTerm>
1015 <IndexTerm><Primary><literal>indexFloatOffAddr&num;</literal></Primary></IndexTerm>
1016 <IndexTerm><Primary><literal>indexDoubleOffAddr&num;</literal></Primary></IndexTerm>
1017 <IndexTerm><Primary><literal>indexAddrOffAddr&num;</literal></Primary></IndexTerm>
1018 </Para>
1019
1020 <Para>
1021 The last of these, <Function>indexAddrOffAddr&num;</Function>, extracts an <Literal>Addr&num;</Literal> using an offset
1022 from another <Literal>Addr&num;</Literal>, thereby providing the ability to follow a chain of
1023 C pointers.
1024 </Para>
1025
1026 <Para>
1027 Something a bit more interesting goes on when indexing arrays of boxed
1028 objects, because the result is simply the boxed object. So presumably
1029 it should be entered&mdash;we never usually return an unevaluated
1030 object!  This is a pain: primitive ops aren't supposed to do
1031 complicated things like enter objects.  The current solution is to
1032 return a single element unboxed tuple (see <XRef LinkEnd="unboxed-tuples">).
1033 </Para>
1034
1035 <Para>
1036
1037 <ProgramListing>
1038 indexArray#       :: Array# elt -> Int# -> (# elt #)
1039 </ProgramListing>
1040
1041 <IndexTerm><Primary><literal>indexArray&num;</literal></Primary></IndexTerm>
1042 </Para>
1043
1044 </Sect2>
1045
1046 <Sect2>
1047 <Title>The state type</Title>
1048
1049 <Para>
1050 <IndexTerm><Primary><literal>state, primitive type</literal></Primary></IndexTerm>
1051 <IndexTerm><Primary><literal>State&num;</literal></Primary></IndexTerm>
1052 </Para>
1053
1054 <Para>
1055 The primitive type <Literal>State&num;</Literal> represents the state of a state
1056 transformer.  It is parameterised on the desired type of state, which
1057 serves to keep states from distinct threads distinct from one another.
1058 But the <Emphasis>only</Emphasis> effect of this parameterisation is in the type
1059 system: all values of type <Literal>State&num;</Literal> are represented in the same way.
1060 Indeed, they are all represented by nothing at all!  The code
1061 generator &ldquo;knows&rdquo; to generate no code, and allocate no registers
1062 etc, for primitive states.
1063 </Para>
1064
1065 <Para>
1066
1067 <ProgramListing>
1068 type State# s
1069 </ProgramListing>
1070
1071 </Para>
1072
1073 <Para>
1074 The type <Literal>GHC.RealWorld</Literal> is truly opaque: there are no values defined
1075 of this type, and no operations over it.  It is &ldquo;primitive&rdquo; in that
1076 sense - but it is <Emphasis>not unlifted!</Emphasis> Its only role in life is to be
1077 the type which distinguishes the <Literal>IO</Literal> state transformer.
1078 </Para>
1079
1080 <Para>
1081
1082 <ProgramListing>
1083 data RealWorld
1084 </ProgramListing>
1085
1086 </Para>
1087
1088 </Sect2>
1089
1090 <Sect2>
1091 <Title>State of the world</Title>
1092
1093 <Para>
1094 A single, primitive, value of type <Literal>State&num; RealWorld</Literal> is provided.
1095 </Para>
1096
1097 <Para>
1098
1099 <ProgramListing>
1100 realWorld# :: State# RealWorld
1101 </ProgramListing>
1102
1103 <IndexTerm><Primary>realWorld&num; state object</Primary></IndexTerm>
1104 </Para>
1105
1106 <Para>
1107 (Note: in the compiler, not a <Literal>PrimOp</Literal>; just a mucho magic
1108 <Literal>Id</Literal>. Exported from <Literal>GHC</Literal>, though).
1109 </Para>
1110
1111 </Sect2>
1112
1113 <Sect2 id="sect-mutable">
1114 <Title>Mutable arrays</Title>
1115
1116 <Para>
1117 <IndexTerm><Primary>mutable arrays</Primary></IndexTerm>
1118 <IndexTerm><Primary>arrays, mutable</Primary></IndexTerm>
1119 Corresponding to <Literal>Array&num;</Literal> and <Literal>ByteArray&num;</Literal>, we have the types of
1120 mutable versions of each.  In each case, the representation is a
1121 pointer to a suitable block of (mutable) heap-allocated storage.
1122 </Para>
1123
1124 <Para>
1125
1126 <ProgramListing>
1127 type MutableArray# s elt
1128 type MutableByteArray# s
1129 </ProgramListing>
1130
1131 <IndexTerm><Primary><literal>MutableArray&num;</literal></Primary></IndexTerm>
1132 <IndexTerm><Primary><literal>MutableByteArray&num;</literal></Primary></IndexTerm>
1133 </Para>
1134
1135 <Sect3>
1136 <Title>Allocation</Title>
1137
1138 <Para>
1139 <IndexTerm><Primary>mutable arrays, allocation</Primary></IndexTerm>
1140 <IndexTerm><Primary>arrays, allocation</Primary></IndexTerm>
1141 <IndexTerm><Primary>allocation, of mutable arrays</Primary></IndexTerm>
1142 </Para>
1143
1144 <Para>
1145 Mutable arrays can be allocated. Only pointer-arrays are initialised;
1146 arrays of non-pointers are filled in by &ldquo;user code&rdquo; rather than by
1147 the array-allocation primitive.  Reason: only the pointer case has to
1148 worry about GC striking with a partly-initialised array.
1149 </Para>
1150
1151 <Para>
1152
1153 <ProgramListing>
1154 newArray#       :: Int# -> elt -> State# s -> (# State# s, MutableArray# s elt #)
1155
1156 newCharArray#   :: Int# -> State# s -> (# State# s, MutableByteArray# s elt #)
1157 newIntArray#    :: Int# -> State# s -> (# State# s, MutableByteArray# s elt #)
1158 newAddrArray#   :: Int# -> State# s -> (# State# s, MutableByteArray# s elt #)
1159 newFloatArray#  :: Int# -> State# s -> (# State# s, MutableByteArray# s elt #)
1160 newDoubleArray# :: Int# -> State# s -> (# State# s, MutableByteArray# s elt #)
1161 </ProgramListing>
1162
1163 <IndexTerm><Primary><literal>newArray&num;</literal></Primary></IndexTerm>
1164 <IndexTerm><Primary><literal>newCharArray&num;</literal></Primary></IndexTerm>
1165 <IndexTerm><Primary><literal>newIntArray&num;</literal></Primary></IndexTerm>
1166 <IndexTerm><Primary><literal>newAddrArray&num;</literal></Primary></IndexTerm>
1167 <IndexTerm><Primary><literal>newFloatArray&num;</literal></Primary></IndexTerm>
1168 <IndexTerm><Primary><literal>newDoubleArray&num;</literal></Primary></IndexTerm>
1169 </Para>
1170
1171 <Para>
1172 The size of a <Literal>ByteArray&num;</Literal> is given in bytes.
1173 </Para>
1174
1175 </Sect3>
1176
1177 <Sect3>
1178 <Title>Reading and writing</Title>
1179
1180 <Para>
1181 <IndexTerm><Primary>arrays, reading and writing</Primary></IndexTerm>
1182 </Para>
1183
1184 <Para>
1185
1186 <ProgramListing>
1187 readArray#       :: MutableArray# s elt -> Int# -> State# s -> (# State# s, elt #)
1188 readCharArray#   :: MutableByteArray# s -> Int# -> State# s -> (# State# s, Char# #)
1189 readIntArray#    :: MutableByteArray# s -> Int# -> State# s -> (# State# s, Int# #)
1190 readAddrArray#   :: MutableByteArray# s -> Int# -> State# s -> (# State# s, Addr# #)
1191 readFloatArray#  :: MutableByteArray# s -> Int# -> State# s -> (# State# s, Float# #)
1192 readDoubleArray# :: MutableByteArray# s -> Int# -> State# s -> (# State# s, Double# #)
1193
1194 writeArray#       :: MutableArray# s elt -> Int# -> elt     -> State# s -> State# s
1195 writeCharArray#   :: MutableByteArray# s -> Int# -> Char#   -> State# s -> State# s
1196 writeIntArray#    :: MutableByteArray# s -> Int# -> Int#    -> State# s -> State# s
1197 writeAddrArray#   :: MutableByteArray# s -> Int# -> Addr#   -> State# s -> State# s
1198 writeFloatArray#  :: MutableByteArray# s -> Int# -> Float#  -> State# s -> State# s
1199 writeDoubleArray# :: MutableByteArray# s -> Int# -> Double# -> State# s -> State# s
1200 </ProgramListing>
1201
1202 <IndexTerm><Primary><literal>readArray&num;</literal></Primary></IndexTerm>
1203 <IndexTerm><Primary><literal>readCharArray&num;</literal></Primary></IndexTerm>
1204 <IndexTerm><Primary><literal>readIntArray&num;</literal></Primary></IndexTerm>
1205 <IndexTerm><Primary><literal>readAddrArray&num;</literal></Primary></IndexTerm>
1206 <IndexTerm><Primary><literal>readFloatArray&num;</literal></Primary></IndexTerm>
1207 <IndexTerm><Primary><literal>readDoubleArray&num;</literal></Primary></IndexTerm>
1208 <IndexTerm><Primary><literal>writeArray&num;</literal></Primary></IndexTerm>
1209 <IndexTerm><Primary><literal>writeCharArray&num;</literal></Primary></IndexTerm>
1210 <IndexTerm><Primary><literal>writeIntArray&num;</literal></Primary></IndexTerm>
1211 <IndexTerm><Primary><literal>writeAddrArray&num;</literal></Primary></IndexTerm>
1212 <IndexTerm><Primary><literal>writeFloatArray&num;</literal></Primary></IndexTerm>
1213 <IndexTerm><Primary><literal>writeDoubleArray&num;</literal></Primary></IndexTerm>
1214 </Para>
1215
1216 </Sect3>
1217
1218 <Sect3>
1219 <Title>Equality</Title>
1220
1221 <Para>
1222 <IndexTerm><Primary>arrays, testing for equality</Primary></IndexTerm>
1223 </Para>
1224
1225 <Para>
1226 One can take &ldquo;equality&rdquo; of mutable arrays.  What is compared is the
1227 <Emphasis>name</Emphasis> or reference to the mutable array, not its contents.
1228 </Para>
1229
1230 <Para>
1231
1232 <ProgramListing>
1233 sameMutableArray#     :: MutableArray# s elt -> MutableArray# s elt -> Bool
1234 sameMutableByteArray# :: MutableByteArray# s -> MutableByteArray# s -> Bool
1235 </ProgramListing>
1236
1237 <IndexTerm><Primary><literal>sameMutableArray&num;</literal></Primary></IndexTerm>
1238 <IndexTerm><Primary><literal>sameMutableByteArray&num;</literal></Primary></IndexTerm>
1239 </Para>
1240
1241 </Sect3>
1242
1243 <Sect3>
1244 <Title>Freezing mutable arrays</Title>
1245
1246 <Para>
1247 <IndexTerm><Primary>arrays, freezing mutable</Primary></IndexTerm>
1248 <IndexTerm><Primary>freezing mutable arrays</Primary></IndexTerm>
1249 <IndexTerm><Primary>mutable arrays, freezing</Primary></IndexTerm>
1250 </Para>
1251
1252 <Para>
1253 Only unsafe-freeze has a primitive.  (Safe freeze is done directly in Haskell
1254 by copying the array and then using <Function>unsafeFreeze</Function>.)
1255 </Para>
1256
1257 <Para>
1258
1259 <ProgramListing>
1260 unsafeFreezeArray#     :: MutableArray# s elt -> State# s -> (# State# s, Array# s elt #)
1261 unsafeFreezeByteArray# :: MutableByteArray# s -> State# s -> (# State# s, ByteArray# #)
1262 </ProgramListing>
1263
1264 <IndexTerm><Primary><literal>unsafeFreezeArray&num;</literal></Primary></IndexTerm>
1265 <IndexTerm><Primary><literal>unsafeFreezeByteArray&num;</literal></Primary></IndexTerm>
1266 </Para>
1267
1268 </Sect3>
1269
1270 </Sect2>
1271
1272 <Sect2>
1273 <Title>Synchronizing variables (M-vars)</Title>
1274
1275 <Para>
1276 <IndexTerm><Primary>synchronising variables (M-vars)</Primary></IndexTerm>
1277 <IndexTerm><Primary>M-Vars</Primary></IndexTerm>
1278 </Para>
1279
1280 <Para>
1281 Synchronising variables are the primitive type used to implement
1282 Concurrent Haskell's MVars (see the Concurrent Haskell paper for
1283 the operational behaviour of these operations).
1284 </Para>
1285
1286 <Para>
1287
1288 <ProgramListing>
1289 type MVar# s elt        -- primitive
1290
1291 newMVar#    :: State# s -> (# State# s, MVar# s elt #)
1292 takeMVar#   :: SynchVar# s elt -> State# s -> (# State# s, elt #)
1293 putMVar#    :: SynchVar# s elt -> State# s -> State# s
1294 </ProgramListing>
1295
1296 <IndexTerm><Primary><literal>SynchVar&num;</literal></Primary></IndexTerm>
1297 <IndexTerm><Primary><literal>newSynchVar&num;</literal></Primary></IndexTerm>
1298 <IndexTerm><Primary><literal>takeMVar</literal></Primary></IndexTerm>
1299 <IndexTerm><Primary><literal>putMVar</literal></Primary></IndexTerm>
1300 </Para>
1301
1302 </Sect2>
1303
1304 </Sect1>
1305
1306 <Sect1 id="glasgow-ST-monad">
1307 <Title>Primitive state-transformer monad
1308 </Title>
1309
1310 <Para>
1311 <IndexTerm><Primary>state transformers (Glasgow extensions)</Primary></IndexTerm>
1312 <IndexTerm><Primary>ST monad (Glasgow extension)</Primary></IndexTerm>
1313 </Para>
1314
1315 <Para>
1316 This monad underlies our implementation of arrays, mutable and
1317 immutable, and our implementation of I/O, including &ldquo;C calls&rdquo;.
1318 </Para>
1319
1320 <Para>
1321 The <Literal>ST</Literal> library, which provides access to the
1322 <Function>ST</Function> monad, is described in <xref
1323 linkend="sec-ST">.
1324 </Para>
1325
1326 </Sect1>
1327
1328 <Sect1 id="glasgow-prim-arrays">
1329 <Title>Primitive arrays, mutable and otherwise
1330 </Title>
1331
1332 <Para>
1333 <IndexTerm><Primary>primitive arrays (Glasgow extension)</Primary></IndexTerm>
1334 <IndexTerm><Primary>arrays, primitive (Glasgow extension)</Primary></IndexTerm>
1335 </Para>
1336
1337 <Para>
1338 GHC knows about quite a few flavours of Large Swathes of Bytes.
1339 </Para>
1340
1341 <Para>
1342 First, GHC distinguishes between primitive arrays of (boxed) Haskell
1343 objects (type <Literal>Array&num; obj</Literal>) and primitive arrays of bytes (type
1344 <Literal>ByteArray&num;</Literal>).
1345 </Para>
1346
1347 <Para>
1348 Second, it distinguishes between&hellip;
1349 <VariableList>
1350
1351 <VarListEntry>
1352 <Term>Immutable:</Term>
1353 <ListItem>
1354 <Para>
1355 Arrays that do not change (as with &ldquo;standard&rdquo; Haskell arrays); you
1356 can only read from them.  Obviously, they do not need the care and
1357 attention of the state-transformer monad.
1358 </Para>
1359 </ListItem>
1360 </VarListEntry>
1361 <VarListEntry>
1362 <Term>Mutable:</Term>
1363 <ListItem>
1364 <Para>
1365 Arrays that may be changed or &ldquo;mutated.&rdquo;  All the operations on them
1366 live within the state-transformer monad and the updates happen
1367 <Emphasis>in-place</Emphasis>.
1368 </Para>
1369 </ListItem>
1370 </VarListEntry>
1371 <VarListEntry>
1372 <Term>&ldquo;Static&rdquo; (in C land):</Term>
1373 <ListItem>
1374 <Para>
1375 A C routine may pass an <Literal>Addr&num;</Literal> pointer back into Haskell land.  There
1376 are then primitive operations with which you may merrily grab values
1377 over in C land, by indexing off the &ldquo;static&rdquo; pointer.
1378 </Para>
1379 </ListItem>
1380 </VarListEntry>
1381 <VarListEntry>
1382 <Term>&ldquo;Stable&rdquo; pointers:</Term>
1383 <ListItem>
1384 <Para>
1385 If, for some reason, you wish to hand a Haskell pointer (i.e.,
1386 <Emphasis>not</Emphasis> an unboxed value) to a C routine, you first make the
1387 pointer &ldquo;stable,&rdquo; so that the garbage collector won't forget that it
1388 exists.  That is, GHC provides a safe way to pass Haskell pointers to
1389 C.
1390 </Para>
1391
1392 <Para>
1393 Please see <XRef LinkEnd="sec-stable-pointers"> for more details.
1394 </Para>
1395 </ListItem>
1396 </VarListEntry>
1397 <VarListEntry>
1398 <Term>&ldquo;Foreign objects&rdquo;:</Term>
1399 <ListItem>
1400 <Para>
1401 A &ldquo;foreign object&rdquo; is a safe way to pass an external object (a
1402 C-allocated pointer, say) to Haskell and have Haskell do the Right
1403 Thing when it no longer references the object.  So, for example, C
1404 could pass a large bitmap over to Haskell and say &ldquo;please free this
1405 memory when you're done with it.&rdquo;
1406 </Para>
1407
1408 <Para>
1409 Please see <XRef LinkEnd="sec-ForeignObj"> for more details.
1410 </Para>
1411 </ListItem>
1412 </VarListEntry>
1413 </VariableList>
1414 </Para>
1415
1416 <Para>
1417 The libraries documentatation gives more details on all these
1418 &ldquo;primitive array&rdquo; types and the operations on them.
1419 </Para>
1420
1421 </Sect1>
1422
1423
1424 <Sect1 id="pattern-guards">
1425 <Title>Pattern guards</Title>
1426
1427 <Para>
1428 <IndexTerm><Primary>Pattern guards (Glasgow extension)</Primary></IndexTerm>
1429 The discussion that follows is an abbreviated version of Simon Peyton Jones's original <ULink URL="http://research.microsoft.com/~simonpj/Haskell/guards.html">proposal</ULink>. (Note that the proposal was written before pattern guards were implemented, so refers to them as unimplemented.)
1430 </Para>
1431
1432 <Para>
1433 Suppose we have an abstract data type of finite maps, with a
1434 lookup operation:
1435
1436 <ProgramListing>
1437 lookup :: FiniteMap -> Int -> Maybe Int
1438 </ProgramListing>
1439
1440 The lookup returns <Function>Nothing</Function> if the supplied key is not in the domain of the mapping, and <Function>(Just v)</Function> otherwise,
1441 where <VarName>v</VarName> is the value that the key maps to.  Now consider the following definition:
1442 </Para>
1443
1444 <ProgramListing>
1445 clunky env var1 var2 | ok1 && ok2 = val1 + val2
1446 | otherwise  = var1 + var2
1447 where
1448   m1 = lookup env var1
1449   m2 = lookup env var2
1450   ok1 = maybeToBool m1
1451   ok2 = maybeToBool m2
1452   val1 = expectJust m1
1453   val2 = expectJust m2
1454 </ProgramListing>
1455
1456 <Para>
1457 The auxiliary functions are
1458 </Para>
1459
1460 <ProgramListing>
1461 maybeToBool :: Maybe a -&gt; Bool
1462 maybeToBool (Just x) = True
1463 maybeToBool Nothing  = False
1464
1465 expectJust :: Maybe a -&gt; a
1466 expectJust (Just x) = x
1467 expectJust Nothing  = error "Unexpected Nothing"
1468 </ProgramListing>
1469
1470 <Para>
1471 What is <Function>clunky</Function> doing? The guard <Literal>ok1 &&
1472 ok2</Literal> checks that both lookups succeed, using
1473 <Function>maybeToBool</Function> to convert the <Function>Maybe</Function>
1474 types to booleans. The (lazily evaluated) <Function>expectJust</Function>
1475 calls extract the values from the results of the lookups, and binds the
1476 returned values to <VarName>val1</VarName> and <VarName>val2</VarName>
1477 respectively.  If either lookup fails, then clunky takes the
1478 <Literal>otherwise</Literal> case and returns the sum of its arguments.
1479 </Para>
1480
1481 <Para>
1482 This is certainly legal Haskell, but it is a tremendously verbose and
1483 un-obvious way to achieve the desired effect.  Arguably, a more direct way
1484 to write clunky would be to use case expressions:
1485 </Para>
1486
1487 <ProgramListing>
1488 clunky env var1 var1 = case lookup env var1 of
1489   Nothing -&gt; fail
1490   Just val1 -&gt; case lookup env var2 of
1491     Nothing -&gt; fail
1492     Just val2 -&gt; val1 + val2
1493 where
1494   fail = val1 + val2
1495 </ProgramListing>
1496
1497 <Para>
1498 This is a bit shorter, but hardly better.  Of course, we can rewrite any set
1499 of pattern-matching, guarded equations as case expressions; that is
1500 precisely what the compiler does when compiling equations! The reason that
1501 Haskell provides guarded equations is because they allow us to write down
1502 the cases we want to consider, one at a time, independently of each other.
1503 This structure is hidden in the case version.  Two of the right-hand sides
1504 are really the same (<Function>fail</Function>), and the whole expression
1505 tends to become more and more indented.
1506 </Para>
1507
1508 <Para>
1509 Here is how I would write clunky:
1510 </Para>
1511
1512 <ProgramListing>
1513 clunky env var1 var1
1514   | Just val1 &lt;- lookup env var1
1515   , Just val2 &lt;- lookup env var2
1516   = val1 + val2
1517 ...other equations for clunky...
1518 </ProgramListing>
1519
1520 <Para>
1521 The semantics should be clear enough.  The qualifers are matched in order.
1522 For a <Literal>&lt;-</Literal> qualifier, which I call a pattern guard, the
1523 right hand side is evaluated and matched against the pattern on the left.
1524 If the match fails then the whole guard fails and the next equation is
1525 tried.  If it succeeds, then the appropriate binding takes place, and the
1526 next qualifier is matched, in the augmented environment.  Unlike list
1527 comprehensions, however, the type of the expression to the right of the
1528 <Literal>&lt;-</Literal> is the same as the type of the pattern to its
1529 left.  The bindings introduced by pattern guards scope over all the
1530 remaining guard qualifiers, and over the right hand side of the equation.
1531 </Para>
1532
1533 <Para>
1534 Just as with list comprehensions, boolean expressions can be freely mixed
1535 with among the pattern guards.  For example:
1536 </Para>
1537
1538 <ProgramListing>
1539 f x | [y] <- x
1540     , y > 3
1541     , Just z <- h y
1542     = ...
1543 </ProgramListing>
1544
1545 <Para>
1546 Haskell's current guards therefore emerge as a special case, in which the
1547 qualifier list has just one element, a boolean expression.
1548 </Para>
1549 </Sect1>
1550
1551   <sect1 id="sec-ffi">
1552     <title>The foreign interface</title>
1553
1554     <para>The foreign interface consists of the following components:</para>
1555
1556     <itemizedlist>
1557       <listitem>
1558         <para>The Foreign Function Interface language specification
1559         (included in this manual, in <xref linkend="ffi">).</para>
1560       </listitem>
1561
1562       <listitem>
1563         <para>The <literal>Foreign</literal> module (see <xref
1564         linkend="sec-Foreign">) collects together several interfaces
1565         which are useful in specifying foreign language
1566         interfaces, including the following:</para>
1567
1568         <itemizedlist>
1569           <listitem>
1570             <para>The <literal>ForeignObj</literal> module (see <xref
1571             linkend="sec-ForeignObj">), for managing pointers from
1572             Haskell into the outside world.</para>
1573           </listitem>
1574
1575           <listitem>
1576             <para>The <literal>StablePtr</literal> module (see <xref
1577             linkend="sec-stable-pointers">), for managing pointers
1578             into Haskell from the outside world.</para>
1579           </listitem>
1580
1581           <listitem>
1582             <para>The <literal>CTypes</literal> module (see <xref
1583             linkend="sec-CTypes">) gives Haskell equivalents for the
1584             standard C datatypes, for use in making Haskell bindings
1585             to existing C libraries.</para>
1586           </listitem>
1587
1588           <listitem>
1589             <para>The <literal>CTypesISO</literal> module (see <xref
1590             linkend="sec-CTypesISO">) gives Haskell equivalents for C
1591             types defined by the ISO C standard.</para>
1592           </listitem>
1593
1594           <listitem>
1595             <para>The <literal>Storable</literal> library, for
1596             primitive marshalling of data types between Haskell and
1597             the foreign language.</para>
1598           </listitem>
1599         </itemizedlist>
1600
1601       </listitem>
1602     </itemizedlist>
1603
1604 <para>The following sections also give some hints and tips on the use
1605 of the foreign function interface in GHC.</para>
1606
1607 <Sect2 id="glasgow-foreign-headers">
1608 <Title>Using function headers
1609 </Title>
1610
1611 <Para>
1612 <IndexTerm><Primary>C calls, function headers</Primary></IndexTerm>
1613 </Para>
1614
1615 <Para>
1616 When generating C (using the <Option>-fvia-C</Option> directive), one can assist the
1617 C compiler in detecting type errors by using the <Command>-&num;include</Command> directive
1618 to provide <Filename>.h</Filename> files containing function headers.
1619 </Para>
1620
1621 <Para>
1622 For example,
1623 </Para>
1624
1625 <Para>
1626
1627 <ProgramListing>
1628 #include "HsFFI.h"
1629
1630 void         initialiseEFS (HsInt size);
1631 HsInt        terminateEFS (void);
1632 HsForeignObj emptyEFS(void);
1633 HsForeignObj updateEFS (HsForeignObj a, HsInt i, HsInt x);
1634 HsInt        lookupEFS (HsForeignObj a, HsInt i);
1635 </ProgramListing>
1636 </Para>
1637
1638       <para>The types <literal>HsInt</literal>,
1639       <literal>HsForeignObj</literal> etc. are described in <xref
1640       linkend="sec-mapping-table">.</Para>
1641
1642       <Para>Note that this approach is only
1643       <Emphasis>essential</Emphasis> for returning
1644       <Literal>float</Literal>s (or if <Literal>sizeof(int) !=
1645       sizeof(int *)</Literal> on your architecture) but is a Good
1646       Thing for anyone who cares about writing solid code.  You're
1647       crazy not to do it.</Para>
1648
1649 </Sect2>
1650
1651 </Sect1>
1652
1653 <Sect1 id="multi-param-type-classes">
1654 <Title>Multi-parameter type classes
1655 </Title>
1656
1657 <Para>
1658 This section documents GHC's implementation of multi-parameter type
1659 classes.  There's lots of background in the paper <ULink
1660 URL="http://research.microsoft.com/~simonpj/multi.ps.gz" >Type
1661 classes: exploring the design space</ULink > (Simon Peyton Jones, Mark
1662 Jones, Erik Meijer).
1663 </Para>
1664
1665 <Para>
1666 I'd like to thank people who reported shorcomings in the GHC 3.02
1667 implementation.  Our default decisions were all conservative ones, and
1668 the experience of these heroic pioneers has given useful concrete
1669 examples to support several generalisations.  (These appear below as
1670 design choices not implemented in 3.02.)
1671 </Para>
1672
1673 <Para>
1674 I've discussed these notes with Mark Jones, and I believe that Hugs
1675 will migrate towards the same design choices as I outline here.
1676 Thanks to him, and to many others who have offered very useful
1677 feedback.
1678 </Para>
1679
1680 <Sect2>
1681 <Title>Types</Title>
1682
1683 <Para>
1684 There are the following restrictions on the form of a qualified
1685 type:
1686 </Para>
1687
1688 <Para>
1689
1690 <ProgramListing>
1691   forall tv1..tvn (c1, ...,cn) => type
1692 </ProgramListing>
1693
1694 </Para>
1695
1696 <Para>
1697 (Here, I write the "foralls" explicitly, although the Haskell source
1698 language omits them; in Haskell 1.4, all the free type variables of an
1699 explicit source-language type signature are universally quantified,
1700 except for the class type variables in a class declaration.  However,
1701 in GHC, you can give the foralls if you want.  See <XRef LinkEnd="universal-quantification">).
1702 </Para>
1703
1704 <Para>
1705
1706 <OrderedList>
1707 <ListItem>
1708
1709 <Para>
1710  <Emphasis>Each universally quantified type variable
1711 <Literal>tvi</Literal> must be mentioned (i.e. appear free) in <Literal>type</Literal></Emphasis>.
1712
1713 The reason for this is that a value with a type that does not obey
1714 this restriction could not be used without introducing
1715 ambiguity. Here, for example, is an illegal type:
1716
1717
1718 <ProgramListing>
1719   forall a. Eq a => Int
1720 </ProgramListing>
1721
1722
1723 When a value with this type was used, the constraint <Literal>Eq tv</Literal>
1724 would be introduced where <Literal>tv</Literal> is a fresh type variable, and
1725 (in the dictionary-translation implementation) the value would be
1726 applied to a dictionary for <Literal>Eq tv</Literal>.  The difficulty is that we
1727 can never know which instance of <Literal>Eq</Literal> to use because we never
1728 get any more information about <Literal>tv</Literal>.
1729
1730 </Para>
1731 </ListItem>
1732 <ListItem>
1733
1734 <Para>
1735  <Emphasis>Every constraint <Literal>ci</Literal> must mention at least one of the
1736 universally quantified type variables <Literal>tvi</Literal></Emphasis>.
1737
1738 For example, this type is OK because <Literal>C a b</Literal> mentions the
1739 universally quantified type variable <Literal>b</Literal>:
1740
1741
1742 <ProgramListing>
1743   forall a. C a b => burble
1744 </ProgramListing>
1745
1746
1747 The next type is illegal because the constraint <Literal>Eq b</Literal> does not
1748 mention <Literal>a</Literal>:
1749
1750
1751 <ProgramListing>
1752   forall a. Eq b => burble
1753 </ProgramListing>
1754
1755
1756 The reason for this restriction is milder than the other one.  The
1757 excluded types are never useful or necessary (because the offending
1758 context doesn't need to be witnessed at this point; it can be floated
1759 out).  Furthermore, floating them out increases sharing. Lastly,
1760 excluding them is a conservative choice; it leaves a patch of
1761 territory free in case we need it later.
1762
1763 </Para>
1764 </ListItem>
1765
1766 </OrderedList>
1767
1768 </Para>
1769
1770 <Para>
1771 These restrictions apply to all types, whether declared in a type signature
1772 or inferred.
1773 </Para>
1774
1775 <Para>
1776 Unlike Haskell 1.4, constraints in types do <Emphasis>not</Emphasis> have to be of
1777 the form <Emphasis>(class type-variables)</Emphasis>.  Thus, these type signatures
1778 are perfectly OK
1779 </Para>
1780
1781 <Para>
1782
1783 <ProgramListing>
1784   f :: Eq (m a) => [m a] -> [m a]
1785   g :: Eq [a] => ...
1786 </ProgramListing>
1787
1788 </Para>
1789
1790 <Para>
1791 This choice recovers principal types, a property that Haskell 1.4 does not have.
1792 </Para>
1793
1794 </Sect2>
1795
1796 <Sect2>
1797 <Title>Class declarations</Title>
1798
1799 <Para>
1800
1801 <OrderedList>
1802 <ListItem>
1803
1804 <Para>
1805  <Emphasis>Multi-parameter type classes are permitted</Emphasis>. For example:
1806
1807
1808 <ProgramListing>
1809   class Collection c a where
1810     union :: c a -> c a -> c a
1811     ...etc.
1812 </ProgramListing>
1813
1814
1815
1816 </Para>
1817 </ListItem>
1818 <ListItem>
1819
1820 <Para>
1821  <Emphasis>The class hierarchy must be acyclic</Emphasis>.  However, the definition
1822 of "acyclic" involves only the superclass relationships.  For example,
1823 this is OK:
1824
1825
1826 <ProgramListing>
1827   class C a where {
1828     op :: D b => a -> b -> b
1829   }
1830
1831   class C a => D a where { ... }
1832 </ProgramListing>
1833
1834
1835 Here, <Literal>C</Literal> is a superclass of <Literal>D</Literal>, but it's OK for a
1836 class operation <Literal>op</Literal> of <Literal>C</Literal> to mention <Literal>D</Literal>.  (It
1837 would not be OK for <Literal>D</Literal> to be a superclass of <Literal>C</Literal>.)
1838
1839 </Para>
1840 </ListItem>
1841 <ListItem>
1842
1843 <Para>
1844  <Emphasis>There are no restrictions on the context in a class declaration
1845 (which introduces superclasses), except that the class hierarchy must
1846 be acyclic</Emphasis>.  So these class declarations are OK:
1847
1848
1849 <ProgramListing>
1850   class Functor (m k) => FiniteMap m k where
1851     ...
1852
1853   class (Monad m, Monad (t m)) => Transform t m where
1854     lift :: m a -> (t m) a
1855 </ProgramListing>
1856
1857
1858 </Para>
1859 </ListItem>
1860 <ListItem>
1861
1862 <Para>
1863  <Emphasis>In the signature of a class operation, every constraint
1864 must mention at least one type variable that is not a class type
1865 variable</Emphasis>.
1866
1867 Thus:
1868
1869
1870 <ProgramListing>
1871   class Collection c a where
1872     mapC :: Collection c b => (a->b) -> c a -> c b
1873 </ProgramListing>
1874
1875
1876 is OK because the constraint <Literal>(Collection a b)</Literal> mentions
1877 <Literal>b</Literal>, even though it also mentions the class variable
1878 <Literal>a</Literal>.  On the other hand:
1879
1880
1881 <ProgramListing>
1882   class C a where
1883     op :: Eq a => (a,b) -> (a,b)
1884 </ProgramListing>
1885
1886
1887 is not OK because the constraint <Literal>(Eq a)</Literal> mentions on the class
1888 type variable <Literal>a</Literal>, but not <Literal>b</Literal>.  However, any such
1889 example is easily fixed by moving the offending context up to the
1890 superclass context:
1891
1892
1893 <ProgramListing>
1894   class Eq a => C a where
1895     op ::(a,b) -> (a,b)
1896 </ProgramListing>
1897
1898
1899 A yet more relaxed rule would allow the context of a class-op signature
1900 to mention only class type variables.  However, that conflicts with
1901 Rule 1(b) for types above.
1902
1903 </Para>
1904 </ListItem>
1905 <ListItem>
1906
1907 <Para>
1908  <Emphasis>The type of each class operation must mention <Emphasis>all</Emphasis> of
1909 the class type variables</Emphasis>.  For example:
1910
1911
1912 <ProgramListing>
1913   class Coll s a where
1914     empty  :: s
1915     insert :: s -> a -> s
1916 </ProgramListing>
1917
1918
1919 is not OK, because the type of <Literal>empty</Literal> doesn't mention
1920 <Literal>a</Literal>.  This rule is a consequence of Rule 1(a), above, for
1921 types, and has the same motivation.
1922
1923 Sometimes, offending class declarations exhibit misunderstandings.  For
1924 example, <Literal>Coll</Literal> might be rewritten
1925
1926
1927 <ProgramListing>
1928   class Coll s a where
1929     empty  :: s a
1930     insert :: s a -> a -> s a
1931 </ProgramListing>
1932
1933
1934 which makes the connection between the type of a collection of
1935 <Literal>a</Literal>'s (namely <Literal>(s a)</Literal>) and the element type <Literal>a</Literal>.
1936 Occasionally this really doesn't work, in which case you can split the
1937 class like this:
1938
1939
1940 <ProgramListing>
1941   class CollE s where
1942     empty  :: s
1943
1944   class CollE s => Coll s a where
1945     insert :: s -> a -> s
1946 </ProgramListing>
1947
1948
1949 </Para>
1950 </ListItem>
1951
1952 </OrderedList>
1953
1954 </Para>
1955
1956 </Sect2>
1957
1958 <Sect2 id="instance-decls">
1959 <Title>Instance declarations</Title>
1960
1961 <Para>
1962
1963 <OrderedList>
1964 <ListItem>
1965
1966 <Para>
1967  <Emphasis>Instance declarations may not overlap</Emphasis>.  The two instance
1968 declarations
1969
1970
1971 <ProgramListing>
1972   instance context1 => C type1 where ...
1973   instance context2 => C type2 where ...
1974 </ProgramListing>
1975
1976
1977 "overlap" if <Literal>type1</Literal> and <Literal>type2</Literal> unify
1978
1979 However, if you give the command line option
1980 <Option>-fallow-overlapping-instances</Option><IndexTerm><Primary>-fallow-overlapping-instances
1981 option</Primary></IndexTerm> then two overlapping instance declarations are permitted
1982 iff
1983
1984
1985 <ItemizedList>
1986 <ListItem>
1987
1988 <Para>
1989  EITHER <Literal>type1</Literal> and <Literal>type2</Literal> do not unify
1990 </Para>
1991 </ListItem>
1992 <ListItem>
1993
1994 <Para>
1995  OR <Literal>type2</Literal> is a substitution instance of <Literal>type1</Literal>
1996 (but not identical to <Literal>type1</Literal>)
1997 </Para>
1998 </ListItem>
1999 <ListItem>
2000
2001 <Para>
2002  OR vice versa
2003 </Para>
2004 </ListItem>
2005
2006 </ItemizedList>
2007
2008
2009 Notice that these rules
2010
2011
2012 <ItemizedList>
2013 <ListItem>
2014
2015 <Para>
2016  make it clear which instance decl to use
2017 (pick the most specific one that matches)
2018
2019 </Para>
2020 </ListItem>
2021 <ListItem>
2022
2023 <Para>
2024  do not mention the contexts <Literal>context1</Literal>, <Literal>context2</Literal>
2025 Reason: you can pick which instance decl
2026 "matches" based on the type.
2027 </Para>
2028 </ListItem>
2029
2030 </ItemizedList>
2031
2032
2033 Regrettably, GHC doesn't guarantee to detect overlapping instance
2034 declarations if they appear in different modules.  GHC can "see" the
2035 instance declarations in the transitive closure of all the modules
2036 imported by the one being compiled, so it can "see" all instance decls
2037 when it is compiling <Literal>Main</Literal>.  However, it currently chooses not
2038 to look at ones that can't possibly be of use in the module currently
2039 being compiled, in the interests of efficiency.  (Perhaps we should
2040 change that decision, at least for <Literal>Main</Literal>.)
2041
2042 </Para>
2043 </ListItem>
2044 <ListItem>
2045
2046 <Para>
2047  <Emphasis>There are no restrictions on the type in an instance
2048 <Emphasis>head</Emphasis>, except that at least one must not be a type variable</Emphasis>.
2049 The instance "head" is the bit after the "=>" in an instance decl. For
2050 example, these are OK:
2051
2052
2053 <ProgramListing>
2054   instance C Int a where ...
2055
2056   instance D (Int, Int) where ...
2057
2058   instance E [[a]] where ...
2059 </ProgramListing>
2060
2061
2062 Note that instance heads <Emphasis>may</Emphasis> contain repeated type variables.
2063 For example, this is OK:
2064
2065
2066 <ProgramListing>
2067   instance Stateful (ST s) (MutVar s) where ...
2068 </ProgramListing>
2069
2070
2071 The "at least one not a type variable" restriction is to ensure that
2072 context reduction terminates: each reduction step removes one type
2073 constructor.  For example, the following would make the type checker
2074 loop if it wasn't excluded:
2075
2076
2077 <ProgramListing>
2078   instance C a => C a where ...
2079 </ProgramListing>
2080
2081
2082 There are two situations in which the rule is a bit of a pain. First,
2083 if one allows overlapping instance declarations then it's quite
2084 convenient to have a "default instance" declaration that applies if
2085 something more specific does not:
2086
2087
2088 <ProgramListing>
2089   instance C a where
2090     op = ... -- Default
2091 </ProgramListing>
2092
2093
2094 Second, sometimes you might want to use the following to get the
2095 effect of a "class synonym":
2096
2097
2098 <ProgramListing>
2099   class (C1 a, C2 a, C3 a) => C a where { }
2100
2101   instance (C1 a, C2 a, C3 a) => C a where { }
2102 </ProgramListing>
2103
2104
2105 This allows you to write shorter signatures:
2106
2107
2108 <ProgramListing>
2109   f :: C a => ...
2110 </ProgramListing>
2111
2112
2113 instead of
2114
2115
2116 <ProgramListing>
2117   f :: (C1 a, C2 a, C3 a) => ...
2118 </ProgramListing>
2119
2120
2121 I'm on the lookout for a simple rule that preserves decidability while
2122 allowing these idioms.  The experimental flag
2123 <Option>-fallow-undecidable-instances</Option><IndexTerm><Primary>-fallow-undecidable-instances
2124 option</Primary></IndexTerm> lifts this restriction, allowing all the types in an
2125 instance head to be type variables.
2126
2127 </Para>
2128 </ListItem>
2129 <ListItem>
2130
2131 <Para>
2132  <Emphasis>Unlike Haskell 1.4, instance heads may use type
2133 synonyms</Emphasis>.  As always, using a type synonym is just shorthand for
2134 writing the RHS of the type synonym definition.  For example:
2135
2136
2137 <ProgramListing>
2138   type Point = (Int,Int)
2139   instance C Point   where ...
2140   instance C [Point] where ...
2141 </ProgramListing>
2142
2143
2144 is legal.  However, if you added
2145
2146
2147 <ProgramListing>
2148   instance C (Int,Int) where ...
2149 </ProgramListing>
2150
2151
2152 as well, then the compiler will complain about the overlapping
2153 (actually, identical) instance declarations.  As always, type synonyms
2154 must be fully applied.  You cannot, for example, write:
2155
2156
2157 <ProgramListing>
2158   type P a = [[a]]
2159   instance Monad P where ...
2160 </ProgramListing>
2161
2162
2163 This design decision is independent of all the others, and easily
2164 reversed, but it makes sense to me.
2165
2166 </Para>
2167 </ListItem>
2168 <ListItem>
2169
2170 <Para>
2171 <Emphasis>The types in an instance-declaration <Emphasis>context</Emphasis> must all
2172 be type variables</Emphasis>. Thus
2173
2174
2175 <ProgramListing>
2176 instance C a b => Eq (a,b) where ...
2177 </ProgramListing>
2178
2179
2180 is OK, but
2181
2182
2183 <ProgramListing>
2184 instance C Int b => Foo b where ...
2185 </ProgramListing>
2186
2187
2188 is not OK.  Again, the intent here is to make sure that context
2189 reduction terminates.
2190
2191 Voluminous correspondence on the Haskell mailing list has convinced me
2192 that it's worth experimenting with a more liberal rule.  If you use
2193 the flag <Option>-fallow-undecidable-instances</Option> can use arbitrary
2194 types in an instance context.  Termination is ensured by having a
2195 fixed-depth recursion stack.  If you exceed the stack depth you get a
2196 sort of backtrace, and the opportunity to increase the stack depth
2197 with <Option>-fcontext-stack</Option><Emphasis>N</Emphasis>.
2198
2199 </Para>
2200 </ListItem>
2201
2202 </OrderedList>
2203
2204 </Para>
2205
2206 </Sect2>
2207
2208 </Sect1>
2209
2210 <Sect1 id="universal-quantification">
2211 <Title>Explicit universal quantification
2212 </Title>
2213
2214 <Para>
2215 GHC now allows you to write explicitly quantified types.  GHC's
2216 syntax for this now agrees with Hugs's, namely:
2217 </Para>
2218
2219 <Para>
2220
2221 <ProgramListing>
2222         forall a b. (Ord a, Eq  b) => a -> b -> a
2223 </ProgramListing>
2224
2225 </Para>
2226
2227 <Para>
2228 The context is, of course, optional.  You can't use <Literal>forall</Literal> as
2229 a type variable any more!
2230 </Para>
2231
2232 <Para>
2233 Haskell type signatures are implicitly quantified.  The <Literal>forall</Literal>
2234 allows us to say exactly what this means.  For example:
2235 </Para>
2236
2237 <Para>
2238
2239 <ProgramListing>
2240         g :: b -> b
2241 </ProgramListing>
2242
2243 </Para>
2244
2245 <Para>
2246 means this:
2247 </Para>
2248
2249 <Para>
2250
2251 <ProgramListing>
2252         g :: forall b. (b -> b)
2253 </ProgramListing>
2254
2255 </Para>
2256
2257 <Para>
2258 The two are treated identically.
2259 </Para>
2260
2261 <Sect2 id="univ">
2262 <Title>Universally-quantified data type fields
2263 </Title>
2264
2265 <Para>
2266 In a <Literal>data</Literal> or <Literal>newtype</Literal> declaration one can quantify
2267 the types of the constructor arguments.  Here are several examples:
2268 </Para>
2269
2270 <Para>
2271
2272 <ProgramListing>
2273 data T a = T1 (forall b. b -> b -> b) a
2274
2275 data MonadT m = MkMonad { return :: forall a. a -> m a,
2276                           bind   :: forall a b. m a -> (a -> m b) -> m b
2277                         }
2278
2279 newtype Swizzle = MkSwizzle (Ord a => [a] -> [a])
2280 </ProgramListing>
2281
2282 </Para>
2283
2284 <Para>
2285 The constructors now have so-called <Emphasis>rank 2</Emphasis> polymorphic
2286 types, in which there is a for-all in the argument types.:
2287 </Para>
2288
2289 <Para>
2290
2291 <ProgramListing>
2292 T1 :: forall a. (forall b. b -> b -> b) -> a -> T a
2293 MkMonad :: forall m. (forall a. a -> m a)
2294                   -> (forall a b. m a -> (a -> m b) -> m b)
2295                   -> MonadT m
2296 MkSwizzle :: (Ord a => [a] -> [a]) -> Swizzle
2297 </ProgramListing>
2298
2299 </Para>
2300
2301 <Para>
2302 Notice that you don't need to use a <Literal>forall</Literal> if there's an
2303 explicit context.  For example in the first argument of the
2304 constructor <Function>MkSwizzle</Function>, an implicit "<Literal>forall a.</Literal>" is
2305 prefixed to the argument type.  The implicit <Literal>forall</Literal>
2306 quantifies all type variables that are not already in scope, and are
2307 mentioned in the type quantified over.
2308 </Para>
2309
2310 <Para>
2311 As for type signatures, implicit quantification happens for non-overloaded
2312 types too.  So if you write this:
2313
2314 <ProgramListing>
2315   data T a = MkT (Either a b) (b -> b)
2316 </ProgramListing>
2317
2318 it's just as if you had written this:
2319
2320 <ProgramListing>
2321   data T a = MkT (forall b. Either a b) (forall b. b -> b)
2322 </ProgramListing>
2323
2324 That is, since the type variable <Literal>b</Literal> isn't in scope, it's
2325 implicitly universally quantified.  (Arguably, it would be better
2326 to <Emphasis>require</Emphasis> explicit quantification on constructor arguments
2327 where that is what is wanted.  Feedback welcomed.)
2328 </Para>
2329
2330 </Sect2>
2331
2332 <Sect2>
2333 <Title>Construction </Title>
2334
2335 <Para>
2336 You construct values of types <Literal>T1, MonadT, Swizzle</Literal> by applying
2337 the constructor to suitable values, just as usual.  For example,
2338 </Para>
2339
2340 <Para>
2341
2342 <ProgramListing>
2343 (T1 (\xy->x) 3) :: T Int
2344
2345 (MkSwizzle sort)    :: Swizzle
2346 (MkSwizzle reverse) :: Swizzle
2347
2348 (let r x = Just x
2349      b m k = case m of
2350                 Just y -> k y
2351                 Nothing -> Nothing
2352   in
2353   MkMonad r b) :: MonadT Maybe
2354 </ProgramListing>
2355
2356 </Para>
2357
2358 <Para>
2359 The type of the argument can, as usual, be more general than the type
2360 required, as <Literal>(MkSwizzle reverse)</Literal> shows.  (<Function>reverse</Function>
2361 does not need the <Literal>Ord</Literal> constraint.)
2362 </Para>
2363
2364 </Sect2>
2365
2366 <Sect2>
2367 <Title>Pattern matching</Title>
2368
2369 <Para>
2370 When you use pattern matching, the bound variables may now have
2371 polymorphic types.  For example:
2372 </Para>
2373
2374 <Para>
2375
2376 <ProgramListing>
2377         f :: T a -> a -> (a, Char)
2378         f (T1 f k) x = (f k x, f 'c' 'd')
2379
2380         g :: (Ord a, Ord b) => Swizzle -> [a] -> (a -> b) -> [b]
2381         g (MkSwizzle s) xs f = s (map f (s xs))
2382
2383         h :: MonadT m -> [m a] -> m [a]
2384         h m [] = return m []
2385         h m (x:xs) = bind m x           $ \y ->
2386                       bind m (h m xs)   $ \ys ->
2387                       return m (y:ys)
2388 </ProgramListing>
2389
2390 </Para>
2391
2392 <Para>
2393 In the function <Function>h</Function> we use the record selectors <Literal>return</Literal>
2394 and <Literal>bind</Literal> to extract the polymorphic bind and return functions
2395 from the <Literal>MonadT</Literal> data structure, rather than using pattern
2396 matching.
2397 </Para>
2398
2399 <Para>
2400 You cannot pattern-match against an argument that is polymorphic.
2401 For example:
2402
2403 <ProgramListing>
2404         newtype TIM s a = TIM (ST s (Maybe a))
2405
2406         runTIM :: (forall s. TIM s a) -> Maybe a
2407         runTIM (TIM m) = runST m
2408 </ProgramListing>
2409
2410 </Para>
2411
2412 <Para>
2413 Here the pattern-match fails, because you can't pattern-match against
2414 an argument of type <Literal>(forall s. TIM s a)</Literal>.  Instead you
2415 must bind the variable and pattern match in the right hand side:
2416
2417 <ProgramListing>
2418         runTIM :: (forall s. TIM s a) -> Maybe a
2419         runTIM tm = case tm of { TIM m -> runST m }
2420 </ProgramListing>
2421
2422 The <Literal>tm</Literal> on the right hand side is (invisibly) instantiated, like
2423 any polymorphic value at its occurrence site, and now you can pattern-match
2424 against it.
2425 </Para>
2426
2427 </Sect2>
2428
2429 <Sect2>
2430 <Title>The partial-application restriction</Title>
2431
2432 <Para>
2433 There is really only one way in which data structures with polymorphic
2434 components might surprise you: you must not partially apply them.
2435 For example, this is illegal:
2436 </Para>
2437
2438 <Para>
2439
2440 <ProgramListing>
2441         map MkSwizzle [sort, reverse]
2442 </ProgramListing>
2443
2444 </Para>
2445
2446 <Para>
2447 The restriction is this: <Emphasis>every subexpression of the program must
2448 have a type that has no for-alls, except that in a function
2449 application (f e1&hellip;en) the partial applications are not subject to
2450 this rule</Emphasis>.  The restriction makes type inference feasible.
2451 </Para>
2452
2453 <Para>
2454 In the illegal example, the sub-expression <Literal>MkSwizzle</Literal> has the
2455 polymorphic type <Literal>(Ord b => [b] -> [b]) -> Swizzle</Literal> and is not
2456 a sub-expression of an enclosing application.  On the other hand, this
2457 expression is OK:
2458 </Para>
2459
2460 <Para>
2461
2462 <ProgramListing>
2463         map (T1 (\a b -> a)) [1,2,3]
2464 </ProgramListing>
2465
2466 </Para>
2467
2468 <Para>
2469 even though it involves a partial application of <Function>T1</Function>, because
2470 the sub-expression <Literal>T1 (\a b -> a)</Literal> has type <Literal>Int -> T
2471 Int</Literal>.
2472 </Para>
2473
2474 </Sect2>
2475
2476 <Sect2 id="sigs">
2477 <Title>Type signatures
2478 </Title>
2479
2480 <Para>
2481 Once you have data constructors with universally-quantified fields, or
2482 constants such as <Constant>runST</Constant> that have rank-2 types, it isn't long
2483 before you discover that you need more!  Consider:
2484 </Para>
2485
2486 <Para>
2487
2488 <ProgramListing>
2489   mkTs f x y = [T1 f x, T1 f y]
2490 </ProgramListing>
2491
2492 </Para>
2493
2494 <Para>
2495 <Function>mkTs</Function> is a fuction that constructs some values of type
2496 <Literal>T</Literal>, using some pieces passed to it.  The trouble is that since
2497 <Literal>f</Literal> is a function argument, Haskell assumes that it is
2498 monomorphic, so we'll get a type error when applying <Function>T1</Function> to
2499 it.  This is a rather silly example, but the problem really bites in
2500 practice.  Lots of people trip over the fact that you can't make
2501 "wrappers functions" for <Constant>runST</Constant> for exactly the same reason.
2502 In short, it is impossible to build abstractions around functions with
2503 rank-2 types.
2504 </Para>
2505
2506 <Para>
2507 The solution is fairly clear.  We provide the ability to give a rank-2
2508 type signature for <Emphasis>ordinary</Emphasis> functions (not only data
2509 constructors), thus:
2510 </Para>
2511
2512 <Para>
2513
2514 <ProgramListing>
2515   mkTs :: (forall b. b -> b -> b) -> a -> [T a]
2516   mkTs f x y = [T1 f x, T1 f y]
2517 </ProgramListing>
2518
2519 </Para>
2520
2521 <Para>
2522 This type signature tells the compiler to attribute <Literal>f</Literal> with
2523 the polymorphic type <Literal>(forall b. b -> b -> b)</Literal> when type
2524 checking the body of <Function>mkTs</Function>, so now the application of
2525 <Function>T1</Function> is fine.
2526 </Para>
2527
2528 <Para>
2529 There are two restrictions:
2530 </Para>
2531
2532 <Para>
2533
2534 <ItemizedList>
2535 <ListItem>
2536
2537 <Para>
2538  You can only define a rank 2 type, specified by the following
2539 grammar:
2540
2541
2542 <ProgramListing>
2543 rank2type ::= [forall tyvars .] [context =>] funty
2544 funty     ::= ([forall tyvars .] [context =>] ty) -> funty
2545             | ty
2546 ty        ::= ...current Haskell monotype syntax...
2547 </ProgramListing>
2548
2549
2550 Informally, the universal quantification must all be right at the beginning,
2551 or at the top level of a function argument.
2552
2553 </Para>
2554 </ListItem>
2555 <ListItem>
2556
2557 <Para>
2558  There is a restriction on the definition of a function whose
2559 type signature is a rank-2 type: the polymorphic arguments must be
2560 matched on the left hand side of the "<Literal>=</Literal>" sign.  You can't
2561 define <Function>mkTs</Function> like this:
2562
2563
2564 <ProgramListing>
2565 mkTs :: (forall b. b -> b -> b) -> a -> [T a]
2566 mkTs = \ f x y -> [T1 f x, T1 f y]
2567 </ProgramListing>
2568
2569
2570
2571 The same partial-application rule applies to ordinary functions with
2572 rank-2 types as applied to data constructors.
2573
2574 </Para>
2575 </ListItem>
2576
2577 </ItemizedList>
2578
2579 </Para>
2580
2581 </Sect2>
2582
2583
2584 <Sect2 id="hoist">
2585 <Title>Type synonyms and hoisting
2586 </Title>
2587
2588 <Para>
2589 GHC also allows you to write a <Literal>forall</Literal> in a type synonym, thus:
2590 <ProgramListing>
2591   type Discard a = forall b. a -> b -> a
2592
2593   f :: Discard a
2594   f x y = x
2595 </ProgramListing>
2596 However, it is often convenient to use these sort of synonyms at the right hand
2597 end of an arrow, thus:
2598 <ProgramListing>
2599   type Discard a = forall b. a -> b -> a
2600
2601   g :: Int -> Discard Int
2602   g x y z = x+y
2603 </ProgramListing>
2604 Simply expanding the type synonym would give
2605 <ProgramListing>
2606   g :: Int -> (forall b. Int -> b -> Int)
2607 </ProgramListing>
2608 but GHC "hoists" the <Literal>forall</Literal> to give the isomorphic type
2609 <ProgramListing>
2610   g :: forall b. Int -> Int -> b -> Int
2611 </ProgramListing>
2612 In general, the rule is this: <Emphasis>to determine the type specified by any explicit
2613 user-written type (e.g. in a type signature), GHC expands type synonyms and then repeatedly
2614 performs the transformation:</Emphasis>
2615 <ProgramListing>
2616   <Emphasis>type1</Emphasis> -> forall a. <Emphasis>type2</Emphasis>
2617 ==>
2618   forall a. <Emphasis>type1</Emphasis> -> <Emphasis>type2</Emphasis>
2619 </ProgramListing>
2620 (In fact, GHC tries to retain as much synonym information as possible for use in
2621 error messages, but that is a usability issue.)  This rule applies, of course, whether
2622 or not the <Literal>forall</Literal> comes from a synonym. For example, here is another
2623 valid way to write <Literal>g</Literal>'s type signature:
2624 <ProgramListing>
2625   g :: Int -> Int -> forall b. b -> Int
2626 </ProgramListing>
2627 </Para>
2628 </Sect2>
2629
2630 </Sect1>
2631
2632 <Sect1 id="existential-quantification">
2633 <Title>Existentially quantified data constructors
2634 </Title>
2635
2636 <Para>
2637 The idea of using existential quantification in data type declarations
2638 was suggested by Laufer (I believe, thought doubtless someone will
2639 correct me), and implemented in Hope+. It's been in Lennart
2640 Augustsson's <Command>hbc</Command> Haskell compiler for several years, and
2641 proved very useful.  Here's the idea.  Consider the declaration:
2642 </Para>
2643
2644 <Para>
2645
2646 <ProgramListing>
2647   data Foo = forall a. MkFoo a (a -> Bool)
2648            | Nil
2649 </ProgramListing>
2650
2651 </Para>
2652
2653 <Para>
2654 The data type <Literal>Foo</Literal> has two constructors with types:
2655 </Para>
2656
2657 <Para>
2658
2659 <ProgramListing>
2660   MkFoo :: forall a. a -> (a -> Bool) -> Foo
2661   Nil   :: Foo
2662 </ProgramListing>
2663
2664 </Para>
2665
2666 <Para>
2667 Notice that the type variable <Literal>a</Literal> in the type of <Function>MkFoo</Function>
2668 does not appear in the data type itself, which is plain <Literal>Foo</Literal>.
2669 For example, the following expression is fine:
2670 </Para>
2671
2672 <Para>
2673
2674 <ProgramListing>
2675   [MkFoo 3 even, MkFoo 'c' isUpper] :: [Foo]
2676 </ProgramListing>
2677
2678 </Para>
2679
2680 <Para>
2681 Here, <Literal>(MkFoo 3 even)</Literal> packages an integer with a function
2682 <Function>even</Function> that maps an integer to <Literal>Bool</Literal>; and <Function>MkFoo 'c'
2683 isUpper</Function> packages a character with a compatible function.  These
2684 two things are each of type <Literal>Foo</Literal> and can be put in a list.
2685 </Para>
2686
2687 <Para>
2688 What can we do with a value of type <Literal>Foo</Literal>?.  In particular,
2689 what happens when we pattern-match on <Function>MkFoo</Function>?
2690 </Para>
2691
2692 <Para>
2693
2694 <ProgramListing>
2695   f (MkFoo val fn) = ???
2696 </ProgramListing>
2697
2698 </Para>
2699
2700 <Para>
2701 Since all we know about <Literal>val</Literal> and <Function>fn</Function> is that they
2702 are compatible, the only (useful) thing we can do with them is to
2703 apply <Function>fn</Function> to <Literal>val</Literal> to get a boolean.  For example:
2704 </Para>
2705
2706 <Para>
2707
2708 <ProgramListing>
2709   f :: Foo -> Bool
2710   f (MkFoo val fn) = fn val
2711 </ProgramListing>
2712
2713 </Para>
2714
2715 <Para>
2716 What this allows us to do is to package heterogenous values
2717 together with a bunch of functions that manipulate them, and then treat
2718 that collection of packages in a uniform manner.  You can express
2719 quite a bit of object-oriented-like programming this way.
2720 </Para>
2721
2722 <Sect2 id="existential">
2723 <Title>Why existential?
2724 </Title>
2725
2726 <Para>
2727 What has this to do with <Emphasis>existential</Emphasis> quantification?
2728 Simply that <Function>MkFoo</Function> has the (nearly) isomorphic type
2729 </Para>
2730
2731 <Para>
2732
2733 <ProgramListing>
2734   MkFoo :: (exists a . (a, a -> Bool)) -> Foo
2735 </ProgramListing>
2736
2737 </Para>
2738
2739 <Para>
2740 But Haskell programmers can safely think of the ordinary
2741 <Emphasis>universally</Emphasis> quantified type given above, thereby avoiding
2742 adding a new existential quantification construct.
2743 </Para>
2744
2745 </Sect2>
2746
2747 <Sect2>
2748 <Title>Type classes</Title>
2749
2750 <Para>
2751 An easy extension (implemented in <Command>hbc</Command>) is to allow
2752 arbitrary contexts before the constructor.  For example:
2753 </Para>
2754
2755 <Para>
2756
2757 <ProgramListing>
2758 data Baz = forall a. Eq a => Baz1 a a
2759          | forall b. Show b => Baz2 b (b -> b)
2760 </ProgramListing>
2761
2762 </Para>
2763
2764 <Para>
2765 The two constructors have the types you'd expect:
2766 </Para>
2767
2768 <Para>
2769
2770 <ProgramListing>
2771 Baz1 :: forall a. Eq a => a -> a -> Baz
2772 Baz2 :: forall b. Show b => b -> (b -> b) -> Baz
2773 </ProgramListing>
2774
2775 </Para>
2776
2777 <Para>
2778 But when pattern matching on <Function>Baz1</Function> the matched values can be compared
2779 for equality, and when pattern matching on <Function>Baz2</Function> the first matched
2780 value can be converted to a string (as well as applying the function to it).
2781 So this program is legal:
2782 </Para>
2783
2784 <Para>
2785
2786 <ProgramListing>
2787   f :: Baz -> String
2788   f (Baz1 p q) | p == q    = "Yes"
2789                | otherwise = "No"
2790   f (Baz1 v fn)            = show (fn v)
2791 </ProgramListing>
2792
2793 </Para>
2794
2795 <Para>
2796 Operationally, in a dictionary-passing implementation, the
2797 constructors <Function>Baz1</Function> and <Function>Baz2</Function> must store the
2798 dictionaries for <Literal>Eq</Literal> and <Literal>Show</Literal> respectively, and
2799 extract it on pattern matching.
2800 </Para>
2801
2802 <Para>
2803 Notice the way that the syntax fits smoothly with that used for
2804 universal quantification earlier.
2805 </Para>
2806
2807 </Sect2>
2808
2809 <Sect2>
2810 <Title>Restrictions</Title>
2811
2812 <Para>
2813 There are several restrictions on the ways in which existentially-quantified
2814 constructors can be use.
2815 </Para>
2816
2817 <Para>
2818
2819 <ItemizedList>
2820 <ListItem>
2821
2822 <Para>
2823  When pattern matching, each pattern match introduces a new,
2824 distinct, type for each existential type variable.  These types cannot
2825 be unified with any other type, nor can they escape from the scope of
2826 the pattern match.  For example, these fragments are incorrect:
2827
2828
2829 <ProgramListing>
2830 f1 (MkFoo a f) = a
2831 </ProgramListing>
2832
2833
2834 Here, the type bound by <Function>MkFoo</Function> "escapes", because <Literal>a</Literal>
2835 is the result of <Function>f1</Function>.  One way to see why this is wrong is to
2836 ask what type <Function>f1</Function> has:
2837
2838
2839 <ProgramListing>
2840   f1 :: Foo -> a             -- Weird!
2841 </ProgramListing>
2842
2843
2844 What is this "<Literal>a</Literal>" in the result type? Clearly we don't mean
2845 this:
2846
2847
2848 <ProgramListing>
2849   f1 :: forall a. Foo -> a   -- Wrong!
2850 </ProgramListing>
2851
2852
2853 The original program is just plain wrong.  Here's another sort of error
2854
2855
2856 <ProgramListing>
2857   f2 (Baz1 a b) (Baz1 p q) = a==q
2858 </ProgramListing>
2859
2860
2861 It's ok to say <Literal>a==b</Literal> or <Literal>p==q</Literal>, but
2862 <Literal>a==q</Literal> is wrong because it equates the two distinct types arising
2863 from the two <Function>Baz1</Function> constructors.
2864
2865
2866 </Para>
2867 </ListItem>
2868 <ListItem>
2869
2870 <Para>
2871 You can't pattern-match on an existentially quantified
2872 constructor in a <Literal>let</Literal> or <Literal>where</Literal> group of
2873 bindings. So this is illegal:
2874
2875
2876 <ProgramListing>
2877   f3 x = a==b where { Baz1 a b = x }
2878 </ProgramListing>
2879
2880
2881 You can only pattern-match
2882 on an existentially-quantified constructor in a <Literal>case</Literal> expression or
2883 in the patterns of a function definition.
2884
2885 The reason for this restriction is really an implementation one.
2886 Type-checking binding groups is already a nightmare without
2887 existentials complicating the picture.  Also an existential pattern
2888 binding at the top level of a module doesn't make sense, because it's
2889 not clear how to prevent the existentially-quantified type "escaping".
2890 So for now, there's a simple-to-state restriction.  We'll see how
2891 annoying it is.
2892
2893 </Para>
2894 </ListItem>
2895 <ListItem>
2896
2897 <Para>
2898 You can't use existential quantification for <Literal>newtype</Literal>
2899 declarations.  So this is illegal:
2900
2901
2902 <ProgramListing>
2903   newtype T = forall a. Ord a => MkT a
2904 </ProgramListing>
2905
2906
2907 Reason: a value of type <Literal>T</Literal> must be represented as a pair
2908 of a dictionary for <Literal>Ord t</Literal> and a value of type <Literal>t</Literal>.
2909 That contradicts the idea that <Literal>newtype</Literal> should have no
2910 concrete representation.  You can get just the same efficiency and effect
2911 by using <Literal>data</Literal> instead of <Literal>newtype</Literal>.  If there is no
2912 overloading involved, then there is more of a case for allowing
2913 an existentially-quantified <Literal>newtype</Literal>, because the <Literal>data</Literal>
2914 because the <Literal>data</Literal> version does carry an implementation cost,
2915 but single-field existentially quantified constructors aren't much
2916 use.  So the simple restriction (no existential stuff on <Literal>newtype</Literal>)
2917 stands, unless there are convincing reasons to change it.
2918
2919
2920 </Para>
2921 </ListItem>
2922 <ListItem>
2923
2924 <Para>
2925  You can't use <Literal>deriving</Literal> to define instances of a
2926 data type with existentially quantified data constructors.
2927
2928 Reason: in most cases it would not make sense. For example:&num;
2929
2930 <ProgramListing>
2931 data T = forall a. MkT [a] deriving( Eq )
2932 </ProgramListing>
2933
2934 To derive <Literal>Eq</Literal> in the standard way we would need to have equality
2935 between the single component of two <Function>MkT</Function> constructors:
2936
2937 <ProgramListing>
2938 instance Eq T where
2939   (MkT a) == (MkT b) = ???
2940 </ProgramListing>
2941
2942 But <VarName>a</VarName> and <VarName>b</VarName> have distinct types, and so can't be compared.
2943 It's just about possible to imagine examples in which the derived instance
2944 would make sense, but it seems altogether simpler simply to prohibit such
2945 declarations.  Define your own instances!
2946 </Para>
2947 </ListItem>
2948
2949 </ItemizedList>
2950
2951 </Para>
2952
2953 </Sect2>
2954
2955 </Sect1>
2956
2957 <Sect1 id="sec-assertions">
2958 <Title>Assertions
2959 <IndexTerm><Primary>Assertions</Primary></IndexTerm>
2960 </Title>
2961
2962 <Para>
2963 If you want to make use of assertions in your standard Haskell code, you
2964 could define a function like the following:
2965 </Para>
2966
2967 <Para>
2968
2969 <ProgramListing>
2970 assert :: Bool -> a -> a
2971 assert False x = error "assertion failed!"
2972 assert _     x = x
2973 </ProgramListing>
2974
2975 </Para>
2976
2977 <Para>
2978 which works, but gives you back a less than useful error message --
2979 an assertion failed, but which and where?
2980 </Para>
2981
2982 <Para>
2983 One way out is to define an extended <Function>assert</Function> function which also
2984 takes a descriptive string to include in the error message and
2985 perhaps combine this with the use of a pre-processor which inserts
2986 the source location where <Function>assert</Function> was used.
2987 </Para>
2988
2989 <Para>
2990 Ghc offers a helping hand here, doing all of this for you. For every
2991 use of <Function>assert</Function> in the user's source:
2992 </Para>
2993
2994 <Para>
2995
2996 <ProgramListing>
2997 kelvinToC :: Double -> Double
2998 kelvinToC k = assert (k &gt;= 0.0) (k+273.15)
2999 </ProgramListing>
3000
3001 </Para>
3002
3003 <Para>
3004 Ghc will rewrite this to also include the source location where the
3005 assertion was made,
3006 </Para>
3007
3008 <Para>
3009
3010 <ProgramListing>
3011 assert pred val ==> assertError "Main.hs|15" pred val
3012 </ProgramListing>
3013
3014 </Para>
3015
3016 <Para>
3017 The rewrite is only performed by the compiler when it spots
3018 applications of <Function>Exception.assert</Function>, so you can still define and
3019 use your own versions of <Function>assert</Function>, should you so wish. If not,
3020 import <Literal>Exception</Literal> to make use <Function>assert</Function> in your code.
3021 </Para>
3022
3023 <Para>
3024 To have the compiler ignore uses of assert, use the compiler option
3025 <Option>-fignore-asserts</Option>. <IndexTerm><Primary>-fignore-asserts option</Primary></IndexTerm> That is,
3026 expressions of the form <Literal>assert pred e</Literal> will be rewritten to <Literal>e</Literal>.
3027 </Para>
3028
3029 <Para>
3030 Assertion failures can be caught, see the documentation for the
3031 <literal>Exception</literal> library (<xref linkend="sec-Exception">)
3032 for the details.
3033 </Para>
3034
3035 </Sect1>
3036
3037 <Sect1 id="scoped-type-variables">
3038 <Title>Scoped Type Variables
3039 </Title>
3040
3041 <Para>
3042 A <Emphasis>pattern type signature</Emphasis> can introduce a <Emphasis>scoped type
3043 variable</Emphasis>.  For example
3044 </Para>
3045
3046 <Para>
3047
3048 <ProgramListing>
3049 f (xs::[a]) = ys ++ ys
3050            where
3051               ys :: [a]
3052               ys = reverse xs
3053 </ProgramListing>
3054
3055 </Para>
3056
3057 <Para>
3058 The pattern <Literal>(xs::[a])</Literal> includes a type signature for <VarName>xs</VarName>.
3059 This brings the type variable <Literal>a</Literal> into scope; it scopes over
3060 all the patterns and right hand sides for this equation for <Function>f</Function>.
3061 In particular, it is in scope at the type signature for <VarName>y</VarName>.
3062 </Para>
3063
3064 <Para>
3065 At ordinary type signatures, such as that for <VarName>ys</VarName>, any type variables
3066 mentioned in the type signature <Emphasis>that are not in scope</Emphasis> are
3067 implicitly universally quantified.  (If there are no type variables in
3068 scope, all type variables mentioned in the signature are universally
3069 quantified, which is just as in Haskell 98.)  In this case, since <VarName>a</VarName>
3070 is in scope, it is not universally quantified, so the type of <VarName>ys</VarName> is
3071 the same as that of <VarName>xs</VarName>.  In Haskell 98 it is not possible to declare
3072 a type for <VarName>ys</VarName>; a major benefit of scoped type variables is that
3073 it becomes possible to do so.
3074 </Para>
3075
3076 <Para>
3077 Scoped type variables are implemented in both GHC and Hugs.  Where the
3078 implementations differ from the specification below, those differences
3079 are noted.
3080 </Para>
3081
3082 <Para>
3083 So much for the basic idea.  Here are the details.
3084 </Para>
3085
3086 <Sect2>
3087 <Title>Scope and implicit quantification</Title>
3088
3089 <Para>
3090
3091 <ItemizedList>
3092 <ListItem>
3093
3094 <Para>
3095  All the type variables mentioned in the patterns for a single
3096 function definition equation, that are not already in scope,
3097 are brought into scope by the patterns.  We describe this set as
3098 the <Emphasis>type variables bound by the equation</Emphasis>.
3099
3100 </Para>
3101 </ListItem>
3102 <ListItem>
3103
3104 <Para>
3105  The type variables thus brought into scope may be mentioned
3106 in ordinary type signatures or pattern type signatures anywhere within
3107 their scope.
3108
3109 </Para>
3110 </ListItem>
3111 <ListItem>
3112
3113 <Para>
3114  In ordinary type signatures, any type variable mentioned in the
3115 signature that is in scope is <Emphasis>not</Emphasis> universally quantified.
3116
3117 </Para>
3118 </ListItem>
3119 <ListItem>
3120
3121 <Para>
3122  Ordinary type signatures do not bring any new type variables
3123 into scope (except in the type signature itself!). So this is illegal:
3124
3125
3126 <ProgramListing>
3127   f :: a -> a
3128   f x = x::a
3129 </ProgramListing>
3130
3131
3132 It's illegal because <VarName>a</VarName> is not in scope in the body of <Function>f</Function>,
3133 so the ordinary signature <Literal>x::a</Literal> is equivalent to <Literal>x::forall a.a</Literal>;
3134 and that is an incorrect typing.
3135
3136 </Para>
3137 </ListItem>
3138 <ListItem>
3139
3140 <Para>
3141  There is no implicit universal quantification on pattern type
3142 signatures, nor may one write an explicit <Literal>forall</Literal> type in a pattern
3143 type signature.  The pattern type signature is a monotype.
3144
3145 </Para>
3146 </ListItem>
3147 <ListItem>
3148
3149 <Para>
3150
3151 The type variables in the head of a <Literal>class</Literal> or <Literal>instance</Literal> declaration
3152 scope over the methods defined in the <Literal>where</Literal> part.  For example:
3153
3154
3155 <ProgramListing>
3156   class C a where
3157     op :: [a] -> a
3158
3159     op xs = let ys::[a]
3160                 ys = reverse xs
3161             in
3162             head ys
3163 </ProgramListing>
3164
3165
3166 (Not implemented in Hugs yet, Dec 98).
3167 </Para>
3168 </ListItem>
3169
3170 </ItemizedList>
3171
3172 </Para>
3173
3174 </Sect2>
3175
3176 <Sect2>
3177 <Title>Polymorphism</Title>
3178
3179 <Para>
3180
3181 <ItemizedList>
3182 <ListItem>
3183
3184 <Para>
3185  Pattern type signatures are completely orthogonal to ordinary, separate
3186 type signatures.  The two can be used independently or together.  There is
3187 no scoping associated with the names of the type variables in a separate type signature.
3188
3189
3190 <ProgramListing>
3191    f :: [a] -> [a]
3192    f (xs::[b]) = reverse xs
3193 </ProgramListing>
3194
3195
3196 </Para>
3197 </ListItem>
3198 <ListItem>
3199
3200 <Para>
3201  The function must be polymorphic in the type variables
3202 bound by all its equations.  Operationally, the type variables bound
3203 by one equation must not:
3204
3205
3206 <ItemizedList>
3207 <ListItem>
3208
3209 <Para>
3210  Be unified with a type (such as <Literal>Int</Literal>, or <Literal>[a]</Literal>).
3211 </Para>
3212 </ListItem>
3213 <ListItem>
3214
3215 <Para>
3216  Be unified with a type variable free in the environment.
3217 </Para>
3218 </ListItem>
3219 <ListItem>
3220
3221 <Para>
3222  Be unified with each other.  (They may unify with the type variables
3223 bound by another equation for the same function, of course.)
3224 </Para>
3225 </ListItem>
3226
3227 </ItemizedList>
3228
3229
3230 For example, the following all fail to type check:
3231
3232
3233 <ProgramListing>
3234   f (x::a) (y::b) = [x,y]       -- a unifies with b
3235
3236   g (x::a) = x + 1::Int         -- a unifies with Int
3237
3238   h x = let k (y::a) = [x,y]    -- a is free in the
3239         in k x                  -- environment
3240
3241   k (x::a) True    = ...        -- a unifies with Int
3242   k (x::Int) False = ...
3243
3244   w :: [b] -> [b]
3245   w (x::a) = x                  -- a unifies with [b]
3246 </ProgramListing>
3247
3248
3249 </Para>
3250 </ListItem>
3251 <ListItem>
3252
3253 <Para>
3254  The pattern-bound type variable may, however, be constrained
3255 by the context of the principal type, thus:
3256
3257
3258 <ProgramListing>
3259   f (x::a) (y::a) = x+y*2
3260 </ProgramListing>
3261
3262
3263 gets the inferred type: <Literal>forall a. Num a =&gt; a -&gt; a -&gt; a</Literal>.
3264 </Para>
3265 </ListItem>
3266
3267 </ItemizedList>
3268
3269 </Para>
3270
3271 </Sect2>
3272
3273 <Sect2>
3274 <Title>Result type signatures</Title>
3275
3276 <Para>
3277
3278 <ItemizedList>
3279 <ListItem>
3280
3281 <Para>
3282  The result type of a function can be given a signature,
3283 thus:
3284
3285
3286 <ProgramListing>
3287   f (x::a) :: [a] = [x,x,x]
3288 </ProgramListing>
3289
3290
3291 The final <Literal>:: [a]</Literal> after all the patterns gives a signature to the
3292 result type.  Sometimes this is the only way of naming the type variable
3293 you want:
3294
3295
3296 <ProgramListing>
3297   f :: Int -> [a] -> [a]
3298   f n :: ([a] -> [a]) = let g (x::a, y::a) = (y,x)
3299                         in \xs -> map g (reverse xs `zip` xs)
3300 </ProgramListing>
3301
3302
3303 </Para>
3304 </ListItem>
3305
3306 </ItemizedList>
3307
3308 </Para>
3309
3310 <Para>
3311 Result type signatures are not yet implemented in Hugs.
3312 </Para>
3313
3314 </Sect2>
3315
3316 <Sect2>
3317 <Title>Pattern signatures on other constructs</Title>
3318
3319 <Para>
3320
3321 <ItemizedList>
3322 <ListItem>
3323
3324 <Para>
3325  A pattern type signature can be on an arbitrary sub-pattern, not
3326 just on a variable:
3327
3328
3329 <ProgramListing>
3330   f ((x,y)::(a,b)) = (y,x) :: (b,a)
3331 </ProgramListing>
3332
3333
3334 </Para>
3335 </ListItem>
3336 <ListItem>
3337
3338 <Para>
3339  Pattern type signatures, including the result part, can be used
3340 in lambda abstractions:
3341
3342
3343 <ProgramListing>
3344   (\ (x::a, y) :: a -> x)
3345 </ProgramListing>
3346
3347
3348 Type variables bound by these patterns must be polymorphic in
3349 the sense defined above.
3350 For example:
3351
3352
3353 <ProgramListing>
3354   f1 (x::c) = f1 x      -- ok
3355   f2 = \(x::c) -> f2 x  -- not ok
3356 </ProgramListing>
3357
3358
3359 Here, <Function>f1</Function> is OK, but <Function>f2</Function> is not, because <VarName>c</VarName> gets unified
3360 with a type variable free in the environment, in this
3361 case, the type of <Function>f2</Function>, which is in the environment when
3362 the lambda abstraction is checked.
3363
3364 </Para>
3365 </ListItem>
3366 <ListItem>
3367
3368 <Para>
3369  Pattern type signatures, including the result part, can be used
3370 in <Literal>case</Literal> expressions:
3371
3372
3373 <ProgramListing>
3374   case e of { (x::a, y) :: a -> x }
3375 </ProgramListing>
3376
3377
3378 The pattern-bound type variables must, as usual,
3379 be polymorphic in the following sense: each case alternative,
3380 considered as a lambda abstraction, must be polymorphic.
3381 Thus this is OK:
3382
3383
3384 <ProgramListing>
3385   case (True,False) of { (x::a, y) -> x }
3386 </ProgramListing>
3387
3388
3389 Even though the context is that of a pair of booleans,
3390 the alternative itself is polymorphic.  Of course, it is
3391 also OK to say:
3392
3393
3394 <ProgramListing>
3395   case (True,False) of { (x::Bool, y) -> x }
3396 </ProgramListing>
3397
3398
3399 </Para>
3400 </ListItem>
3401 <ListItem>
3402
3403 <Para>
3404 To avoid ambiguity, the type after the &ldquo;<Literal>::</Literal>&rdquo; in a result
3405 pattern signature on a lambda or <Literal>case</Literal> must be atomic (i.e. a single
3406 token or a parenthesised type of some sort).  To see why,
3407 consider how one would parse this:
3408
3409
3410 <ProgramListing>
3411   \ x :: a -> b -> x
3412 </ProgramListing>
3413
3414
3415 </Para>
3416 </ListItem>
3417 <ListItem>
3418
3419 <Para>
3420  Pattern type signatures that bind new type variables
3421 may not be used in pattern bindings at all.
3422 So this is illegal:
3423
3424
3425 <ProgramListing>
3426   f x = let (y, z::a) = x in ...
3427 </ProgramListing>
3428
3429
3430 But these are OK, because they do not bind fresh type variables:
3431
3432
3433 <ProgramListing>
3434   f1 x            = let (y, z::Int) = x in ...
3435   f2 (x::(Int,a)) = let (y, z::a)   = x in ...
3436 </ProgramListing>
3437
3438
3439 However a single variable is considered a degenerate function binding,
3440 rather than a degerate pattern binding, so this is permitted, even
3441 though it binds a type variable:
3442
3443
3444 <ProgramListing>
3445   f :: (b->b) = \(x::b) -> x
3446 </ProgramListing>
3447
3448
3449 </Para>
3450 </ListItem>
3451
3452 </ItemizedList>
3453
3454 Such degnerate function bindings do not fall under the monomorphism
3455 restriction.  Thus:
3456 </Para>
3457
3458 <Para>
3459
3460 <ProgramListing>
3461   g :: a -> a -> Bool = \x y. x==y
3462 </ProgramListing>
3463
3464 </Para>
3465
3466 <Para>
3467 Here <Function>g</Function> has type <Literal>forall a. Eq a =&gt; a -&gt; a -&gt; Bool</Literal>, just as if
3468 <Function>g</Function> had a separate type signature.  Lacking a type signature, <Function>g</Function>
3469 would get a monomorphic type.
3470 </Para>
3471
3472 </Sect2>
3473
3474 <Sect2>
3475 <Title>Existentials</Title>
3476
3477 <Para>
3478
3479 <ItemizedList>
3480 <ListItem>
3481
3482 <Para>
3483  Pattern type signatures can bind existential type variables.
3484 For example:
3485
3486
3487 <ProgramListing>
3488   data T = forall a. MkT [a]
3489
3490   f :: T -> T
3491   f (MkT [t::a]) = MkT t3
3492                  where
3493                    t3::[a] = [t,t,t]
3494 </ProgramListing>
3495
3496
3497 </Para>
3498 </ListItem>
3499
3500 </ItemizedList>
3501
3502 </Para>
3503
3504 </Sect2>
3505
3506 </Sect1>
3507
3508 <Sect1 id="pragmas">
3509 <Title>Pragmas
3510 </Title>
3511
3512 <Para>
3513 GHC supports several pragmas, or instructions to the compiler placed
3514 in the source code.  Pragmas don't affect the meaning of the program,
3515 but they might affect the efficiency of the generated code.
3516 </Para>
3517
3518 <Sect2 id="inline-pragma">
3519 <Title>INLINE pragma
3520
3521 <IndexTerm><Primary>INLINE pragma</Primary></IndexTerm>
3522 <IndexTerm><Primary>pragma, INLINE</Primary></IndexTerm></Title>
3523
3524 <Para>
3525 GHC (with <Option>-O</Option>, as always) tries to inline (or &ldquo;unfold&rdquo;)
3526 functions/values that are &ldquo;small enough,&rdquo; thus avoiding the call
3527 overhead and possibly exposing other more-wonderful optimisations.
3528 </Para>
3529
3530 <Para>
3531 You will probably see these unfoldings (in Core syntax) in your
3532 interface files.
3533 </Para>
3534
3535 <Para>
3536 Normally, if GHC decides a function is &ldquo;too expensive&rdquo; to inline, it
3537 will not do so, nor will it export that unfolding for other modules to
3538 use.
3539 </Para>
3540
3541 <Para>
3542 The sledgehammer you can bring to bear is the
3543 <Literal>INLINE</Literal><IndexTerm><Primary>INLINE pragma</Primary></IndexTerm> pragma, used thusly:
3544
3545 <ProgramListing>
3546 key_function :: Int -> String -> (Bool, Double)
3547
3548 #ifdef __GLASGOW_HASKELL__
3549 {-# INLINE key_function #-}
3550 #endif
3551 </ProgramListing>
3552
3553 (You don't need to do the C pre-processor carry-on unless you're going
3554 to stick the code through HBC&mdash;it doesn't like <Literal>INLINE</Literal> pragmas.)
3555 </Para>
3556
3557 <Para>
3558 The major effect of an <Literal>INLINE</Literal> pragma is to declare a function's
3559 &ldquo;cost&rdquo; to be very low.  The normal unfolding machinery will then be
3560 very keen to inline it.
3561 </Para>
3562
3563 <Para>
3564 An <Literal>INLINE</Literal> pragma for a function can be put anywhere its type
3565 signature could be put.
3566 </Para>
3567
3568 <Para>
3569 <Literal>INLINE</Literal> pragmas are a particularly good idea for the
3570 <Literal>then</Literal>/<Literal>return</Literal> (or <Literal>bind</Literal>/<Literal>unit</Literal>) functions in a monad.
3571 For example, in GHC's own <Literal>UniqueSupply</Literal> monad code, we have:
3572
3573 <ProgramListing>
3574 #ifdef __GLASGOW_HASKELL__
3575 {-# INLINE thenUs #-}
3576 {-# INLINE returnUs #-}
3577 #endif
3578 </ProgramListing>
3579
3580 </Para>
3581
3582 </Sect2>
3583
3584 <Sect2 id="noinline-pragma">
3585 <Title>NOINLINE pragma
3586 </Title>
3587
3588 <Para>
3589 <IndexTerm><Primary>NOINLINE pragma</Primary></IndexTerm>
3590 <IndexTerm><Primary>pragma, NOINLINE</Primary></IndexTerm>
3591 </Para>
3592
3593 <Para>
3594 The <Literal>NOINLINE</Literal> pragma does exactly what you'd expect: it stops the
3595 named function from being inlined by the compiler.  You shouldn't ever
3596 need to do this, unless you're very cautious about code size.
3597 </Para>
3598
3599 </Sect2>
3600
3601 <Sect2 id="specialize-pragma">
3602 <Title>SPECIALIZE pragma
3603 </Title>
3604
3605 <Para>
3606 <IndexTerm><Primary>SPECIALIZE pragma</Primary></IndexTerm>
3607 <IndexTerm><Primary>pragma, SPECIALIZE</Primary></IndexTerm>
3608 <IndexTerm><Primary>overloading, death to</Primary></IndexTerm>
3609 </Para>
3610
3611 <Para>
3612 (UK spelling also accepted.)  For key overloaded functions, you can
3613 create extra versions (NB: more code space) specialised to particular
3614 types.  Thus, if you have an overloaded function:
3615 </Para>
3616
3617 <Para>
3618
3619 <ProgramListing>
3620 hammeredLookup :: Ord key => [(key, value)] -> key -> value
3621 </ProgramListing>
3622
3623 </Para>
3624
3625 <Para>
3626 If it is heavily used on lists with <Literal>Widget</Literal> keys, you could
3627 specialise it as follows:
3628
3629 <ProgramListing>
3630 {-# SPECIALIZE hammeredLookup :: [(Widget, value)] -> Widget -> value #-}
3631 </ProgramListing>
3632
3633 </Para>
3634
3635 <Para>
3636 To get very fancy, you can also specify a named function to use for
3637 the specialised value, by adding <Literal>= blah</Literal>, as in:
3638
3639 <ProgramListing>
3640 {-# SPECIALIZE hammeredLookup :: ...as before... = blah #-}
3641 </ProgramListing>
3642
3643 It's <Emphasis>Your Responsibility</Emphasis> to make sure that <Function>blah</Function> really
3644 behaves as a specialised version of <Function>hammeredLookup</Function>!!!
3645 </Para>
3646
3647 <Para>
3648 NOTE: the <Literal>=blah</Literal> feature isn't implemented in GHC 4.xx.
3649 </Para>
3650
3651 <Para>
3652 An example in which the <Literal>= blah</Literal> form will Win Big:
3653
3654 <ProgramListing>
3655 toDouble :: Real a => a -> Double
3656 toDouble = fromRational . toRational
3657
3658 {-# SPECIALIZE toDouble :: Int -> Double = i2d #-}
3659 i2d (I# i) = D# (int2Double# i) -- uses Glasgow prim-op directly
3660 </ProgramListing>
3661
3662 The <Function>i2d</Function> function is virtually one machine instruction; the
3663 default conversion&mdash;via an intermediate <Literal>Rational</Literal>&mdash;is obscenely
3664 expensive by comparison.
3665 </Para>
3666
3667 <Para>
3668 By using the US spelling, your <Literal>SPECIALIZE</Literal> pragma will work with
3669 HBC, too.  Note that HBC doesn't support the <Literal>= blah</Literal> form.
3670 </Para>
3671
3672 <Para>
3673 A <Literal>SPECIALIZE</Literal> pragma for a function can be put anywhere its type
3674 signature could be put.
3675 </Para>
3676
3677 </Sect2>
3678
3679 <Sect2 id="specialize-instance-pragma">
3680 <Title>SPECIALIZE instance pragma
3681 </Title>
3682
3683 <Para>
3684 <IndexTerm><Primary>SPECIALIZE pragma</Primary></IndexTerm>
3685 <IndexTerm><Primary>overloading, death to</Primary></IndexTerm>
3686 Same idea, except for instance declarations.  For example:
3687
3688 <ProgramListing>
3689 instance (Eq a) => Eq (Foo a) where { ... usual stuff ... }
3690
3691 {-# SPECIALIZE instance Eq (Foo [(Int, Bar)] #-}
3692 </ProgramListing>
3693
3694 Compatible with HBC, by the way.
3695 </Para>
3696
3697 </Sect2>
3698
3699 <Sect2 id="line-pragma">
3700 <Title>LINE pragma
3701 </Title>
3702
3703 <Para>
3704 <IndexTerm><Primary>LINE pragma</Primary></IndexTerm>
3705 <IndexTerm><Primary>pragma, LINE</Primary></IndexTerm>
3706 </Para>
3707
3708 <Para>
3709 This pragma is similar to C's <Literal>&num;line</Literal> pragma, and is mainly for use in
3710 automatically generated Haskell code.  It lets you specify the line
3711 number and filename of the original code; for example
3712 </Para>
3713
3714 <Para>
3715
3716 <ProgramListing>
3717 {-# LINE 42 "Foo.vhs" #-}
3718 </ProgramListing>
3719
3720 </Para>
3721
3722 <Para>
3723 if you'd generated the current file from something called <Filename>Foo.vhs</Filename>
3724 and this line corresponds to line 42 in the original.  GHC will adjust
3725 its error messages to refer to the line/file named in the <Literal>LINE</Literal>
3726 pragma.
3727 </Para>
3728
3729 </Sect2>
3730
3731 <Sect2>
3732 <Title>RULES pragma</Title>
3733
3734 <Para>
3735 The RULES pragma lets you specify rewrite rules.  It is described in
3736 <XRef LinkEnd="rewrite-rules">.
3737 </Para>
3738
3739 </Sect2>
3740
3741 </Sect1>
3742
3743 <Sect1 id="rewrite-rules">
3744 <Title>Rewrite rules
3745
3746 <IndexTerm><Primary>RULES pagma</Primary></IndexTerm>
3747 <IndexTerm><Primary>pragma, RULES</Primary></IndexTerm>
3748 <IndexTerm><Primary>rewrite rules</Primary></IndexTerm></Title>
3749
3750 <Para>
3751 The programmer can specify rewrite rules as part of the source program
3752 (in a pragma).  GHC applies these rewrite rules wherever it can.
3753 </Para>
3754
3755 <Para>
3756 Here is an example:
3757
3758 <ProgramListing>
3759   {-# RULES
3760         "map/map"       forall f g xs. map f (map g xs) = map (f.g) xs
3761   #-}
3762 </ProgramListing>
3763
3764 </Para>
3765
3766 <Sect2>
3767 <Title>Syntax</Title>
3768
3769 <Para>
3770 From a syntactic point of view:
3771
3772 <ItemizedList>
3773 <ListItem>
3774
3775 <Para>
3776  Each rule has a name, enclosed in double quotes.  The name itself has
3777 no significance at all.  It is only used when reporting how many times the rule fired.
3778 </Para>
3779 </ListItem>
3780 <ListItem>
3781
3782 <Para>
3783  There may be zero or more rules in a <Literal>RULES</Literal> pragma.
3784 </Para>
3785 </ListItem>
3786 <ListItem>
3787
3788 <Para>
3789  Layout applies in a <Literal>RULES</Literal> pragma.  Currently no new indentation level
3790 is set, so you must lay out your rules starting in the same column as the
3791 enclosing definitions.
3792 </Para>
3793 </ListItem>
3794 <ListItem>
3795
3796 <Para>
3797  Each variable mentioned in a rule must either be in scope (e.g. <Function>map</Function>),
3798 or bound by the <Literal>forall</Literal> (e.g. <Function>f</Function>, <Function>g</Function>, <Function>xs</Function>).  The variables bound by
3799 the <Literal>forall</Literal> are called the <Emphasis>pattern</Emphasis> variables.  They are separated
3800 by spaces, just like in a type <Literal>forall</Literal>.
3801 </Para>
3802 </ListItem>
3803 <ListItem>
3804
3805 <Para>
3806  A pattern variable may optionally have a type signature.
3807 If the type of the pattern variable is polymorphic, it <Emphasis>must</Emphasis> have a type signature.
3808 For example, here is the <Literal>foldr/build</Literal> rule:
3809
3810 <ProgramListing>
3811 "fold/build"  forall k z (g::forall b. (a->b->b) -> b -> b) .
3812               foldr k z (build g) = g k z
3813 </ProgramListing>
3814
3815 Since <Function>g</Function> has a polymorphic type, it must have a type signature.
3816
3817 </Para>
3818 </ListItem>
3819 <ListItem>
3820
3821 <Para>
3822 The left hand side of a rule must consist of a top-level variable applied
3823 to arbitrary expressions.  For example, this is <Emphasis>not</Emphasis> OK:
3824
3825 <ProgramListing>
3826 "wrong1"   forall e1 e2.  case True of { True -> e1; False -> e2 } = e1
3827 "wrong2"   forall f.      f True = True
3828 </ProgramListing>
3829
3830 In <Literal>"wrong1"</Literal>, the LHS is not an application; in <Literal>"wrong2"</Literal>, the LHS has a pattern variable
3831 in the head.
3832 </Para>
3833 </ListItem>
3834 <ListItem>
3835
3836 <Para>
3837  A rule does not need to be in the same module as (any of) the
3838 variables it mentions, though of course they need to be in scope.
3839 </Para>
3840 </ListItem>
3841 <ListItem>
3842
3843 <Para>
3844  Rules are automatically exported from a module, just as instance declarations are.
3845 </Para>
3846 </ListItem>
3847
3848 </ItemizedList>
3849
3850 </Para>
3851
3852 </Sect2>
3853
3854 <Sect2>
3855 <Title>Semantics</Title>
3856
3857 <Para>
3858 From a semantic point of view:
3859
3860 <ItemizedList>
3861 <ListItem>
3862
3863 <Para>
3864 Rules are only applied if you use the <Option>-O</Option> flag.
3865 </Para>
3866 </ListItem>
3867
3868 <ListItem>
3869 <Para>
3870  Rules are regarded as left-to-right rewrite rules.
3871 When GHC finds an expression that is a substitution instance of the LHS
3872 of a rule, it replaces the expression by the (appropriately-substituted) RHS.
3873 By "a substitution instance" we mean that the LHS can be made equal to the
3874 expression by substituting for the pattern variables.
3875
3876 </Para>
3877 </ListItem>
3878 <ListItem>
3879
3880 <Para>
3881  The LHS and RHS of a rule are typechecked, and must have the
3882 same type.
3883
3884 </Para>
3885 </ListItem>
3886 <ListItem>
3887
3888 <Para>
3889  GHC makes absolutely no attempt to verify that the LHS and RHS
3890 of a rule have the same meaning.  That is undecideable in general, and
3891 infeasible in most interesting cases.  The responsibility is entirely the programmer's!
3892
3893 </Para>
3894 </ListItem>
3895 <ListItem>
3896
3897 <Para>
3898  GHC makes no attempt to make sure that the rules are confluent or
3899 terminating.  For example:
3900
3901 <ProgramListing>
3902   "loop"        forall x,y.  f x y = f y x
3903 </ProgramListing>
3904
3905 This rule will cause the compiler to go into an infinite loop.
3906
3907 </Para>
3908 </ListItem>
3909 <ListItem>
3910
3911 <Para>
3912  If more than one rule matches a call, GHC will choose one arbitrarily to apply.
3913
3914 </Para>
3915 </ListItem>
3916 <ListItem>
3917 <Para>
3918  GHC currently uses a very simple, syntactic, matching algorithm
3919 for matching a rule LHS with an expression.  It seeks a substitution
3920 which makes the LHS and expression syntactically equal modulo alpha
3921 conversion.  The pattern (rule), but not the expression, is eta-expanded if
3922 necessary.  (Eta-expanding the epression can lead to laziness bugs.)
3923 But not beta conversion (that's called higher-order matching).
3924 </Para>
3925
3926 <Para>
3927 Matching is carried out on GHC's intermediate language, which includes
3928 type abstractions and applications.  So a rule only matches if the
3929 types match too.  See <XRef LinkEnd="rule-spec"> below.
3930 </Para>
3931 </ListItem>
3932 <ListItem>
3933
3934 <Para>
3935  GHC keeps trying to apply the rules as it optimises the program.
3936 For example, consider:
3937
3938 <ProgramListing>
3939   let s = map f
3940       t = map g
3941   in
3942   s (t xs)
3943 </ProgramListing>
3944
3945 The expression <Literal>s (t xs)</Literal> does not match the rule <Literal>"map/map"</Literal>, but GHC
3946 will substitute for <VarName>s</VarName> and <VarName>t</VarName>, giving an expression which does match.
3947 If <VarName>s</VarName> or <VarName>t</VarName> was (a) used more than once, and (b) large or a redex, then it would
3948 not be substituted, and the rule would not fire.
3949
3950 </Para>
3951 </ListItem>
3952 <ListItem>
3953
3954 <Para>
3955  In the earlier phases of compilation, GHC inlines <Emphasis>nothing
3956 that appears on the LHS of a rule</Emphasis>, because once you have substituted
3957 for something you can't match against it (given the simple minded
3958 matching).  So if you write the rule
3959
3960 <ProgramListing>
3961         "map/map"       forall f,g.  map f . map g = map (f.g)
3962 </ProgramListing>
3963
3964 this <Emphasis>won't</Emphasis> match the expression <Literal>map f (map g xs)</Literal>.
3965 It will only match something written with explicit use of ".".
3966 Well, not quite.  It <Emphasis>will</Emphasis> match the expression
3967
3968 <ProgramListing>
3969 wibble f g xs
3970 </ProgramListing>
3971
3972 where <Function>wibble</Function> is defined:
3973
3974 <ProgramListing>
3975 wibble f g = map f . map g
3976 </ProgramListing>
3977
3978 because <Function>wibble</Function> will be inlined (it's small).
3979
3980 Later on in compilation, GHC starts inlining even things on the
3981 LHS of rules, but still leaves the rules enabled.  This inlining
3982 policy is controlled by the per-simplification-pass flag <Option>-finline-phase</Option><Emphasis>n</Emphasis>.
3983
3984 </Para>
3985 </ListItem>
3986 <ListItem>
3987
3988 <Para>
3989  All rules are implicitly exported from the module, and are therefore
3990 in force in any module that imports the module that defined the rule, directly
3991 or indirectly.  (That is, if A imports B, which imports C, then C's rules are
3992 in force when compiling A.)  The situation is very similar to that for instance
3993 declarations.
3994 </Para>
3995 </ListItem>
3996
3997 </ItemizedList>
3998
3999 </Para>
4000
4001 </Sect2>
4002
4003 <Sect2>
4004 <Title>List fusion</Title>
4005
4006 <Para>
4007 The RULES mechanism is used to implement fusion (deforestation) of common list functions.
4008 If a "good consumer" consumes an intermediate list constructed by a "good producer", the
4009 intermediate list should be eliminated entirely.
4010 </Para>
4011
4012 <Para>
4013 The following are good producers:
4014
4015 <ItemizedList>
4016 <ListItem>
4017
4018 <Para>
4019  List comprehensions
4020 </Para>
4021 </ListItem>
4022 <ListItem>
4023
4024 <Para>
4025  Enumerations of <Literal>Int</Literal> and <Literal>Char</Literal> (e.g. <Literal>['a'..'z']</Literal>).
4026 </Para>
4027 </ListItem>
4028 <ListItem>
4029
4030 <Para>
4031  Explicit lists (e.g. <Literal>[True, False]</Literal>)
4032 </Para>
4033 </ListItem>
4034 <ListItem>
4035
4036 <Para>
4037  The cons constructor (e.g <Literal>3:4:[]</Literal>)
4038 </Para>
4039 </ListItem>
4040 <ListItem>
4041
4042 <Para>
4043  <Function>++</Function>
4044 </Para>
4045 </ListItem>
4046 <ListItem>
4047
4048 <Para>
4049  <Function>map</Function>
4050 </Para>
4051 </ListItem>
4052 <ListItem>
4053
4054 <Para>
4055  <Function>filter</Function>
4056 </Para>
4057 </ListItem>
4058 <ListItem>
4059
4060 <Para>
4061  <Function>iterate</Function>, <Function>repeat</Function>
4062 </Para>
4063 </ListItem>
4064 <ListItem>
4065
4066 <Para>
4067  <Function>zip</Function>, <Function>zipWith</Function>
4068 </Para>
4069 </ListItem>
4070
4071 </ItemizedList>
4072
4073 </Para>
4074
4075 <Para>
4076 The following are good consumers:
4077
4078 <ItemizedList>
4079 <ListItem>
4080
4081 <Para>
4082  List comprehensions
4083 </Para>
4084 </ListItem>
4085 <ListItem>
4086
4087 <Para>
4088  <Function>array</Function> (on its second argument)
4089 </Para>
4090 </ListItem>
4091 <ListItem>
4092
4093 <Para>
4094  <Function>length</Function>
4095 </Para>
4096 </ListItem>
4097 <ListItem>
4098
4099 <Para>
4100  <Function>++</Function> (on its first argument)
4101 </Para>
4102 </ListItem>
4103 <ListItem>
4104
4105 <Para>
4106  <Function>map</Function>
4107 </Para>
4108 </ListItem>
4109 <ListItem>
4110
4111 <Para>
4112  <Function>filter</Function>
4113 </Para>
4114 </ListItem>
4115 <ListItem>
4116
4117 <Para>
4118  <Function>concat</Function>
4119 </Para>
4120 </ListItem>
4121 <ListItem>
4122
4123 <Para>
4124  <Function>unzip</Function>, <Function>unzip2</Function>, <Function>unzip3</Function>, <Function>unzip4</Function>
4125 </Para>
4126 </ListItem>
4127 <ListItem>
4128
4129 <Para>
4130  <Function>zip</Function>, <Function>zipWith</Function> (but on one argument only; if both are good producers, <Function>zip</Function>
4131 will fuse with one but not the other)
4132 </Para>
4133 </ListItem>
4134 <ListItem>
4135
4136 <Para>
4137  <Function>partition</Function>
4138 </Para>
4139 </ListItem>
4140 <ListItem>
4141
4142 <Para>
4143  <Function>head</Function>
4144 </Para>
4145 </ListItem>
4146 <ListItem>
4147
4148 <Para>
4149  <Function>and</Function>, <Function>or</Function>, <Function>any</Function>, <Function>all</Function>
4150 </Para>
4151 </ListItem>
4152 <ListItem>
4153
4154 <Para>
4155  <Function>sequence&lowbar;</Function>
4156 </Para>
4157 </ListItem>
4158 <ListItem>
4159
4160 <Para>
4161  <Function>msum</Function>
4162 </Para>
4163 </ListItem>
4164 <ListItem>
4165
4166 <Para>
4167  <Function>sortBy</Function>
4168 </Para>
4169 </ListItem>
4170
4171 </ItemizedList>
4172
4173 </Para>
4174
4175 <Para>
4176 So, for example, the following should generate no intermediate lists:
4177
4178 <ProgramListing>
4179 array (1,10) [(i,i*i) | i &#60;- map (+ 1) [0..9]]
4180 </ProgramListing>
4181
4182 </Para>
4183
4184 <Para>
4185 This list could readily be extended; if there are Prelude functions that you use
4186 a lot which are not included, please tell us.
4187 </Para>
4188
4189 <Para>
4190 If you want to write your own good consumers or producers, look at the
4191 Prelude definitions of the above functions to see how to do so.
4192 </Para>
4193
4194 </Sect2>
4195
4196 <Sect2 id="rule-spec">
4197 <Title>Specialisation
4198 </Title>
4199
4200 <Para>
4201 Rewrite rules can be used to get the same effect as a feature
4202 present in earlier version of GHC:
4203
4204 <ProgramListing>
4205   {-# SPECIALIZE fromIntegral :: Int8 -> Int16 = int8ToInt16 #-}
4206 </ProgramListing>
4207
4208 This told GHC to use <Function>int8ToInt16</Function> instead of <Function>fromIntegral</Function> whenever
4209 the latter was called with type <Literal>Int8 -&gt; Int16</Literal>.  That is, rather than
4210 specialising the original definition of <Function>fromIntegral</Function> the programmer is
4211 promising that it is safe to use <Function>int8ToInt16</Function> instead.
4212 </Para>
4213
4214 <Para>
4215 This feature is no longer in GHC.  But rewrite rules let you do the
4216 same thing:
4217
4218 <ProgramListing>
4219 {-# RULES
4220   "fromIntegral/Int8/Int16" fromIntegral = int8ToInt16
4221 #-}
4222 </ProgramListing>
4223
4224 This slightly odd-looking rule instructs GHC to replace <Function>fromIntegral</Function>
4225 by <Function>int8ToInt16</Function> <Emphasis>whenever the types match</Emphasis>.  Speaking more operationally,
4226 GHC adds the type and dictionary applications to get the typed rule
4227
4228 <ProgramListing>
4229 forall (d1::Integral Int8) (d2::Num Int16) .
4230         fromIntegral Int8 Int16 d1 d2 = int8ToInt16
4231 </ProgramListing>
4232
4233 What is more,
4234 this rule does not need to be in the same file as fromIntegral,
4235 unlike the <Literal>SPECIALISE</Literal> pragmas which currently do (so that they
4236 have an original definition available to specialise).
4237 </Para>
4238
4239 </Sect2>
4240
4241 <Sect2>
4242 <Title>Controlling what's going on</Title>
4243
4244 <Para>
4245
4246 <ItemizedList>
4247 <ListItem>
4248
4249 <Para>
4250  Use <Option>-ddump-rules</Option> to see what transformation rules GHC is using.
4251 </Para>
4252 </ListItem>
4253 <ListItem>
4254
4255 <Para>
4256  Use <Option>-ddump-simpl-stats</Option> to see what rules are being fired.
4257 If you add <Option>-dppr-debug</Option> you get a more detailed listing.
4258 </Para>
4259 </ListItem>
4260 <ListItem>
4261
4262 <Para>
4263  The defintion of (say) <Function>build</Function> in <FileName>PrelBase.lhs</FileName> looks llike this:
4264
4265 <ProgramListing>
4266         build   :: forall a. (forall b. (a -> b -> b) -> b -> b) -> [a]
4267         {-# INLINE build #-}
4268         build g = g (:) []
4269 </ProgramListing>
4270
4271 Notice the <Literal>INLINE</Literal>!  That prevents <Literal>(:)</Literal> from being inlined when compiling
4272 <Literal>PrelBase</Literal>, so that an importing module will &ldquo;see&rdquo; the <Literal>(:)</Literal>, and can
4273 match it on the LHS of a rule.  <Literal>INLINE</Literal> prevents any inlining happening
4274 in the RHS of the <Literal>INLINE</Literal> thing.  I regret the delicacy of this.
4275
4276 </Para>
4277 </ListItem>
4278 <ListItem>
4279
4280 <Para>
4281  In <Filename>ghc/lib/std/PrelBase.lhs</Filename> look at the rules for <Function>map</Function> to
4282 see how to write rules that will do fusion and yet give an efficient
4283 program even if fusion doesn't happen.  More rules in <Filename>PrelList.lhs</Filename>.
4284 </Para>
4285 </ListItem>
4286
4287 </ItemizedList>
4288
4289 </Para>
4290
4291 </Sect2>
4292
4293 </Sect1>
4294
4295 <Sect1 id="generic-classes">
4296 <Title>Generic classes</Title>
4297
4298 <Para>
4299 The ideas behind this extension are described in detail in "Derivable type classes",
4300 Ralf Hinze and Simon Peyton Jones, Haskell Workshop, Montreal Sept 2000, pp94-105.
4301 An example will give the idea:
4302 </Para>
4303
4304 <ProgramListing>
4305   import Generics
4306
4307   class Bin a where
4308     toBin   :: a -> [Int]
4309     fromBin :: [Int] -> (a, [Int])
4310
4311     toBin {| Unit |}    Unit      = []
4312     toBin {| a :+: b |} (Inl x)   = 0 : toBin x
4313     toBin {| a :+: b |} (Inr y)   = 1 : toBin y
4314     toBin {| a :*: b |} (x :*: y) = toBin x ++ toBin y
4315
4316     fromBin {| Unit |}    bs      = (Unit, bs)
4317     fromBin {| a :+: b |} (0:bs)  = (Inl x, bs')    where (x,bs') = fromBin bs
4318     fromBin {| a :+: b |} (1:bs)  = (Inr y, bs')    where (y,bs') = fromBin bs
4319     fromBin {| a :*: b |} bs      = (x :*: y, bs'') where (x,bs' ) = fromBin bs
4320                                                           (y,bs'') = fromBin bs'
4321 </ProgramListing>
4322 <Para>
4323 This class declaration explains how <Literal>toBin</Literal> and <Literal>fromBin</Literal>
4324 work for arbitrary data types.  They do so by giving cases for unit, product, and sum,
4325 which are defined thus in the library module <Literal>Generics</Literal>:
4326 </Para>
4327 <ProgramListing>
4328   data Unit    = Unit
4329   data a :+: b = Inl a | Inr b
4330   data a :*: b = a :*: b
4331 </ProgramListing>
4332 <Para>
4333 Now you can make a data type into an instance of Bin like this:
4334 <ProgramListing>
4335   instance (Bin a, Bin b) => Bin (a,b)
4336   instance Bin a => Bin [a]
4337 </ProgramListing>
4338 That is, just leave off the "where" clasuse.  Of course, you can put in the
4339 where clause and over-ride whichever methods you please.
4340 </Para>
4341
4342     <Sect2>
4343       <Title> Using generics </Title>
4344       <Para>To use generics you need to</para>
4345       <ItemizedList>
4346         <ListItem>
4347           <Para>Use the <Option>-fgenerics</Option> flag.</Para>
4348         </ListItem>
4349         <ListItem>
4350           <Para>Import the module <Literal>Generics</Literal> from the
4351           <Literal>lang</Literal> package.  This import brings into
4352           scope the data types <Literal>Unit</Literal>,
4353           <Literal>:*:</Literal>, and <Literal>:+:</Literal>.  (You
4354           don't need this import if you don't mention these types
4355           explicitly; for example, if you are simply giving instance
4356           declarations.)</Para>
4357         </ListItem>
4358       </ItemizedList>
4359     </Sect2>
4360
4361 <Sect2> <Title> Changes wrt the paper </Title>
4362 <Para>
4363 Note that the type constructors <Literal>:+:</Literal> and <Literal>:*:</Literal>
4364 can be written infix (indeed, you can now use
4365 any operator starting in a colon as an infix type constructor).  Also note that
4366 the type constructors are not exactly as in the paper (Unit instead of 1, etc).
4367 Finally, note that the syntax of the type patterns in the class declaration
4368 uses "<Literal>{|</Literal>" and "<Literal>{|</Literal>" brackets; curly braces
4369 alone would ambiguous when they appear on right hand sides (an extension we
4370 anticipate wanting).
4371 </Para>
4372 </Sect2>
4373
4374 <Sect2> <Title>Terminology and restrictions</Title>
4375 <Para>
4376 Terminology.  A "generic default method" in a class declaration
4377 is one that is defined using type patterns as above.
4378 A "polymorphic default method" is a default method defined as in Haskell 98.
4379 A "generic class declaration" is a class declaration with at least one
4380 generic default method.
4381 </Para>
4382
4383 <Para>
4384 Restrictions:
4385 <ItemizedList>
4386 <ListItem>
4387 <Para>
4388 Alas, we do not yet implement the stuff about constructor names and
4389 field labels.
4390 </Para>
4391 </ListItem>
4392
4393 <ListItem>
4394 <Para>
4395 A generic class can have only one parameter; you can't have a generic
4396 multi-parameter class.
4397 </Para>
4398 </ListItem>
4399
4400 <ListItem>
4401 <Para>
4402 A default method must be defined entirely using type patterns, or entirely
4403 without.  So this is illegal:
4404 <ProgramListing>
4405   class Foo a where
4406     op :: a -> (a, Bool)
4407     op {| Unit |} Unit = (Unit, True)
4408     op x               = (x,    False)
4409 </ProgramListing>
4410 However it is perfectly OK for some methods of a generic class to have
4411 generic default methods and others to have polymorphic default methods.
4412 </Para>
4413 </ListItem>
4414
4415 <ListItem>
4416 <Para>
4417 The type variable(s) in the type pattern for a generic method declaration
4418 scope over the right hand side.  So this is legal (note the use of the type variable ``p'' in a type signature on the right hand side:
4419 <ProgramListing>
4420   class Foo a where
4421     op :: a -> Bool
4422     op {| p :*: q |} (x :*: y) = op (x :: p)
4423     ...
4424 </ProgramListing>
4425 </Para>
4426 </ListItem>
4427
4428 <ListItem>
4429 <Para>
4430 The type patterns in a generic default method must take one of the forms:
4431 <ProgramListing>
4432        a :+: b
4433        a :*: b
4434        Unit
4435 </ProgramListing>
4436 where "a" and "b" are type variables.  Furthermore, all the type patterns for
4437 a single type constructor (<Literal>:*:</Literal>, say) must be identical; they
4438 must use the same type variables.  So this is illegal:
4439 <ProgramListing>
4440   class Foo a where
4441     op :: a -> Bool
4442     op {| a :+: b |} (Inl x) = True
4443     op {| p :+: q |} (Inr y) = False
4444 </ProgramListing>
4445 The type patterns must be identical, even in equations for different methods of the class.
4446 So this too is illegal:
4447 <ProgramListing>
4448   class Foo a where
4449     op1 :: a -> Bool
4450     op {| a :*: b |} (Inl x) = True
4451
4452     op2 :: a -> Bool
4453     op {| p :*: q |} (Inr y) = False
4454 </ProgramListing>
4455 (The reason for this restriction is that we gather all the equations for a particular type consructor
4456 into a single generic instance declaration.)
4457 </Para>
4458 </ListItem>
4459
4460 <ListItem>
4461 <Para>
4462 A generic method declaration must give a case for each of the three type constructors.
4463 </Para>
4464 </ListItem>
4465
4466 <ListItem>
4467 <Para>
4468 In an instance declaration for a generic class, the idea is that the compiler
4469 will fill in the methods for you, based on the generic templates.  However it can only
4470 do so if
4471   <ItemizedList>
4472   <ListItem>
4473   <Para>
4474   The instance type is simple (a type constructor applied to type variables, as in Haskell 98).
4475   </Para>
4476   </ListItem>
4477   <ListItem>
4478   <Para>
4479   No constructor of the instance type has unboxed fields.
4480   </Para>
4481   </ListItem>
4482   </ItemizedList>
4483 (Of course, these things can only arise if you are already using GHC extensions.)
4484 However, you can still give an instance declarations for types which break these rules,
4485 provided you give explicit code to override any generic default methods.
4486 </Para>
4487 </ListItem>
4488
4489 </ItemizedList>
4490 </Para>
4491
4492 <Para>
4493 The option <Option>-ddump-deriv</Option> dumps incomprehensible stuff giving details of
4494 what the compiler does with generic declarations.
4495 </Para>
4496
4497 </Sect2>
4498
4499 <Sect2> <Title> Another example </Title>
4500 <Para>
4501 Just to finish with, here's another example I rather like:
4502 <ProgramListing>
4503   class Tag a where
4504     nCons :: a -> Int
4505     nCons {| Unit |}    _ = 1
4506     nCons {| a :*: b |} _ = 1
4507     nCons {| a :+: b |} _ = nCons (bot::a) + nCons (bot::b)
4508
4509     tag :: a -> Int
4510     tag {| Unit |}    _       = 1
4511     tag {| a :*: b |} _       = 1
4512     tag {| a :+: b |} (Inl x) = tag x
4513     tag {| a :+: b |} (Inr y) = nCons (bot::a) + tag y
4514 </ProgramListing>
4515 </Para>
4516 </Sect2>
4517 </Sect1>
4518
4519 <!-- Emacs stuff:
4520      ;;; Local Variables: ***
4521      ;;; mode: sgml ***
4522      ;;; sgml-parent-document: ("users_guide.sgml" "book" "chapter" "sect1") ***
4523      ;;; End: ***
4524  -->