ghc/docs/users_guide/glasgow_exts.sgml

   1 <Para>
   2 <IndexTerm><Primary>language, GHC</Primary></IndexTerm>
   3 <IndexTerm><Primary>extensions, GHC</Primary></IndexTerm>
   4 As with all known Haskell systems, GHC implements some extensions to
   5 the language.  To use them, you'll need to give a <Option>-fglasgow-exts</Option>
   6 <IndexTerm><Primary>-fglasgow-exts option</Primary></IndexTerm> option.
   7 </Para>
   8
   9 <Para>
  10 Virtually all of the Glasgow extensions serve to give you access to
  11 the underlying facilities with which we implement Haskell.  Thus, you
  12 can get at the Raw Iron, if you are willing to write some non-standard
  13 code at a more primitive level.  You need not be &ldquo;stuck&rdquo; on
  14 performance because of the implementation costs of Haskell's
  15 &ldquo;high-level&rdquo; features&mdash;you can always code &ldquo;under&rdquo; them.  In an extreme case, you can write all your time-critical code in C, and then just glue it together with Haskell!
  16 </Para>
  17
  18 <Para>
  19 Executive summary of our extensions:
  20 </Para>
  21
  22 <Para>
  23 <VariableList>
  24
  25 <VarListEntry>
  26 <Term>Unboxed types and primitive operations:</Term>
  27 <ListItem>
  28 <Para>
  29 You can get right down to the raw machine types and operations;
  30 included in this are &ldquo;primitive arrays&rdquo; (direct access to Big Wads
  31 of Bytes).  Please see <XRef LinkEnd="glasgow-unboxed"> and following.
  32 </Para>
  33 </ListItem>
  34 </VarListEntry>
  35
  36 <VarListEntry>
  37 <Term>Multi-parameter type classes:</Term>
  38 <ListItem>
  39 <Para>
  40 GHC's type system supports extended type classes with multiple
  41 parameters.  Please see <XRef LinkEnd="multi-param-type-classes">.
  42 </Para>
  43 </ListItem>
  44 </VarListEntry>
  45
  46 <VarListEntry>
  47 <Term>Local universal quantification:</Term>
  48 <ListItem>
  49 <Para>
  50 GHC's type system supports explicit universal quantification in
  51 constructor fields and function arguments.  This is useful for things
  52 like defining <Literal>runST</Literal> from the state-thread world.  See <XRef LinkEnd="universal-quantification">.
  53 </Para>
  54 </ListItem>
  55 </VarListEntry>
  56
  57 <VarListEntry>
  58 <Term>Extistentially quantification in data types:</Term>
  59 <ListItem>
  60 <Para>
  61 Some or all of the type variables in a datatype declaration may be
  62 <Emphasis>existentially quantified</Emphasis>.  More details in <XRef LinkEnd="existential-quantification">.
  63 </Para>
  64 </ListItem>
  65 </VarListEntry>
  66
  67 <VarListEntry>
  68 <Term>Scoped type variables:</Term>
  69 <ListItem>
  70 <Para>
  71 Scoped type variables enable the programmer to supply type signatures
  72 for some nested declarations, where this would not be legal in Haskell
  73 98.  Details in <XRef LinkEnd="scoped-type-variables">.
  74 </Para>
  75 </ListItem>
  76 </VarListEntry>
  77
  78 <VarListEntry>
  79 <Term>Pattern guards</Term>
  80 <ListItem>
  81 <Para>
  82 Instead of being a boolean expression, a guard is a list of qualifiers, exactly as in a list comprehension. See <XRef LinkEnd="pattern-guards">.
  83 </Para>
  84 </ListItem>
  85 </VarListEntry>
  86
  87 <VarListEntry>
  88 <Term>Foreign calling:</Term>
  89 <ListItem>
  90 <Para>
  91 Just what it sounds like.  We provide <Emphasis>lots</Emphasis> of rope that you
  92 can dangle around your neck.  Please see <XRef LinkEnd="ffi">.
  93 </Para>
  94 </ListItem>
  95 </VarListEntry>
  96
  97 <VarListEntry>
  98 <Term>Pragmas</Term>
  99 <ListItem>
 100 <Para>
 101 Pragmas are special instructions to the compiler placed in the source
 102 file.  The pragmas GHC supports are described in <XRef LinkEnd="pragmas">.
 103 </Para>
 104 </ListItem>
 105 </VarListEntry>
 106
 107 <VarListEntry>
 108 <Term>Rewrite rules:</Term>
 109 <ListItem>
 110 <Para>
 111 The programmer can specify rewrite rules as part of the source program
 112 (in a pragma).  GHC applies these rewrite rules wherever it can.
 113 Details in <XRef LinkEnd="rewrite-rules">.
 114 </Para>
 115 </ListItem>
 116 </VarListEntry>
 117
 118 <VarListEntry>
 119 <Term>Generic classes:</Term>
 120 <ListItem>
 121 <Para>
 122 Generic class declarations allow you to define a class
 123 whose methods say how to work over an arbitrary data type.
 124 Then it's really easy to make any new type into an instance of
 125 the class.  This generalises the rather ad-hoc "deriving" feature
 126 of Haskell 98.
 127 Details in <XRef LinkEnd="generic-classes">.
 128 </Para>
 129 </ListItem>
 130 </VarListEntry>
 131 </VariableList>
 132 </Para>
 133
 134 <Para>
 135 Before you get too carried away working at the lowest level (e.g.,
 136 sloshing <Literal>MutableByteArray&num;</Literal>s around your
 137 program), you may wish to check if there are libraries that provide a
 138 &ldquo;Haskellised veneer&rdquo; over the features you want.  See
 139 <xref linkend="book-hslibs">.
 140 </Para>
 141
 142 <Sect1 id="language-options">
 143 <Title>Language variations
 144 </Title>
 145
 146 <Para> There are several flags that control what variation of the language are permitted.
 147 Leaving out all of them gives you standard Haskell 98.</Para>
 148
 149 <VariableList>
 150
 151 <VarListEntry>
 152 <Term><Option>-fglasgow-exts</Option>:</Term>
 153 <ListItem>
 154 <Para>This simultaneously enables all of the extensions to Haskell 98 described in this
 155 chapter, except where otherwise noted. </Para>
 156 </ListItem> </VarListEntry>
 157
 158 <VarListEntry>
 159 <Term><Option>-fno-monomorphism-restriction</Option>:</Term>
 160 <ListItem>
 161 <Para> Switch off the Haskell 98 monomorphism restriction.  Independent of the <Option>-fglasgow-exts</Option>
 162 flag. </Para>
 163 </ListItem> </VarListEntry>
 164
 165 <VarListEntry>
 166 <Term><Option>-fallow-overlapping-instances</Option>,
 167       <Option>-fallow-undecidable-instances</Option>,
 168       <Option>-fcontext-stack</Option>:</Term>
 169 <ListItem>
 170 <Para> See <XRef LinkEnd="instance-decls">.
 171 Only relevant if you also use <Option>-fglasgow-exts</Option>.
 172 </Para>
 173 </ListItem> </VarListEntry>
 174
 175 <VarListEntry>
 176 <Term><Option>-fignore-asserts</Option>:</Term>
 177 <ListItem>
 178 <Para> See <XRef LinkEnd="sec-assertions">.
 179 Only relevant if you also use <Option>-fglasgow-exts</Option>.
 180 </Para>
 181 </ListItem> </VarListEntry>
 182
 183 <VarListEntry>
 184 <Term> <Option>-finline-phase</Option>:</Term>
 185 <ListItem>
 186 <Para> See <XRef LinkEnd="rewrite-rules">.
 187 Only relevant if you also use <Option>-fglasgow-exts</Option>.</para>
 188 </ListItem> </VarListEntry>
 189
 190 <VarListEntry>
 191 <Term> <Option>-fgenerics</Option>:</Term>
 192 <ListItem>
 193 <Para> See <XRef LinkEnd="generic-classes">.
 194 Independent of <Option>-fglasgow-exts</Option>.
 195 </Para>
 196 </ListItem> </VarListEntry>
 197
 198 </VariableList>
 199   </sect1>
 200
 201 <Sect1 id="primitives">
 202 <Title>Unboxed types and primitive operations
 203 </Title>
 204 <IndexTerm><Primary>PrelGHC module</Primary></IndexTerm>
 205
 206 <Para>
 207 This module defines all the types which are primitive in Glasgow
 208 Haskell, and the operations provided for them.
 209 </Para>
 210
 211 <Sect2 id="glasgow-unboxed">
 212 <Title>Unboxed types
 213 </Title>
 214
 215 <Para>
 216 <IndexTerm><Primary>Unboxed types (Glasgow extension)</Primary></IndexTerm>
 217 </Para>
 218
 219 <para>Most types in GHC are <firstterm>boxed</firstterm>, which means
 220 that values of that type are represented by a pointer to a heap
 221 object.  The representation of a Haskell <literal>Int</literal>, for
 222 example, is a two-word heap object.  An <firstterm>unboxed</firstterm>
 223 type, however, is represented by the value itself, no pointers or heap
 224 allocation are involved.
 225 </para>
 226
 227 <Para>
 228 Unboxed types correspond to the &ldquo;raw machine&rdquo; types you
 229 would use in C: <Literal>Int&num;</Literal> (long int),
 230 <Literal>Double&num;</Literal> (double), <Literal>Addr&num;</Literal>
 231 (void *), etc.  The <Emphasis>primitive operations</Emphasis>
 232 (PrimOps) on these types are what you might expect; e.g.,
 233 <Literal>(+&num;)</Literal> is addition on
 234 <Literal>Int&num;</Literal>s, and is the machine-addition that we all
 235 know and love&mdash;usually one instruction.
 236 </Para>
 237
 238 <Para>
 239 Primitive (unboxed) types cannot be defined in Haskell, and are
 240 therefore built into the language and compiler.  Primitive types are
 241 always unlifted; that is, a value of a primitive type cannot be
 242 bottom.  We use the convention that primitive types, values, and
 243 operations have a <Literal>&num;</Literal> suffix.
 244 </Para>
 245
 246 <Para>
 247 Primitive values are often represented by a simple bit-pattern, such
 248 as <Literal>Int&num;</Literal>, <Literal>Float&num;</Literal>,
 249 <Literal>Double&num;</Literal>.  But this is not necessarily the case:
 250 a primitive value might be represented by a pointer to a
 251 heap-allocated object.  Examples include
 252 <Literal>Array&num;</Literal>, the type of primitive arrays.  A
 253 primitive array is heap-allocated because it is too big a value to fit
 254 in a register, and would be too expensive to copy around; in a sense,
 255 it is accidental that it is represented by a pointer.  If a pointer
 256 represents a primitive value, then it really does point to that value:
 257 no unevaluated thunks, no indirections&hellip;nothing can be at the
 258 other end of the pointer than the primitive value.
 259 </Para>
 260
 261 <Para>
 262 There are some restrictions on the use of primitive types, the main
 263 one being that you can't pass a primitive value to a polymorphic
 264 function or store one in a polymorphic data type.  This rules out
 265 things like <Literal>[Int&num;]</Literal> (i.e. lists of primitive
 266 integers).  The reason for this restriction is that polymorphic
 267 arguments and constructor fields are assumed to be pointers: if an
 268 unboxed integer is stored in one of these, the garbage collector would
 269 attempt to follow it, leading to unpredictable space leaks.  Or a
 270 <Function>seq</Function> operation on the polymorphic component may
 271 attempt to dereference the pointer, with disastrous results.  Even
 272 worse, the unboxed value might be larger than a pointer
 273 (<Literal>Double&num;</Literal> for instance).
 274 </Para>
 275
 276 <Para>
 277 Nevertheless, A numerically-intensive program using unboxed types can
 278 go a <Emphasis>lot</Emphasis> faster than its &ldquo;standard&rdquo;
 279 counterpart&mdash;we saw a threefold speedup on one example.
 280 </Para>
 281
 282 </sect2>
 283
 284 <Sect2 id="unboxed-tuples">
 285 <Title>Unboxed Tuples
 286 </Title>
 287
 288 <Para>
 289 Unboxed tuples aren't really exported by <Literal>PrelGHC</Literal>,
 290 they're available by default with <Option>-fglasgow-exts</Option>.  An
 291 unboxed tuple looks like this:
 292 </Para>
 293
 294 <Para>
 295
 296 <ProgramListing>
 297 (# e_1, ..., e_n #)
 298 </ProgramListing>
 299
 300 </Para>
 301
 302 <Para>
 303 where <Literal>e&lowbar;1..e&lowbar;n</Literal> are expressions of any
 304 type (primitive or non-primitive).  The type of an unboxed tuple looks
 305 the same.
 306 </Para>
 307
 308 <Para>
 309 Unboxed tuples are used for functions that need to return multiple
 310 values, but they avoid the heap allocation normally associated with
 311 using fully-fledged tuples.  When an unboxed tuple is returned, the
 312 components are put directly into registers or on the stack; the
 313 unboxed tuple itself does not have a composite representation.  Many
 314 of the primitive operations listed in this section return unboxed
 315 tuples.
 316 </Para>
 317
 318 <Para>
 319 There are some pretty stringent restrictions on the use of unboxed tuples:
 320 </Para>
 321
 322 <Para>
 323
 324 <ItemizedList>
 325 <ListItem>
 326
 327 <Para>
 328  Unboxed tuple types are subject to the same restrictions as
 329 other unboxed types; i.e. they may not be stored in polymorphic data
 330 structures or passed to polymorphic functions.
 331
 332 </Para>
 333 </ListItem>
 334 <ListItem>
 335
 336 <Para>
 337  Unboxed tuples may only be constructed as the direct result of
 338 a function, and may only be deconstructed with a <Literal>case</Literal> expression.
 339 eg. the following are valid:
 340
 341
 342 <ProgramListing>
 343 f x y = (# x+1, y-1 #)
 344 g x = case f x x of { (# a, b #) -&#62; a + b }
 345 </ProgramListing>
 346
 347
 348 but the following are invalid:
 349
 350
 351 <ProgramListing>
 352 f x y = g (# x, y #)
 353 g (# x, y #) = x + y
 354 </ProgramListing>
 355
 356
 357 </Para>
 358 </ListItem>
 359 <ListItem>
 360
 361 <Para>
 362  No variable can have an unboxed tuple type.  This is illegal:
 363
 364
 365 <ProgramListing>
 366 f :: (# Int, Int #) -&#62; (# Int, Int #)
 367 f x = x
 368 </ProgramListing>
 369
 370
 371 because <VarName>x</VarName> has an unboxed tuple type.
 372
 373 </Para>
 374 </ListItem>
 375
 376 </ItemizedList>
 377
 378 </Para>
 379
 380 <Para>
 381 Note: we may relax some of these restrictions in the future.
 382 </Para>
 383
 384 <Para>
 385 The <Literal>IO</Literal> and <Literal>ST</Literal> monads use unboxed
 386 tuples to avoid unnecessary allocation during sequences of operations.
 387 </Para>
 388
 389 </Sect2>
 390
 391 <Sect2>
 392 <Title>Character and numeric types</Title>
 393
 394 <IndexTerm><Primary>character types, primitive</Primary></IndexTerm>
 395 <IndexTerm><Primary>numeric types, primitive</Primary></IndexTerm>
 396 <IndexTerm><Primary>integer types, primitive</Primary></IndexTerm>
 397 <IndexTerm><Primary>floating point types, primitive</Primary></IndexTerm>
 398 <Para>
 399 There are the following obvious primitive types:
 400 </Para>
 401
 402 <ProgramListing>
 403 type Char#
 404 type Int#
 405 type Word#
 406 type Addr#
 407 type Float#
 408 type Double#
 409 type Int64#
 410 type Word64#
 411 </ProgramListing>
 412
 413 <IndexTerm><Primary><literal>Char&num;</literal></Primary></IndexTerm>
 414 <IndexTerm><Primary><literal>Int&num;</literal></Primary></IndexTerm>
 415 <IndexTerm><Primary><literal>Word&num;</literal></Primary></IndexTerm>
 416 <IndexTerm><Primary><literal>Addr&num;</literal></Primary></IndexTerm>
 417 <IndexTerm><Primary><literal>Float&num;</literal></Primary></IndexTerm>
 418 <IndexTerm><Primary><literal>Double&num;</literal></Primary></IndexTerm>
 419 <IndexTerm><Primary><literal>Int64&num;</literal></Primary></IndexTerm>
 420 <IndexTerm><Primary><literal>Word64&num;</literal></Primary></IndexTerm>
 421
 422 <Para>
 423 If you really want to know their exact equivalents in C, see
 424 <Filename>ghc/includes/StgTypes.h</Filename> in the GHC source tree.
 425 </Para>
 426
 427 <Para>
 428 Literals for these types may be written as follows:
 429 </Para>
 430
 431 <Para>
 432
 433 <ProgramListing>
 434 1#              an Int#
 435 1.2#            a Float#
 436 1.34##          a Double#
 437 'a'#            a Char#; for weird characters, use e.g. '\o&#60;octal&#62;'#
 438 "a"#            an Addr# (a `char *'); only characters '\0'..'\255' allowed
 439 </ProgramListing>
 440
 441 <IndexTerm><Primary>literals, primitive</Primary></IndexTerm>
 442 <IndexTerm><Primary>constants, primitive</Primary></IndexTerm>
 443 <IndexTerm><Primary>numbers, primitive</Primary></IndexTerm>
 444 </Para>
 445
 446 </Sect2>
 447
 448 <Sect2>
 449 <Title>Comparison operations</Title>
 450
 451 <Para>
 452 <IndexTerm><Primary>comparisons, primitive</Primary></IndexTerm>
 453 <IndexTerm><Primary>operators, comparison</Primary></IndexTerm>
 454 </Para>
 455
 456 <Para>
 457
 458 <ProgramListing>
 459 {&#62;,&#62;=,==,/=,&#60;,&#60;=}# :: Int# -&#62; Int# -&#62; Bool
 460
 461 {gt,ge,eq,ne,lt,le}Char# :: Char# -&#62; Char# -&#62; Bool
 462     -- ditto for Word# and Addr#
 463 </ProgramListing>
 464
 465 <IndexTerm><Primary><literal>&#62;&num;</literal></Primary></IndexTerm>
 466 <IndexTerm><Primary><literal>&#62;=&num;</literal></Primary></IndexTerm>
 467 <IndexTerm><Primary><literal>==&num;</literal></Primary></IndexTerm>
 468 <IndexTerm><Primary><literal>/=&num;</literal></Primary></IndexTerm>
 469 <IndexTerm><Primary><literal>&#60;&num;</literal></Primary></IndexTerm>
 470 <IndexTerm><Primary><literal>&#60;=&num;</literal></Primary></IndexTerm>
 471 <IndexTerm><Primary><literal>gt&lcub;Char,Word,Addr&rcub;&num;</literal></Primary></IndexTerm>
 472 <IndexTerm><Primary><literal>ge&lcub;Char,Word,Addr&rcub;&num;</literal></Primary></IndexTerm>
 473 <IndexTerm><Primary><literal>eq&lcub;Char,Word,Addr&rcub;&num;</literal></Primary></IndexTerm>
 474 <IndexTerm><Primary><literal>ne&lcub;Char,Word,Addr&rcub;&num;</literal></Primary></IndexTerm>
 475 <IndexTerm><Primary><literal>lt&lcub;Char,Word,Addr&rcub;&num;</literal></Primary></IndexTerm>
 476 <IndexTerm><Primary><literal>le&lcub;Char,Word,Addr&rcub;&num;</literal></Primary></IndexTerm>
 477 </Para>
 478
 479 </Sect2>
 480
 481 <Sect2>
 482 <Title>Primitive-character operations</Title>
 483
 484 <Para>
 485 <IndexTerm><Primary>characters, primitive operations</Primary></IndexTerm>
 486 <IndexTerm><Primary>operators, primitive character</Primary></IndexTerm>
 487 </Para>
 488
 489 <Para>
 490
 491 <ProgramListing>
 492 ord# :: Char# -&#62; Int#
 493 chr# :: Int# -&#62; Char#
 494 </ProgramListing>
 495
 496 <IndexTerm><Primary><literal>ord&num;</literal></Primary></IndexTerm>
 497 <IndexTerm><Primary><literal>chr&num;</literal></Primary></IndexTerm>
 498 </Para>
 499
 500 </Sect2>
 501
 502 <Sect2>
 503 <Title>Primitive-<Literal>Int</Literal> operations</Title>
 504
 505 <Para>
 506 <IndexTerm><Primary>integers, primitive operations</Primary></IndexTerm>
 507 <IndexTerm><Primary>operators, primitive integer</Primary></IndexTerm>
 508 </Para>
 509
 510 <Para>
 511
 512 <ProgramListing>
 513 {+,-,*,quotInt,remInt,gcdInt}# :: Int# -&#62; Int# -&#62; Int#
 514 negateInt# :: Int# -&#62; Int#
 515
 516 iShiftL#, iShiftRA#, iShiftRL# :: Int# -&#62; Int# -&#62; Int#
 517         -- shift left, right arithmetic, right logical
 518
 519 addIntC#, subIntC#, mulIntC# :: Int# -> Int# -> (# Int#, Int# #)
 520         -- add, subtract, multiply with carry
 521 </ProgramListing>
 522
 523 <IndexTerm><Primary><literal>+&num;</literal></Primary></IndexTerm>
 524 <IndexTerm><Primary><literal>-&num;</literal></Primary></IndexTerm>
 525 <IndexTerm><Primary><literal>*&num;</literal></Primary></IndexTerm>
 526 <IndexTerm><Primary><literal>quotInt&num;</literal></Primary></IndexTerm>
 527 <IndexTerm><Primary><literal>remInt&num;</literal></Primary></IndexTerm>
 528 <IndexTerm><Primary><literal>gcdInt&num;</literal></Primary></IndexTerm>
 529 <IndexTerm><Primary><literal>iShiftL&num;</literal></Primary></IndexTerm>
 530 <IndexTerm><Primary><literal>iShiftRA&num;</literal></Primary></IndexTerm>
 531 <IndexTerm><Primary><literal>iShiftRL&num;</literal></Primary></IndexTerm>
 532 <IndexTerm><Primary><literal>addIntC&num;</literal></Primary></IndexTerm>
 533 <IndexTerm><Primary><literal>subIntC&num;</literal></Primary></IndexTerm>
 534 <IndexTerm><Primary><literal>mulIntC&num;</literal></Primary></IndexTerm>
 535 <IndexTerm><Primary>shift operations, integer</Primary></IndexTerm>
 536 </Para>
 537
 538 <Para>
 539 <Emphasis>Note:</Emphasis> No error/overflow checking!
 540 </Para>
 541
 542 </Sect2>
 543
 544 <Sect2>
 545 <Title>Primitive-<Literal>Double</Literal> and <Literal>Float</Literal> operations</Title>
 546
 547 <Para>
 548 <IndexTerm><Primary>floating point numbers, primitive</Primary></IndexTerm>
 549 <IndexTerm><Primary>operators, primitive floating point</Primary></IndexTerm>
 550 </Para>
 551
 552 <Para>
 553
 554 <ProgramListing>
 555 {+,-,*,/}##         :: Double# -&#62; Double# -&#62; Double#
 556 {&#60;,&#60;=,==,/=,&#62;=,&#62;}## :: Double# -&#62; Double# -&#62; Bool
 557 negateDouble#       :: Double# -&#62; Double#
 558 double2Int#         :: Double# -&#62; Int#
 559 int2Double#         :: Int#    -&#62; Double#
 560
 561 {plus,minux,times,divide}Float# :: Float# -&#62; Float# -&#62; Float#
 562 {gt,ge,eq,ne,lt,le}Float# :: Float# -&#62; Float# -&#62; Bool
 563 negateFloat#        :: Float# -&#62; Float#
 564 float2Int#          :: Float# -&#62; Int#
 565 int2Float#          :: Int#   -&#62; Float#
 566 </ProgramListing>
 567
 568 </Para>
 569
 570 <Para>
 571 <IndexTerm><Primary><literal>+&num;&num;</literal></Primary></IndexTerm>
 572 <IndexTerm><Primary><literal>-&num;&num;</literal></Primary></IndexTerm>
 573 <IndexTerm><Primary><literal>*&num;&num;</literal></Primary></IndexTerm>
 574 <IndexTerm><Primary><literal>/&num;&num;</literal></Primary></IndexTerm>
 575 <IndexTerm><Primary><literal>&#60;&num;&num;</literal></Primary></IndexTerm>
 576 <IndexTerm><Primary><literal>&#60;=&num;&num;</literal></Primary></IndexTerm>
 577 <IndexTerm><Primary><literal>==&num;&num;</literal></Primary></IndexTerm>
 578 <IndexTerm><Primary><literal>=/&num;&num;</literal></Primary></IndexTerm>
 579 <IndexTerm><Primary><literal>&#62;=&num;&num;</literal></Primary></IndexTerm>
 580 <IndexTerm><Primary><literal>&#62;&num;&num;</literal></Primary></IndexTerm>
 581 <IndexTerm><Primary><literal>negateDouble&num;</literal></Primary></IndexTerm>
 582 <IndexTerm><Primary><literal>double2Int&num;</literal></Primary></IndexTerm>
 583 <IndexTerm><Primary><literal>int2Double&num;</literal></Primary></IndexTerm>
 584 </Para>
 585
 586 <Para>
 587 <IndexTerm><Primary><literal>plusFloat&num;</literal></Primary></IndexTerm>
 588 <IndexTerm><Primary><literal>minusFloat&num;</literal></Primary></IndexTerm>
 589 <IndexTerm><Primary><literal>timesFloat&num;</literal></Primary></IndexTerm>
 590 <IndexTerm><Primary><literal>divideFloat&num;</literal></Primary></IndexTerm>
 591 <IndexTerm><Primary><literal>gtFloat&num;</literal></Primary></IndexTerm>
 592 <IndexTerm><Primary><literal>geFloat&num;</literal></Primary></IndexTerm>
 593 <IndexTerm><Primary><literal>eqFloat&num;</literal></Primary></IndexTerm>
 594 <IndexTerm><Primary><literal>neFloat&num;</literal></Primary></IndexTerm>
 595 <IndexTerm><Primary><literal>ltFloat&num;</literal></Primary></IndexTerm>
 596 <IndexTerm><Primary><literal>leFloat&num;</literal></Primary></IndexTerm>
 597 <IndexTerm><Primary><literal>negateFloat&num;</literal></Primary></IndexTerm>
 598 <IndexTerm><Primary><literal>float2Int&num;</literal></Primary></IndexTerm>
 599 <IndexTerm><Primary><literal>int2Float&num;</literal></Primary></IndexTerm>
 600 </Para>
 601
 602 <Para>
 603 And a full complement of trigonometric functions:
 604 </Para>
 605
 606 <Para>
 607
 608 <ProgramListing>
 609 expDouble#      :: Double# -&#62; Double#
 610 logDouble#      :: Double# -&#62; Double#
 611 sqrtDouble#     :: Double# -&#62; Double#
 612 sinDouble#      :: Double# -&#62; Double#
 613 cosDouble#      :: Double# -&#62; Double#
 614 tanDouble#      :: Double# -&#62; Double#
 615 asinDouble#     :: Double# -&#62; Double#
 616 acosDouble#     :: Double# -&#62; Double#
 617 atanDouble#     :: Double# -&#62; Double#
 618 sinhDouble#     :: Double# -&#62; Double#
 619 coshDouble#     :: Double# -&#62; Double#
 620 tanhDouble#     :: Double# -&#62; Double#
 621 powerDouble#    :: Double# -&#62; Double# -&#62; Double#
 622 </ProgramListing>
 623
 624 <IndexTerm><Primary>trigonometric functions, primitive</Primary></IndexTerm>
 625 </Para>
 626
 627 <Para>
 628 similarly for <Literal>Float&num;</Literal>.
 629 </Para>
 630
 631 <Para>
 632 There are two coercion functions for <Literal>Float&num;</Literal>/<Literal>Double&num;</Literal>:
 633 </Para>
 634
 635 <Para>
 636
 637 <ProgramListing>
 638 float2Double#   :: Float# -&#62; Double#
 639 double2Float#   :: Double# -&#62; Float#
 640 </ProgramListing>
 641
 642 <IndexTerm><Primary><literal>float2Double&num;</literal></Primary></IndexTerm>
 643 <IndexTerm><Primary><literal>double2Float&num;</literal></Primary></IndexTerm>
 644 </Para>
 645
 646 <Para>
 647 The primitive version of <Function>decodeDouble</Function>
 648 (<Function>encodeDouble</Function> is implemented as an external C
 649 function):
 650 </Para>
 651
 652 <Para>
 653
 654 <ProgramListing>
 655 decodeDouble#   :: Double# -&#62; PrelNum.ReturnIntAndGMP
 656 </ProgramListing>
 657
 658 <IndexTerm><Primary><literal>encodeDouble&num;</literal></Primary></IndexTerm>
 659 <IndexTerm><Primary><literal>decodeDouble&num;</literal></Primary></IndexTerm>
 660 </Para>
 661
 662 <Para>
 663 (And the same for <Literal>Float&num;</Literal>s.)
 664 </Para>
 665
 666 </Sect2>
 667
 668 <Sect2 id="integer-operations">
 669 <Title>Operations on/for <Literal>Integers</Literal> (interface to GMP)
 670 </Title>
 671
 672 <Para>
 673 <IndexTerm><Primary>arbitrary precision integers</Primary></IndexTerm>
 674 <IndexTerm><Primary>Integer, operations on</Primary></IndexTerm>
 675 </Para>
 676
 677 <Para>
 678 We implement <Literal>Integers</Literal> (arbitrary-precision
 679 integers) using the GNU multiple-precision (GMP) package (version
 680 2.0.2).
 681 </Para>
 682
 683 <Para>
 684 The data type for <Literal>Integer</Literal> is either a small
 685 integer, represented by an <Literal>Int</Literal>, or a large integer
 686 represented using the pieces required by GMP's
 687 <Literal>MP&lowbar;INT</Literal> in <Filename>gmp.h</Filename> (see
 688 <Filename>gmp.info</Filename> in
 689 <Filename>ghc/includes/runtime/gmp</Filename>).  It comes out as:
 690 </Para>
 691
 692 <Para>
 693
 694 <ProgramListing>
 695 data Integer = S# Int#             -- small integers
 696              | J# Int# ByteArray#  -- large integers
 697 </ProgramListing>
 698
 699 <IndexTerm><Primary>Integer type</Primary></IndexTerm> The primitive
 700 ops to support large <Literal>Integers</Literal> use the
 701 &ldquo;pieces&rdquo; of the representation, and are as follows:
 702 </Para>
 703
 704 <Para>
 705
 706 <ProgramListing>
 707 negateInteger#  :: Int# -&#62; ByteArray# -&#62; Integer
 708
 709 {plus,minus,times}Integer#, gcdInteger#,
 710   quotInteger#, remInteger#, divExactInteger#
 711         :: Int# -> ByteArray#
 712         -> Int# -> ByteArray#
 713         -> (# Int#, ByteArray# #)
 714
 715 cmpInteger#
 716         :: Int# -> ByteArray#
 717         -> Int# -> ByteArray#
 718         -> Int# -- -1 for &#60;; 0 for ==; +1 for >
 719
 720 cmpIntegerInt#
 721         :: Int# -> ByteArray#
 722         -> Int#
 723         -> Int# -- -1 for &#60;; 0 for ==; +1 for >
 724
 725 gcdIntegerInt# ::
 726         :: Int# -> ByteArray#
 727         -> Int#
 728         -> Int#
 729
 730 divModInteger#, quotRemInteger#
 731         :: Int# -> ByteArray#
 732         -> Int# -> ByteArray#
 733         -> (# Int#, ByteArray#,
 734                   Int#, ByteArray# #)
 735
 736 integer2Int# :: Int# -> ByteArray# -> Int#
 737
 738 int2Integer#  :: Int#  -> Integer -- NB: no error-checking on these two!
 739 word2Integer# :: Word# -> Integer
 740
 741 addr2Integer# :: Addr# -> Integer
 742         -- the Addr# is taken to be a `char *' string
 743         -- to be converted into an Integer.
 744 </ProgramListing>
 745
 746 <IndexTerm><Primary><literal>negateInteger&num;</literal></Primary></IndexTerm>
 747 <IndexTerm><Primary><literal>plusInteger&num;</literal></Primary></IndexTerm>
 748 <IndexTerm><Primary><literal>minusInteger&num;</literal></Primary></IndexTerm>
 749 <IndexTerm><Primary><literal>timesInteger&num;</literal></Primary></IndexTerm>
 750 <IndexTerm><Primary><literal>quotInteger&num;</literal></Primary></IndexTerm>
 751 <IndexTerm><Primary><literal>remInteger&num;</literal></Primary></IndexTerm>
 752 <IndexTerm><Primary><literal>gcdInteger&num;</literal></Primary></IndexTerm>
 753 <IndexTerm><Primary><literal>gcdIntegerInt&num;</literal></Primary></IndexTerm>
 754 <IndexTerm><Primary><literal>divExactInteger&num;</literal></Primary></IndexTerm>
 755 <IndexTerm><Primary><literal>cmpInteger&num;</literal></Primary></IndexTerm>
 756 <IndexTerm><Primary><literal>divModInteger&num;</literal></Primary></IndexTerm>
 757 <IndexTerm><Primary><literal>quotRemInteger&num;</literal></Primary></IndexTerm>
 758 <IndexTerm><Primary><literal>integer2Int&num;</literal></Primary></IndexTerm>
 759 <IndexTerm><Primary><literal>int2Integer&num;</literal></Primary></IndexTerm>
 760 <IndexTerm><Primary><literal>word2Integer&num;</literal></Primary></IndexTerm>
 761 <IndexTerm><Primary><literal>addr2Integer&num;</literal></Primary></IndexTerm>
 762 </Para>
 763
 764 </Sect2>
 765
 766 <Sect2>
 767 <Title>Words and addresses</Title>
 768
 769 <Para>
 770 <IndexTerm><Primary>word, primitive type</Primary></IndexTerm>
 771 <IndexTerm><Primary>address, primitive type</Primary></IndexTerm>
 772 <IndexTerm><Primary>unsigned integer, primitive type</Primary></IndexTerm>
 773 <IndexTerm><Primary>pointer, primitive type</Primary></IndexTerm>
 774 </Para>
 775
 776 <Para>
 777 A <Literal>Word&num;</Literal> is used for bit-twiddling operations.
 778 It is the same size as an <Literal>Int&num;</Literal>, but has no sign
 779 nor any arithmetic operations.
 780
 781 <ProgramListing>
 782 type Word#      -- Same size/etc as Int# but *unsigned*
 783 type Addr#      -- A pointer from outside the "Haskell world" (from C, probably);
 784                 -- described under "arrays"
 785 </ProgramListing>
 786
 787 <IndexTerm><Primary><literal>Word&num;</literal></Primary></IndexTerm>
 788 <IndexTerm><Primary><literal>Addr&num;</literal></Primary></IndexTerm>
 789 </Para>
 790
 791 <Para>
 792 <Literal>Word&num;</Literal>s and <Literal>Addr&num;</Literal>s have
 793 the usual comparison operations.  Other
 794 unboxed-<Literal>Word</Literal> ops (bit-twiddling and coercions):
 795 </Para>
 796
 797 <Para>
 798
 799 <ProgramListing>
 800 {gt,ge,eq,ne,lt,le}Word# :: Word# -> Word# -> Bool
 801
 802 and#, or#, xor# :: Word# -> Word# -> Word#
 803         -- standard bit ops.
 804
 805 quotWord#, remWord# :: Word# -> Word# -> Word#
 806         -- word (i.e. unsigned) versions are different from int
 807         -- versions, so we have to provide these explicitly.
 808
 809 not# :: Word# -> Word#
 810
 811 shiftL#, shiftRL# :: Word# -> Int# -> Word#
 812         -- shift left, right logical
 813
 814 int2Word#       :: Int#  -> Word# -- just a cast, really
 815 word2Int#       :: Word# -> Int#
 816 </ProgramListing>
 817
 818 <IndexTerm><Primary>bit operations, Word and Addr</Primary></IndexTerm>
 819 <IndexTerm><Primary><literal>gtWord&num;</literal></Primary></IndexTerm>
 820 <IndexTerm><Primary><literal>geWord&num;</literal></Primary></IndexTerm>
 821 <IndexTerm><Primary><literal>eqWord&num;</literal></Primary></IndexTerm>
 822 <IndexTerm><Primary><literal>neWord&num;</literal></Primary></IndexTerm>
 823 <IndexTerm><Primary><literal>ltWord&num;</literal></Primary></IndexTerm>
 824 <IndexTerm><Primary><literal>leWord&num;</literal></Primary></IndexTerm>
 825 <IndexTerm><Primary><literal>and&num;</literal></Primary></IndexTerm>
 826 <IndexTerm><Primary><literal>or&num;</literal></Primary></IndexTerm>
 827 <IndexTerm><Primary><literal>xor&num;</literal></Primary></IndexTerm>
 828 <IndexTerm><Primary><literal>not&num;</literal></Primary></IndexTerm>
 829 <IndexTerm><Primary><literal>quotWord&num;</literal></Primary></IndexTerm>
 830 <IndexTerm><Primary><literal>remWord&num;</literal></Primary></IndexTerm>
 831 <IndexTerm><Primary><literal>shiftL&num;</literal></Primary></IndexTerm>
 832 <IndexTerm><Primary><literal>shiftRA&num;</literal></Primary></IndexTerm>
 833 <IndexTerm><Primary><literal>shiftRL&num;</literal></Primary></IndexTerm>
 834 <IndexTerm><Primary><literal>int2Word&num;</literal></Primary></IndexTerm>
 835 <IndexTerm><Primary><literal>word2Int&num;</literal></Primary></IndexTerm>
 836 </Para>
 837
 838 <Para>
 839 Unboxed-<Literal>Addr</Literal> ops (C casts, really):
 840
 841 <ProgramListing>
 842 {gt,ge,eq,ne,lt,le}Addr# :: Addr# -> Addr# -> Bool
 843
 844 int2Addr#       :: Int#  -> Addr#
 845 addr2Int#       :: Addr# -> Int#
 846 addr2Integer#   :: Addr# -> (# Int#, ByteArray# #)
 847 </ProgramListing>
 848
 849 <IndexTerm><Primary><literal>gtAddr&num;</literal></Primary></IndexTerm>
 850 <IndexTerm><Primary><literal>geAddr&num;</literal></Primary></IndexTerm>
 851 <IndexTerm><Primary><literal>eqAddr&num;</literal></Primary></IndexTerm>
 852 <IndexTerm><Primary><literal>neAddr&num;</literal></Primary></IndexTerm>
 853 <IndexTerm><Primary><literal>ltAddr&num;</literal></Primary></IndexTerm>
 854 <IndexTerm><Primary><literal>leAddr&num;</literal></Primary></IndexTerm>
 855 <IndexTerm><Primary><literal>int2Addr&num;</literal></Primary></IndexTerm>
 856 <IndexTerm><Primary><literal>addr2Int&num;</literal></Primary></IndexTerm>
 857 <IndexTerm><Primary><literal>addr2Integer&num;</literal></Primary></IndexTerm>
 858 </Para>
 859
 860 <Para>
 861 The casts between <Literal>Int&num;</Literal>,
 862 <Literal>Word&num;</Literal> and <Literal>Addr&num;</Literal>
 863 correspond to null operations at the machine level, but are required
 864 to keep the Haskell type checker happy.
 865 </Para>
 866
 867 <Para>
 868 Operations for indexing off of C pointers
 869 (<Literal>Addr&num;</Literal>s) to snatch values are listed under
 870 &ldquo;arrays&rdquo;.
 871 </Para>
 872
 873 </Sect2>
 874
 875 <Sect2>
 876 <Title>Arrays</Title>
 877
 878 <Para>
 879 <IndexTerm><Primary>arrays, primitive</Primary></IndexTerm>
 880 </Para>
 881
 882 <Para>
 883 The type <Literal>Array&num; elt</Literal> is the type of primitive,
 884 unpointed arrays of values of type <Literal>elt</Literal>.
 885 </Para>
 886
 887 <Para>
 888
 889 <ProgramListing>
 890 type Array# elt
 891 </ProgramListing>
 892
 893 <IndexTerm><Primary><literal>Array&num;</literal></Primary></IndexTerm>
 894 </Para>
 895
 896 <Para>
 897 <Literal>Array&num;</Literal> is more primitive than a Haskell
 898 array&mdash;indeed, the Haskell <Literal>Array</Literal> interface is
 899 implemented using <Literal>Array&num;</Literal>&mdash;in that an
 900 <Literal>Array&num;</Literal> is indexed only by
 901 <Literal>Int&num;</Literal>s, starting at zero.  It is also more
 902 primitive by virtue of being unboxed.  That doesn't mean that it isn't
 903 a heap-allocated object&mdash;of course, it is.  Rather, being unboxed
 904 means that it is represented by a pointer to the array itself, and not
 905 to a thunk which will evaluate to the array (or to bottom).  The
 906 components of an <Literal>Array&num;</Literal> are themselves boxed.
 907 </Para>
 908
 909 <Para>
 910 The type <Literal>ByteArray&num;</Literal> is similar to
 911 <Literal>Array&num;</Literal>, except that it contains just a string
 912 of (non-pointer) bytes.
 913 </Para>
 914
 915 <Para>
 916
 917 <ProgramListing>
 918 type ByteArray#
 919 </ProgramListing>
 920
 921 <IndexTerm><Primary><literal>ByteArray&num;</literal></Primary></IndexTerm>
 922 </Para>
 923
 924 <Para>
 925 Arrays of these types are useful when a Haskell program wishes to
 926 construct a value to pass to a C procedure. It is also possible to use
 927 them to build (say) arrays of unboxed characters for internal use in a
 928 Haskell program.  Given these uses, <Literal>ByteArray&num;</Literal>
 929 is deliberately a bit vague about the type of its components.
 930 Operations are provided to extract values of type
 931 <Literal>Char&num;</Literal>, <Literal>Int&num;</Literal>,
 932 <Literal>Float&num;</Literal>, <Literal>Double&num;</Literal>, and
 933 <Literal>Addr&num;</Literal> from arbitrary offsets within a
 934 <Literal>ByteArray&num;</Literal>.  (For type
 935 <Literal>Foo&num;</Literal>, the $i$th offset gets you the $i$th
 936 <Literal>Foo&num;</Literal>, not the <Literal>Foo&num;</Literal> at
 937 byte-position $i$.  Mumble.)  (If you want a
 938 <Literal>Word&num;</Literal>, grab an <Literal>Int&num;</Literal>,
 939 then coerce it.)
 940 </Para>
 941
 942 <Para>
 943 Lastly, we have static byte-arrays, of type
 944 <Literal>Addr&num;</Literal> &lsqb;mentioned previously].  (Remember
 945 the duality between arrays and pointers in C.)  Arrays of this types
 946 are represented by a pointer to an array in the world outside Haskell,
 947 so this pointer is not followed by the garbage collector.  In other
 948 respects they are just like <Literal>ByteArray&num;</Literal>.  They
 949 are only needed in order to pass values from C to Haskell.
 950 </Para>
 951
 952 </Sect2>
 953
 954 <Sect2>
 955 <Title>Reading and writing</Title>
 956
 957 <Para>
 958 Primitive arrays are linear, and indexed starting at zero.
 959 </Para>
 960
 961 <Para>
 962 The size and indices of a <Literal>ByteArray&num;</Literal>, <Literal>Addr&num;</Literal>, and
 963 <Literal>MutableByteArray&num;</Literal> are all in bytes.  It's up to the program to
 964 calculate the correct byte offset from the start of the array.  This
 965 allows a <Literal>ByteArray&num;</Literal> to contain a mixture of values of different
 966 type, which is often needed when preparing data for and unpicking
 967 results from C.  (Umm&hellip;not true of indices&hellip;WDP 95/09)
 968 </Para>
 969
 970 <Para>
 971 <Emphasis>Should we provide some <Literal>sizeOfDouble&num;</Literal> constants?</Emphasis>
 972 </Para>
 973
 974 <Para>
 975 Out-of-range errors on indexing should be caught by the code which
 976 uses the primitive operation; the primitive operations themselves do
 977 <Emphasis>not</Emphasis> check for out-of-range indexes. The intention is that the
 978 primitive ops compile to one machine instruction or thereabouts.
 979 </Para>
 980
 981 <Para>
 982 We use the terms &ldquo;reading&rdquo; and &ldquo;writing&rdquo; to refer to accessing
 983 <Emphasis>mutable</Emphasis> arrays (see <XRef LinkEnd="sect-mutable">), and
 984 &ldquo;indexing&rdquo; to refer to reading a value from an <Emphasis>immutable</Emphasis>
 985 array.
 986 </Para>
 987
 988 <Para>
 989 Immutable byte arrays are straightforward to index (all indices in bytes):
 990
 991 <ProgramListing>
 992 indexCharArray#   :: ByteArray# -> Int# -> Char#
 993 indexIntArray#    :: ByteArray# -> Int# -> Int#
 994 indexAddrArray#   :: ByteArray# -> Int# -> Addr#
 995 indexFloatArray#  :: ByteArray# -> Int# -> Float#
 996 indexDoubleArray# :: ByteArray# -> Int# -> Double#
 997
 998 indexCharOffAddr#   :: Addr# -> Int# -> Char#
 999 indexIntOffAddr#    :: Addr# -> Int# -> Int#
1000 indexFloatOffAddr#  :: Addr# -> Int# -> Float#
1001 indexDoubleOffAddr# :: Addr# -> Int# -> Double#
1002 indexAddrOffAddr#   :: Addr# -> Int# -> Addr#
1003  -- Get an Addr# from an Addr# offset
1004 </ProgramListing>
1005
1006 <IndexTerm><Primary><literal>indexCharArray&num;</literal></Primary></IndexTerm>
1007 <IndexTerm><Primary><literal>indexIntArray&num;</literal></Primary></IndexTerm>
1008 <IndexTerm><Primary><literal>indexAddrArray&num;</literal></Primary></IndexTerm>
1009 <IndexTerm><Primary><literal>indexFloatArray&num;</literal></Primary></IndexTerm>
1010 <IndexTerm><Primary><literal>indexDoubleArray&num;</literal></Primary></IndexTerm>
1011 <IndexTerm><Primary><literal>indexCharOffAddr&num;</literal></Primary></IndexTerm>
1012 <IndexTerm><Primary><literal>indexIntOffAddr&num;</literal></Primary></IndexTerm>
1013 <IndexTerm><Primary><literal>indexFloatOffAddr&num;</literal></Primary></IndexTerm>
1014 <IndexTerm><Primary><literal>indexDoubleOffAddr&num;</literal></Primary></IndexTerm>
1015 <IndexTerm><Primary><literal>indexAddrOffAddr&num;</literal></Primary></IndexTerm>
1016 </Para>
1017
1018 <Para>
1019 The last of these, <Function>indexAddrOffAddr&num;</Function>, extracts an <Literal>Addr&num;</Literal> using an offset
1020 from another <Literal>Addr&num;</Literal>, thereby providing the ability to follow a chain of
1021 C pointers.
1022 </Para>
1023
1024 <Para>
1025 Something a bit more interesting goes on when indexing arrays of boxed
1026 objects, because the result is simply the boxed object. So presumably
1027 it should be entered&mdash;we never usually return an unevaluated
1028 object!  This is a pain: primitive ops aren't supposed to do
1029 complicated things like enter objects.  The current solution is to
1030 return a single element unboxed tuple (see <XRef LinkEnd="unboxed-tuples">).
1031 </Para>
1032
1033 <Para>
1034
1035 <ProgramListing>
1036 indexArray#       :: Array# elt -> Int# -> (# elt #)
1037 </ProgramListing>
1038
1039 <IndexTerm><Primary><literal>indexArray&num;</literal></Primary></IndexTerm>
1040 </Para>
1041
1042 </Sect2>
1043
1044 <Sect2>
1045 <Title>The state type</Title>
1046
1047 <Para>
1048 <IndexTerm><Primary><literal>state, primitive type</literal></Primary></IndexTerm>
1049 <IndexTerm><Primary><literal>State&num;</literal></Primary></IndexTerm>
1050 </Para>
1051
1052 <Para>
1053 The primitive type <Literal>State&num;</Literal> represents the state of a state
1054 transformer.  It is parameterised on the desired type of state, which
1055 serves to keep states from distinct threads distinct from one another.
1056 But the <Emphasis>only</Emphasis> effect of this parameterisation is in the type
1057 system: all values of type <Literal>State&num;</Literal> are represented in the same way.
1058 Indeed, they are all represented by nothing at all!  The code
1059 generator &ldquo;knows&rdquo; to generate no code, and allocate no registers
1060 etc, for primitive states.
1061 </Para>
1062
1063 <Para>
1064
1065 <ProgramListing>
1066 type State# s
1067 </ProgramListing>
1068
1069 </Para>
1070
1071 <Para>
1072 The type <Literal>GHC.RealWorld</Literal> is truly opaque: there are no values defined
1073 of this type, and no operations over it.  It is &ldquo;primitive&rdquo; in that
1074 sense - but it is <Emphasis>not unlifted!</Emphasis> Its only role in life is to be
1075 the type which distinguishes the <Literal>IO</Literal> state transformer.
1076 </Para>
1077
1078 <Para>
1079
1080 <ProgramListing>
1081 data RealWorld
1082 </ProgramListing>
1083
1084 </Para>
1085
1086 </Sect2>
1087
1088 <Sect2>
1089 <Title>State of the world</Title>
1090
1091 <Para>
1092 A single, primitive, value of type <Literal>State&num; RealWorld</Literal> is provided.
1093 </Para>
1094
1095 <Para>
1096
1097 <ProgramListing>
1098 realWorld# :: State# RealWorld
1099 </ProgramListing>
1100
1101 <IndexTerm><Primary>realWorld&num; state object</Primary></IndexTerm>
1102 </Para>
1103
1104 <Para>
1105 (Note: in the compiler, not a <Literal>PrimOp</Literal>; just a mucho magic
1106 <Literal>Id</Literal>. Exported from <Literal>GHC</Literal>, though).
1107 </Para>
1108
1109 </Sect2>
1110
1111 <Sect2 id="sect-mutable">
1112 <Title>Mutable arrays</Title>
1113
1114 <Para>
1115 <IndexTerm><Primary>mutable arrays</Primary></IndexTerm>
1116 <IndexTerm><Primary>arrays, mutable</Primary></IndexTerm>
1117 Corresponding to <Literal>Array&num;</Literal> and <Literal>ByteArray&num;</Literal>, we have the types of
1118 mutable versions of each.  In each case, the representation is a
1119 pointer to a suitable block of (mutable) heap-allocated storage.
1120 </Para>
1121
1122 <Para>
1123
1124 <ProgramListing>
1125 type MutableArray# s elt
1126 type MutableByteArray# s
1127 </ProgramListing>
1128
1129 <IndexTerm><Primary><literal>MutableArray&num;</literal></Primary></IndexTerm>
1130 <IndexTerm><Primary><literal>MutableByteArray&num;</literal></Primary></IndexTerm>
1131 </Para>
1132
1133 <Sect3>
1134 <Title>Allocation</Title>
1135
1136 <Para>
1137 <IndexTerm><Primary>mutable arrays, allocation</Primary></IndexTerm>
1138 <IndexTerm><Primary>arrays, allocation</Primary></IndexTerm>
1139 <IndexTerm><Primary>allocation, of mutable arrays</Primary></IndexTerm>
1140 </Para>
1141
1142 <Para>
1143 Mutable arrays can be allocated. Only pointer-arrays are initialised;
1144 arrays of non-pointers are filled in by &ldquo;user code&rdquo; rather than by
1145 the array-allocation primitive.  Reason: only the pointer case has to
1146 worry about GC striking with a partly-initialised array.
1147 </Para>
1148
1149 <Para>
1150
1151 <ProgramListing>
1152 newArray#       :: Int# -> elt -> State# s -> (# State# s, MutableArray# s elt #)
1153
1154 newCharArray#   :: Int# -> State# s -> (# State# s, MutableByteArray# s elt #)
1155 newIntArray#    :: Int# -> State# s -> (# State# s, MutableByteArray# s elt #)
1156 newAddrArray#   :: Int# -> State# s -> (# State# s, MutableByteArray# s elt #)
1157 newFloatArray#  :: Int# -> State# s -> (# State# s, MutableByteArray# s elt #)
1158 newDoubleArray# :: Int# -> State# s -> (# State# s, MutableByteArray# s elt #)
1159 </ProgramListing>
1160
1161 <IndexTerm><Primary><literal>newArray&num;</literal></Primary></IndexTerm>
1162 <IndexTerm><Primary><literal>newCharArray&num;</literal></Primary></IndexTerm>
1163 <IndexTerm><Primary><literal>newIntArray&num;</literal></Primary></IndexTerm>
1164 <IndexTerm><Primary><literal>newAddrArray&num;</literal></Primary></IndexTerm>
1165 <IndexTerm><Primary><literal>newFloatArray&num;</literal></Primary></IndexTerm>
1166 <IndexTerm><Primary><literal>newDoubleArray&num;</literal></Primary></IndexTerm>
1167 </Para>
1168
1169 <Para>
1170 The size of a <Literal>ByteArray&num;</Literal> is given in bytes.
1171 </Para>
1172
1173 </Sect3>
1174
1175 <Sect3>
1176 <Title>Reading and writing</Title>
1177
1178 <Para>
1179 <IndexTerm><Primary>arrays, reading and writing</Primary></IndexTerm>
1180 </Para>
1181
1182 <Para>
1183
1184 <ProgramListing>
1185 readArray#       :: MutableArray# s elt -> Int# -> State# s -> (# State# s, elt #)
1186 readCharArray#   :: MutableByteArray# s -> Int# -> State# s -> (# State# s, Char# #)
1187 readIntArray#    :: MutableByteArray# s -> Int# -> State# s -> (# State# s, Int# #)
1188 readAddrArray#   :: MutableByteArray# s -> Int# -> State# s -> (# State# s, Addr# #)
1189 readFloatArray#  :: MutableByteArray# s -> Int# -> State# s -> (# State# s, Float# #)
1190 readDoubleArray# :: MutableByteArray# s -> Int# -> State# s -> (# State# s, Double# #)
1191
1192 writeArray#       :: MutableArray# s elt -> Int# -> elt     -> State# s -> State# s
1193 writeCharArray#   :: MutableByteArray# s -> Int# -> Char#   -> State# s -> State# s
1194 writeIntArray#    :: MutableByteArray# s -> Int# -> Int#    -> State# s -> State# s
1195 writeAddrArray#   :: MutableByteArray# s -> Int# -> Addr#   -> State# s -> State# s
1196 writeFloatArray#  :: MutableByteArray# s -> Int# -> Float#  -> State# s -> State# s
1197 writeDoubleArray# :: MutableByteArray# s -> Int# -> Double# -> State# s -> State# s
1198 </ProgramListing>
1199
1200 <IndexTerm><Primary><literal>readArray&num;</literal></Primary></IndexTerm>
1201 <IndexTerm><Primary><literal>readCharArray&num;</literal></Primary></IndexTerm>
1202 <IndexTerm><Primary><literal>readIntArray&num;</literal></Primary></IndexTerm>
1203 <IndexTerm><Primary><literal>readAddrArray&num;</literal></Primary></IndexTerm>
1204 <IndexTerm><Primary><literal>readFloatArray&num;</literal></Primary></IndexTerm>
1205 <IndexTerm><Primary><literal>readDoubleArray&num;</literal></Primary></IndexTerm>
1206 <IndexTerm><Primary><literal>writeArray&num;</literal></Primary></IndexTerm>
1207 <IndexTerm><Primary><literal>writeCharArray&num;</literal></Primary></IndexTerm>
1208 <IndexTerm><Primary><literal>writeIntArray&num;</literal></Primary></IndexTerm>
1209 <IndexTerm><Primary><literal>writeAddrArray&num;</literal></Primary></IndexTerm>
1210 <IndexTerm><Primary><literal>writeFloatArray&num;</literal></Primary></IndexTerm>
1211 <IndexTerm><Primary><literal>writeDoubleArray&num;</literal></Primary></IndexTerm>
1212 </Para>
1213
1214 </Sect3>
1215
1216 <Sect3>
1217 <Title>Equality</Title>
1218
1219 <Para>
1220 <IndexTerm><Primary>arrays, testing for equality</Primary></IndexTerm>
1221 </Para>
1222
1223 <Para>
1224 One can take &ldquo;equality&rdquo; of mutable arrays.  What is compared is the
1225 <Emphasis>name</Emphasis> or reference to the mutable array, not its contents.
1226 </Para>
1227
1228 <Para>
1229
1230 <ProgramListing>
1231 sameMutableArray#     :: MutableArray# s elt -> MutableArray# s elt -> Bool
1232 sameMutableByteArray# :: MutableByteArray# s -> MutableByteArray# s -> Bool
1233 </ProgramListing>
1234
1235 <IndexTerm><Primary><literal>sameMutableArray&num;</literal></Primary></IndexTerm>
1236 <IndexTerm><Primary><literal>sameMutableByteArray&num;</literal></Primary></IndexTerm>
1237 </Para>
1238
1239 </Sect3>
1240
1241 <Sect3>
1242 <Title>Freezing mutable arrays</Title>
1243
1244 <Para>
1245 <IndexTerm><Primary>arrays, freezing mutable</Primary></IndexTerm>
1246 <IndexTerm><Primary>freezing mutable arrays</Primary></IndexTerm>
1247 <IndexTerm><Primary>mutable arrays, freezing</Primary></IndexTerm>
1248 </Para>
1249
1250 <Para>
1251 Only unsafe-freeze has a primitive.  (Safe freeze is done directly in Haskell
1252 by copying the array and then using <Function>unsafeFreeze</Function>.)
1253 </Para>
1254
1255 <Para>
1256
1257 <ProgramListing>
1258 unsafeFreezeArray#     :: MutableArray# s elt -> State# s -> (# State# s, Array# s elt #)
1259 unsafeFreezeByteArray# :: MutableByteArray# s -> State# s -> (# State# s, ByteArray# #)
1260 </ProgramListing>
1261
1262 <IndexTerm><Primary><literal>unsafeFreezeArray&num;</literal></Primary></IndexTerm>
1263 <IndexTerm><Primary><literal>unsafeFreezeByteArray&num;</literal></Primary></IndexTerm>
1264 </Para>
1265
1266 </Sect3>
1267
1268 </Sect2>
1269
1270 <Sect2>
1271 <Title>Synchronizing variables (M-vars)</Title>
1272
1273 <Para>
1274 <IndexTerm><Primary>synchronising variables (M-vars)</Primary></IndexTerm>
1275 <IndexTerm><Primary>M-Vars</Primary></IndexTerm>
1276 </Para>
1277
1278 <Para>
1279 Synchronising variables are the primitive type used to implement
1280 Concurrent Haskell's MVars (see the Concurrent Haskell paper for
1281 the operational behaviour of these operations).
1282 </Para>
1283
1284 <Para>
1285
1286 <ProgramListing>
1287 type MVar# s elt        -- primitive
1288
1289 newMVar#    :: State# s -> (# State# s, MVar# s elt #)
1290 takeMVar#   :: SynchVar# s elt -> State# s -> (# State# s, elt #)
1291 putMVar#    :: SynchVar# s elt -> State# s -> State# s
1292 </ProgramListing>
1293
1294 <IndexTerm><Primary><literal>SynchVar&num;</literal></Primary></IndexTerm>
1295 <IndexTerm><Primary><literal>newSynchVar&num;</literal></Primary></IndexTerm>
1296 <IndexTerm><Primary><literal>takeMVar</literal></Primary></IndexTerm>
1297 <IndexTerm><Primary><literal>putMVar</literal></Primary></IndexTerm>
1298 </Para>
1299
1300 </Sect2>
1301
1302 </Sect1>
1303
1304 <Sect1 id="glasgow-ST-monad">
1305 <Title>Primitive state-transformer monad
1306 </Title>
1307
1308 <Para>
1309 <IndexTerm><Primary>state transformers (Glasgow extensions)</Primary></IndexTerm>
1310 <IndexTerm><Primary>ST monad (Glasgow extension)</Primary></IndexTerm>
1311 </Para>
1312
1313 <Para>
1314 This monad underlies our implementation of arrays, mutable and
1315 immutable, and our implementation of I/O, including &ldquo;C calls&rdquo;.
1316 </Para>
1317
1318 <Para>
1319 The <Literal>ST</Literal> library, which provides access to the
1320 <Function>ST</Function> monad, is described in <xref
1321 linkend="sec-ST">.
1322 </Para>
1323
1324 </Sect1>
1325
1326 <Sect1 id="glasgow-prim-arrays">
1327 <Title>Primitive arrays, mutable and otherwise
1328 </Title>
1329
1330 <Para>
1331 <IndexTerm><Primary>primitive arrays (Glasgow extension)</Primary></IndexTerm>
1332 <IndexTerm><Primary>arrays, primitive (Glasgow extension)</Primary></IndexTerm>
1333 </Para>
1334
1335 <Para>
1336 GHC knows about quite a few flavours of Large Swathes of Bytes.
1337 </Para>
1338
1339 <Para>
1340 First, GHC distinguishes between primitive arrays of (boxed) Haskell
1341 objects (type <Literal>Array&num; obj</Literal>) and primitive arrays of bytes (type
1342 <Literal>ByteArray&num;</Literal>).
1343 </Para>
1344
1345 <Para>
1346 Second, it distinguishes between&hellip;
1347 <VariableList>
1348
1349 <VarListEntry>
1350 <Term>Immutable:</Term>
1351 <ListItem>
1352 <Para>
1353 Arrays that do not change (as with &ldquo;standard&rdquo; Haskell arrays); you
1354 can only read from them.  Obviously, they do not need the care and
1355 attention of the state-transformer monad.
1356 </Para>
1357 </ListItem>
1358 </VarListEntry>
1359 <VarListEntry>
1360 <Term>Mutable:</Term>
1361 <ListItem>
1362 <Para>
1363 Arrays that may be changed or &ldquo;mutated.&rdquo;  All the operations on them
1364 live within the state-transformer monad and the updates happen
1365 <Emphasis>in-place</Emphasis>.
1366 </Para>
1367 </ListItem>
1368 </VarListEntry>
1369 <VarListEntry>
1370 <Term>&ldquo;Static&rdquo; (in C land):</Term>
1371 <ListItem>
1372 <Para>
1373 A C routine may pass an <Literal>Addr&num;</Literal> pointer back into Haskell land.  There
1374 are then primitive operations with which you may merrily grab values
1375 over in C land, by indexing off the &ldquo;static&rdquo; pointer.
1376 </Para>
1377 </ListItem>
1378 </VarListEntry>
1379 <VarListEntry>
1380 <Term>&ldquo;Stable&rdquo; pointers:</Term>
1381 <ListItem>
1382 <Para>
1383 If, for some reason, you wish to hand a Haskell pointer (i.e.,
1384 <Emphasis>not</Emphasis> an unboxed value) to a C routine, you first make the
1385 pointer &ldquo;stable,&rdquo; so that the garbage collector won't forget that it
1386 exists.  That is, GHC provides a safe way to pass Haskell pointers to
1387 C.
1388 </Para>
1389
1390 <Para>
1391 Please see <XRef LinkEnd="sec-stable-pointers"> for more details.
1392 </Para>
1393 </ListItem>
1394 </VarListEntry>
1395 <VarListEntry>
1396 <Term>&ldquo;Foreign objects&rdquo;:</Term>
1397 <ListItem>
1398 <Para>
1399 A &ldquo;foreign object&rdquo; is a safe way to pass an external object (a
1400 C-allocated pointer, say) to Haskell and have Haskell do the Right
1401 Thing when it no longer references the object.  So, for example, C
1402 could pass a large bitmap over to Haskell and say &ldquo;please free this
1403 memory when you're done with it.&rdquo;
1404 </Para>
1405
1406 <Para>
1407 Please see <XRef LinkEnd="sec-ForeignObj"> for more details.
1408 </Para>
1409 </ListItem>
1410 </VarListEntry>
1411 </VariableList>
1412 </Para>
1413
1414 <Para>
1415 The libraries documentatation gives more details on all these
1416 &ldquo;primitive array&rdquo; types and the operations on them.
1417 </Para>
1418
1419 </Sect1>
1420
1421
1422 <Sect1 id="pattern-guards">
1423 <Title>Pattern guards</Title>
1424
1425 <Para>
1426 <IndexTerm><Primary>Pattern guards (Glasgow extension)</Primary></IndexTerm>
1427 The discussion that follows is an abbreviated version of Simon Peyton Jones's original <ULink URL="http://research.microsoft.com/~simonpj/Haskell/guards.html">proposal</ULink>. (Note that the proposal was written before pattern guards were implemented, so refers to them as unimplemented.)
1428 </Para>
1429
1430 <Para>
1431 Suppose we have an abstract data type of finite maps, with a
1432 lookup operation:
1433
1434 <ProgramListing>
1435 lookup :: FiniteMap -> Int -> Maybe Int
1436 </ProgramListing>
1437
1438 The lookup returns <Function>Nothing</Function> if the supplied key is not in the domain of the mapping, and <Function>(Just v)</Function> otherwise,
1439 where <VarName>v</VarName> is the value that the key maps to.  Now consider the following definition:
1440 </Para>
1441
1442 <ProgramListing>
1443 clunky env var1 var2 | ok1 && ok2 = val1 + val2
1444 | otherwise  = var1 + var2
1445 where
1446   m1 = lookup env var1
1447   m2 = lookup env var2
1448   ok1 = maybeToBool m1
1449   ok2 = maybeToBool m2
1450   val1 = expectJust m1
1451   val2 = expectJust m2
1452 </ProgramListing>
1453
1454 <Para>
1455 The auxiliary functions are
1456 </Para>
1457
1458 <ProgramListing>
1459 maybeToBool :: Maybe a -&gt; Bool
1460 maybeToBool (Just x) = True
1461 maybeToBool Nothing  = False
1462
1463 expectJust :: Maybe a -&gt; a
1464 expectJust (Just x) = x
1465 expectJust Nothing  = error "Unexpected Nothing"
1466 </ProgramListing>
1467
1468 <Para>
1469 What is <Function>clunky</Function> doing? The guard <Literal>ok1 &&
1470 ok2</Literal> checks that both lookups succeed, using
1471 <Function>maybeToBool</Function> to convert the <Function>Maybe</Function>
1472 types to booleans. The (lazily evaluated) <Function>expectJust</Function>
1473 calls extract the values from the results of the lookups, and binds the
1474 returned values to <VarName>val1</VarName> and <VarName>val2</VarName>
1475 respectively.  If either lookup fails, then clunky takes the
1476 <Literal>otherwise</Literal> case and returns the sum of its arguments.
1477 </Para>
1478
1479 <Para>
1480 This is certainly legal Haskell, but it is a tremendously verbose and
1481 un-obvious way to achieve the desired effect.  Arguably, a more direct way
1482 to write clunky would be to use case expressions:
1483 </Para>
1484
1485 <ProgramListing>
1486 clunky env var1 var1 = case lookup env var1 of
1487   Nothing -&gt; fail
1488   Just val1 -&gt; case lookup env var2 of
1489     Nothing -&gt; fail
1490     Just val2 -&gt; val1 + val2
1491 where
1492   fail = val1 + val2
1493 </ProgramListing>
1494
1495 <Para>
1496 This is a bit shorter, but hardly better.  Of course, we can rewrite any set
1497 of pattern-matching, guarded equations as case expressions; that is
1498 precisely what the compiler does when compiling equations! The reason that
1499 Haskell provides guarded equations is because they allow us to write down
1500 the cases we want to consider, one at a time, independently of each other.
1501 This structure is hidden in the case version.  Two of the right-hand sides
1502 are really the same (<Function>fail</Function>), and the whole expression
1503 tends to become more and more indented.
1504 </Para>
1505
1506 <Para>
1507 Here is how I would write clunky:
1508 </Para>
1509
1510 <ProgramListing>
1511 clunky env var1 var1
1512   | Just val1 &lt;- lookup env var1
1513   , Just val2 &lt;- lookup env var2
1514   = val1 + val2
1515 ...other equations for clunky...
1516 </ProgramListing>
1517
1518 <Para>
1519 The semantics should be clear enough.  The qualifers are matched in order.
1520 For a <Literal>&lt;-</Literal> qualifier, which I call a pattern guard, the
1521 right hand side is evaluated and matched against the pattern on the left.
1522 If the match fails then the whole guard fails and the next equation is
1523 tried.  If it succeeds, then the appropriate binding takes place, and the
1524 next qualifier is matched, in the augmented environment.  Unlike list
1525 comprehensions, however, the type of the expression to the right of the
1526 <Literal>&lt;-</Literal> is the same as the type of the pattern to its
1527 left.  The bindings introduced by pattern guards scope over all the
1528 remaining guard qualifiers, and over the right hand side of the equation.
1529 </Para>
1530
1531 <Para>
1532 Just as with list comprehensions, boolean expressions can be freely mixed
1533 with among the pattern guards.  For example:
1534 </Para>
1535
1536 <ProgramListing>
1537 f x | [y] <- x
1538     , y > 3
1539     , Just z <- h y
1540     = ...
1541 </ProgramListing>
1542
1543 <Para>
1544 Haskell's current guards therefore emerge as a special case, in which the
1545 qualifier list has just one element, a boolean expression.
1546 </Para>
1547 </Sect1>
1548
1549   <sect1 id="sec-ffi">
1550     <title>The foreign interface</title>
1551
1552     <para>The foreign interface consists of the following components:</para>
1553
1554     <itemizedlist>
1555       <listitem>
1556         <para>The Foreign Function Interface language specification
1557         (included in this manual, in <xref linkend="ffi">).</para>
1558       </listitem>
1559
1560       <listitem>
1561         <para>The <literal>Foreign</literal> module (see <xref
1562         linkend="sec-Foreign">) collects together several interfaces
1563         which are useful in specifying foreign language
1564         interfaces, including the following:</para>
1565
1566         <itemizedlist>
1567           <listitem>
1568             <para>The <literal>ForeignObj</literal> module (see <xref
1569             linkend="sec-ForeignObj">), for managing pointers from
1570             Haskell into the outside world.</para>
1571           </listitem>
1572
1573           <listitem>
1574             <para>The <literal>StablePtr</literal> module (see <xref
1575             linkend="sec-stable-pointers">), for managing pointers
1576             into Haskell from the outside world.</para>
1577           </listitem>
1578
1579           <listitem>
1580             <para>The <literal>CTypes</literal> module (see <xref
1581             linkend="sec-CTypes">) gives Haskell equivalents for the
1582             standard C datatypes, for use in making Haskell bindings
1583             to existing C libraries.</para>
1584           </listitem>
1585
1586           <listitem>
1587             <para>The <literal>CTypesISO</literal> module (see <xref
1588             linkend="sec-CTypesISO">) gives Haskell equivalents for C
1589             types defined by the ISO C standard.</para>
1590           </listitem>
1591
1592           <listitem>
1593             <para>The <literal>Storable</literal> library, for
1594             primitive marshalling of data types between Haskell and
1595             the foreign language.</para>
1596           </listitem>
1597         </itemizedlist>
1598
1599       </listitem>
1600     </itemizedlist>
1601
1602 <para>The following sections also give some hints and tips on the use
1603 of the foreign function interface in GHC.</para>
1604
1605 <Sect2 id="glasgow-foreign-headers">
1606 <Title>Using function headers
1607 </Title>
1608
1609 <Para>
1610 <IndexTerm><Primary>C calls, function headers</Primary></IndexTerm>
1611 </Para>
1612
1613 <Para>
1614 When generating C (using the <Option>-fvia-C</Option> directive), one can assist the
1615 C compiler in detecting type errors by using the <Command>-&num;include</Command> directive
1616 to provide <Filename>.h</Filename> files containing function headers.
1617 </Para>
1618
1619 <Para>
1620 For example,
1621 </Para>
1622
1623 <Para>
1624
1625 <ProgramListing>
1626 #include "HsFFI.h"
1627
1628 void         initialiseEFS (HsInt size);
1629 HsInt        terminateEFS (void);
1630 HsForeignObj emptyEFS(void);
1631 HsForeignObj updateEFS (HsForeignObj a, HsInt i, HsInt x);
1632 HsInt        lookupEFS (HsForeignObj a, HsInt i);
1633 </ProgramListing>
1634 </Para>
1635
1636       <para>The types <literal>HsInt</literal>,
1637       <literal>HsForeignObj</literal> etc. are described in <xref
1638       linkend="sec-mapping-table">.</Para>
1639
1640       <Para>Note that this approach is only
1641       <Emphasis>essential</Emphasis> for returning
1642       <Literal>float</Literal>s (or if <Literal>sizeof(int) !=
1643       sizeof(int *)</Literal> on your architecture) but is a Good
1644       Thing for anyone who cares about writing solid code.  You're
1645       crazy not to do it.</Para>
1646
1647 </Sect2>
1648
1649 </Sect1>
1650
1651 <Sect1 id="multi-param-type-classes">
1652 <Title>Multi-parameter type classes
1653 </Title>
1654
1655 <Para>
1656 This section documents GHC's implementation of multi-parameter type
1657 classes.  There's lots of background in the paper <ULink
1658 URL="http://research.microsoft.com/~simonpj/multi.ps.gz" >Type
1659 classes: exploring the design space</ULink > (Simon Peyton Jones, Mark
1660 Jones, Erik Meijer).
1661 </Para>
1662
1663 <Para>
1664 I'd like to thank people who reported shorcomings in the GHC 3.02
1665 implementation.  Our default decisions were all conservative ones, and
1666 the experience of these heroic pioneers has given useful concrete
1667 examples to support several generalisations.  (These appear below as
1668 design choices not implemented in 3.02.)
1669 </Para>
1670
1671 <Para>
1672 I've discussed these notes with Mark Jones, and I believe that Hugs
1673 will migrate towards the same design choices as I outline here.
1674 Thanks to him, and to many others who have offered very useful
1675 feedback.
1676 </Para>
1677
1678 <Sect2>
1679 <Title>Types</Title>
1680
1681 <Para>
1682 There are the following restrictions on the form of a qualified
1683 type:
1684 </Para>
1685
1686 <Para>
1687
1688 <ProgramListing>
1689   forall tv1..tvn (c1, ...,cn) => type
1690 </ProgramListing>
1691
1692 </Para>
1693
1694 <Para>
1695 (Here, I write the "foralls" explicitly, although the Haskell source
1696 language omits them; in Haskell 1.4, all the free type variables of an
1697 explicit source-language type signature are universally quantified,
1698 except for the class type variables in a class declaration.  However,
1699 in GHC, you can give the foralls if you want.  See <XRef LinkEnd="universal-quantification">).
1700 </Para>
1701
1702 <Para>
1703
1704 <OrderedList>
1705 <ListItem>
1706
1707 <Para>
1708  <Emphasis>Each universally quantified type variable
1709 <Literal>tvi</Literal> must be mentioned (i.e. appear free) in <Literal>type</Literal></Emphasis>.
1710
1711 The reason for this is that a value with a type that does not obey
1712 this restriction could not be used without introducing
1713 ambiguity. Here, for example, is an illegal type:
1714
1715
1716 <ProgramListing>
1717   forall a. Eq a => Int
1718 </ProgramListing>
1719
1720
1721 When a value with this type was used, the constraint <Literal>Eq tv</Literal>
1722 would be introduced where <Literal>tv</Literal> is a fresh type variable, and
1723 (in the dictionary-translation implementation) the value would be
1724 applied to a dictionary for <Literal>Eq tv</Literal>.  The difficulty is that we
1725 can never know which instance of <Literal>Eq</Literal> to use because we never
1726 get any more information about <Literal>tv</Literal>.
1727
1728 </Para>
1729 </ListItem>
1730 <ListItem>
1731
1732 <Para>
1733  <Emphasis>Every constraint <Literal>ci</Literal> must mention at least one of the
1734 universally quantified type variables <Literal>tvi</Literal></Emphasis>.
1735
1736 For example, this type is OK because <Literal>C a b</Literal> mentions the
1737 universally quantified type variable <Literal>b</Literal>:
1738
1739
1740 <ProgramListing>
1741   forall a. C a b => burble
1742 </ProgramListing>
1743
1744
1745 The next type is illegal because the constraint <Literal>Eq b</Literal> does not
1746 mention <Literal>a</Literal>:
1747
1748
1749 <ProgramListing>
1750   forall a. Eq b => burble
1751 </ProgramListing>
1752
1753
1754 The reason for this restriction is milder than the other one.  The
1755 excluded types are never useful or necessary (because the offending
1756 context doesn't need to be witnessed at this point; it can be floated
1757 out).  Furthermore, floating them out increases sharing. Lastly,
1758 excluding them is a conservative choice; it leaves a patch of
1759 territory free in case we need it later.
1760
1761 </Para>
1762 </ListItem>
1763
1764 </OrderedList>
1765
1766 </Para>
1767
1768 <Para>
1769 These restrictions apply to all types, whether declared in a type signature
1770 or inferred.
1771 </Para>
1772
1773 <Para>
1774 Unlike Haskell 1.4, constraints in types do <Emphasis>not</Emphasis> have to be of
1775 the form <Emphasis>(class type-variables)</Emphasis>.  Thus, these type signatures
1776 are perfectly OK
1777 </Para>
1778
1779 <Para>
1780
1781 <ProgramListing>
1782   f :: Eq (m a) => [m a] -> [m a]
1783   g :: Eq [a] => ...
1784 </ProgramListing>
1785
1786 </Para>
1787
1788 <Para>
1789 This choice recovers principal types, a property that Haskell 1.4 does not have.
1790 </Para>
1791
1792 </Sect2>
1793
1794 <Sect2>
1795 <Title>Class declarations</Title>
1796
1797 <Para>
1798
1799 <OrderedList>
1800 <ListItem>
1801
1802 <Para>
1803  <Emphasis>Multi-parameter type classes are permitted</Emphasis>. For example:
1804
1805
1806 <ProgramListing>
1807   class Collection c a where
1808     union :: c a -> c a -> c a
1809     ...etc.
1810 </ProgramListing>
1811
1812
1813
1814 </Para>
1815 </ListItem>
1816 <ListItem>
1817
1818 <Para>
1819  <Emphasis>The class hierarchy must be acyclic</Emphasis>.  However, the definition
1820 of "acyclic" involves only the superclass relationships.  For example,
1821 this is OK:
1822
1823
1824 <ProgramListing>
1825   class C a where {
1826     op :: D b => a -> b -> b
1827   }
1828
1829   class C a => D a where { ... }
1830 </ProgramListing>
1831
1832
1833 Here, <Literal>C</Literal> is a superclass of <Literal>D</Literal>, but it's OK for a
1834 class operation <Literal>op</Literal> of <Literal>C</Literal> to mention <Literal>D</Literal>.  (It
1835 would not be OK for <Literal>D</Literal> to be a superclass of <Literal>C</Literal>.)
1836
1837 </Para>
1838 </ListItem>
1839 <ListItem>
1840
1841 <Para>
1842  <Emphasis>There are no restrictions on the context in a class declaration
1843 (which introduces superclasses), except that the class hierarchy must
1844 be acyclic</Emphasis>.  So these class declarations are OK:
1845
1846
1847 <ProgramListing>
1848   class Functor (m k) => FiniteMap m k where
1849     ...
1850
1851   class (Monad m, Monad (t m)) => Transform t m where
1852     lift :: m a -> (t m) a
1853 </ProgramListing>
1854
1855
1856 </Para>
1857 </ListItem>
1858 <ListItem>
1859
1860 <Para>
1861  <Emphasis>In the signature of a class operation, every constraint
1862 must mention at least one type variable that is not a class type
1863 variable</Emphasis>.
1864
1865 Thus:
1866
1867
1868 <ProgramListing>
1869   class Collection c a where
1870     mapC :: Collection c b => (a->b) -> c a -> c b
1871 </ProgramListing>
1872
1873
1874 is OK because the constraint <Literal>(Collection a b)</Literal> mentions
1875 <Literal>b</Literal>, even though it also mentions the class variable
1876 <Literal>a</Literal>.  On the other hand:
1877
1878
1879 <ProgramListing>
1880   class C a where
1881     op :: Eq a => (a,b) -> (a,b)
1882 </ProgramListing>
1883
1884
1885 is not OK because the constraint <Literal>(Eq a)</Literal> mentions on the class
1886 type variable <Literal>a</Literal>, but not <Literal>b</Literal>.  However, any such
1887 example is easily fixed by moving the offending context up to the
1888 superclass context:
1889
1890
1891 <ProgramListing>
1892   class Eq a => C a where
1893     op ::(a,b) -> (a,b)
1894 </ProgramListing>
1895
1896
1897 A yet more relaxed rule would allow the context of a class-op signature
1898 to mention only class type variables.  However, that conflicts with
1899 Rule 1(b) for types above.
1900
1901 </Para>
1902 </ListItem>
1903 <ListItem>
1904
1905 <Para>
1906  <Emphasis>The type of each class operation must mention <Emphasis>all</Emphasis> of
1907 the class type variables</Emphasis>.  For example:
1908
1909
1910 <ProgramListing>
1911   class Coll s a where
1912     empty  :: s
1913     insert :: s -> a -> s
1914 </ProgramListing>
1915
1916
1917 is not OK, because the type of <Literal>empty</Literal> doesn't mention
1918 <Literal>a</Literal>.  This rule is a consequence of Rule 1(a), above, for
1919 types, and has the same motivation.
1920
1921 Sometimes, offending class declarations exhibit misunderstandings.  For
1922 example, <Literal>Coll</Literal> might be rewritten
1923
1924
1925 <ProgramListing>
1926   class Coll s a where
1927     empty  :: s a
1928     insert :: s a -> a -> s a
1929 </ProgramListing>
1930
1931
1932 which makes the connection between the type of a collection of
1933 <Literal>a</Literal>'s (namely <Literal>(s a)</Literal>) and the element type <Literal>a</Literal>.
1934 Occasionally this really doesn't work, in which case you can split the
1935 class like this:
1936
1937
1938 <ProgramListing>
1939   class CollE s where
1940     empty  :: s
1941
1942   class CollE s => Coll s a where
1943     insert :: s -> a -> s
1944 </ProgramListing>
1945
1946
1947 </Para>
1948 </ListItem>
1949
1950 </OrderedList>
1951
1952 </Para>
1953
1954 </Sect2>
1955
1956 <Sect2 id="instance-decls">
1957 <Title>Instance declarations</Title>
1958
1959 <Para>
1960
1961 <OrderedList>
1962 <ListItem>
1963
1964 <Para>
1965  <Emphasis>Instance declarations may not overlap</Emphasis>.  The two instance
1966 declarations
1967
1968
1969 <ProgramListing>
1970   instance context1 => C type1 where ...
1971   instance context2 => C type2 where ...
1972 </ProgramListing>
1973
1974
1975 "overlap" if <Literal>type1</Literal> and <Literal>type2</Literal> unify
1976
1977 However, if you give the command line option
1978 <Option>-fallow-overlapping-instances</Option><IndexTerm><Primary>-fallow-overlapping-instances
1979 option</Primary></IndexTerm> then two overlapping instance declarations are permitted
1980 iff
1981
1982
1983 <ItemizedList>
1984 <ListItem>
1985
1986 <Para>
1987  EITHER <Literal>type1</Literal> and <Literal>type2</Literal> do not unify
1988 </Para>
1989 </ListItem>
1990 <ListItem>
1991
1992 <Para>
1993  OR <Literal>type2</Literal> is a substitution instance of <Literal>type1</Literal>
1994 (but not identical to <Literal>type1</Literal>)
1995 </Para>
1996 </ListItem>
1997 <ListItem>
1998
1999 <Para>
2000  OR vice versa
2001 </Para>
2002 </ListItem>
2003
2004 </ItemizedList>
2005
2006
2007 Notice that these rules
2008
2009
2010 <ItemizedList>
2011 <ListItem>
2012
2013 <Para>
2014  make it clear which instance decl to use
2015 (pick the most specific one that matches)
2016
2017 </Para>
2018 </ListItem>
2019 <ListItem>
2020
2021 <Para>
2022  do not mention the contexts <Literal>context1</Literal>, <Literal>context2</Literal>
2023 Reason: you can pick which instance decl
2024 "matches" based on the type.
2025 </Para>
2026 </ListItem>
2027
2028 </ItemizedList>
2029
2030
2031 Regrettably, GHC doesn't guarantee to detect overlapping instance
2032 declarations if they appear in different modules.  GHC can "see" the
2033 instance declarations in the transitive closure of all the modules
2034 imported by the one being compiled, so it can "see" all instance decls
2035 when it is compiling <Literal>Main</Literal>.  However, it currently chooses not
2036 to look at ones that can't possibly be of use in the module currently
2037 being compiled, in the interests of efficiency.  (Perhaps we should
2038 change that decision, at least for <Literal>Main</Literal>.)
2039
2040 </Para>
2041 </ListItem>
2042 <ListItem>
2043
2044 <Para>
2045  <Emphasis>There are no restrictions on the type in an instance
2046 <Emphasis>head</Emphasis>, except that at least one must not be a type variable</Emphasis>.
2047 The instance "head" is the bit after the "=>" in an instance decl. For
2048 example, these are OK:
2049
2050
2051 <ProgramListing>
2052   instance C Int a where ...
2053
2054   instance D (Int, Int) where ...
2055
2056   instance E [[a]] where ...
2057 </ProgramListing>
2058
2059
2060 Note that instance heads <Emphasis>may</Emphasis> contain repeated type variables.
2061 For example, this is OK:
2062
2063
2064 <ProgramListing>
2065   instance Stateful (ST s) (MutVar s) where ...
2066 </ProgramListing>
2067
2068
2069 The "at least one not a type variable" restriction is to ensure that
2070 context reduction terminates: each reduction step removes one type
2071 constructor.  For example, the following would make the type checker
2072 loop if it wasn't excluded:
2073
2074
2075 <ProgramListing>
2076   instance C a => C a where ...
2077 </ProgramListing>
2078
2079
2080 There are two situations in which the rule is a bit of a pain. First,
2081 if one allows overlapping instance declarations then it's quite
2082 convenient to have a "default instance" declaration that applies if
2083 something more specific does not:
2084
2085
2086 <ProgramListing>
2087   instance C a where
2088     op = ... -- Default
2089 </ProgramListing>
2090
2091
2092 Second, sometimes you might want to use the following to get the
2093 effect of a "class synonym":
2094
2095
2096 <ProgramListing>
2097   class (C1 a, C2 a, C3 a) => C a where { }
2098
2099   instance (C1 a, C2 a, C3 a) => C a where { }
2100 </ProgramListing>
2101
2102
2103 This allows you to write shorter signatures:
2104
2105
2106 <ProgramListing>
2107   f :: C a => ...
2108 </ProgramListing>
2109
2110
2111 instead of
2112
2113
2114 <ProgramListing>
2115   f :: (C1 a, C2 a, C3 a) => ...
2116 </ProgramListing>
2117
2118
2119 I'm on the lookout for a simple rule that preserves decidability while
2120 allowing these idioms.  The experimental flag
2121 <Option>-fallow-undecidable-instances</Option><IndexTerm><Primary>-fallow-undecidable-instances
2122 option</Primary></IndexTerm> lifts this restriction, allowing all the types in an
2123 instance head to be type variables.
2124
2125 </Para>
2126 </ListItem>
2127 <ListItem>
2128
2129 <Para>
2130  <Emphasis>Unlike Haskell 1.4, instance heads may use type
2131 synonyms</Emphasis>.  As always, using a type synonym is just shorthand for
2132 writing the RHS of the type synonym definition.  For example:
2133
2134
2135 <ProgramListing>
2136   type Point = (Int,Int)
2137   instance C Point   where ...
2138   instance C [Point] where ...
2139 </ProgramListing>
2140
2141
2142 is legal.  However, if you added
2143
2144
2145 <ProgramListing>
2146   instance C (Int,Int) where ...
2147 </ProgramListing>
2148
2149
2150 as well, then the compiler will complain about the overlapping
2151 (actually, identical) instance declarations.  As always, type synonyms
2152 must be fully applied.  You cannot, for example, write:
2153
2154
2155 <ProgramListing>
2156   type P a = [[a]]
2157   instance Monad P where ...
2158 </ProgramListing>
2159
2160
2161 This design decision is independent of all the others, and easily
2162 reversed, but it makes sense to me.
2163
2164 </Para>
2165 </ListItem>
2166 <ListItem>
2167
2168 <Para>
2169 <Emphasis>The types in an instance-declaration <Emphasis>context</Emphasis> must all
2170 be type variables</Emphasis>. Thus
2171
2172
2173 <ProgramListing>
2174 instance C a b => Eq (a,b) where ...
2175 </ProgramListing>
2176
2177
2178 is OK, but
2179
2180
2181 <ProgramListing>
2182 instance C Int b => Foo b where ...
2183 </ProgramListing>
2184
2185
2186 is not OK.  Again, the intent here is to make sure that context
2187 reduction terminates.
2188
2189 Voluminous correspondence on the Haskell mailing list has convinced me
2190 that it's worth experimenting with a more liberal rule.  If you use
2191 the flag <Option>-fallow-undecidable-instances</Option> can use arbitrary
2192 types in an instance context.  Termination is ensured by having a
2193 fixed-depth recursion stack.  If you exceed the stack depth you get a
2194 sort of backtrace, and the opportunity to increase the stack depth
2195 with <Option>-fcontext-stack</Option><Emphasis>N</Emphasis>.
2196
2197 </Para>
2198 </ListItem>
2199
2200 </OrderedList>
2201
2202 </Para>
2203
2204 </Sect2>
2205
2206 </Sect1>
2207
2208 <Sect1 id="universal-quantification">
2209 <Title>Explicit universal quantification
2210 </Title>
2211
2212 <Para>
2213 GHC now allows you to write explicitly quantified types.  GHC's
2214 syntax for this now agrees with Hugs's, namely:
2215 </Para>
2216
2217 <Para>
2218
2219 <ProgramListing>
2220         forall a b. (Ord a, Eq  b) => a -> b -> a
2221 </ProgramListing>
2222
2223 </Para>
2224
2225 <Para>
2226 The context is, of course, optional.  You can't use <Literal>forall</Literal> as
2227 a type variable any more!
2228 </Para>
2229
2230 <Para>
2231 Haskell type signatures are implicitly quantified.  The <Literal>forall</Literal>
2232 allows us to say exactly what this means.  For example:
2233 </Para>
2234
2235 <Para>
2236
2237 <ProgramListing>
2238         g :: b -> b
2239 </ProgramListing>
2240
2241 </Para>
2242
2243 <Para>
2244 means this:
2245 </Para>
2246
2247 <Para>
2248
2249 <ProgramListing>
2250         g :: forall b. (b -> b)
2251 </ProgramListing>
2252
2253 </Para>
2254
2255 <Para>
2256 The two are treated identically.
2257 </Para>
2258
2259 <Sect2 id="univ">
2260 <Title>Universally-quantified data type fields
2261 </Title>
2262
2263 <Para>
2264 In a <Literal>data</Literal> or <Literal>newtype</Literal> declaration one can quantify
2265 the types of the constructor arguments.  Here are several examples:
2266 </Para>
2267
2268 <Para>
2269
2270 <ProgramListing>
2271 data T a = T1 (forall b. b -> b -> b) a
2272
2273 data MonadT m = MkMonad { return :: forall a. a -> m a,
2274                           bind   :: forall a b. m a -> (a -> m b) -> m b
2275                         }
2276
2277 newtype Swizzle = MkSwizzle (Ord a => [a] -> [a])
2278 </ProgramListing>
2279
2280 </Para>
2281
2282 <Para>
2283 The constructors now have so-called <Emphasis>rank 2</Emphasis> polymorphic
2284 types, in which there is a for-all in the argument types.:
2285 </Para>
2286
2287 <Para>
2288
2289 <ProgramListing>
2290 T1 :: forall a. (forall b. b -> b -> b) -> a -> T a
2291 MkMonad :: forall m. (forall a. a -> m a)
2292                   -> (forall a b. m a -> (a -> m b) -> m b)
2293                   -> MonadT m
2294 MkSwizzle :: (Ord a => [a] -> [a]) -> Swizzle
2295 </ProgramListing>
2296
2297 </Para>
2298
2299 <Para>
2300 Notice that you don't need to use a <Literal>forall</Literal> if there's an
2301 explicit context.  For example in the first argument of the
2302 constructor <Function>MkSwizzle</Function>, an implicit "<Literal>forall a.</Literal>" is
2303 prefixed to the argument type.  The implicit <Literal>forall</Literal>
2304 quantifies all type variables that are not already in scope, and are
2305 mentioned in the type quantified over.
2306 </Para>
2307
2308 <Para>
2309 As for type signatures, implicit quantification happens for non-overloaded
2310 types too.  So if you write this:
2311
2312 <ProgramListing>
2313   data T a = MkT (Either a b) (b -> b)
2314 </ProgramListing>
2315
2316 it's just as if you had written this:
2317
2318 <ProgramListing>
2319   data T a = MkT (forall b. Either a b) (forall b. b -> b)
2320 </ProgramListing>
2321
2322 That is, since the type variable <Literal>b</Literal> isn't in scope, it's
2323 implicitly universally quantified.  (Arguably, it would be better
2324 to <Emphasis>require</Emphasis> explicit quantification on constructor arguments
2325 where that is what is wanted.  Feedback welcomed.)
2326 </Para>
2327
2328 </Sect2>
2329
2330 <Sect2>
2331 <Title>Construction </Title>
2332
2333 <Para>
2334 You construct values of types <Literal>T1, MonadT, Swizzle</Literal> by applying
2335 the constructor to suitable values, just as usual.  For example,
2336 </Para>
2337
2338 <Para>
2339
2340 <ProgramListing>
2341 (T1 (\xy->x) 3) :: T Int
2342
2343 (MkSwizzle sort)    :: Swizzle
2344 (MkSwizzle reverse) :: Swizzle
2345
2346 (let r x = Just x
2347      b m k = case m of
2348                 Just y -> k y
2349                 Nothing -> Nothing
2350   in
2351   MkMonad r b) :: MonadT Maybe
2352 </ProgramListing>
2353
2354 </Para>
2355
2356 <Para>
2357 The type of the argument can, as usual, be more general than the type
2358 required, as <Literal>(MkSwizzle reverse)</Literal> shows.  (<Function>reverse</Function>
2359 does not need the <Literal>Ord</Literal> constraint.)
2360 </Para>
2361
2362 </Sect2>
2363
2364 <Sect2>
2365 <Title>Pattern matching</Title>
2366
2367 <Para>
2368 When you use pattern matching, the bound variables may now have
2369 polymorphic types.  For example:
2370 </Para>
2371
2372 <Para>
2373
2374 <ProgramListing>
2375         f :: T a -> a -> (a, Char)
2376         f (T1 f k) x = (f k x, f 'c' 'd')
2377
2378         g :: (Ord a, Ord b) => Swizzle -> [a] -> (a -> b) -> [b]
2379         g (MkSwizzle s) xs f = s (map f (s xs))
2380
2381         h :: MonadT m -> [m a] -> m [a]
2382         h m [] = return m []
2383         h m (x:xs) = bind m x           $ \y ->
2384                       bind m (h m xs)   $ \ys ->
2385                       return m (y:ys)
2386 </ProgramListing>
2387
2388 </Para>
2389
2390 <Para>
2391 In the function <Function>h</Function> we use the record selectors <Literal>return</Literal>
2392 and <Literal>bind</Literal> to extract the polymorphic bind and return functions
2393 from the <Literal>MonadT</Literal> data structure, rather than using pattern
2394 matching.
2395 </Para>
2396
2397 <Para>
2398 You cannot pattern-match against an argument that is polymorphic.
2399 For example:
2400
2401 <ProgramListing>
2402         newtype TIM s a = TIM (ST s (Maybe a))
2403
2404         runTIM :: (forall s. TIM s a) -> Maybe a
2405         runTIM (TIM m) = runST m
2406 </ProgramListing>
2407
2408 </Para>
2409
2410 <Para>
2411 Here the pattern-match fails, because you can't pattern-match against
2412 an argument of type <Literal>(forall s. TIM s a)</Literal>.  Instead you
2413 must bind the variable and pattern match in the right hand side:
2414
2415 <ProgramListing>
2416         runTIM :: (forall s. TIM s a) -> Maybe a
2417         runTIM tm = case tm of { TIM m -> runST m }
2418 </ProgramListing>
2419
2420 The <Literal>tm</Literal> on the right hand side is (invisibly) instantiated, like
2421 any polymorphic value at its occurrence site, and now you can pattern-match
2422 against it.
2423 </Para>
2424
2425 </Sect2>
2426
2427 <Sect2>
2428 <Title>The partial-application restriction</Title>
2429
2430 <Para>
2431 There is really only one way in which data structures with polymorphic
2432 components might surprise you: you must not partially apply them.
2433 For example, this is illegal:
2434 </Para>
2435
2436 <Para>
2437
2438 <ProgramListing>
2439         map MkSwizzle [sort, reverse]
2440 </ProgramListing>
2441
2442 </Para>
2443
2444 <Para>
2445 The restriction is this: <Emphasis>every subexpression of the program must
2446 have a type that has no for-alls, except that in a function
2447 application (f e1&hellip;en) the partial applications are not subject to
2448 this rule</Emphasis>.  The restriction makes type inference feasible.
2449 </Para>
2450
2451 <Para>
2452 In the illegal example, the sub-expression <Literal>MkSwizzle</Literal> has the
2453 polymorphic type <Literal>(Ord b => [b] -> [b]) -> Swizzle</Literal> and is not
2454 a sub-expression of an enclosing application.  On the other hand, this
2455 expression is OK:
2456 </Para>
2457
2458 <Para>
2459
2460 <ProgramListing>
2461         map (T1 (\a b -> a)) [1,2,3]
2462 </ProgramListing>
2463
2464 </Para>
2465
2466 <Para>
2467 even though it involves a partial application of <Function>T1</Function>, because
2468 the sub-expression <Literal>T1 (\a b -> a)</Literal> has type <Literal>Int -> T
2469 Int</Literal>.
2470 </Para>
2471
2472 </Sect2>
2473
2474 <Sect2 id="sigs">
2475 <Title>Type signatures
2476 </Title>
2477
2478 <Para>
2479 Once you have data constructors with universally-quantified fields, or
2480 constants such as <Constant>runST</Constant> that have rank-2 types, it isn't long
2481 before you discover that you need more!  Consider:
2482 </Para>
2483
2484 <Para>
2485
2486 <ProgramListing>
2487   mkTs f x y = [T1 f x, T1 f y]
2488 </ProgramListing>
2489
2490 </Para>
2491
2492 <Para>
2493 <Function>mkTs</Function> is a fuction that constructs some values of type
2494 <Literal>T</Literal>, using some pieces passed to it.  The trouble is that since
2495 <Literal>f</Literal> is a function argument, Haskell assumes that it is
2496 monomorphic, so we'll get a type error when applying <Function>T1</Function> to
2497 it.  This is a rather silly example, but the problem really bites in
2498 practice.  Lots of people trip over the fact that you can't make
2499 "wrappers functions" for <Constant>runST</Constant> for exactly the same reason.
2500 In short, it is impossible to build abstractions around functions with
2501 rank-2 types.
2502 </Para>
2503
2504 <Para>
2505 The solution is fairly clear.  We provide the ability to give a rank-2
2506 type signature for <Emphasis>ordinary</Emphasis> functions (not only data
2507 constructors), thus:
2508 </Para>
2509
2510 <Para>
2511
2512 <ProgramListing>
2513   mkTs :: (forall b. b -> b -> b) -> a -> [T a]
2514   mkTs f x y = [T1 f x, T1 f y]
2515 </ProgramListing>
2516
2517 </Para>
2518
2519 <Para>
2520 This type signature tells the compiler to attribute <Literal>f</Literal> with
2521 the polymorphic type <Literal>(forall b. b -> b -> b)</Literal> when type
2522 checking the body of <Function>mkTs</Function>, so now the application of
2523 <Function>T1</Function> is fine.
2524 </Para>
2525
2526 <Para>
2527 There are two restrictions:
2528 </Para>
2529
2530 <Para>
2531
2532 <ItemizedList>
2533 <ListItem>
2534
2535 <Para>
2536  You can only define a rank 2 type, specified by the following
2537 grammar:
2538
2539
2540 <ProgramListing>
2541 rank2type ::= [forall tyvars .] [context =>] funty
2542 funty     ::= ([forall tyvars .] [context =>] ty) -> funty
2543             | ty
2544 ty        ::= ...current Haskell monotype syntax...
2545 </ProgramListing>
2546
2547
2548 Informally, the universal quantification must all be right at the beginning,
2549 or at the top level of a function argument.
2550
2551 </Para>
2552 </ListItem>
2553 <ListItem>
2554
2555 <Para>
2556  There is a restriction on the definition of a function whose
2557 type signature is a rank-2 type: the polymorphic arguments must be
2558 matched on the left hand side of the "<Literal>=</Literal>" sign.  You can't
2559 define <Function>mkTs</Function> like this:
2560
2561
2562 <ProgramListing>
2563 mkTs :: (forall b. b -> b -> b) -> a -> [T a]
2564 mkTs = \ f x y -> [T1 f x, T1 f y]
2565 </ProgramListing>
2566
2567
2568
2569 The same partial-application rule applies to ordinary functions with
2570 rank-2 types as applied to data constructors.
2571
2572 </Para>
2573 </ListItem>
2574
2575 </ItemizedList>
2576
2577 </Para>
2578
2579 </Sect2>
2580
2581
2582 <Sect2 id="hoist">
2583 <Title>Type synonyms and hoisting
2584 </Title>
2585
2586 <Para>
2587 GHC also allows you to write a <Literal>forall</Literal> in a type synonym, thus:
2588 <ProgramListing>
2589   type Discard a = forall b. a -> b -> a
2590
2591   f :: Discard a
2592   f x y = x
2593 </ProgramListing>
2594 However, it is often convenient to use these sort of synonyms at the right hand
2595 end of an arrow, thus:
2596 <ProgramListing>
2597   type Discard a = forall b. a -> b -> a
2598
2599   g :: Int -> Discard Int
2600   g x y z = x+y
2601 </ProgramListing>
2602 Simply expanding the type synonym would give
2603 <ProgramListing>
2604   g :: Int -> (forall b. Int -> b -> Int)
2605 </ProgramListing>
2606 but GHC "hoists" the <Literal>forall</Literal> to give the isomorphic type
2607 <ProgramListing>
2608   g :: forall b. Int -> Int -> b -> Int
2609 </ProgramListing>
2610 In general, the rule is this: <Emphasis>to determine the type specified by any explicit
2611 user-written type (e.g. in a type signature), GHC expands type synonyms and then repeatedly
2612 performs the transformation:</Emphasis>
2613 <ProgramListing>
2614   <Emphasis>type1</Emphasis> -> forall a. <Emphasis>type2</Emphasis>
2615 ==>
2616   forall a. <Emphasis>type1</Emphasis> -> <Emphasis>type2</Emphasis>
2617 </ProgramListing>
2618 (In fact, GHC tries to retain as much synonym information as possible for use in
2619 error messages, but that is a usability issue.)  This rule applies, of course, whether
2620 or not the <Literal>forall</Literal> comes from a synonym. For example, here is another
2621 valid way to write <Literal>g</Literal>'s type signature:
2622 <ProgramListing>
2623   g :: Int -> Int -> forall b. b -> Int
2624 </ProgramListing>
2625 </Para>
2626 </Sect2>
2627
2628 </Sect1>
2629
2630 <Sect1 id="existential-quantification">
2631 <Title>Existentially quantified data constructors
2632 </Title>
2633
2634 <Para>
2635 The idea of using existential quantification in data type declarations
2636 was suggested by Laufer (I believe, thought doubtless someone will
2637 correct me), and implemented in Hope+. It's been in Lennart
2638 Augustsson's <Command>hbc</Command> Haskell compiler for several years, and
2639 proved very useful.  Here's the idea.  Consider the declaration:
2640 </Para>
2641
2642 <Para>
2643
2644 <ProgramListing>
2645   data Foo = forall a. MkFoo a (a -> Bool)
2646            | Nil
2647 </ProgramListing>
2648
2649 </Para>
2650
2651 <Para>
2652 The data type <Literal>Foo</Literal> has two constructors with types:
2653 </Para>
2654
2655 <Para>
2656
2657 <ProgramListing>
2658   MkFoo :: forall a. a -> (a -> Bool) -> Foo
2659   Nil   :: Foo
2660 </ProgramListing>
2661
2662 </Para>
2663
2664 <Para>
2665 Notice that the type variable <Literal>a</Literal> in the type of <Function>MkFoo</Function>
2666 does not appear in the data type itself, which is plain <Literal>Foo</Literal>.
2667 For example, the following expression is fine:
2668 </Para>
2669
2670 <Para>
2671
2672 <ProgramListing>
2673   [MkFoo 3 even, MkFoo 'c' isUpper] :: [Foo]
2674 </ProgramListing>
2675
2676 </Para>
2677
2678 <Para>
2679 Here, <Literal>(MkFoo 3 even)</Literal> packages an integer with a function
2680 <Function>even</Function> that maps an integer to <Literal>Bool</Literal>; and <Function>MkFoo 'c'
2681 isUpper</Function> packages a character with a compatible function.  These
2682 two things are each of type <Literal>Foo</Literal> and can be put in a list.
2683 </Para>
2684
2685 <Para>
2686 What can we do with a value of type <Literal>Foo</Literal>?.  In particular,
2687 what happens when we pattern-match on <Function>MkFoo</Function>?
2688 </Para>
2689
2690 <Para>
2691
2692 <ProgramListing>
2693   f (MkFoo val fn) = ???
2694 </ProgramListing>
2695
2696 </Para>
2697
2698 <Para>
2699 Since all we know about <Literal>val</Literal> and <Function>fn</Function> is that they
2700 are compatible, the only (useful) thing we can do with them is to
2701 apply <Function>fn</Function> to <Literal>val</Literal> to get a boolean.  For example:
2702 </Para>
2703
2704 <Para>
2705
2706 <ProgramListing>
2707   f :: Foo -> Bool
2708   f (MkFoo val fn) = fn val
2709 </ProgramListing>
2710
2711 </Para>
2712
2713 <Para>
2714 What this allows us to do is to package heterogenous values
2715 together with a bunch of functions that manipulate them, and then treat
2716 that collection of packages in a uniform manner.  You can express
2717 quite a bit of object-oriented-like programming this way.
2718 </Para>
2719
2720 <Sect2 id="existential">
2721 <Title>Why existential?
2722 </Title>
2723
2724 <Para>
2725 What has this to do with <Emphasis>existential</Emphasis> quantification?
2726 Simply that <Function>MkFoo</Function> has the (nearly) isomorphic type
2727 </Para>
2728
2729 <Para>
2730
2731 <ProgramListing>
2732   MkFoo :: (exists a . (a, a -> Bool)) -> Foo
2733 </ProgramListing>
2734
2735 </Para>
2736
2737 <Para>
2738 But Haskell programmers can safely think of the ordinary
2739 <Emphasis>universally</Emphasis> quantified type given above, thereby avoiding
2740 adding a new existential quantification construct.
2741 </Para>
2742
2743 </Sect2>
2744
2745 <Sect2>
2746 <Title>Type classes</Title>
2747
2748 <Para>
2749 An easy extension (implemented in <Command>hbc</Command>) is to allow
2750 arbitrary contexts before the constructor.  For example:
2751 </Para>
2752
2753 <Para>
2754
2755 <ProgramListing>
2756 data Baz = forall a. Eq a => Baz1 a a
2757          | forall b. Show b => Baz2 b (b -> b)
2758 </ProgramListing>
2759
2760 </Para>
2761
2762 <Para>
2763 The two constructors have the types you'd expect:
2764 </Para>
2765
2766 <Para>
2767
2768 <ProgramListing>
2769 Baz1 :: forall a. Eq a => a -> a -> Baz
2770 Baz2 :: forall b. Show b => b -> (b -> b) -> Baz
2771 </ProgramListing>
2772
2773 </Para>
2774
2775 <Para>
2776 But when pattern matching on <Function>Baz1</Function> the matched values can be compared
2777 for equality, and when pattern matching on <Function>Baz2</Function> the first matched
2778 value can be converted to a string (as well as applying the function to it).
2779 So this program is legal:
2780 </Para>
2781
2782 <Para>
2783
2784 <ProgramListing>
2785   f :: Baz -> String
2786   f (Baz1 p q) | p == q    = "Yes"
2787                | otherwise = "No"
2788   f (Baz1 v fn)            = show (fn v)
2789 </ProgramListing>
2790
2791 </Para>
2792
2793 <Para>
2794 Operationally, in a dictionary-passing implementation, the
2795 constructors <Function>Baz1</Function> and <Function>Baz2</Function> must store the
2796 dictionaries for <Literal>Eq</Literal> and <Literal>Show</Literal> respectively, and
2797 extract it on pattern matching.
2798 </Para>
2799
2800 <Para>
2801 Notice the way that the syntax fits smoothly with that used for
2802 universal quantification earlier.
2803 </Para>
2804
2805 </Sect2>
2806
2807 <Sect2>
2808 <Title>Restrictions</Title>
2809
2810 <Para>
2811 There are several restrictions on the ways in which existentially-quantified
2812 constructors can be use.
2813 </Para>
2814
2815 <Para>
2816
2817 <ItemizedList>
2818 <ListItem>
2819
2820 <Para>
2821  When pattern matching, each pattern match introduces a new,
2822 distinct, type for each existential type variable.  These types cannot
2823 be unified with any other type, nor can they escape from the scope of
2824 the pattern match.  For example, these fragments are incorrect:
2825
2826
2827 <ProgramListing>
2828 f1 (MkFoo a f) = a
2829 </ProgramListing>
2830
2831
2832 Here, the type bound by <Function>MkFoo</Function> "escapes", because <Literal>a</Literal>
2833 is the result of <Function>f1</Function>.  One way to see why this is wrong is to
2834 ask what type <Function>f1</Function> has:
2835
2836
2837 <ProgramListing>
2838   f1 :: Foo -> a             -- Weird!
2839 </ProgramListing>
2840
2841
2842 What is this "<Literal>a</Literal>" in the result type? Clearly we don't mean
2843 this:
2844
2845
2846 <ProgramListing>
2847   f1 :: forall a. Foo -> a   -- Wrong!
2848 </ProgramListing>
2849
2850
2851 The original program is just plain wrong.  Here's another sort of error
2852
2853
2854 <ProgramListing>
2855   f2 (Baz1 a b) (Baz1 p q) = a==q
2856 </ProgramListing>
2857
2858
2859 It's ok to say <Literal>a==b</Literal> or <Literal>p==q</Literal>, but
2860 <Literal>a==q</Literal> is wrong because it equates the two distinct types arising
2861 from the two <Function>Baz1</Function> constructors.
2862
2863
2864 </Para>
2865 </ListItem>
2866 <ListItem>
2867
2868 <Para>
2869 You can't pattern-match on an existentially quantified
2870 constructor in a <Literal>let</Literal> or <Literal>where</Literal> group of
2871 bindings. So this is illegal:
2872
2873
2874 <ProgramListing>
2875   f3 x = a==b where { Baz1 a b = x }
2876 </ProgramListing>
2877
2878
2879 You can only pattern-match
2880 on an existentially-quantified constructor in a <Literal>case</Literal> expression or
2881 in the patterns of a function definition.
2882
2883 The reason for this restriction is really an implementation one.
2884 Type-checking binding groups is already a nightmare without
2885 existentials complicating the picture.  Also an existential pattern
2886 binding at the top level of a module doesn't make sense, because it's
2887 not clear how to prevent the existentially-quantified type "escaping".
2888 So for now, there's a simple-to-state restriction.  We'll see how
2889 annoying it is.
2890
2891 </Para>
2892 </ListItem>
2893 <ListItem>
2894
2895 <Para>
2896 You can't use existential quantification for <Literal>newtype</Literal>
2897 declarations.  So this is illegal:
2898
2899
2900 <ProgramListing>
2901   newtype T = forall a. Ord a => MkT a
2902 </ProgramListing>
2903
2904
2905 Reason: a value of type <Literal>T</Literal> must be represented as a pair
2906 of a dictionary for <Literal>Ord t</Literal> and a value of type <Literal>t</Literal>.
2907 That contradicts the idea that <Literal>newtype</Literal> should have no
2908 concrete representation.  You can get just the same efficiency and effect
2909 by using <Literal>data</Literal> instead of <Literal>newtype</Literal>.  If there is no
2910 overloading involved, then there is more of a case for allowing
2911 an existentially-quantified <Literal>newtype</Literal>, because the <Literal>data</Literal>
2912 because the <Literal>data</Literal> version does carry an implementation cost,
2913 but single-field existentially quantified constructors aren't much
2914 use.  So the simple restriction (no existential stuff on <Literal>newtype</Literal>)
2915 stands, unless there are convincing reasons to change it.
2916
2917
2918 </Para>
2919 </ListItem>
2920 <ListItem>
2921
2922 <Para>
2923  You can't use <Literal>deriving</Literal> to define instances of a
2924 data type with existentially quantified data constructors.
2925
2926 Reason: in most cases it would not make sense. For example:&num;
2927
2928 <ProgramListing>
2929 data T = forall a. MkT [a] deriving( Eq )
2930 </ProgramListing>
2931
2932 To derive <Literal>Eq</Literal> in the standard way we would need to have equality
2933 between the single component of two <Function>MkT</Function> constructors:
2934
2935 <ProgramListing>
2936 instance Eq T where
2937   (MkT a) == (MkT b) = ???
2938 </ProgramListing>
2939
2940 But <VarName>a</VarName> and <VarName>b</VarName> have distinct types, and so can't be compared.
2941 It's just about possible to imagine examples in which the derived instance
2942 would make sense, but it seems altogether simpler simply to prohibit such
2943 declarations.  Define your own instances!
2944 </Para>
2945 </ListItem>
2946
2947 </ItemizedList>
2948
2949 </Para>
2950
2951 </Sect2>
2952
2953 </Sect1>
2954
2955 <Sect1 id="sec-assertions">
2956 <Title>Assertions
2957 <IndexTerm><Primary>Assertions</Primary></IndexTerm>
2958 </Title>
2959
2960 <Para>
2961 If you want to make use of assertions in your standard Haskell code, you
2962 could define a function like the following:
2963 </Para>
2964
2965 <Para>
2966
2967 <ProgramListing>
2968 assert :: Bool -> a -> a
2969 assert False x = error "assertion failed!"
2970 assert _     x = x
2971 </ProgramListing>
2972
2973 </Para>
2974
2975 <Para>
2976 which works, but gives you back a less than useful error message --
2977 an assertion failed, but which and where?
2978 </Para>
2979
2980 <Para>
2981 One way out is to define an extended <Function>assert</Function> function which also
2982 takes a descriptive string to include in the error message and
2983 perhaps combine this with the use of a pre-processor which inserts
2984 the source location where <Function>assert</Function> was used.
2985 </Para>
2986
2987 <Para>
2988 Ghc offers a helping hand here, doing all of this for you. For every
2989 use of <Function>assert</Function> in the user's source:
2990 </Para>
2991
2992 <Para>
2993
2994 <ProgramListing>
2995 kelvinToC :: Double -> Double
2996 kelvinToC k = assert (k &gt;= 0.0) (k+273.15)
2997 </ProgramListing>
2998
2999 </Para>
3000
3001 <Para>
3002 Ghc will rewrite this to also include the source location where the
3003 assertion was made,
3004 </Para>
3005
3006 <Para>
3007
3008 <ProgramListing>
3009 assert pred val ==> assertError "Main.hs|15" pred val
3010 </ProgramListing>
3011
3012 </Para>
3013
3014 <Para>
3015 The rewrite is only performed by the compiler when it spots
3016 applications of <Function>Exception.assert</Function>, so you can still define and
3017 use your own versions of <Function>assert</Function>, should you so wish. If not,
3018 import <Literal>Exception</Literal> to make use <Function>assert</Function> in your code.
3019 </Para>
3020
3021 <Para>
3022 To have the compiler ignore uses of assert, use the compiler option
3023 <Option>-fignore-asserts</Option>. <IndexTerm><Primary>-fignore-asserts option</Primary></IndexTerm> That is,
3024 expressions of the form <Literal>assert pred e</Literal> will be rewritten to <Literal>e</Literal>.
3025 </Para>
3026
3027 <Para>
3028 Assertion failures can be caught, see the documentation for the
3029 <literal>Exception</literal> library (<xref linkend="sec-Exception">)
3030 for the details.
3031 </Para>
3032
3033 </Sect1>
3034
3035 <Sect1 id="scoped-type-variables">
3036 <Title>Scoped Type Variables
3037 </Title>
3038
3039 <Para>
3040 A <Emphasis>pattern type signature</Emphasis> can introduce a <Emphasis>scoped type
3041 variable</Emphasis>.  For example
3042 </Para>
3043
3044 <Para>
3045
3046 <ProgramListing>
3047 f (xs::[a]) = ys ++ ys
3048            where
3049               ys :: [a]
3050               ys = reverse xs
3051 </ProgramListing>
3052
3053 </Para>
3054
3055 <Para>
3056 The pattern <Literal>(xs::[a])</Literal> includes a type signature for <VarName>xs</VarName>.
3057 This brings the type variable <Literal>a</Literal> into scope; it scopes over
3058 all the patterns and right hand sides for this equation for <Function>f</Function>.
3059 In particular, it is in scope at the type signature for <VarName>y</VarName>.
3060 </Para>
3061
3062 <Para>
3063 At ordinary type signatures, such as that for <VarName>ys</VarName>, any type variables
3064 mentioned in the type signature <Emphasis>that are not in scope</Emphasis> are
3065 implicitly universally quantified.  (If there are no type variables in
3066 scope, all type variables mentioned in the signature are universally
3067 quantified, which is just as in Haskell 98.)  In this case, since <VarName>a</VarName>
3068 is in scope, it is not universally quantified, so the type of <VarName>ys</VarName> is
3069 the same as that of <VarName>xs</VarName>.  In Haskell 98 it is not possible to declare
3070 a type for <VarName>ys</VarName>; a major benefit of scoped type variables is that
3071 it becomes possible to do so.
3072 </Para>
3073
3074 <Para>
3075 Scoped type variables are implemented in both GHC and Hugs.  Where the
3076 implementations differ from the specification below, those differences
3077 are noted.
3078 </Para>
3079
3080 <Para>
3081 So much for the basic idea.  Here are the details.
3082 </Para>
3083
3084 <Sect2>
3085 <Title>Scope and implicit quantification</Title>
3086
3087 <Para>
3088
3089 <ItemizedList>
3090 <ListItem>
3091
3092 <Para>
3093  All the type variables mentioned in the patterns for a single
3094 function definition equation, that are not already in scope,
3095 are brought into scope by the patterns.  We describe this set as
3096 the <Emphasis>type variables bound by the equation</Emphasis>.
3097
3098 </Para>
3099 </ListItem>
3100 <ListItem>
3101
3102 <Para>
3103  The type variables thus brought into scope may be mentioned
3104 in ordinary type signatures or pattern type signatures anywhere within
3105 their scope.
3106
3107 </Para>
3108 </ListItem>
3109 <ListItem>
3110
3111 <Para>
3112  In ordinary type signatures, any type variable mentioned in the
3113 signature that is in scope is <Emphasis>not</Emphasis> universally quantified.
3114
3115 </Para>
3116 </ListItem>
3117 <ListItem>
3118
3119 <Para>
3120  Ordinary type signatures do not bring any new type variables
3121 into scope (except in the type signature itself!). So this is illegal:
3122
3123
3124 <ProgramListing>
3125   f :: a -> a
3126   f x = x::a
3127 </ProgramListing>
3128
3129
3130 It's illegal because <VarName>a</VarName> is not in scope in the body of <Function>f</Function>,
3131 so the ordinary signature <Literal>x::a</Literal> is equivalent to <Literal>x::forall a.a</Literal>;
3132 and that is an incorrect typing.
3133
3134 </Para>
3135 </ListItem>
3136 <ListItem>
3137
3138 <Para>
3139  There is no implicit universal quantification on pattern type
3140 signatures, nor may one write an explicit <Literal>forall</Literal> type in a pattern
3141 type signature.  The pattern type signature is a monotype.
3142
3143 </Para>
3144 </ListItem>
3145 <ListItem>
3146
3147 <Para>
3148
3149 The type variables in the head of a <Literal>class</Literal> or <Literal>instance</Literal> declaration
3150 scope over the methods defined in the <Literal>where</Literal> part.  For example:
3151
3152
3153 <ProgramListing>
3154   class C a where
3155     op :: [a] -> a
3156
3157     op xs = let ys::[a]
3158                 ys = reverse xs
3159             in
3160             head ys
3161 </ProgramListing>
3162
3163
3164 (Not implemented in Hugs yet, Dec 98).
3165 </Para>
3166 </ListItem>
3167
3168 </ItemizedList>
3169
3170 </Para>
3171
3172 </Sect2>
3173
3174 <Sect2>
3175 <Title>Polymorphism</Title>
3176
3177 <Para>
3178
3179 <ItemizedList>
3180 <ListItem>
3181
3182 <Para>
3183  Pattern type signatures are completely orthogonal to ordinary, separate
3184 type signatures.  The two can be used independently or together.  There is
3185 no scoping associated with the names of the type variables in a separate type signature.
3186
3187
3188 <ProgramListing>
3189    f :: [a] -> [a]
3190    f (xs::[b]) = reverse xs
3191 </ProgramListing>
3192
3193
3194 </Para>
3195 </ListItem>
3196 <ListItem>
3197
3198 <Para>
3199  The function must be polymorphic in the type variables
3200 bound by all its equations.  Operationally, the type variables bound
3201 by one equation must not:
3202
3203
3204 <ItemizedList>
3205 <ListItem>
3206
3207 <Para>
3208  Be unified with a type (such as <Literal>Int</Literal>, or <Literal>[a]</Literal>).
3209 </Para>
3210 </ListItem>
3211 <ListItem>
3212
3213 <Para>
3214  Be unified with a type variable free in the environment.
3215 </Para>
3216 </ListItem>
3217 <ListItem>
3218
3219 <Para>
3220  Be unified with each other.  (They may unify with the type variables
3221 bound by another equation for the same function, of course.)
3222 </Para>
3223 </ListItem>
3224
3225 </ItemizedList>
3226
3227
3228 For example, the following all fail to type check:
3229
3230
3231 <ProgramListing>
3232   f (x::a) (y::b) = [x,y]       -- a unifies with b
3233
3234   g (x::a) = x + 1::Int         -- a unifies with Int
3235
3236   h x = let k (y::a) = [x,y]    -- a is free in the
3237         in k x                  -- environment
3238
3239   k (x::a) True    = ...        -- a unifies with Int
3240   k (x::Int) False = ...
3241
3242   w :: [b] -> [b]
3243   w (x::a) = x                  -- a unifies with [b]
3244 </ProgramListing>
3245
3246
3247 </Para>
3248 </ListItem>
3249 <ListItem>
3250
3251 <Para>
3252  The pattern-bound type variable may, however, be constrained
3253 by the context of the principal type, thus:
3254
3255
3256 <ProgramListing>
3257   f (x::a) (y::a) = x+y*2
3258 </ProgramListing>
3259
3260
3261 gets the inferred type: <Literal>forall a. Num a =&gt; a -&gt; a -&gt; a</Literal>.
3262 </Para>
3263 </ListItem>
3264
3265 </ItemizedList>
3266
3267 </Para>
3268
3269 </Sect2>
3270
3271 <Sect2>
3272 <Title>Result type signatures</Title>
3273
3274 <Para>
3275
3276 <ItemizedList>
3277 <ListItem>
3278
3279 <Para>
3280  The result type of a function can be given a signature,
3281 thus:
3282
3283
3284 <ProgramListing>
3285   f (x::a) :: [a] = [x,x,x]
3286 </ProgramListing>
3287
3288
3289 The final <Literal>:: [a]</Literal> after all the patterns gives a signature to the
3290 result type.  Sometimes this is the only way of naming the type variable
3291 you want:
3292
3293
3294 <ProgramListing>
3295   f :: Int -> [a] -> [a]
3296   f n :: ([a] -> [a]) = let g (x::a, y::a) = (y,x)
3297                         in \xs -> map g (reverse xs `zip` xs)
3298 </ProgramListing>
3299
3300
3301 </Para>
3302 </ListItem>
3303
3304 </ItemizedList>
3305
3306 </Para>
3307
3308 <Para>
3309 Result type signatures are not yet implemented in Hugs.
3310 </Para>
3311
3312 </Sect2>
3313
3314 <Sect2>
3315 <Title>Pattern signatures on other constructs</Title>
3316
3317 <Para>
3318
3319 <ItemizedList>
3320 <ListItem>
3321
3322 <Para>
3323  A pattern type signature can be on an arbitrary sub-pattern, not
3324 just on a variable:
3325
3326
3327 <ProgramListing>
3328   f ((x,y)::(a,b)) = (y,x) :: (b,a)
3329 </ProgramListing>
3330
3331
3332 </Para>
3333 </ListItem>
3334 <ListItem>
3335
3336 <Para>
3337  Pattern type signatures, including the result part, can be used
3338 in lambda abstractions:
3339
3340
3341 <ProgramListing>
3342   (\ (x::a, y) :: a -> x)
3343 </ProgramListing>
3344
3345
3346 Type variables bound by these patterns must be polymorphic in
3347 the sense defined above.
3348 For example:
3349
3350
3351 <ProgramListing>
3352   f1 (x::c) = f1 x      -- ok
3353   f2 = \(x::c) -> f2 x  -- not ok
3354 </ProgramListing>
3355
3356
3357 Here, <Function>f1</Function> is OK, but <Function>f2</Function> is not, because <VarName>c</VarName> gets unified
3358 with a type variable free in the environment, in this
3359 case, the type of <Function>f2</Function>, which is in the environment when
3360 the lambda abstraction is checked.
3361
3362 </Para>
3363 </ListItem>
3364 <ListItem>
3365
3366 <Para>
3367  Pattern type signatures, including the result part, can be used
3368 in <Literal>case</Literal> expressions:
3369
3370
3371 <ProgramListing>
3372   case e of { (x::a, y) :: a -> x }
3373 </ProgramListing>
3374
3375
3376 The pattern-bound type variables must, as usual,
3377 be polymorphic in the following sense: each case alternative,
3378 considered as a lambda abstraction, must be polymorphic.
3379 Thus this is OK:
3380
3381
3382 <ProgramListing>
3383   case (True,False) of { (x::a, y) -> x }
3384 </ProgramListing>
3385
3386
3387 Even though the context is that of a pair of booleans,
3388 the alternative itself is polymorphic.  Of course, it is
3389 also OK to say:
3390
3391
3392 <ProgramListing>
3393   case (True,False) of { (x::Bool, y) -> x }
3394 </ProgramListing>
3395
3396
3397 </Para>
3398 </ListItem>
3399 <ListItem>
3400
3401 <Para>
3402 To avoid ambiguity, the type after the &ldquo;<Literal>::</Literal>&rdquo; in a result
3403 pattern signature on a lambda or <Literal>case</Literal> must be atomic (i.e. a single
3404 token or a parenthesised type of some sort).  To see why,
3405 consider how one would parse this:
3406
3407
3408 <ProgramListing>
3409   \ x :: a -> b -> x
3410 </ProgramListing>
3411
3412
3413 </Para>
3414 </ListItem>
3415 <ListItem>
3416
3417 <Para>
3418  Pattern type signatures that bind new type variables
3419 may not be used in pattern bindings at all.
3420 So this is illegal:
3421
3422
3423 <ProgramListing>
3424   f x = let (y, z::a) = x in ...
3425 </ProgramListing>
3426
3427
3428 But these are OK, because they do not bind fresh type variables:
3429
3430
3431 <ProgramListing>
3432   f1 x            = let (y, z::Int) = x in ...
3433   f2 (x::(Int,a)) = let (y, z::a)   = x in ...
3434 </ProgramListing>
3435
3436
3437 However a single variable is considered a degenerate function binding,
3438 rather than a degerate pattern binding, so this is permitted, even
3439 though it binds a type variable:
3440
3441
3442 <ProgramListing>
3443   f :: (b->b) = \(x::b) -> x
3444 </ProgramListing>
3445
3446
3447 </Para>
3448 </ListItem>
3449
3450 </ItemizedList>
3451
3452 Such degnerate function bindings do not fall under the monomorphism
3453 restriction.  Thus:
3454 </Para>
3455
3456 <Para>
3457
3458 <ProgramListing>
3459   g :: a -> a -> Bool = \x y. x==y
3460 </ProgramListing>
3461
3462 </Para>
3463
3464 <Para>
3465 Here <Function>g</Function> has type <Literal>forall a. Eq a =&gt; a -&gt; a -&gt; Bool</Literal>, just as if
3466 <Function>g</Function> had a separate type signature.  Lacking a type signature, <Function>g</Function>
3467 would get a monomorphic type.
3468 </Para>
3469
3470 </Sect2>
3471
3472 <Sect2>
3473 <Title>Existentials</Title>
3474
3475 <Para>
3476
3477 <ItemizedList>
3478 <ListItem>
3479
3480 <Para>
3481  Pattern type signatures can bind existential type variables.
3482 For example:
3483
3484
3485 <ProgramListing>
3486   data T = forall a. MkT [a]
3487
3488   f :: T -> T
3489   f (MkT [t::a]) = MkT t3
3490                  where
3491                    t3::[a] = [t,t,t]
3492 </ProgramListing>
3493
3494
3495 </Para>
3496 </ListItem>
3497
3498 </ItemizedList>
3499
3500 </Para>
3501
3502 </Sect2>
3503
3504 </Sect1>
3505
3506 <Sect1 id="pragmas">
3507 <Title>Pragmas
3508 </Title>
3509
3510 <Para>
3511 GHC supports several pragmas, or instructions to the compiler placed
3512 in the source code.  Pragmas don't affect the meaning of the program,
3513 but they might affect the efficiency of the generated code.
3514 </Para>
3515
3516 <Sect2 id="inline-pragma">
3517 <Title>INLINE pragma
3518
3519 <IndexTerm><Primary>INLINE pragma</Primary></IndexTerm>
3520 <IndexTerm><Primary>pragma, INLINE</Primary></IndexTerm></Title>
3521
3522 <Para>
3523 GHC (with <Option>-O</Option>, as always) tries to inline (or &ldquo;unfold&rdquo;)
3524 functions/values that are &ldquo;small enough,&rdquo; thus avoiding the call
3525 overhead and possibly exposing other more-wonderful optimisations.
3526 </Para>
3527
3528 <Para>
3529 You will probably see these unfoldings (in Core syntax) in your
3530 interface files.
3531 </Para>
3532
3533 <Para>
3534 Normally, if GHC decides a function is &ldquo;too expensive&rdquo; to inline, it
3535 will not do so, nor will it export that unfolding for other modules to
3536 use.
3537 </Para>
3538
3539 <Para>
3540 The sledgehammer you can bring to bear is the
3541 <Literal>INLINE</Literal><IndexTerm><Primary>INLINE pragma</Primary></IndexTerm> pragma, used thusly:
3542
3543 <ProgramListing>
3544 key_function :: Int -> String -> (Bool, Double)
3545
3546 #ifdef __GLASGOW_HASKELL__
3547 {-# INLINE key_function #-}
3548 #endif
3549 </ProgramListing>
3550
3551 (You don't need to do the C pre-processor carry-on unless you're going
3552 to stick the code through HBC&mdash;it doesn't like <Literal>INLINE</Literal> pragmas.)
3553 </Para>
3554
3555 <Para>
3556 The major effect of an <Literal>INLINE</Literal> pragma is to declare a function's
3557 &ldquo;cost&rdquo; to be very low.  The normal unfolding machinery will then be
3558 very keen to inline it.
3559 </Para>
3560
3561 <Para>
3562 An <Literal>INLINE</Literal> pragma for a function can be put anywhere its type
3563 signature could be put.
3564 </Para>
3565
3566 <Para>
3567 <Literal>INLINE</Literal> pragmas are a particularly good idea for the
3568 <Literal>then</Literal>/<Literal>return</Literal> (or <Literal>bind</Literal>/<Literal>unit</Literal>) functions in a monad.
3569 For example, in GHC's own <Literal>UniqueSupply</Literal> monad code, we have:
3570
3571 <ProgramListing>
3572 #ifdef __GLASGOW_HASKELL__
3573 {-# INLINE thenUs #-}
3574 {-# INLINE returnUs #-}
3575 #endif
3576 </ProgramListing>
3577
3578 </Para>
3579
3580 </Sect2>
3581
3582 <Sect2 id="noinline-pragma">
3583 <Title>NOINLINE pragma
3584 </Title>
3585
3586 <Para>
3587 <IndexTerm><Primary>NOINLINE pragma</Primary></IndexTerm>
3588 <IndexTerm><Primary>pragma, NOINLINE</Primary></IndexTerm>
3589 </Para>
3590
3591 <Para>
3592 The <Literal>NOINLINE</Literal> pragma does exactly what you'd expect: it stops the
3593 named function from being inlined by the compiler.  You shouldn't ever
3594 need to do this, unless you're very cautious about code size.
3595 </Para>
3596
3597 </Sect2>
3598
3599 <Sect2 id="specialize-pragma">
3600 <Title>SPECIALIZE pragma
3601 </Title>
3602
3603 <Para>
3604 <IndexTerm><Primary>SPECIALIZE pragma</Primary></IndexTerm>
3605 <IndexTerm><Primary>pragma, SPECIALIZE</Primary></IndexTerm>
3606 <IndexTerm><Primary>overloading, death to</Primary></IndexTerm>
3607 </Para>
3608
3609 <Para>
3610 (UK spelling also accepted.)  For key overloaded functions, you can
3611 create extra versions (NB: more code space) specialised to particular
3612 types.  Thus, if you have an overloaded function:
3613 </Para>
3614
3615 <Para>
3616
3617 <ProgramListing>
3618 hammeredLookup :: Ord key => [(key, value)] -> key -> value
3619 </ProgramListing>
3620
3621 </Para>
3622
3623 <Para>
3624 If it is heavily used on lists with <Literal>Widget</Literal> keys, you could
3625 specialise it as follows:
3626
3627 <ProgramListing>
3628 {-# SPECIALIZE hammeredLookup :: [(Widget, value)] -> Widget -> value #-}
3629 </ProgramListing>
3630
3631 </Para>
3632
3633 <Para>
3634 To get very fancy, you can also specify a named function to use for
3635 the specialised value, by adding <Literal>= blah</Literal>, as in:
3636
3637 <ProgramListing>
3638 {-# SPECIALIZE hammeredLookup :: ...as before... = blah #-}
3639 </ProgramListing>
3640
3641 It's <Emphasis>Your Responsibility</Emphasis> to make sure that <Function>blah</Function> really
3642 behaves as a specialised version of <Function>hammeredLookup</Function>!!!
3643 </Para>
3644
3645 <Para>
3646 NOTE: the <Literal>=blah</Literal> feature isn't implemented in GHC 4.xx.
3647 </Para>
3648
3649 <Para>
3650 An example in which the <Literal>= blah</Literal> form will Win Big:
3651
3652 <ProgramListing>
3653 toDouble :: Real a => a -> Double
3654 toDouble = fromRational . toRational
3655
3656 {-# SPECIALIZE toDouble :: Int -> Double = i2d #-}
3657 i2d (I# i) = D# (int2Double# i) -- uses Glasgow prim-op directly
3658 </ProgramListing>
3659
3660 The <Function>i2d</Function> function is virtually one machine instruction; the
3661 default conversion&mdash;via an intermediate <Literal>Rational</Literal>&mdash;is obscenely
3662 expensive by comparison.
3663 </Para>
3664
3665 <Para>
3666 By using the US spelling, your <Literal>SPECIALIZE</Literal> pragma will work with
3667 HBC, too.  Note that HBC doesn't support the <Literal>= blah</Literal> form.
3668 </Para>
3669
3670 <Para>
3671 A <Literal>SPECIALIZE</Literal> pragma for a function can be put anywhere its type
3672 signature could be put.
3673 </Para>
3674
3675 </Sect2>
3676
3677 <Sect2 id="specialize-instance-pragma">
3678 <Title>SPECIALIZE instance pragma
3679 </Title>
3680
3681 <Para>
3682 <IndexTerm><Primary>SPECIALIZE pragma</Primary></IndexTerm>
3683 <IndexTerm><Primary>overloading, death to</Primary></IndexTerm>
3684 Same idea, except for instance declarations.  For example:
3685
3686 <ProgramListing>
3687 instance (Eq a) => Eq (Foo a) where { ... usual stuff ... }
3688
3689 {-# SPECIALIZE instance Eq (Foo [(Int, Bar)] #-}
3690 </ProgramListing>
3691
3692 Compatible with HBC, by the way.
3693 </Para>
3694
3695 </Sect2>
3696
3697 <Sect2 id="line-pragma">
3698 <Title>LINE pragma
3699 </Title>
3700
3701 <Para>
3702 <IndexTerm><Primary>LINE pragma</Primary></IndexTerm>
3703 <IndexTerm><Primary>pragma, LINE</Primary></IndexTerm>
3704 </Para>
3705
3706 <Para>
3707 This pragma is similar to C's <Literal>&num;line</Literal> pragma, and is mainly for use in
3708 automatically generated Haskell code.  It lets you specify the line
3709 number and filename of the original code; for example
3710 </Para>
3711
3712 <Para>
3713
3714 <ProgramListing>
3715 {-# LINE 42 "Foo.vhs" #-}
3716 </ProgramListing>
3717
3718 </Para>
3719
3720 <Para>
3721 if you'd generated the current file from something called <Filename>Foo.vhs</Filename>
3722 and this line corresponds to line 42 in the original.  GHC will adjust
3723 its error messages to refer to the line/file named in the <Literal>LINE</Literal>
3724 pragma.
3725 </Para>
3726
3727 </Sect2>
3728
3729 <Sect2>
3730 <Title>RULES pragma</Title>
3731
3732 <Para>
3733 The RULES pragma lets you specify rewrite rules.  It is described in
3734 <XRef LinkEnd="rewrite-rules">.
3735 </Para>
3736
3737 </Sect2>
3738
3739 </Sect1>
3740
3741 <Sect1 id="rewrite-rules">
3742 <Title>Rewrite rules
3743
3744 <IndexTerm><Primary>RULES pagma</Primary></IndexTerm>
3745 <IndexTerm><Primary>pragma, RULES</Primary></IndexTerm>
3746 <IndexTerm><Primary>rewrite rules</Primary></IndexTerm></Title>
3747
3748 <Para>
3749 The programmer can specify rewrite rules as part of the source program
3750 (in a pragma).  GHC applies these rewrite rules wherever it can.
3751 </Para>
3752
3753 <Para>
3754 Here is an example:
3755
3756 <ProgramListing>
3757   {-# RULES
3758         "map/map"       forall f g xs. map f (map g xs) = map (f.g) xs
3759   #-}
3760 </ProgramListing>
3761
3762 </Para>
3763
3764 <Sect2>
3765 <Title>Syntax</Title>
3766
3767 <Para>
3768 From a syntactic point of view:
3769
3770 <ItemizedList>
3771 <ListItem>
3772
3773 <Para>
3774  Each rule has a name, enclosed in double quotes.  The name itself has
3775 no significance at all.  It is only used when reporting how many times the rule fired.
3776 </Para>
3777 </ListItem>
3778 <ListItem>
3779
3780 <Para>
3781  There may be zero or more rules in a <Literal>RULES</Literal> pragma.
3782 </Para>
3783 </ListItem>
3784 <ListItem>
3785
3786 <Para>
3787  Layout applies in a <Literal>RULES</Literal> pragma.  Currently no new indentation level
3788 is set, so you must lay out your rules starting in the same column as the
3789 enclosing definitions.
3790 </Para>
3791 </ListItem>
3792 <ListItem>
3793
3794 <Para>
3795  Each variable mentioned in a rule must either be in scope (e.g. <Function>map</Function>),
3796 or bound by the <Literal>forall</Literal> (e.g. <Function>f</Function>, <Function>g</Function>, <Function>xs</Function>).  The variables bound by
3797 the <Literal>forall</Literal> are called the <Emphasis>pattern</Emphasis> variables.  They are separated
3798 by spaces, just like in a type <Literal>forall</Literal>.
3799 </Para>
3800 </ListItem>
3801 <ListItem>
3802
3803 <Para>
3804  A pattern variable may optionally have a type signature.
3805 If the type of the pattern variable is polymorphic, it <Emphasis>must</Emphasis> have a type signature.
3806 For example, here is the <Literal>foldr/build</Literal> rule:
3807
3808 <ProgramListing>
3809 "fold/build"  forall k z (g::forall b. (a->b->b) -> b -> b) .
3810               foldr k z (build g) = g k z
3811 </ProgramListing>
3812
3813 Since <Function>g</Function> has a polymorphic type, it must have a type signature.
3814
3815 </Para>
3816 </ListItem>
3817 <ListItem>
3818
3819 <Para>
3820 The left hand side of a rule must consist of a top-level variable applied
3821 to arbitrary expressions.  For example, this is <Emphasis>not</Emphasis> OK:
3822
3823 <ProgramListing>
3824 "wrong1"   forall e1 e2.  case True of { True -> e1; False -> e2 } = e1
3825 "wrong2"   forall f.      f True = True
3826 </ProgramListing>
3827
3828 In <Literal>"wrong1"</Literal>, the LHS is not an application; in <Literal>"wrong2"</Literal>, the LHS has a pattern variable
3829 in the head.
3830 </Para>
3831 </ListItem>
3832 <ListItem>
3833
3834 <Para>
3835  A rule does not need to be in the same module as (any of) the
3836 variables it mentions, though of course they need to be in scope.
3837 </Para>
3838 </ListItem>
3839 <ListItem>
3840
3841 <Para>
3842  Rules are automatically exported from a module, just as instance declarations are.
3843 </Para>
3844 </ListItem>
3845
3846 </ItemizedList>
3847
3848 </Para>
3849
3850 </Sect2>
3851
3852 <Sect2>
3853 <Title>Semantics</Title>
3854
3855 <Para>
3856 From a semantic point of view:
3857
3858 <ItemizedList>
3859 <ListItem>
3860
3861 <Para>
3862 Rules are only applied if you use the <Option>-O</Option> flag.
3863 </Para>
3864 </ListItem>
3865
3866 <ListItem>
3867 <Para>
3868  Rules are regarded as left-to-right rewrite rules.
3869 When GHC finds an expression that is a substitution instance of the LHS
3870 of a rule, it replaces the expression by the (appropriately-substituted) RHS.
3871 By "a substitution instance" we mean that the LHS can be made equal to the
3872 expression by substituting for the pattern variables.
3873
3874 </Para>
3875 </ListItem>
3876 <ListItem>
3877
3878 <Para>
3879  The LHS and RHS of a rule are typechecked, and must have the
3880 same type.
3881
3882 </Para>
3883 </ListItem>
3884 <ListItem>
3885
3886 <Para>
3887  GHC makes absolutely no attempt to verify that the LHS and RHS
3888 of a rule have the same meaning.  That is undecideable in general, and
3889 infeasible in most interesting cases.  The responsibility is entirely the programmer's!
3890
3891 </Para>
3892 </ListItem>
3893 <ListItem>
3894
3895 <Para>
3896  GHC makes no attempt to make sure that the rules are confluent or
3897 terminating.  For example:
3898
3899 <ProgramListing>
3900   "loop"        forall x,y.  f x y = f y x
3901 </ProgramListing>
3902
3903 This rule will cause the compiler to go into an infinite loop.
3904
3905 </Para>
3906 </ListItem>
3907 <ListItem>
3908
3909 <Para>
3910  If more than one rule matches a call, GHC will choose one arbitrarily to apply.
3911
3912 </Para>
3913 </ListItem>
3914 <ListItem>
3915 <Para>
3916  GHC currently uses a very simple, syntactic, matching algorithm
3917 for matching a rule LHS with an expression.  It seeks a substitution
3918 which makes the LHS and expression syntactically equal modulo alpha
3919 conversion.  The pattern (rule), but not the expression, is eta-expanded if
3920 necessary.  (Eta-expanding the epression can lead to laziness bugs.)
3921 But not beta conversion (that's called higher-order matching).
3922 </Para>
3923
3924 <Para>
3925 Matching is carried out on GHC's intermediate language, which includes
3926 type abstractions and applications.  So a rule only matches if the
3927 types match too.  See <XRef LinkEnd="rule-spec"> below.
3928 </Para>
3929 </ListItem>
3930 <ListItem>
3931
3932 <Para>
3933  GHC keeps trying to apply the rules as it optimises the program.
3934 For example, consider:
3935
3936 <ProgramListing>
3937   let s = map f
3938       t = map g
3939   in
3940   s (t xs)
3941 </ProgramListing>
3942
3943 The expression <Literal>s (t xs)</Literal> does not match the rule <Literal>"map/map"</Literal>, but GHC
3944 will substitute for <VarName>s</VarName> and <VarName>t</VarName>, giving an expression which does match.
3945 If <VarName>s</VarName> or <VarName>t</VarName> was (a) used more than once, and (b) large or a redex, then it would
3946 not be substituted, and the rule would not fire.
3947
3948 </Para>
3949 </ListItem>
3950 <ListItem>
3951
3952 <Para>
3953  In the earlier phases of compilation, GHC inlines <Emphasis>nothing
3954 that appears on the LHS of a rule</Emphasis>, because once you have substituted
3955 for something you can't match against it (given the simple minded
3956 matching).  So if you write the rule
3957
3958 <ProgramListing>
3959         "map/map"       forall f,g.  map f . map g = map (f.g)
3960 </ProgramListing>
3961
3962 this <Emphasis>won't</Emphasis> match the expression <Literal>map f (map g xs)</Literal>.
3963 It will only match something written with explicit use of ".".
3964 Well, not quite.  It <Emphasis>will</Emphasis> match the expression
3965
3966 <ProgramListing>
3967 wibble f g xs
3968 </ProgramListing>
3969
3970 where <Function>wibble</Function> is defined:
3971
3972 <ProgramListing>
3973 wibble f g = map f . map g
3974 </ProgramListing>
3975
3976 because <Function>wibble</Function> will be inlined (it's small).
3977
3978 Later on in compilation, GHC starts inlining even things on the
3979 LHS of rules, but still leaves the rules enabled.  This inlining
3980 policy is controlled by the per-simplification-pass flag <Option>-finline-phase</Option><Emphasis>n</Emphasis>.
3981
3982 </Para>
3983 </ListItem>
3984 <ListItem>
3985
3986 <Para>
3987  All rules are implicitly exported from the module, and are therefore
3988 in force in any module that imports the module that defined the rule, directly
3989 or indirectly.  (That is, if A imports B, which imports C, then C's rules are
3990 in force when compiling A.)  The situation is very similar to that for instance
3991 declarations.
3992 </Para>
3993 </ListItem>
3994
3995 </ItemizedList>
3996
3997 </Para>
3998
3999 </Sect2>
4000
4001 <Sect2>
4002 <Title>List fusion</Title>
4003
4004 <Para>
4005 The RULES mechanism is used to implement fusion (deforestation) of common list functions.
4006 If a "good consumer" consumes an intermediate list constructed by a "good producer", the
4007 intermediate list should be eliminated entirely.
4008 </Para>
4009
4010 <Para>
4011 The following are good producers:
4012
4013 <ItemizedList>
4014 <ListItem>
4015
4016 <Para>
4017  List comprehensions
4018 </Para>
4019 </ListItem>
4020 <ListItem>
4021
4022 <Para>
4023  Enumerations of <Literal>Int</Literal> and <Literal>Char</Literal> (e.g. <Literal>['a'..'z']</Literal>).
4024 </Para>
4025 </ListItem>
4026 <ListItem>
4027
4028 <Para>
4029  Explicit lists (e.g. <Literal>[True, False]</Literal>)
4030 </Para>
4031 </ListItem>
4032 <ListItem>
4033
4034 <Para>
4035  The cons constructor (e.g <Literal>3:4:[]</Literal>)
4036 </Para>
4037 </ListItem>
4038 <ListItem>
4039
4040 <Para>
4041  <Function>++</Function>
4042 </Para>
4043 </ListItem>
4044 <ListItem>
4045
4046 <Para>
4047  <Function>map</Function>
4048 </Para>
4049 </ListItem>
4050 <ListItem>
4051
4052 <Para>
4053  <Function>filter</Function>
4054 </Para>
4055 </ListItem>
4056 <ListItem>
4057
4058 <Para>
4059  <Function>iterate</Function>, <Function>repeat</Function>
4060 </Para>
4061 </ListItem>
4062 <ListItem>
4063
4064 <Para>
4065  <Function>zip</Function>, <Function>zipWith</Function>
4066 </Para>
4067 </ListItem>
4068
4069 </ItemizedList>
4070
4071 </Para>
4072
4073 <Para>
4074 The following are good consumers:
4075
4076 <ItemizedList>
4077 <ListItem>
4078
4079 <Para>
4080  List comprehensions
4081 </Para>
4082 </ListItem>
4083 <ListItem>
4084
4085 <Para>
4086  <Function>array</Function> (on its second argument)
4087 </Para>
4088 </ListItem>
4089 <ListItem>
4090
4091 <Para>
4092  <Function>length</Function>
4093 </Para>
4094 </ListItem>
4095 <ListItem>
4096
4097 <Para>
4098  <Function>++</Function> (on its first argument)
4099 </Para>
4100 </ListItem>
4101 <ListItem>
4102
4103 <Para>
4104  <Function>map</Function>
4105 </Para>
4106 </ListItem>
4107 <ListItem>
4108
4109 <Para>
4110  <Function>filter</Function>
4111 </Para>
4112 </ListItem>
4113 <ListItem>
4114
4115 <Para>
4116  <Function>concat</Function>
4117 </Para>
4118 </ListItem>
4119 <ListItem>
4120
4121 <Para>
4122  <Function>unzip</Function>, <Function>unzip2</Function>, <Function>unzip3</Function>, <Function>unzip4</Function>
4123 </Para>
4124 </ListItem>
4125 <ListItem>
4126
4127 <Para>
4128  <Function>zip</Function>, <Function>zipWith</Function> (but on one argument only; if both are good producers, <Function>zip</Function>
4129 will fuse with one but not the other)
4130 </Para>
4131 </ListItem>
4132 <ListItem>
4133
4134 <Para>
4135  <Function>partition</Function>
4136 </Para>
4137 </ListItem>
4138 <ListItem>
4139
4140 <Para>
4141  <Function>head</Function>
4142 </Para>
4143 </ListItem>
4144 <ListItem>
4145
4146 <Para>
4147  <Function>and</Function>, <Function>or</Function>, <Function>any</Function>, <Function>all</Function>
4148 </Para>
4149 </ListItem>
4150 <ListItem>
4151
4152 <Para>
4153  <Function>sequence&lowbar;</Function>
4154 </Para>
4155 </ListItem>
4156 <ListItem>
4157
4158 <Para>
4159  <Function>msum</Function>
4160 </Para>
4161 </ListItem>
4162 <ListItem>
4163
4164 <Para>
4165  <Function>sortBy</Function>
4166 </Para>
4167 </ListItem>
4168
4169 </ItemizedList>
4170
4171 </Para>
4172
4173 <Para>
4174 So, for example, the following should generate no intermediate lists:
4175
4176 <ProgramListing>
4177 array (1,10) [(i,i*i) | i &#60;- map (+ 1) [0..9]]
4178 </ProgramListing>
4179
4180 </Para>
4181
4182 <Para>
4183 This list could readily be extended; if there are Prelude functions that you use
4184 a lot which are not included, please tell us.
4185 </Para>
4186
4187 <Para>
4188 If you want to write your own good consumers or producers, look at the
4189 Prelude definitions of the above functions to see how to do so.
4190 </Para>
4191
4192 </Sect2>
4193
4194 <Sect2 id="rule-spec">
4195 <Title>Specialisation
4196 </Title>
4197
4198 <Para>
4199 Rewrite rules can be used to get the same effect as a feature
4200 present in earlier version of GHC:
4201
4202 <ProgramListing>
4203   {-# SPECIALIZE fromIntegral :: Int8 -> Int16 = int8ToInt16 #-}
4204 </ProgramListing>
4205
4206 This told GHC to use <Function>int8ToInt16</Function> instead of <Function>fromIntegral</Function> whenever
4207 the latter was called with type <Literal>Int8 -&gt; Int16</Literal>.  That is, rather than
4208 specialising the original definition of <Function>fromIntegral</Function> the programmer is
4209 promising that it is safe to use <Function>int8ToInt16</Function> instead.
4210 </Para>
4211
4212 <Para>
4213 This feature is no longer in GHC.  But rewrite rules let you do the
4214 same thing:
4215
4216 <ProgramListing>
4217 {-# RULES
4218   "fromIntegral/Int8/Int16" fromIntegral = int8ToInt16
4219 #-}
4220 </ProgramListing>
4221
4222 This slightly odd-looking rule instructs GHC to replace <Function>fromIntegral</Function>
4223 by <Function>int8ToInt16</Function> <Emphasis>whenever the types match</Emphasis>.  Speaking more operationally,
4224 GHC adds the type and dictionary applications to get the typed rule
4225
4226 <ProgramListing>
4227 forall (d1::Integral Int8) (d2::Num Int16) .
4228         fromIntegral Int8 Int16 d1 d2 = int8ToInt16
4229 </ProgramListing>
4230
4231 What is more,
4232 this rule does not need to be in the same file as fromIntegral,
4233 unlike the <Literal>SPECIALISE</Literal> pragmas which currently do (so that they
4234 have an original definition available to specialise).
4235 </Para>
4236
4237 </Sect2>
4238
4239 <Sect2>
4240 <Title>Controlling what's going on</Title>
4241
4242 <Para>
4243
4244 <ItemizedList>
4245 <ListItem>
4246
4247 <Para>
4248  Use <Option>-ddump-rules</Option> to see what transformation rules GHC is using.
4249 </Para>
4250 </ListItem>
4251 <ListItem>
4252
4253 <Para>
4254  Use <Option>-ddump-simpl-stats</Option> to see what rules are being fired.
4255 If you add <Option>-dppr-debug</Option> you get a more detailed listing.
4256 </Para>
4257 </ListItem>
4258 <ListItem>
4259
4260 <Para>
4261  The defintion of (say) <Function>build</Function> in <FileName>PrelBase.lhs</FileName> looks llike this:
4262
4263 <ProgramListing>
4264         build   :: forall a. (forall b. (a -> b -> b) -> b -> b) -> [a]
4265         {-# INLINE build #-}
4266         build g = g (:) []
4267 </ProgramListing>
4268
4269 Notice the <Literal>INLINE</Literal>!  That prevents <Literal>(:)</Literal> from being inlined when compiling
4270 <Literal>PrelBase</Literal>, so that an importing module will &ldquo;see&rdquo; the <Literal>(:)</Literal>, and can
4271 match it on the LHS of a rule.  <Literal>INLINE</Literal> prevents any inlining happening
4272 in the RHS of the <Literal>INLINE</Literal> thing.  I regret the delicacy of this.
4273
4274 </Para>
4275 </ListItem>
4276 <ListItem>
4277
4278 <Para>
4279  In <Filename>ghc/lib/std/PrelBase.lhs</Filename> look at the rules for <Function>map</Function> to
4280 see how to write rules that will do fusion and yet give an efficient
4281 program even if fusion doesn't happen.  More rules in <Filename>PrelList.lhs</Filename>.
4282 </Para>
4283 </ListItem>
4284
4285 </ItemizedList>
4286
4287 </Para>
4288
4289 </Sect2>
4290
4291 </Sect1>
4292
4293 <Sect1 id="generic-classes">
4294 <Title>Generic classes</Title>
4295
4296 <Para>
4297 The ideas behind this extension are described in detail in "Derivable type classes",
4298 Ralf Hinze and Simon Peyton Jones, Haskell Workshop, Montreal Sept 2000, pp94-105.
4299 An example will give the idea:
4300 </Para>
4301
4302 <ProgramListing>
4303   import Generics
4304
4305   class Bin a where
4306     toBin   :: a -> [Int]
4307     fromBin :: [Int] -> (a, [Int])
4308
4309     toBin {| Unit |}    Unit      = []
4310     toBin {| a :+: b |} (Inl x)   = 0 : toBin x
4311     toBin {| a :+: b |} (Inr y)   = 1 : toBin y
4312     toBin {| a :*: b |} (x :*: y) = toBin x ++ toBin y
4313
4314     fromBin {| Unit |}    bs      = (Unit, bs)
4315     fromBin {| a :+: b |} (0:bs)  = (Inl x, bs')    where (x,bs') = fromBin bs
4316     fromBin {| a :+: b |} (1:bs)  = (Inr y, bs')    where (y,bs') = fromBin bs
4317     fromBin {| a :*: b |} bs      = (x :*: y, bs'') where (x,bs' ) = fromBin bs
4318                                                           (y,bs'') = fromBin bs'
4319 </ProgramListing>
4320 <Para>
4321 This class declaration explains how <Literal>toBin</Literal> and <Literal>fromBin</Literal>
4322 work for arbitrary data types.  They do so by giving cases for unit, product, and sum,
4323 which are defined thus in the library module <Literal>Generics</Literal>:
4324 </Para>
4325 <ProgramListing>
4326   data Unit    = Unit
4327   data a :+: b = Inl a | Inr b
4328   data a :*: b = a :*: b
4329 </ProgramListing>
4330 <Para>
4331 Now you can make a data type into an instance of Bin like this:
4332 <ProgramListing>
4333   instance (Bin a, Bin b) => Bin (a,b)
4334   instance Bin a => Bin [a]
4335 </ProgramListing>
4336 That is, just leave off the "where" clasuse.  Of course, you can put in the
4337 where clause and over-ride whichever methods you please.
4338 </Para>
4339
4340     <Sect2>
4341       <Title> Using generics </Title>
4342       <Para>To use generics you need to</para>
4343       <ItemizedList>
4344         <ListItem>
4345           <Para>Use the <Option>-fgenerics</Option> flag.</Para>
4346         </ListItem>
4347         <ListItem>
4348           <Para>Import the module <Literal>Generics</Literal> from the
4349           <Literal>lang</Literal> package.  This import brings into
4350           scope the data types <Literal>Unit</Literal>,
4351           <Literal>:*:</Literal>, and <Literal>:+:</Literal>.  (You
4352           don't need this import if you don't mention these types
4353           explicitly; for example, if you are simply giving instance
4354           declarations.)</Para>
4355         </ListItem>
4356       </ItemizedList>
4357     </Sect2>
4358
4359 <Sect2> <Title> Changes wrt the paper </Title>
4360 <Para>
4361 Note that the type constructors <Literal>:+:</Literal> and <Literal>:*:</Literal>
4362 can be written infix (indeed, you can now use
4363 any operator starting in a colon as an infix type constructor).  Also note that
4364 the type constructors are not exactly as in the paper (Unit instead of 1, etc).
4365 Finally, note that the syntax of the type patterns in the class declaration
4366 uses "<Literal>{|</Literal>" and "<Literal>{|</Literal>" brackets; curly braces
4367 alone would ambiguous when they appear on right hand sides (an extension we
4368 anticipate wanting).
4369 </Para>
4370 </Sect2>
4371
4372 <Sect2> <Title>Terminology and restrictions</Title>
4373 <Para>
4374 Terminology.  A "generic default method" in a class declaration
4375 is one that is defined using type patterns as above.
4376 A "polymorphic default method" is a default method defined as in Haskell 98.
4377 A "generic class declaration" is a class declaration with at least one
4378 generic default method.
4379 </Para>
4380
4381 <Para>
4382 Restrictions:
4383 <ItemizedList>
4384 <ListItem>
4385 <Para>
4386 Alas, we do not yet implement the stuff about constructor names and
4387 field labels.
4388 </Para>
4389 </ListItem>
4390
4391 <ListItem>
4392 <Para>
4393 A generic class can have only one parameter; you can't have a generic
4394 multi-parameter class.
4395 </Para>
4396 </ListItem>
4397
4398 <ListItem>
4399 <Para>
4400 A default method must be defined entirely using type patterns, or entirely
4401 without.  So this is illegal:
4402 <ProgramListing>
4403   class Foo a where
4404     op :: a -> (a, Bool)
4405     op {| Unit |} Unit = (Unit, True)
4406     op x               = (x,    False)
4407 </ProgramListing>
4408 However it is perfectly OK for some methods of a generic class to have
4409 generic default methods and others to have polymorphic default methods.
4410 </Para>
4411 </ListItem>
4412
4413 <ListItem>
4414 <Para>
4415 The type variable(s) in the type pattern for a generic method declaration
4416 scope over the right hand side.  So this is legal (note the use of the type variable ``p'' in a type signature on the right hand side:
4417 <ProgramListing>
4418   class Foo a where
4419     op :: a -> Bool
4420     op {| p :*: q |} (x :*: y) = op (x :: p)
4421     ...
4422 </ProgramListing>
4423 </Para>
4424 </ListItem>
4425
4426 <ListItem>
4427 <Para>
4428 The type patterns in a generic default method must take one of the forms:
4429 <ProgramListing>
4430        a :+: b
4431        a :*: b
4432        Unit
4433 </ProgramListing>
4434 where "a" and "b" are type variables.  Furthermore, all the type patterns for
4435 a single type constructor (<Literal>:*:</Literal>, say) must be identical; they
4436 must use the same type variables.  So this is illegal:
4437 <ProgramListing>
4438   class Foo a where
4439     op :: a -> Bool
4440     op {| a :+: b |} (Inl x) = True
4441     op {| p :+: q |} (Inr y) = False
4442 </ProgramListing>
4443 The type patterns must be identical, even in equations for different methods of the class.
4444 So this too is illegal:
4445 <ProgramListing>
4446   class Foo a where
4447     op1 :: a -> Bool
4448     op {| a :*: b |} (Inl x) = True
4449
4450     op2 :: a -> Bool
4451     op {| p :*: q |} (Inr y) = False
4452 </ProgramListing>
4453 (The reason for this restriction is that we gather all the equations for a particular type consructor
4454 into a single generic instance declaration.)
4455 </Para>
4456 </ListItem>
4457
4458 <ListItem>
4459 <Para>
4460 A generic method declaration must give a case for each of the three type constructors.
4461 </Para>
4462 </ListItem>
4463
4464 <ListItem>
4465 <Para>
4466 In an instance declaration for a generic class, the idea is that the compiler
4467 will fill in the methods for you, based on the generic templates.  However it can only
4468 do so if
4469   <ItemizedList>
4470   <ListItem>
4471   <Para>
4472   The instance type is simple (a type constructor applied to type variables, as in Haskell 98).
4473   </Para>
4474   </ListItem>
4475   <ListItem>
4476   <Para>
4477   No constructor of the instance type has unboxed fields.
4478   </Para>
4479   </ListItem>
4480   </ItemizedList>
4481 (Of course, these things can only arise if you are already using GHC extensions.)
4482 However, you can still give an instance declarations for types which break these rules,
4483 provided you give explicit code to override any generic default methods.
4484 </Para>
4485 </ListItem>
4486
4487 </ItemizedList>
4488 </Para>
4489
4490 <Para>
4491 The option <Option>-ddump-deriv</Option> dumps incomprehensible stuff giving details of
4492 what the compiler does with generic declarations.
4493 </Para>
4494
4495 </Sect2>
4496
4497 <Sect2> <Title> Another example </Title>
4498 <Para>
4499 Just to finish with, here's another example I rather like:
4500 <ProgramListing>
4501   class Tag a where
4502     nCons :: a -> Int
4503     nCons {| Unit |}    _ = 1
4504     nCons {| a :*: b |} _ = 1
4505     nCons {| a :+: b |} _ = nCons (bot::a) + nCons (bot::b)
4506
4507     tag :: a -> Int
4508     tag {| Unit |}    _       = 1
4509     tag {| a :*: b |} _       = 1
4510     tag {| a :+: b |} (Inl x) = tag x
4511     tag {| a :+: b |} (Inr y) = nCons (bot::a) + tag y
4512 </ProgramListing>
4513 </Para>
4514 </Sect2>
4515 </Sect1>
4516
4517 <!-- Emacs stuff:
4518      ;;; Local Variables: ***
4519      ;;; mode: sgml ***
4520      ;;; sgml-parent-document: ("users_guide.sgml" "book" "chapter" "sect1") ***
4521      ;;; End: ***
4522  -->