ghc/docs/users_guide/glasgow_exts.vsgml

   1 %
   2 % $Id: glasgow_exts.vsgml,v 1.12 1999/07/14 11:33:10 simonmar Exp $
   3 %
   4 % GHC Language Extensions.
   5 %
   6
   7 As with all known Haskell systems, GHC implements some extensions to
   8 the language.  To use them, you'll need to give a @-fglasgow-exts@%
   9 <nidx>-fglasgow-exts option</nidx> option.
  10
  11 Virtually all of the Glasgow extensions serve to give you access to
  12 the underlying facilities with which we implement Haskell.  Thus, you
  13 can get at the Raw Iron, if you are willing to write some non-standard
  14 code at a more primitive level.  You need not be ``stuck'' on
  15 performance because of the implementation costs of Haskell's
  16 ``high-level'' features---you can always code ``under'' them.  In an
  17 extreme case, you can write all your time-critical code in C, and then
  18 just glue it together with Haskell!
  19
  20 Executive summary of our extensions:
  21
  22 <descrip>
  23
  24 <tag>Unboxed types and primitive operations:</tag>
  25
  26 You can get right down to the raw machine types and operations;
  27 included in this are ``primitive arrays'' (direct access to Big Wads
  28 of Bytes).  Please see Section <ref name="Unboxed types"
  29 id="glasgow-unboxed"> and following.
  30
  31 <tag>Multi-parameter type classes:</tag>
  32
  33 GHC's type system supports extended type classes with multiple
  34 parameters.  Please see Section <ref name="Mult-parameter type
  35 classes" id="multi-param-type-classes">.
  36
  37 <tag>Local universal quantification:</tag>
  38
  39 GHC's type system supports explicit universal quantification in
  40 constructor fields and function arguments.  This is useful for things
  41 like defining @runST@ from the state-thread world.  See Section <ref
  42 name="Local universal quantification" id="universal-quantification">.
  43
  44 <tag>Extistentially quantification in data types:</tag>
  45
  46 Some or all of the type variables in a datatype declaration may be
  47 <em>existentially quantified</em>.  More details in Section <ref
  48 name="Existential Quantification" id="existential-quantification">.
  49
  50 <tag>Scoped type variables:</tag>
  51
  52 Scoped type variables enable the programmer to supply type signatures
  53 for some nested declarations, where this would not be legal in Haskell
  54 98.  Details in Section <ref name="Scoped Type Variables"
  55 id="scoped-type-variables">.
  56
  57 <tag>Calling out to C:</tag>
  58
  59 Just what it sounds like.  We provide <em>lots</em> of rope that you
  60 can dangle around your neck.  Please see Section <ref name="Calling~C
  61 directly from Haskell" id="glasgow-ccalls">.
  62
  63 <tag>Pragmas</tag>
  64
  65 Pragmas are special instructions to the compiler placed in the source
  66 file.  The pragmas GHC supports are described in Section <ref
  67 name="Pragmas" id="pragmas">.
  68
  69 <tag>Rewrite rules:</tag>
  70
  71 The programmer can specify rewrite rules as part of the source program
  72 (in a pragma).  GHC applies these rewrite rules wherever it can.
  73 Details in Section <ref name="Rewrite Rules"
  74 id="rewrite-rules">.
  75
  76 </descrip>
  77
  78 Before you get too carried away working at the lowest level (e.g.,
  79 sloshing @MutableByteArray#@s around your program), you may wish to
  80 check if there are system libraries that provide a ``Haskellised
  81 veneer'' over the features you want.  See Section <ref name="GHC
  82 Prelude and libraries" id="ghc-prelude">.
  83
  84 %************************************************************************
  85 %*                                                                      *
  86 <sect1>Unboxed types
  87 <label id="glasgow-unboxed">
  88 <p>
  89 <nidx>Unboxed types (Glasgow extension)</nidx>
  90 %*                                                                      *
  91 %************************************************************************
  92
  93 These types correspond to the ``raw machine'' types you would use in
  94 C: @Int#@ (long int), @Double#@ (double), @Addr#@ (void *), etc.  The
  95 <em>primitive operations</em> (PrimOps) on these types are what you
  96 might expect; e.g., @(+#)@ is addition on @Int#@s, and is the
  97 machine-addition that we all know and love---usually one instruction.
  98
  99 There are some restrictions on the use of unboxed types, the main one
 100 being that you can't pass an unboxed value to a polymorphic function
 101 or store one in a polymorphic data type.  This rules out things like
 102 @[Int#]@ (ie. lists of unboxed integers).  The reason for this
 103 restriction is that polymorphic arguments and constructor fields are
 104 assumed to be pointers: if an unboxed integer is stored in one of
 105 these, the garbage collector would attempt to follow it, leading to
 106 unpredictable space leaks.  Or a @seq@ operation on the polymorphic
 107 component may attempt to dereference the pointer, with disastrous
 108 results.  Even worse, the unboxed value might be larger than a pointer
 109 (@Double#@ for instance).
 110
 111 Nevertheless, A numerically-intensive program using unboxed types can
 112 go a <em>lot</em> faster than its ``standard'' counterpart---we saw a
 113 threefold speedup on one example.
 114
 115 Please see Section <ref name="The module PrelGHC: really primitive
 116 stuff" id="ghc-libs-ghc"> for the details of unboxed types and the
 117 operations on them.
 118
 119 %************************************************************************
 120 %*                                                                      *
 121 <sect1>Primitive state-transformer monad
 122 <label id="glasgow-ST-monad">
 123 <p>
 124 <nidx>state transformers (Glasgow extensions)</nidx>
 125 <nidx>ST monad (Glasgow extension)</nidx>
 126 %*                                                                      *
 127 %************************************************************************
 128
 129 This monad underlies our implementation of arrays, mutable and
 130 immutable, and our implementation of I/O, including ``C calls''.
 131
 132 The @ST@ library, which provides access to the @ST@ monad, is a
 133 GHC/Hugs extension library and is described in the separate <htmlurl
 134 name="GHC/Hugs Extension Libraries" url="libs.html"> document.
 135
 136 %************************************************************************
 137 %*                                                                      *
 138 <sect1>Primitive arrays, mutable and otherwise
 139 <label id="glasgow-prim-arrays">
 140 <p>
 141 <nidx>primitive arrays (Glasgow extension)</nidx>
 142 <nidx>arrays, primitive (Glasgow extension)</nidx>
 143 %*                                                                      *
 144 %************************************************************************
 145
 146 GHC knows about quite a few flavours of Large Swathes of Bytes.
 147
 148 First, GHC distinguishes between primitive arrays of (boxed) Haskell
 149 objects (type @Array# obj@) and primitive arrays of bytes (type
 150 @ByteArray#@).
 151
 152 Second, it distinguishes between...
 153 <descrip>
 154 <tag>Immutable:</tag>
 155 Arrays that do not change (as with ``standard'' Haskell arrays); you
 156 can only read from them.  Obviously, they do not need the care and
 157 attention of the state-transformer monad.
 158
 159 <tag>Mutable:</tag>
 160 Arrays that may be changed or ``mutated.''  All the operations on them
 161 live within the state-transformer monad and the updates happen
 162 <em>in-place</em>.
 163
 164 <tag>``Static'' (in C land):</tag>
 165 A C routine may pass an @Addr#@ pointer back into Haskell land.  There
 166 are then primitive operations with which you may merrily grab values
 167 over in C land, by indexing off the ``static'' pointer.
 168
 169 <tag>``Stable'' pointers:</tag>
 170 If, for some reason, you wish to hand a Haskell pointer (i.e.,
 171 <em>not</em> an unboxed value) to a C routine, you first make the
 172 pointer ``stable,'' so that the garbage collector won't forget that it
 173 exists.  That is, GHC provides a safe way to pass Haskell pointers to
 174 C.
 175
 176 Please see Section <ref name="Subverting automatic unboxing with
 177 ``stable pointers''" id="glasgow-stablePtrs"> for more details.
 178
 179 <tag>``Foreign objects'':</tag>
 180 A ``foreign object'' is a safe way to pass an external object (a
 181 C-allocated pointer, say) to Haskell and have Haskell do the Right
 182 Thing when it no longer references the object.  So, for example, C
 183 could pass a large bitmap over to Haskell and say ``please free this
 184 memory when you're done with it.''
 185
 186 Please see Section <ref name="Pointing outside the Haskell heap"
 187 id="glasgow-foreignObjs"> for more details.
 188
 189 </descrip>
 190
 191 The libraries section gives more details on all these ``primitive
 192 array'' types and the operations on them, Section <ref name="The GHC
 193 Prelude and Libraries" id="ghc-prelude">.  Some of these extensions
 194 are also supported by Hugs, and the supporting libraries are described
 195 in the <htmlurl name="GHC/Hugs Extension Libraries" url="libs.html">
 196 document.
 197
 198 %************************************************************************
 199 %*                                                                      *
 200 <sect1>Calling~C directly from Haskell
 201 <label id="glasgow-ccalls">
 202 <p>
 203 <nidx>C calls (Glasgow extension)</nidx>
 204 <nidx>_ccall_ (Glasgow extension)</nidx>
 205 <nidx>_casm_ (Glasgow extension)</nidx>
 206 %*                                                                      *
 207 %************************************************************************
 208
 209 GOOD ADVICE: Because this stuff is not Entirely Stable as far as names
 210 and things go, you would be well-advised to keep your C-callery
 211 corraled in a few modules, rather than sprinkled all over your code.
 212 It will then be quite easy to update later on.
 213
 214 %************************************************************************
 215 %*                                                                      *
 216 <sect2>@_ccall_@ and @_casm_@: an introduction
 217 <label id="ccall-intro">
 218 <p>
 219 %*                                                                      *
 220 %************************************************************************
 221
 222 The simplest way to use a simple C function
 223
 224 <tscreen><verb>
 225 double fooC( FILE *in, char c, int i, double d, unsigned int u )
 226 </verb></tscreen>
 227
 228 is to provide a Haskell wrapper:
 229
 230 <tscreen><verb>
 231 fooH :: Char -> Int -> Double -> Word -> IO Double
 232 fooH c i d w = _ccall_ fooC (``stdin''::Addr) c i d w
 233 </verb></tscreen>
 234
 235 The function @fooH@ will unbox all of its arguments, call the C
 236 function @fooC@ and box the corresponding arguments.
 237
 238 One of the annoyances about @_ccall_@s is when the C types don't quite
 239 match the Haskell compiler's ideas.  For this, the @_casm_@ variant
 240 may be just the ticket (NB: <em>no chance</em> of such code going
 241 through a native-code generator):
 242
 243 <tscreen><verb>
 244 import Addr
 245 import CString
 246
 247 oldGetEnv name
 248   = _casm_ ``%r = getenv((char *) %0);'' name >>= \ litstring ->
 249     return (
 250         if (litstring == nullAddr) then
 251             Left ("Fail:oldGetEnv:"++name)
 252         else
 253             Right (unpackCString litstring)
 254     )
 255 </verb></tscreen>
 256
 257 The first literal-literal argument to a @_casm_@ is like a @printf@
 258 format: @%r@ is replaced with the ``result,'' @%0@--@%n-1@ are
 259 replaced with the 1st--nth arguments.  As you can see above, it is an
 260 easy way to do simple C~casting.  Everything said about @_ccall_@ goes
 261 for @_casm_@ as well.
 262
 263 The use of @_casm_@ in your code does pose a problem to the compiler
 264 when it comes to generating an interface file for a freshly compiled
 265 module. Included in an interface file is the unfolding (if any) of a
 266 declaration. However, if a declaration's unfolding happens to contain
 267 a @_casm_@, its unfolding will <em/not/ be emitted into the interface
 268 file even if it qualifies by all the other criteria. The reason why
 269 the compiler prevents this from happening is that unfolding @_casm_@s
 270 into an interface file unduly constrains how code that import your
 271 module have to be compiled. If an imported declaration is unfolded and
 272 it contains a @_casm_@, you now have to be using a compiler backend
 273 capable of dealing with it (i.e., the C compiler backend). If you are
 274 using the C compiler backend, the unfolded @_casm_@ may still cause you
 275 problems since the C code snippet it contains may mention CPP symbols
 276 that were in scope when compiling the original module are not when
 277 compiling the importing module.
 278
 279 If you're willing to put up with the drawbacks of doing cross-module
 280 inlining of C code (GHC - A Better C Compiler :-), the option
 281 @-funfold-casms-in-hi-file@ will turn off the default behaviour.
 282 <nidx>-funfold-casms-in-hi-file option</nidx>
 283
 284 %************************************************************************
 285 %*                                                                      *
 286 <sect2>Literal-literals
 287 <label id="glasgow-literal-literals">
 288 <p>
 289 <nidx>Literal-literals</nidx>
 290 %*                                                                      *
 291 %************************************************************************
 292
 293 The literal-literal argument to @_casm_@ can be made use of separately
 294 from the @_casm_@ construct itself. Indeed, we've already used it:
 295
 296 <tscreen><verb>
 297 fooH :: Char -> Int -> Double -> Word -> IO Double
 298 fooH c i d w = _ccall_ fooC (``stdin''::Addr) c i d w
 299 </verb></tscreen>
 300
 301 The first argument that's passed to @fooC@ is given as a literal-literal,
 302 that is, a literal chunk of C code that will be inserted into the generated
 303 @.hc@ code at the right place.
 304
 305 A literal-literal is restricted to having a type that's an instance of
 306 the @CCallable@ class, see <ref name="CCallable" id="ccall-gotchas">
 307 for more information.
 308
 309 Notice that literal-literals are by their very nature unfriendly to
 310 native code generators, so exercise judgement about whether or not to
 311 make use of them in your code.
 312
 313 %************************************************************************
 314 %*                                                                      *
 315 <sect2>Using function headers
 316 <label id="glasgow-foreign-headers">
 317 <p>
 318 <nidx>C calls, function headers</nidx>
 319 %*                                                                      *
 320 %************************************************************************
 321
 322 When generating C (using the @-fvia-C@ directive), one can assist the
 323 C compiler in detecting type errors by using the @-#include@ directive
 324 to provide @.h@ files containing function headers.
 325
 326 For example,
 327
 328 <tscreen><verb>
 329 typedef unsigned long *StgForeignObj;
 330 typedef long StgInt;
 331
 332 void          initialiseEFS (StgInt size);
 333 StgInt        terminateEFS (void);
 334 StgForeignObj emptyEFS(void);
 335 StgForeignObj updateEFS (StgForeignObj a, StgInt i, StgInt x);
 336 StgInt        lookupEFS (StgForeignObj a, StgInt i);
 337 </verb></tscreen>
 338
 339 You can find appropriate definitions for @StgInt@, @StgForeignObj@,
 340 etc using @gcc@ on your architecture by consulting
 341 @ghc/includes/StgTypes.h@.  The following table summarises the
 342 relationship between Haskell types and C types.
 343
 344 <tabular ca="ll">
 345 <bf>C type name</bf>      | <bf>Haskell Type</bf> @@
 346 @@
 347 @StgChar@          | @Char#@ @@
 348 @StgInt@           | @Int#@ @@
 349 @StgWord@          | @Word#@ @@
 350 @StgAddr@          | @Addr#@ @@
 351 @StgFloat@         | @Float#@ @@
 352 @StgDouble@        | @Double#@ @@
 353
 354 @StgArray@         | @Array#@ @@
 355 @StgByteArray@     | @ByteArray#@ @@
 356 @StgArray@         | @MutableArray#@ @@
 357 @StgByteArray@     | @MutableByteArray#@ @@
 358
 359 @StgStablePtr@     | @StablePtr#@ @@
 360 @StgForeignObj@    | @ForeignObj#@
 361 </tabular>
 362
 363 Note that this approach is only <em>essential</em> for returning
 364 @float@s (or if @sizeof(int) != sizeof(int *)@ on your
 365 architecture) but is a Good Thing for anyone who cares about writing
 366 solid code.  You're crazy not to do it.
 367
 368 %************************************************************************
 369 %*                                                                      *
 370 <sect2>Subverting automatic unboxing with ``stable pointers''
 371 <label id="glasgow-stablePtrs">
 372 <p>
 373 <nidx>stable pointers (Glasgow extension)</nidx>
 374 %*                                                                      *
 375 %************************************************************************
 376
 377 The arguments of a @_ccall_@ are automatically unboxed before the
 378 call.  There are two reasons why this is usually the Right Thing to
 379 do:
 380
 381 <itemize>
 382 <item>
 383 C is a strict language: it would be excessively tedious to pass
 384 unevaluated arguments and require the C programmer to force their
 385 evaluation before using them.
 386
 387 <item> Boxed values are stored on the Haskell heap and may be moved
 388 within the heap if a garbage collection occurs---that is, pointers
 389 to boxed objects are not <em>stable</em>.
 390 </itemize>
 391
 392 It is possible to subvert the unboxing process by creating a ``stable
 393 pointer'' to a value and passing the stable pointer instead.  For
 394 example, to pass/return an integer lazily to C functions @storeC@ and
 395 @fetchC@, one might write:
 396
 397 <tscreen><verb>
 398 storeH :: Int -> IO ()
 399 storeH x = makeStablePtr x              >>= \ stable_x ->
 400            _ccall_ storeC stable_x
 401
 402 fetchH :: IO Int
 403 fetchH x = _ccall_ fetchC               >>= \ stable_x ->
 404            deRefStablePtr stable_x      >>= \ x ->
 405            freeStablePtr stable_x       >>
 406            return x
 407 </verb></tscreen>
 408
 409 The garbage collector will refrain from throwing a stable pointer away
 410 until you explicitly call one of the following from C or Haskell.
 411
 412 <tscreen><verb>
 413 void freeStablePointer( StgStablePtr stablePtrToToss )
 414 freeStablePtr :: StablePtr a -> IO ()
 415 </verb></tscreen>
 416
 417 As with the use of @free@ in C programs, GREAT CARE SHOULD BE
 418 EXERCISED to ensure these functions are called at the right time: too
 419 early and you get dangling references (and, if you're lucky, an error
 420 message from the runtime system); too late and you get space leaks.
 421
 422 And to force evaluation of the argument within @fooC@, one would
 423 call one of the following C functions (according to type of argument).
 424
 425 <tscreen><verb>
 426 void     performIO  ( StgStablePtr stableIndex /* StablePtr s (IO ()) */ );
 427 StgInt   enterInt   ( StgStablePtr stableIndex /* StablePtr s Int */ );
 428 StgFloat enterFloat ( StgStablePtr stableIndex /* StablePtr s Float */ );
 429 </verb></tscreen>
 430
 431 <nidx>performIO</nidx>
 432 <nidx>enterInt</nidx>
 433 <nidx>enterFloat</nidx>
 434
 435 Note Bene: @_ccall_GC_@<nidx>_ccall_GC_</nidx> must be used if any of
 436 these functions are used.
 437
 438 %************************************************************************
 439 %*                                                                      *
 440 <sect2>Foreign objects: pointing outside the Haskell heap
 441 <label id="glasgow-foreignObjs">
 442 <p>
 443 <nidx>foreign objects (Glasgow extension)</nidx>
 444 %*                                                                      *
 445 %************************************************************************
 446
 447 There are two types that @ghc@ programs can use to reference
 448 (heap-allocated) objects outside the Haskell world: @Addr@ and
 449 @ForeignObj@.
 450
 451 If you use @Addr@, it is up to you to the programmer to arrange
 452 allocation and deallocation of the objects.
 453
 454 If you use @ForeignObj@, @ghc@'s garbage collector will call upon the
 455 user-supplied <em>finaliser</em> function to free the object when the
 456 Haskell world no longer can access the object.  (An object is
 457 associated with a finaliser function when the abstract
 458  Haskell type @ForeignObj@ is created). The finaliser function is
 459 expressed in C, and is passed as argument the object:
 460
 461 <tscreen><verb>
 462 void foreignFinaliser ( StgForeignObj fo )
 463 </verb></tscreen>
 464
 465 when the Haskell world can no longer access the object.  Since
 466 @ForeignObj@s only get released when a garbage collection occurs, we
 467 provide ways of triggering a garbage collection from within C and from
 468 within Haskell.
 469
 470 <tscreen><verb>
 471 void GarbageCollect()
 472 performGC :: IO ()
 473 </verb></tscreen>
 474
 475 More information on the programmers' interface to @ForeignObj@ can be
 476 found in the library documentation.
 477
 478 %************************************************************************
 479 %*                                                                      *
 480 <sect2>Avoiding monads
 481 <label id="glasgow-avoiding-monads">
 482 <p>
 483 <nidx>C calls to `pure C'</nidx>
 484 <nidx>unsafePerformIO</nidx>
 485 %*                                                                      *
 486 %************************************************************************
 487
 488 The @_ccall_@ construct is part of the @IO@ monad because 9 out of 10
 489 uses will be to call imperative functions with side effects such as
 490 @printf@.  Use of the monad ensures that these operations happen in a
 491 predictable order in spite of laziness and compiler optimisations.
 492
 493 To avoid having to be in the monad to call a C function, it is
 494 possible to use @unsafePerformIO@, which is available from the
 495 @IOExts@ module.  There are three situations where one might like to
 496 call a C function from outside the IO world:
 497
 498 <itemize>
 499 <item>
 500 Calling a function with no side-effects:
 501 <tscreen><verb>
 502 atan2d :: Double -> Double -> Double
 503 atan2d y x = unsafePerformIO (_ccall_ atan2d y x)
 504
 505 sincosd :: Double -> (Double, Double)
 506 sincosd x = unsafePerformIO $ do
 507         da <- newDoubleArray (0, 1)
 508         _casm_ ``sincosd( %0, &((double *)%1[0]), &((double *)%1[1]) );'' x da
 509         s <- readDoubleArray da 0
 510         c <- readDoubleArray da 1
 511         return (s, c)
 512 </verb></tscreen>
 513
 514 <item> Calling a set of functions which have side-effects but which can
 515 be used in a purely functional manner.
 516
 517 For example, an imperative implementation of a purely functional
 518 lookup-table might be accessed using the following functions.
 519
 520 <tscreen><verb>
 521 empty  :: EFS x
 522 update :: EFS x -> Int -> x -> EFS x
 523 lookup :: EFS a -> Int -> a
 524
 525 empty = unsafePerformIO (_ccall_ emptyEFS)
 526
 527 update a i x = unsafePerformIO $
 528         makeStablePtr x         >>= \ stable_x ->
 529         _ccall_ updateEFS a i stable_x
 530
 531 lookup a i = unsafePerformIO $
 532         _ccall_ lookupEFS a i   >>= \ stable_x ->
 533         deRefStablePtr stable_x
 534 </verb></tscreen>
 535
 536 You will almost always want to use @ForeignObj@s with this.
 537
 538 <item> Calling a side-effecting function even though the results will
 539 be unpredictable.  For example the @trace@ function is defined by:
 540
 541 <tscreen><verb>
 542 trace :: String -> a -> a
 543 trace string expr
 544   = unsafePerformIO (
 545         ((_ccall_ PreTraceHook sTDERR{-msg-}):: IO ())  >>
 546         fputs sTDERR string                             >>
 547         ((_ccall_ PostTraceHook sTDERR{-msg-}):: IO ()) >>
 548         return expr )
 549   where
 550     sTDERR = (``stderr'' :: Addr)
 551 </verb></tscreen>
 552
 553 (This kind of use is not highly recommended --- it is only really
 554 useful in debugging code.)
 555 </itemize>
 556
 557 %************************************************************************
 558 %*                                                                      *
 559 <sect2>C-calling ``gotchas'' checklist
 560 <label id="ccall-gotchas">
 561 <p>
 562 <nidx>C call dangers</nidx>
 563 <nidx>CCallable</nidx>
 564 <nidx>CReturnable</nidx>
 565 %*                                                                      *
 566 %************************************************************************
 567
 568 And some advice, too.
 569
 570 <itemize>
 571 <item> For modules that use @_ccall_@s, etc., compile with
 572 @-fvia-C@.<nidx>-fvia-C option</nidx> You don't have to, but you should.
 573
 574 Also, use the @-#include "prototypes.h"@ flag (hack) to inform the C
 575 compiler of the fully-prototyped types of all the C functions you
 576 call.  (Section <ref name="Using function headers"
 577 id="glasgow-foreign-headers"> says more about this...)
 578
 579 This scheme is the <em>only</em> way that you will get <em>any</em>
 580 typechecking of your @_ccall_@s.  (It shouldn't be that way, but...).
 581 GHC will pass the flag @-Wimplicit@ to gcc so that you'll get warnings
 582 if any @_ccall_@ed functions have no prototypes.
 583
 584 <item>
 585 Try to avoid @_ccall_@s to C~functions that take @float@
 586 arguments or return @float@ results.  Reason: if you do, you will
 587 become entangled in (ANSI?) C's rules for when arguments/results are
 588 promoted to @doubles@.  It's a nightmare and just not worth it.
 589 Use @doubles@ if possible.
 590
 591 If you do use @floats@, check and re-check that the right thing is
 592 happening.  Perhaps compile with @-keep-hc-file-too@ and look at
 593 the intermediate C (@.hc@ file).
 594
 595 <item> The compiler uses two non-standard type-classes when
 596 type-checking the arguments and results of @_ccall_@: the arguments
 597 (respectively result) of @_ccall_@ must be instances of the class
 598 @CCallable@ (respectively @CReturnable@).  Both classes may be
 599 imported from the module @CCall@, but this should only be
 600 necessary if you want to define a new instance.  (Neither class
 601 defines any methods --- their only function is to keep the
 602 type-checker happy.)
 603
 604 The type checker must be able to figure out just which of the
 605 C-callable/returnable types is being used.  If it can't, you have to
 606 add type signatures. For example,
 607
 608 <tscreen><verb>
 609 f x = _ccall_ foo x
 610 </verb></tscreen>
 611
 612 is not good enough, because the compiler can't work out what type @x@
 613 is, nor what type the @_ccall_@ returns.  You have to write, say:
 614
 615 <tscreen><verb>
 616 f :: Int -> IO Double
 617 f x = _ccall_ foo x
 618 </verb></tscreen>
 619
 620 This table summarises the standard instances of these classes.
 621
 622 % ToDo: check this table against implementation!
 623
 624 <tabular ca="llll">
 625 <bf>Type</bf>       |<bf>CCallable</bf>|<bf>CReturnable</bf> | <bf>Which is probably...</bf> @@
 626
 627 @Char@              | Yes  | Yes   | @unsigned char@ @@
 628 @Int@               | Yes  | Yes   | @long int@ @@
 629 @Word@              | Yes  | Yes   | @unsigned long int@ @@
 630 @Addr@              | Yes  | Yes   | @void *@ @@
 631 @Float@             | Yes  | Yes   | @float@ @@
 632 @Double@            | Yes  | Yes   | @double@ @@
 633 @()@                | No   | Yes   | @void@ @@
 634 @[Char]@            | Yes  | No    | @char *@ (null-terminated) @@
 635
 636 @Array@             | Yes  | No    | @unsigned long *@ @@
 637 @ByteArray@         | Yes  | No    | @unsigned long *@ @@
 638 @MutableArray@      | Yes  | No    | @unsigned long *@ @@
 639 @MutableByteArray@  | Yes  | No    | @unsigned long *@ @@
 640
 641 @State@             | Yes  | Yes   | nothing!@@
 642
 643 @StablePtr@         | Yes  | Yes   | @unsigned long *@ @@
 644 @ForeignObjs@       | Yes  | Yes   | see later @@
 645 </tabular>
 646
 647 Actually, the @Word@ type is defined as being the same size as a
 648 pointer on the target architecture, which is <em>probably</em>
 649 @unsigned long int@.
 650
 651 The brave and careful programmer can add their own instances of these
 652 classes for the following types:
 653
 654 <itemize>
 655 <item>
 656 A <em>boxed-primitive</em> type may be made an instance of both
 657 @CCallable@ and @CReturnable@.
 658
 659 A boxed primitive type is any data type with a
 660 single unary constructor with a single primitive argument.  For
 661 example, the following are all boxed primitive types:
 662
 663 <tscreen><verb>
 664 Int
 665 Double
 666 data XDisplay = XDisplay Addr#
 667 data EFS a = EFS# ForeignObj#
 668 </verb></tscreen>
 669
 670 <tscreen><verb>
 671 instance CCallable   (EFS a)
 672 instance CReturnable (EFS a)
 673 </verb></tscreen>
 674
 675 <item> Any datatype with a single nullary constructor may be made an
 676 instance of @CReturnable@.  For example:
 677
 678 <tscreen><verb>
 679 data MyVoid = MyVoid
 680 instance CReturnable MyVoid
 681 </verb></tscreen>
 682
 683 <item> As at version 2.09, @String@ (i.e., @[Char]@) is still
 684 not a @CReturnable@ type.
 685
 686 Also, the now-builtin type @PackedString@ is neither
 687 @CCallable@ nor @CReturnable@.  (But there are functions in
 688 the PackedString interface to let you get at the necessary bits...)
 689 </itemize>
 690
 691 <item> The code-generator will complain if you attempt to use @%r@ in
 692 a @_casm_@ whose result type is @IO ()@; or if you don't use @%r@
 693 <em>precisely</em> once for any other result type.  These messages are
 694 supposed to be helpful and catch bugs---please tell us if they wreck
 695 your life.
 696
 697 <item> If you call out to C code which may trigger the Haskell garbage
 698 collector or create new threads (examples of this later...), then you
 699 must use the @_ccall_GC_@<nidx>_ccall_GC_ primitive</nidx> or
 700 @_casm_GC_@<nidx>_casm_GC_ primitive</nidx> variant of C-calls.  (This
 701 does not work with the native code generator - use @\fvia-C@.) This
 702 stuff is hairy with a capital H!  </itemize>
 703
 704 <sect1> Multi-parameter type classes
 705 <label id="multi-param-type-classes">
 706 <p>
 707
 708 This section documents GHC's implementation of multi-paramter type
 709 classes.  There's lots of background in the paper <url name="Type
 710 classes: exploring the design space"
 711 url="http://www.dcs.gla.ac.uk/~simonpj/multi.ps.gz"> (Simon Peyton
 712 Jones, Mark Jones, Erik Meijer).
 713
 714 I'd like to thank people who reported shorcomings in the GHC 3.02
 715 implementation.  Our default decisions were all conservative ones, and
 716 the experience of these heroic pioneers has given useful concrete
 717 examples to support several generalisations.  (These appear below as
 718 design choices not implemented in 3.02.)
 719
 720 I've discussed these notes with Mark Jones, and I believe that Hugs
 721 will migrate towards the same design choices as I outline here.
 722 Thanks to him, and to many others who have offered very useful
 723 feedback.
 724
 725 <sect2>Types
 726 <p>
 727
 728 There are the following restrictions on the form of a qualified
 729 type:
 730
 731 <tscreen><verb>
 732   forall tv1..tvn (c1, ...,cn) => type
 733 </verb></tscreen>
 734
 735 (Here, I write the "foralls" explicitly, although the Haskell source
 736 language omits them; in Haskell 1.4, all the free type variables of an
 737 explicit source-language type signature are universally quantified,
 738 except for the class type variables in a class declaration.  However,
 739 in GHC, you can give the foralls if you want.  See Section <ref
 740 name="Explicit universal quantification"
 741 id="universal-quantification">).
 742
 743 <enum>
 744
 745 <item> <bf>Each universally quantified type variable
 746 @tvi@ must be mentioned (i.e. appear free) in @type@</bf>.
 747
 748 The reason for this is that a value with a type that does not obey
 749 this restriction could not be used without introducing
 750 ambiguity. Here, for example, is an illegal type:
 751
 752 <tscreen><verb>
 753   forall a. Eq a => Int
 754 </verb></tscreen>
 755
 756 When a value with this type was used, the constraint <tt>Eq tv</tt>
 757 would be introduced where <tt>tv</tt> is a fresh type variable, and
 758 (in the dictionary-translation implementation) the value would be
 759 applied to a dictionary for <tt>Eq tv</tt>.  The difficulty is that we
 760 can never know which instance of <tt>Eq</tt> to use because we never
 761 get any more information about <tt>tv</tt>.
 762
 763 <item> <bf>Every constraint @ci@ must mention at least one of the
 764 universally quantified type variables @tvi@</bf>.
 765
 766 For example, this type is OK because <tt>C a b</tt> mentions the
 767 universally quantified type variable <tt>b</tt>:
 768
 769 <tscreen><verb>
 770   forall a. C a b => burble
 771 </verb></tscreen>
 772
 773 The next type is illegal because the constraint <tt>Eq b</tt> does not
 774 mention <tt>a</tt>:
 775
 776 <tscreen><verb>
 777   forall a. Eq b => burble
 778 </verb></tscreen>
 779
 780 The reason for this restriction is milder than the other one.  The
 781 excluded types are never useful or necessary (because the offending
 782 context doesn't need to be witnessed at this point; it can be floated
 783 out).  Furthermore, floating them out increases sharing. Lastly,
 784 excluding them is a conservative choice; it leaves a patch of
 785 territory free in case we need it later.
 786
 787 </enum>
 788
 789 These restrictions apply to all types, whether declared in a type signature
 790 or inferred.
 791
 792 Unlike Haskell 1.4, constraints in types do <bf>not</bf> have to be of
 793 the form <em>(class type-variables)</em>.  Thus, these type signatures
 794 are perfectly OK
 795
 796 <tscreen><verb>
 797   f :: Eq (m a) => [m a] -> [m a]
 798   g :: Eq [a] => ...
 799 </verb></tscreen>
 800
 801 This choice recovers principal types, a property that Haskell 1.4 does not have.
 802
 803 <sect2>Class declarations
 804 <p>
 805
 806 <enum>
 807
 808 <item> <bf>Multi-parameter type classes are permitted</bf>. For example:
 809
 810 <tscreen><verb>
 811   class Collection c a where
 812     union :: c a -> c a -> c a
 813     ...etc..
 814 </verb></tscreen>
 815
 816
 817 <item> <bf>The class hierarchy must be acyclic</bf>.  However, the definition
 818 of "acyclic" involves only the superclass relationships.  For example,
 819 this is OK:
 820
 821 <tscreen><verb>
 822   class C a where {
 823     op :: D b => a -> b -> b
 824   }
 825
 826   class C a => D a where { ... }
 827 </verb></tscreen>
 828
 829 Here, <tt>C</tt> is a superclass of <tt>D</tt>, but it's OK for a
 830 class operation <tt>op</tt> of <tt>C</tt> to mention <tt>D</tt>.  (It
 831 would not be OK for <tt>D</tt> to be a superclass of <tt>C</tt>.)
 832
 833 <item> <bf>There are no restrictions on the context in a class declaration
 834 (which introduces superclasses), except that the class hierarchy must
 835 be acyclic</bf>.  So these class declarations are OK:
 836
 837 <tscreen><verb>
 838   class Functor (m k) => FiniteMap m k where
 839     ...
 840
 841   class (Monad m, Monad (t m)) => Transform t m where
 842     lift :: m a -> (t m) a
 843 </verb></tscreen>
 844
 845 <item> <bf>In the signature of a class operation, every constraint
 846 must mention at least one type variable that is not a class type
 847 variable</bf>.
 848
 849 Thus:
 850
 851 <tscreen><verb>
 852   class Collection c a where
 853     mapC :: Collection c b => (a->b) -> c a -> c b
 854 </verb></tscreen>
 855
 856 is OK because the constraint <tt>(Collection a b)</tt> mentions
 857 <tt>b</tt>, even though it also mentions the class variable
 858 <tt>a</tt>.  On the other hand:
 859
 860 <tscreen><verb>
 861   class C a where
 862     op :: Eq a => (a,b) -> (a,b)
 863 </verb></tscreen>
 864
 865 is not OK because the constraint <tt>(Eq a)</tt> mentions on the class
 866 type variable <tt>a</tt>, but not <tt>b</tt>.  However, any such
 867 example is easily fixed by moving the offending context up to the
 868 superclass context:
 869
 870 <tscreen><verb>
 871   class Eq a => C a where
 872     op ::(a,b) -> (a,b)
 873 </verb></tscreen>
 874
 875 A yet more relaxed rule would allow the context of a class-op signature
 876 to mention only class type variables.  However, that conflicts with
 877 Rule 1(b) for types above.
 878
 879 <item> <bf>The type of each class operation must mention <em/all/ of
 880 the class type variables</bf>.  For example:
 881
 882 <tscreen><verb>
 883   class Coll s a where
 884     empty  :: s
 885     insert :: s -> a -> s
 886 </verb></tscreen>
 887
 888 is not OK, because the type of <tt>empty</tt> doesn't mention
 889 <tt>a</tt>.  This rule is a consequence of Rule 1(a), above, for
 890 types, and has the same motivation.
 891
 892 Sometimes, offending class declarations exhibit misunderstandings.  For
 893 example, <tt>Coll</tt> might be rewritten
 894
 895 <tscreen><verb>
 896   class Coll s a where
 897     empty  :: s a
 898     insert :: s a -> a -> s a
 899 </verb></tscreen>
 900
 901 which makes the connection between the type of a collection of
 902 <tt>a</tt>'s (namely <tt>(s a)</tt>) and the element type <tt>a</tt>.
 903 Occasionally this really doesn't work, in which case you can split the
 904 class like this:
 905
 906 <tscreen><verb>
 907   class CollE s where
 908     empty  :: s
 909
 910   class CollE s => Coll s a where
 911     insert :: s -> a -> s
 912 </verb></tscreen>
 913
 914 </enum>
 915
 916 <sect2>Instance declarations
 917 <p>
 918
 919 <enum>
 920
 921 <item> <bf>Instance declarations may not overlap</bf>.  The two instance
 922 declarations
 923
 924 <tscreen><verb>
 925   instance context1 => C type1 where ...
 926   instance context2 => C type2 where ...
 927 </verb></tscreen>
 928
 929 "overlap" if @type1@ and @type2@ unify
 930
 931 However, if you give the command line option
 932 @-fallow-overlapping-instances@<nidx>-fallow-overlapping-instances
 933 option</nidx> then two overlapping instance declarations are permitted
 934 iff
 935
 936 <itemize>
 937 <item> EITHER @type1@ and @type2@ do not unify
 938 <item> OR @type2@ is a substitution instance of @type1@
 939                 (but not identical to @type1@)
 940 <item> OR vice versa
 941 </itemize>
 942
 943 Notice that these rules
 944
 945 <itemize>
 946 <item> make it clear which instance decl to use
 947            (pick the most specific one that matches)
 948
 949 <item> do not mention the contexts @context1@, @context2@
 950             Reason: you can pick which instance decl
 951             "matches" based on the type.
 952 </itemize>
 953
 954 Regrettably, GHC doesn't guarantee to detect overlapping instance
 955 declarations if they appear in different modules.  GHC can "see" the
 956 instance declarations in the transitive closure of all the modules
 957 imported by the one being compiled, so it can "see" all instance decls
 958 when it is compiling <tt>Main</tt>.  However, it currently chooses not
 959 to look at ones that can't possibly be of use in the module currently
 960 being compiled, in the interests of efficiency.  (Perhaps we should
 961 change that decision, at least for <tt>Main</tt>.)
 962
 963 <item> <bf>There are no restrictions on the type in an instance
 964 <em/head/, except that at least one must not be a type variable</bf>.
 965 The instance "head" is the bit after the "=>" in an instance decl. For
 966 example, these are OK:
 967
 968 <tscreen><verb>
 969   instance C Int a where ...
 970
 971   instance D (Int, Int) where ...
 972
 973   instance E [[a]] where ...
 974 </verb></tscreen>
 975
 976 Note that instance heads <bf>may</bf> contain repeated type variables.
 977 For example, this is OK:
 978
 979 <tscreen><verb>
 980   instance Stateful (ST s) (MutVar s) where ...
 981 </verb></tscreen>
 982
 983 The "at least one not a type variable" restriction is to ensure that
 984 context reduction terminates: each reduction step removes one type
 985 constructor.  For example, the following would make the type checker
 986 loop if it wasn't excluded:
 987
 988 <tscreen><verb>
 989   instance C a => C a where ...
 990 </verb></tscreen>
 991
 992 There are two situations in which the rule is a bit of a pain. First,
 993 if one allows overlapping instance declarations then it's quite
 994 convenient to have a "default instance" declaration that applies if
 995 something more specific does not:
 996
 997 <tscreen><verb>
 998   instance C a where
 999     op = ... -- Default
1000 </verb></tscreen>
1001
1002 Second, sometimes you might want to use the following to get the
1003 effect of a "class synonym":
1004
1005 <tscreen><verb>
1006   class (C1 a, C2 a, C3 a) => C a where { }
1007
1008   instance (C1 a, C2 a, C3 a) => C a where { }
1009 </verb></tscreen>
1010
1011 This allows you to write shorter signatures:
1012
1013 <tscreen><verb>
1014   f :: C a => ...
1015 </verb></tscreen>
1016
1017 instead of
1018
1019 <tscreen><verb>
1020   f :: (C1 a, C2 a, C3 a) => ...
1021 </verb></tscreen>
1022
1023 I'm on the lookout for a simple rule that preserves decidability while
1024 allowing these idioms.  The experimental flag
1025 @-fallow-undecidable-instances@<nidx>-fallow-undecidable-instances
1026 option</nidx> lifts this restriction, allowing all the types in an
1027 instance head to be type variables.
1028
1029 <item> <bf>Unlike Haskell 1.4, instance heads may use type
1030 synonyms</bf>.  As always, using a type synonym is just shorthand for
1031 writing the RHS of the type synonym definition.  For example:
1032
1033 <tscreen><verb>
1034   type Point = (Int,Int)
1035   instance C Point   where ...
1036   instance C [Point] where ...
1037 </verb></tscreen>
1038
1039 is legal.  However, if you added
1040
1041 <tscreen><verb>
1042   instance C (Int,Int) where ...
1043 </verb></tscreen>
1044
1045 as well, then the compiler will complain about the overlapping
1046 (actually, identical) instance declarations.  As always, type synonyms
1047 must be fully applied.  You cannot, for example, write:
1048
1049 <tscreen><verb>
1050   type P a = [[a]]
1051   instance Monad P where ...
1052 </verb></tscreen>
1053
1054 This design decision is independent of all the others, and easily
1055 reversed, but it makes sense to me.
1056
1057 <item><bf>The types in an instance-declaration <em/context/ must all
1058 be type variables</bf>. Thus
1059
1060 <tscreen><verb>
1061   instance C a b => Eq (a,b) where ...
1062 </verb></tscreen>
1063
1064 is OK, but
1065
1066 <tscreen><verb>
1067   instance C Int b => Foo b where ...
1068 </verb></tscreen>
1069
1070 is not OK.  Again, the intent here is to make sure that context
1071 reduction terminates.
1072
1073 Voluminous correspondence on the Haskell mailing list has convinced me
1074 that it's worth experimenting with a more liberal rule.  If you use
1075 the flag <tt>-fallow-undecidable-instances</tt> you can use arbitrary
1076 types in an instance context.  Termination is ensured by having a
1077 fixed-depth recursion stack.  If you exceed the stack depth you get a
1078 sort of backtrace, and the opportunity to increase the stack depth
1079 with <tt>-fcontext-stack</tt><em/N/.
1080
1081 </enum>
1082
1083 % -----------------------------------------------------------------------------
1084 <sect1>Explicit universal quantification
1085 <label id="universal-quantification">
1086 <p>
1087
1088 GHC now allows you to write explicitly quantified types.  GHC's
1089 syntax for this now agrees with Hugs's, namely:
1090
1091 <tscreen><verb>
1092         forall a b. (Ord a, Eq  b) => a -> b -> a
1093 </verb></tscreen>
1094
1095 The context is, of course, optional.  You can't use <tt>forall</tt> as
1096 a type variable any more!
1097
1098 Haskell type signatures are implicitly quantified.  The <tt>forall</tt>
1099 allows us to say exactly what this means.  For example:
1100
1101 <tscreen><verb>
1102         g :: b -> b
1103 </verb></tscreen>
1104
1105 means this:
1106
1107 <tscreen><verb>
1108         g :: forall b. (b -> b)
1109 </verb></tscreen>
1110
1111 The two are treated identically.
1112
1113 <sect2>Universally-quantified data type fields
1114 <label id="univ">
1115 <p>
1116
1117 In a <tt>data</tt> or <tt>newtype</tt> declaration one can quantify
1118 the types of the constructor arguments.  Here are several examples:
1119
1120 <tscreen><verb>
1121 data T a = T1 (forall b. b -> b -> b) a
1122
1123 data MonadT m = MkMonad { return :: forall a. a -> m a,
1124                           bind   :: forall a b. m a -> (a -> m b) -> m b
1125                         }
1126
1127 newtype Swizzle = MkSwizzle (Ord a => [a] -> [a])
1128 </verb></tscreen>
1129
1130 The constructors now have so-called <em/rank 2/ polymorphic
1131 types, in which there is a for-all in the argument types.:
1132
1133 <tscreen><verb>
1134 T1 :: forall a. (forall b. b -> b -> b) -> a -> T1 a
1135 MkMonad :: forall m. (forall a. a -> m a)
1136                   -> (forall a b. m a -> (a -> m b) -> m b)
1137                   -> MonadT m
1138 MkSwizzle :: (Ord a => [a] -> [a]) -> Swizzle
1139 </verb></tscreen>
1140
1141 Notice that you don't need to use a <tt>forall</tt> if there's an
1142 explicit context.  For example in the first argument of the
1143 constructor <tt>MkSwizzle</tt>, an implicit "<tt>forall a.</tt>" is
1144 prefixed to the argument type.  The implicit <tt>forall</tt>
1145 quantifies all type variables that are not already in scope, and are
1146 mentioned in the type quantified over.
1147
1148 As for type signatures, implicit quantification happens for non-overloaded
1149 types too.  So if you write this:
1150 <tscreen><verb>
1151   data T a = MkT (Either a b) (b -> b)
1152 </verb></tscreen>
1153 it's just as if you had written this:
1154 <tscreen><verb>
1155   data T a = MkT (forall b. Either a b) (forall b. b -> b)
1156 </verb></tscreen>
1157 That is, since the type variable <tt>b</tt> isn't in scope, it's
1158 implicitly universally quantified.  (Arguably, it would be better
1159 to <em>require</em> explicit quantification on constructor arguments
1160 where that is what is wanted.  Feedback welcomed.)
1161
1162 <sect2> Construction
1163 <p>
1164
1165 You construct values of types <tt>T1, MonadT, Swizzle</tt> by applying
1166 the constructor to suitable values, just as usual.  For example,
1167
1168 <tscreen><verb>
1169 (T1 (\xy->x) 3) :: T Int
1170
1171 (MkSwizzle sort)    :: Swizzle
1172 (MkSwizzle reverse) :: Swizzle
1173
1174 (let r x = Just x
1175      b m k = case m of
1176                 Just y -> k y
1177                 Nothing -> Nothing
1178   in
1179   MkMonad r b) :: MonadT Maybe
1180 </verb></tscreen>
1181
1182 The type of the argument can, as usual, be more general than the type
1183 required, as <tt>(MkSwizzle reverse)</tt> shows.  (<tt>reverse</tt>
1184 does not need the <tt>Ord</tt> constraint.)
1185
1186 <sect2>Pattern matching
1187 <p>
1188
1189 When you use pattern matching, the bound variables may now have
1190 polymorphic types.  For example:
1191
1192 <tscreen><verb>
1193         f :: T a -> a -> (a, Char)
1194         f (T1 f k) x = (f k x, f 'c' 'd')
1195
1196         g :: (Ord a, Ord b) => Swizzle -> [a] -> (a -> b) -> [b]
1197         g (MkSwizzle s) xs f = s (map f (s xs))
1198
1199         h :: MonadT m -> [m a] -> m [a]
1200         h m [] = return m []
1201         h m (x:xs) = bind m x           $ \y ->
1202                       bind m (h m xs)   $ \ys ->
1203                       return m (y:ys)
1204 </verb></tscreen>
1205
1206 In the function <tt>h</tt> we use the record selectors <tt>return</tt>
1207 and <tt>bind</tt> to extract the polymorphic bind and return functions
1208 from the <tt>MonadT</tt> data structure, rather than using pattern
1209 matching.
1210
1211 You cannot pattern-match against an argument that is polymorphic.
1212 For example:
1213 <tscreen><verb>
1214         newtype TIM s a = TIM (ST s (Maybe a))
1215
1216         runTIM :: (forall s. TIM s a) -> Maybe a
1217         runTIM (TIM m) = runST m
1218 </verb></tscreen>
1219
1220 Here the pattern-match fails, because you can't pattern-match against
1221 an argument of type <tt>(forall s. TIM s a)</tt>.  Instead you
1222 must bind the variable and pattern match in the right hand side:
1223 <tscreen><verb>
1224         runTIM :: (forall s. TIM s a) -> Maybe a
1225         runTIM tm = case tm of { TIM m -> runST m }
1226 </verb></tscreen>
1227 The <tt>tm</tt> on the right hand side is (invisibly) instantiated, like
1228 any polymorphic value at its occurrence site, and now you can pattern-match
1229 against it.
1230
1231 <sect2>The partial-application restriction
1232 <p>
1233
1234 There is really only one way in which data structures with polymorphic
1235 components might surprise you: you must not partially apply them.
1236 For example, this is illegal:
1237
1238 <tscreen><verb>
1239         map MkSwizzle [sort, reverse]
1240 </verb></tscreen>
1241
1242 The restriction is this: <em>every subexpression of the program must
1243 have a type that has no for-alls, except that in a function
1244 application (f e1 ... en) the partial applications are not subject to
1245 this rule</em>.  The restriction makes type inference feasible.
1246
1247 In the illegal example, the sub-expression <tt>MkSwizzle</tt> has the
1248 polymorphic type <tt>(Ord b => [b] -> [b]) -> Swizzle</tt> and is not
1249 a sub-expression of an enclosing application.  On the other hand, this
1250 expression is OK:
1251
1252 <tscreen><verb>
1253         map (T1 (\a b -> a)) [1,2,3]
1254 </verb></tscreen>
1255
1256 even though it involves a partial application of <tt>T1</tt>, because
1257 the sub-expression <tt>T1 (\a b -> a)</tt> has type <tt>Int -> T
1258 Int</tt>.
1259
1260 <sect2>Type signatures
1261 <label id="sigs">
1262 <p>
1263
1264 Once you have data constructors with universally-quantified fields, or
1265 constants such as <tt>runST</tt> that have rank-2 types, it isn't long
1266 before you discover that you need more!  Consider:
1267
1268 <tscreen><verb>
1269   mkTs f x y = [T1 f x, T1 f y]
1270 </verb></tscreen>
1271
1272 <tt>mkTs</tt> is a fuction that constructs some values of type
1273 <tt>T</tt>, using some pieces passed to it.  The trouble is that since
1274 <tt>f</tt> is a function argument, Haskell assumes that it is
1275 monomorphic, so we'll get a type error when applying <tt>T1</tt> to
1276 it.  This is a rather silly example, but the problem really bites in
1277 practice.  Lots of people trip over the fact that you can't make
1278 "wrappers functions" for <tt>runST</tt> for exactly the same reason.
1279 In short, it is impossible to build abstractions around functions with
1280 rank-2 types.
1281
1282 The solution is fairly clear.  We provide the ability to give a rank-2
1283 type signature for <em>ordinary</em> functions (not only data
1284 constructors), thus:
1285
1286 <tscreen><verb>
1287   mkTs :: (forall b. b -> b -> b) -> a -> [T a]
1288   mkTs f x y = [T1 f x, T1 f y]
1289 </verb></tscreen>
1290
1291 This type signature tells the compiler to attribute <tt>f</tt> with
1292 the polymorphic type <tt>(forall b. b -> b -> b)</tt> when type
1293 checking the body of <tt>mkTs</tt>, so now the application of
1294 <tt>T1</tt> is fine.
1295
1296 There are two restrictions:
1297
1298 <itemize>
1299 <item> You can only define a rank 2 type, specified by the following
1300 grammar:
1301
1302 <tscreen><verb>
1303    rank2type ::= [forall tyvars .] [context =>] funty
1304    funty     ::= ([forall tyvars .] [context =>] ty) -> funty
1305                | ty
1306    ty        ::= ...current Haskell monotype syntax...
1307 </verb></tscreen>
1308
1309 Informally, the universal quantification must all be right at the beginning,
1310 or at the top level of a function argument.
1311
1312 <item> There is a restriction on the definition of a function whose
1313 type signature is a rank-2 type: the polymorphic arguments must be
1314 matched on the left hand side of the "<tt>=</tt>" sign.  You can't
1315 define <tt>mkTs</tt> like this:
1316
1317 <tscreen><verb>
1318   mkTs :: (forall b. b -> b -> b) -> a -> [T a]
1319   mkTs = \ f x y -> [T1 f x, T1 f y]
1320 </verb></tscreen>
1321
1322
1323 The same partial-application rule applies to ordinary functions with
1324 rank-2 types as applied to data constructors.
1325
1326 </itemize>
1327
1328 % -----------------------------------------------------------------------------
1329 <sect1>Existentially quantified data constructors
1330 <label id="existential-quantification">
1331 <p>
1332
1333 The idea of using existential quantification in data type declarations
1334 was suggested by Laufer (I believe, thought doubtless someone will
1335 correct me), and implemented in Hope+. It's been in Lennart
1336 Augustsson's <tt>hbc</tt> Haskell compiler for several years, and
1337 proved very useful.  Here's the idea.  Consider the declaration:
1338
1339 <tscreen><verb>
1340   data Foo = forall a. MkFoo a (a -> Bool)
1341            | Nil
1342 </verb></tscreen>
1343
1344 The data type <tt>Foo</tt> has two constructors with types:
1345
1346 <tscreen><verb>
1347   MkFoo :: forall a. a -> (a -> Bool) -> Foo
1348   Nil   :: Foo
1349 </verb></tscreen>
1350
1351 Notice that the type variable <tt>a</tt> in the type of <tt>MkFoo</tt>
1352 does not appear in the data type itself, which is plain <tt>Foo</tt>.
1353 For example, the following expression is fine:
1354
1355 <tscreen><verb>
1356   [MkFoo 3 even, MkFoo 'c' isUpper] :: [Foo]
1357 </verb></tscreen>
1358
1359 Here, <tt>(MkFoo 3 even)</tt> packages an integer with a function
1360 <tt>even</tt> that maps an integer to <tt>Bool</tt>; and <tt>MkFoo 'c'
1361 isUpper</tt> packages a character with a compatible function.  These
1362 two things are each of type <tt>Foo</tt> and can be put in a list.
1363
1364 What can we do with a value of type <tt>Foo</tt>?.  In particular,
1365 what happens when we pattern-match on <tt>MkFoo</tt>?
1366
1367 <tscreen><verb>
1368   f (MkFoo val fn) = ???
1369 </verb></tscreen>
1370
1371 Since all we know about <tt>val</tt> and <tt>fn</tt> is that they
1372 are compatible, the only (useful) thing we can do with them is to
1373 apply <tt>fn</tt> to <tt>val</tt> to get a boolean.  For example:
1374
1375 <tscreen><verb>
1376   f :: Foo -> Bool
1377   f (MkFoo val fn) = fn val
1378 </verb></tscreen>
1379
1380 What this allows us to do is to package heterogenous values
1381 together with a bunch of functions that manipulate them, and then treat
1382 that collection of packages in a uniform manner.  You can express
1383 quite a bit of object-oriented-like programming this way.
1384
1385 <sect2>Why existential?
1386 <label id="existential">
1387 <p>
1388
1389 What has this to do with <em>existential</em> quantification?
1390 Simply that <tt>MkFoo</tt> has the (nearly) isomorphic type
1391
1392 <tscreen><verb>
1393   MkFoo :: (exists a . (a, a -> Bool)) -> Foo
1394 </verb></tscreen>
1395
1396 But Haskell programmers can safely think of the ordinary
1397 <em>universally</em> quantified type given above, thereby avoiding
1398 adding a new existential quantification construct.
1399
1400 <sect2>Type classes
1401 <p>
1402
1403 An easy extension (implemented in <tt>hbc</tt>) is to allow
1404 arbitrary contexts before the constructor.  For example:
1405
1406 <tscreen><verb>
1407   data Baz = forall a. Eq a => Baz1 a a
1408            | forall b. Show b => Baz2 b (b -> b)
1409 </verb></tscreen>
1410
1411 The two constructors have the types you'd expect:
1412
1413 <tscreen><verb>
1414   Baz1 :: forall a. Eq a => a -> a -> Baz
1415   Baz2 :: forall b. Show b => b -> (b -> b) -> Baz
1416 </verb></tscreen>
1417
1418 But when pattern matching on <tt>Baz1</tt> the matched values can be compared
1419 for equality, and when pattern matching on <tt>Baz2</tt> the first matched
1420 value can be converted to a string (as well as applying the function to it).
1421 So this program is legal:
1422
1423 <tscreen><verb>
1424   f :: Baz -> String
1425   f (Baz1 p q) | p == q    = "Yes"
1426                | otherwise = "No"
1427   f (Baz1 v fn)            = show (fn v)
1428 </verb></tscreen>
1429
1430 Operationally, in a dictionary-passing implementation, the
1431 constructors <tt>Baz1</tt> and <tt>Baz2</tt> must store the
1432 dictionaries for <tt>Eq</tt> and <tt>Show</tt> respectively, and
1433 extract it on pattern matching.
1434
1435 Notice the way that the syntax fits smoothly with that used for
1436 universal quantification earlier.
1437
1438 <sect2>Restrictions
1439 <p>
1440
1441 There are several restrictions on the ways in which existentially-quantified
1442 constructors can be use.
1443
1444 <itemize>
1445
1446 <item> When pattern matching, each pattern match introduces a new,
1447 distinct, type for each existential type variable.  These types cannot
1448 be unified with any other type, nor can they escape from the scope of
1449 the pattern match.  For example, these fragments are incorrect:
1450
1451 <tscreen><verb>
1452   f1 (MkFoo a f) = a
1453 </verb></tscreen>
1454
1455 Here, the type bound by <tt>MkFoo</tt> "escapes", because <tt>a</tt>
1456 is the result of <tt>f1</tt>.  One way to see why this is wrong is to
1457 ask what type <tt>f1</tt> has:
1458
1459 <tscreen><verb>
1460   f1 :: Foo -> a             -- Weird!
1461 </verb></tscreen>
1462
1463 What is this "<tt>a</tt>" in the result type? Clearly we don't mean
1464 this:
1465
1466 <tscreen><verb>
1467   f1 :: forall a. Foo -> a   -- Wrong!
1468 </verb></tscreen>
1469
1470 The original program is just plain wrong.  Here's another sort of error
1471
1472 <tscreen><verb>
1473   f2 (Baz1 a b) (Baz1 p q) = a==q
1474 </verb></tscreen>
1475
1476 It's ok to say <tt>a==b</tt> or <tt>p==q</tt>, but
1477 <tt>a==q</tt> is wrong because it equates the two distinct types arising
1478 from the two <tt>Baz1</tt> constructors.
1479
1480
1481 <item>You can't pattern-match on an existentially quantified
1482 constructor in a <tt>let</tt> or <tt>where</tt> group of
1483 bindings. So this is illegal:
1484
1485 <tscreen><verb>
1486   f3 x = a==b where { Baz1 a b = x }
1487 </verb></tscreen>
1488
1489 You can only pattern-match
1490 on an existentially-quantified constructor in a <tt>case</tt> expression or
1491 in the patterns of a function definition.
1492
1493 The reason for this restriction is really an implementation one.
1494 Type-checking binding groups is already a nightmare without
1495 existentials complicating the picture.  Also an existential pattern
1496 binding at the top level of a module doesn't make sense, because it's
1497 not clear how to prevent the existentially-quantified type "escaping".
1498 So for now, there's a simple-to-state restriction.  We'll see how
1499 annoying it is.
1500
1501 <item>You can't use existential quantification for <tt>newtype</tt>
1502 declarations.  So this is illegal:
1503
1504 <tscreen><verb>
1505   newtype T = forall a. Ord a => MkT a
1506 </verb></tscreen>
1507
1508 Reason: a value of type <tt>T</tt> must be represented as a pair
1509 of a dictionary for <tt>Ord t</tt> and a value of type <tt>t</tt>.
1510 That contradicts the idea that <tt>newtype</tt> should have no
1511 concrete representation.  You can get just the same efficiency and effect
1512 by using <tt>data</tt> instead of <tt>newtype</tt>.  If there is no
1513 overloading involved, then there is more of a case for allowing
1514 an existentially-quantified <tt>newtype</tt>, because the <tt>data</tt>
1515 because the <tt>data</tt> version does carry an implementation cost,
1516 but single-field existentially quantified constructors aren't much
1517 use.  So the simple restriction (no existential stuff on <tt>newtype</tt>)
1518 stands, unless there are convincing reasons to change it.
1519 </itemize>
1520
1521
1522 <sect1> <idx/Assertions/
1523 <label id="sec:assertions">
1524 <p>
1525
1526 If you want to make use of assertions in your standard Haskell code, you
1527 could define a function like the following:
1528
1529 <tscreen><verb>
1530 assert :: Bool -> a -> a
1531 assert False x = error "assertion failed!"
1532 assert _     x = x
1533 </verb></tscreen>
1534
1535 which works, but gives you back a less than useful error message --
1536 an assertion failed, but which and where?
1537
1538 One way out is to define an extended <tt/assert/ function which also
1539 takes a descriptive string to include in the error message and
1540 perhaps combine this with the use of a pre-processor which inserts
1541 the source location where <tt/assert/ was used.
1542
1543 Ghc offers a helping hand here, doing all of this for you. For every
1544 use of <tt/assert/ in the user's source:
1545
1546 <tscreen><verb>
1547 kelvinToC :: Double -> Double
1548 kelvinToC k = assert (k &gt;= 0.0) (k+273.15)
1549 </verb></tscreen>
1550
1551 Ghc will rewrite this to also include the source location where the
1552 assertion was made,
1553
1554 <tscreen><verb>
1555 assert pred val ==> assertError "Main.hs|15" pred val
1556 </verb></tscreen>
1557
1558 The rewrite is only performed by the compiler when it spots
1559 applications of <tt>Exception.assert</tt>, so you can still define and
1560 use your own versions of <tt/assert/, should you so wish. If not,
1561 import <tt/Exception/ to make use <tt/assert/ in your code.
1562
1563 To have the compiler ignore uses of assert, use the compiler option
1564 @-fignore-asserts@. <nidx>-fignore-asserts option</nidx> That is,
1565 expressions of the form @assert pred e@ will be rewritten to @e@.
1566
1567 Assertion failures can be caught, see the documentation for the
1568 Hugs/GHC Exception library for information of how.
1569
1570 % -----------------------------------------------------------------------------
1571 <sect1>Scoped Type Variables
1572 <label id="scoped-type-variables">
1573 <p>
1574
1575 A <em/pattern type signature/ can introduce a <em/scoped type
1576 variable/.  For example
1577
1578 <tscreen><verb>
1579 f (xs::[a]) = ys ++ ys
1580            where
1581               ys :: [a]
1582               ys = reverse xs
1583 </verb></tscreen>
1584
1585 The pattern @(xs::[a])@ includes a type signature for @xs@.
1586 This brings the type variable @a@ into scope; it scopes over
1587 all the patterns and right hand sides for this equation for @f@.
1588 In particular, it is in scope at the type signature for @y@.
1589
1590 At ordinary type signatures, such as that for @ys@, any type variables
1591 mentioned in the type signature <em/that are not in scope/ are
1592 implicitly universally quantified.  (If there are no type variables in
1593 scope, all type variables mentioned in the signature are universally
1594 quantified, which is just as in Haskell 98.)  In this case, since @a@
1595 is in scope, it is not universally quantified, so the type of @ys@ is
1596 the same as that of @xs@.  In Haskell 98 it is not possible to declare
1597 a type for @ys@; a major benefit of scoped type variables is that
1598 it becomes possible to do so.
1599
1600 Scoped type variables are implemented in both GHC and Hugs.  Where the
1601 implementations differ from the specification below, those differences
1602 are noted.
1603
1604 So much for the basic idea.  Here are the details.
1605
1606 <sect2>Scope and implicit quantification
1607 <p>
1608
1609 <itemize>
1610 <item> All the type variables mentioned in the patterns for a single
1611 function definition equation, that are not already in scope,
1612 are brought into scope by the patterns.  We describe this set as
1613 the <em/type variables bound by the equation/.
1614
1615 <item> The type variables thus brought into scope may be mentioned
1616 in ordinary type signatures or pattern type signatures anywhere within
1617 their scope.
1618
1619 <item> In ordinary type signatures, any type variable mentioned in the
1620 signature that is in scope is <em/not/ universally quantified.
1621
1622 <item> Ordinary type signatures do not bring any new type variables
1623 into scope (except in the type signature itself!). So this is illegal:
1624
1625 <tscreen><verb>
1626   f :: a -> a
1627   f x = x::a
1628 </verb></tscreen>
1629
1630 It's illegal because @a@ is not in scope in the body of @f@,
1631 so the ordinary signature @x::a@ is equivalent to @x::forall a.a@;
1632 and that is an incorrect typing.
1633
1634 <item> There is no implicit universal quantification on pattern type
1635 signatures, nor may one write an explicit @forall@ type in a pattern
1636 type signature.  The pattern type signature is a monotype.
1637
1638 <item>
1639 The type variables in the head of a @class@ or @instance@ declaration
1640 scope over the methods defined in the @where@ part.  For example:
1641
1642 <tscreen><verb>
1643   class C a where
1644     op :: [a] -> a
1645
1646     op xs = let ys::[a]
1647                 ys = reverse xs
1648             in
1649             head ys
1650 </verb></tscreen>
1651
1652 (Not implemented in Hugs yet, Dec 98).
1653 </itemize>
1654
1655 <sect2>Polymorphism
1656 <p>
1657
1658 <itemize>
1659 <item> Pattern type signatures are completely orthogonal to ordinary, separate
1660 type signatures.  The two can be used independently or together.  There is
1661 no scoping associated with the names of the type variables in a separate type signature.
1662
1663 <tscreen><verb>
1664    f :: [a] -> [a]
1665    f (xs::[b]) = reverse xs
1666 </verb></tscreen>
1667
1668 <item> The function must be polymorphic in the type variables
1669 bound by all its equations.  Operationally, the type variables bound
1670 by one equation must not:
1671
1672 <itemize>
1673 <item> Be unified with a type (such as @Int@, or @[a]@).
1674 <item> Be unified with a type variable free in the environment.
1675 <item> Be unified with each other.  (They may unify with the type variables
1676 bound by another equation for the same function, of course.)
1677 </itemize>
1678
1679 For example, the following all fail to type check:
1680
1681 <tscreen><verb>
1682   f (x::a) (y::b) = [x,y]       -- a unifies with b
1683
1684   g (x::a) = x + 1::Int         -- a unifies with Int
1685
1686   h x = let k (y::a) = [x,y]    -- a is free in the
1687         in k x                  -- environment
1688
1689   k (x::a) True    = ...        -- a unifies with Int
1690   k (x::Int) False = ...
1691
1692   w :: [b] -> [b]
1693   w (x::a) = x                  -- a unifies with [b]
1694 </verb></tscreen>
1695
1696 <item> The pattern-bound type variable may, however, be constrained
1697 by the context of the principal type, thus:
1698
1699 <tscreen><verb>
1700   f (x::a) (y::a) = x+y*2
1701 </verb></tscreen>
1702
1703 gets the inferred type: @forall a. Num a => a -> a -> a@.
1704 </itemize>
1705
1706 <sect2>Result type signatures
1707 <p>
1708
1709 <itemize>
1710 <item> The result type of a function can be given a signature,
1711 thus:
1712
1713 <tscreen><verb>
1714   f (x::a) :: [a] = [x,x,x]
1715 </verb></tscreen>
1716
1717 The final @":: [a]"@ after all the patterns gives a signature to the
1718 result type.  Sometimes this is the only way of naming the type variable
1719 you want:
1720
1721 <tscreen><verb>
1722   f :: Int -> [a] -> [a]
1723   f n :: ([a] -> [a]) = let g (x::a, y::a) = (y,x)
1724                         in \xs -> map g (reverse xs `zip` xs)
1725 </verb></tscreen>
1726
1727 </itemize>
1728
1729 Result type signatures are not yet implemented in Hugs.
1730
1731 <sect2>Pattern signatures on other constructs
1732 <p>
1733
1734 <itemize>
1735 <item> A pattern type signature can be on an arbitrary sub-pattern, not
1736 just on a variable:
1737
1738 <tscreen><verb>
1739   f ((x,y)::(a,b)) = (y,x) :: (b,a)
1740 </verb></tscreen>
1741
1742 <item> Pattern type signatures, including the result part, can be used
1743 in lambda abstractions:
1744
1745 <tscreen><verb>
1746   (\ (x::a, y) :: a -> x)
1747 </verb></tscreen>
1748
1749 Type variables bound by these patterns must be polymorphic in
1750 the sense defined above.
1751 For example:
1752
1753 <tscreen><verb>
1754   f1 (x::c) = f1 x      -- ok
1755   f2 = \(x::c) -> f2 x  -- not ok
1756 </verb></tscreen>
1757
1758 Here, @f1@ is OK, but @f2@ is not, because @c@ gets unified
1759 with a type variable free in the environment, in this
1760 case, the type of @f2@, which is in the environment when
1761 the lambda abstraction is checked.
1762
1763 <item> Pattern type signatures, including the result part, can be used
1764 in @case@ expressions:
1765
1766 <tscreen><verb>
1767   case e of { (x::a, y) :: a -> x }
1768 </verb></tscreen>
1769
1770 The pattern-bound type variables must, as usual,
1771 be polymorphic in the following sense: each case alternative,
1772 considered as a lambda abstraction, must be polymorphic.
1773 Thus this is OK:
1774
1775 <tscreen><verb>
1776   case (True,False) of { (x::a, y) -> x }
1777 </verb></tscreen>
1778
1779 Even though the context is that of a pair of booleans,
1780 the alternative itself is polymorphic.  Of course, it is
1781 also OK to say:
1782
1783 <tscreen><verb>
1784   case (True,False) of { (x::Bool, y) -> x }
1785 </verb></tscreen>
1786
1787 <item>
1788 To avoid ambiguity, the type after the ``@::@'' in a result
1789 pattern signature on a lambda or @case@ must be atomic (i.e. a single
1790 token or a parenthesised type of some sort).  To see why,
1791 consider how one would parse this:
1792
1793 <tscreen><verb>
1794   \ x :: a -> b -> x
1795 </verb></tscreen>
1796
1797 <item> Pattern type signatures that bind new type variables
1798 may not be used in pattern bindings at all.
1799 So this is illegal:
1800
1801 <tscreen><verb>
1802   f x = let (y, z::a) = x in ...
1803 </verb></tscreen>
1804
1805 But these are OK, because they do not bind fresh type variables:
1806
1807 <tscreen><verb>
1808   f1 x            = let (y, z::Int) = x in ...
1809   f2 (x::(Int,a)) = let (y, z::a)   = x in ...
1810 </verb></tscreen>
1811
1812 However a single variable is considered a degenerate function binding,
1813 rather than a degerate pattern binding, so this is permitted, even
1814 though it binds a type variable:
1815
1816 <tscreen><verb>
1817   f :: (b->b) = \(x::b) -> x
1818 </verb></tscreen>
1819
1820 </itemize>
1821 Such degnerate function bindings do not fall under the monomorphism
1822 restriction.  Thus:
1823
1824 <tscreen><verb>
1825   g :: a -> a -> Bool = \x y. x==y
1826 </verb></tscreen>
1827
1828 Here @g@ has type @forall a. Eq a => a -> a -> Bool@, just as if
1829 @g@ had a separate type signature.  Lacking a type signature, @g@
1830 would get a monomorphic type.
1831
1832 <sect2>Existentials
1833 <p>
1834
1835 <itemize>
1836 <item> Pattern type signatures can bind existential type variables.
1837 For example:
1838
1839 <tscreen><verb>
1840   data T = forall a. MkT [a]
1841
1842   f :: T -> T
1843   f (MkT [t::a]) = MkT t3
1844                  where
1845                    t3::[a] = [t,t,t]
1846 </verb></tscreen>
1847
1848 </itemize>
1849
1850 %-----------------------------------------------------------------------------
1851 <sect1>Pragmas
1852 <label id="pragmas">
1853 <p>
1854
1855 GHC supports several pragmas, or instructions to the compiler placed
1856 in the source code.  Pragmas don't affect the meaning of the program,
1857 but they might affect the efficiency of the generated code.
1858
1859 <sect2>INLINE pragma
1860 <label id="inline-pragma">
1861 <nidx>INLINE pragma</nidx>
1862 <nidx>pragma, INLINE</nidx>
1863 <p>
1864
1865 GHC (with @-O@, as always) tries to inline (or ``unfold'')
1866 functions/values that are ``small enough,'' thus avoiding the call
1867 overhead and possibly exposing other more-wonderful optimisations.
1868
1869 You will probably see these unfoldings (in Core syntax) in your
1870 interface files.
1871
1872 Normally, if GHC decides a function is ``too expensive'' to inline, it
1873 will not do so, nor will it export that unfolding for other modules to
1874 use.
1875
1876 The sledgehammer you can bring to bear is the
1877 @INLINE@<nidx>INLINE pragma</nidx> pragma, used thusly:
1878 <tscreen><verb>
1879 key_function :: Int -> String -> (Bool, Double)
1880
1881 #ifdef __GLASGOW_HASKELL__
1882 {-# INLINE key_function #-}
1883 #endif
1884 </verb></tscreen>
1885 (You don't need to do the C pre-processor carry-on unless you're going
1886 to stick the code through HBC---it doesn't like @INLINE@ pragmas.)
1887
1888 The major effect of an @INLINE@ pragma is to declare a function's
1889 ``cost'' to be very low.  The normal unfolding machinery will then be
1890 very keen to inline it.
1891
1892 An @INLINE@ pragma for a function can be put anywhere its type
1893 signature could be put.
1894
1895 @INLINE@ pragmas are a particularly good idea for the
1896 @then@/@return@ (or @bind@/@unit@) functions in a monad.
1897 For example, in GHC's own @UniqueSupply@ monad code, we have:
1898 <tscreen><verb>
1899 #ifdef __GLASGOW_HASKELL__
1900 {-# INLINE thenUs #-}
1901 {-# INLINE returnUs #-}
1902 #endif
1903 </verb></tscreen>
1904
1905 <sect2>NOINLINE pragma
1906 <label id="noinline-pragma">
1907 <p>
1908 <nidx>NOINLINE pragma</nidx>
1909 <nidx>pragma, NOINLINE</nidx>
1910
1911 The @NOINLINE@ pragma does exactly what you'd expect: it stops the
1912 named function from being inlined by the compiler.  You shouldn't ever
1913 need to do this, unless you're very cautious about code size.
1914
1915 <sect2>SPECIALIZE pragma
1916 <label id="specialize-pragma">
1917 <p>
1918 <nidx>SPECIALIZE pragma</nidx>
1919 <nidx>pragma, SPECIALIZE</nidx>
1920 <nidx>overloading, death to</nidx>
1921
1922 (UK spelling also accepted.)  For key overloaded functions, you can
1923 create extra versions (NB: more code space) specialised to particular
1924 types.  Thus, if you have an overloaded function:
1925
1926 <tscreen><verb>
1927 hammeredLookup :: Ord key => [(key, value)] -> key -> value
1928 </verb></tscreen>
1929
1930 If it is heavily used on lists with @Widget@ keys, you could
1931 specialise it as follows:
1932 <tscreen><verb>
1933 {-# SPECIALIZE hammeredLookup :: [(Widget, value)] -> Widget -> value #-}
1934 </verb></tscreen>
1935
1936 To get very fancy, you can also specify a named function to use for
1937 the specialised value, by adding @= blah@, as in:
1938 <tscreen><verb>
1939 {-# SPECIALIZE hammeredLookup :: ...as before... = blah #-}
1940 </verb></tscreen>
1941 It's <em>Your Responsibility</em> to make sure that @blah@ really
1942 behaves as a specialised version of @hammeredLookup@!!!
1943
1944 NOTE: the @=blah@ feature isn't implemented in GHC 4.xx.
1945
1946 An example in which the @= blah@ form will Win Big:
1947 <tscreen><verb>
1948 toDouble :: Real a => a -> Double
1949 toDouble = fromRational . toRational
1950
1951 {-# SPECIALIZE toDouble :: Int -> Double = i2d #-}
1952 i2d (I# i) = D# (int2Double# i) -- uses Glasgow prim-op directly
1953 </verb></tscreen>
1954 The @i2d@ function is virtually one machine instruction; the
1955 default conversion---via an intermediate @Rational@---is obscenely
1956 expensive by comparison.
1957
1958 By using the US spelling, your @SPECIALIZE@ pragma will work with
1959 HBC, too.  Note that HBC doesn't support the @= blah@ form.
1960
1961 A @SPECIALIZE@ pragma for a function can be put anywhere its type
1962 signature could be put.
1963
1964 <sect2>SPECIALIZE instance pragma
1965 <label id="specialize-instance-pragma">
1966 <p>
1967 <nidx>SPECIALIZE pragma</nidx>
1968 <nidx>overloading, death to</nidx>
1969 Same idea, except for instance declarations.  For example:
1970 <tscreen><verb>
1971 instance (Eq a) => Eq (Foo a) where { ... usual stuff ... }
1972
1973 {-# SPECIALIZE instance Eq (Foo [(Int, Bar)] #-}
1974 </verb></tscreen>
1975 Compatible with HBC, by the way.
1976
1977 <sect2>LINE pragma
1978 <label id="line-pragma">
1979 <p>
1980 <nidx>LINE pragma</nidx>
1981 <nidx>pragma, LINE</nidx>
1982
1983 This pragma is similar to C's @#line@ pragma, and is mainly for use in
1984 automatically generated Haskell code.  It lets you specify the line
1985 number and filename of the original code; for example
1986
1987 <tscreen><verb>
1988 {-# LINE 42 "Foo.vhs" #-}
1989 </verb></tscreen>
1990
1991 if you'd generated the current file from something called @Foo.vhs@
1992 and this line corresponds to line 42 in the original.  GHC will adjust
1993 its error messages to refer to the line/file named in the @LINE@
1994 pragma.
1995
1996 <sect2>RULES pragma
1997 <p>
1998 The RULES pragma lets you specify rewrite rules.  It is described in
1999 Section <ref name="Rewrite Rules"
2000 id="rewrite-rules">.
2001
2002 %-----------------------------------------------------------------------------
2003 <sect1>Rewrite rules
2004 <label id="rewrite-rules">
2005 <nidx>RULES pagma</nidx>
2006 <nidx>pragma, RULES</nidx>
2007 <nidx>rewrite rules</nidx>
2008 <p>
2009
2010 The programmer can specify rewrite rules as part of the source program
2011 (in a pragma).  GHC applies these rewrite rules wherever it can.
2012
2013 Here is an example:
2014 <tscreen><verb>
2015   {-# RULES
2016         "map/map"       forall f,g,xs. map f (map g) xs = map (f.g) xs
2017   #-}
2018 </verb></tscreen>
2019
2020 <sect2>Syntax
2021 <p>
2022
2023 From a syntactic point of view:
2024 <itemize>
2025 <item> Each rule has a name, enclosed in double quotes.
2026 <item> There may be zero or more rules in a @RULES@ pragma.
2027 <item> Layout applies in a @RULES@ pragma.  Currently no new indentation level
2028 is set, so you must lay out your rules starting in the same column as the
2029 enclosing definitions.
2030 <item> Each variable mentioned in a rule must either be in scope (e.g. @map@),
2031 or bound by the @forall@ (e.g. @f@, @g@, @xs@).  The variables bound by
2032 the @forall@ are called the <em>pattern</em> variables.
2033 <item> A pattern variable may optionally have a type signature.
2034 If its type is polymorphic, it <em>must</em> have a type signature.
2035 For example, here is the @foldr/build@ rule:
2036 <tscreen><verb>
2037   "fold/build"  forall k,z,g::forall b. (a->b->b) -> b -> b .
2038                 foldr k z (build g) = g k z
2039 </verb></tscreen>
2040
2041
2042 <item> The left hand side of a rule must consist of a top-level variable applied
2043 to arbitrary expressions.  For example, this is <em>not</em> OK:
2044 <tscreen><verb>
2045   "wrong1"   forall e1,e2.  case True of { True -> e1; False -> e2 } = e1
2046   "wrong2"   forall f.      f True = True
2047 </verb></tscreen>
2048 In @"wrong1"@, the LHS is not an application; in @"wrong1"@, the LHS has a pattern variable
2049 in the head.
2050 <item> A rule does not need to be in the same module as (any of) the
2051 variables it mentions, though of course they need to be in scope.
2052 <item> Rules are automatically exported from a module, just as instance declarations are.
2053 </itemize>
2054
2055 <sect2>Semantics
2056 <p>
2057
2058 From a semantic point of view:
2059 <itemize>
2060 <item> Rules are only applied if you use the @-O@ flag.
2061
2062 <item> Rules are regarded as left-to-right rewrite rules.
2063 When GHC finds an expression that is a substitution instance of the LHS
2064 of a rule, it replaces the expression by the (appropriately-substituted) RHS.
2065 By "a substitution instance" we mean that the LHS can be made equal to the
2066 expression by substituting for the pattern variables.
2067
2068 <item> The LHS and RHS of a rule are typechecked, and must have the
2069 same type.
2070
2071 <item> GHC makes absolutely no attempt to verify that the LHS and RHS
2072 of a rule have the same meaning.  That is undecideable in general, and
2073 infeasible in most interesting cases.  The responsibility is entirely the programmer's!
2074
2075 <item> GHC makes no attempt to make sure that the rules are confluent or
2076 terminating.  For example:
2077 <tscreen><verb>
2078   "loop"        forall x,y.  f x y = f y x
2079 </verb></tscreen>
2080 This rule will cause the compiler to go into an infinite loop.
2081
2082 <item> GHC currently uses a very simple, syntactic, matching algorithm
2083 for matching a rule LHS with an expression.  It seeks a substitution
2084 which makes the LHS and expression syntactically equal modulo: alpha
2085 conversion, and (I think) eta conversion.  But not beta conversion (that's
2086 called higher-order matching).
2087
2088 <item> GHC keeps trying to apply the rules as it optimises the program.
2089 For example, consider:
2090 <tscreen><verb>
2091   let s = map f
2092       t = map g
2093   in
2094   s (t xs)
2095 </verb></tscreen>
2096 The expression @s (t xs)@ does not match the rule @"map/map"@, but GHC
2097 will substitute for @s@ and @t@, giving an expression which does match.
2098 If @s@ or @t@ was (a) used more than once, and (b) large or a redex, then it would
2099 not be substituted, and the rule would not fire.
2100
2101 <item> In the earlier phases of compilation, GHC inlines <em>nothing
2102 that appears on the LHS of a rule</em>, because once you have substituted
2103 for something you can't match against it (given the simple minded
2104 matching).  So if you write the rule
2105 <tscreen><verb>
2106         "map/map"       forall f,g.  map f . map g = map (f.g)
2107 </verb></tscreen>
2108 this <em>won't</em> match the expression @map f (map g xs)@.
2109 It will only match something written with explicit use of ".".
2110 Well, not quite.  It <em>will</em> match the expression
2111 <tscreen><verb>
2112         wibble f g xs
2113 </verb></tscreen>
2114 where @wibble@ is defined:
2115 <tscreen><verb>
2116         wibble f g = map f . map g
2117 </verb></tscreen>
2118 because @wibble@ will be inlined (it's small).
2119
2120 Later on in compilation, GHC starts inlining even things on the
2121 LHS of rules, but still leaves the rules enabled.  This inlining
2122 policy is controlled by the per-simplification-pass flag @-finline-phase@n.
2123 </itemize>
2124
2125
2126 <sect2>Controlling what's going on
2127 <p>
2128
2129 <itemize>
2130 <item> Use @-fddump-rules@ to see what transformation rules GHC is using.
2131 <item> Use @-fddump-simpl-stats@ to see what rules are being fired.
2132 <item> The defintion of (say) @build@ in @PrelBase.lhs@ looks llike this:
2133 <tscreen><verb>
2134         build   :: forall a. (forall b. (a -> b -> b) -> b -> b) -> [a]
2135         {-# INLINE build #-}
2136         build g = g (:) []
2137 </verb></tscreen>
2138 Notice the @INLINE@!  That prevents @(:)@ from being inlined when compiling
2139 @PrelBase@, so that an importing module will ``see'' the @(:)@, and can
2140 match it on the LHS of a rule.  @INLINE@ prevents any inlining happening
2141 in the RHS of the @INLINE@ thing.  I regret the delicacy of this.
2142
2143 <item> In @ghc/lib/std/PrelBase.lhs@ look at the rules for @map@ to
2144 see how to write rules that will do fusion and yet give an efficient
2145 program even if fusion doesn't happen.  More rules in @PrelList.lhs@.
2146 </itemize>