2 <IndexTerm><Primary>language, GHC</Primary></IndexTerm>
3 <IndexTerm><Primary>extensions, GHC</Primary></IndexTerm>
4 As with all known Haskell systems, GHC implements some extensions to
5 the language. To use them, you'll need to give a <Option>-fglasgow-exts</Option>
6 <IndexTerm><Primary>-fglasgow-exts option</Primary></IndexTerm> option.
10 Virtually all of the Glasgow extensions serve to give you access to
11 the underlying facilities with which we implement Haskell. Thus, you
12 can get at the Raw Iron, if you are willing to write some non-standard
13 code at a more primitive level. You need not be “stuck” on
14 performance because of the implementation costs of Haskell's
15 “high-level” features—you can always code “under” them. In an extreme case, you can write all your time-critical code in C, and then just glue it together with Haskell!
19 Executive summary of our extensions:
26 <Term>Unboxed types and primitive operations:</Term>
29 You can get right down to the raw machine types and operations;
30 included in this are “primitive arrays” (direct access to Big Wads
31 of Bytes). Please see <XRef LinkEnd="glasgow-unboxed"> and following.
37 <Term>Multi-parameter type classes:</Term>
40 GHC's type system supports extended type classes with multiple
41 parameters. Please see <XRef LinkEnd="multi-param-type-classes">.
47 <Term>Local universal quantification:</Term>
50 GHC's type system supports explicit universal quantification in
51 constructor fields and function arguments. This is useful for things
52 like defining <Literal>runST</Literal> from the state-thread world. See <XRef LinkEnd="universal-quantification">.
58 <Term>Extistentially quantification in data types:</Term>
61 Some or all of the type variables in a datatype declaration may be
62 <Emphasis>existentially quantified</Emphasis>. More details in <XRef LinkEnd="existential-quantification">.
68 <Term>Scoped type variables:</Term>
71 Scoped type variables enable the programmer to supply type signatures
72 for some nested declarations, where this would not be legal in Haskell
73 98. Details in <XRef LinkEnd="scoped-type-variables">.
79 <Term>Pattern guards</Term>
82 Instead of being a boolean expression, a guard is a list of qualifiers, exactly as in a list comprehension. See <XRef LinkEnd="pattern-guards">.
88 <Term>Foreign calling:</Term>
91 Just what it sounds like. We provide <Emphasis>lots</Emphasis> of rope that you
92 can dangle around your neck. Please see <XRef LinkEnd="ffi">.
101 Pragmas are special instructions to the compiler placed in the source
102 file. The pragmas GHC supports are described in <XRef LinkEnd="pragmas">.
108 <Term>Rewrite rules:</Term>
111 The programmer can specify rewrite rules as part of the source program
112 (in a pragma). GHC applies these rewrite rules wherever it can.
113 Details in <XRef LinkEnd="rewrite-rules">.
119 <Term>Generic classes:</Term>
122 Generic class declarations allow you to define a class
123 whose methods say how to work over an arbitrary data type.
124 Then it's really easy to make any new type into an instance of
125 the class. This generalises the rather ad-hoc "deriving" feature
127 Details in <XRef LinkEnd="generic-classes">.
135 Before you get too carried away working at the lowest level (e.g.,
136 sloshing <Literal>MutableByteArray#</Literal>s around your
137 program), you may wish to check if there are libraries that provide a
138 “Haskellised veneer” over the features you want. See
139 <xref linkend="book-hslibs">.
142 <Sect1 id="language-options">
143 <Title>Language variations
146 <Para> There are several flags that control what variation of the language are permitted.
147 Leaving out all of them gives you standard Haskell 98.</Para>
152 <Term><Option>-fglasgow-exts</Option>:</Term>
154 <Para>This simultaneously enables all of the extensions to Haskell 98 described in this
155 chapter, except where otherwise noted. </Para>
156 </ListItem> </VarListEntry>
159 <Term><Option>-fno-monomorphism-restriction</Option>:</Term>
161 <Para> Switch off the Haskell 98 monomorphism restriction. Independent of the <Option>-fglasgow-exts</Option>
163 </ListItem> </VarListEntry>
166 <Term><Option>-fallow-overlapping-instances</Option>,
167 <Option>-fallow-undecidable-instances</Option>,
168 <Option>-fcontext-stack</Option>:</Term>
170 <Para> See <XRef LinkEnd="instance-decls">.
171 Only relevant if you also use <Option>-fglasgow-exts</Option>.
173 </ListItem> </VarListEntry>
176 <Term><Option>-fignore-asserts</Option>:</Term>
178 <Para> See <XRef LinkEnd="sec-assertions">.
179 Only relevant if you also use <Option>-fglasgow-exts</Option>.
181 </ListItem> </VarListEntry>
184 <Term> <Option>-finline-phase</Option>:</Term>
186 <Para> See <XRef LinkEnd="rewrite-rules">.
187 Only relevant if you also use <Option>-fglasgow-exts</Option>.</para>
188 </ListItem> </VarListEntry>
191 <Term> <Option>-fgenerics</Option>:</Term>
193 <Para> See <XRef LinkEnd="generic-classes">.
194 Independent of <Option>-fglasgow-exts</Option>.
196 </ListItem> </VarListEntry>
201 <Sect1 id="primitives">
202 <Title>Unboxed types and primitive operations
204 <IndexTerm><Primary>PrelGHC module</Primary></IndexTerm>
207 This module defines all the types which are primitive in Glasgow
208 Haskell, and the operations provided for them.
211 <Sect2 id="glasgow-unboxed">
216 <IndexTerm><Primary>Unboxed types (Glasgow extension)</Primary></IndexTerm>
219 <para>Most types in GHC are <firstterm>boxed</firstterm>, which means
220 that values of that type are represented by a pointer to a heap
221 object. The representation of a Haskell <literal>Int</literal>, for
222 example, is a two-word heap object. An <firstterm>unboxed</firstterm>
223 type, however, is represented by the value itself, no pointers or heap
224 allocation are involved.
228 Unboxed types correspond to the “raw machine” types you
229 would use in C: <Literal>Int#</Literal> (long int),
230 <Literal>Double#</Literal> (double), <Literal>Addr#</Literal>
231 (void *), etc. The <Emphasis>primitive operations</Emphasis>
232 (PrimOps) on these types are what you might expect; e.g.,
233 <Literal>(+#)</Literal> is addition on
234 <Literal>Int#</Literal>s, and is the machine-addition that we all
235 know and love—usually one instruction.
239 Primitive (unboxed) types cannot be defined in Haskell, and are
240 therefore built into the language and compiler. Primitive types are
241 always unlifted; that is, a value of a primitive type cannot be
242 bottom. We use the convention that primitive types, values, and
243 operations have a <Literal>#</Literal> suffix.
247 Primitive values are often represented by a simple bit-pattern, such
248 as <Literal>Int#</Literal>, <Literal>Float#</Literal>,
249 <Literal>Double#</Literal>. But this is not necessarily the case:
250 a primitive value might be represented by a pointer to a
251 heap-allocated object. Examples include
252 <Literal>Array#</Literal>, the type of primitive arrays. A
253 primitive array is heap-allocated because it is too big a value to fit
254 in a register, and would be too expensive to copy around; in a sense,
255 it is accidental that it is represented by a pointer. If a pointer
256 represents a primitive value, then it really does point to that value:
257 no unevaluated thunks, no indirections…nothing can be at the
258 other end of the pointer than the primitive value.
262 There are some restrictions on the use of primitive types, the main
263 one being that you can't pass a primitive value to a polymorphic
264 function or store one in a polymorphic data type. This rules out
265 things like <Literal>[Int#]</Literal> (i.e. lists of primitive
266 integers). The reason for this restriction is that polymorphic
267 arguments and constructor fields are assumed to be pointers: if an
268 unboxed integer is stored in one of these, the garbage collector would
269 attempt to follow it, leading to unpredictable space leaks. Or a
270 <Function>seq</Function> operation on the polymorphic component may
271 attempt to dereference the pointer, with disastrous results. Even
272 worse, the unboxed value might be larger than a pointer
273 (<Literal>Double#</Literal> for instance).
277 Nevertheless, A numerically-intensive program using unboxed types can
278 go a <Emphasis>lot</Emphasis> faster than its “standard”
279 counterpart—we saw a threefold speedup on one example.
284 <Sect2 id="unboxed-tuples">
285 <Title>Unboxed Tuples
289 Unboxed tuples aren't really exported by <Literal>PrelGHC</Literal>,
290 they're available by default with <Option>-fglasgow-exts</Option>. An
291 unboxed tuple looks like this:
303 where <Literal>e_1..e_n</Literal> are expressions of any
304 type (primitive or non-primitive). The type of an unboxed tuple looks
309 Unboxed tuples are used for functions that need to return multiple
310 values, but they avoid the heap allocation normally associated with
311 using fully-fledged tuples. When an unboxed tuple is returned, the
312 components are put directly into registers or on the stack; the
313 unboxed tuple itself does not have a composite representation. Many
314 of the primitive operations listed in this section return unboxed
319 There are some pretty stringent restrictions on the use of unboxed tuples:
328 Unboxed tuple types are subject to the same restrictions as
329 other unboxed types; i.e. they may not be stored in polymorphic data
330 structures or passed to polymorphic functions.
337 Unboxed tuples may only be constructed as the direct result of
338 a function, and may only be deconstructed with a <Literal>case</Literal> expression.
339 eg. the following are valid:
343 f x y = (# x+1, y-1 #)
344 g x = case f x x of { (# a, b #) -> a + b }
348 but the following are invalid:
362 No variable can have an unboxed tuple type. This is illegal:
366 f :: (# Int, Int #) -> (# Int, Int #)
371 because <VarName>x</VarName> has an unboxed tuple type.
381 Note: we may relax some of these restrictions in the future.
385 The <Literal>IO</Literal> and <Literal>ST</Literal> monads use unboxed
386 tuples to avoid unnecessary allocation during sequences of operations.
392 <Title>Character and numeric types</Title>
394 <IndexTerm><Primary>character types, primitive</Primary></IndexTerm>
395 <IndexTerm><Primary>numeric types, primitive</Primary></IndexTerm>
396 <IndexTerm><Primary>integer types, primitive</Primary></IndexTerm>
397 <IndexTerm><Primary>floating point types, primitive</Primary></IndexTerm>
399 There are the following obvious primitive types:
413 <IndexTerm><Primary><literal>Char#</literal></Primary></IndexTerm>
414 <IndexTerm><Primary><literal>Int#</literal></Primary></IndexTerm>
415 <IndexTerm><Primary><literal>Word#</literal></Primary></IndexTerm>
416 <IndexTerm><Primary><literal>Addr#</literal></Primary></IndexTerm>
417 <IndexTerm><Primary><literal>Float#</literal></Primary></IndexTerm>
418 <IndexTerm><Primary><literal>Double#</literal></Primary></IndexTerm>
419 <IndexTerm><Primary><literal>Int64#</literal></Primary></IndexTerm>
420 <IndexTerm><Primary><literal>Word64#</literal></Primary></IndexTerm>
423 If you really want to know their exact equivalents in C, see
424 <Filename>ghc/includes/StgTypes.h</Filename> in the GHC source tree.
428 Literals for these types may be written as follows:
437 'a'# a Char#; for weird characters, use e.g. '\o<octal>'#
438 "a"# an Addr# (a `char *'); only characters '\0'..'\255' allowed
441 <IndexTerm><Primary>literals, primitive</Primary></IndexTerm>
442 <IndexTerm><Primary>constants, primitive</Primary></IndexTerm>
443 <IndexTerm><Primary>numbers, primitive</Primary></IndexTerm>
449 <Title>Comparison operations</Title>
452 <IndexTerm><Primary>comparisons, primitive</Primary></IndexTerm>
453 <IndexTerm><Primary>operators, comparison</Primary></IndexTerm>
459 {>,>=,==,/=,<,<=}# :: Int# -> Int# -> Bool
461 {gt,ge,eq,ne,lt,le}Char# :: Char# -> Char# -> Bool
462 -- ditto for Word# and Addr#
465 <IndexTerm><Primary><literal>>#</literal></Primary></IndexTerm>
466 <IndexTerm><Primary><literal>>=#</literal></Primary></IndexTerm>
467 <IndexTerm><Primary><literal>==#</literal></Primary></IndexTerm>
468 <IndexTerm><Primary><literal>/=#</literal></Primary></IndexTerm>
469 <IndexTerm><Primary><literal><#</literal></Primary></IndexTerm>
470 <IndexTerm><Primary><literal><=#</literal></Primary></IndexTerm>
471 <IndexTerm><Primary><literal>gt{Char,Word,Addr}#</literal></Primary></IndexTerm>
472 <IndexTerm><Primary><literal>ge{Char,Word,Addr}#</literal></Primary></IndexTerm>
473 <IndexTerm><Primary><literal>eq{Char,Word,Addr}#</literal></Primary></IndexTerm>
474 <IndexTerm><Primary><literal>ne{Char,Word,Addr}#</literal></Primary></IndexTerm>
475 <IndexTerm><Primary><literal>lt{Char,Word,Addr}#</literal></Primary></IndexTerm>
476 <IndexTerm><Primary><literal>le{Char,Word,Addr}#</literal></Primary></IndexTerm>
482 <Title>Primitive-character operations</Title>
485 <IndexTerm><Primary>characters, primitive operations</Primary></IndexTerm>
486 <IndexTerm><Primary>operators, primitive character</Primary></IndexTerm>
492 ord# :: Char# -> Int#
493 chr# :: Int# -> Char#
496 <IndexTerm><Primary><literal>ord#</literal></Primary></IndexTerm>
497 <IndexTerm><Primary><literal>chr#</literal></Primary></IndexTerm>
503 <Title>Primitive-<Literal>Int</Literal> operations</Title>
506 <IndexTerm><Primary>integers, primitive operations</Primary></IndexTerm>
507 <IndexTerm><Primary>operators, primitive integer</Primary></IndexTerm>
513 {+,-,*,quotInt,remInt,gcdInt}# :: Int# -> Int# -> Int#
514 negateInt# :: Int# -> Int#
516 iShiftL#, iShiftRA#, iShiftRL# :: Int# -> Int# -> Int#
517 -- shift left, right arithmetic, right logical
519 addIntC#, subIntC#, mulIntC# :: Int# -> Int# -> (# Int#, Int# #)
520 -- add, subtract, multiply with carry
523 <IndexTerm><Primary><literal>+#</literal></Primary></IndexTerm>
524 <IndexTerm><Primary><literal>-#</literal></Primary></IndexTerm>
525 <IndexTerm><Primary><literal>*#</literal></Primary></IndexTerm>
526 <IndexTerm><Primary><literal>quotInt#</literal></Primary></IndexTerm>
527 <IndexTerm><Primary><literal>remInt#</literal></Primary></IndexTerm>
528 <IndexTerm><Primary><literal>gcdInt#</literal></Primary></IndexTerm>
529 <IndexTerm><Primary><literal>iShiftL#</literal></Primary></IndexTerm>
530 <IndexTerm><Primary><literal>iShiftRA#</literal></Primary></IndexTerm>
531 <IndexTerm><Primary><literal>iShiftRL#</literal></Primary></IndexTerm>
532 <IndexTerm><Primary><literal>addIntC#</literal></Primary></IndexTerm>
533 <IndexTerm><Primary><literal>subIntC#</literal></Primary></IndexTerm>
534 <IndexTerm><Primary><literal>mulIntC#</literal></Primary></IndexTerm>
535 <IndexTerm><Primary>shift operations, integer</Primary></IndexTerm>
539 <Emphasis>Note:</Emphasis> No error/overflow checking!
545 <Title>Primitive-<Literal>Double</Literal> and <Literal>Float</Literal> operations</Title>
548 <IndexTerm><Primary>floating point numbers, primitive</Primary></IndexTerm>
549 <IndexTerm><Primary>operators, primitive floating point</Primary></IndexTerm>
555 {+,-,*,/}## :: Double# -> Double# -> Double#
556 {<,<=,==,/=,>=,>}## :: Double# -> Double# -> Bool
557 negateDouble# :: Double# -> Double#
558 double2Int# :: Double# -> Int#
559 int2Double# :: Int# -> Double#
561 {plus,minux,times,divide}Float# :: Float# -> Float# -> Float#
562 {gt,ge,eq,ne,lt,le}Float# :: Float# -> Float# -> Bool
563 negateFloat# :: Float# -> Float#
564 float2Int# :: Float# -> Int#
565 int2Float# :: Int# -> Float#
571 <IndexTerm><Primary><literal>+##</literal></Primary></IndexTerm>
572 <IndexTerm><Primary><literal>-##</literal></Primary></IndexTerm>
573 <IndexTerm><Primary><literal>*##</literal></Primary></IndexTerm>
574 <IndexTerm><Primary><literal>/##</literal></Primary></IndexTerm>
575 <IndexTerm><Primary><literal><##</literal></Primary></IndexTerm>
576 <IndexTerm><Primary><literal><=##</literal></Primary></IndexTerm>
577 <IndexTerm><Primary><literal>==##</literal></Primary></IndexTerm>
578 <IndexTerm><Primary><literal>=/##</literal></Primary></IndexTerm>
579 <IndexTerm><Primary><literal>>=##</literal></Primary></IndexTerm>
580 <IndexTerm><Primary><literal>>##</literal></Primary></IndexTerm>
581 <IndexTerm><Primary><literal>negateDouble#</literal></Primary></IndexTerm>
582 <IndexTerm><Primary><literal>double2Int#</literal></Primary></IndexTerm>
583 <IndexTerm><Primary><literal>int2Double#</literal></Primary></IndexTerm>
587 <IndexTerm><Primary><literal>plusFloat#</literal></Primary></IndexTerm>
588 <IndexTerm><Primary><literal>minusFloat#</literal></Primary></IndexTerm>
589 <IndexTerm><Primary><literal>timesFloat#</literal></Primary></IndexTerm>
590 <IndexTerm><Primary><literal>divideFloat#</literal></Primary></IndexTerm>
591 <IndexTerm><Primary><literal>gtFloat#</literal></Primary></IndexTerm>
592 <IndexTerm><Primary><literal>geFloat#</literal></Primary></IndexTerm>
593 <IndexTerm><Primary><literal>eqFloat#</literal></Primary></IndexTerm>
594 <IndexTerm><Primary><literal>neFloat#</literal></Primary></IndexTerm>
595 <IndexTerm><Primary><literal>ltFloat#</literal></Primary></IndexTerm>
596 <IndexTerm><Primary><literal>leFloat#</literal></Primary></IndexTerm>
597 <IndexTerm><Primary><literal>negateFloat#</literal></Primary></IndexTerm>
598 <IndexTerm><Primary><literal>float2Int#</literal></Primary></IndexTerm>
599 <IndexTerm><Primary><literal>int2Float#</literal></Primary></IndexTerm>
603 And a full complement of trigonometric functions:
609 expDouble# :: Double# -> Double#
610 logDouble# :: Double# -> Double#
611 sqrtDouble# :: Double# -> Double#
612 sinDouble# :: Double# -> Double#
613 cosDouble# :: Double# -> Double#
614 tanDouble# :: Double# -> Double#
615 asinDouble# :: Double# -> Double#
616 acosDouble# :: Double# -> Double#
617 atanDouble# :: Double# -> Double#
618 sinhDouble# :: Double# -> Double#
619 coshDouble# :: Double# -> Double#
620 tanhDouble# :: Double# -> Double#
621 powerDouble# :: Double# -> Double# -> Double#
624 <IndexTerm><Primary>trigonometric functions, primitive</Primary></IndexTerm>
628 similarly for <Literal>Float#</Literal>.
632 There are two coercion functions for <Literal>Float#</Literal>/<Literal>Double#</Literal>:
638 float2Double# :: Float# -> Double#
639 double2Float# :: Double# -> Float#
642 <IndexTerm><Primary><literal>float2Double#</literal></Primary></IndexTerm>
643 <IndexTerm><Primary><literal>double2Float#</literal></Primary></IndexTerm>
647 The primitive version of <Function>decodeDouble</Function>
648 (<Function>encodeDouble</Function> is implemented as an external C
655 decodeDouble# :: Double# -> PrelNum.ReturnIntAndGMP
658 <IndexTerm><Primary><literal>encodeDouble#</literal></Primary></IndexTerm>
659 <IndexTerm><Primary><literal>decodeDouble#</literal></Primary></IndexTerm>
663 (And the same for <Literal>Float#</Literal>s.)
668 <Sect2 id="integer-operations">
669 <Title>Operations on/for <Literal>Integers</Literal> (interface to GMP)
673 <IndexTerm><Primary>arbitrary precision integers</Primary></IndexTerm>
674 <IndexTerm><Primary>Integer, operations on</Primary></IndexTerm>
678 We implement <Literal>Integers</Literal> (arbitrary-precision
679 integers) using the GNU multiple-precision (GMP) package (version
684 The data type for <Literal>Integer</Literal> is either a small
685 integer, represented by an <Literal>Int</Literal>, or a large integer
686 represented using the pieces required by GMP's
687 <Literal>MP_INT</Literal> in <Filename>gmp.h</Filename> (see
688 <Filename>gmp.info</Filename> in
689 <Filename>ghc/includes/runtime/gmp</Filename>). It comes out as:
695 data Integer = S# Int# -- small integers
696 | J# Int# ByteArray# -- large integers
699 <IndexTerm><Primary>Integer type</Primary></IndexTerm> The primitive
700 ops to support large <Literal>Integers</Literal> use the
701 “pieces” of the representation, and are as follows:
707 negateInteger# :: Int# -> ByteArray# -> Integer
709 {plus,minus,times}Integer#, gcdInteger#,
710 quotInteger#, remInteger#, divExactInteger#
711 :: Int# -> ByteArray#
712 -> Int# -> ByteArray#
713 -> (# Int#, ByteArray# #)
716 :: Int# -> ByteArray#
717 -> Int# -> ByteArray#
718 -> Int# -- -1 for <; 0 for ==; +1 for >
721 :: Int# -> ByteArray#
723 -> Int# -- -1 for <; 0 for ==; +1 for >
726 :: Int# -> ByteArray#
730 divModInteger#, quotRemInteger#
731 :: Int# -> ByteArray#
732 -> Int# -> ByteArray#
733 -> (# Int#, ByteArray#,
736 integer2Int# :: Int# -> ByteArray# -> Int#
738 int2Integer# :: Int# -> Integer -- NB: no error-checking on these two!
739 word2Integer# :: Word# -> Integer
741 addr2Integer# :: Addr# -> Integer
742 -- the Addr# is taken to be a `char *' string
743 -- to be converted into an Integer.
746 <IndexTerm><Primary><literal>negateInteger#</literal></Primary></IndexTerm>
747 <IndexTerm><Primary><literal>plusInteger#</literal></Primary></IndexTerm>
748 <IndexTerm><Primary><literal>minusInteger#</literal></Primary></IndexTerm>
749 <IndexTerm><Primary><literal>timesInteger#</literal></Primary></IndexTerm>
750 <IndexTerm><Primary><literal>quotInteger#</literal></Primary></IndexTerm>
751 <IndexTerm><Primary><literal>remInteger#</literal></Primary></IndexTerm>
752 <IndexTerm><Primary><literal>gcdInteger#</literal></Primary></IndexTerm>
753 <IndexTerm><Primary><literal>gcdIntegerInt#</literal></Primary></IndexTerm>
754 <IndexTerm><Primary><literal>divExactInteger#</literal></Primary></IndexTerm>
755 <IndexTerm><Primary><literal>cmpInteger#</literal></Primary></IndexTerm>
756 <IndexTerm><Primary><literal>divModInteger#</literal></Primary></IndexTerm>
757 <IndexTerm><Primary><literal>quotRemInteger#</literal></Primary></IndexTerm>
758 <IndexTerm><Primary><literal>integer2Int#</literal></Primary></IndexTerm>
759 <IndexTerm><Primary><literal>int2Integer#</literal></Primary></IndexTerm>
760 <IndexTerm><Primary><literal>word2Integer#</literal></Primary></IndexTerm>
761 <IndexTerm><Primary><literal>addr2Integer#</literal></Primary></IndexTerm>
767 <Title>Words and addresses</Title>
770 <IndexTerm><Primary>word, primitive type</Primary></IndexTerm>
771 <IndexTerm><Primary>address, primitive type</Primary></IndexTerm>
772 <IndexTerm><Primary>unsigned integer, primitive type</Primary></IndexTerm>
773 <IndexTerm><Primary>pointer, primitive type</Primary></IndexTerm>
777 A <Literal>Word#</Literal> is used for bit-twiddling operations.
778 It is the same size as an <Literal>Int#</Literal>, but has no sign
779 nor any arithmetic operations.
782 type Word# -- Same size/etc as Int# but *unsigned*
783 type Addr# -- A pointer from outside the "Haskell world" (from C, probably);
784 -- described under "arrays"
787 <IndexTerm><Primary><literal>Word#</literal></Primary></IndexTerm>
788 <IndexTerm><Primary><literal>Addr#</literal></Primary></IndexTerm>
792 <Literal>Word#</Literal>s and <Literal>Addr#</Literal>s have
793 the usual comparison operations. Other
794 unboxed-<Literal>Word</Literal> ops (bit-twiddling and coercions):
800 {gt,ge,eq,ne,lt,le}Word# :: Word# -> Word# -> Bool
802 and#, or#, xor# :: Word# -> Word# -> Word#
805 quotWord#, remWord# :: Word# -> Word# -> Word#
806 -- word (i.e. unsigned) versions are different from int
807 -- versions, so we have to provide these explicitly.
809 not# :: Word# -> Word#
811 shiftL#, shiftRL# :: Word# -> Int# -> Word#
812 -- shift left, right logical
814 int2Word# :: Int# -> Word# -- just a cast, really
815 word2Int# :: Word# -> Int#
818 <IndexTerm><Primary>bit operations, Word and Addr</Primary></IndexTerm>
819 <IndexTerm><Primary><literal>gtWord#</literal></Primary></IndexTerm>
820 <IndexTerm><Primary><literal>geWord#</literal></Primary></IndexTerm>
821 <IndexTerm><Primary><literal>eqWord#</literal></Primary></IndexTerm>
822 <IndexTerm><Primary><literal>neWord#</literal></Primary></IndexTerm>
823 <IndexTerm><Primary><literal>ltWord#</literal></Primary></IndexTerm>
824 <IndexTerm><Primary><literal>leWord#</literal></Primary></IndexTerm>
825 <IndexTerm><Primary><literal>and#</literal></Primary></IndexTerm>
826 <IndexTerm><Primary><literal>or#</literal></Primary></IndexTerm>
827 <IndexTerm><Primary><literal>xor#</literal></Primary></IndexTerm>
828 <IndexTerm><Primary><literal>not#</literal></Primary></IndexTerm>
829 <IndexTerm><Primary><literal>quotWord#</literal></Primary></IndexTerm>
830 <IndexTerm><Primary><literal>remWord#</literal></Primary></IndexTerm>
831 <IndexTerm><Primary><literal>shiftL#</literal></Primary></IndexTerm>
832 <IndexTerm><Primary><literal>shiftRA#</literal></Primary></IndexTerm>
833 <IndexTerm><Primary><literal>shiftRL#</literal></Primary></IndexTerm>
834 <IndexTerm><Primary><literal>int2Word#</literal></Primary></IndexTerm>
835 <IndexTerm><Primary><literal>word2Int#</literal></Primary></IndexTerm>
839 Unboxed-<Literal>Addr</Literal> ops (C casts, really):
842 {gt,ge,eq,ne,lt,le}Addr# :: Addr# -> Addr# -> Bool
844 int2Addr# :: Int# -> Addr#
845 addr2Int# :: Addr# -> Int#
846 addr2Integer# :: Addr# -> (# Int#, ByteArray# #)
849 <IndexTerm><Primary><literal>gtAddr#</literal></Primary></IndexTerm>
850 <IndexTerm><Primary><literal>geAddr#</literal></Primary></IndexTerm>
851 <IndexTerm><Primary><literal>eqAddr#</literal></Primary></IndexTerm>
852 <IndexTerm><Primary><literal>neAddr#</literal></Primary></IndexTerm>
853 <IndexTerm><Primary><literal>ltAddr#</literal></Primary></IndexTerm>
854 <IndexTerm><Primary><literal>leAddr#</literal></Primary></IndexTerm>
855 <IndexTerm><Primary><literal>int2Addr#</literal></Primary></IndexTerm>
856 <IndexTerm><Primary><literal>addr2Int#</literal></Primary></IndexTerm>
857 <IndexTerm><Primary><literal>addr2Integer#</literal></Primary></IndexTerm>
861 The casts between <Literal>Int#</Literal>,
862 <Literal>Word#</Literal> and <Literal>Addr#</Literal>
863 correspond to null operations at the machine level, but are required
864 to keep the Haskell type checker happy.
868 Operations for indexing off of C pointers
869 (<Literal>Addr#</Literal>s) to snatch values are listed under
870 “arrays”.
876 <Title>Arrays</Title>
879 <IndexTerm><Primary>arrays, primitive</Primary></IndexTerm>
883 The type <Literal>Array# elt</Literal> is the type of primitive,
884 unpointed arrays of values of type <Literal>elt</Literal>.
893 <IndexTerm><Primary><literal>Array#</literal></Primary></IndexTerm>
897 <Literal>Array#</Literal> is more primitive than a Haskell
898 array—indeed, the Haskell <Literal>Array</Literal> interface is
899 implemented using <Literal>Array#</Literal>—in that an
900 <Literal>Array#</Literal> is indexed only by
901 <Literal>Int#</Literal>s, starting at zero. It is also more
902 primitive by virtue of being unboxed. That doesn't mean that it isn't
903 a heap-allocated object—of course, it is. Rather, being unboxed
904 means that it is represented by a pointer to the array itself, and not
905 to a thunk which will evaluate to the array (or to bottom). The
906 components of an <Literal>Array#</Literal> are themselves boxed.
910 The type <Literal>ByteArray#</Literal> is similar to
911 <Literal>Array#</Literal>, except that it contains just a string
912 of (non-pointer) bytes.
921 <IndexTerm><Primary><literal>ByteArray#</literal></Primary></IndexTerm>
925 Arrays of these types are useful when a Haskell program wishes to
926 construct a value to pass to a C procedure. It is also possible to use
927 them to build (say) arrays of unboxed characters for internal use in a
928 Haskell program. Given these uses, <Literal>ByteArray#</Literal>
929 is deliberately a bit vague about the type of its components.
930 Operations are provided to extract values of type
931 <Literal>Char#</Literal>, <Literal>Int#</Literal>,
932 <Literal>Float#</Literal>, <Literal>Double#</Literal>, and
933 <Literal>Addr#</Literal> from arbitrary offsets within a
934 <Literal>ByteArray#</Literal>. (For type
935 <Literal>Foo#</Literal>, the $i$th offset gets you the $i$th
936 <Literal>Foo#</Literal>, not the <Literal>Foo#</Literal> at
937 byte-position $i$. Mumble.) (If you want a
938 <Literal>Word#</Literal>, grab an <Literal>Int#</Literal>,
943 Lastly, we have static byte-arrays, of type
944 <Literal>Addr#</Literal> [mentioned previously]. (Remember
945 the duality between arrays and pointers in C.) Arrays of this types
946 are represented by a pointer to an array in the world outside Haskell,
947 so this pointer is not followed by the garbage collector. In other
948 respects they are just like <Literal>ByteArray#</Literal>. They
949 are only needed in order to pass values from C to Haskell.
955 <Title>Reading and writing</Title>
958 Primitive arrays are linear, and indexed starting at zero.
962 The size and indices of a <Literal>ByteArray#</Literal>, <Literal>Addr#</Literal>, and
963 <Literal>MutableByteArray#</Literal> are all in bytes. It's up to the program to
964 calculate the correct byte offset from the start of the array. This
965 allows a <Literal>ByteArray#</Literal> to contain a mixture of values of different
966 type, which is often needed when preparing data for and unpicking
967 results from C. (Umm…not true of indices…WDP 95/09)
971 <Emphasis>Should we provide some <Literal>sizeOfDouble#</Literal> constants?</Emphasis>
975 Out-of-range errors on indexing should be caught by the code which
976 uses the primitive operation; the primitive operations themselves do
977 <Emphasis>not</Emphasis> check for out-of-range indexes. The intention is that the
978 primitive ops compile to one machine instruction or thereabouts.
982 We use the terms “reading” and “writing” to refer to accessing
983 <Emphasis>mutable</Emphasis> arrays (see <XRef LinkEnd="sect-mutable">), and
984 “indexing” to refer to reading a value from an <Emphasis>immutable</Emphasis>
989 Immutable byte arrays are straightforward to index (all indices in bytes):
992 indexCharArray# :: ByteArray# -> Int# -> Char#
993 indexIntArray# :: ByteArray# -> Int# -> Int#
994 indexAddrArray# :: ByteArray# -> Int# -> Addr#
995 indexFloatArray# :: ByteArray# -> Int# -> Float#
996 indexDoubleArray# :: ByteArray# -> Int# -> Double#
998 indexCharOffAddr# :: Addr# -> Int# -> Char#
999 indexIntOffAddr# :: Addr# -> Int# -> Int#
1000 indexFloatOffAddr# :: Addr# -> Int# -> Float#
1001 indexDoubleOffAddr# :: Addr# -> Int# -> Double#
1002 indexAddrOffAddr# :: Addr# -> Int# -> Addr#
1003 -- Get an Addr# from an Addr# offset
1006 <IndexTerm><Primary><literal>indexCharArray#</literal></Primary></IndexTerm>
1007 <IndexTerm><Primary><literal>indexIntArray#</literal></Primary></IndexTerm>
1008 <IndexTerm><Primary><literal>indexAddrArray#</literal></Primary></IndexTerm>
1009 <IndexTerm><Primary><literal>indexFloatArray#</literal></Primary></IndexTerm>
1010 <IndexTerm><Primary><literal>indexDoubleArray#</literal></Primary></IndexTerm>
1011 <IndexTerm><Primary><literal>indexCharOffAddr#</literal></Primary></IndexTerm>
1012 <IndexTerm><Primary><literal>indexIntOffAddr#</literal></Primary></IndexTerm>
1013 <IndexTerm><Primary><literal>indexFloatOffAddr#</literal></Primary></IndexTerm>
1014 <IndexTerm><Primary><literal>indexDoubleOffAddr#</literal></Primary></IndexTerm>
1015 <IndexTerm><Primary><literal>indexAddrOffAddr#</literal></Primary></IndexTerm>
1019 The last of these, <Function>indexAddrOffAddr#</Function>, extracts an <Literal>Addr#</Literal> using an offset
1020 from another <Literal>Addr#</Literal>, thereby providing the ability to follow a chain of
1025 Something a bit more interesting goes on when indexing arrays of boxed
1026 objects, because the result is simply the boxed object. So presumably
1027 it should be entered—we never usually return an unevaluated
1028 object! This is a pain: primitive ops aren't supposed to do
1029 complicated things like enter objects. The current solution is to
1030 return a single element unboxed tuple (see <XRef LinkEnd="unboxed-tuples">).
1036 indexArray# :: Array# elt -> Int# -> (# elt #)
1039 <IndexTerm><Primary><literal>indexArray#</literal></Primary></IndexTerm>
1045 <Title>The state type</Title>
1048 <IndexTerm><Primary><literal>state, primitive type</literal></Primary></IndexTerm>
1049 <IndexTerm><Primary><literal>State#</literal></Primary></IndexTerm>
1053 The primitive type <Literal>State#</Literal> represents the state of a state
1054 transformer. It is parameterised on the desired type of state, which
1055 serves to keep states from distinct threads distinct from one another.
1056 But the <Emphasis>only</Emphasis> effect of this parameterisation is in the type
1057 system: all values of type <Literal>State#</Literal> are represented in the same way.
1058 Indeed, they are all represented by nothing at all! The code
1059 generator “knows” to generate no code, and allocate no registers
1060 etc, for primitive states.
1072 The type <Literal>GHC.RealWorld</Literal> is truly opaque: there are no values defined
1073 of this type, and no operations over it. It is “primitive” in that
1074 sense - but it is <Emphasis>not unlifted!</Emphasis> Its only role in life is to be
1075 the type which distinguishes the <Literal>IO</Literal> state transformer.
1089 <Title>State of the world</Title>
1092 A single, primitive, value of type <Literal>State# RealWorld</Literal> is provided.
1098 realWorld# :: State# RealWorld
1101 <IndexTerm><Primary>realWorld# state object</Primary></IndexTerm>
1105 (Note: in the compiler, not a <Literal>PrimOp</Literal>; just a mucho magic
1106 <Literal>Id</Literal>. Exported from <Literal>GHC</Literal>, though).
1111 <Sect2 id="sect-mutable">
1112 <Title>Mutable arrays</Title>
1115 <IndexTerm><Primary>mutable arrays</Primary></IndexTerm>
1116 <IndexTerm><Primary>arrays, mutable</Primary></IndexTerm>
1117 Corresponding to <Literal>Array#</Literal> and <Literal>ByteArray#</Literal>, we have the types of
1118 mutable versions of each. In each case, the representation is a
1119 pointer to a suitable block of (mutable) heap-allocated storage.
1125 type MutableArray# s elt
1126 type MutableByteArray# s
1129 <IndexTerm><Primary><literal>MutableArray#</literal></Primary></IndexTerm>
1130 <IndexTerm><Primary><literal>MutableByteArray#</literal></Primary></IndexTerm>
1134 <Title>Allocation</Title>
1137 <IndexTerm><Primary>mutable arrays, allocation</Primary></IndexTerm>
1138 <IndexTerm><Primary>arrays, allocation</Primary></IndexTerm>
1139 <IndexTerm><Primary>allocation, of mutable arrays</Primary></IndexTerm>
1143 Mutable arrays can be allocated. Only pointer-arrays are initialised;
1144 arrays of non-pointers are filled in by “user code” rather than by
1145 the array-allocation primitive. Reason: only the pointer case has to
1146 worry about GC striking with a partly-initialised array.
1152 newArray# :: Int# -> elt -> State# s -> (# State# s, MutableArray# s elt #)
1154 newCharArray# :: Int# -> State# s -> (# State# s, MutableByteArray# s elt #)
1155 newIntArray# :: Int# -> State# s -> (# State# s, MutableByteArray# s elt #)
1156 newAddrArray# :: Int# -> State# s -> (# State# s, MutableByteArray# s elt #)
1157 newFloatArray# :: Int# -> State# s -> (# State# s, MutableByteArray# s elt #)
1158 newDoubleArray# :: Int# -> State# s -> (# State# s, MutableByteArray# s elt #)
1161 <IndexTerm><Primary><literal>newArray#</literal></Primary></IndexTerm>
1162 <IndexTerm><Primary><literal>newCharArray#</literal></Primary></IndexTerm>
1163 <IndexTerm><Primary><literal>newIntArray#</literal></Primary></IndexTerm>
1164 <IndexTerm><Primary><literal>newAddrArray#</literal></Primary></IndexTerm>
1165 <IndexTerm><Primary><literal>newFloatArray#</literal></Primary></IndexTerm>
1166 <IndexTerm><Primary><literal>newDoubleArray#</literal></Primary></IndexTerm>
1170 The size of a <Literal>ByteArray#</Literal> is given in bytes.
1176 <Title>Reading and writing</Title>
1179 <IndexTerm><Primary>arrays, reading and writing</Primary></IndexTerm>
1185 readArray# :: MutableArray# s elt -> Int# -> State# s -> (# State# s, elt #)
1186 readCharArray# :: MutableByteArray# s -> Int# -> State# s -> (# State# s, Char# #)
1187 readIntArray# :: MutableByteArray# s -> Int# -> State# s -> (# State# s, Int# #)
1188 readAddrArray# :: MutableByteArray# s -> Int# -> State# s -> (# State# s, Addr# #)
1189 readFloatArray# :: MutableByteArray# s -> Int# -> State# s -> (# State# s, Float# #)
1190 readDoubleArray# :: MutableByteArray# s -> Int# -> State# s -> (# State# s, Double# #)
1192 writeArray# :: MutableArray# s elt -> Int# -> elt -> State# s -> State# s
1193 writeCharArray# :: MutableByteArray# s -> Int# -> Char# -> State# s -> State# s
1194 writeIntArray# :: MutableByteArray# s -> Int# -> Int# -> State# s -> State# s
1195 writeAddrArray# :: MutableByteArray# s -> Int# -> Addr# -> State# s -> State# s
1196 writeFloatArray# :: MutableByteArray# s -> Int# -> Float# -> State# s -> State# s
1197 writeDoubleArray# :: MutableByteArray# s -> Int# -> Double# -> State# s -> State# s
1200 <IndexTerm><Primary><literal>readArray#</literal></Primary></IndexTerm>
1201 <IndexTerm><Primary><literal>readCharArray#</literal></Primary></IndexTerm>
1202 <IndexTerm><Primary><literal>readIntArray#</literal></Primary></IndexTerm>
1203 <IndexTerm><Primary><literal>readAddrArray#</literal></Primary></IndexTerm>
1204 <IndexTerm><Primary><literal>readFloatArray#</literal></Primary></IndexTerm>
1205 <IndexTerm><Primary><literal>readDoubleArray#</literal></Primary></IndexTerm>
1206 <IndexTerm><Primary><literal>writeArray#</literal></Primary></IndexTerm>
1207 <IndexTerm><Primary><literal>writeCharArray#</literal></Primary></IndexTerm>
1208 <IndexTerm><Primary><literal>writeIntArray#</literal></Primary></IndexTerm>
1209 <IndexTerm><Primary><literal>writeAddrArray#</literal></Primary></IndexTerm>
1210 <IndexTerm><Primary><literal>writeFloatArray#</literal></Primary></IndexTerm>
1211 <IndexTerm><Primary><literal>writeDoubleArray#</literal></Primary></IndexTerm>
1217 <Title>Equality</Title>
1220 <IndexTerm><Primary>arrays, testing for equality</Primary></IndexTerm>
1224 One can take “equality” of mutable arrays. What is compared is the
1225 <Emphasis>name</Emphasis> or reference to the mutable array, not its contents.
1231 sameMutableArray# :: MutableArray# s elt -> MutableArray# s elt -> Bool
1232 sameMutableByteArray# :: MutableByteArray# s -> MutableByteArray# s -> Bool
1235 <IndexTerm><Primary><literal>sameMutableArray#</literal></Primary></IndexTerm>
1236 <IndexTerm><Primary><literal>sameMutableByteArray#</literal></Primary></IndexTerm>
1242 <Title>Freezing mutable arrays</Title>
1245 <IndexTerm><Primary>arrays, freezing mutable</Primary></IndexTerm>
1246 <IndexTerm><Primary>freezing mutable arrays</Primary></IndexTerm>
1247 <IndexTerm><Primary>mutable arrays, freezing</Primary></IndexTerm>
1251 Only unsafe-freeze has a primitive. (Safe freeze is done directly in Haskell
1252 by copying the array and then using <Function>unsafeFreeze</Function>.)
1258 unsafeFreezeArray# :: MutableArray# s elt -> State# s -> (# State# s, Array# s elt #)
1259 unsafeFreezeByteArray# :: MutableByteArray# s -> State# s -> (# State# s, ByteArray# #)
1262 <IndexTerm><Primary><literal>unsafeFreezeArray#</literal></Primary></IndexTerm>
1263 <IndexTerm><Primary><literal>unsafeFreezeByteArray#</literal></Primary></IndexTerm>
1271 <Title>Synchronizing variables (M-vars)</Title>
1274 <IndexTerm><Primary>synchronising variables (M-vars)</Primary></IndexTerm>
1275 <IndexTerm><Primary>M-Vars</Primary></IndexTerm>
1279 Synchronising variables are the primitive type used to implement
1280 Concurrent Haskell's MVars (see the Concurrent Haskell paper for
1281 the operational behaviour of these operations).
1287 type MVar# s elt -- primitive
1289 newMVar# :: State# s -> (# State# s, MVar# s elt #)
1290 takeMVar# :: SynchVar# s elt -> State# s -> (# State# s, elt #)
1291 putMVar# :: SynchVar# s elt -> State# s -> State# s
1294 <IndexTerm><Primary><literal>SynchVar#</literal></Primary></IndexTerm>
1295 <IndexTerm><Primary><literal>newSynchVar#</literal></Primary></IndexTerm>
1296 <IndexTerm><Primary><literal>takeMVar</literal></Primary></IndexTerm>
1297 <IndexTerm><Primary><literal>putMVar</literal></Primary></IndexTerm>
1304 <Sect1 id="glasgow-ST-monad">
1305 <Title>Primitive state-transformer monad
1309 <IndexTerm><Primary>state transformers (Glasgow extensions)</Primary></IndexTerm>
1310 <IndexTerm><Primary>ST monad (Glasgow extension)</Primary></IndexTerm>
1314 This monad underlies our implementation of arrays, mutable and
1315 immutable, and our implementation of I/O, including “C calls”.
1319 The <Literal>ST</Literal> library, which provides access to the
1320 <Function>ST</Function> monad, is described in <xref
1326 <Sect1 id="glasgow-prim-arrays">
1327 <Title>Primitive arrays, mutable and otherwise
1331 <IndexTerm><Primary>primitive arrays (Glasgow extension)</Primary></IndexTerm>
1332 <IndexTerm><Primary>arrays, primitive (Glasgow extension)</Primary></IndexTerm>
1336 GHC knows about quite a few flavours of Large Swathes of Bytes.
1340 First, GHC distinguishes between primitive arrays of (boxed) Haskell
1341 objects (type <Literal>Array# obj</Literal>) and primitive arrays of bytes (type
1342 <Literal>ByteArray#</Literal>).
1346 Second, it distinguishes between…
1350 <Term>Immutable:</Term>
1353 Arrays that do not change (as with “standard” Haskell arrays); you
1354 can only read from them. Obviously, they do not need the care and
1355 attention of the state-transformer monad.
1360 <Term>Mutable:</Term>
1363 Arrays that may be changed or “mutated.” All the operations on them
1364 live within the state-transformer monad and the updates happen
1365 <Emphasis>in-place</Emphasis>.
1370 <Term>“Static” (in C land):</Term>
1373 A C routine may pass an <Literal>Addr#</Literal> pointer back into Haskell land. There
1374 are then primitive operations with which you may merrily grab values
1375 over in C land, by indexing off the “static” pointer.
1380 <Term>“Stable” pointers:</Term>
1383 If, for some reason, you wish to hand a Haskell pointer (i.e.,
1384 <Emphasis>not</Emphasis> an unboxed value) to a C routine, you first make the
1385 pointer “stable,” so that the garbage collector won't forget that it
1386 exists. That is, GHC provides a safe way to pass Haskell pointers to
1391 Please see <XRef LinkEnd="sec-stable-pointers"> for more details.
1396 <Term>“Foreign objects”:</Term>
1399 A “foreign object” is a safe way to pass an external object (a
1400 C-allocated pointer, say) to Haskell and have Haskell do the Right
1401 Thing when it no longer references the object. So, for example, C
1402 could pass a large bitmap over to Haskell and say “please free this
1403 memory when you're done with it.”
1407 Please see <XRef LinkEnd="sec-ForeignObj"> for more details.
1415 The libraries documentatation gives more details on all these
1416 “primitive array” types and the operations on them.
1422 <Sect1 id="pattern-guards">
1423 <Title>Pattern guards</Title>
1426 <IndexTerm><Primary>Pattern guards (Glasgow extension)</Primary></IndexTerm>
1427 The discussion that follows is an abbreviated version of Simon Peyton Jones's original <ULink URL="http://research.microsoft.com/~simonpj/Haskell/guards.html">proposal</ULink>. (Note that the proposal was written before pattern guards were implemented, so refers to them as unimplemented.)
1431 Suppose we have an abstract data type of finite maps, with a
1435 lookup :: FiniteMap -> Int -> Maybe Int
1438 The lookup returns <Function>Nothing</Function> if the supplied key is not in the domain of the mapping, and <Function>(Just v)</Function> otherwise,
1439 where <VarName>v</VarName> is the value that the key maps to. Now consider the following definition:
1443 clunky env var1 var2 | ok1 && ok2 = val1 + val2
1444 | otherwise = var1 + var2
1446 m1 = lookup env var1
1447 m2 = lookup env var2
1448 ok1 = maybeToBool m1
1449 ok2 = maybeToBool m2
1450 val1 = expectJust m1
1451 val2 = expectJust m2
1455 The auxiliary functions are
1459 maybeToBool :: Maybe a -> Bool
1460 maybeToBool (Just x) = True
1461 maybeToBool Nothing = False
1463 expectJust :: Maybe a -> a
1464 expectJust (Just x) = x
1465 expectJust Nothing = error "Unexpected Nothing"
1469 What is <Function>clunky</Function> doing? The guard <Literal>ok1 &&
1470 ok2</Literal> checks that both lookups succeed, using
1471 <Function>maybeToBool</Function> to convert the <Function>Maybe</Function>
1472 types to booleans. The (lazily evaluated) <Function>expectJust</Function>
1473 calls extract the values from the results of the lookups, and binds the
1474 returned values to <VarName>val1</VarName> and <VarName>val2</VarName>
1475 respectively. If either lookup fails, then clunky takes the
1476 <Literal>otherwise</Literal> case and returns the sum of its arguments.
1480 This is certainly legal Haskell, but it is a tremendously verbose and
1481 un-obvious way to achieve the desired effect. Arguably, a more direct way
1482 to write clunky would be to use case expressions:
1486 clunky env var1 var1 = case lookup env var1 of
1488 Just val1 -> case lookup env var2 of
1490 Just val2 -> val1 + val2
1496 This is a bit shorter, but hardly better. Of course, we can rewrite any set
1497 of pattern-matching, guarded equations as case expressions; that is
1498 precisely what the compiler does when compiling equations! The reason that
1499 Haskell provides guarded equations is because they allow us to write down
1500 the cases we want to consider, one at a time, independently of each other.
1501 This structure is hidden in the case version. Two of the right-hand sides
1502 are really the same (<Function>fail</Function>), and the whole expression
1503 tends to become more and more indented.
1507 Here is how I would write clunky:
1511 clunky env var1 var1
1512 | Just val1 <- lookup env var1
1513 , Just val2 <- lookup env var2
1515 ...other equations for clunky...
1519 The semantics should be clear enough. The qualifers are matched in order.
1520 For a <Literal><-</Literal> qualifier, which I call a pattern guard, the
1521 right hand side is evaluated and matched against the pattern on the left.
1522 If the match fails then the whole guard fails and the next equation is
1523 tried. If it succeeds, then the appropriate binding takes place, and the
1524 next qualifier is matched, in the augmented environment. Unlike list
1525 comprehensions, however, the type of the expression to the right of the
1526 <Literal><-</Literal> is the same as the type of the pattern to its
1527 left. The bindings introduced by pattern guards scope over all the
1528 remaining guard qualifiers, and over the right hand side of the equation.
1532 Just as with list comprehensions, boolean expressions can be freely mixed
1533 with among the pattern guards. For example:
1544 Haskell's current guards therefore emerge as a special case, in which the
1545 qualifier list has just one element, a boolean expression.
1549 <sect1 id="sec-ffi">
1550 <title>The foreign interface</title>
1552 <para>The foreign interface consists of the following components:</para>
1556 <para>The Foreign Function Interface language specification
1557 (included in this manual, in <xref linkend="ffi">).</para>
1561 <para>The <literal>Foreign</literal> module (see <xref
1562 linkend="sec-Foreign">) collects together several interfaces
1563 which are useful in specifying foreign language
1564 interfaces, including the following:</para>
1568 <para>The <literal>ForeignObj</literal> module (see <xref
1569 linkend="sec-ForeignObj">), for managing pointers from
1570 Haskell into the outside world.</para>
1574 <para>The <literal>StablePtr</literal> module (see <xref
1575 linkend="sec-stable-pointers">), for managing pointers
1576 into Haskell from the outside world.</para>
1580 <para>The <literal>CTypes</literal> module (see <xref
1581 linkend="sec-CTypes">) gives Haskell equivalents for the
1582 standard C datatypes, for use in making Haskell bindings
1583 to existing C libraries.</para>
1587 <para>The <literal>CTypesISO</literal> module (see <xref
1588 linkend="sec-CTypesISO">) gives Haskell equivalents for C
1589 types defined by the ISO C standard.</para>
1593 <para>The <literal>Storable</literal> library, for
1594 primitive marshalling of data types between Haskell and
1595 the foreign language.</para>
1602 <para>The following sections also give some hints and tips on the use
1603 of the foreign function interface in GHC.</para>
1605 <Sect2 id="glasgow-foreign-headers">
1606 <Title>Using function headers
1610 <IndexTerm><Primary>C calls, function headers</Primary></IndexTerm>
1614 When generating C (using the <Option>-fvia-C</Option> directive), one can assist the
1615 C compiler in detecting type errors by using the <Command>-#include</Command> directive
1616 to provide <Filename>.h</Filename> files containing function headers.
1628 void initialiseEFS (HsInt size);
1629 HsInt terminateEFS (void);
1630 HsForeignObj emptyEFS(void);
1631 HsForeignObj updateEFS (HsForeignObj a, HsInt i, HsInt x);
1632 HsInt lookupEFS (HsForeignObj a, HsInt i);
1636 <para>The types <literal>HsInt</literal>,
1637 <literal>HsForeignObj</literal> etc. are described in <xref
1638 linkend="sec-mapping-table">.</Para>
1640 <Para>Note that this approach is only
1641 <Emphasis>essential</Emphasis> for returning
1642 <Literal>float</Literal>s (or if <Literal>sizeof(int) !=
1643 sizeof(int *)</Literal> on your architecture) but is a Good
1644 Thing for anyone who cares about writing solid code. You're
1645 crazy not to do it.</Para>
1651 <Sect1 id="multi-param-type-classes">
1652 <Title>Multi-parameter type classes
1656 This section documents GHC's implementation of multi-parameter type
1657 classes. There's lots of background in the paper <ULink
1658 URL="http://research.microsoft.com/~simonpj/multi.ps.gz" >Type
1659 classes: exploring the design space</ULink > (Simon Peyton Jones, Mark
1660 Jones, Erik Meijer).
1664 I'd like to thank people who reported shorcomings in the GHC 3.02
1665 implementation. Our default decisions were all conservative ones, and
1666 the experience of these heroic pioneers has given useful concrete
1667 examples to support several generalisations. (These appear below as
1668 design choices not implemented in 3.02.)
1672 I've discussed these notes with Mark Jones, and I believe that Hugs
1673 will migrate towards the same design choices as I outline here.
1674 Thanks to him, and to many others who have offered very useful
1679 <Title>Types</Title>
1682 There are the following restrictions on the form of a qualified
1689 forall tv1..tvn (c1, ...,cn) => type
1695 (Here, I write the "foralls" explicitly, although the Haskell source
1696 language omits them; in Haskell 1.4, all the free type variables of an
1697 explicit source-language type signature are universally quantified,
1698 except for the class type variables in a class declaration. However,
1699 in GHC, you can give the foralls if you want. See <XRef LinkEnd="universal-quantification">).
1708 <Emphasis>Each universally quantified type variable
1709 <Literal>tvi</Literal> must be mentioned (i.e. appear free) in <Literal>type</Literal></Emphasis>.
1711 The reason for this is that a value with a type that does not obey
1712 this restriction could not be used without introducing
1713 ambiguity. Here, for example, is an illegal type:
1717 forall a. Eq a => Int
1721 When a value with this type was used, the constraint <Literal>Eq tv</Literal>
1722 would be introduced where <Literal>tv</Literal> is a fresh type variable, and
1723 (in the dictionary-translation implementation) the value would be
1724 applied to a dictionary for <Literal>Eq tv</Literal>. The difficulty is that we
1725 can never know which instance of <Literal>Eq</Literal> to use because we never
1726 get any more information about <Literal>tv</Literal>.
1733 <Emphasis>Every constraint <Literal>ci</Literal> must mention at least one of the
1734 universally quantified type variables <Literal>tvi</Literal></Emphasis>.
1736 For example, this type is OK because <Literal>C a b</Literal> mentions the
1737 universally quantified type variable <Literal>b</Literal>:
1741 forall a. C a b => burble
1745 The next type is illegal because the constraint <Literal>Eq b</Literal> does not
1746 mention <Literal>a</Literal>:
1750 forall a. Eq b => burble
1754 The reason for this restriction is milder than the other one. The
1755 excluded types are never useful or necessary (because the offending
1756 context doesn't need to be witnessed at this point; it can be floated
1757 out). Furthermore, floating them out increases sharing. Lastly,
1758 excluding them is a conservative choice; it leaves a patch of
1759 territory free in case we need it later.
1769 These restrictions apply to all types, whether declared in a type signature
1774 Unlike Haskell 1.4, constraints in types do <Emphasis>not</Emphasis> have to be of
1775 the form <Emphasis>(class type-variables)</Emphasis>. Thus, these type signatures
1782 f :: Eq (m a) => [m a] -> [m a]
1789 This choice recovers principal types, a property that Haskell 1.4 does not have.
1795 <Title>Class declarations</Title>
1803 <Emphasis>Multi-parameter type classes are permitted</Emphasis>. For example:
1807 class Collection c a where
1808 union :: c a -> c a -> c a
1819 <Emphasis>The class hierarchy must be acyclic</Emphasis>. However, the definition
1820 of "acyclic" involves only the superclass relationships. For example,
1826 op :: D b => a -> b -> b
1829 class C a => D a where { ... }
1833 Here, <Literal>C</Literal> is a superclass of <Literal>D</Literal>, but it's OK for a
1834 class operation <Literal>op</Literal> of <Literal>C</Literal> to mention <Literal>D</Literal>. (It
1835 would not be OK for <Literal>D</Literal> to be a superclass of <Literal>C</Literal>.)
1842 <Emphasis>There are no restrictions on the context in a class declaration
1843 (which introduces superclasses), except that the class hierarchy must
1844 be acyclic</Emphasis>. So these class declarations are OK:
1848 class Functor (m k) => FiniteMap m k where
1851 class (Monad m, Monad (t m)) => Transform t m where
1852 lift :: m a -> (t m) a
1861 <Emphasis>In the signature of a class operation, every constraint
1862 must mention at least one type variable that is not a class type
1863 variable</Emphasis>.
1869 class Collection c a where
1870 mapC :: Collection c b => (a->b) -> c a -> c b
1874 is OK because the constraint <Literal>(Collection a b)</Literal> mentions
1875 <Literal>b</Literal>, even though it also mentions the class variable
1876 <Literal>a</Literal>. On the other hand:
1881 op :: Eq a => (a,b) -> (a,b)
1885 is not OK because the constraint <Literal>(Eq a)</Literal> mentions on the class
1886 type variable <Literal>a</Literal>, but not <Literal>b</Literal>. However, any such
1887 example is easily fixed by moving the offending context up to the
1892 class Eq a => C a where
1897 A yet more relaxed rule would allow the context of a class-op signature
1898 to mention only class type variables. However, that conflicts with
1899 Rule 1(b) for types above.
1906 <Emphasis>The type of each class operation must mention <Emphasis>all</Emphasis> of
1907 the class type variables</Emphasis>. For example:
1911 class Coll s a where
1913 insert :: s -> a -> s
1917 is not OK, because the type of <Literal>empty</Literal> doesn't mention
1918 <Literal>a</Literal>. This rule is a consequence of Rule 1(a), above, for
1919 types, and has the same motivation.
1921 Sometimes, offending class declarations exhibit misunderstandings. For
1922 example, <Literal>Coll</Literal> might be rewritten
1926 class Coll s a where
1928 insert :: s a -> a -> s a
1932 which makes the connection between the type of a collection of
1933 <Literal>a</Literal>'s (namely <Literal>(s a)</Literal>) and the element type <Literal>a</Literal>.
1934 Occasionally this really doesn't work, in which case you can split the
1942 class CollE s => Coll s a where
1943 insert :: s -> a -> s
1956 <Sect2 id="instance-decls">
1957 <Title>Instance declarations</Title>
1965 <Emphasis>Instance declarations may not overlap</Emphasis>. The two instance
1970 instance context1 => C type1 where ...
1971 instance context2 => C type2 where ...
1975 "overlap" if <Literal>type1</Literal> and <Literal>type2</Literal> unify
1977 However, if you give the command line option
1978 <Option>-fallow-overlapping-instances</Option><IndexTerm><Primary>-fallow-overlapping-instances
1979 option</Primary></IndexTerm> then two overlapping instance declarations are permitted
1987 EITHER <Literal>type1</Literal> and <Literal>type2</Literal> do not unify
1993 OR <Literal>type2</Literal> is a substitution instance of <Literal>type1</Literal>
1994 (but not identical to <Literal>type1</Literal>)
2007 Notice that these rules
2014 make it clear which instance decl to use
2015 (pick the most specific one that matches)
2022 do not mention the contexts <Literal>context1</Literal>, <Literal>context2</Literal>
2023 Reason: you can pick which instance decl
2024 "matches" based on the type.
2031 Regrettably, GHC doesn't guarantee to detect overlapping instance
2032 declarations if they appear in different modules. GHC can "see" the
2033 instance declarations in the transitive closure of all the modules
2034 imported by the one being compiled, so it can "see" all instance decls
2035 when it is compiling <Literal>Main</Literal>. However, it currently chooses not
2036 to look at ones that can't possibly be of use in the module currently
2037 being compiled, in the interests of efficiency. (Perhaps we should
2038 change that decision, at least for <Literal>Main</Literal>.)
2045 <Emphasis>There are no restrictions on the type in an instance
2046 <Emphasis>head</Emphasis>, except that at least one must not be a type variable</Emphasis>.
2047 The instance "head" is the bit after the "=>" in an instance decl. For
2048 example, these are OK:
2052 instance C Int a where ...
2054 instance D (Int, Int) where ...
2056 instance E [[a]] where ...
2060 Note that instance heads <Emphasis>may</Emphasis> contain repeated type variables.
2061 For example, this is OK:
2065 instance Stateful (ST s) (MutVar s) where ...
2069 The "at least one not a type variable" restriction is to ensure that
2070 context reduction terminates: each reduction step removes one type
2071 constructor. For example, the following would make the type checker
2072 loop if it wasn't excluded:
2076 instance C a => C a where ...
2080 There are two situations in which the rule is a bit of a pain. First,
2081 if one allows overlapping instance declarations then it's quite
2082 convenient to have a "default instance" declaration that applies if
2083 something more specific does not:
2092 Second, sometimes you might want to use the following to get the
2093 effect of a "class synonym":
2097 class (C1 a, C2 a, C3 a) => C a where { }
2099 instance (C1 a, C2 a, C3 a) => C a where { }
2103 This allows you to write shorter signatures:
2115 f :: (C1 a, C2 a, C3 a) => ...
2119 I'm on the lookout for a simple rule that preserves decidability while
2120 allowing these idioms. The experimental flag
2121 <Option>-fallow-undecidable-instances</Option><IndexTerm><Primary>-fallow-undecidable-instances
2122 option</Primary></IndexTerm> lifts this restriction, allowing all the types in an
2123 instance head to be type variables.
2130 <Emphasis>Unlike Haskell 1.4, instance heads may use type
2131 synonyms</Emphasis>. As always, using a type synonym is just shorthand for
2132 writing the RHS of the type synonym definition. For example:
2136 type Point = (Int,Int)
2137 instance C Point where ...
2138 instance C [Point] where ...
2142 is legal. However, if you added
2146 instance C (Int,Int) where ...
2150 as well, then the compiler will complain about the overlapping
2151 (actually, identical) instance declarations. As always, type synonyms
2152 must be fully applied. You cannot, for example, write:
2157 instance Monad P where ...
2161 This design decision is independent of all the others, and easily
2162 reversed, but it makes sense to me.
2169 <Emphasis>The types in an instance-declaration <Emphasis>context</Emphasis> must all
2170 be type variables</Emphasis>. Thus
2174 instance C a b => Eq (a,b) where ...
2182 instance C Int b => Foo b where ...
2186 is not OK. Again, the intent here is to make sure that context
2187 reduction terminates.
2189 Voluminous correspondence on the Haskell mailing list has convinced me
2190 that it's worth experimenting with a more liberal rule. If you use
2191 the flag <Option>-fallow-undecidable-instances</Option> can use arbitrary
2192 types in an instance context. Termination is ensured by having a
2193 fixed-depth recursion stack. If you exceed the stack depth you get a
2194 sort of backtrace, and the opportunity to increase the stack depth
2195 with <Option>-fcontext-stack</Option><Emphasis>N</Emphasis>.
2208 <Sect1 id="universal-quantification">
2209 <Title>Explicit universal quantification
2213 GHC now allows you to write explicitly quantified types. GHC's
2214 syntax for this now agrees with Hugs's, namely:
2220 forall a b. (Ord a, Eq b) => a -> b -> a
2226 The context is, of course, optional. You can't use <Literal>forall</Literal> as
2227 a type variable any more!
2231 Haskell type signatures are implicitly quantified. The <Literal>forall</Literal>
2232 allows us to say exactly what this means. For example:
2250 g :: forall b. (b -> b)
2256 The two are treated identically.
2260 <Title>Universally-quantified data type fields
2264 In a <Literal>data</Literal> or <Literal>newtype</Literal> declaration one can quantify
2265 the types of the constructor arguments. Here are several examples:
2271 data T a = T1 (forall b. b -> b -> b) a
2273 data MonadT m = MkMonad { return :: forall a. a -> m a,
2274 bind :: forall a b. m a -> (a -> m b) -> m b
2277 newtype Swizzle = MkSwizzle (Ord a => [a] -> [a])
2283 The constructors now have so-called <Emphasis>rank 2</Emphasis> polymorphic
2284 types, in which there is a for-all in the argument types.:
2290 T1 :: forall a. (forall b. b -> b -> b) -> a -> T a
2291 MkMonad :: forall m. (forall a. a -> m a)
2292 -> (forall a b. m a -> (a -> m b) -> m b)
2294 MkSwizzle :: (Ord a => [a] -> [a]) -> Swizzle
2300 Notice that you don't need to use a <Literal>forall</Literal> if there's an
2301 explicit context. For example in the first argument of the
2302 constructor <Function>MkSwizzle</Function>, an implicit "<Literal>forall a.</Literal>" is
2303 prefixed to the argument type. The implicit <Literal>forall</Literal>
2304 quantifies all type variables that are not already in scope, and are
2305 mentioned in the type quantified over.
2309 As for type signatures, implicit quantification happens for non-overloaded
2310 types too. So if you write this:
2313 data T a = MkT (Either a b) (b -> b)
2316 it's just as if you had written this:
2319 data T a = MkT (forall b. Either a b) (forall b. b -> b)
2322 That is, since the type variable <Literal>b</Literal> isn't in scope, it's
2323 implicitly universally quantified. (Arguably, it would be better
2324 to <Emphasis>require</Emphasis> explicit quantification on constructor arguments
2325 where that is what is wanted. Feedback welcomed.)
2331 <Title>Construction </Title>
2334 You construct values of types <Literal>T1, MonadT, Swizzle</Literal> by applying
2335 the constructor to suitable values, just as usual. For example,
2341 (T1 (\xy->x) 3) :: T Int
2343 (MkSwizzle sort) :: Swizzle
2344 (MkSwizzle reverse) :: Swizzle
2351 MkMonad r b) :: MonadT Maybe
2357 The type of the argument can, as usual, be more general than the type
2358 required, as <Literal>(MkSwizzle reverse)</Literal> shows. (<Function>reverse</Function>
2359 does not need the <Literal>Ord</Literal> constraint.)
2365 <Title>Pattern matching</Title>
2368 When you use pattern matching, the bound variables may now have
2369 polymorphic types. For example:
2375 f :: T a -> a -> (a, Char)
2376 f (T1 f k) x = (f k x, f 'c' 'd')
2378 g :: (Ord a, Ord b) => Swizzle -> [a] -> (a -> b) -> [b]
2379 g (MkSwizzle s) xs f = s (map f (s xs))
2381 h :: MonadT m -> [m a] -> m [a]
2382 h m [] = return m []
2383 h m (x:xs) = bind m x $ \y ->
2384 bind m (h m xs) $ \ys ->
2391 In the function <Function>h</Function> we use the record selectors <Literal>return</Literal>
2392 and <Literal>bind</Literal> to extract the polymorphic bind and return functions
2393 from the <Literal>MonadT</Literal> data structure, rather than using pattern
2398 You cannot pattern-match against an argument that is polymorphic.
2402 newtype TIM s a = TIM (ST s (Maybe a))
2404 runTIM :: (forall s. TIM s a) -> Maybe a
2405 runTIM (TIM m) = runST m
2411 Here the pattern-match fails, because you can't pattern-match against
2412 an argument of type <Literal>(forall s. TIM s a)</Literal>. Instead you
2413 must bind the variable and pattern match in the right hand side:
2416 runTIM :: (forall s. TIM s a) -> Maybe a
2417 runTIM tm = case tm of { TIM m -> runST m }
2420 The <Literal>tm</Literal> on the right hand side is (invisibly) instantiated, like
2421 any polymorphic value at its occurrence site, and now you can pattern-match
2428 <Title>The partial-application restriction</Title>
2431 There is really only one way in which data structures with polymorphic
2432 components might surprise you: you must not partially apply them.
2433 For example, this is illegal:
2439 map MkSwizzle [sort, reverse]
2445 The restriction is this: <Emphasis>every subexpression of the program must
2446 have a type that has no for-alls, except that in a function
2447 application (f e1…en) the partial applications are not subject to
2448 this rule</Emphasis>. The restriction makes type inference feasible.
2452 In the illegal example, the sub-expression <Literal>MkSwizzle</Literal> has the
2453 polymorphic type <Literal>(Ord b => [b] -> [b]) -> Swizzle</Literal> and is not
2454 a sub-expression of an enclosing application. On the other hand, this
2461 map (T1 (\a b -> a)) [1,2,3]
2467 even though it involves a partial application of <Function>T1</Function>, because
2468 the sub-expression <Literal>T1 (\a b -> a)</Literal> has type <Literal>Int -> T
2475 <Title>Type signatures
2479 Once you have data constructors with universally-quantified fields, or
2480 constants such as <Constant>runST</Constant> that have rank-2 types, it isn't long
2481 before you discover that you need more! Consider:
2487 mkTs f x y = [T1 f x, T1 f y]
2493 <Function>mkTs</Function> is a fuction that constructs some values of type
2494 <Literal>T</Literal>, using some pieces passed to it. The trouble is that since
2495 <Literal>f</Literal> is a function argument, Haskell assumes that it is
2496 monomorphic, so we'll get a type error when applying <Function>T1</Function> to
2497 it. This is a rather silly example, but the problem really bites in
2498 practice. Lots of people trip over the fact that you can't make
2499 "wrappers functions" for <Constant>runST</Constant> for exactly the same reason.
2500 In short, it is impossible to build abstractions around functions with
2505 The solution is fairly clear. We provide the ability to give a rank-2
2506 type signature for <Emphasis>ordinary</Emphasis> functions (not only data
2507 constructors), thus:
2513 mkTs :: (forall b. b -> b -> b) -> a -> [T a]
2514 mkTs f x y = [T1 f x, T1 f y]
2520 This type signature tells the compiler to attribute <Literal>f</Literal> with
2521 the polymorphic type <Literal>(forall b. b -> b -> b)</Literal> when type
2522 checking the body of <Function>mkTs</Function>, so now the application of
2523 <Function>T1</Function> is fine.
2527 There are two restrictions:
2536 You can only define a rank 2 type, specified by the following
2541 rank2type ::= [forall tyvars .] [context =>] funty
2542 funty ::= ([forall tyvars .] [context =>] ty) -> funty
2544 ty ::= ...current Haskell monotype syntax...
2548 Informally, the universal quantification must all be right at the beginning,
2549 or at the top level of a function argument.
2556 There is a restriction on the definition of a function whose
2557 type signature is a rank-2 type: the polymorphic arguments must be
2558 matched on the left hand side of the "<Literal>=</Literal>" sign. You can't
2559 define <Function>mkTs</Function> like this:
2563 mkTs :: (forall b. b -> b -> b) -> a -> [T a]
2564 mkTs = \ f x y -> [T1 f x, T1 f y]
2569 The same partial-application rule applies to ordinary functions with
2570 rank-2 types as applied to data constructors.
2583 <Title>Type synonyms and hoisting
2587 GHC also allows you to write a <Literal>forall</Literal> in a type synonym, thus:
2589 type Discard a = forall b. a -> b -> a
2594 However, it is often convenient to use these sort of synonyms at the right hand
2595 end of an arrow, thus:
2597 type Discard a = forall b. a -> b -> a
2599 g :: Int -> Discard Int
2602 Simply expanding the type synonym would give
2604 g :: Int -> (forall b. Int -> b -> Int)
2606 but GHC "hoists" the <Literal>forall</Literal> to give the isomorphic type
2608 g :: forall b. Int -> Int -> b -> Int
2610 In general, the rule is this: <Emphasis>to determine the type specified by any explicit
2611 user-written type (e.g. in a type signature), GHC expands type synonyms and then repeatedly
2612 performs the transformation:</Emphasis>
2614 <Emphasis>type1</Emphasis> -> forall a. <Emphasis>type2</Emphasis>
2616 forall a. <Emphasis>type1</Emphasis> -> <Emphasis>type2</Emphasis>
2618 (In fact, GHC tries to retain as much synonym information as possible for use in
2619 error messages, but that is a usability issue.) This rule applies, of course, whether
2620 or not the <Literal>forall</Literal> comes from a synonym. For example, here is another
2621 valid way to write <Literal>g</Literal>'s type signature:
2623 g :: Int -> Int -> forall b. b -> Int
2630 <Sect1 id="existential-quantification">
2631 <Title>Existentially quantified data constructors
2635 The idea of using existential quantification in data type declarations
2636 was suggested by Laufer (I believe, thought doubtless someone will
2637 correct me), and implemented in Hope+. It's been in Lennart
2638 Augustsson's <Command>hbc</Command> Haskell compiler for several years, and
2639 proved very useful. Here's the idea. Consider the declaration:
2645 data Foo = forall a. MkFoo a (a -> Bool)
2652 The data type <Literal>Foo</Literal> has two constructors with types:
2658 MkFoo :: forall a. a -> (a -> Bool) -> Foo
2665 Notice that the type variable <Literal>a</Literal> in the type of <Function>MkFoo</Function>
2666 does not appear in the data type itself, which is plain <Literal>Foo</Literal>.
2667 For example, the following expression is fine:
2673 [MkFoo 3 even, MkFoo 'c' isUpper] :: [Foo]
2679 Here, <Literal>(MkFoo 3 even)</Literal> packages an integer with a function
2680 <Function>even</Function> that maps an integer to <Literal>Bool</Literal>; and <Function>MkFoo 'c'
2681 isUpper</Function> packages a character with a compatible function. These
2682 two things are each of type <Literal>Foo</Literal> and can be put in a list.
2686 What can we do with a value of type <Literal>Foo</Literal>?. In particular,
2687 what happens when we pattern-match on <Function>MkFoo</Function>?
2693 f (MkFoo val fn) = ???
2699 Since all we know about <Literal>val</Literal> and <Function>fn</Function> is that they
2700 are compatible, the only (useful) thing we can do with them is to
2701 apply <Function>fn</Function> to <Literal>val</Literal> to get a boolean. For example:
2708 f (MkFoo val fn) = fn val
2714 What this allows us to do is to package heterogenous values
2715 together with a bunch of functions that manipulate them, and then treat
2716 that collection of packages in a uniform manner. You can express
2717 quite a bit of object-oriented-like programming this way.
2720 <Sect2 id="existential">
2721 <Title>Why existential?
2725 What has this to do with <Emphasis>existential</Emphasis> quantification?
2726 Simply that <Function>MkFoo</Function> has the (nearly) isomorphic type
2732 MkFoo :: (exists a . (a, a -> Bool)) -> Foo
2738 But Haskell programmers can safely think of the ordinary
2739 <Emphasis>universally</Emphasis> quantified type given above, thereby avoiding
2740 adding a new existential quantification construct.
2746 <Title>Type classes</Title>
2749 An easy extension (implemented in <Command>hbc</Command>) is to allow
2750 arbitrary contexts before the constructor. For example:
2756 data Baz = forall a. Eq a => Baz1 a a
2757 | forall b. Show b => Baz2 b (b -> b)
2763 The two constructors have the types you'd expect:
2769 Baz1 :: forall a. Eq a => a -> a -> Baz
2770 Baz2 :: forall b. Show b => b -> (b -> b) -> Baz
2776 But when pattern matching on <Function>Baz1</Function> the matched values can be compared
2777 for equality, and when pattern matching on <Function>Baz2</Function> the first matched
2778 value can be converted to a string (as well as applying the function to it).
2779 So this program is legal:
2786 f (Baz1 p q) | p == q = "Yes"
2788 f (Baz1 v fn) = show (fn v)
2794 Operationally, in a dictionary-passing implementation, the
2795 constructors <Function>Baz1</Function> and <Function>Baz2</Function> must store the
2796 dictionaries for <Literal>Eq</Literal> and <Literal>Show</Literal> respectively, and
2797 extract it on pattern matching.
2801 Notice the way that the syntax fits smoothly with that used for
2802 universal quantification earlier.
2808 <Title>Restrictions</Title>
2811 There are several restrictions on the ways in which existentially-quantified
2812 constructors can be use.
2821 When pattern matching, each pattern match introduces a new,
2822 distinct, type for each existential type variable. These types cannot
2823 be unified with any other type, nor can they escape from the scope of
2824 the pattern match. For example, these fragments are incorrect:
2832 Here, the type bound by <Function>MkFoo</Function> "escapes", because <Literal>a</Literal>
2833 is the result of <Function>f1</Function>. One way to see why this is wrong is to
2834 ask what type <Function>f1</Function> has:
2838 f1 :: Foo -> a -- Weird!
2842 What is this "<Literal>a</Literal>" in the result type? Clearly we don't mean
2847 f1 :: forall a. Foo -> a -- Wrong!
2851 The original program is just plain wrong. Here's another sort of error
2855 f2 (Baz1 a b) (Baz1 p q) = a==q
2859 It's ok to say <Literal>a==b</Literal> or <Literal>p==q</Literal>, but
2860 <Literal>a==q</Literal> is wrong because it equates the two distinct types arising
2861 from the two <Function>Baz1</Function> constructors.
2869 You can't pattern-match on an existentially quantified
2870 constructor in a <Literal>let</Literal> or <Literal>where</Literal> group of
2871 bindings. So this is illegal:
2875 f3 x = a==b where { Baz1 a b = x }
2879 You can only pattern-match
2880 on an existentially-quantified constructor in a <Literal>case</Literal> expression or
2881 in the patterns of a function definition.
2883 The reason for this restriction is really an implementation one.
2884 Type-checking binding groups is already a nightmare without
2885 existentials complicating the picture. Also an existential pattern
2886 binding at the top level of a module doesn't make sense, because it's
2887 not clear how to prevent the existentially-quantified type "escaping".
2888 So for now, there's a simple-to-state restriction. We'll see how
2896 You can't use existential quantification for <Literal>newtype</Literal>
2897 declarations. So this is illegal:
2901 newtype T = forall a. Ord a => MkT a
2905 Reason: a value of type <Literal>T</Literal> must be represented as a pair
2906 of a dictionary for <Literal>Ord t</Literal> and a value of type <Literal>t</Literal>.
2907 That contradicts the idea that <Literal>newtype</Literal> should have no
2908 concrete representation. You can get just the same efficiency and effect
2909 by using <Literal>data</Literal> instead of <Literal>newtype</Literal>. If there is no
2910 overloading involved, then there is more of a case for allowing
2911 an existentially-quantified <Literal>newtype</Literal>, because the <Literal>data</Literal>
2912 because the <Literal>data</Literal> version does carry an implementation cost,
2913 but single-field existentially quantified constructors aren't much
2914 use. So the simple restriction (no existential stuff on <Literal>newtype</Literal>)
2915 stands, unless there are convincing reasons to change it.
2923 You can't use <Literal>deriving</Literal> to define instances of a
2924 data type with existentially quantified data constructors.
2926 Reason: in most cases it would not make sense. For example:#
2929 data T = forall a. MkT [a] deriving( Eq )
2932 To derive <Literal>Eq</Literal> in the standard way we would need to have equality
2933 between the single component of two <Function>MkT</Function> constructors:
2937 (MkT a) == (MkT b) = ???
2940 But <VarName>a</VarName> and <VarName>b</VarName> have distinct types, and so can't be compared.
2941 It's just about possible to imagine examples in which the derived instance
2942 would make sense, but it seems altogether simpler simply to prohibit such
2943 declarations. Define your own instances!
2955 <Sect1 id="sec-assertions">
2957 <IndexTerm><Primary>Assertions</Primary></IndexTerm>
2961 If you want to make use of assertions in your standard Haskell code, you
2962 could define a function like the following:
2968 assert :: Bool -> a -> a
2969 assert False x = error "assertion failed!"
2976 which works, but gives you back a less than useful error message --
2977 an assertion failed, but which and where?
2981 One way out is to define an extended <Function>assert</Function> function which also
2982 takes a descriptive string to include in the error message and
2983 perhaps combine this with the use of a pre-processor which inserts
2984 the source location where <Function>assert</Function> was used.
2988 Ghc offers a helping hand here, doing all of this for you. For every
2989 use of <Function>assert</Function> in the user's source:
2995 kelvinToC :: Double -> Double
2996 kelvinToC k = assert (k >= 0.0) (k+273.15)
3002 Ghc will rewrite this to also include the source location where the
3009 assert pred val ==> assertError "Main.hs|15" pred val
3015 The rewrite is only performed by the compiler when it spots
3016 applications of <Function>Exception.assert</Function>, so you can still define and
3017 use your own versions of <Function>assert</Function>, should you so wish. If not,
3018 import <Literal>Exception</Literal> to make use <Function>assert</Function> in your code.
3022 To have the compiler ignore uses of assert, use the compiler option
3023 <Option>-fignore-asserts</Option>. <IndexTerm><Primary>-fignore-asserts option</Primary></IndexTerm> That is,
3024 expressions of the form <Literal>assert pred e</Literal> will be rewritten to <Literal>e</Literal>.
3028 Assertion failures can be caught, see the documentation for the
3029 <literal>Exception</literal> library (<xref linkend="sec-Exception">)
3035 <Sect1 id="scoped-type-variables">
3036 <Title>Scoped Type Variables
3040 A <Emphasis>pattern type signature</Emphasis> can introduce a <Emphasis>scoped type
3041 variable</Emphasis>. For example
3047 f (xs::[a]) = ys ++ ys
3056 The pattern <Literal>(xs::[a])</Literal> includes a type signature for <VarName>xs</VarName>.
3057 This brings the type variable <Literal>a</Literal> into scope; it scopes over
3058 all the patterns and right hand sides for this equation for <Function>f</Function>.
3059 In particular, it is in scope at the type signature for <VarName>y</VarName>.
3063 At ordinary type signatures, such as that for <VarName>ys</VarName>, any type variables
3064 mentioned in the type signature <Emphasis>that are not in scope</Emphasis> are
3065 implicitly universally quantified. (If there are no type variables in
3066 scope, all type variables mentioned in the signature are universally
3067 quantified, which is just as in Haskell 98.) In this case, since <VarName>a</VarName>
3068 is in scope, it is not universally quantified, so the type of <VarName>ys</VarName> is
3069 the same as that of <VarName>xs</VarName>. In Haskell 98 it is not possible to declare
3070 a type for <VarName>ys</VarName>; a major benefit of scoped type variables is that
3071 it becomes possible to do so.
3075 Scoped type variables are implemented in both GHC and Hugs. Where the
3076 implementations differ from the specification below, those differences
3081 So much for the basic idea. Here are the details.
3085 <Title>Scope and implicit quantification</Title>
3093 All the type variables mentioned in the patterns for a single
3094 function definition equation, that are not already in scope,
3095 are brought into scope by the patterns. We describe this set as
3096 the <Emphasis>type variables bound by the equation</Emphasis>.
3103 The type variables thus brought into scope may be mentioned
3104 in ordinary type signatures or pattern type signatures anywhere within
3112 In ordinary type signatures, any type variable mentioned in the
3113 signature that is in scope is <Emphasis>not</Emphasis> universally quantified.
3120 Ordinary type signatures do not bring any new type variables
3121 into scope (except in the type signature itself!). So this is illegal:
3130 It's illegal because <VarName>a</VarName> is not in scope in the body of <Function>f</Function>,
3131 so the ordinary signature <Literal>x::a</Literal> is equivalent to <Literal>x::forall a.a</Literal>;
3132 and that is an incorrect typing.
3139 There is no implicit universal quantification on pattern type
3140 signatures, nor may one write an explicit <Literal>forall</Literal> type in a pattern
3141 type signature. The pattern type signature is a monotype.
3149 The type variables in the head of a <Literal>class</Literal> or <Literal>instance</Literal> declaration
3150 scope over the methods defined in the <Literal>where</Literal> part. For example:
3164 (Not implemented in Hugs yet, Dec 98).
3175 <Title>Polymorphism</Title>
3183 Pattern type signatures are completely orthogonal to ordinary, separate
3184 type signatures. The two can be used independently or together. There is
3185 no scoping associated with the names of the type variables in a separate type signature.
3190 f (xs::[b]) = reverse xs
3199 The function must be polymorphic in the type variables
3200 bound by all its equations. Operationally, the type variables bound
3201 by one equation must not:
3208 Be unified with a type (such as <Literal>Int</Literal>, or <Literal>[a]</Literal>).
3214 Be unified with a type variable free in the environment.
3220 Be unified with each other. (They may unify with the type variables
3221 bound by another equation for the same function, of course.)
3228 For example, the following all fail to type check:
3232 f (x::a) (y::b) = [x,y] -- a unifies with b
3234 g (x::a) = x + 1::Int -- a unifies with Int
3236 h x = let k (y::a) = [x,y] -- a is free in the
3237 in k x -- environment
3239 k (x::a) True = ... -- a unifies with Int
3240 k (x::Int) False = ...
3243 w (x::a) = x -- a unifies with [b]
3252 The pattern-bound type variable may, however, be constrained
3253 by the context of the principal type, thus:
3257 f (x::a) (y::a) = x+y*2
3261 gets the inferred type: <Literal>forall a. Num a => a -> a -> a</Literal>.
3272 <Title>Result type signatures</Title>
3280 The result type of a function can be given a signature,
3285 f (x::a) :: [a] = [x,x,x]
3289 The final <Literal>:: [a]</Literal> after all the patterns gives a signature to the
3290 result type. Sometimes this is the only way of naming the type variable
3295 f :: Int -> [a] -> [a]
3296 f n :: ([a] -> [a]) = let g (x::a, y::a) = (y,x)
3297 in \xs -> map g (reverse xs `zip` xs)
3309 Result type signatures are not yet implemented in Hugs.
3315 <Title>Pattern signatures on other constructs</Title>
3323 A pattern type signature can be on an arbitrary sub-pattern, not
3328 f ((x,y)::(a,b)) = (y,x) :: (b,a)
3337 Pattern type signatures, including the result part, can be used
3338 in lambda abstractions:
3342 (\ (x::a, y) :: a -> x)
3346 Type variables bound by these patterns must be polymorphic in
3347 the sense defined above.
3352 f1 (x::c) = f1 x -- ok
3353 f2 = \(x::c) -> f2 x -- not ok
3357 Here, <Function>f1</Function> is OK, but <Function>f2</Function> is not, because <VarName>c</VarName> gets unified
3358 with a type variable free in the environment, in this
3359 case, the type of <Function>f2</Function>, which is in the environment when
3360 the lambda abstraction is checked.
3367 Pattern type signatures, including the result part, can be used
3368 in <Literal>case</Literal> expressions:
3372 case e of { (x::a, y) :: a -> x }
3376 The pattern-bound type variables must, as usual,
3377 be polymorphic in the following sense: each case alternative,
3378 considered as a lambda abstraction, must be polymorphic.
3383 case (True,False) of { (x::a, y) -> x }
3387 Even though the context is that of a pair of booleans,
3388 the alternative itself is polymorphic. Of course, it is
3393 case (True,False) of { (x::Bool, y) -> x }
3402 To avoid ambiguity, the type after the “<Literal>::</Literal>” in a result
3403 pattern signature on a lambda or <Literal>case</Literal> must be atomic (i.e. a single
3404 token or a parenthesised type of some sort). To see why,
3405 consider how one would parse this:
3418 Pattern type signatures that bind new type variables
3419 may not be used in pattern bindings at all.
3424 f x = let (y, z::a) = x in ...
3428 But these are OK, because they do not bind fresh type variables:
3432 f1 x = let (y, z::Int) = x in ...
3433 f2 (x::(Int,a)) = let (y, z::a) = x in ...
3437 However a single variable is considered a degenerate function binding,
3438 rather than a degerate pattern binding, so this is permitted, even
3439 though it binds a type variable:
3443 f :: (b->b) = \(x::b) -> x
3452 Such degnerate function bindings do not fall under the monomorphism
3459 g :: a -> a -> Bool = \x y. x==y
3465 Here <Function>g</Function> has type <Literal>forall a. Eq a => a -> a -> Bool</Literal>, just as if
3466 <Function>g</Function> had a separate type signature. Lacking a type signature, <Function>g</Function>
3467 would get a monomorphic type.
3473 <Title>Existentials</Title>
3481 Pattern type signatures can bind existential type variables.
3486 data T = forall a. MkT [a]
3489 f (MkT [t::a]) = MkT t3
3506 <Sect1 id="pragmas">
3511 GHC supports several pragmas, or instructions to the compiler placed
3512 in the source code. Pragmas don't affect the meaning of the program,
3513 but they might affect the efficiency of the generated code.
3516 <Sect2 id="inline-pragma">
3517 <Title>INLINE pragma
3519 <IndexTerm><Primary>INLINE pragma</Primary></IndexTerm>
3520 <IndexTerm><Primary>pragma, INLINE</Primary></IndexTerm></Title>
3523 GHC (with <Option>-O</Option>, as always) tries to inline (or “unfold”)
3524 functions/values that are “small enough,” thus avoiding the call
3525 overhead and possibly exposing other more-wonderful optimisations.
3529 You will probably see these unfoldings (in Core syntax) in your
3534 Normally, if GHC decides a function is “too expensive” to inline, it
3535 will not do so, nor will it export that unfolding for other modules to
3540 The sledgehammer you can bring to bear is the
3541 <Literal>INLINE</Literal><IndexTerm><Primary>INLINE pragma</Primary></IndexTerm> pragma, used thusly:
3544 key_function :: Int -> String -> (Bool, Double)
3546 #ifdef __GLASGOW_HASKELL__
3547 {-# INLINE key_function #-}
3551 (You don't need to do the C pre-processor carry-on unless you're going
3552 to stick the code through HBC—it doesn't like <Literal>INLINE</Literal> pragmas.)
3556 The major effect of an <Literal>INLINE</Literal> pragma is to declare a function's
3557 “cost” to be very low. The normal unfolding machinery will then be
3558 very keen to inline it.
3562 An <Literal>INLINE</Literal> pragma for a function can be put anywhere its type
3563 signature could be put.
3567 <Literal>INLINE</Literal> pragmas are a particularly good idea for the
3568 <Literal>then</Literal>/<Literal>return</Literal> (or <Literal>bind</Literal>/<Literal>unit</Literal>) functions in a monad.
3569 For example, in GHC's own <Literal>UniqueSupply</Literal> monad code, we have:
3572 #ifdef __GLASGOW_HASKELL__
3573 {-# INLINE thenUs #-}
3574 {-# INLINE returnUs #-}
3582 <Sect2 id="noinline-pragma">
3583 <Title>NOINLINE pragma
3587 <IndexTerm><Primary>NOINLINE pragma</Primary></IndexTerm>
3588 <IndexTerm><Primary>pragma, NOINLINE</Primary></IndexTerm>
3592 The <Literal>NOINLINE</Literal> pragma does exactly what you'd expect: it stops the
3593 named function from being inlined by the compiler. You shouldn't ever
3594 need to do this, unless you're very cautious about code size.
3599 <Sect2 id="specialize-pragma">
3600 <Title>SPECIALIZE pragma
3604 <IndexTerm><Primary>SPECIALIZE pragma</Primary></IndexTerm>
3605 <IndexTerm><Primary>pragma, SPECIALIZE</Primary></IndexTerm>
3606 <IndexTerm><Primary>overloading, death to</Primary></IndexTerm>
3610 (UK spelling also accepted.) For key overloaded functions, you can
3611 create extra versions (NB: more code space) specialised to particular
3612 types. Thus, if you have an overloaded function:
3618 hammeredLookup :: Ord key => [(key, value)] -> key -> value
3624 If it is heavily used on lists with <Literal>Widget</Literal> keys, you could
3625 specialise it as follows:
3628 {-# SPECIALIZE hammeredLookup :: [(Widget, value)] -> Widget -> value #-}
3634 To get very fancy, you can also specify a named function to use for
3635 the specialised value, by adding <Literal>= blah</Literal>, as in:
3638 {-# SPECIALIZE hammeredLookup :: ...as before... = blah #-}
3641 It's <Emphasis>Your Responsibility</Emphasis> to make sure that <Function>blah</Function> really
3642 behaves as a specialised version of <Function>hammeredLookup</Function>!!!
3646 NOTE: the <Literal>=blah</Literal> feature isn't implemented in GHC 4.xx.
3650 An example in which the <Literal>= blah</Literal> form will Win Big:
3653 toDouble :: Real a => a -> Double
3654 toDouble = fromRational . toRational
3656 {-# SPECIALIZE toDouble :: Int -> Double = i2d #-}
3657 i2d (I# i) = D# (int2Double# i) -- uses Glasgow prim-op directly
3660 The <Function>i2d</Function> function is virtually one machine instruction; the
3661 default conversion—via an intermediate <Literal>Rational</Literal>—is obscenely
3662 expensive by comparison.
3666 By using the US spelling, your <Literal>SPECIALIZE</Literal> pragma will work with
3667 HBC, too. Note that HBC doesn't support the <Literal>= blah</Literal> form.
3671 A <Literal>SPECIALIZE</Literal> pragma for a function can be put anywhere its type
3672 signature could be put.
3677 <Sect2 id="specialize-instance-pragma">
3678 <Title>SPECIALIZE instance pragma
3682 <IndexTerm><Primary>SPECIALIZE pragma</Primary></IndexTerm>
3683 <IndexTerm><Primary>overloading, death to</Primary></IndexTerm>
3684 Same idea, except for instance declarations. For example:
3687 instance (Eq a) => Eq (Foo a) where { ... usual stuff ... }
3689 {-# SPECIALIZE instance Eq (Foo [(Int, Bar)] #-}
3692 Compatible with HBC, by the way.
3697 <Sect2 id="line-pragma">
3702 <IndexTerm><Primary>LINE pragma</Primary></IndexTerm>
3703 <IndexTerm><Primary>pragma, LINE</Primary></IndexTerm>
3707 This pragma is similar to C's <Literal>#line</Literal> pragma, and is mainly for use in
3708 automatically generated Haskell code. It lets you specify the line
3709 number and filename of the original code; for example
3715 {-# LINE 42 "Foo.vhs" #-}
3721 if you'd generated the current file from something called <Filename>Foo.vhs</Filename>
3722 and this line corresponds to line 42 in the original. GHC will adjust
3723 its error messages to refer to the line/file named in the <Literal>LINE</Literal>
3730 <Title>RULES pragma</Title>
3733 The RULES pragma lets you specify rewrite rules. It is described in
3734 <XRef LinkEnd="rewrite-rules">.
3741 <Sect1 id="rewrite-rules">
3742 <Title>Rewrite rules
3744 <IndexTerm><Primary>RULES pagma</Primary></IndexTerm>
3745 <IndexTerm><Primary>pragma, RULES</Primary></IndexTerm>
3746 <IndexTerm><Primary>rewrite rules</Primary></IndexTerm></Title>
3749 The programmer can specify rewrite rules as part of the source program
3750 (in a pragma). GHC applies these rewrite rules wherever it can.
3758 "map/map" forall f g xs. map f (map g xs) = map (f.g) xs
3765 <Title>Syntax</Title>
3768 From a syntactic point of view:
3774 Each rule has a name, enclosed in double quotes. The name itself has
3775 no significance at all. It is only used when reporting how many times the rule fired.
3781 There may be zero or more rules in a <Literal>RULES</Literal> pragma.
3787 Layout applies in a <Literal>RULES</Literal> pragma. Currently no new indentation level
3788 is set, so you must lay out your rules starting in the same column as the
3789 enclosing definitions.
3795 Each variable mentioned in a rule must either be in scope (e.g. <Function>map</Function>),
3796 or bound by the <Literal>forall</Literal> (e.g. <Function>f</Function>, <Function>g</Function>, <Function>xs</Function>). The variables bound by
3797 the <Literal>forall</Literal> are called the <Emphasis>pattern</Emphasis> variables. They are separated
3798 by spaces, just like in a type <Literal>forall</Literal>.
3804 A pattern variable may optionally have a type signature.
3805 If the type of the pattern variable is polymorphic, it <Emphasis>must</Emphasis> have a type signature.
3806 For example, here is the <Literal>foldr/build</Literal> rule:
3809 "fold/build" forall k z (g::forall b. (a->b->b) -> b -> b) .
3810 foldr k z (build g) = g k z
3813 Since <Function>g</Function> has a polymorphic type, it must have a type signature.
3820 The left hand side of a rule must consist of a top-level variable applied
3821 to arbitrary expressions. For example, this is <Emphasis>not</Emphasis> OK:
3824 "wrong1" forall e1 e2. case True of { True -> e1; False -> e2 } = e1
3825 "wrong2" forall f. f True = True
3828 In <Literal>"wrong1"</Literal>, the LHS is not an application; in <Literal>"wrong2"</Literal>, the LHS has a pattern variable
3835 A rule does not need to be in the same module as (any of) the
3836 variables it mentions, though of course they need to be in scope.
3842 Rules are automatically exported from a module, just as instance declarations are.
3853 <Title>Semantics</Title>
3856 From a semantic point of view:
3862 Rules are only applied if you use the <Option>-O</Option> flag.
3868 Rules are regarded as left-to-right rewrite rules.
3869 When GHC finds an expression that is a substitution instance of the LHS
3870 of a rule, it replaces the expression by the (appropriately-substituted) RHS.
3871 By "a substitution instance" we mean that the LHS can be made equal to the
3872 expression by substituting for the pattern variables.
3879 The LHS and RHS of a rule are typechecked, and must have the
3887 GHC makes absolutely no attempt to verify that the LHS and RHS
3888 of a rule have the same meaning. That is undecideable in general, and
3889 infeasible in most interesting cases. The responsibility is entirely the programmer's!
3896 GHC makes no attempt to make sure that the rules are confluent or
3897 terminating. For example:
3900 "loop" forall x,y. f x y = f y x
3903 This rule will cause the compiler to go into an infinite loop.
3910 If more than one rule matches a call, GHC will choose one arbitrarily to apply.
3916 GHC currently uses a very simple, syntactic, matching algorithm
3917 for matching a rule LHS with an expression. It seeks a substitution
3918 which makes the LHS and expression syntactically equal modulo alpha
3919 conversion. The pattern (rule), but not the expression, is eta-expanded if
3920 necessary. (Eta-expanding the epression can lead to laziness bugs.)
3921 But not beta conversion (that's called higher-order matching).
3925 Matching is carried out on GHC's intermediate language, which includes
3926 type abstractions and applications. So a rule only matches if the
3927 types match too. See <XRef LinkEnd="rule-spec"> below.
3933 GHC keeps trying to apply the rules as it optimises the program.
3934 For example, consider:
3943 The expression <Literal>s (t xs)</Literal> does not match the rule <Literal>"map/map"</Literal>, but GHC
3944 will substitute for <VarName>s</VarName> and <VarName>t</VarName>, giving an expression which does match.
3945 If <VarName>s</VarName> or <VarName>t</VarName> was (a) used more than once, and (b) large or a redex, then it would
3946 not be substituted, and the rule would not fire.
3953 In the earlier phases of compilation, GHC inlines <Emphasis>nothing
3954 that appears on the LHS of a rule</Emphasis>, because once you have substituted
3955 for something you can't match against it (given the simple minded
3956 matching). So if you write the rule
3959 "map/map" forall f,g. map f . map g = map (f.g)
3962 this <Emphasis>won't</Emphasis> match the expression <Literal>map f (map g xs)</Literal>.
3963 It will only match something written with explicit use of ".".
3964 Well, not quite. It <Emphasis>will</Emphasis> match the expression
3970 where <Function>wibble</Function> is defined:
3973 wibble f g = map f . map g
3976 because <Function>wibble</Function> will be inlined (it's small).
3978 Later on in compilation, GHC starts inlining even things on the
3979 LHS of rules, but still leaves the rules enabled. This inlining
3980 policy is controlled by the per-simplification-pass flag <Option>-finline-phase</Option><Emphasis>n</Emphasis>.
3987 All rules are implicitly exported from the module, and are therefore
3988 in force in any module that imports the module that defined the rule, directly
3989 or indirectly. (That is, if A imports B, which imports C, then C's rules are
3990 in force when compiling A.) The situation is very similar to that for instance
4002 <Title>List fusion</Title>
4005 The RULES mechanism is used to implement fusion (deforestation) of common list functions.
4006 If a "good consumer" consumes an intermediate list constructed by a "good producer", the
4007 intermediate list should be eliminated entirely.
4011 The following are good producers:
4023 Enumerations of <Literal>Int</Literal> and <Literal>Char</Literal> (e.g. <Literal>['a'..'z']</Literal>).
4029 Explicit lists (e.g. <Literal>[True, False]</Literal>)
4035 The cons constructor (e.g <Literal>3:4:[]</Literal>)
4041 <Function>++</Function>
4047 <Function>map</Function>
4053 <Function>filter</Function>
4059 <Function>iterate</Function>, <Function>repeat</Function>
4065 <Function>zip</Function>, <Function>zipWith</Function>
4074 The following are good consumers:
4086 <Function>array</Function> (on its second argument)
4092 <Function>length</Function>
4098 <Function>++</Function> (on its first argument)
4104 <Function>map</Function>
4110 <Function>filter</Function>
4116 <Function>concat</Function>
4122 <Function>unzip</Function>, <Function>unzip2</Function>, <Function>unzip3</Function>, <Function>unzip4</Function>
4128 <Function>zip</Function>, <Function>zipWith</Function> (but on one argument only; if both are good producers, <Function>zip</Function>
4129 will fuse with one but not the other)
4135 <Function>partition</Function>
4141 <Function>head</Function>
4147 <Function>and</Function>, <Function>or</Function>, <Function>any</Function>, <Function>all</Function>
4153 <Function>sequence_</Function>
4159 <Function>msum</Function>
4165 <Function>sortBy</Function>
4174 So, for example, the following should generate no intermediate lists:
4177 array (1,10) [(i,i*i) | i <- map (+ 1) [0..9]]
4183 This list could readily be extended; if there are Prelude functions that you use
4184 a lot which are not included, please tell us.
4188 If you want to write your own good consumers or producers, look at the
4189 Prelude definitions of the above functions to see how to do so.
4194 <Sect2 id="rule-spec">
4195 <Title>Specialisation
4199 Rewrite rules can be used to get the same effect as a feature
4200 present in earlier version of GHC:
4203 {-# SPECIALIZE fromIntegral :: Int8 -> Int16 = int8ToInt16 #-}
4206 This told GHC to use <Function>int8ToInt16</Function> instead of <Function>fromIntegral</Function> whenever
4207 the latter was called with type <Literal>Int8 -> Int16</Literal>. That is, rather than
4208 specialising the original definition of <Function>fromIntegral</Function> the programmer is
4209 promising that it is safe to use <Function>int8ToInt16</Function> instead.
4213 This feature is no longer in GHC. But rewrite rules let you do the
4218 "fromIntegral/Int8/Int16" fromIntegral = int8ToInt16
4222 This slightly odd-looking rule instructs GHC to replace <Function>fromIntegral</Function>
4223 by <Function>int8ToInt16</Function> <Emphasis>whenever the types match</Emphasis>. Speaking more operationally,
4224 GHC adds the type and dictionary applications to get the typed rule
4227 forall (d1::Integral Int8) (d2::Num Int16) .
4228 fromIntegral Int8 Int16 d1 d2 = int8ToInt16
4232 this rule does not need to be in the same file as fromIntegral,
4233 unlike the <Literal>SPECIALISE</Literal> pragmas which currently do (so that they
4234 have an original definition available to specialise).
4240 <Title>Controlling what's going on</Title>
4248 Use <Option>-ddump-rules</Option> to see what transformation rules GHC is using.
4254 Use <Option>-ddump-simpl-stats</Option> to see what rules are being fired.
4255 If you add <Option>-dppr-debug</Option> you get a more detailed listing.
4261 The defintion of (say) <Function>build</Function> in <FileName>PrelBase.lhs</FileName> looks llike this:
4264 build :: forall a. (forall b. (a -> b -> b) -> b -> b) -> [a]
4265 {-# INLINE build #-}
4269 Notice the <Literal>INLINE</Literal>! That prevents <Literal>(:)</Literal> from being inlined when compiling
4270 <Literal>PrelBase</Literal>, so that an importing module will “see” the <Literal>(:)</Literal>, and can
4271 match it on the LHS of a rule. <Literal>INLINE</Literal> prevents any inlining happening
4272 in the RHS of the <Literal>INLINE</Literal> thing. I regret the delicacy of this.
4279 In <Filename>ghc/lib/std/PrelBase.lhs</Filename> look at the rules for <Function>map</Function> to
4280 see how to write rules that will do fusion and yet give an efficient
4281 program even if fusion doesn't happen. More rules in <Filename>PrelList.lhs</Filename>.
4293 <Sect1 id="generic-classes">
4294 <Title>Generic classes</Title>
4297 The ideas behind this extension are described in detail in "Derivable type classes",
4298 Ralf Hinze and Simon Peyton Jones, Haskell Workshop, Montreal Sept 2000, pp94-105.
4299 An example will give the idea:
4307 fromBin :: [Int] -> (a, [Int])
4309 toBin {| Unit |} Unit = []
4310 toBin {| a :+: b |} (Inl x) = 0 : toBin x
4311 toBin {| a :+: b |} (Inr y) = 1 : toBin y
4312 toBin {| a :*: b |} (x :*: y) = toBin x ++ toBin y
4314 fromBin {| Unit |} bs = (Unit, bs)
4315 fromBin {| a :+: b |} (0:bs) = (Inl x, bs') where (x,bs') = fromBin bs
4316 fromBin {| a :+: b |} (1:bs) = (Inr y, bs') where (y,bs') = fromBin bs
4317 fromBin {| a :*: b |} bs = (x :*: y, bs'') where (x,bs' ) = fromBin bs
4318 (y,bs'') = fromBin bs'
4321 This class declaration explains how <Literal>toBin</Literal> and <Literal>fromBin</Literal>
4322 work for arbitrary data types. They do so by giving cases for unit, product, and sum,
4323 which are defined thus in the library module <Literal>Generics</Literal>:
4327 data a :+: b = Inl a | Inr b
4328 data a :*: b = a :*: b
4331 Now you can make a data type into an instance of Bin like this:
4333 instance (Bin a, Bin b) => Bin (a,b)
4334 instance Bin a => Bin [a]
4336 That is, just leave off the "where" clasuse. Of course, you can put in the
4337 where clause and over-ride whichever methods you please.
4341 <Title> Using generics </Title>
4342 <Para>To use generics you need to</para>
4345 <Para>Use the <Option>-fgenerics</Option> flag.</Para>
4348 <Para>Import the module <Literal>Generics</Literal> from the
4349 <Literal>lang</Literal> package. This import brings into
4350 scope the data types <Literal>Unit</Literal>,
4351 <Literal>:*:</Literal>, and <Literal>:+:</Literal>. (You
4352 don't need this import if you don't mention these types
4353 explicitly; for example, if you are simply giving instance
4354 declarations.)</Para>
4359 <Sect2> <Title> Changes wrt the paper </Title>
4361 Note that the type constructors <Literal>:+:</Literal> and <Literal>:*:</Literal>
4362 can be written infix (indeed, you can now use
4363 any operator starting in a colon as an infix type constructor). Also note that
4364 the type constructors are not exactly as in the paper (Unit instead of 1, etc).
4365 Finally, note that the syntax of the type patterns in the class declaration
4366 uses "<Literal>{|</Literal>" and "<Literal>{|</Literal>" brackets; curly braces
4367 alone would ambiguous when they appear on right hand sides (an extension we
4368 anticipate wanting).
4372 <Sect2> <Title>Terminology and restrictions</Title>
4374 Terminology. A "generic default method" in a class declaration
4375 is one that is defined using type patterns as above.
4376 A "polymorphic default method" is a default method defined as in Haskell 98.
4377 A "generic class declaration" is a class declaration with at least one
4378 generic default method.
4386 Alas, we do not yet implement the stuff about constructor names and
4393 A generic class can have only one parameter; you can't have a generic
4394 multi-parameter class.
4400 A default method must be defined entirely using type patterns, or entirely
4401 without. So this is illegal:
4404 op :: a -> (a, Bool)
4405 op {| Unit |} Unit = (Unit, True)
4408 However it is perfectly OK for some methods of a generic class to have
4409 generic default methods and others to have polymorphic default methods.
4415 The type variable(s) in the type pattern for a generic method declaration
4416 scope over the right hand side. So this is legal (note the use of the type variable ``p'' in a type signature on the right hand side:
4420 op {| p :*: q |} (x :*: y) = op (x :: p)
4428 The type patterns in a generic default method must take one of the forms:
4434 where "a" and "b" are type variables. Furthermore, all the type patterns for
4435 a single type constructor (<Literal>:*:</Literal>, say) must be identical; they
4436 must use the same type variables. So this is illegal:
4440 op {| a :+: b |} (Inl x) = True
4441 op {| p :+: q |} (Inr y) = False
4443 The type patterns must be identical, even in equations for different methods of the class.
4444 So this too is illegal:
4448 op {| a :*: b |} (Inl x) = True
4451 op {| p :*: q |} (Inr y) = False
4453 (The reason for this restriction is that we gather all the equations for a particular type consructor
4454 into a single generic instance declaration.)
4460 A generic method declaration must give a case for each of the three type constructors.
4466 In an instance declaration for a generic class, the idea is that the compiler
4467 will fill in the methods for you, based on the generic templates. However it can only
4472 The instance type is simple (a type constructor applied to type variables, as in Haskell 98).
4477 No constructor of the instance type has unboxed fields.
4481 (Of course, these things can only arise if you are already using GHC extensions.)
4482 However, you can still give an instance declarations for types which break these rules,
4483 provided you give explicit code to override any generic default methods.
4491 The option <Option>-ddump-deriv</Option> dumps incomprehensible stuff giving details of
4492 what the compiler does with generic declarations.
4497 <Sect2> <Title> Another example </Title>
4499 Just to finish with, here's another example I rather like:
4503 nCons {| Unit |} _ = 1
4504 nCons {| a :*: b |} _ = 1
4505 nCons {| a :+: b |} _ = nCons (bot::a) + nCons (bot::b)
4508 tag {| Unit |} _ = 1
4509 tag {| a :*: b |} _ = 1
4510 tag {| a :+: b |} (Inl x) = tag x
4511 tag {| a :+: b |} (Inr y) = nCons (bot::a) + tag y
4518 ;;; Local Variables: ***
4520 ;;; sgml-parent-document: ("users_guide.sgml" "book" "chapter" "sect1") ***