2 <IndexTerm><Primary>language, GHC</Primary></IndexTerm>
3 <IndexTerm><Primary>extensions, GHC</Primary></IndexTerm>
4 As with all known Haskell systems, GHC implements some extensions to
5 the language. To use them, you'll need to give a <Option>-fglasgow-exts</Option>
6 <IndexTerm><Primary>-fglasgow-exts option</Primary></IndexTerm> option.
10 Virtually all of the Glasgow extensions serve to give you access to
11 the underlying facilities with which we implement Haskell. Thus, you
12 can get at the Raw Iron, if you are willing to write some non-standard
13 code at a more primitive level. You need not be “stuck” on
14 performance because of the implementation costs of Haskell's
15 “high-level” features—you can always code “under” them. In an extreme case, you can write all your time-critical code in C, and then just glue it together with Haskell!
19 Executive summary of our extensions:
26 <Term>Unboxed types and primitive operations:</Term>
29 You can get right down to the raw machine types and operations;
30 included in this are “primitive arrays” (direct access to Big Wads
31 of Bytes). Please see <XRef LinkEnd="glasgow-unboxed"> and following.
37 <Term>Multi-parameter type classes:</Term>
40 GHC's type system supports extended type classes with multiple
41 parameters. Please see <XRef LinkEnd="multi-param-type-classes">.
47 <Term>Local universal quantification:</Term>
50 GHC's type system supports explicit universal quantification in
51 constructor fields and function arguments. This is useful for things
52 like defining <Literal>runST</Literal> from the state-thread world. See <XRef LinkEnd="universal-quantification">.
58 <Term>Extistentially quantification in data types:</Term>
61 Some or all of the type variables in a datatype declaration may be
62 <Emphasis>existentially quantified</Emphasis>. More details in <XRef LinkEnd="existential-quantification">.
68 <Term>Scoped type variables:</Term>
71 Scoped type variables enable the programmer to supply type signatures
72 for some nested declarations, where this would not be legal in Haskell
73 98. Details in <XRef LinkEnd="scoped-type-variables">.
79 <Term>Pattern guards</Term>
82 Instead of being a boolean expression, a guard is a list of qualifiers, exactly as in a list comprehension. See <XRef LinkEnd="pattern-guards">.
88 <Term>Foreign calling:</Term>
91 Just what it sounds like. We provide <Emphasis>lots</Emphasis> of rope that you
92 can dangle around your neck. Please see <XRef LinkEnd="ffi">.
101 Pragmas are special instructions to the compiler placed in the source
102 file. The pragmas GHC supports are described in <XRef LinkEnd="pragmas">.
108 <Term>Rewrite rules:</Term>
111 The programmer can specify rewrite rules as part of the source program
112 (in a pragma). GHC applies these rewrite rules wherever it can.
113 Details in <XRef LinkEnd="rewrite-rules">.
121 Before you get too carried away working at the lowest level (e.g.,
122 sloshing <Literal>MutableByteArray#</Literal>s around your
123 program), you may wish to check if there are libraries that provide a
124 “Haskellised veneer” over the features you want. See
125 <xref linkend="book-hslibs">.
128 <Sect1 id="primitives">
129 <Title>Unboxed types and primitive operations
131 <IndexTerm><Primary>PrelGHC module</Primary></IndexTerm>
134 This module defines all the types which are primitive in Glasgow
135 Haskell, and the operations provided for them.
138 <Sect2 id="glasgow-unboxed">
143 <IndexTerm><Primary>Unboxed types (Glasgow extension)</Primary></IndexTerm>
146 <para>Most types in GHC are <firstterm>boxed</firstterm>, which means
147 that values of that type are represented by a pointer to a heap
148 object. The representation of a Haskell <literal>Int</literal>, for
149 example, is a two-word heap object. An <firstterm>unboxed</firstterm>
150 type, however, is represented by the value itself, no pointers or heap
151 allocation are involved.
155 Unboxed types correspond to the “raw machine” types you
156 would use in C: <Literal>Int#</Literal> (long int),
157 <Literal>Double#</Literal> (double), <Literal>Addr#</Literal>
158 (void *), etc. The <Emphasis>primitive operations</Emphasis>
159 (PrimOps) on these types are what you might expect; e.g.,
160 <Literal>(+#)</Literal> is addition on
161 <Literal>Int#</Literal>s, and is the machine-addition that we all
162 know and love—usually one instruction.
166 Primitive (unboxed) types cannot be defined in Haskell, and are
167 therefore built into the language and compiler. Primitive types are
168 always unlifted; that is, a value of a primitive type cannot be
169 bottom. We use the convention that primitive types, values, and
170 operations have a <Literal>#</Literal> suffix.
174 Primitive values are often represented by a simple bit-pattern, such
175 as <Literal>Int#</Literal>, <Literal>Float#</Literal>,
176 <Literal>Double#</Literal>. But this is not necessarily the case:
177 a primitive value might be represented by a pointer to a
178 heap-allocated object. Examples include
179 <Literal>Array#</Literal>, the type of primitive arrays. A
180 primitive array is heap-allocated because it is too big a value to fit
181 in a register, and would be too expensive to copy around; in a sense,
182 it is accidental that it is represented by a pointer. If a pointer
183 represents a primitive value, then it really does point to that value:
184 no unevaluated thunks, no indirections…nothing can be at the
185 other end of the pointer than the primitive value.
189 There are some restrictions on the use of primitive types, the main
190 one being that you can't pass a primitive value to a polymorphic
191 function or store one in a polymorphic data type. This rules out
192 things like <Literal>[Int#]</Literal> (i.e. lists of primitive
193 integers). The reason for this restriction is that polymorphic
194 arguments and constructor fields are assumed to be pointers: if an
195 unboxed integer is stored in one of these, the garbage collector would
196 attempt to follow it, leading to unpredictable space leaks. Or a
197 <Function>seq</Function> operation on the polymorphic component may
198 attempt to dereference the pointer, with disastrous results. Even
199 worse, the unboxed value might be larger than a pointer
200 (<Literal>Double#</Literal> for instance).
204 Nevertheless, A numerically-intensive program using unboxed types can
205 go a <Emphasis>lot</Emphasis> faster than its “standard”
206 counterpart—we saw a threefold speedup on one example.
211 <Sect2 id="unboxed-tuples">
212 <Title>Unboxed Tuples
216 Unboxed tuples aren't really exported by <Literal>PrelGHC</Literal>,
217 they're available by default with <Option>-fglasgow-exts</Option>. An
218 unboxed tuple looks like this:
230 where <Literal>e_1..e_n</Literal> are expressions of any
231 type (primitive or non-primitive). The type of an unboxed tuple looks
236 Unboxed tuples are used for functions that need to return multiple
237 values, but they avoid the heap allocation normally associated with
238 using fully-fledged tuples. When an unboxed tuple is returned, the
239 components are put directly into registers or on the stack; the
240 unboxed tuple itself does not have a composite representation. Many
241 of the primitive operations listed in this section return unboxed
246 There are some pretty stringent restrictions on the use of unboxed tuples:
255 Unboxed tuple types are subject to the same restrictions as
256 other unboxed types; i.e. they may not be stored in polymorphic data
257 structures or passed to polymorphic functions.
264 Unboxed tuples may only be constructed as the direct result of
265 a function, and may only be deconstructed with a <Literal>case</Literal> expression.
266 eg. the following are valid:
270 f x y = (# x+1, y-1 #)
271 g x = case f x x of { (# a, b #) -> a + b }
275 but the following are invalid:
289 No variable can have an unboxed tuple type. This is illegal:
293 f :: (# Int, Int #) -> (# Int, Int #)
298 because <VarName>x</VarName> has an unboxed tuple type.
308 Note: we may relax some of these restrictions in the future.
312 The <Literal>IO</Literal> and <Literal>ST</Literal> monads use unboxed tuples to avoid unnecessary
313 allocation during sequences of operations.
319 <Title>Character and numeric types</Title>
322 <IndexTerm><Primary>character types, primitive</Primary></IndexTerm>
323 <IndexTerm><Primary>numeric types, primitive</Primary></IndexTerm>
324 <IndexTerm><Primary>integer types, primitive</Primary></IndexTerm>
325 <IndexTerm><Primary>floating point types, primitive</Primary></IndexTerm>
326 There are the following obvious primitive types:
342 <IndexTerm><Primary><literal>Char#</literal></Primary></IndexTerm>
343 <IndexTerm><Primary><literal>Int#</literal></Primary></IndexTerm>
344 <IndexTerm><Primary><literal>Word#</literal></Primary></IndexTerm>
345 <IndexTerm><Primary><literal>Addr#</literal></Primary></IndexTerm>
346 <IndexTerm><Primary><literal>Float#</literal></Primary></IndexTerm>
347 <IndexTerm><Primary><literal>Double#</literal></Primary></IndexTerm>
348 <IndexTerm><Primary><literal>Int64#</literal></Primary></IndexTerm>
349 <IndexTerm><Primary><literal>Word64#</literal></Primary></IndexTerm>
353 If you really want to know their exact equivalents in C, see
354 <Filename>ghc/includes/StgTypes.h</Filename> in the GHC source tree.
358 Literals for these types may be written as follows:
367 'a'# a Char#; for weird characters, use '\o<octal>'#
368 "a"# an Addr# (a `char *')
371 <IndexTerm><Primary>literals, primitive</Primary></IndexTerm>
372 <IndexTerm><Primary>constants, primitive</Primary></IndexTerm>
373 <IndexTerm><Primary>numbers, primitive</Primary></IndexTerm>
379 <Title>Comparison operations</Title>
382 <IndexTerm><Primary>comparisons, primitive</Primary></IndexTerm>
383 <IndexTerm><Primary>operators, comparison</Primary></IndexTerm>
389 {>,>=,==,/=,<,<=}# :: Int# -> Int# -> Bool
391 {gt,ge,eq,ne,lt,le}Char# :: Char# -> Char# -> Bool
392 -- ditto for Word# and Addr#
395 <IndexTerm><Primary><literal>>#</literal></Primary></IndexTerm>
396 <IndexTerm><Primary><literal>>=#</literal></Primary></IndexTerm>
397 <IndexTerm><Primary><literal>==#</literal></Primary></IndexTerm>
398 <IndexTerm><Primary><literal>/=#</literal></Primary></IndexTerm>
399 <IndexTerm><Primary><literal><#</literal></Primary></IndexTerm>
400 <IndexTerm><Primary><literal><=#</literal></Primary></IndexTerm>
401 <IndexTerm><Primary><literal>gt{Char,Word,Addr}#</literal></Primary></IndexTerm>
402 <IndexTerm><Primary><literal>ge{Char,Word,Addr}#</literal></Primary></IndexTerm>
403 <IndexTerm><Primary><literal>eq{Char,Word,Addr}#</literal></Primary></IndexTerm>
404 <IndexTerm><Primary><literal>ne{Char,Word,Addr}#</literal></Primary></IndexTerm>
405 <IndexTerm><Primary><literal>lt{Char,Word,Addr}#</literal></Primary></IndexTerm>
406 <IndexTerm><Primary><literal>le{Char,Word,Addr}#</literal></Primary></IndexTerm>
412 <Title>Primitive-character operations</Title>
415 <IndexTerm><Primary>characters, primitive operations</Primary></IndexTerm>
416 <IndexTerm><Primary>operators, primitive character</Primary></IndexTerm>
422 ord# :: Char# -> Int#
423 chr# :: Int# -> Char#
426 <IndexTerm><Primary><literal>ord#</literal></Primary></IndexTerm>
427 <IndexTerm><Primary><literal>chr#</literal></Primary></IndexTerm>
433 <Title>Primitive-<Literal>Int</Literal> operations</Title>
436 <IndexTerm><Primary>integers, primitive operations</Primary></IndexTerm>
437 <IndexTerm><Primary>operators, primitive integer</Primary></IndexTerm>
443 {+,-,*,quotInt,remInt,gcdInt}# :: Int# -> Int# -> Int#
444 negateInt# :: Int# -> Int#
446 iShiftL#, iShiftRA#, iShiftRL# :: Int# -> Int# -> Int#
447 -- shift left, right arithmetic, right logical
449 addIntC#, subIntC#, mulIntC# :: Int# -> Int# -> (# Int#, Int# #)
450 -- add, subtract, multiply with carry
453 <IndexTerm><Primary><literal>+#</literal></Primary></IndexTerm>
454 <IndexTerm><Primary><literal>-#</literal></Primary></IndexTerm>
455 <IndexTerm><Primary><literal>*#</literal></Primary></IndexTerm>
456 <IndexTerm><Primary><literal>quotInt#</literal></Primary></IndexTerm>
457 <IndexTerm><Primary><literal>remInt#</literal></Primary></IndexTerm>
458 <IndexTerm><Primary><literal>gcdInt#</literal></Primary></IndexTerm>
459 <IndexTerm><Primary><literal>iShiftL#</literal></Primary></IndexTerm>
460 <IndexTerm><Primary><literal>iShiftRA#</literal></Primary></IndexTerm>
461 <IndexTerm><Primary><literal>iShiftRL#</literal></Primary></IndexTerm>
462 <IndexTerm><Primary><literal>addIntC#</literal></Primary></IndexTerm>
463 <IndexTerm><Primary><literal>subIntC#</literal></Primary></IndexTerm>
464 <IndexTerm><Primary><literal>mulIntC#</literal></Primary></IndexTerm>
465 <IndexTerm><Primary>shift operations, integer</Primary></IndexTerm>
469 <Emphasis>Note:</Emphasis> No error/overflow checking!
475 <Title>Primitive-<Literal>Double</Literal> and <Literal>Float</Literal> operations</Title>
478 <IndexTerm><Primary>floating point numbers, primitive</Primary></IndexTerm>
479 <IndexTerm><Primary>operators, primitive floating point</Primary></IndexTerm>
485 {+,-,*,/}## :: Double# -> Double# -> Double#
486 {<,<=,==,/=,>=,>}## :: Double# -> Double# -> Bool
487 negateDouble# :: Double# -> Double#
488 double2Int# :: Double# -> Int#
489 int2Double# :: Int# -> Double#
491 {plus,minux,times,divide}Float# :: Float# -> Float# -> Float#
492 {gt,ge,eq,ne,lt,le}Float# :: Float# -> Float# -> Bool
493 negateFloat# :: Float# -> Float#
494 float2Int# :: Float# -> Int#
495 int2Float# :: Int# -> Float#
501 <IndexTerm><Primary><literal>+##</literal></Primary></IndexTerm>
502 <IndexTerm><Primary><literal>-##</literal></Primary></IndexTerm>
503 <IndexTerm><Primary><literal>*##</literal></Primary></IndexTerm>
504 <IndexTerm><Primary><literal>/##</literal></Primary></IndexTerm>
505 <IndexTerm><Primary><literal><##</literal></Primary></IndexTerm>
506 <IndexTerm><Primary><literal><=##</literal></Primary></IndexTerm>
507 <IndexTerm><Primary><literal>==##</literal></Primary></IndexTerm>
508 <IndexTerm><Primary><literal>=/##</literal></Primary></IndexTerm>
509 <IndexTerm><Primary><literal>>=##</literal></Primary></IndexTerm>
510 <IndexTerm><Primary><literal>>##</literal></Primary></IndexTerm>
511 <IndexTerm><Primary><literal>negateDouble#</literal></Primary></IndexTerm>
512 <IndexTerm><Primary><literal>double2Int#</literal></Primary></IndexTerm>
513 <IndexTerm><Primary><literal>int2Double#</literal></Primary></IndexTerm>
517 <IndexTerm><Primary><literal>plusFloat#</literal></Primary></IndexTerm>
518 <IndexTerm><Primary><literal>minusFloat#</literal></Primary></IndexTerm>
519 <IndexTerm><Primary><literal>timesFloat#</literal></Primary></IndexTerm>
520 <IndexTerm><Primary><literal>divideFloat#</literal></Primary></IndexTerm>
521 <IndexTerm><Primary><literal>gtFloat#</literal></Primary></IndexTerm>
522 <IndexTerm><Primary><literal>geFloat#</literal></Primary></IndexTerm>
523 <IndexTerm><Primary><literal>eqFloat#</literal></Primary></IndexTerm>
524 <IndexTerm><Primary><literal>neFloat#</literal></Primary></IndexTerm>
525 <IndexTerm><Primary><literal>ltFloat#</literal></Primary></IndexTerm>
526 <IndexTerm><Primary><literal>leFloat#</literal></Primary></IndexTerm>
527 <IndexTerm><Primary><literal>negateFloat#</literal></Primary></IndexTerm>
528 <IndexTerm><Primary><literal>float2Int#</literal></Primary></IndexTerm>
529 <IndexTerm><Primary><literal>int2Float#</literal></Primary></IndexTerm>
533 And a full complement of trigonometric functions:
539 expDouble# :: Double# -> Double#
540 logDouble# :: Double# -> Double#
541 sqrtDouble# :: Double# -> Double#
542 sinDouble# :: Double# -> Double#
543 cosDouble# :: Double# -> Double#
544 tanDouble# :: Double# -> Double#
545 asinDouble# :: Double# -> Double#
546 acosDouble# :: Double# -> Double#
547 atanDouble# :: Double# -> Double#
548 sinhDouble# :: Double# -> Double#
549 coshDouble# :: Double# -> Double#
550 tanhDouble# :: Double# -> Double#
551 powerDouble# :: Double# -> Double# -> Double#
554 <IndexTerm><Primary>trigonometric functions, primitive</Primary></IndexTerm>
558 similarly for <Literal>Float#</Literal>.
562 There are two coercion functions for <Literal>Float#</Literal>/<Literal>Double#</Literal>:
568 float2Double# :: Float# -> Double#
569 double2Float# :: Double# -> Float#
572 <IndexTerm><Primary><literal>float2Double#</literal></Primary></IndexTerm>
573 <IndexTerm><Primary><literal>double2Float#</literal></Primary></IndexTerm>
577 The primitive version of <Function>decodeDouble</Function>
578 (<Function>encodeDouble</Function> is implemented as an external C
585 decodeDouble# :: Double# -> PrelNum.ReturnIntAndGMP
588 <IndexTerm><Primary><literal>encodeDouble#</literal></Primary></IndexTerm>
589 <IndexTerm><Primary><literal>decodeDouble#</literal></Primary></IndexTerm>
593 (And the same for <Literal>Float#</Literal>s.)
598 <Sect2 id="integer-operations">
599 <Title>Operations on/for <Literal>Integers</Literal> (interface to GMP)
603 <IndexTerm><Primary>arbitrary precision integers</Primary></IndexTerm>
604 <IndexTerm><Primary>Integer, operations on</Primary></IndexTerm>
608 We implement <Literal>Integers</Literal> (arbitrary-precision
609 integers) using the GNU multiple-precision (GMP) package (version
614 The data type for <Literal>Integer</Literal> is either a small
615 integer, represented by an <Literal>Int</Literal>, or a large integer
616 represented using the pieces required by GMP's
617 <Literal>MP_INT</Literal> in <Filename>gmp.h</Filename> (see
618 <Filename>gmp.info</Filename> in
619 <Filename>ghc/includes/runtime/gmp</Filename>). It comes out as:
625 data Integer = S# Int# -- small integers
626 | J# Int# ByteArray# -- large integers
629 <IndexTerm><Primary>Integer type</Primary></IndexTerm> The primitive
630 ops to support large <Literal>Integers</Literal> use the
631 “pieces” of the representation, and are as follows:
637 negateInteger# :: Int# -> ByteArray# -> Integer
639 {plus,minus,times}Integer#, gcdInteger#,
640 quotInteger#, remInteger#, divExactInteger#
641 :: Int# -> ByteArray#
642 -> Int# -> ByteArray#
643 -> (# Int#, ByteArray# #)
646 :: Int# -> ByteArray#
647 -> Int# -> ByteArray#
648 -> Int# -- -1 for <; 0 for ==; +1 for >
651 :: Int# -> ByteArray#
653 -> Int# -- -1 for <; 0 for ==; +1 for >
656 :: Int# -> ByteArray#
660 divModInteger#, quotRemInteger#
661 :: Int# -> ByteArray#
662 -> Int# -> ByteArray#
663 -> (# Int#, ByteArray#,
666 integer2Int# :: Int# -> ByteArray# -> Int#
668 int2Integer# :: Int# -> Integer -- NB: no error-checking on these two!
669 word2Integer# :: Word# -> Integer
671 addr2Integer# :: Addr# -> Integer
672 -- the Addr# is taken to be a `char *' string
673 -- to be converted into an Integer.
676 <IndexTerm><Primary><literal>negateInteger#</literal></Primary></IndexTerm>
677 <IndexTerm><Primary><literal>plusInteger#</literal></Primary></IndexTerm>
678 <IndexTerm><Primary><literal>minusInteger#</literal></Primary></IndexTerm>
679 <IndexTerm><Primary><literal>timesInteger#</literal></Primary></IndexTerm>
680 <IndexTerm><Primary><literal>quotInteger#</literal></Primary></IndexTerm>
681 <IndexTerm><Primary><literal>remInteger#</literal></Primary></IndexTerm>
682 <IndexTerm><Primary><literal>gcdInteger#</literal></Primary></IndexTerm>
683 <IndexTerm><Primary><literal>gcdIntegerInt#</literal></Primary></IndexTerm>
684 <IndexTerm><Primary><literal>divExactInteger#</literal></Primary></IndexTerm>
685 <IndexTerm><Primary><literal>cmpInteger#</literal></Primary></IndexTerm>
686 <IndexTerm><Primary><literal>divModInteger#</literal></Primary></IndexTerm>
687 <IndexTerm><Primary><literal>quotRemInteger#</literal></Primary></IndexTerm>
688 <IndexTerm><Primary><literal>integer2Int#</literal></Primary></IndexTerm>
689 <IndexTerm><Primary><literal>int2Integer#</literal></Primary></IndexTerm>
690 <IndexTerm><Primary><literal>word2Integer#</literal></Primary></IndexTerm>
691 <IndexTerm><Primary><literal>addr2Integer#</literal></Primary></IndexTerm>
697 <Title>Words and addresses</Title>
700 <IndexTerm><Primary>word, primitive type</Primary></IndexTerm>
701 <IndexTerm><Primary>address, primitive type</Primary></IndexTerm>
702 <IndexTerm><Primary>unsigned integer, primitive type</Primary></IndexTerm>
703 <IndexTerm><Primary>pointer, primitive type</Primary></IndexTerm>
707 A <Literal>Word#</Literal> is used for bit-twiddling operations.
708 It is the same size as an <Literal>Int#</Literal>, but has no sign
709 nor any arithmetic operations.
712 type Word# -- Same size/etc as Int# but *unsigned*
713 type Addr# -- A pointer from outside the "Haskell world" (from C, probably);
714 -- described under "arrays"
717 <IndexTerm><Primary><literal>Word#</literal></Primary></IndexTerm>
718 <IndexTerm><Primary><literal>Addr#</literal></Primary></IndexTerm>
722 <Literal>Word#</Literal>s and <Literal>Addr#</Literal>s have
723 the usual comparison operations. Other
724 unboxed-<Literal>Word</Literal> ops (bit-twiddling and coercions):
730 {gt,ge,eq,ne,lt,le}Word# :: Word# -> Word# -> Bool
732 and#, or#, xor# :: Word# -> Word# -> Word#
735 quotWord#, remWord# :: Word# -> Word# -> Word#
736 -- word (i.e. unsigned) versions are different from int
737 -- versions, so we have to provide these explicitly.
739 not# :: Word# -> Word#
741 shiftL#, shiftRL# :: Word# -> Int# -> Word#
742 -- shift left, right logical
744 int2Word# :: Int# -> Word# -- just a cast, really
745 word2Int# :: Word# -> Int#
748 <IndexTerm><Primary>bit operations, Word and Addr</Primary></IndexTerm>
749 <IndexTerm><Primary><literal>gtWord#</literal></Primary></IndexTerm>
750 <IndexTerm><Primary><literal>geWord#</literal></Primary></IndexTerm>
751 <IndexTerm><Primary><literal>eqWord#</literal></Primary></IndexTerm>
752 <IndexTerm><Primary><literal>neWord#</literal></Primary></IndexTerm>
753 <IndexTerm><Primary><literal>ltWord#</literal></Primary></IndexTerm>
754 <IndexTerm><Primary><literal>leWord#</literal></Primary></IndexTerm>
755 <IndexTerm><Primary><literal>and#</literal></Primary></IndexTerm>
756 <IndexTerm><Primary><literal>or#</literal></Primary></IndexTerm>
757 <IndexTerm><Primary><literal>xor#</literal></Primary></IndexTerm>
758 <IndexTerm><Primary><literal>not#</literal></Primary></IndexTerm>
759 <IndexTerm><Primary><literal>quotWord#</literal></Primary></IndexTerm>
760 <IndexTerm><Primary><literal>remWord#</literal></Primary></IndexTerm>
761 <IndexTerm><Primary><literal>shiftL#</literal></Primary></IndexTerm>
762 <IndexTerm><Primary><literal>shiftRA#</literal></Primary></IndexTerm>
763 <IndexTerm><Primary><literal>shiftRL#</literal></Primary></IndexTerm>
764 <IndexTerm><Primary><literal>int2Word#</literal></Primary></IndexTerm>
765 <IndexTerm><Primary><literal>word2Int#</literal></Primary></IndexTerm>
769 Unboxed-<Literal>Addr</Literal> ops (C casts, really):
772 {gt,ge,eq,ne,lt,le}Addr# :: Addr# -> Addr# -> Bool
774 int2Addr# :: Int# -> Addr#
775 addr2Int# :: Addr# -> Int#
776 addr2Integer# :: Addr# -> (# Int#, ByteArray# #)
779 <IndexTerm><Primary><literal>gtAddr#</literal></Primary></IndexTerm>
780 <IndexTerm><Primary><literal>geAddr#</literal></Primary></IndexTerm>
781 <IndexTerm><Primary><literal>eqAddr#</literal></Primary></IndexTerm>
782 <IndexTerm><Primary><literal>neAddr#</literal></Primary></IndexTerm>
783 <IndexTerm><Primary><literal>ltAddr#</literal></Primary></IndexTerm>
784 <IndexTerm><Primary><literal>leAddr#</literal></Primary></IndexTerm>
785 <IndexTerm><Primary><literal>int2Addr#</literal></Primary></IndexTerm>
786 <IndexTerm><Primary><literal>addr2Int#</literal></Primary></IndexTerm>
787 <IndexTerm><Primary><literal>addr2Integer#</literal></Primary></IndexTerm>
791 The casts between <Literal>Int#</Literal>,
792 <Literal>Word#</Literal> and <Literal>Addr#</Literal>
793 correspond to null operations at the machine level, but are required
794 to keep the Haskell type checker happy.
798 Operations for indexing off of C pointers
799 (<Literal>Addr#</Literal>s) to snatch values are listed under
800 “arrays”.
806 <Title>Arrays</Title>
809 <IndexTerm><Primary>arrays, primitive</Primary></IndexTerm>
813 The type <Literal>Array# elt</Literal> is the type of primitive,
814 unpointed arrays of values of type <Literal>elt</Literal>.
823 <IndexTerm><Primary><literal>Array#</literal></Primary></IndexTerm>
827 <Literal>Array#</Literal> is more primitive than a Haskell
828 array—indeed, the Haskell <Literal>Array</Literal> interface is
829 implemented using <Literal>Array#</Literal>—in that an
830 <Literal>Array#</Literal> is indexed only by
831 <Literal>Int#</Literal>s, starting at zero. It is also more
832 primitive by virtue of being unboxed. That doesn't mean that it isn't
833 a heap-allocated object—of course, it is. Rather, being unboxed
834 means that it is represented by a pointer to the array itself, and not
835 to a thunk which will evaluate to the array (or to bottom). The
836 components of an <Literal>Array#</Literal> are themselves boxed.
840 The type <Literal>ByteArray#</Literal> is similar to
841 <Literal>Array#</Literal>, except that it contains just a string
842 of (non-pointer) bytes.
851 <IndexTerm><Primary><literal>ByteArray#</literal></Primary></IndexTerm>
855 Arrays of these types are useful when a Haskell program wishes to
856 construct a value to pass to a C procedure. It is also possible to use
857 them to build (say) arrays of unboxed characters for internal use in a
858 Haskell program. Given these uses, <Literal>ByteArray#</Literal>
859 is deliberately a bit vague about the type of its components.
860 Operations are provided to extract values of type
861 <Literal>Char#</Literal>, <Literal>Int#</Literal>,
862 <Literal>Float#</Literal>, <Literal>Double#</Literal>, and
863 <Literal>Addr#</Literal> from arbitrary offsets within a
864 <Literal>ByteArray#</Literal>. (For type
865 <Literal>Foo#</Literal>, the $i$th offset gets you the $i$th
866 <Literal>Foo#</Literal>, not the <Literal>Foo#</Literal> at
867 byte-position $i$. Mumble.) (If you want a
868 <Literal>Word#</Literal>, grab an <Literal>Int#</Literal>,
873 Lastly, we have static byte-arrays, of type
874 <Literal>Addr#</Literal> [mentioned previously]. (Remember
875 the duality between arrays and pointers in C.) Arrays of this types
876 are represented by a pointer to an array in the world outside Haskell,
877 so this pointer is not followed by the garbage collector. In other
878 respects they are just like <Literal>ByteArray#</Literal>. They
879 are only needed in order to pass values from C to Haskell.
885 <Title>Reading and writing</Title>
888 Primitive arrays are linear, and indexed starting at zero.
892 The size and indices of a <Literal>ByteArray#</Literal>, <Literal>Addr#</Literal>, and
893 <Literal>MutableByteArray#</Literal> are all in bytes. It's up to the program to
894 calculate the correct byte offset from the start of the array. This
895 allows a <Literal>ByteArray#</Literal> to contain a mixture of values of different
896 type, which is often needed when preparing data for and unpicking
897 results from C. (Umm…not true of indices…WDP 95/09)
901 <Emphasis>Should we provide some <Literal>sizeOfDouble#</Literal> constants?</Emphasis>
905 Out-of-range errors on indexing should be caught by the code which
906 uses the primitive operation; the primitive operations themselves do
907 <Emphasis>not</Emphasis> check for out-of-range indexes. The intention is that the
908 primitive ops compile to one machine instruction or thereabouts.
912 We use the terms “reading” and “writing” to refer to accessing
913 <Emphasis>mutable</Emphasis> arrays (see <XRef LinkEnd="sect-mutable">), and
914 “indexing” to refer to reading a value from an <Emphasis>immutable</Emphasis>
919 Immutable byte arrays are straightforward to index (all indices in bytes):
922 indexCharArray# :: ByteArray# -> Int# -> Char#
923 indexIntArray# :: ByteArray# -> Int# -> Int#
924 indexAddrArray# :: ByteArray# -> Int# -> Addr#
925 indexFloatArray# :: ByteArray# -> Int# -> Float#
926 indexDoubleArray# :: ByteArray# -> Int# -> Double#
928 indexCharOffAddr# :: Addr# -> Int# -> Char#
929 indexIntOffAddr# :: Addr# -> Int# -> Int#
930 indexFloatOffAddr# :: Addr# -> Int# -> Float#
931 indexDoubleOffAddr# :: Addr# -> Int# -> Double#
932 indexAddrOffAddr# :: Addr# -> Int# -> Addr#
933 -- Get an Addr# from an Addr# offset
936 <IndexTerm><Primary><literal>indexCharArray#</literal></Primary></IndexTerm>
937 <IndexTerm><Primary><literal>indexIntArray#</literal></Primary></IndexTerm>
938 <IndexTerm><Primary><literal>indexAddrArray#</literal></Primary></IndexTerm>
939 <IndexTerm><Primary><literal>indexFloatArray#</literal></Primary></IndexTerm>
940 <IndexTerm><Primary><literal>indexDoubleArray#</literal></Primary></IndexTerm>
941 <IndexTerm><Primary><literal>indexCharOffAddr#</literal></Primary></IndexTerm>
942 <IndexTerm><Primary><literal>indexIntOffAddr#</literal></Primary></IndexTerm>
943 <IndexTerm><Primary><literal>indexFloatOffAddr#</literal></Primary></IndexTerm>
944 <IndexTerm><Primary><literal>indexDoubleOffAddr#</literal></Primary></IndexTerm>
945 <IndexTerm><Primary><literal>indexAddrOffAddr#</literal></Primary></IndexTerm>
949 The last of these, <Function>indexAddrOffAddr#</Function>, extracts an <Literal>Addr#</Literal> using an offset
950 from another <Literal>Addr#</Literal>, thereby providing the ability to follow a chain of
955 Something a bit more interesting goes on when indexing arrays of boxed
956 objects, because the result is simply the boxed object. So presumably
957 it should be entered—we never usually return an unevaluated
958 object! This is a pain: primitive ops aren't supposed to do
959 complicated things like enter objects. The current solution is to
960 return a single element unboxed tuple (see <XRef LinkEnd="unboxed-tuples">).
966 indexArray# :: Array# elt -> Int# -> (# elt #)
969 <IndexTerm><Primary><literal>indexArray#</literal></Primary></IndexTerm>
975 <Title>The state type</Title>
978 <IndexTerm><Primary><literal>state, primitive type</literal></Primary></IndexTerm>
979 <IndexTerm><Primary><literal>State#</literal></Primary></IndexTerm>
983 The primitive type <Literal>State#</Literal> represents the state of a state
984 transformer. It is parameterised on the desired type of state, which
985 serves to keep states from distinct threads distinct from one another.
986 But the <Emphasis>only</Emphasis> effect of this parameterisation is in the type
987 system: all values of type <Literal>State#</Literal> are represented in the same way.
988 Indeed, they are all represented by nothing at all! The code
989 generator “knows” to generate no code, and allocate no registers
990 etc, for primitive states.
1002 The type <Literal>GHC.RealWorld</Literal> is truly opaque: there are no values defined
1003 of this type, and no operations over it. It is “primitive” in that
1004 sense - but it is <Emphasis>not unlifted!</Emphasis> Its only role in life is to be
1005 the type which distinguishes the <Literal>IO</Literal> state transformer.
1019 <Title>State of the world</Title>
1022 A single, primitive, value of type <Literal>State# RealWorld</Literal> is provided.
1028 realWorld# :: State# RealWorld
1031 <IndexTerm><Primary>realWorld# state object</Primary></IndexTerm>
1035 (Note: in the compiler, not a <Literal>PrimOp</Literal>; just a mucho magic
1036 <Literal>Id</Literal>. Exported from <Literal>GHC</Literal>, though).
1041 <Sect2 id="sect-mutable">
1042 <Title>Mutable arrays</Title>
1045 <IndexTerm><Primary>mutable arrays</Primary></IndexTerm>
1046 <IndexTerm><Primary>arrays, mutable</Primary></IndexTerm>
1047 Corresponding to <Literal>Array#</Literal> and <Literal>ByteArray#</Literal>, we have the types of
1048 mutable versions of each. In each case, the representation is a
1049 pointer to a suitable block of (mutable) heap-allocated storage.
1055 type MutableArray# s elt
1056 type MutableByteArray# s
1059 <IndexTerm><Primary><literal>MutableArray#</literal></Primary></IndexTerm>
1060 <IndexTerm><Primary><literal>MutableByteArray#</literal></Primary></IndexTerm>
1064 <Title>Allocation</Title>
1067 <IndexTerm><Primary>mutable arrays, allocation</Primary></IndexTerm>
1068 <IndexTerm><Primary>arrays, allocation</Primary></IndexTerm>
1069 <IndexTerm><Primary>allocation, of mutable arrays</Primary></IndexTerm>
1073 Mutable arrays can be allocated. Only pointer-arrays are initialised;
1074 arrays of non-pointers are filled in by “user code” rather than by
1075 the array-allocation primitive. Reason: only the pointer case has to
1076 worry about GC striking with a partly-initialised array.
1082 newArray# :: Int# -> elt -> State# s -> (# State# s, MutableArray# s elt #)
1084 newCharArray# :: Int# -> State# s -> (# State# s, MutableByteArray# s elt #)
1085 newIntArray# :: Int# -> State# s -> (# State# s, MutableByteArray# s elt #)
1086 newAddrArray# :: Int# -> State# s -> (# State# s, MutableByteArray# s elt #)
1087 newFloatArray# :: Int# -> State# s -> (# State# s, MutableByteArray# s elt #)
1088 newDoubleArray# :: Int# -> State# s -> (# State# s, MutableByteArray# s elt #)
1091 <IndexTerm><Primary><literal>newArray#</literal></Primary></IndexTerm>
1092 <IndexTerm><Primary><literal>newCharArray#</literal></Primary></IndexTerm>
1093 <IndexTerm><Primary><literal>newIntArray#</literal></Primary></IndexTerm>
1094 <IndexTerm><Primary><literal>newAddrArray#</literal></Primary></IndexTerm>
1095 <IndexTerm><Primary><literal>newFloatArray#</literal></Primary></IndexTerm>
1096 <IndexTerm><Primary><literal>newDoubleArray#</literal></Primary></IndexTerm>
1100 The size of a <Literal>ByteArray#</Literal> is given in bytes.
1106 <Title>Reading and writing</Title>
1109 <IndexTerm><Primary>arrays, reading and writing</Primary></IndexTerm>
1115 readArray# :: MutableArray# s elt -> Int# -> State# s -> (# State# s, elt #)
1116 readCharArray# :: MutableByteArray# s -> Int# -> State# s -> (# State# s, Char# #)
1117 readIntArray# :: MutableByteArray# s -> Int# -> State# s -> (# State# s, Int# #)
1118 readAddrArray# :: MutableByteArray# s -> Int# -> State# s -> (# State# s, Addr# #)
1119 readFloatArray# :: MutableByteArray# s -> Int# -> State# s -> (# State# s, Float# #)
1120 readDoubleArray# :: MutableByteArray# s -> Int# -> State# s -> (# State# s, Double# #)
1122 writeArray# :: MutableArray# s elt -> Int# -> elt -> State# s -> State# s
1123 writeCharArray# :: MutableByteArray# s -> Int# -> Char# -> State# s -> State# s
1124 writeIntArray# :: MutableByteArray# s -> Int# -> Int# -> State# s -> State# s
1125 writeAddrArray# :: MutableByteArray# s -> Int# -> Addr# -> State# s -> State# s
1126 writeFloatArray# :: MutableByteArray# s -> Int# -> Float# -> State# s -> State# s
1127 writeDoubleArray# :: MutableByteArray# s -> Int# -> Double# -> State# s -> State# s
1130 <IndexTerm><Primary><literal>readArray#</literal></Primary></IndexTerm>
1131 <IndexTerm><Primary><literal>readCharArray#</literal></Primary></IndexTerm>
1132 <IndexTerm><Primary><literal>readIntArray#</literal></Primary></IndexTerm>
1133 <IndexTerm><Primary><literal>readAddrArray#</literal></Primary></IndexTerm>
1134 <IndexTerm><Primary><literal>readFloatArray#</literal></Primary></IndexTerm>
1135 <IndexTerm><Primary><literal>readDoubleArray#</literal></Primary></IndexTerm>
1136 <IndexTerm><Primary><literal>writeArray#</literal></Primary></IndexTerm>
1137 <IndexTerm><Primary><literal>writeCharArray#</literal></Primary></IndexTerm>
1138 <IndexTerm><Primary><literal>writeIntArray#</literal></Primary></IndexTerm>
1139 <IndexTerm><Primary><literal>writeAddrArray#</literal></Primary></IndexTerm>
1140 <IndexTerm><Primary><literal>writeFloatArray#</literal></Primary></IndexTerm>
1141 <IndexTerm><Primary><literal>writeDoubleArray#</literal></Primary></IndexTerm>
1147 <Title>Equality</Title>
1150 <IndexTerm><Primary>arrays, testing for equality</Primary></IndexTerm>
1154 One can take “equality” of mutable arrays. What is compared is the
1155 <Emphasis>name</Emphasis> or reference to the mutable array, not its contents.
1161 sameMutableArray# :: MutableArray# s elt -> MutableArray# s elt -> Bool
1162 sameMutableByteArray# :: MutableByteArray# s -> MutableByteArray# s -> Bool
1165 <IndexTerm><Primary><literal>sameMutableArray#</literal></Primary></IndexTerm>
1166 <IndexTerm><Primary><literal>sameMutableByteArray#</literal></Primary></IndexTerm>
1172 <Title>Freezing mutable arrays</Title>
1175 <IndexTerm><Primary>arrays, freezing mutable</Primary></IndexTerm>
1176 <IndexTerm><Primary>freezing mutable arrays</Primary></IndexTerm>
1177 <IndexTerm><Primary>mutable arrays, freezing</Primary></IndexTerm>
1181 Only unsafe-freeze has a primitive. (Safe freeze is done directly in Haskell
1182 by copying the array and then using <Function>unsafeFreeze</Function>.)
1188 unsafeFreezeArray# :: MutableArray# s elt -> State# s -> (# State# s, Array# s elt #)
1189 unsafeFreezeByteArray# :: MutableByteArray# s -> State# s -> (# State# s, ByteArray# #)
1192 <IndexTerm><Primary><literal>unsafeFreezeArray#</literal></Primary></IndexTerm>
1193 <IndexTerm><Primary><literal>unsafeFreezeByteArray#</literal></Primary></IndexTerm>
1201 <Title>Synchronizing variables (M-vars)</Title>
1204 <IndexTerm><Primary>synchronising variables (M-vars)</Primary></IndexTerm>
1205 <IndexTerm><Primary>M-Vars</Primary></IndexTerm>
1209 Synchronising variables are the primitive type used to implement
1210 Concurrent Haskell's MVars (see the Concurrent Haskell paper for
1211 the operational behaviour of these operations).
1217 type MVar# s elt -- primitive
1219 newMVar# :: State# s -> (# State# s, MVar# s elt #)
1220 takeMVar# :: SynchVar# s elt -> State# s -> (# State# s, elt #)
1221 putMVar# :: SynchVar# s elt -> State# s -> State# s
1224 <IndexTerm><Primary><literal>SynchVar#</literal></Primary></IndexTerm>
1225 <IndexTerm><Primary><literal>newSynchVar#</literal></Primary></IndexTerm>
1226 <IndexTerm><Primary><literal>takeMVar</literal></Primary></IndexTerm>
1227 <IndexTerm><Primary><literal>putMVar</literal></Primary></IndexTerm>
1234 <Sect1 id="glasgow-ST-monad">
1235 <Title>Primitive state-transformer monad
1239 <IndexTerm><Primary>state transformers (Glasgow extensions)</Primary></IndexTerm>
1240 <IndexTerm><Primary>ST monad (Glasgow extension)</Primary></IndexTerm>
1244 This monad underlies our implementation of arrays, mutable and
1245 immutable, and our implementation of I/O, including “C calls”.
1249 The <Literal>ST</Literal> library, which provides access to the
1250 <Function>ST</Function> monad, is described in <xref
1256 <Sect1 id="glasgow-prim-arrays">
1257 <Title>Primitive arrays, mutable and otherwise
1261 <IndexTerm><Primary>primitive arrays (Glasgow extension)</Primary></IndexTerm>
1262 <IndexTerm><Primary>arrays, primitive (Glasgow extension)</Primary></IndexTerm>
1266 GHC knows about quite a few flavours of Large Swathes of Bytes.
1270 First, GHC distinguishes between primitive arrays of (boxed) Haskell
1271 objects (type <Literal>Array# obj</Literal>) and primitive arrays of bytes (type
1272 <Literal>ByteArray#</Literal>).
1276 Second, it distinguishes between…
1280 <Term>Immutable:</Term>
1283 Arrays that do not change (as with “standard” Haskell arrays); you
1284 can only read from them. Obviously, they do not need the care and
1285 attention of the state-transformer monad.
1290 <Term>Mutable:</Term>
1293 Arrays that may be changed or “mutated.” All the operations on them
1294 live within the state-transformer monad and the updates happen
1295 <Emphasis>in-place</Emphasis>.
1300 <Term>“Static” (in C land):</Term>
1303 A C routine may pass an <Literal>Addr#</Literal> pointer back into Haskell land. There
1304 are then primitive operations with which you may merrily grab values
1305 over in C land, by indexing off the “static” pointer.
1310 <Term>“Stable” pointers:</Term>
1313 If, for some reason, you wish to hand a Haskell pointer (i.e.,
1314 <Emphasis>not</Emphasis> an unboxed value) to a C routine, you first make the
1315 pointer “stable,” so that the garbage collector won't forget that it
1316 exists. That is, GHC provides a safe way to pass Haskell pointers to
1321 Please see <XRef LinkEnd="glasgow-stablePtrs"> for more details.
1326 <Term>“Foreign objects”:</Term>
1329 A “foreign object” is a safe way to pass an external object (a
1330 C-allocated pointer, say) to Haskell and have Haskell do the Right
1331 Thing when it no longer references the object. So, for example, C
1332 could pass a large bitmap over to Haskell and say “please free this
1333 memory when you're done with it.”
1337 Please see <XRef LinkEnd="glasgow-foreignObjs"> for more details.
1345 The libraries documentatation gives more details on all these
1346 “primitive array” types and the operations on them.
1352 <Sect1 id="pattern-guards">
1353 <Title>Pattern guards</Title>
1356 <IndexTerm><Primary>Pattern guards (Glasgow extension)</Primary></IndexTerm>
1357 The discussion that follows is an abbreviated version of Simon Peyton Jones's original <ULink URL="http://research.microsoft.com/~simonpj/Haskell/guards.html">proposal</ULink>. (Note that the proposal was written before pattern guards were implemented, so refers to them as unimplemented.)
1361 Suppose we have an abstract data type of finite maps, with a
1365 lookup :: FiniteMap -> Int -> Maybe Int
1368 The lookup returns <Function>Nothing</Function> if the supplied key is not in the domain of the mapping, and <Function>(Just v)</Function> otherwise,
1369 where <VarName>v</VarName> is the value that the key maps to. Now consider the following definition:
1373 clunky env var1 var2 | ok1 && ok2 = val1 + val2
1374 | otherwise = var1 + var2
1376 m1 = lookup env var1
1377 m2 = lookup env var2
1378 ok1 = maybeToBool m1
1379 ok2 = maybeToBool m2
1380 val1 = expectJust m1
1381 val2 = expectJust m2
1385 The auxiliary functions are
1389 maybeToBool :: Maybe a -> Bool
1390 maybeToBool (Just x) = True
1391 maybeToBool Nothing = False
1393 expectJust :: Maybe a -> a
1394 expectJust (Just x) = x
1395 expectJust Nothing = error "Unexpected Nothing"
1399 What is <Function>clunky</Function> doing? The guard <Literal>ok1 &&
1400 ok2</Literal> checks that both lookups succeed, using
1401 <Function>maybeToBool</Function> to convert the <Function>Maybe</Function>
1402 types to booleans. The (lazily evaluated) <Function>expectJust</Function>
1403 calls extract the values from the results of the lookups, and binds the
1404 returned values to <VarName>val1</VarName> and <VarName>val2</VarName>
1405 respectively. If either lookup fails, then clunky takes the
1406 <Literal>otherwise</Literal> case and returns the sum of its arguments.
1410 This is certainly legal Haskell, but it is a tremendously verbose and
1411 un-obvious way to achieve the desired effect. Arguably, a more direct way
1412 to write clunky would be to use case expressions:
1416 clunky env var1 var1 = case lookup env var1 of
1418 Just val1 -> case lookup env var2 of
1420 Just val2 -> val1 + val2
1426 This is a bit shorter, but hardly better. Of course, we can rewrite any set
1427 of pattern-matching, guarded equations as case expressions; that is
1428 precisely what the compiler does when compiling equations! The reason that
1429 Haskell provides guarded equations is because they allow us to write down
1430 the cases we want to consider, one at a time, independently of each other.
1431 This structure is hidden in the case version. Two of the right-hand sides
1432 are really the same (<Function>fail</Function>), and the whole expression
1433 tends to become more and more indented.
1437 Here is how I would write clunky:
1441 clunky env var1 var1
1442 | Just val1 <- lookup env var1
1443 , Just val2 <- lookup env var2
1445 ...other equations for clunky...
1449 The semantics should be clear enough. The qualifers are matched in order.
1450 For a <Literal><-</Literal> qualifier, which I call a pattern guard, the
1451 right hand side is evaluated and matched against the pattern on the left.
1452 If the match fails then the whole guard fails and the next equation is
1453 tried. If it succeeds, then the appropriate binding takes place, and the
1454 next qualifier is matched, in the augmented environment. Unlike list
1455 comprehensions, however, the type of the expression to the right of the
1456 <Literal><-</Literal> is the same as the type of the pattern to its
1457 left. The bindings introduced by pattern guards scope over all the
1458 remaining guard qualifiers, and over the right hand side of the equation.
1462 Just as with list comprehensions, boolean expressions can be freely mixed
1463 with among the pattern guards. For example:
1474 Haskell's current guards therefore emerge as a special case, in which the
1475 qualifier list has just one element, a boolean expression.
1479 <Sect1 id="sec-ffi">
1480 <Title>The foreign interface</Title>
1483 The foreign interface consists of language and library support. The former
1484 is described later in <XRef LinkEnd="ffi">; the latter is outlined below,
1485 and detailed in the hslibs documentation.
1488 <Sect2 id="glasgow-foreign-headers">
1489 <Title>Using function headers
1493 <IndexTerm><Primary>C calls, function headers</Primary></IndexTerm>
1497 When generating C (using the <Option>-fvia-C</Option> directive), one can assist the
1498 C compiler in detecting type errors by using the <Command>-#include</Command> directive
1499 to provide <Filename>.h</Filename> files containing function headers.
1509 typedef unsigned long *StgForeignObj;
1510 typedef long StgInt;
1512 void initialiseEFS (StgInt size);
1513 StgInt terminateEFS (void);
1514 StgForeignObj emptyEFS(void);
1515 StgForeignObj updateEFS (StgForeignObj a, StgInt i, StgInt x);
1516 StgInt lookupEFS (StgForeignObj a, StgInt i);
1522 You can find appropriate definitions for <Literal>StgInt</Literal>, <Literal>StgForeignObj</Literal>,
1523 etc using <Command>gcc</Command> on your architecture by consulting
1524 <Filename>ghc/includes/StgTypes.h</Filename>. The following table summarises the
1525 relationship between Haskell types and C types.
1532 <ColSpec Align="Left" Colsep="0">
1533 <ColSpec Align="Left" Colsep="0">
1536 <Entry><Emphasis>C type name</Emphasis> </Entry>
1537 <Entry> <Emphasis>Haskell Type</Emphasis> </Entry>
1542 <Literal>StgChar</Literal> </Entry>
1543 <Entry> <Literal>Char#</Literal> </Entry>
1547 <Literal>StgInt</Literal> </Entry>
1548 <Entry> <Literal>Int#</Literal> </Entry>
1552 <Literal>StgWord</Literal> </Entry>
1553 <Entry> <Literal>Word#</Literal> </Entry>
1557 <Literal>StgAddr</Literal> </Entry>
1558 <Entry> <Literal>Addr#</Literal> </Entry>
1562 <Literal>StgFloat</Literal> </Entry>
1563 <Entry> <Literal>Float#</Literal> </Entry>
1567 <Literal>StgDouble</Literal> </Entry>
1568 <Entry> <Literal>Double#</Literal> </Entry>
1572 <Literal>StgArray</Literal> </Entry>
1573 <Entry> <Literal>Array#</Literal> </Entry>
1577 <Literal>StgByteArray</Literal> </Entry>
1578 <Entry> <Literal>ByteArray#</Literal> </Entry>
1582 <Literal>StgArray</Literal> </Entry>
1583 <Entry> <Literal>MutableArray#</Literal> </Entry>
1587 <Literal>StgByteArray</Literal> </Entry>
1588 <Entry> <Literal>MutableByteArray#</Literal> </Entry>
1592 <Literal>StgStablePtr</Literal> </Entry>
1593 <Entry> <Literal>StablePtr#</Literal> </Entry>
1597 <Literal>StgForeignObj</Literal> </Entry>
1598 <Entry> <Literal>ForeignObj#</Literal></Entry>
1607 Note that this approach is only <Emphasis>essential</Emphasis> for returning
1608 <Literal>float</Literal>s (or if <Literal>sizeof(int) != sizeof(int *)</Literal> on your
1609 architecture) but is a Good Thing for anyone who cares about writing
1610 solid code. You're crazy not to do it.
1615 <Sect2 id="glasgow-stablePtrs">
1616 <Title>Subverting automatic unboxing with “stable pointers”
1620 <IndexTerm><Primary>stable pointers (Glasgow extension)</Primary></IndexTerm>
1624 The arguments of a <Function>_ccall_</Function> automatically unboxed before the
1625 call. There are two reasons why this is usually the Right Thing to
1635 C is a strict language: it would be excessively tedious to pass
1636 unevaluated arguments and require the C programmer to force their
1637 evaluation before using them.
1644 Boxed values are stored on the Haskell heap and may be moved
1645 within the heap if a garbage collection occurs—that is, pointers
1646 to boxed objects are not <Emphasis>stable</Emphasis>.
1655 It is possible to subvert the unboxing process by creating a “stable
1656 pointer” to a value and passing the stable pointer instead. For
1657 example, to pass/return an integer lazily to C functions <Function>storeC</Function> and
1658 <Function>fetchC</Function> might write:
1664 storeH :: Int -> IO ()
1665 storeH x = makeStablePtr x >>= \ stable_x ->
1666 _ccall_ storeC stable_x
1669 fetchH x = _ccall_ fetchC >>= \ stable_x ->
1670 deRefStablePtr stable_x >>= \ x ->
1671 freeStablePtr stable_x >>
1678 The garbage collector will refrain from throwing a stable pointer away
1679 until you explicitly call one of the following from C or Haskell.
1685 void freeStablePointer( StgStablePtr stablePtrToToss )
1686 freeStablePtr :: StablePtr a -> IO ()
1692 As with the use of <Function>free</Function> in C programs, GREAT CARE SHOULD BE
1693 EXERCISED to ensure these functions are called at the right time: too
1694 early and you get dangling references (and, if you're lucky, an error
1695 message from the runtime system); too late and you get space leaks.
1699 And to force evaluation of the argument within <Function>fooC</Function>, one would
1700 call one of the following C functions (according to type of argument).
1706 void performIO ( StgStablePtr stableIndex /* StablePtr s (IO ()) */ );
1707 StgInt enterInt ( StgStablePtr stableIndex /* StablePtr s Int */ );
1708 StgFloat enterFloat ( StgStablePtr stableIndex /* StablePtr s Float */ );
1714 <IndexTerm><Primary>performIO</Primary></IndexTerm>
1715 <IndexTerm><Primary>enterInt</Primary></IndexTerm>
1716 <IndexTerm><Primary>enterFloat</Primary></IndexTerm>
1720 Nota Bene: <Function>_ccall_GC_</Function><IndexTerm><Primary>_ccall_GC_</Primary></IndexTerm> must be used if any of
1721 these functions are used.
1726 <Sect2 id="glasgow-foreignObjs">
1727 <Title>Foreign objects: pointing outside the Haskell heap
1731 <IndexTerm><Primary>foreign objects (Glasgow extension)</Primary></IndexTerm>
1735 There are two types that GHC programs can use to reference
1736 (heap-allocated) objects outside the Haskell world: <Literal>Addr</Literal> and
1737 <Literal>ForeignObj</Literal>.
1741 If you use <Literal>Addr</Literal>, it is up to you to the programmer to arrange
1742 allocation and deallocation of the objects.
1746 If you use <Literal>ForeignObj</Literal>, GHC's garbage collector will call upon the
1747 user-supplied <Emphasis>finaliser</Emphasis> function to free the object when the
1748 Haskell world no longer can access the object. (An object is
1749 associated with a finaliser function when the abstract
1750 Haskell type <Literal>ForeignObj</Literal> is created). The finaliser function is
1751 expressed in C, and is passed as argument the object:
1757 void foreignFinaliser ( StgForeignObj fo )
1763 when the Haskell world can no longer access the object. Since
1764 <Literal>ForeignObj</Literal>s only get released when a garbage collection occurs, we
1765 provide ways of triggering a garbage collection from within C and from
1772 void GarbageCollect()
1779 More information on the programmers' interface to <Literal>ForeignObj</Literal> can be
1780 found in the library documentation.
1785 <Sect2 id="glasgow-avoiding-monads">
1786 <Title>Avoiding monads
1790 <IndexTerm><Primary>C calls to `pure C'</Primary></IndexTerm>
1791 <IndexTerm><Primary>unsafePerformIO</Primary></IndexTerm>
1795 The <Function>_ccall_</Function> construct is part of the <Literal>IO</Literal> monad because 9 out of 10
1796 uses will be to call imperative functions with side effects such as
1797 <Function>printf</Function>. Use of the monad ensures that these operations happen in a
1798 predictable order in spite of laziness and compiler optimisations.
1802 To avoid having to be in the monad to call a C function, it is
1803 possible to use <Function>unsafePerformIO</Function>, which is available from the
1804 <Literal>IOExts</Literal> module. There are three situations where one might like to
1805 call a C function from outside the IO world:
1814 Calling a function with no side-effects:
1817 atan2d :: Double -> Double -> Double
1818 atan2d y x = unsafePerformIO (_ccall_ atan2d y x)
1820 sincosd :: Double -> (Double, Double)
1821 sincosd x = unsafePerformIO $ do
1822 da <- newDoubleArray (0, 1)
1823 _casm_ “sincosd( %0, &((double *)%1[0]), &((double *)%1[1]) );” x da
1824 s <- readDoubleArray da 0
1825 c <- readDoubleArray da 1
1835 Calling a set of functions which have side-effects but which can
1836 be used in a purely functional manner.
1838 For example, an imperative implementation of a purely functional
1839 lookup-table might be accessed using the following functions.
1844 update :: EFS x -> Int -> x -> EFS x
1845 lookup :: EFS a -> Int -> a
1847 empty = unsafePerformIO (_ccall_ emptyEFS)
1849 update a i x = unsafePerformIO $
1850 makeStablePtr x >>= \ stable_x ->
1851 _ccall_ updateEFS a i stable_x
1853 lookup a i = unsafePerformIO $
1854 _ccall_ lookupEFS a i >>= \ stable_x ->
1855 deRefStablePtr stable_x
1859 You will almost always want to use <Literal>ForeignObj</Literal>s with this.
1866 Calling a side-effecting function even though the results will
1867 be unpredictable. For example the <Function>trace</Function> function is defined by:
1871 trace :: String -> a -> a
1874 ((_ccall_ PreTraceHook sTDERR{-msg-}):: IO ()) >>
1875 fputs sTDERR string >>
1876 ((_ccall_ PostTraceHook sTDERR{-msg-}):: IO ()) >>
1879 sTDERR = (“stderr” :: Addr)
1883 (This kind of use is not highly recommended—it is only really
1884 useful in debugging code.)
1894 <Sect2 id="ccall-gotchas">
1895 <Title>C-calling “gotchas” checklist
1899 <IndexTerm><Primary>C call dangers</Primary></IndexTerm>
1900 <IndexTerm><Primary>CCallable</Primary></IndexTerm>
1901 <IndexTerm><Primary>CReturnable</Primary></IndexTerm>
1905 And some advice, too.
1914 For modules that use <Function>_ccall_</Function>s, etc., compile with
1915 <Option>-fvia-C</Option>.<IndexTerm><Primary>-fvia-C option</Primary></IndexTerm> You don't have to, but you should.
1917 Also, use the <Option>-#include "prototypes.h"</Option> flag (hack) to inform the C
1918 compiler of the fully-prototyped types of all the C functions you
1919 call. (<XRef LinkEnd="glasgow-foreign-headers"> says more about this…)
1921 This scheme is the <Emphasis>only</Emphasis> way that you will get <Emphasis>any</Emphasis>
1922 typechecking of your <Function>_ccall_</Function>s. (It shouldn't be that way, but…).
1923 GHC will pass the flag <Option>-Wimplicit</Option> to <Command>gcc</Command> so that you'll get warnings
1924 if any <Function>_ccall_</Function>ed functions have no prototypes.
1931 Try to avoid <Function>_ccall_</Function>s to C functions that take <Literal>float</Literal>
1932 arguments or return <Literal>float</Literal> results. Reason: if you do, you will
1933 become entangled in (ANSI?) C's rules for when arguments/results are
1934 promoted to <Literal>doubles</Literal>. It's a nightmare and just not worth it.
1935 Use <Literal>doubles</Literal> if possible.
1937 If you do use <Literal>floats</Literal>, check and re-check that the right thing is
1938 happening. Perhaps compile with <Option>-keep-hc-file-too</Option> and look at
1939 the intermediate C (<Function>.hc</Function>).
1946 The compiler uses two non-standard type-classes when
1947 type-checking the arguments and results of <Function>_ccall_</Function>: the arguments
1948 (respectively result) of <Function>_ccall_</Function> must be instances of the class
1949 <Literal>CCallable</Literal> (respectively <Literal>CReturnable</Literal>). Both classes may be
1950 imported from the module <Literal>CCall</Literal>, but this should only be
1951 necessary if you want to define a new instance. (Neither class
1952 defines any methods—their only function is to keep the
1953 type-checker happy.)
1955 The type checker must be able to figure out just which of the
1956 C-callable/returnable types is being used. If it can't, you have to
1957 add type signatures. For example,
1965 is not good enough, because the compiler can't work out what type <VarName>x</VarName>
1966 is, nor what type the <Function>_ccall_</Function> returns. You have to write, say:
1970 f :: Int -> IO Double
1975 This table summarises the standard instances of these classes.
1979 <ColSpec Align="Left" Colsep="0">
1980 <ColSpec Align="Left" Colsep="0">
1981 <ColSpec Align="Left" Colsep="0">
1982 <ColSpec Align="Left" Colsep="0">
1985 <Entry><Emphasis>Type</Emphasis> </Entry>
1986 <Entry><Emphasis>CCallable</Emphasis></Entry>
1987 <Entry><Emphasis>CReturnable</Emphasis> </Entry>
1988 <Entry><Emphasis>Which is probably…</Emphasis> </Entry>
1992 <Literal>Char</Literal> </Entry>
1993 <Entry> Yes </Entry>
1994 <Entry> Yes </Entry>
1995 <Entry> <Literal>unsigned char</Literal> </Entry>
1999 <Literal>Int</Literal> </Entry>
2000 <Entry> Yes </Entry>
2001 <Entry> Yes </Entry>
2002 <Entry> <Literal>long int</Literal> </Entry>
2006 <Literal>Word</Literal> </Entry>
2007 <Entry> Yes </Entry>
2008 <Entry> Yes </Entry>
2009 <Entry> <Literal>unsigned long int</Literal> </Entry>
2013 <Literal>Addr</Literal> </Entry>
2014 <Entry> Yes </Entry>
2015 <Entry> Yes </Entry>
2016 <Entry> <Literal>void *</Literal> </Entry>
2020 <Literal>Float</Literal> </Entry>
2021 <Entry> Yes </Entry>
2022 <Entry> Yes </Entry>
2023 <Entry> <Literal>float</Literal> </Entry>
2027 <Literal>Double</Literal> </Entry>
2028 <Entry> Yes </Entry>
2029 <Entry> Yes </Entry>
2030 <Entry> <Literal>double</Literal> </Entry>
2034 <Literal>()</Literal> </Entry>
2036 <Entry> Yes </Entry>
2037 <Entry> <Literal>void</Literal> </Entry>
2041 <Literal>[Char]</Literal> </Entry>
2042 <Entry> Yes </Entry>
2044 <Entry> <Literal>char *</Literal> (null-terminated) </Entry>
2048 <Literal>Array</Literal> </Entry>
2049 <Entry> Yes </Entry>
2051 <Entry> <Literal>unsigned long *</Literal> </Entry>
2055 <Literal>ByteArray</Literal> </Entry>
2056 <Entry> Yes </Entry>
2058 <Entry> <Literal>unsigned long *</Literal> </Entry>
2062 <Literal>MutableArray</Literal> </Entry>
2063 <Entry> Yes </Entry>
2065 <Entry> <Literal>unsigned long *</Literal> </Entry>
2069 <Literal>MutableByteArray</Literal> </Entry>
2070 <Entry> Yes </Entry>
2072 <Entry> <Literal>unsigned long *</Literal> </Entry>
2076 <Literal>State</Literal> </Entry>
2077 <Entry> Yes </Entry>
2078 <Entry> Yes </Entry>
2079 <Entry> nothing!</Entry>
2083 <Literal>StablePtr</Literal> </Entry>
2084 <Entry> Yes </Entry>
2085 <Entry> Yes </Entry>
2086 <Entry> <Literal>unsigned long *</Literal> </Entry>
2090 <Literal>ForeignObjs</Literal> </Entry>
2091 <Entry> Yes </Entry>
2092 <Entry> Yes </Entry>
2093 <Entry> see later </Entry>
2101 Actually, the <Literal>Word</Literal> type is defined as being the same size as a
2102 pointer on the target architecture, which is <Emphasis>probably</Emphasis>
2103 <Literal>unsigned long int</Literal>.
2105 The brave and careful programmer can add their own instances of these
2106 classes for the following types:
2113 A <Emphasis>boxed-primitive</Emphasis> type may be made an instance of both
2114 <Literal>CCallable</Literal> and <Literal>CReturnable</Literal>.
2116 A boxed primitive type is any data type with a
2117 single unary constructor with a single primitive argument. For
2118 example, the following are all boxed primitive types:
2124 data XDisplay = XDisplay Addr#
2125 data EFS a = EFS# ForeignObj#
2131 instance CCallable (EFS a)
2132 instance CReturnable (EFS a)
2141 Any datatype with a single nullary constructor may be made an
2142 instance of <Literal>CReturnable</Literal>. For example:
2146 data MyVoid = MyVoid
2147 instance CReturnable MyVoid
2156 As at version 2.09, <Literal>String</Literal> (i.e., <Literal>[Char]</Literal>) is still
2157 not a <Literal>CReturnable</Literal> type.
2159 Also, the now-builtin type <Literal>PackedString</Literal> is neither
2160 <Literal>CCallable</Literal> nor <Literal>CReturnable</Literal>. (But there are functions in
2161 the PackedString interface to let you get at the necessary bits…)
2173 The code-generator will complain if you attempt to use <Literal>%r</Literal> in
2174 a <Literal>_casm_</Literal> whose result type is <Literal>IO ()</Literal>; or if you don't use <Literal>%r</Literal>
2175 <Emphasis>precisely</Emphasis> once for any other result type. These messages are
2176 supposed to be helpful and catch bugs—please tell us if they wreck
2184 If you call out to C code which may trigger the Haskell garbage
2185 collector or create new threads (examples of this later…), then you
2186 must use the <Function>_ccall_GC_</Function><IndexTerm><Primary>_ccall_GC_ primitive</Primary></IndexTerm> or
2187 <Function>_casm_GC_</Function><IndexTerm><Primary>_casm_GC_ primitive</Primary></IndexTerm> variant of C-calls. (This
2188 does not work with the native code generator—use <Option>-fvia-C</Option>.) This
2189 stuff is hairy with a capital H!
2201 <Sect1 id="multi-param-type-classes">
2202 <Title>Multi-parameter type classes
2206 This section documents GHC's implementation of multi-parameter type
2207 classes. There's lots of background in the paper <ULink
2208 URL="http://research.microsoft.com/~simonpj/multi.ps.gz" >Type
2209 classes: exploring the design space</ULink > (Simon Peyton Jones, Mark
2210 Jones, Erik Meijer).
2214 I'd like to thank people who reported shorcomings in the GHC 3.02
2215 implementation. Our default decisions were all conservative ones, and
2216 the experience of these heroic pioneers has given useful concrete
2217 examples to support several generalisations. (These appear below as
2218 design choices not implemented in 3.02.)
2222 I've discussed these notes with Mark Jones, and I believe that Hugs
2223 will migrate towards the same design choices as I outline here.
2224 Thanks to him, and to many others who have offered very useful
2229 <Title>Types</Title>
2232 There are the following restrictions on the form of a qualified
2239 forall tv1..tvn (c1, ...,cn) => type
2245 (Here, I write the "foralls" explicitly, although the Haskell source
2246 language omits them; in Haskell 1.4, all the free type variables of an
2247 explicit source-language type signature are universally quantified,
2248 except for the class type variables in a class declaration. However,
2249 in GHC, you can give the foralls if you want. See <XRef LinkEnd="universal-quantification">).
2258 <Emphasis>Each universally quantified type variable
2259 <Literal>tvi</Literal> must be mentioned (i.e. appear free) in <Literal>type</Literal></Emphasis>.
2261 The reason for this is that a value with a type that does not obey
2262 this restriction could not be used without introducing
2263 ambiguity. Here, for example, is an illegal type:
2267 forall a. Eq a => Int
2271 When a value with this type was used, the constraint <Literal>Eq tv</Literal>
2272 would be introduced where <Literal>tv</Literal> is a fresh type variable, and
2273 (in the dictionary-translation implementation) the value would be
2274 applied to a dictionary for <Literal>Eq tv</Literal>. The difficulty is that we
2275 can never know which instance of <Literal>Eq</Literal> to use because we never
2276 get any more information about <Literal>tv</Literal>.
2283 <Emphasis>Every constraint <Literal>ci</Literal> must mention at least one of the
2284 universally quantified type variables <Literal>tvi</Literal></Emphasis>.
2286 For example, this type is OK because <Literal>C a b</Literal> mentions the
2287 universally quantified type variable <Literal>b</Literal>:
2291 forall a. C a b => burble
2295 The next type is illegal because the constraint <Literal>Eq b</Literal> does not
2296 mention <Literal>a</Literal>:
2300 forall a. Eq b => burble
2304 The reason for this restriction is milder than the other one. The
2305 excluded types are never useful or necessary (because the offending
2306 context doesn't need to be witnessed at this point; it can be floated
2307 out). Furthermore, floating them out increases sharing. Lastly,
2308 excluding them is a conservative choice; it leaves a patch of
2309 territory free in case we need it later.
2319 These restrictions apply to all types, whether declared in a type signature
2324 Unlike Haskell 1.4, constraints in types do <Emphasis>not</Emphasis> have to be of
2325 the form <Emphasis>(class type-variables)</Emphasis>. Thus, these type signatures
2332 f :: Eq (m a) => [m a] -> [m a]
2339 This choice recovers principal types, a property that Haskell 1.4 does not have.
2345 <Title>Class declarations</Title>
2353 <Emphasis>Multi-parameter type classes are permitted</Emphasis>. For example:
2357 class Collection c a where
2358 union :: c a -> c a -> c a
2369 <Emphasis>The class hierarchy must be acyclic</Emphasis>. However, the definition
2370 of "acyclic" involves only the superclass relationships. For example,
2376 op :: D b => a -> b -> b
2379 class C a => D a where { ... }
2383 Here, <Literal>C</Literal> is a superclass of <Literal>D</Literal>, but it's OK for a
2384 class operation <Literal>op</Literal> of <Literal>C</Literal> to mention <Literal>D</Literal>. (It
2385 would not be OK for <Literal>D</Literal> to be a superclass of <Literal>C</Literal>.)
2392 <Emphasis>There are no restrictions on the context in a class declaration
2393 (which introduces superclasses), except that the class hierarchy must
2394 be acyclic</Emphasis>. So these class declarations are OK:
2398 class Functor (m k) => FiniteMap m k where
2401 class (Monad m, Monad (t m)) => Transform t m where
2402 lift :: m a -> (t m) a
2411 <Emphasis>In the signature of a class operation, every constraint
2412 must mention at least one type variable that is not a class type
2413 variable</Emphasis>.
2419 class Collection c a where
2420 mapC :: Collection c b => (a->b) -> c a -> c b
2424 is OK because the constraint <Literal>(Collection a b)</Literal> mentions
2425 <Literal>b</Literal>, even though it also mentions the class variable
2426 <Literal>a</Literal>. On the other hand:
2431 op :: Eq a => (a,b) -> (a,b)
2435 is not OK because the constraint <Literal>(Eq a)</Literal> mentions on the class
2436 type variable <Literal>a</Literal>, but not <Literal>b</Literal>. However, any such
2437 example is easily fixed by moving the offending context up to the
2442 class Eq a => C a where
2447 A yet more relaxed rule would allow the context of a class-op signature
2448 to mention only class type variables. However, that conflicts with
2449 Rule 1(b) for types above.
2456 <Emphasis>The type of each class operation must mention <Emphasis>all</Emphasis> of
2457 the class type variables</Emphasis>. For example:
2461 class Coll s a where
2463 insert :: s -> a -> s
2467 is not OK, because the type of <Literal>empty</Literal> doesn't mention
2468 <Literal>a</Literal>. This rule is a consequence of Rule 1(a), above, for
2469 types, and has the same motivation.
2471 Sometimes, offending class declarations exhibit misunderstandings. For
2472 example, <Literal>Coll</Literal> might be rewritten
2476 class Coll s a where
2478 insert :: s a -> a -> s a
2482 which makes the connection between the type of a collection of
2483 <Literal>a</Literal>'s (namely <Literal>(s a)</Literal>) and the element type <Literal>a</Literal>.
2484 Occasionally this really doesn't work, in which case you can split the
2492 class CollE s => Coll s a where
2493 insert :: s -> a -> s
2507 <Title>Instance declarations</Title>
2515 <Emphasis>Instance declarations may not overlap</Emphasis>. The two instance
2520 instance context1 => C type1 where ...
2521 instance context2 => C type2 where ...
2525 "overlap" if <Literal>type1</Literal> and <Literal>type2</Literal> unify
2527 However, if you give the command line option
2528 <Option>-fallow-overlapping-instances</Option><IndexTerm><Primary>-fallow-overlapping-instances
2529 option</Primary></IndexTerm> then two overlapping instance declarations are permitted
2537 EITHER <Literal>type1</Literal> and <Literal>type2</Literal> do not unify
2543 OR <Literal>type2</Literal> is a substitution instance of <Literal>type1</Literal>
2544 (but not identical to <Literal>type1</Literal>)
2557 Notice that these rules
2564 make it clear which instance decl to use
2565 (pick the most specific one that matches)
2572 do not mention the contexts <Literal>context1</Literal>, <Literal>context2</Literal>
2573 Reason: you can pick which instance decl
2574 "matches" based on the type.
2581 Regrettably, GHC doesn't guarantee to detect overlapping instance
2582 declarations if they appear in different modules. GHC can "see" the
2583 instance declarations in the transitive closure of all the modules
2584 imported by the one being compiled, so it can "see" all instance decls
2585 when it is compiling <Literal>Main</Literal>. However, it currently chooses not
2586 to look at ones that can't possibly be of use in the module currently
2587 being compiled, in the interests of efficiency. (Perhaps we should
2588 change that decision, at least for <Literal>Main</Literal>.)
2595 <Emphasis>There are no restrictions on the type in an instance
2596 <Emphasis>head</Emphasis>, except that at least one must not be a type variable</Emphasis>.
2597 The instance "head" is the bit after the "=>" in an instance decl. For
2598 example, these are OK:
2602 instance C Int a where ...
2604 instance D (Int, Int) where ...
2606 instance E [[a]] where ...
2610 Note that instance heads <Emphasis>may</Emphasis> contain repeated type variables.
2611 For example, this is OK:
2615 instance Stateful (ST s) (MutVar s) where ...
2619 The "at least one not a type variable" restriction is to ensure that
2620 context reduction terminates: each reduction step removes one type
2621 constructor. For example, the following would make the type checker
2622 loop if it wasn't excluded:
2626 instance C a => C a where ...
2630 There are two situations in which the rule is a bit of a pain. First,
2631 if one allows overlapping instance declarations then it's quite
2632 convenient to have a "default instance" declaration that applies if
2633 something more specific does not:
2642 Second, sometimes you might want to use the following to get the
2643 effect of a "class synonym":
2647 class (C1 a, C2 a, C3 a) => C a where { }
2649 instance (C1 a, C2 a, C3 a) => C a where { }
2653 This allows you to write shorter signatures:
2665 f :: (C1 a, C2 a, C3 a) => ...
2669 I'm on the lookout for a simple rule that preserves decidability while
2670 allowing these idioms. The experimental flag
2671 <Option>-fallow-undecidable-instances</Option><IndexTerm><Primary>-fallow-undecidable-instances
2672 option</Primary></IndexTerm> lifts this restriction, allowing all the types in an
2673 instance head to be type variables.
2680 <Emphasis>Unlike Haskell 1.4, instance heads may use type
2681 synonyms</Emphasis>. As always, using a type synonym is just shorthand for
2682 writing the RHS of the type synonym definition. For example:
2686 type Point = (Int,Int)
2687 instance C Point where ...
2688 instance C [Point] where ...
2692 is legal. However, if you added
2696 instance C (Int,Int) where ...
2700 as well, then the compiler will complain about the overlapping
2701 (actually, identical) instance declarations. As always, type synonyms
2702 must be fully applied. You cannot, for example, write:
2707 instance Monad P where ...
2711 This design decision is independent of all the others, and easily
2712 reversed, but it makes sense to me.
2719 <Emphasis>The types in an instance-declaration <Emphasis>context</Emphasis> must all
2720 be type variables</Emphasis>. Thus
2724 instance C a b => Eq (a,b) where ...
2732 instance C Int b => Foo b where ...
2736 is not OK. Again, the intent here is to make sure that context
2737 reduction terminates.
2739 Voluminous correspondence on the Haskell mailing list has convinced me
2740 that it's worth experimenting with a more liberal rule. If you use
2741 the flag <Option>-fallow-undecidable-instances</Option> can use arbitrary
2742 types in an instance context. Termination is ensured by having a
2743 fixed-depth recursion stack. If you exceed the stack depth you get a
2744 sort of backtrace, and the opportunity to increase the stack depth
2745 with <Option>-fcontext-stack</Option><Emphasis>N</Emphasis>.
2758 <Sect1 id="universal-quantification">
2759 <Title>Explicit universal quantification
2763 GHC now allows you to write explicitly quantified types. GHC's
2764 syntax for this now agrees with Hugs's, namely:
2770 forall a b. (Ord a, Eq b) => a -> b -> a
2776 The context is, of course, optional. You can't use <Literal>forall</Literal> as
2777 a type variable any more!
2781 Haskell type signatures are implicitly quantified. The <Literal>forall</Literal>
2782 allows us to say exactly what this means. For example:
2800 g :: forall b. (b -> b)
2806 The two are treated identically.
2810 <Title>Universally-quantified data type fields
2814 In a <Literal>data</Literal> or <Literal>newtype</Literal> declaration one can quantify
2815 the types of the constructor arguments. Here are several examples:
2821 data T a = T1 (forall b. b -> b -> b) a
2823 data MonadT m = MkMonad { return :: forall a. a -> m a,
2824 bind :: forall a b. m a -> (a -> m b) -> m b
2827 newtype Swizzle = MkSwizzle (Ord a => [a] -> [a])
2833 The constructors now have so-called <Emphasis>rank 2</Emphasis> polymorphic
2834 types, in which there is a for-all in the argument types.:
2840 T1 :: forall a. (forall b. b -> b -> b) -> a -> T a
2841 MkMonad :: forall m. (forall a. a -> m a)
2842 -> (forall a b. m a -> (a -> m b) -> m b)
2844 MkSwizzle :: (Ord a => [a] -> [a]) -> Swizzle
2850 Notice that you don't need to use a <Literal>forall</Literal> if there's an
2851 explicit context. For example in the first argument of the
2852 constructor <Function>MkSwizzle</Function>, an implicit "<Literal>forall a.</Literal>" is
2853 prefixed to the argument type. The implicit <Literal>forall</Literal>
2854 quantifies all type variables that are not already in scope, and are
2855 mentioned in the type quantified over.
2859 As for type signatures, implicit quantification happens for non-overloaded
2860 types too. So if you write this:
2863 data T a = MkT (Either a b) (b -> b)
2866 it's just as if you had written this:
2869 data T a = MkT (forall b. Either a b) (forall b. b -> b)
2872 That is, since the type variable <Literal>b</Literal> isn't in scope, it's
2873 implicitly universally quantified. (Arguably, it would be better
2874 to <Emphasis>require</Emphasis> explicit quantification on constructor arguments
2875 where that is what is wanted. Feedback welcomed.)
2881 <Title>Construction </Title>
2884 You construct values of types <Literal>T1, MonadT, Swizzle</Literal> by applying
2885 the constructor to suitable values, just as usual. For example,
2891 (T1 (\xy->x) 3) :: T Int
2893 (MkSwizzle sort) :: Swizzle
2894 (MkSwizzle reverse) :: Swizzle
2901 MkMonad r b) :: MonadT Maybe
2907 The type of the argument can, as usual, be more general than the type
2908 required, as <Literal>(MkSwizzle reverse)</Literal> shows. (<Function>reverse</Function>
2909 does not need the <Literal>Ord</Literal> constraint.)
2915 <Title>Pattern matching</Title>
2918 When you use pattern matching, the bound variables may now have
2919 polymorphic types. For example:
2925 f :: T a -> a -> (a, Char)
2926 f (T1 f k) x = (f k x, f 'c' 'd')
2928 g :: (Ord a, Ord b) => Swizzle -> [a] -> (a -> b) -> [b]
2929 g (MkSwizzle s) xs f = s (map f (s xs))
2931 h :: MonadT m -> [m a] -> m [a]
2932 h m [] = return m []
2933 h m (x:xs) = bind m x $ \y ->
2934 bind m (h m xs) $ \ys ->
2941 In the function <Function>h</Function> we use the record selectors <Literal>return</Literal>
2942 and <Literal>bind</Literal> to extract the polymorphic bind and return functions
2943 from the <Literal>MonadT</Literal> data structure, rather than using pattern
2948 You cannot pattern-match against an argument that is polymorphic.
2952 newtype TIM s a = TIM (ST s (Maybe a))
2954 runTIM :: (forall s. TIM s a) -> Maybe a
2955 runTIM (TIM m) = runST m
2961 Here the pattern-match fails, because you can't pattern-match against
2962 an argument of type <Literal>(forall s. TIM s a)</Literal>. Instead you
2963 must bind the variable and pattern match in the right hand side:
2966 runTIM :: (forall s. TIM s a) -> Maybe a
2967 runTIM tm = case tm of { TIM m -> runST m }
2970 The <Literal>tm</Literal> on the right hand side is (invisibly) instantiated, like
2971 any polymorphic value at its occurrence site, and now you can pattern-match
2978 <Title>The partial-application restriction</Title>
2981 There is really only one way in which data structures with polymorphic
2982 components might surprise you: you must not partially apply them.
2983 For example, this is illegal:
2989 map MkSwizzle [sort, reverse]
2995 The restriction is this: <Emphasis>every subexpression of the program must
2996 have a type that has no for-alls, except that in a function
2997 application (f e1…en) the partial applications are not subject to
2998 this rule</Emphasis>. The restriction makes type inference feasible.
3002 In the illegal example, the sub-expression <Literal>MkSwizzle</Literal> has the
3003 polymorphic type <Literal>(Ord b => [b] -> [b]) -> Swizzle</Literal> and is not
3004 a sub-expression of an enclosing application. On the other hand, this
3011 map (T1 (\a b -> a)) [1,2,3]
3017 even though it involves a partial application of <Function>T1</Function>, because
3018 the sub-expression <Literal>T1 (\a b -> a)</Literal> has type <Literal>Int -> T
3025 <Title>Type signatures
3029 Once you have data constructors with universally-quantified fields, or
3030 constants such as <Constant>runST</Constant> that have rank-2 types, it isn't long
3031 before you discover that you need more! Consider:
3037 mkTs f x y = [T1 f x, T1 f y]
3043 <Function>mkTs</Function> is a fuction that constructs some values of type
3044 <Literal>T</Literal>, using some pieces passed to it. The trouble is that since
3045 <Literal>f</Literal> is a function argument, Haskell assumes that it is
3046 monomorphic, so we'll get a type error when applying <Function>T1</Function> to
3047 it. This is a rather silly example, but the problem really bites in
3048 practice. Lots of people trip over the fact that you can't make
3049 "wrappers functions" for <Constant>runST</Constant> for exactly the same reason.
3050 In short, it is impossible to build abstractions around functions with
3055 The solution is fairly clear. We provide the ability to give a rank-2
3056 type signature for <Emphasis>ordinary</Emphasis> functions (not only data
3057 constructors), thus:
3063 mkTs :: (forall b. b -> b -> b) -> a -> [T a]
3064 mkTs f x y = [T1 f x, T1 f y]
3070 This type signature tells the compiler to attribute <Literal>f</Literal> with
3071 the polymorphic type <Literal>(forall b. b -> b -> b)</Literal> when type
3072 checking the body of <Function>mkTs</Function>, so now the application of
3073 <Function>T1</Function> is fine.
3077 There are two restrictions:
3086 You can only define a rank 2 type, specified by the following
3091 rank2type ::= [forall tyvars .] [context =>] funty
3092 funty ::= ([forall tyvars .] [context =>] ty) -> funty
3094 ty ::= ...current Haskell monotype syntax...
3098 Informally, the universal quantification must all be right at the beginning,
3099 or at the top level of a function argument.
3106 There is a restriction on the definition of a function whose
3107 type signature is a rank-2 type: the polymorphic arguments must be
3108 matched on the left hand side of the "<Literal>=</Literal>" sign. You can't
3109 define <Function>mkTs</Function> like this:
3113 mkTs :: (forall b. b -> b -> b) -> a -> [T a]
3114 mkTs = \ f x y -> [T1 f x, T1 f y]
3119 The same partial-application rule applies to ordinary functions with
3120 rank-2 types as applied to data constructors.
3133 <Title>Type synonyms and hoisting
3137 GHC also allows you to write a <Literal>forall</Literal> in a type synonym, thus:
3139 type Discard a = forall b. a -> b -> a
3144 However, it is often convenient to use these sort of synonyms at the right hand
3145 end of an arrow, thus:
3147 type Discard a = forall b. a -> b -> a
3149 g :: Int -> Discard Int
3152 Simply expanding the type synonym would give
3154 g :: Int -> (forall b. Int -> b -> Int)
3156 but GHC "hoists" the <Literal>forall</Literal> to give the isomorphic type
3158 g :: forall b. Int -> Int -> b -> Int
3160 In general, the rule is this: <Emphasis>to determine the type specified by any explicit
3161 user-written type (e.g. in a type signature), GHC expands type synonyms and then repeatedly
3162 performs the transformation:</Emphasis>
3164 <Emphasis>type1</Emphasis> -> forall a. <Emphasis>type2</Emphasis>
3166 forall a. <Emphasis>type1</Emphasis> -> <Emphasis>type2</Emphasis>
3168 (In fact, GHC tries to retain as much synonym information as possible for use in
3169 error messages, but that is a usability issue.) This rule applies, of course, whether
3170 or not the <Literal>forall</Literal> comes from a synonym. For example, here is another
3171 valid way to write <Literal>g</Literal>'s type signature:
3173 g :: Int -> Int -> forall b. b -> Int
3180 <Sect1 id="existential-quantification">
3181 <Title>Existentially quantified data constructors
3185 The idea of using existential quantification in data type declarations
3186 was suggested by Laufer (I believe, thought doubtless someone will
3187 correct me), and implemented in Hope+. It's been in Lennart
3188 Augustsson's <Command>hbc</Command> Haskell compiler for several years, and
3189 proved very useful. Here's the idea. Consider the declaration:
3195 data Foo = forall a. MkFoo a (a -> Bool)
3202 The data type <Literal>Foo</Literal> has two constructors with types:
3208 MkFoo :: forall a. a -> (a -> Bool) -> Foo
3215 Notice that the type variable <Literal>a</Literal> in the type of <Function>MkFoo</Function>
3216 does not appear in the data type itself, which is plain <Literal>Foo</Literal>.
3217 For example, the following expression is fine:
3223 [MkFoo 3 even, MkFoo 'c' isUpper] :: [Foo]
3229 Here, <Literal>(MkFoo 3 even)</Literal> packages an integer with a function
3230 <Function>even</Function> that maps an integer to <Literal>Bool</Literal>; and <Function>MkFoo 'c'
3231 isUpper</Function> packages a character with a compatible function. These
3232 two things are each of type <Literal>Foo</Literal> and can be put in a list.
3236 What can we do with a value of type <Literal>Foo</Literal>?. In particular,
3237 what happens when we pattern-match on <Function>MkFoo</Function>?
3243 f (MkFoo val fn) = ???
3249 Since all we know about <Literal>val</Literal> and <Function>fn</Function> is that they
3250 are compatible, the only (useful) thing we can do with them is to
3251 apply <Function>fn</Function> to <Literal>val</Literal> to get a boolean. For example:
3258 f (MkFoo val fn) = fn val
3264 What this allows us to do is to package heterogenous values
3265 together with a bunch of functions that manipulate them, and then treat
3266 that collection of packages in a uniform manner. You can express
3267 quite a bit of object-oriented-like programming this way.
3270 <Sect2 id="existential">
3271 <Title>Why existential?
3275 What has this to do with <Emphasis>existential</Emphasis> quantification?
3276 Simply that <Function>MkFoo</Function> has the (nearly) isomorphic type
3282 MkFoo :: (exists a . (a, a -> Bool)) -> Foo
3288 But Haskell programmers can safely think of the ordinary
3289 <Emphasis>universally</Emphasis> quantified type given above, thereby avoiding
3290 adding a new existential quantification construct.
3296 <Title>Type classes</Title>
3299 An easy extension (implemented in <Command>hbc</Command>) is to allow
3300 arbitrary contexts before the constructor. For example:
3306 data Baz = forall a. Eq a => Baz1 a a
3307 | forall b. Show b => Baz2 b (b -> b)
3313 The two constructors have the types you'd expect:
3319 Baz1 :: forall a. Eq a => a -> a -> Baz
3320 Baz2 :: forall b. Show b => b -> (b -> b) -> Baz
3326 But when pattern matching on <Function>Baz1</Function> the matched values can be compared
3327 for equality, and when pattern matching on <Function>Baz2</Function> the first matched
3328 value can be converted to a string (as well as applying the function to it).
3329 So this program is legal:
3336 f (Baz1 p q) | p == q = "Yes"
3338 f (Baz1 v fn) = show (fn v)
3344 Operationally, in a dictionary-passing implementation, the
3345 constructors <Function>Baz1</Function> and <Function>Baz2</Function> must store the
3346 dictionaries for <Literal>Eq</Literal> and <Literal>Show</Literal> respectively, and
3347 extract it on pattern matching.
3351 Notice the way that the syntax fits smoothly with that used for
3352 universal quantification earlier.
3358 <Title>Restrictions</Title>
3361 There are several restrictions on the ways in which existentially-quantified
3362 constructors can be use.
3371 When pattern matching, each pattern match introduces a new,
3372 distinct, type for each existential type variable. These types cannot
3373 be unified with any other type, nor can they escape from the scope of
3374 the pattern match. For example, these fragments are incorrect:
3382 Here, the type bound by <Function>MkFoo</Function> "escapes", because <Literal>a</Literal>
3383 is the result of <Function>f1</Function>. One way to see why this is wrong is to
3384 ask what type <Function>f1</Function> has:
3388 f1 :: Foo -> a -- Weird!
3392 What is this "<Literal>a</Literal>" in the result type? Clearly we don't mean
3397 f1 :: forall a. Foo -> a -- Wrong!
3401 The original program is just plain wrong. Here's another sort of error
3405 f2 (Baz1 a b) (Baz1 p q) = a==q
3409 It's ok to say <Literal>a==b</Literal> or <Literal>p==q</Literal>, but
3410 <Literal>a==q</Literal> is wrong because it equates the two distinct types arising
3411 from the two <Function>Baz1</Function> constructors.
3419 You can't pattern-match on an existentially quantified
3420 constructor in a <Literal>let</Literal> or <Literal>where</Literal> group of
3421 bindings. So this is illegal:
3425 f3 x = a==b where { Baz1 a b = x }
3429 You can only pattern-match
3430 on an existentially-quantified constructor in a <Literal>case</Literal> expression or
3431 in the patterns of a function definition.
3433 The reason for this restriction is really an implementation one.
3434 Type-checking binding groups is already a nightmare without
3435 existentials complicating the picture. Also an existential pattern
3436 binding at the top level of a module doesn't make sense, because it's
3437 not clear how to prevent the existentially-quantified type "escaping".
3438 So for now, there's a simple-to-state restriction. We'll see how
3446 You can't use existential quantification for <Literal>newtype</Literal>
3447 declarations. So this is illegal:
3451 newtype T = forall a. Ord a => MkT a
3455 Reason: a value of type <Literal>T</Literal> must be represented as a pair
3456 of a dictionary for <Literal>Ord t</Literal> and a value of type <Literal>t</Literal>.
3457 That contradicts the idea that <Literal>newtype</Literal> should have no
3458 concrete representation. You can get just the same efficiency and effect
3459 by using <Literal>data</Literal> instead of <Literal>newtype</Literal>. If there is no
3460 overloading involved, then there is more of a case for allowing
3461 an existentially-quantified <Literal>newtype</Literal>, because the <Literal>data</Literal>
3462 because the <Literal>data</Literal> version does carry an implementation cost,
3463 but single-field existentially quantified constructors aren't much
3464 use. So the simple restriction (no existential stuff on <Literal>newtype</Literal>)
3465 stands, unless there are convincing reasons to change it.
3473 You can't use <Literal>deriving</Literal> to define instances of a
3474 data type with existentially quantified data constructors.
3476 Reason: in most cases it would not make sense. For example:#
3479 data T = forall a. MkT [a] deriving( Eq )
3482 To derive <Literal>Eq</Literal> in the standard way we would need to have equality
3483 between the single component of two <Function>MkT</Function> constructors:
3487 (MkT a) == (MkT b) = ???
3490 But <VarName>a</VarName> and <VarName>b</VarName> have distinct types, and so can't be compared.
3491 It's just about possible to imagine examples in which the derived instance
3492 would make sense, but it seems altogether simpler simply to prohibit such
3493 declarations. Define your own instances!
3505 <Sect1 id="sec-assertions">
3507 <IndexTerm><Primary>Assertions</Primary></IndexTerm>
3511 If you want to make use of assertions in your standard Haskell code, you
3512 could define a function like the following:
3518 assert :: Bool -> a -> a
3519 assert False x = error "assertion failed!"
3526 which works, but gives you back a less than useful error message --
3527 an assertion failed, but which and where?
3531 One way out is to define an extended <Function>assert</Function> function which also
3532 takes a descriptive string to include in the error message and
3533 perhaps combine this with the use of a pre-processor which inserts
3534 the source location where <Function>assert</Function> was used.
3538 Ghc offers a helping hand here, doing all of this for you. For every
3539 use of <Function>assert</Function> in the user's source:
3545 kelvinToC :: Double -> Double
3546 kelvinToC k = assert (k &gt;= 0.0) (k+273.15)
3552 Ghc will rewrite this to also include the source location where the
3559 assert pred val ==> assertError "Main.hs|15" pred val
3565 The rewrite is only performed by the compiler when it spots
3566 applications of <Function>Exception.assert</Function>, so you can still define and
3567 use your own versions of <Function>assert</Function>, should you so wish. If not,
3568 import <Literal>Exception</Literal> to make use <Function>assert</Function> in your code.
3572 To have the compiler ignore uses of assert, use the compiler option
3573 <Option>-fignore-asserts</Option>. <IndexTerm><Primary>-fignore-asserts option</Primary></IndexTerm> That is,
3574 expressions of the form <Literal>assert pred e</Literal> will be rewritten to <Literal>e</Literal>.
3578 Assertion failures can be caught, see the documentation for the
3579 <literal>Exception</literal> library (<xref linkend="sec-Exception">)
3585 <Sect1 id="scoped-type-variables">
3586 <Title>Scoped Type Variables
3590 A <Emphasis>pattern type signature</Emphasis> can introduce a <Emphasis>scoped type
3591 variable</Emphasis>. For example
3597 f (xs::[a]) = ys ++ ys
3606 The pattern <Literal>(xs::[a])</Literal> includes a type signature for <VarName>xs</VarName>.
3607 This brings the type variable <Literal>a</Literal> into scope; it scopes over
3608 all the patterns and right hand sides for this equation for <Function>f</Function>.
3609 In particular, it is in scope at the type signature for <VarName>y</VarName>.
3613 At ordinary type signatures, such as that for <VarName>ys</VarName>, any type variables
3614 mentioned in the type signature <Emphasis>that are not in scope</Emphasis> are
3615 implicitly universally quantified. (If there are no type variables in
3616 scope, all type variables mentioned in the signature are universally
3617 quantified, which is just as in Haskell 98.) In this case, since <VarName>a</VarName>
3618 is in scope, it is not universally quantified, so the type of <VarName>ys</VarName> is
3619 the same as that of <VarName>xs</VarName>. In Haskell 98 it is not possible to declare
3620 a type for <VarName>ys</VarName>; a major benefit of scoped type variables is that
3621 it becomes possible to do so.
3625 Scoped type variables are implemented in both GHC and Hugs. Where the
3626 implementations differ from the specification below, those differences
3631 So much for the basic idea. Here are the details.
3635 <Title>Scope and implicit quantification</Title>
3643 All the type variables mentioned in the patterns for a single
3644 function definition equation, that are not already in scope,
3645 are brought into scope by the patterns. We describe this set as
3646 the <Emphasis>type variables bound by the equation</Emphasis>.
3653 The type variables thus brought into scope may be mentioned
3654 in ordinary type signatures or pattern type signatures anywhere within
3662 In ordinary type signatures, any type variable mentioned in the
3663 signature that is in scope is <Emphasis>not</Emphasis> universally quantified.
3670 Ordinary type signatures do not bring any new type variables
3671 into scope (except in the type signature itself!). So this is illegal:
3680 It's illegal because <VarName>a</VarName> is not in scope in the body of <Function>f</Function>,
3681 so the ordinary signature <Literal>x::a</Literal> is equivalent to <Literal>x::forall a.a</Literal>;
3682 and that is an incorrect typing.
3689 There is no implicit universal quantification on pattern type
3690 signatures, nor may one write an explicit <Literal>forall</Literal> type in a pattern
3691 type signature. The pattern type signature is a monotype.
3699 The type variables in the head of a <Literal>class</Literal> or <Literal>instance</Literal> declaration
3700 scope over the methods defined in the <Literal>where</Literal> part. For example:
3714 (Not implemented in Hugs yet, Dec 98).
3725 <Title>Polymorphism</Title>
3733 Pattern type signatures are completely orthogonal to ordinary, separate
3734 type signatures. The two can be used independently or together. There is
3735 no scoping associated with the names of the type variables in a separate type signature.
3740 f (xs::[b]) = reverse xs
3749 The function must be polymorphic in the type variables
3750 bound by all its equations. Operationally, the type variables bound
3751 by one equation must not:
3758 Be unified with a type (such as <Literal>Int</Literal>, or <Literal>[a]</Literal>).
3764 Be unified with a type variable free in the environment.
3770 Be unified with each other. (They may unify with the type variables
3771 bound by another equation for the same function, of course.)
3778 For example, the following all fail to type check:
3782 f (x::a) (y::b) = [x,y] -- a unifies with b
3784 g (x::a) = x + 1::Int -- a unifies with Int
3786 h x = let k (y::a) = [x,y] -- a is free in the
3787 in k x -- environment
3789 k (x::a) True = ... -- a unifies with Int
3790 k (x::Int) False = ...
3793 w (x::a) = x -- a unifies with [b]
3802 The pattern-bound type variable may, however, be constrained
3803 by the context of the principal type, thus:
3807 f (x::a) (y::a) = x+y*2
3811 gets the inferred type: <Literal>forall a. Num a => a -> a -> a</Literal>.
3822 <Title>Result type signatures</Title>
3830 The result type of a function can be given a signature,
3835 f (x::a) :: [a] = [x,x,x]
3839 The final <Literal>:: [a]</Literal> after all the patterns gives a signature to the
3840 result type. Sometimes this is the only way of naming the type variable
3845 f :: Int -> [a] -> [a]
3846 f n :: ([a] -> [a]) = let g (x::a, y::a) = (y,x)
3847 in \xs -> map g (reverse xs `zip` xs)
3859 Result type signatures are not yet implemented in Hugs.
3865 <Title>Pattern signatures on other constructs</Title>
3873 A pattern type signature can be on an arbitrary sub-pattern, not
3878 f ((x,y)::(a,b)) = (y,x) :: (b,a)
3887 Pattern type signatures, including the result part, can be used
3888 in lambda abstractions:
3892 (\ (x::a, y) :: a -> x)
3896 Type variables bound by these patterns must be polymorphic in
3897 the sense defined above.
3902 f1 (x::c) = f1 x -- ok
3903 f2 = \(x::c) -> f2 x -- not ok
3907 Here, <Function>f1</Function> is OK, but <Function>f2</Function> is not, because <VarName>c</VarName> gets unified
3908 with a type variable free in the environment, in this
3909 case, the type of <Function>f2</Function>, which is in the environment when
3910 the lambda abstraction is checked.
3917 Pattern type signatures, including the result part, can be used
3918 in <Literal>case</Literal> expressions:
3922 case e of { (x::a, y) :: a -> x }
3926 The pattern-bound type variables must, as usual,
3927 be polymorphic in the following sense: each case alternative,
3928 considered as a lambda abstraction, must be polymorphic.
3933 case (True,False) of { (x::a, y) -> x }
3937 Even though the context is that of a pair of booleans,
3938 the alternative itself is polymorphic. Of course, it is
3943 case (True,False) of { (x::Bool, y) -> x }
3952 To avoid ambiguity, the type after the “<Literal>::</Literal>” in a result
3953 pattern signature on a lambda or <Literal>case</Literal> must be atomic (i.e. a single
3954 token or a parenthesised type of some sort). To see why,
3955 consider how one would parse this:
3968 Pattern type signatures that bind new type variables
3969 may not be used in pattern bindings at all.
3974 f x = let (y, z::a) = x in ...
3978 But these are OK, because they do not bind fresh type variables:
3982 f1 x = let (y, z::Int) = x in ...
3983 f2 (x::(Int,a)) = let (y, z::a) = x in ...
3987 However a single variable is considered a degenerate function binding,
3988 rather than a degerate pattern binding, so this is permitted, even
3989 though it binds a type variable:
3993 f :: (b->b) = \(x::b) -> x
4002 Such degnerate function bindings do not fall under the monomorphism
4009 g :: a -> a -> Bool = \x y. x==y
4015 Here <Function>g</Function> has type <Literal>forall a. Eq a => a -> a -> Bool</Literal>, just as if
4016 <Function>g</Function> had a separate type signature. Lacking a type signature, <Function>g</Function>
4017 would get a monomorphic type.
4023 <Title>Existentials</Title>
4031 Pattern type signatures can bind existential type variables.
4036 data T = forall a. MkT [a]
4039 f (MkT [t::a]) = MkT t3
4056 <Sect1 id="pragmas">
4061 GHC supports several pragmas, or instructions to the compiler placed
4062 in the source code. Pragmas don't affect the meaning of the program,
4063 but they might affect the efficiency of the generated code.
4066 <Sect2 id="inline-pragma">
4067 <Title>INLINE pragma
4069 <IndexTerm><Primary>INLINE pragma</Primary></IndexTerm>
4070 <IndexTerm><Primary>pragma, INLINE</Primary></IndexTerm></Title>
4073 GHC (with <Option>-O</Option>, as always) tries to inline (or “unfold”)
4074 functions/values that are “small enough,” thus avoiding the call
4075 overhead and possibly exposing other more-wonderful optimisations.
4079 You will probably see these unfoldings (in Core syntax) in your
4084 Normally, if GHC decides a function is “too expensive” to inline, it
4085 will not do so, nor will it export that unfolding for other modules to
4090 The sledgehammer you can bring to bear is the
4091 <Literal>INLINE</Literal><IndexTerm><Primary>INLINE pragma</Primary></IndexTerm> pragma, used thusly:
4094 key_function :: Int -> String -> (Bool, Double)
4096 #ifdef __GLASGOW_HASKELL__
4097 {-# INLINE key_function #-}
4101 (You don't need to do the C pre-processor carry-on unless you're going
4102 to stick the code through HBC—it doesn't like <Literal>INLINE</Literal> pragmas.)
4106 The major effect of an <Literal>INLINE</Literal> pragma is to declare a function's
4107 “cost” to be very low. The normal unfolding machinery will then be
4108 very keen to inline it.
4112 An <Literal>INLINE</Literal> pragma for a function can be put anywhere its type
4113 signature could be put.
4117 <Literal>INLINE</Literal> pragmas are a particularly good idea for the
4118 <Literal>then</Literal>/<Literal>return</Literal> (or <Literal>bind</Literal>/<Literal>unit</Literal>) functions in a monad.
4119 For example, in GHC's own <Literal>UniqueSupply</Literal> monad code, we have:
4122 #ifdef __GLASGOW_HASKELL__
4123 {-# INLINE thenUs #-}
4124 {-# INLINE returnUs #-}
4132 <Sect2 id="noinline-pragma">
4133 <Title>NOINLINE pragma
4137 <IndexTerm><Primary>NOINLINE pragma</Primary></IndexTerm>
4138 <IndexTerm><Primary>pragma, NOINLINE</Primary></IndexTerm>
4142 The <Literal>NOINLINE</Literal> pragma does exactly what you'd expect: it stops the
4143 named function from being inlined by the compiler. You shouldn't ever
4144 need to do this, unless you're very cautious about code size.
4149 <Sect2 id="specialize-pragma">
4150 <Title>SPECIALIZE pragma
4154 <IndexTerm><Primary>SPECIALIZE pragma</Primary></IndexTerm>
4155 <IndexTerm><Primary>pragma, SPECIALIZE</Primary></IndexTerm>
4156 <IndexTerm><Primary>overloading, death to</Primary></IndexTerm>
4160 (UK spelling also accepted.) For key overloaded functions, you can
4161 create extra versions (NB: more code space) specialised to particular
4162 types. Thus, if you have an overloaded function:
4168 hammeredLookup :: Ord key => [(key, value)] -> key -> value
4174 If it is heavily used on lists with <Literal>Widget</Literal> keys, you could
4175 specialise it as follows:
4178 {-# SPECIALIZE hammeredLookup :: [(Widget, value)] -> Widget -> value #-}
4184 To get very fancy, you can also specify a named function to use for
4185 the specialised value, by adding <Literal>= blah</Literal>, as in:
4188 {-# SPECIALIZE hammeredLookup :: ...as before... = blah #-}
4191 It's <Emphasis>Your Responsibility</Emphasis> to make sure that <Function>blah</Function> really
4192 behaves as a specialised version of <Function>hammeredLookup</Function>!!!
4196 NOTE: the <Literal>=blah</Literal> feature isn't implemented in GHC 4.xx.
4200 An example in which the <Literal>= blah</Literal> form will Win Big:
4203 toDouble :: Real a => a -> Double
4204 toDouble = fromRational . toRational
4206 {-# SPECIALIZE toDouble :: Int -> Double = i2d #-}
4207 i2d (I# i) = D# (int2Double# i) -- uses Glasgow prim-op directly
4210 The <Function>i2d</Function> function is virtually one machine instruction; the
4211 default conversion—via an intermediate <Literal>Rational</Literal>—is obscenely
4212 expensive by comparison.
4216 By using the US spelling, your <Literal>SPECIALIZE</Literal> pragma will work with
4217 HBC, too. Note that HBC doesn't support the <Literal>= blah</Literal> form.
4221 A <Literal>SPECIALIZE</Literal> pragma for a function can be put anywhere its type
4222 signature could be put.
4227 <Sect2 id="specialize-instance-pragma">
4228 <Title>SPECIALIZE instance pragma
4232 <IndexTerm><Primary>SPECIALIZE pragma</Primary></IndexTerm>
4233 <IndexTerm><Primary>overloading, death to</Primary></IndexTerm>
4234 Same idea, except for instance declarations. For example:
4237 instance (Eq a) => Eq (Foo a) where { ... usual stuff ... }
4239 {-# SPECIALIZE instance Eq (Foo [(Int, Bar)] #-}
4242 Compatible with HBC, by the way.
4247 <Sect2 id="line-pragma">
4252 <IndexTerm><Primary>LINE pragma</Primary></IndexTerm>
4253 <IndexTerm><Primary>pragma, LINE</Primary></IndexTerm>
4257 This pragma is similar to C's <Literal>#line</Literal> pragma, and is mainly for use in
4258 automatically generated Haskell code. It lets you specify the line
4259 number and filename of the original code; for example
4265 {-# LINE 42 "Foo.vhs" #-}
4271 if you'd generated the current file from something called <Filename>Foo.vhs</Filename>
4272 and this line corresponds to line 42 in the original. GHC will adjust
4273 its error messages to refer to the line/file named in the <Literal>LINE</Literal>
4280 <Title>RULES pragma</Title>
4283 The RULES pragma lets you specify rewrite rules. It is described in
4284 <XRef LinkEnd="rewrite-rules">.
4291 <Sect1 id="rewrite-rules">
4292 <Title>Rewrite rules
4294 <IndexTerm><Primary>RULES pagma</Primary></IndexTerm>
4295 <IndexTerm><Primary>pragma, RULES</Primary></IndexTerm>
4296 <IndexTerm><Primary>rewrite rules</Primary></IndexTerm></Title>
4299 The programmer can specify rewrite rules as part of the source program
4300 (in a pragma). GHC applies these rewrite rules wherever it can.
4308 "map/map" forall f g xs. map f (map g xs) = map (f.g) xs
4315 <Title>Syntax</Title>
4318 From a syntactic point of view:
4324 Each rule has a name, enclosed in double quotes. The name itself has
4325 no significance at all. It is only used when reporting how many times the rule fired.
4331 There may be zero or more rules in a <Literal>RULES</Literal> pragma.
4337 Layout applies in a <Literal>RULES</Literal> pragma. Currently no new indentation level
4338 is set, so you must lay out your rules starting in the same column as the
4339 enclosing definitions.
4345 Each variable mentioned in a rule must either be in scope (e.g. <Function>map</Function>),
4346 or bound by the <Literal>forall</Literal> (e.g. <Function>f</Function>, <Function>g</Function>, <Function>xs</Function>). The variables bound by
4347 the <Literal>forall</Literal> are called the <Emphasis>pattern</Emphasis> variables. They are separated
4348 by spaces, just like in a type <Literal>forall</Literal>.
4354 A pattern variable may optionally have a type signature.
4355 If the type of the pattern variable is polymorphic, it <Emphasis>must</Emphasis> have a type signature.
4356 For example, here is the <Literal>foldr/build</Literal> rule:
4359 "fold/build" forall k z (g::forall b. (a->b->b) -> b -> b) .
4360 foldr k z (build g) = g k z
4363 Since <Function>g</Function> has a polymorphic type, it must have a type signature.
4370 The left hand side of a rule must consist of a top-level variable applied
4371 to arbitrary expressions. For example, this is <Emphasis>not</Emphasis> OK:
4374 "wrong1" forall e1 e2. case True of { True -> e1; False -> e2 } = e1
4375 "wrong2" forall f. f True = True
4378 In <Literal>"wrong1"</Literal>, the LHS is not an application; in <Literal>"wrong1"</Literal>, the LHS has a pattern variable
4385 A rule does not need to be in the same module as (any of) the
4386 variables it mentions, though of course they need to be in scope.
4392 Rules are automatically exported from a module, just as instance declarations are.
4403 <Title>Semantics</Title>
4406 From a semantic point of view:
4412 Rules are only applied if you use the <Option>-O</Option> flag.
4418 Rules are regarded as left-to-right rewrite rules.
4419 When GHC finds an expression that is a substitution instance of the LHS
4420 of a rule, it replaces the expression by the (appropriately-substituted) RHS.
4421 By "a substitution instance" we mean that the LHS can be made equal to the
4422 expression by substituting for the pattern variables.
4429 The LHS and RHS of a rule are typechecked, and must have the
4437 GHC makes absolutely no attempt to verify that the LHS and RHS
4438 of a rule have the same meaning. That is undecideable in general, and
4439 infeasible in most interesting cases. The responsibility is entirely the programmer's!
4446 GHC makes no attempt to make sure that the rules are confluent or
4447 terminating. For example:
4450 "loop" forall x,y. f x y = f y x
4453 This rule will cause the compiler to go into an infinite loop.
4460 If more than one rule matches a call, GHC will choose one arbitrarily to apply.
4466 GHC currently uses a very simple, syntactic, matching algorithm
4467 for matching a rule LHS with an expression. It seeks a substitution
4468 which makes the LHS and expression syntactically equal modulo alpha
4469 conversion. The pattern (rule), but not the expression, is eta-expanded if
4470 necessary. (Eta-expanding the epression can lead to laziness bugs.)
4471 But not beta conversion (that's called higher-order matching).
4475 Matching is carried out on GHC's intermediate language, which includes
4476 type abstractions and applications. So a rule only matches if the
4477 types match too. See <XRef LinkEnd="rule-spec"> below.
4483 GHC keeps trying to apply the rules as it optimises the program.
4484 For example, consider:
4493 The expression <Literal>s (t xs)</Literal> does not match the rule <Literal>"map/map"</Literal>, but GHC
4494 will substitute for <VarName>s</VarName> and <VarName>t</VarName>, giving an expression which does match.
4495 If <VarName>s</VarName> or <VarName>t</VarName> was (a) used more than once, and (b) large or a redex, then it would
4496 not be substituted, and the rule would not fire.
4503 In the earlier phases of compilation, GHC inlines <Emphasis>nothing
4504 that appears on the LHS of a rule</Emphasis>, because once you have substituted
4505 for something you can't match against it (given the simple minded
4506 matching). So if you write the rule
4509 "map/map" forall f,g. map f . map g = map (f.g)
4512 this <Emphasis>won't</Emphasis> match the expression <Literal>map f (map g xs)</Literal>.
4513 It will only match something written with explicit use of ".".
4514 Well, not quite. It <Emphasis>will</Emphasis> match the expression
4520 where <Function>wibble</Function> is defined:
4523 wibble f g = map f . map g
4526 because <Function>wibble</Function> will be inlined (it's small).
4528 Later on in compilation, GHC starts inlining even things on the
4529 LHS of rules, but still leaves the rules enabled. This inlining
4530 policy is controlled by the per-simplification-pass flag <Option>-finline-phase</Option><Emphasis>n</Emphasis>.
4537 All rules are implicitly exported from the module, and are therefore
4538 in force in any module that imports the module that defined the rule, directly
4539 or indirectly. (That is, if A imports B, which imports C, then C's rules are
4540 in force when compiling A.) The situation is very similar to that for instance
4552 <Title>List fusion</Title>
4555 The RULES mechanism is used to implement fusion (deforestation) of common list functions.
4556 If a "good consumer" consumes an intermediate list constructed by a "good producer", the
4557 intermediate list should be eliminated entirely.
4561 The following are good producers:
4573 Enumerations of <Literal>Int</Literal> and <Literal>Char</Literal> (e.g. <Literal>['a'..'z']</Literal>).
4579 Explicit lists (e.g. <Literal>[True, False]</Literal>)
4585 The cons constructor (e.g <Literal>3:4:[]</Literal>)
4591 <Function>++</Function>
4597 <Function>map</Function>
4603 <Function>filter</Function>
4609 <Function>iterate</Function>, <Function>repeat</Function>
4615 <Function>zip</Function>, <Function>zipWith</Function>
4624 The following are good consumers:
4636 <Function>array</Function> (on its second argument)
4642 <Function>length</Function>
4648 <Function>++</Function> (on its first argument)
4654 <Function>map</Function>
4660 <Function>filter</Function>
4666 <Function>concat</Function>
4672 <Function>unzip</Function>, <Function>unzip2</Function>, <Function>unzip3</Function>, <Function>unzip4</Function>
4678 <Function>zip</Function>, <Function>zipWith</Function> (but on one argument only; if both are good producers, <Function>zip</Function>
4679 will fuse with one but not the other)
4685 <Function>partition</Function>
4691 <Function>head</Function>
4697 <Function>and</Function>, <Function>or</Function>, <Function>any</Function>, <Function>all</Function>
4703 <Function>sequence_</Function>
4709 <Function>msum</Function>
4715 <Function>sortBy</Function>
4724 So, for example, the following should generate no intermediate lists:
4727 array (1,10) [(i,i*i) | i <- map (+ 1) [0..9]]
4733 This list could readily be extended; if there are Prelude functions that you use
4734 a lot which are not included, please tell us.
4738 If you want to write your own good consumers or producers, look at the
4739 Prelude definitions of the above functions to see how to do so.
4744 <Sect2 id="rule-spec">
4745 <Title>Specialisation
4749 Rewrite rules can be used to get the same effect as a feature
4750 present in earlier version of GHC:
4753 {-# SPECIALIZE fromIntegral :: Int8 -> Int16 = int8ToInt16 #-}
4756 This told GHC to use <Function>int8ToInt16</Function> instead of <Function>fromIntegral</Function> whenever
4757 the latter was called with type <Literal>Int8 -> Int16</Literal>. That is, rather than
4758 specialising the original definition of <Function>fromIntegral</Function> the programmer is
4759 promising that it is safe to use <Function>int8ToInt16</Function> instead.
4763 This feature is no longer in GHC. But rewrite rules let you do the
4768 "fromIntegral/Int8/Int16" fromIntegral = int8ToInt16
4772 This slightly odd-looking rule instructs GHC to replace <Function>fromIntegral</Function>
4773 by <Function>int8ToInt16</Function> <Emphasis>whenever the types match</Emphasis>. Speaking more operationally,
4774 GHC adds the type and dictionary applications to get the typed rule
4777 forall (d1::Integral Int8) (d2::Num Int16) .
4778 fromIntegral Int8 Int16 d1 d2 = int8ToInt16
4782 this rule does not need to be in the same file as fromIntegral,
4783 unlike the <Literal>SPECIALISE</Literal> pragmas which currently do (so that they
4784 have an original definition available to specialise).
4790 <Title>Controlling what's going on</Title>
4798 Use <Option>-ddump-rules</Option> to see what transformation rules GHC is using.
4804 Use <Option>-ddump-simpl-stats</Option> to see what rules are being fired.
4805 If you add <Option>-dppr-debug</Option> you get a more detailed listing.
4811 The defintion of (say) <Function>build</Function> in <FileName>PrelBase.lhs</FileName> looks llike this:
4814 build :: forall a. (forall b. (a -> b -> b) -> b -> b) -> [a]
4815 {-# INLINE build #-}
4819 Notice the <Literal>INLINE</Literal>! That prevents <Literal>(:)</Literal> from being inlined when compiling
4820 <Literal>PrelBase</Literal>, so that an importing module will “see” the <Literal>(:)</Literal>, and can
4821 match it on the LHS of a rule. <Literal>INLINE</Literal> prevents any inlining happening
4822 in the RHS of the <Literal>INLINE</Literal> thing. I regret the delicacy of this.
4829 In <Filename>ghc/lib/std/PrelBase.lhs</Filename> look at the rules for <Function>map</Function> to
4830 see how to write rules that will do fusion and yet give an efficient
4831 program even if fusion doesn't happen. More rules in <Filename>PrelList.lhs</Filename>.
4844 ;;; Local Variables: ***
4846 ;;; sgml-parent-document: ("users_guide.sgml" "book" "chapter" "sect1") ***