2 <IndexTerm><Primary>language, GHC</Primary></IndexTerm>
3 <IndexTerm><Primary>extensions, GHC</Primary></IndexTerm>
4 As with all known Haskell systems, GHC implements some extensions to
5 the language. To use them, you'll need to give a <Option>-fglasgow-exts</Option>
6 <IndexTerm><Primary>-fglasgow-exts option</Primary></IndexTerm> option.
10 Virtually all of the Glasgow extensions serve to give you access to
11 the underlying facilities with which we implement Haskell. Thus, you
12 can get at the Raw Iron, if you are willing to write some non-standard
13 code at a more primitive level. You need not be “stuck” on
14 performance because of the implementation costs of Haskell's
15 “high-level” features—you can always code “under” them. In an
16 extreme case, you can write all your time-critical code in C, and then
17 just glue it together with Haskell!
21 Executive summary of our extensions:
28 <Term>Unboxed types and primitive operations:</Term>
31 You can get right down to the raw machine types and operations;
32 included in this are “primitive arrays” (direct access to Big Wads
33 of Bytes). Please see <XRef LinkEnd="glasgow-unboxed"> and following.
39 <Term>Multi-parameter type classes:</Term>
42 GHC's type system supports extended type classes with multiple
43 parameters. Please see <XRef LinkEnd="multi-param-type-classes">.
49 <Term>Local universal quantification:</Term>
52 GHC's type system supports explicit universal quantification in
53 constructor fields and function arguments. This is useful for things
54 like defining <Literal>runST</Literal> from the state-thread world. See <XRef LinkEnd="universal-quantification">.
60 <Term>Extistentially quantification in data types:</Term>
63 Some or all of the type variables in a datatype declaration may be
64 <Emphasis>existentially quantified</Emphasis>. More details in <XRef LinkEnd="existential-quantification">.
70 <Term>Scoped type variables:</Term>
73 Scoped type variables enable the programmer to supply type signatures
74 for some nested declarations, where this would not be legal in Haskell
75 98. Details in <XRef LinkEnd="scoped-type-variables">.
81 <Term>Calling out to C:</Term>
84 Just what it sounds like. We provide <Emphasis>lots</Emphasis> of rope that you
85 can dangle around your neck. Please see <XRef LinkEnd="glasgow-ccalls">.
94 Pragmas are special instructions to the compiler placed in the source
95 file. The pragmas GHC supports are described in <XRef LinkEnd="pragmas">.
101 <Term>Rewrite rules:</Term>
104 The programmer can specify rewrite rules as part of the source program
105 (in a pragma). GHC applies these rewrite rules wherever it can.
106 Details in <XRef LinkEnd="rewrite-rules">.
114 Before you get too carried away working at the lowest level (e.g.,
115 sloshing <Literal>MutableByteArray#</Literal>s around your
116 program), you may wish to check if there are libraries that provide a
117 “Haskellised veneer” over the features you want. See the
118 accompanying library documentation.
121 <Sect1 id="primitives">
122 <Title>Unboxed types and primitive operations
124 <IndexTerm><Primary>PrelGHC module</Primary></IndexTerm>
127 This module defines all the types which are primitive in Glasgow
128 Haskell, and the operations provided for them.
131 <Sect2 id="glasgow-unboxed">
136 <IndexTerm><Primary>Unboxed types (Glasgow extension)</Primary></IndexTerm>
139 <para>Most types in GHC are <firstterm>boxed</firstterm>, which means
140 that values of that type are represented by a pointer to a heap
141 object. The representation of a Haskell <literal>Int</literal>, for
142 example, is a two-word heap object. An <firstterm>unboxed</firstterm>
143 type, however, is represented by the value itself, no pointers or heap
144 allocation are involved.
148 Unboxed types correspond to the “raw machine” types you
149 would use in C: <Literal>Int#</Literal> (long int),
150 <Literal>Double#</Literal> (double), <Literal>Addr#</Literal>
151 (void *), etc. The <Emphasis>primitive operations</Emphasis>
152 (PrimOps) on these types are what you might expect; e.g.,
153 <Literal>(+#)</Literal> is addition on
154 <Literal>Int#</Literal>s, and is the machine-addition that we all
155 know and love—usually one instruction.
159 Primitive (unboxed) types cannot be defined in Haskell, and are
160 therefore built into the language and compiler. Primitive types are
161 always unlifted; that is, a value of a primitive type cannot be
162 bottom. We use the convention that primitive types, values, and
163 operations have a <Literal>#</Literal> suffix.
167 Primitive values are often represented by a simple bit-pattern, such
168 as <Literal>Int#</Literal>, <Literal>Float#</Literal>,
169 <Literal>Double#</Literal>. But this is not necessarily the case:
170 a primitive value might be represented by a pointer to a
171 heap-allocated object. Examples include
172 <Literal>Array#</Literal>, the type of primitive arrays. A
173 primitive array is heap-allocated because it is too big a value to fit
174 in a register, and would be too expensive to copy around; in a sense,
175 it is accidental that it is represented by a pointer. If a pointer
176 represents a primitive value, then it really does point to that value:
177 no unevaluated thunks, no indirections…nothing can be at the
178 other end of the pointer than the primitive value.
182 There are some restrictions on the use of primitive types, the main
183 one being that you can't pass a primitive value to a polymorphic
184 function or store one in a polymorphic data type. This rules out
185 things like <Literal>[Int#]</Literal> (i.e. lists of primitive
186 integers). The reason for this restriction is that polymorphic
187 arguments and constructor fields are assumed to be pointers: if an
188 unboxed integer is stored in one of these, the garbage collector would
189 attempt to follow it, leading to unpredictable space leaks. Or a
190 <Function>seq</Function> operation on the polymorphic component may
191 attempt to dereference the pointer, with disastrous results. Even
192 worse, the unboxed value might be larger than a pointer
193 (<Literal>Double#</Literal> for instance).
197 Nevertheless, A numerically-intensive program using unboxed types can
198 go a <Emphasis>lot</Emphasis> faster than its “standard”
199 counterpart—we saw a threefold speedup on one example.
204 <Sect2 id="unboxed-tuples">
205 <Title>Unboxed Tuples
209 Unboxed tuples aren't really exported by <Literal>PrelGHC</Literal>,
210 they're available by default with <Option>-fglasgow-exts</Option>. An
211 unboxed tuple looks like this:
223 where <Literal>e_1..e_n</Literal> are expressions of any
224 type (primitive or non-primitive). The type of an unboxed tuple looks
229 Unboxed tuples are used for functions that need to return multiple
230 values, but they avoid the heap allocation normally associated with
231 using fully-fledged tuples. When an unboxed tuple is returned, the
232 components are put directly into registers or on the stack; the
233 unboxed tuple itself does not have a composite representation. Many
234 of the primitive operations listed in this section return unboxed
239 There are some pretty stringent restrictions on the use of unboxed tuples:
248 Unboxed tuple types are subject to the same restrictions as
249 other unboxed types; i.e. they may not be stored in polymorphic data
250 structures or passed to polymorphic functions.
257 Unboxed tuples may only be constructed as the direct result of
258 a function, and may only be deconstructed with a <Literal>case</Literal> expression.
259 eg. the following are valid:
263 f x y = (# x+1, y-1 #)
264 g x = case f x x of { (# a, b #) -> a + b }
268 but the following are invalid:
282 No variable can have an unboxed tuple type. This is illegal:
286 f :: (# Int, Int #) -> (# Int, Int #)
291 because <VarName>x</VarName> has an unboxed tuple type.
301 Note: we may relax some of these restrictions in the future.
305 The <Literal>IO</Literal> and <Literal>ST</Literal> monads use unboxed tuples to avoid unnecessary
306 allocation during sequences of operations.
312 <Title>Character and numeric types</Title>
315 <IndexTerm><Primary>character types, primitive</Primary></IndexTerm>
316 <IndexTerm><Primary>numeric types, primitive</Primary></IndexTerm>
317 <IndexTerm><Primary>integer types, primitive</Primary></IndexTerm>
318 <IndexTerm><Primary>floating point types, primitive</Primary></IndexTerm>
319 There are the following obvious primitive types:
335 <IndexTerm><Primary><literal>Char#</literal></Primary></IndexTerm>
336 <IndexTerm><Primary><literal>Int#</literal></Primary></IndexTerm>
337 <IndexTerm><Primary><literal>Word#</literal></Primary></IndexTerm>
338 <IndexTerm><Primary><literal>Addr#</literal></Primary></IndexTerm>
339 <IndexTerm><Primary><literal>Float#</literal></Primary></IndexTerm>
340 <IndexTerm><Primary><literal>Double#</literal></Primary></IndexTerm>
341 <IndexTerm><Primary><literal>Int64#</literal></Primary></IndexTerm>
342 <IndexTerm><Primary><literal>Word64#</literal></Primary></IndexTerm>
346 If you really want to know their exact equivalents in C, see
347 <Filename>ghc/includes/StgTypes.h</Filename> in the GHC source tree.
351 Literals for these types may be written as follows:
360 'a'# a Char#; for weird characters, use '\o<octal>'#
361 "a"# an Addr# (a `char *')
364 <IndexTerm><Primary>literals, primitive</Primary></IndexTerm>
365 <IndexTerm><Primary>constants, primitive</Primary></IndexTerm>
366 <IndexTerm><Primary>numbers, primitive</Primary></IndexTerm>
372 <Title>Comparison operations</Title>
375 <IndexTerm><Primary>comparisons, primitive</Primary></IndexTerm>
376 <IndexTerm><Primary>operators, comparison</Primary></IndexTerm>
382 {>,>=,==,/=,<,<=}# :: Int# -> Int# -> Bool
384 {gt,ge,eq,ne,lt,le}Char# :: Char# -> Char# -> Bool
385 -- ditto for Word# and Addr#
388 <IndexTerm><Primary><literal>>#</literal></Primary></IndexTerm>
389 <IndexTerm><Primary><literal>>=#</literal></Primary></IndexTerm>
390 <IndexTerm><Primary><literal>==#</literal></Primary></IndexTerm>
391 <IndexTerm><Primary><literal>/=#</literal></Primary></IndexTerm>
392 <IndexTerm><Primary><literal><#</literal></Primary></IndexTerm>
393 <IndexTerm><Primary><literal><=#</literal></Primary></IndexTerm>
394 <IndexTerm><Primary><literal>gt{Char,Word,Addr}#</literal></Primary></IndexTerm>
395 <IndexTerm><Primary><literal>ge{Char,Word,Addr}#</literal></Primary></IndexTerm>
396 <IndexTerm><Primary><literal>eq{Char,Word,Addr}#</literal></Primary></IndexTerm>
397 <IndexTerm><Primary><literal>ne{Char,Word,Addr}#</literal></Primary></IndexTerm>
398 <IndexTerm><Primary><literal>lt{Char,Word,Addr}#</literal></Primary></IndexTerm>
399 <IndexTerm><Primary><literal>le{Char,Word,Addr}#</literal></Primary></IndexTerm>
405 <Title>Primitive-character operations</Title>
408 <IndexTerm><Primary>characters, primitive operations</Primary></IndexTerm>
409 <IndexTerm><Primary>operators, primitive character</Primary></IndexTerm>
415 ord# :: Char# -> Int#
416 chr# :: Int# -> Char#
419 <IndexTerm><Primary><literal>ord#</literal></Primary></IndexTerm>
420 <IndexTerm><Primary><literal>chr#</literal></Primary></IndexTerm>
426 <Title>Primitive-<Literal>Int</Literal> operations</Title>
429 <IndexTerm><Primary>integers, primitive operations</Primary></IndexTerm>
430 <IndexTerm><Primary>operators, primitive integer</Primary></IndexTerm>
436 {+,-,*,quotInt,remInt,gcdInt}# :: Int# -> Int# -> Int#
437 negateInt# :: Int# -> Int#
439 iShiftL#, iShiftRA#, iShiftRL# :: Int# -> Int# -> Int#
440 -- shift left, right arithmetic, right logical
442 addIntC#, subIntC#, mulIntC# :: Int# -> Int# -> (# Int#, Int# #)
443 -- add, subtract, multiply with carry
446 <IndexTerm><Primary><literal>+#</literal></Primary></IndexTerm>
447 <IndexTerm><Primary><literal>-#</literal></Primary></IndexTerm>
448 <IndexTerm><Primary><literal>*#</literal></Primary></IndexTerm>
449 <IndexTerm><Primary><literal>quotInt#</literal></Primary></IndexTerm>
450 <IndexTerm><Primary><literal>remInt#</literal></Primary></IndexTerm>
451 <IndexTerm><Primary><literal>gcdInt#</literal></Primary></IndexTerm>
452 <IndexTerm><Primary><literal>iShiftL#</literal></Primary></IndexTerm>
453 <IndexTerm><Primary><literal>iShiftRA#</literal></Primary></IndexTerm>
454 <IndexTerm><Primary><literal>iShiftRL#</literal></Primary></IndexTerm>
455 <IndexTerm><Primary><literal>addIntC#</literal></Primary></IndexTerm>
456 <IndexTerm><Primary><literal>subIntC#</literal></Primary></IndexTerm>
457 <IndexTerm><Primary><literal>mulIntC#</literal></Primary></IndexTerm>
458 <IndexTerm><Primary>shift operations, integer</Primary></IndexTerm>
462 <Emphasis>Note:</Emphasis> No error/overflow checking!
468 <Title>Primitive-<Literal>Double</Literal> and <Literal>Float</Literal> operations</Title>
471 <IndexTerm><Primary>floating point numbers, primitive</Primary></IndexTerm>
472 <IndexTerm><Primary>operators, primitive floating point</Primary></IndexTerm>
478 {+,-,*,/}## :: Double# -> Double# -> Double#
479 {<,<=,==,/=,>=,>}## :: Double# -> Double# -> Bool
480 negateDouble# :: Double# -> Double#
481 double2Int# :: Double# -> Int#
482 int2Double# :: Int# -> Double#
484 {plus,minux,times,divide}Float# :: Float# -> Float# -> Float#
485 {gt,ge,eq,ne,lt,le}Float# :: Float# -> Float# -> Bool
486 negateFloat# :: Float# -> Float#
487 float2Int# :: Float# -> Int#
488 int2Float# :: Int# -> Float#
494 <IndexTerm><Primary><literal>+##</literal></Primary></IndexTerm>
495 <IndexTerm><Primary><literal>-##</literal></Primary></IndexTerm>
496 <IndexTerm><Primary><literal>*##</literal></Primary></IndexTerm>
497 <IndexTerm><Primary><literal>/##</literal></Primary></IndexTerm>
498 <IndexTerm><Primary><literal><##</literal></Primary></IndexTerm>
499 <IndexTerm><Primary><literal><=##</literal></Primary></IndexTerm>
500 <IndexTerm><Primary><literal>==##</literal></Primary></IndexTerm>
501 <IndexTerm><Primary><literal>=/##</literal></Primary></IndexTerm>
502 <IndexTerm><Primary><literal>>=##</literal></Primary></IndexTerm>
503 <IndexTerm><Primary><literal>>##</literal></Primary></IndexTerm>
504 <IndexTerm><Primary><literal>negateDouble#</literal></Primary></IndexTerm>
505 <IndexTerm><Primary><literal>double2Int#</literal></Primary></IndexTerm>
506 <IndexTerm><Primary><literal>int2Double#</literal></Primary></IndexTerm>
510 <IndexTerm><Primary><literal>plusFloat#</literal></Primary></IndexTerm>
511 <IndexTerm><Primary><literal>minusFloat#</literal></Primary></IndexTerm>
512 <IndexTerm><Primary><literal>timesFloat#</literal></Primary></IndexTerm>
513 <IndexTerm><Primary><literal>divideFloat#</literal></Primary></IndexTerm>
514 <IndexTerm><Primary><literal>gtFloat#</literal></Primary></IndexTerm>
515 <IndexTerm><Primary><literal>geFloat#</literal></Primary></IndexTerm>
516 <IndexTerm><Primary><literal>eqFloat#</literal></Primary></IndexTerm>
517 <IndexTerm><Primary><literal>neFloat#</literal></Primary></IndexTerm>
518 <IndexTerm><Primary><literal>ltFloat#</literal></Primary></IndexTerm>
519 <IndexTerm><Primary><literal>leFloat#</literal></Primary></IndexTerm>
520 <IndexTerm><Primary><literal>negateFloat#</literal></Primary></IndexTerm>
521 <IndexTerm><Primary><literal>float2Int#</literal></Primary></IndexTerm>
522 <IndexTerm><Primary><literal>int2Float#</literal></Primary></IndexTerm>
526 And a full complement of trigonometric functions:
532 expDouble# :: Double# -> Double#
533 logDouble# :: Double# -> Double#
534 sqrtDouble# :: Double# -> Double#
535 sinDouble# :: Double# -> Double#
536 cosDouble# :: Double# -> Double#
537 tanDouble# :: Double# -> Double#
538 asinDouble# :: Double# -> Double#
539 acosDouble# :: Double# -> Double#
540 atanDouble# :: Double# -> Double#
541 sinhDouble# :: Double# -> Double#
542 coshDouble# :: Double# -> Double#
543 tanhDouble# :: Double# -> Double#
544 powerDouble# :: Double# -> Double# -> Double#
547 <IndexTerm><Primary>trigonometric functions, primitive</Primary></IndexTerm>
551 similarly for <Literal>Float#</Literal>.
555 There are two coercion functions for <Literal>Float#</Literal>/<Literal>Double#</Literal>:
561 float2Double# :: Float# -> Double#
562 double2Float# :: Double# -> Float#
565 <IndexTerm><Primary><literal>float2Double#</literal></Primary></IndexTerm>
566 <IndexTerm><Primary><literal>double2Float#</literal></Primary></IndexTerm>
570 The primitive version of <Function>decodeDouble</Function>
571 (<Function>encodeDouble</Function> is implemented as an external C
578 decodeDouble# :: Double# -> PrelNum.ReturnIntAndGMP
581 <IndexTerm><Primary><literal>encodeDouble#</literal></Primary></IndexTerm>
582 <IndexTerm><Primary><literal>decodeDouble#</literal></Primary></IndexTerm>
586 (And the same for <Literal>Float#</Literal>s.)
591 <Sect2 id="integer-operations">
592 <Title>Operations on/for <Literal>Integers</Literal> (interface to GMP)
596 <IndexTerm><Primary>arbitrary precision integers</Primary></IndexTerm>
597 <IndexTerm><Primary>Integer, operations on</Primary></IndexTerm>
601 We implement <Literal>Integers</Literal> (arbitrary-precision
602 integers) using the GNU multiple-precision (GMP) package (version
607 The data type for <Literal>Integer</Literal> is either a small
608 integer, represented by an <Literal>Int</Literal>, or a large integer
609 represented using the pieces required by GMP's
610 <Literal>MP_INT</Literal> in <Filename>gmp.h</Filename> (see
611 <Filename>gmp.info</Filename> in
612 <Filename>ghc/includes/runtime/gmp</Filename>). It comes out as:
618 data Integer = S# Int# -- small integers
619 | J# Int# ByteArray# -- large integers
622 <IndexTerm><Primary>Integer type</Primary></IndexTerm> The primitive
623 ops to support large <Literal>Integers</Literal> use the
624 “pieces” of the representation, and are as follows:
630 negateInteger# :: Int# -> ByteArray# -> Integer
632 {plus,minus,times}Integer#, gcdInteger#,
633 quotInteger#, remInteger#, divExactInteger#
634 :: Int# -> ByteArray#
635 -> Int# -> ByteArray#
636 -> (# Int#, ByteArray# #)
639 :: Int# -> ByteArray#
640 -> Int# -> ByteArray#
641 -> Int# -- -1 for <; 0 for ==; +1 for >
644 :: Int# -> ByteArray#
646 -> Int# -- -1 for <; 0 for ==; +1 for >
649 :: Int# -> ByteArray#
653 divModInteger#, quotRemInteger#
654 :: Int# -> ByteArray#
655 -> Int# -> ByteArray#
656 -> (# Int#, ByteArray#,
659 integer2Int# :: Int# -> ByteArray# -> Int#
661 int2Integer# :: Int# -> Integer -- NB: no error-checking on these two!
662 word2Integer# :: Word# -> Integer
664 addr2Integer# :: Addr# -> Integer
665 -- the Addr# is taken to be a `char *' string
666 -- to be converted into an Integer.
669 <IndexTerm><Primary><literal>negateInteger#</literal></Primary></IndexTerm>
670 <IndexTerm><Primary><literal>plusInteger#</literal></Primary></IndexTerm>
671 <IndexTerm><Primary><literal>minusInteger#</literal></Primary></IndexTerm>
672 <IndexTerm><Primary><literal>timesInteger#</literal></Primary></IndexTerm>
673 <IndexTerm><Primary><literal>quotInteger#</literal></Primary></IndexTerm>
674 <IndexTerm><Primary><literal>remInteger#</literal></Primary></IndexTerm>
675 <IndexTerm><Primary><literal>gcdInteger#</literal></Primary></IndexTerm>
676 <IndexTerm><Primary><literal>gcdIntegerInt#</literal></Primary></IndexTerm>
677 <IndexTerm><Primary><literal>divExactInteger#</literal></Primary></IndexTerm>
678 <IndexTerm><Primary><literal>cmpInteger#</literal></Primary></IndexTerm>
679 <IndexTerm><Primary><literal>divModInteger#</literal></Primary></IndexTerm>
680 <IndexTerm><Primary><literal>quotRemInteger#</literal></Primary></IndexTerm>
681 <IndexTerm><Primary><literal>integer2Int#</literal></Primary></IndexTerm>
682 <IndexTerm><Primary><literal>int2Integer#</literal></Primary></IndexTerm>
683 <IndexTerm><Primary><literal>word2Integer#</literal></Primary></IndexTerm>
684 <IndexTerm><Primary><literal>addr2Integer#</literal></Primary></IndexTerm>
690 <Title>Words and addresses</Title>
693 <IndexTerm><Primary>word, primitive type</Primary></IndexTerm>
694 <IndexTerm><Primary>address, primitive type</Primary></IndexTerm>
695 <IndexTerm><Primary>unsigned integer, primitive type</Primary></IndexTerm>
696 <IndexTerm><Primary>pointer, primitive type</Primary></IndexTerm>
700 A <Literal>Word#</Literal> is used for bit-twiddling operations.
701 It is the same size as an <Literal>Int#</Literal>, but has no sign
702 nor any arithmetic operations.
705 type Word# -- Same size/etc as Int# but *unsigned*
706 type Addr# -- A pointer from outside the "Haskell world" (from C, probably);
707 -- described under "arrays"
710 <IndexTerm><Primary><literal>Word#</literal></Primary></IndexTerm>
711 <IndexTerm><Primary><literal>Addr#</literal></Primary></IndexTerm>
715 <Literal>Word#</Literal>s and <Literal>Addr#</Literal>s have
716 the usual comparison operations. Other
717 unboxed-<Literal>Word</Literal> ops (bit-twiddling and coercions):
723 {gt,ge,eq,ne,lt,le}Word# :: Word# -> Word# -> Bool
725 and#, or#, xor# :: Word# -> Word# -> Word#
728 quotWord#, remWord# :: Word# -> Word# -> Word#
729 -- word (i.e. unsigned) versions are different from int
730 -- versions, so we have to provide these explicitly.
732 not# :: Word# -> Word#
734 shiftL#, shiftRL# :: Word# -> Int# -> Word#
735 -- shift left, right logical
737 int2Word# :: Int# -> Word# -- just a cast, really
738 word2Int# :: Word# -> Int#
741 <IndexTerm><Primary>bit operations, Word and Addr</Primary></IndexTerm>
742 <IndexTerm><Primary><literal>gtWord#</literal></Primary></IndexTerm>
743 <IndexTerm><Primary><literal>geWord#</literal></Primary></IndexTerm>
744 <IndexTerm><Primary><literal>eqWord#</literal></Primary></IndexTerm>
745 <IndexTerm><Primary><literal>neWord#</literal></Primary></IndexTerm>
746 <IndexTerm><Primary><literal>ltWord#</literal></Primary></IndexTerm>
747 <IndexTerm><Primary><literal>leWord#</literal></Primary></IndexTerm>
748 <IndexTerm><Primary><literal>and#</literal></Primary></IndexTerm>
749 <IndexTerm><Primary><literal>or#</literal></Primary></IndexTerm>
750 <IndexTerm><Primary><literal>xor#</literal></Primary></IndexTerm>
751 <IndexTerm><Primary><literal>not#</literal></Primary></IndexTerm>
752 <IndexTerm><Primary><literal>quotWord#</literal></Primary></IndexTerm>
753 <IndexTerm><Primary><literal>remWord#</literal></Primary></IndexTerm>
754 <IndexTerm><Primary><literal>shiftL#</literal></Primary></IndexTerm>
755 <IndexTerm><Primary><literal>shiftRA#</literal></Primary></IndexTerm>
756 <IndexTerm><Primary><literal>shiftRL#</literal></Primary></IndexTerm>
757 <IndexTerm><Primary><literal>int2Word#</literal></Primary></IndexTerm>
758 <IndexTerm><Primary><literal>word2Int#</literal></Primary></IndexTerm>
762 Unboxed-<Literal>Addr</Literal> ops (C casts, really):
765 {gt,ge,eq,ne,lt,le}Addr# :: Addr# -> Addr# -> Bool
767 int2Addr# :: Int# -> Addr#
768 addr2Int# :: Addr# -> Int#
769 addr2Integer# :: Addr# -> (# Int#, ByteArray# #)
772 <IndexTerm><Primary><literal>gtAddr#</literal></Primary></IndexTerm>
773 <IndexTerm><Primary><literal>geAddr#</literal></Primary></IndexTerm>
774 <IndexTerm><Primary><literal>eqAddr#</literal></Primary></IndexTerm>
775 <IndexTerm><Primary><literal>neAddr#</literal></Primary></IndexTerm>
776 <IndexTerm><Primary><literal>ltAddr#</literal></Primary></IndexTerm>
777 <IndexTerm><Primary><literal>leAddr#</literal></Primary></IndexTerm>
778 <IndexTerm><Primary><literal>int2Addr#</literal></Primary></IndexTerm>
779 <IndexTerm><Primary><literal>addr2Int#</literal></Primary></IndexTerm>
780 <IndexTerm><Primary><literal>addr2Integer#</literal></Primary></IndexTerm>
784 The casts between <Literal>Int#</Literal>,
785 <Literal>Word#</Literal> and <Literal>Addr#</Literal>
786 correspond to null operations at the machine level, but are required
787 to keep the Haskell type checker happy.
791 Operations for indexing off of C pointers
792 (<Literal>Addr#</Literal>s) to snatch values are listed under
793 “arrays”.
799 <Title>Arrays</Title>
802 <IndexTerm><Primary>arrays, primitive</Primary></IndexTerm>
806 The type <Literal>Array# elt</Literal> is the type of primitive,
807 unpointed arrays of values of type <Literal>elt</Literal>.
816 <IndexTerm><Primary><literal>Array#</literal></Primary></IndexTerm>
820 <Literal>Array#</Literal> is more primitive than a Haskell
821 array—indeed, the Haskell <Literal>Array</Literal> interface is
822 implemented using <Literal>Array#</Literal>—in that an
823 <Literal>Array#</Literal> is indexed only by
824 <Literal>Int#</Literal>s, starting at zero. It is also more
825 primitive by virtue of being unboxed. That doesn't mean that it isn't
826 a heap-allocated object—of course, it is. Rather, being unboxed
827 means that it is represented by a pointer to the array itself, and not
828 to a thunk which will evaluate to the array (or to bottom). The
829 components of an <Literal>Array#</Literal> are themselves boxed.
833 The type <Literal>ByteArray#</Literal> is similar to
834 <Literal>Array#</Literal>, except that it contains just a string
835 of (non-pointer) bytes.
844 <IndexTerm><Primary><literal>ByteArray#</literal></Primary></IndexTerm>
848 Arrays of these types are useful when a Haskell program wishes to
849 construct a value to pass to a C procedure. It is also possible to use
850 them to build (say) arrays of unboxed characters for internal use in a
851 Haskell program. Given these uses, <Literal>ByteArray#</Literal>
852 is deliberately a bit vague about the type of its components.
853 Operations are provided to extract values of type
854 <Literal>Char#</Literal>, <Literal>Int#</Literal>,
855 <Literal>Float#</Literal>, <Literal>Double#</Literal>, and
856 <Literal>Addr#</Literal> from arbitrary offsets within a
857 <Literal>ByteArray#</Literal>. (For type
858 <Literal>Foo#</Literal>, the $i$th offset gets you the $i$th
859 <Literal>Foo#</Literal>, not the <Literal>Foo#</Literal> at
860 byte-position $i$. Mumble.) (If you want a
861 <Literal>Word#</Literal>, grab an <Literal>Int#</Literal>,
866 Lastly, we have static byte-arrays, of type
867 <Literal>Addr#</Literal> [mentioned previously]. (Remember
868 the duality between arrays and pointers in C.) Arrays of this types
869 are represented by a pointer to an array in the world outside Haskell,
870 so this pointer is not followed by the garbage collector. In other
871 respects they are just like <Literal>ByteArray#</Literal>. They
872 are only needed in order to pass values from C to Haskell.
878 <Title>Reading and writing</Title>
881 Primitive arrays are linear, and indexed starting at zero.
885 The size and indices of a <Literal>ByteArray#</Literal>, <Literal>Addr#</Literal>, and
886 <Literal>MutableByteArray#</Literal> are all in bytes. It's up to the program to
887 calculate the correct byte offset from the start of the array. This
888 allows a <Literal>ByteArray#</Literal> to contain a mixture of values of different
889 type, which is often needed when preparing data for and unpicking
890 results from C. (Umm…not true of indices…WDP 95/09)
894 <Emphasis>Should we provide some <Literal>sizeOfDouble#</Literal> constants?</Emphasis>
898 Out-of-range errors on indexing should be caught by the code which
899 uses the primitive operation; the primitive operations themselves do
900 <Emphasis>not</Emphasis> check for out-of-range indexes. The intention is that the
901 primitive ops compile to one machine instruction or thereabouts.
905 We use the terms “reading” and “writing” to refer to accessing
906 <Emphasis>mutable</Emphasis> arrays (see <XRef LinkEnd="sect-mutable">), and
907 “indexing” to refer to reading a value from an <Emphasis>immutable</Emphasis>
912 Immutable byte arrays are straightforward to index (all indices in bytes):
915 indexCharArray# :: ByteArray# -> Int# -> Char#
916 indexIntArray# :: ByteArray# -> Int# -> Int#
917 indexAddrArray# :: ByteArray# -> Int# -> Addr#
918 indexFloatArray# :: ByteArray# -> Int# -> Float#
919 indexDoubleArray# :: ByteArray# -> Int# -> Double#
921 indexCharOffAddr# :: Addr# -> Int# -> Char#
922 indexIntOffAddr# :: Addr# -> Int# -> Int#
923 indexFloatOffAddr# :: Addr# -> Int# -> Float#
924 indexDoubleOffAddr# :: Addr# -> Int# -> Double#
925 indexAddrOffAddr# :: Addr# -> Int# -> Addr#
926 -- Get an Addr# from an Addr# offset
929 <IndexTerm><Primary><literal>indexCharArray#</literal></Primary></IndexTerm>
930 <IndexTerm><Primary><literal>indexIntArray#</literal></Primary></IndexTerm>
931 <IndexTerm><Primary><literal>indexAddrArray#</literal></Primary></IndexTerm>
932 <IndexTerm><Primary><literal>indexFloatArray#</literal></Primary></IndexTerm>
933 <IndexTerm><Primary><literal>indexDoubleArray#</literal></Primary></IndexTerm>
934 <IndexTerm><Primary><literal>indexCharOffAddr#</literal></Primary></IndexTerm>
935 <IndexTerm><Primary><literal>indexIntOffAddr#</literal></Primary></IndexTerm>
936 <IndexTerm><Primary><literal>indexFloatOffAddr#</literal></Primary></IndexTerm>
937 <IndexTerm><Primary><literal>indexDoubleOffAddr#</literal></Primary></IndexTerm>
938 <IndexTerm><Primary><literal>indexAddrOffAddr#</literal></Primary></IndexTerm>
942 The last of these, <Function>indexAddrOffAddr#</Function>, extracts an <Literal>Addr#</Literal> using an offset
943 from another <Literal>Addr#</Literal>, thereby providing the ability to follow a chain of
948 Something a bit more interesting goes on when indexing arrays of boxed
949 objects, because the result is simply the boxed object. So presumably
950 it should be entered—we never usually return an unevaluated
951 object! This is a pain: primitive ops aren't supposed to do
952 complicated things like enter objects. The current solution is to
953 return a single element unboxed tuple (see <XRef LinkEnd="unboxed-tuples">).
959 indexArray# :: Array# elt -> Int# -> (# elt #)
962 <IndexTerm><Primary><literal>indexArray#</literal></Primary></IndexTerm>
968 <Title>The state type</Title>
971 <IndexTerm><Primary><literal>state, primitive type</literal></Primary></IndexTerm>
972 <IndexTerm><Primary><literal>State#</literal></Primary></IndexTerm>
976 The primitive type <Literal>State#</Literal> represents the state of a state
977 transformer. It is parameterised on the desired type of state, which
978 serves to keep states from distinct threads distinct from one another.
979 But the <Emphasis>only</Emphasis> effect of this parameterisation is in the type
980 system: all values of type <Literal>State#</Literal> are represented in the same way.
981 Indeed, they are all represented by nothing at all! The code
982 generator “knows” to generate no code, and allocate no registers
983 etc, for primitive states.
995 The type <Literal>GHC.RealWorld</Literal> is truly opaque: there are no values defined
996 of this type, and no operations over it. It is “primitive” in that
997 sense - but it is <Emphasis>not unlifted!</Emphasis> Its only role in life is to be
998 the type which distinguishes the <Literal>IO</Literal> state transformer.
1012 <Title>State of the world</Title>
1015 A single, primitive, value of type <Literal>State# RealWorld</Literal> is provided.
1021 realWorld# :: State# RealWorld
1024 <IndexTerm><Primary>realWorld# state object</Primary></IndexTerm>
1028 (Note: in the compiler, not a <Literal>PrimOp</Literal>; just a mucho magic
1029 <Literal>Id</Literal>. Exported from <Literal>GHC</Literal>, though).
1034 <Sect2 id="sect-mutable">
1035 <Title>Mutable arrays</Title>
1038 <IndexTerm><Primary>mutable arrays</Primary></IndexTerm>
1039 <IndexTerm><Primary>arrays, mutable</Primary></IndexTerm>
1040 Corresponding to <Literal>Array#</Literal> and <Literal>ByteArray#</Literal>, we have the types of
1041 mutable versions of each. In each case, the representation is a
1042 pointer to a suitable block of (mutable) heap-allocated storage.
1048 type MutableArray# s elt
1049 type MutableByteArray# s
1052 <IndexTerm><Primary><literal>MutableArray#</literal></Primary></IndexTerm>
1053 <IndexTerm><Primary><literal>MutableByteArray#</literal></Primary></IndexTerm>
1057 <Title>Allocation</Title>
1060 <IndexTerm><Primary>mutable arrays, allocation</Primary></IndexTerm>
1061 <IndexTerm><Primary>arrays, allocation</Primary></IndexTerm>
1062 <IndexTerm><Primary>allocation, of mutable arrays</Primary></IndexTerm>
1066 Mutable arrays can be allocated. Only pointer-arrays are initialised;
1067 arrays of non-pointers are filled in by “user code” rather than by
1068 the array-allocation primitive. Reason: only the pointer case has to
1069 worry about GC striking with a partly-initialised array.
1075 newArray# :: Int# -> elt -> State# s -> (# State# s, MutableArray# s elt #)
1077 newCharArray# :: Int# -> State# s -> (# State# s, MutableByteArray# s elt #)
1078 newIntArray# :: Int# -> State# s -> (# State# s, MutableByteArray# s elt #)
1079 newAddrArray# :: Int# -> State# s -> (# State# s, MutableByteArray# s elt #)
1080 newFloatArray# :: Int# -> State# s -> (# State# s, MutableByteArray# s elt #)
1081 newDoubleArray# :: Int# -> State# s -> (# State# s, MutableByteArray# s elt #)
1084 <IndexTerm><Primary><literal>newArray#</literal></Primary></IndexTerm>
1085 <IndexTerm><Primary><literal>newCharArray#</literal></Primary></IndexTerm>
1086 <IndexTerm><Primary><literal>newIntArray#</literal></Primary></IndexTerm>
1087 <IndexTerm><Primary><literal>newAddrArray#</literal></Primary></IndexTerm>
1088 <IndexTerm><Primary><literal>newFloatArray#</literal></Primary></IndexTerm>
1089 <IndexTerm><Primary><literal>newDoubleArray#</literal></Primary></IndexTerm>
1093 The size of a <Literal>ByteArray#</Literal> is given in bytes.
1099 <Title>Reading and writing</Title>
1102 <IndexTerm><Primary>arrays, reading and writing</Primary></IndexTerm>
1108 readArray# :: MutableArray# s elt -> Int# -> State# s -> (# State# s, elt #)
1109 readCharArray# :: MutableByteArray# s -> Int# -> State# s -> (# State# s, Char# #)
1110 readIntArray# :: MutableByteArray# s -> Int# -> State# s -> (# State# s, Int# #)
1111 readAddrArray# :: MutableByteArray# s -> Int# -> State# s -> (# State# s, Addr# #)
1112 readFloatArray# :: MutableByteArray# s -> Int# -> State# s -> (# State# s, Float# #)
1113 readDoubleArray# :: MutableByteArray# s -> Int# -> State# s -> (# State# s, Double# #)
1115 writeArray# :: MutableArray# s elt -> Int# -> elt -> State# s -> State# s
1116 writeCharArray# :: MutableByteArray# s -> Int# -> Char# -> State# s -> State# s
1117 writeIntArray# :: MutableByteArray# s -> Int# -> Int# -> State# s -> State# s
1118 writeAddrArray# :: MutableByteArray# s -> Int# -> Addr# -> State# s -> State# s
1119 writeFloatArray# :: MutableByteArray# s -> Int# -> Float# -> State# s -> State# s
1120 writeDoubleArray# :: MutableByteArray# s -> Int# -> Double# -> State# s -> State# s
1123 <IndexTerm><Primary><literal>readArray#</literal></Primary></IndexTerm>
1124 <IndexTerm><Primary><literal>readCharArray#</literal></Primary></IndexTerm>
1125 <IndexTerm><Primary><literal>readIntArray#</literal></Primary></IndexTerm>
1126 <IndexTerm><Primary><literal>readAddrArray#</literal></Primary></IndexTerm>
1127 <IndexTerm><Primary><literal>readFloatArray#</literal></Primary></IndexTerm>
1128 <IndexTerm><Primary><literal>readDoubleArray#</literal></Primary></IndexTerm>
1129 <IndexTerm><Primary><literal>writeArray#</literal></Primary></IndexTerm>
1130 <IndexTerm><Primary><literal>writeCharArray#</literal></Primary></IndexTerm>
1131 <IndexTerm><Primary><literal>writeIntArray#</literal></Primary></IndexTerm>
1132 <IndexTerm><Primary><literal>writeAddrArray#</literal></Primary></IndexTerm>
1133 <IndexTerm><Primary><literal>writeFloatArray#</literal></Primary></IndexTerm>
1134 <IndexTerm><Primary><literal>writeDoubleArray#</literal></Primary></IndexTerm>
1140 <Title>Equality</Title>
1143 <IndexTerm><Primary>arrays, testing for equality</Primary></IndexTerm>
1147 One can take “equality” of mutable arrays. What is compared is the
1148 <Emphasis>name</Emphasis> or reference to the mutable array, not its contents.
1154 sameMutableArray# :: MutableArray# s elt -> MutableArray# s elt -> Bool
1155 sameMutableByteArray# :: MutableByteArray# s -> MutableByteArray# s -> Bool
1158 <IndexTerm><Primary><literal>sameMutableArray#</literal></Primary></IndexTerm>
1159 <IndexTerm><Primary><literal>sameMutableByteArray#</literal></Primary></IndexTerm>
1165 <Title>Freezing mutable arrays</Title>
1168 <IndexTerm><Primary>arrays, freezing mutable</Primary></IndexTerm>
1169 <IndexTerm><Primary>freezing mutable arrays</Primary></IndexTerm>
1170 <IndexTerm><Primary>mutable arrays, freezing</Primary></IndexTerm>
1174 Only unsafe-freeze has a primitive. (Safe freeze is done directly in Haskell
1175 by copying the array and then using <Function>unsafeFreeze</Function>.)
1181 unsafeFreezeArray# :: MutableArray# s elt -> State# s -> (# State# s, Array# s elt #)
1182 unsafeFreezeByteArray# :: MutableByteArray# s -> State# s -> (# State# s, ByteArray# #)
1185 <IndexTerm><Primary><literal>unsafeFreezeArray#</literal></Primary></IndexTerm>
1186 <IndexTerm><Primary><literal>unsafeFreezeByteArray#</literal></Primary></IndexTerm>
1194 <Title>Synchronizing variables (M-vars)</Title>
1197 <IndexTerm><Primary>synchronising variables (M-vars)</Primary></IndexTerm>
1198 <IndexTerm><Primary>M-Vars</Primary></IndexTerm>
1202 Synchronising variables are the primitive type used to implement
1203 Concurrent Haskell's MVars (see the Concurrent Haskell paper for
1204 the operational behaviour of these operations).
1210 type MVar# s elt -- primitive
1212 newMVar# :: State# s -> (# State# s, MVar# s elt #)
1213 takeMVar# :: SynchVar# s elt -> State# s -> (# State# s, elt #)
1214 putMVar# :: SynchVar# s elt -> State# s -> State# s
1217 <IndexTerm><Primary><literal>SynchVar#</literal></Primary></IndexTerm>
1218 <IndexTerm><Primary><literal>newSynchVar#</literal></Primary></IndexTerm>
1219 <IndexTerm><Primary><literal>takeMVar</literal></Primary></IndexTerm>
1220 <IndexTerm><Primary><literal>putMVar</literal></Primary></IndexTerm>
1227 <Sect1 id="glasgow-ST-monad">
1228 <Title>Primitive state-transformer monad
1232 <IndexTerm><Primary>state transformers (Glasgow extensions)</Primary></IndexTerm>
1233 <IndexTerm><Primary>ST monad (Glasgow extension)</Primary></IndexTerm>
1237 This monad underlies our implementation of arrays, mutable and
1238 immutable, and our implementation of I/O, including “C calls”.
1242 The <Literal>ST</Literal> library, which provides access to the <Function>ST</Function> monad, is a
1243 GHC/Hugs extension library and is described in the separate <ULink
1245 >GHC/Hugs Extension Libraries</ULink
1251 <Sect1 id="glasgow-prim-arrays">
1252 <Title>Primitive arrays, mutable and otherwise
1256 <IndexTerm><Primary>primitive arrays (Glasgow extension)</Primary></IndexTerm>
1257 <IndexTerm><Primary>arrays, primitive (Glasgow extension)</Primary></IndexTerm>
1261 GHC knows about quite a few flavours of Large Swathes of Bytes.
1265 First, GHC distinguishes between primitive arrays of (boxed) Haskell
1266 objects (type <Literal>Array# obj</Literal>) and primitive arrays of bytes (type
1267 <Literal>ByteArray#</Literal>).
1271 Second, it distinguishes between…
1275 <Term>Immutable:</Term>
1278 Arrays that do not change (as with “standard” Haskell arrays); you
1279 can only read from them. Obviously, they do not need the care and
1280 attention of the state-transformer monad.
1285 <Term>Mutable:</Term>
1288 Arrays that may be changed or “mutated.” All the operations on them
1289 live within the state-transformer monad and the updates happen
1290 <Emphasis>in-place</Emphasis>.
1295 <Term>“Static” (in C land):</Term>
1298 A C routine may pass an <Literal>Addr#</Literal> pointer back into Haskell land. There
1299 are then primitive operations with which you may merrily grab values
1300 over in C land, by indexing off the “static” pointer.
1305 <Term>“Stable” pointers:</Term>
1308 If, for some reason, you wish to hand a Haskell pointer (i.e.,
1309 <Emphasis>not</Emphasis> an unboxed value) to a C routine, you first make the
1310 pointer “stable,” so that the garbage collector won't forget that it
1311 exists. That is, GHC provides a safe way to pass Haskell pointers to
1316 Please see <XRef LinkEnd="glasgow-stablePtrs"> for more details.
1321 <Term>“Foreign objects”:</Term>
1324 A “foreign object” is a safe way to pass an external object (a
1325 C-allocated pointer, say) to Haskell and have Haskell do the Right
1326 Thing when it no longer references the object. So, for example, C
1327 could pass a large bitmap over to Haskell and say “please free this
1328 memory when you're done with it.”
1332 Please see <XRef LinkEnd="glasgow-foreignObjs"> for more details.
1340 The libraries documentatation gives more details on all these
1341 “primitive array” types and the operations on them.
1346 <Sect1 id="glasgow-ccalls">
1347 <Title>Calling C directly from Haskell
1351 <IndexTerm><Primary>C calls (Glasgow extension)</Primary></IndexTerm>
1352 <IndexTerm><Primary>_ccall_ (Glasgow extension)</Primary></IndexTerm>
1353 <IndexTerm><Primary>_casm_ (Glasgow extension)</Primary></IndexTerm>
1357 GOOD ADVICE: Because this stuff is not Entirely Stable as far as names
1358 and things go, you would be well-advised to keep your C-callery
1359 corraled in a few modules, rather than sprinkled all over your code.
1360 It will then be quite easy to update later on.
1363 <Sect2 id="ccall-intro">
1364 <Title><Function>_ccall_</Function> and <Function>_casm_</Function>: an introduction
1368 The simplest way to use a simple C function
1374 double fooC( FILE *in, char c, int i, double d, unsigned int u )
1380 is to provide a Haskell wrapper:
1386 fooH :: Char -> Int -> Double -> Word -> IO Double
1387 fooH c i d w = _ccall_ fooC (“stdin”::Addr) c i d w
1393 The function <Function>fooH</Function> unbox all of its arguments, call the C
1394 function <Function>fooC</Function> and box the corresponding arguments.
1398 One of the annoyances about <Function>_ccall_</Function>s is when the C types don't quite
1399 match the Haskell compiler's ideas. For this, the <Function>_casm_</Function> variant
1400 may be just the ticket (NB: <Emphasis>no chance</Emphasis> of such code going
1401 through a native-code generator):
1411 = _casm_ “%r = getenv((char *) %0);” name >>= \ litstring ->
1413 if (litstring == nullAddr) then
1414 Left ("Fail:oldGetEnv:"++name)
1416 Right (unpackCString litstring)
1423 The first literal-literal argument to a <Function>_casm_</Function> is like a <Function>printf</Function>
1424 format: <Literal>%r</Literal> is replaced with the “result,” <Literal>%0</Literal>–<Literal>%n-1</Literal> are
1425 replaced with the 1st–nth arguments. As you can see above, it is an
1426 easy way to do simple C casting. Everything said about <Function>_ccall_</Function> goes
1427 for <Function>_casm_</Function> as well.
1431 The use of <Function>_casm_</Function> in your code does pose a problem to the compiler
1432 when it comes to generating an interface file for a freshly compiled
1433 module. Included in an interface file is the unfolding (if any) of a
1434 declaration. However, if a declaration's unfolding happens to contain
1435 a <Function>_casm_</Function>, its unfolding will <Emphasis>not</Emphasis> be emitted into the interface
1436 file even if it qualifies by all the other criteria. The reason why
1437 the compiler prevents this from happening is that unfolding <Function>_casm_</Function>s
1438 into an interface file unduly constrains how code that import your
1439 module have to be compiled. If an imported declaration is unfolded and
1440 it contains a <Function>_casm_</Function>, you now have to be using a compiler backend
1441 capable of dealing with it (i.e., the C compiler backend). If you are
1442 using the C compiler backend, the unfolded <Function>_casm_</Function> may still cause you
1443 problems since the C code snippet it contains may mention CPP symbols
1444 that were in scope when compiling the original module are not when
1445 compiling the importing module.
1449 If you're willing to put up with the drawbacks of doing cross-module
1450 inlining of C code (GHC - A Better C Compiler :-), the option
1451 <Option>-funfold-casms-in-hi-file</Option> will turn off the default behaviour.
1452 <IndexTerm><Primary>-funfold-casms-in-hi-file option</Primary></IndexTerm>
1457 <Sect2 id="glasgow-literal-literals">
1458 <Title>Literal-literals</Title>
1461 <IndexTerm><Primary>Literal-literals</Primary></IndexTerm>
1462 The literal-literal argument to <Function>_casm_</Function> can be made use of separately
1463 from the <Function>_casm_</Function> construct itself. Indeed, we've already used it:
1469 fooH :: Char -> Int -> Double -> Word -> IO Double
1470 fooH c i d w = _ccall_ fooC (“stdin”::Addr) c i d w
1476 The first argument that's passed to <Function>fooC</Function> is given as a literal-literal,
1477 that is, a literal chunk of C code that will be inserted into the generated
1478 <Filename>.hc</Filename> code at the right place.
1482 A literal-literal is restricted to having a type that's an instance of
1483 the <Literal>CCallable</Literal> class, see <XRef LinkEnd="ccall-gotchas">
1484 for more information.
1488 Notice that literal-literals are by their very nature unfriendly to
1489 native code generators, so exercise judgement about whether or not to
1490 make use of them in your code.
1495 <Sect2 id="glasgow-foreign-headers">
1496 <Title>Using function headers
1500 <IndexTerm><Primary>C calls, function headers</Primary></IndexTerm>
1504 When generating C (using the <Option>-fvia-C</Option> directive), one can assist the
1505 C compiler in detecting type errors by using the <Command>-#include</Command> directive
1506 to provide <Filename>.h</Filename> files containing function headers.
1516 typedef unsigned long *StgForeignObj;
1517 typedef long StgInt;
1519 void initialiseEFS (StgInt size);
1520 StgInt terminateEFS (void);
1521 StgForeignObj emptyEFS(void);
1522 StgForeignObj updateEFS (StgForeignObj a, StgInt i, StgInt x);
1523 StgInt lookupEFS (StgForeignObj a, StgInt i);
1529 You can find appropriate definitions for <Literal>StgInt</Literal>, <Literal>StgForeignObj</Literal>,
1530 etc using <Command>gcc</Command> on your architecture by consulting
1531 <Filename>ghc/includes/StgTypes.h</Filename>. The following table summarises the
1532 relationship between Haskell types and C types.
1539 <ColSpec Align="Left" Colsep="0">
1540 <ColSpec Align="Left" Colsep="0">
1543 <Entry><Emphasis>C type name</Emphasis> </Entry>
1544 <Entry> <Emphasis>Haskell Type</Emphasis> </Entry>
1549 <Literal>StgChar</Literal> </Entry>
1550 <Entry> <Literal>Char#</Literal> </Entry>
1554 <Literal>StgInt</Literal> </Entry>
1555 <Entry> <Literal>Int#</Literal> </Entry>
1559 <Literal>StgWord</Literal> </Entry>
1560 <Entry> <Literal>Word#</Literal> </Entry>
1564 <Literal>StgAddr</Literal> </Entry>
1565 <Entry> <Literal>Addr#</Literal> </Entry>
1569 <Literal>StgFloat</Literal> </Entry>
1570 <Entry> <Literal>Float#</Literal> </Entry>
1574 <Literal>StgDouble</Literal> </Entry>
1575 <Entry> <Literal>Double#</Literal> </Entry>
1579 <Literal>StgArray</Literal> </Entry>
1580 <Entry> <Literal>Array#</Literal> </Entry>
1584 <Literal>StgByteArray</Literal> </Entry>
1585 <Entry> <Literal>ByteArray#</Literal> </Entry>
1589 <Literal>StgArray</Literal> </Entry>
1590 <Entry> <Literal>MutableArray#</Literal> </Entry>
1594 <Literal>StgByteArray</Literal> </Entry>
1595 <Entry> <Literal>MutableByteArray#</Literal> </Entry>
1599 <Literal>StgStablePtr</Literal> </Entry>
1600 <Entry> <Literal>StablePtr#</Literal> </Entry>
1604 <Literal>StgForeignObj</Literal> </Entry>
1605 <Entry> <Literal>ForeignObj#</Literal></Entry>
1614 Note that this approach is only <Emphasis>essential</Emphasis> for returning
1615 <Literal>float</Literal>s (or if <Literal>sizeof(int) != sizeof(int *)</Literal> on your
1616 architecture) but is a Good Thing for anyone who cares about writing
1617 solid code. You're crazy not to do it.
1622 <Sect2 id="glasgow-stablePtrs">
1623 <Title>Subverting automatic unboxing with “stable pointers”
1627 <IndexTerm><Primary>stable pointers (Glasgow extension)</Primary></IndexTerm>
1631 The arguments of a <Function>_ccall_</Function> automatically unboxed before the
1632 call. There are two reasons why this is usually the Right Thing to
1642 C is a strict language: it would be excessively tedious to pass
1643 unevaluated arguments and require the C programmer to force their
1644 evaluation before using them.
1651 Boxed values are stored on the Haskell heap and may be moved
1652 within the heap if a garbage collection occurs—that is, pointers
1653 to boxed objects are not <Emphasis>stable</Emphasis>.
1662 It is possible to subvert the unboxing process by creating a “stable
1663 pointer” to a value and passing the stable pointer instead. For
1664 example, to pass/return an integer lazily to C functions <Function>storeC</Function> and
1665 <Function>fetchC</Function> might write:
1671 storeH :: Int -> IO ()
1672 storeH x = makeStablePtr x >>= \ stable_x ->
1673 _ccall_ storeC stable_x
1676 fetchH x = _ccall_ fetchC >>= \ stable_x ->
1677 deRefStablePtr stable_x >>= \ x ->
1678 freeStablePtr stable_x >>
1685 The garbage collector will refrain from throwing a stable pointer away
1686 until you explicitly call one of the following from C or Haskell.
1692 void freeStablePointer( StgStablePtr stablePtrToToss )
1693 freeStablePtr :: StablePtr a -> IO ()
1699 As with the use of <Function>free</Function> in C programs, GREAT CARE SHOULD BE
1700 EXERCISED to ensure these functions are called at the right time: too
1701 early and you get dangling references (and, if you're lucky, an error
1702 message from the runtime system); too late and you get space leaks.
1706 And to force evaluation of the argument within <Function>fooC</Function>, one would
1707 call one of the following C functions (according to type of argument).
1713 void performIO ( StgStablePtr stableIndex /* StablePtr s (IO ()) */ );
1714 StgInt enterInt ( StgStablePtr stableIndex /* StablePtr s Int */ );
1715 StgFloat enterFloat ( StgStablePtr stableIndex /* StablePtr s Float */ );
1721 <IndexTerm><Primary>performIO</Primary></IndexTerm>
1722 <IndexTerm><Primary>enterInt</Primary></IndexTerm>
1723 <IndexTerm><Primary>enterFloat</Primary></IndexTerm>
1727 Nota Bene: <Function>_ccall_GC_</Function><IndexTerm><Primary>_ccall_GC_</Primary></IndexTerm> must be used if any of
1728 these functions are used.
1733 <Sect2 id="glasgow-foreignObjs">
1734 <Title>Foreign objects: pointing outside the Haskell heap
1738 <IndexTerm><Primary>foreign objects (Glasgow extension)</Primary></IndexTerm>
1742 There are two types that GHC programs can use to reference
1743 (heap-allocated) objects outside the Haskell world: <Literal>Addr</Literal> and
1744 <Literal>ForeignObj</Literal>.
1748 If you use <Literal>Addr</Literal>, it is up to you to the programmer to arrange
1749 allocation and deallocation of the objects.
1753 If you use <Literal>ForeignObj</Literal>, GHC's garbage collector will call upon the
1754 user-supplied <Emphasis>finaliser</Emphasis> function to free the object when the
1755 Haskell world no longer can access the object. (An object is
1756 associated with a finaliser function when the abstract
1757 Haskell type <Literal>ForeignObj</Literal> is created). The finaliser function is
1758 expressed in C, and is passed as argument the object:
1764 void foreignFinaliser ( StgForeignObj fo )
1770 when the Haskell world can no longer access the object. Since
1771 <Literal>ForeignObj</Literal>s only get released when a garbage collection occurs, we
1772 provide ways of triggering a garbage collection from within C and from
1779 void GarbageCollect()
1786 More information on the programmers' interface to <Literal>ForeignObj</Literal> can be
1787 found in the library documentation.
1792 <Sect2 id="glasgow-avoiding-monads">
1793 <Title>Avoiding monads
1797 <IndexTerm><Primary>C calls to `pure C'</Primary></IndexTerm>
1798 <IndexTerm><Primary>unsafePerformIO</Primary></IndexTerm>
1802 The <Function>_ccall_</Function> construct is part of the <Literal>IO</Literal> monad because 9 out of 10
1803 uses will be to call imperative functions with side effects such as
1804 <Function>printf</Function>. Use of the monad ensures that these operations happen in a
1805 predictable order in spite of laziness and compiler optimisations.
1809 To avoid having to be in the monad to call a C function, it is
1810 possible to use <Function>unsafePerformIO</Function>, which is available from the
1811 <Literal>IOExts</Literal> module. There are three situations where one might like to
1812 call a C function from outside the IO world:
1821 Calling a function with no side-effects:
1824 atan2d :: Double -> Double -> Double
1825 atan2d y x = unsafePerformIO (_ccall_ atan2d y x)
1827 sincosd :: Double -> (Double, Double)
1828 sincosd x = unsafePerformIO $ do
1829 da <- newDoubleArray (0, 1)
1830 _casm_ “sincosd( %0, &((double *)%1[0]), &((double *)%1[1]) );” x da
1831 s <- readDoubleArray da 0
1832 c <- readDoubleArray da 1
1842 Calling a set of functions which have side-effects but which can
1843 be used in a purely functional manner.
1845 For example, an imperative implementation of a purely functional
1846 lookup-table might be accessed using the following functions.
1851 update :: EFS x -> Int -> x -> EFS x
1852 lookup :: EFS a -> Int -> a
1854 empty = unsafePerformIO (_ccall_ emptyEFS)
1856 update a i x = unsafePerformIO $
1857 makeStablePtr x >>= \ stable_x ->
1858 _ccall_ updateEFS a i stable_x
1860 lookup a i = unsafePerformIO $
1861 _ccall_ lookupEFS a i >>= \ stable_x ->
1862 deRefStablePtr stable_x
1866 You will almost always want to use <Literal>ForeignObj</Literal>s with this.
1873 Calling a side-effecting function even though the results will
1874 be unpredictable. For example the <Function>trace</Function> function is defined by:
1878 trace :: String -> a -> a
1881 ((_ccall_ PreTraceHook sTDERR{-msg-}):: IO ()) >>
1882 fputs sTDERR string >>
1883 ((_ccall_ PostTraceHook sTDERR{-msg-}):: IO ()) >>
1886 sTDERR = (“stderr” :: Addr)
1890 (This kind of use is not highly recommended—it is only really
1891 useful in debugging code.)
1901 <Sect2 id="ccall-gotchas">
1902 <Title>C-calling “gotchas” checklist
1906 <IndexTerm><Primary>C call dangers</Primary></IndexTerm>
1907 <IndexTerm><Primary>CCallable</Primary></IndexTerm>
1908 <IndexTerm><Primary>CReturnable</Primary></IndexTerm>
1912 And some advice, too.
1921 For modules that use <Function>_ccall_</Function>s, etc., compile with
1922 <Option>-fvia-C</Option>.<IndexTerm><Primary>-fvia-C option</Primary></IndexTerm> You don't have to, but you should.
1924 Also, use the <Option>-#include "prototypes.h"</Option> flag (hack) to inform the C
1925 compiler of the fully-prototyped types of all the C functions you
1926 call. (<XRef LinkEnd="glasgow-foreign-headers"> says more about this…)
1928 This scheme is the <Emphasis>only</Emphasis> way that you will get <Emphasis>any</Emphasis>
1929 typechecking of your <Function>_ccall_</Function>s. (It shouldn't be that way, but…).
1930 GHC will pass the flag <Option>-Wimplicit</Option> to <Command>gcc</Command> so that you'll get warnings
1931 if any <Function>_ccall_</Function>ed functions have no prototypes.
1938 Try to avoid <Function>_ccall_</Function>s to C functions that take <Literal>float</Literal>
1939 arguments or return <Literal>float</Literal> results. Reason: if you do, you will
1940 become entangled in (ANSI?) C's rules for when arguments/results are
1941 promoted to <Literal>doubles</Literal>. It's a nightmare and just not worth it.
1942 Use <Literal>doubles</Literal> if possible.
1944 If you do use <Literal>floats</Literal>, check and re-check that the right thing is
1945 happening. Perhaps compile with <Option>-keep-hc-file-too</Option> and look at
1946 the intermediate C (<Function>.hc</Function>).
1953 The compiler uses two non-standard type-classes when
1954 type-checking the arguments and results of <Function>_ccall_</Function>: the arguments
1955 (respectively result) of <Function>_ccall_</Function> must be instances of the class
1956 <Literal>CCallable</Literal> (respectively <Literal>CReturnable</Literal>). Both classes may be
1957 imported from the module <Literal>CCall</Literal>, but this should only be
1958 necessary if you want to define a new instance. (Neither class
1959 defines any methods—their only function is to keep the
1960 type-checker happy.)
1962 The type checker must be able to figure out just which of the
1963 C-callable/returnable types is being used. If it can't, you have to
1964 add type signatures. For example,
1972 is not good enough, because the compiler can't work out what type <VarName>x</VarName>
1973 is, nor what type the <Function>_ccall_</Function> returns. You have to write, say:
1977 f :: Int -> IO Double
1982 This table summarises the standard instances of these classes.
1986 <ColSpec Align="Left" Colsep="0">
1987 <ColSpec Align="Left" Colsep="0">
1988 <ColSpec Align="Left" Colsep="0">
1989 <ColSpec Align="Left" Colsep="0">
1992 <Entry><Emphasis>Type</Emphasis> </Entry>
1993 <Entry><Emphasis>CCallable</Emphasis></Entry>
1994 <Entry><Emphasis>CReturnable</Emphasis> </Entry>
1995 <Entry><Emphasis>Which is probably…</Emphasis> </Entry>
1999 <Literal>Char</Literal> </Entry>
2000 <Entry> Yes </Entry>
2001 <Entry> Yes </Entry>
2002 <Entry> <Literal>unsigned char</Literal> </Entry>
2006 <Literal>Int</Literal> </Entry>
2007 <Entry> Yes </Entry>
2008 <Entry> Yes </Entry>
2009 <Entry> <Literal>long int</Literal> </Entry>
2013 <Literal>Word</Literal> </Entry>
2014 <Entry> Yes </Entry>
2015 <Entry> Yes </Entry>
2016 <Entry> <Literal>unsigned long int</Literal> </Entry>
2020 <Literal>Addr</Literal> </Entry>
2021 <Entry> Yes </Entry>
2022 <Entry> Yes </Entry>
2023 <Entry> <Literal>void *</Literal> </Entry>
2027 <Literal>Float</Literal> </Entry>
2028 <Entry> Yes </Entry>
2029 <Entry> Yes </Entry>
2030 <Entry> <Literal>float</Literal> </Entry>
2034 <Literal>Double</Literal> </Entry>
2035 <Entry> Yes </Entry>
2036 <Entry> Yes </Entry>
2037 <Entry> <Literal>double</Literal> </Entry>
2041 <Literal>()</Literal> </Entry>
2043 <Entry> Yes </Entry>
2044 <Entry> <Literal>void</Literal> </Entry>
2048 <Literal>[Char]</Literal> </Entry>
2049 <Entry> Yes </Entry>
2051 <Entry> <Literal>char *</Literal> (null-terminated) </Entry>
2055 <Literal>Array</Literal> </Entry>
2056 <Entry> Yes </Entry>
2058 <Entry> <Literal>unsigned long *</Literal> </Entry>
2062 <Literal>ByteArray</Literal> </Entry>
2063 <Entry> Yes </Entry>
2065 <Entry> <Literal>unsigned long *</Literal> </Entry>
2069 <Literal>MutableArray</Literal> </Entry>
2070 <Entry> Yes </Entry>
2072 <Entry> <Literal>unsigned long *</Literal> </Entry>
2076 <Literal>MutableByteArray</Literal> </Entry>
2077 <Entry> Yes </Entry>
2079 <Entry> <Literal>unsigned long *</Literal> </Entry>
2083 <Literal>State</Literal> </Entry>
2084 <Entry> Yes </Entry>
2085 <Entry> Yes </Entry>
2086 <Entry> nothing!</Entry>
2090 <Literal>StablePtr</Literal> </Entry>
2091 <Entry> Yes </Entry>
2092 <Entry> Yes </Entry>
2093 <Entry> <Literal>unsigned long *</Literal> </Entry>
2097 <Literal>ForeignObjs</Literal> </Entry>
2098 <Entry> Yes </Entry>
2099 <Entry> Yes </Entry>
2100 <Entry> see later </Entry>
2108 Actually, the <Literal>Word</Literal> type is defined as being the same size as a
2109 pointer on the target architecture, which is <Emphasis>probably</Emphasis>
2110 <Literal>unsigned long int</Literal>.
2112 The brave and careful programmer can add their own instances of these
2113 classes for the following types:
2120 A <Emphasis>boxed-primitive</Emphasis> type may be made an instance of both
2121 <Literal>CCallable</Literal> and <Literal>CReturnable</Literal>.
2123 A boxed primitive type is any data type with a
2124 single unary constructor with a single primitive argument. For
2125 example, the following are all boxed primitive types:
2131 data XDisplay = XDisplay Addr#
2132 data EFS a = EFS# ForeignObj#
2138 instance CCallable (EFS a)
2139 instance CReturnable (EFS a)
2148 Any datatype with a single nullary constructor may be made an
2149 instance of <Literal>CReturnable</Literal>. For example:
2153 data MyVoid = MyVoid
2154 instance CReturnable MyVoid
2163 As at version 2.09, <Literal>String</Literal> (i.e., <Literal>[Char]</Literal>) is still
2164 not a <Literal>CReturnable</Literal> type.
2166 Also, the now-builtin type <Literal>PackedString</Literal> is neither
2167 <Literal>CCallable</Literal> nor <Literal>CReturnable</Literal>. (But there are functions in
2168 the PackedString interface to let you get at the necessary bits…)
2180 The code-generator will complain if you attempt to use <Literal>%r</Literal> in
2181 a <Literal>_casm_</Literal> whose result type is <Literal>IO ()</Literal>; or if you don't use <Literal>%r</Literal>
2182 <Emphasis>precisely</Emphasis> once for any other result type. These messages are
2183 supposed to be helpful and catch bugs—please tell us if they wreck
2191 If you call out to C code which may trigger the Haskell garbage
2192 collector or create new threads (examples of this later…), then you
2193 must use the <Function>_ccall_GC_</Function><IndexTerm><Primary>_ccall_GC_ primitive</Primary></IndexTerm> or
2194 <Function>_casm_GC_</Function><IndexTerm><Primary>_casm_GC_ primitive</Primary></IndexTerm> variant of C-calls. (This
2195 does not work with the native code generator—use <Option>-fvia-C</Option>.) This
2196 stuff is hairy with a capital H!
2208 <Sect1 id="multi-param-type-classes">
2209 <Title>Multi-parameter type classes
2213 This section documents GHC's implementation of multi-paramter type
2214 classes. There's lots of background in the paper <ULink
2215 URL="http://research.microsoft.com/~simonpj/multi.ps.gz" >Type
2216 classes: exploring the design space</ULink > (Simon Peyton Jones, Mark
2217 Jones, Erik Meijer).
2221 I'd like to thank people who reported shorcomings in the GHC 3.02
2222 implementation. Our default decisions were all conservative ones, and
2223 the experience of these heroic pioneers has given useful concrete
2224 examples to support several generalisations. (These appear below as
2225 design choices not implemented in 3.02.)
2229 I've discussed these notes with Mark Jones, and I believe that Hugs
2230 will migrate towards the same design choices as I outline here.
2231 Thanks to him, and to many others who have offered very useful
2236 <Title>Types</Title>
2239 There are the following restrictions on the form of a qualified
2246 forall tv1..tvn (c1, ...,cn) => type
2252 (Here, I write the "foralls" explicitly, although the Haskell source
2253 language omits them; in Haskell 1.4, all the free type variables of an
2254 explicit source-language type signature are universally quantified,
2255 except for the class type variables in a class declaration. However,
2256 in GHC, you can give the foralls if you want. See <XRef LinkEnd="universal-quantification">).
2265 <Emphasis>Each universally quantified type variable
2266 <Literal>tvi</Literal> must be mentioned (i.e. appear free) in <Literal>type</Literal></Emphasis>.
2268 The reason for this is that a value with a type that does not obey
2269 this restriction could not be used without introducing
2270 ambiguity. Here, for example, is an illegal type:
2274 forall a. Eq a => Int
2278 When a value with this type was used, the constraint <Literal>Eq tv</Literal>
2279 would be introduced where <Literal>tv</Literal> is a fresh type variable, and
2280 (in the dictionary-translation implementation) the value would be
2281 applied to a dictionary for <Literal>Eq tv</Literal>. The difficulty is that we
2282 can never know which instance of <Literal>Eq</Literal> to use because we never
2283 get any more information about <Literal>tv</Literal>.
2290 <Emphasis>Every constraint <Literal>ci</Literal> must mention at least one of the
2291 universally quantified type variables <Literal>tvi</Literal></Emphasis>.
2293 For example, this type is OK because <Literal>C a b</Literal> mentions the
2294 universally quantified type variable <Literal>b</Literal>:
2298 forall a. C a b => burble
2302 The next type is illegal because the constraint <Literal>Eq b</Literal> does not
2303 mention <Literal>a</Literal>:
2307 forall a. Eq b => burble
2311 The reason for this restriction is milder than the other one. The
2312 excluded types are never useful or necessary (because the offending
2313 context doesn't need to be witnessed at this point; it can be floated
2314 out). Furthermore, floating them out increases sharing. Lastly,
2315 excluding them is a conservative choice; it leaves a patch of
2316 territory free in case we need it later.
2326 These restrictions apply to all types, whether declared in a type signature
2331 Unlike Haskell 1.4, constraints in types do <Emphasis>not</Emphasis> have to be of
2332 the form <Emphasis>(class type-variables)</Emphasis>. Thus, these type signatures
2339 f :: Eq (m a) => [m a] -> [m a]
2346 This choice recovers principal types, a property that Haskell 1.4 does not have.
2352 <Title>Class declarations</Title>
2360 <Emphasis>Multi-parameter type classes are permitted</Emphasis>. For example:
2364 class Collection c a where
2365 union :: c a -> c a -> c a
2376 <Emphasis>The class hierarchy must be acyclic</Emphasis>. However, the definition
2377 of "acyclic" involves only the superclass relationships. For example,
2383 op :: D b => a -> b -> b
2386 class C a => D a where { ... }
2390 Here, <Literal>C</Literal> is a superclass of <Literal>D</Literal>, but it's OK for a
2391 class operation <Literal>op</Literal> of <Literal>C</Literal> to mention <Literal>D</Literal>. (It
2392 would not be OK for <Literal>D</Literal> to be a superclass of <Literal>C</Literal>.)
2399 <Emphasis>There are no restrictions on the context in a class declaration
2400 (which introduces superclasses), except that the class hierarchy must
2401 be acyclic</Emphasis>. So these class declarations are OK:
2405 class Functor (m k) => FiniteMap m k where
2408 class (Monad m, Monad (t m)) => Transform t m where
2409 lift :: m a -> (t m) a
2418 <Emphasis>In the signature of a class operation, every constraint
2419 must mention at least one type variable that is not a class type
2420 variable</Emphasis>.
2426 class Collection c a where
2427 mapC :: Collection c b => (a->b) -> c a -> c b
2431 is OK because the constraint <Literal>(Collection a b)</Literal> mentions
2432 <Literal>b</Literal>, even though it also mentions the class variable
2433 <Literal>a</Literal>. On the other hand:
2438 op :: Eq a => (a,b) -> (a,b)
2442 is not OK because the constraint <Literal>(Eq a)</Literal> mentions on the class
2443 type variable <Literal>a</Literal>, but not <Literal>b</Literal>. However, any such
2444 example is easily fixed by moving the offending context up to the
2449 class Eq a => C a where
2454 A yet more relaxed rule would allow the context of a class-op signature
2455 to mention only class type variables. However, that conflicts with
2456 Rule 1(b) for types above.
2463 <Emphasis>The type of each class operation must mention <Emphasis>all</Emphasis> of
2464 the class type variables</Emphasis>. For example:
2468 class Coll s a where
2470 insert :: s -> a -> s
2474 is not OK, because the type of <Literal>empty</Literal> doesn't mention
2475 <Literal>a</Literal>. This rule is a consequence of Rule 1(a), above, for
2476 types, and has the same motivation.
2478 Sometimes, offending class declarations exhibit misunderstandings. For
2479 example, <Literal>Coll</Literal> might be rewritten
2483 class Coll s a where
2485 insert :: s a -> a -> s a
2489 which makes the connection between the type of a collection of
2490 <Literal>a</Literal>'s (namely <Literal>(s a)</Literal>) and the element type <Literal>a</Literal>.
2491 Occasionally this really doesn't work, in which case you can split the
2499 class CollE s => Coll s a where
2500 insert :: s -> a -> s
2514 <Title>Instance declarations</Title>
2522 <Emphasis>Instance declarations may not overlap</Emphasis>. The two instance
2527 instance context1 => C type1 where ...
2528 instance context2 => C type2 where ...
2532 "overlap" if <Literal>type1</Literal> and <Literal>type2</Literal> unify
2534 However, if you give the command line option
2535 <Option>-fallow-overlapping-instances</Option><IndexTerm><Primary>-fallow-overlapping-instances
2536 option</Primary></IndexTerm> then two overlapping instance declarations are permitted
2544 EITHER <Literal>type1</Literal> and <Literal>type2</Literal> do not unify
2550 OR <Literal>type2</Literal> is a substitution instance of <Literal>type1</Literal>
2551 (but not identical to <Literal>type1</Literal>)
2564 Notice that these rules
2571 make it clear which instance decl to use
2572 (pick the most specific one that matches)
2579 do not mention the contexts <Literal>context1</Literal>, <Literal>context2</Literal>
2580 Reason: you can pick which instance decl
2581 "matches" based on the type.
2588 Regrettably, GHC doesn't guarantee to detect overlapping instance
2589 declarations if they appear in different modules. GHC can "see" the
2590 instance declarations in the transitive closure of all the modules
2591 imported by the one being compiled, so it can "see" all instance decls
2592 when it is compiling <Literal>Main</Literal>. However, it currently chooses not
2593 to look at ones that can't possibly be of use in the module currently
2594 being compiled, in the interests of efficiency. (Perhaps we should
2595 change that decision, at least for <Literal>Main</Literal>.)
2602 <Emphasis>There are no restrictions on the type in an instance
2603 <Emphasis>head</Emphasis>, except that at least one must not be a type variable</Emphasis>.
2604 The instance "head" is the bit after the "=>" in an instance decl. For
2605 example, these are OK:
2609 instance C Int a where ...
2611 instance D (Int, Int) where ...
2613 instance E [[a]] where ...
2617 Note that instance heads <Emphasis>may</Emphasis> contain repeated type variables.
2618 For example, this is OK:
2622 instance Stateful (ST s) (MutVar s) where ...
2626 The "at least one not a type variable" restriction is to ensure that
2627 context reduction terminates: each reduction step removes one type
2628 constructor. For example, the following would make the type checker
2629 loop if it wasn't excluded:
2633 instance C a => C a where ...
2637 There are two situations in which the rule is a bit of a pain. First,
2638 if one allows overlapping instance declarations then it's quite
2639 convenient to have a "default instance" declaration that applies if
2640 something more specific does not:
2649 Second, sometimes you might want to use the following to get the
2650 effect of a "class synonym":
2654 class (C1 a, C2 a, C3 a) => C a where { }
2656 instance (C1 a, C2 a, C3 a) => C a where { }
2660 This allows you to write shorter signatures:
2672 f :: (C1 a, C2 a, C3 a) => ...
2676 I'm on the lookout for a simple rule that preserves decidability while
2677 allowing these idioms. The experimental flag
2678 <Option>-fallow-undecidable-instances</Option><IndexTerm><Primary>-fallow-undecidable-instances
2679 option</Primary></IndexTerm> lifts this restriction, allowing all the types in an
2680 instance head to be type variables.
2687 <Emphasis>Unlike Haskell 1.4, instance heads may use type
2688 synonyms</Emphasis>. As always, using a type synonym is just shorthand for
2689 writing the RHS of the type synonym definition. For example:
2693 type Point = (Int,Int)
2694 instance C Point where ...
2695 instance C [Point] where ...
2699 is legal. However, if you added
2703 instance C (Int,Int) where ...
2707 as well, then the compiler will complain about the overlapping
2708 (actually, identical) instance declarations. As always, type synonyms
2709 must be fully applied. You cannot, for example, write:
2714 instance Monad P where ...
2718 This design decision is independent of all the others, and easily
2719 reversed, but it makes sense to me.
2726 <Emphasis>The types in an instance-declaration <Emphasis>context</Emphasis> must all
2727 be type variables</Emphasis>. Thus
2731 instance C a b => Eq (a,b) where ...
2739 instance C Int b => Foo b where ...
2743 is not OK. Again, the intent here is to make sure that context
2744 reduction terminates.
2746 Voluminous correspondence on the Haskell mailing list has convinced me
2747 that it's worth experimenting with a more liberal rule. If you use
2748 the flag <Option>-fallow-undecidable-instances</Option> can use arbitrary
2749 types in an instance context. Termination is ensured by having a
2750 fixed-depth recursion stack. If you exceed the stack depth you get a
2751 sort of backtrace, and the opportunity to increase the stack depth
2752 with <Option>-fcontext-stack</Option><Emphasis>N</Emphasis>.
2765 <Sect1 id="universal-quantification">
2766 <Title>Explicit universal quantification
2770 GHC now allows you to write explicitly quantified types. GHC's
2771 syntax for this now agrees with Hugs's, namely:
2777 forall a b. (Ord a, Eq b) => a -> b -> a
2783 The context is, of course, optional. You can't use <Literal>forall</Literal> as
2784 a type variable any more!
2788 Haskell type signatures are implicitly quantified. The <Literal>forall</Literal>
2789 allows us to say exactly what this means. For example:
2807 g :: forall b. (b -> b)
2813 The two are treated identically.
2817 <Title>Universally-quantified data type fields
2821 In a <Literal>data</Literal> or <Literal>newtype</Literal> declaration one can quantify
2822 the types of the constructor arguments. Here are several examples:
2828 data T a = T1 (forall b. b -> b -> b) a
2830 data MonadT m = MkMonad { return :: forall a. a -> m a,
2831 bind :: forall a b. m a -> (a -> m b) -> m b
2834 newtype Swizzle = MkSwizzle (Ord a => [a] -> [a])
2840 The constructors now have so-called <Emphasis>rank 2</Emphasis> polymorphic
2841 types, in which there is a for-all in the argument types.:
2847 T1 :: forall a. (forall b. b -> b -> b) -> a -> T a
2848 MkMonad :: forall m. (forall a. a -> m a)
2849 -> (forall a b. m a -> (a -> m b) -> m b)
2851 MkSwizzle :: (Ord a => [a] -> [a]) -> Swizzle
2857 Notice that you don't need to use a <Literal>forall</Literal> if there's an
2858 explicit context. For example in the first argument of the
2859 constructor <Function>MkSwizzle</Function>, an implicit "<Literal>forall a.</Literal>" is
2860 prefixed to the argument type. The implicit <Literal>forall</Literal>
2861 quantifies all type variables that are not already in scope, and are
2862 mentioned in the type quantified over.
2866 As for type signatures, implicit quantification happens for non-overloaded
2867 types too. So if you write this:
2870 data T a = MkT (Either a b) (b -> b)
2873 it's just as if you had written this:
2876 data T a = MkT (forall b. Either a b) (forall b. b -> b)
2879 That is, since the type variable <Literal>b</Literal> isn't in scope, it's
2880 implicitly universally quantified. (Arguably, it would be better
2881 to <Emphasis>require</Emphasis> explicit quantification on constructor arguments
2882 where that is what is wanted. Feedback welcomed.)
2888 <Title>Construction </Title>
2891 You construct values of types <Literal>T1, MonadT, Swizzle</Literal> by applying
2892 the constructor to suitable values, just as usual. For example,
2898 (T1 (\xy->x) 3) :: T Int
2900 (MkSwizzle sort) :: Swizzle
2901 (MkSwizzle reverse) :: Swizzle
2908 MkMonad r b) :: MonadT Maybe
2914 The type of the argument can, as usual, be more general than the type
2915 required, as <Literal>(MkSwizzle reverse)</Literal> shows. (<Function>reverse</Function>
2916 does not need the <Literal>Ord</Literal> constraint.)
2922 <Title>Pattern matching</Title>
2925 When you use pattern matching, the bound variables may now have
2926 polymorphic types. For example:
2932 f :: T a -> a -> (a, Char)
2933 f (T1 f k) x = (f k x, f 'c' 'd')
2935 g :: (Ord a, Ord b) => Swizzle -> [a] -> (a -> b) -> [b]
2936 g (MkSwizzle s) xs f = s (map f (s xs))
2938 h :: MonadT m -> [m a] -> m [a]
2939 h m [] = return m []
2940 h m (x:xs) = bind m x $ \y ->
2941 bind m (h m xs) $ \ys ->
2948 In the function <Function>h</Function> we use the record selectors <Literal>return</Literal>
2949 and <Literal>bind</Literal> to extract the polymorphic bind and return functions
2950 from the <Literal>MonadT</Literal> data structure, rather than using pattern
2955 You cannot pattern-match against an argument that is polymorphic.
2959 newtype TIM s a = TIM (ST s (Maybe a))
2961 runTIM :: (forall s. TIM s a) -> Maybe a
2962 runTIM (TIM m) = runST m
2968 Here the pattern-match fails, because you can't pattern-match against
2969 an argument of type <Literal>(forall s. TIM s a)</Literal>. Instead you
2970 must bind the variable and pattern match in the right hand side:
2973 runTIM :: (forall s. TIM s a) -> Maybe a
2974 runTIM tm = case tm of { TIM m -> runST m }
2977 The <Literal>tm</Literal> on the right hand side is (invisibly) instantiated, like
2978 any polymorphic value at its occurrence site, and now you can pattern-match
2985 <Title>The partial-application restriction</Title>
2988 There is really only one way in which data structures with polymorphic
2989 components might surprise you: you must not partially apply them.
2990 For example, this is illegal:
2996 map MkSwizzle [sort, reverse]
3002 The restriction is this: <Emphasis>every subexpression of the program must
3003 have a type that has no for-alls, except that in a function
3004 application (f e1…en) the partial applications are not subject to
3005 this rule</Emphasis>. The restriction makes type inference feasible.
3009 In the illegal example, the sub-expression <Literal>MkSwizzle</Literal> has the
3010 polymorphic type <Literal>(Ord b => [b] -> [b]) -> Swizzle</Literal> and is not
3011 a sub-expression of an enclosing application. On the other hand, this
3018 map (T1 (\a b -> a)) [1,2,3]
3024 even though it involves a partial application of <Function>T1</Function>, because
3025 the sub-expression <Literal>T1 (\a b -> a)</Literal> has type <Literal>Int -> T
3032 <Title>Type signatures
3036 Once you have data constructors with universally-quantified fields, or
3037 constants such as <Constant>runST</Constant> that have rank-2 types, it isn't long
3038 before you discover that you need more! Consider:
3044 mkTs f x y = [T1 f x, T1 f y]
3050 <Function>mkTs</Function> is a fuction that constructs some values of type
3051 <Literal>T</Literal>, using some pieces passed to it. The trouble is that since
3052 <Literal>f</Literal> is a function argument, Haskell assumes that it is
3053 monomorphic, so we'll get a type error when applying <Function>T1</Function> to
3054 it. This is a rather silly example, but the problem really bites in
3055 practice. Lots of people trip over the fact that you can't make
3056 "wrappers functions" for <Constant>runST</Constant> for exactly the same reason.
3057 In short, it is impossible to build abstractions around functions with
3062 The solution is fairly clear. We provide the ability to give a rank-2
3063 type signature for <Emphasis>ordinary</Emphasis> functions (not only data
3064 constructors), thus:
3070 mkTs :: (forall b. b -> b -> b) -> a -> [T a]
3071 mkTs f x y = [T1 f x, T1 f y]
3077 This type signature tells the compiler to attribute <Literal>f</Literal> with
3078 the polymorphic type <Literal>(forall b. b -> b -> b)</Literal> when type
3079 checking the body of <Function>mkTs</Function>, so now the application of
3080 <Function>T1</Function> is fine.
3084 There are two restrictions:
3093 You can only define a rank 2 type, specified by the following
3098 rank2type ::= [forall tyvars .] [context =>] funty
3099 funty ::= ([forall tyvars .] [context =>] ty) -> funty
3101 ty ::= ...current Haskell monotype syntax...
3105 Informally, the universal quantification must all be right at the beginning,
3106 or at the top level of a function argument.
3113 There is a restriction on the definition of a function whose
3114 type signature is a rank-2 type: the polymorphic arguments must be
3115 matched on the left hand side of the "<Literal>=</Literal>" sign. You can't
3116 define <Function>mkTs</Function> like this:
3120 mkTs :: (forall b. b -> b -> b) -> a -> [T a]
3121 mkTs = \ f x y -> [T1 f x, T1 f y]
3126 The same partial-application rule applies to ordinary functions with
3127 rank-2 types as applied to data constructors.
3140 <Title>Type synonyms and hoisting
3144 GHC also allows you to write a <Literal>forall</Literal> in a type synonym, thus:
3146 type Discard a = forall b. a -> b -> a
3151 However, it is often convenient to use these sort of synonyms at the right hand
3152 end of an arrow, thus:
3154 type Discard a = forall b. a -> b -> a
3156 g :: Int -> Discard Int
3159 Simply expanding the type synonym would give
3161 g :: Int -> (forall b. Int -> b -> Int)
3163 but GHC "hoists" the <Literal>forall</Literal> to give the isomorphic type
3165 g :: forall b. Int -> Int -> b -> Int
3167 In general, the rule is this: <Emphasis>to determine the type specified by any explicit
3168 user-written type (e.g. in a type signature), GHC expands type synonyms and then repeatedly
3169 performs the transformation:</Emphasis>
3171 <Emphasis>type1</Emphasis> -> forall a. <Emphasis>type2</Emphasis>
3173 forall a. <Emphasis>type1</Emphasis> -> <Emphasis>type2</Emphasis>
3175 (In fact, GHC tries to retain as much synonym information as possible for use in
3176 error messages, but that is a usability issue.) This rule applies, of course, whether
3177 or not the <Literal>forall</Literal> comes from a synonym. For example, here is another
3178 valid way to write <Literal>g</Literal>'s type signature:
3180 g :: Int -> Int -> forall b. b -> Int
3187 <Sect1 id="existential-quantification">
3188 <Title>Existentially quantified data constructors
3192 The idea of using existential quantification in data type declarations
3193 was suggested by Laufer (I believe, thought doubtless someone will
3194 correct me), and implemented in Hope+. It's been in Lennart
3195 Augustsson's <Command>hbc</Command> Haskell compiler for several years, and
3196 proved very useful. Here's the idea. Consider the declaration:
3202 data Foo = forall a. MkFoo a (a -> Bool)
3209 The data type <Literal>Foo</Literal> has two constructors with types:
3215 MkFoo :: forall a. a -> (a -> Bool) -> Foo
3222 Notice that the type variable <Literal>a</Literal> in the type of <Function>MkFoo</Function>
3223 does not appear in the data type itself, which is plain <Literal>Foo</Literal>.
3224 For example, the following expression is fine:
3230 [MkFoo 3 even, MkFoo 'c' isUpper] :: [Foo]
3236 Here, <Literal>(MkFoo 3 even)</Literal> packages an integer with a function
3237 <Function>even</Function> that maps an integer to <Literal>Bool</Literal>; and <Function>MkFoo 'c'
3238 isUpper</Function> packages a character with a compatible function. These
3239 two things are each of type <Literal>Foo</Literal> and can be put in a list.
3243 What can we do with a value of type <Literal>Foo</Literal>?. In particular,
3244 what happens when we pattern-match on <Function>MkFoo</Function>?
3250 f (MkFoo val fn) = ???
3256 Since all we know about <Literal>val</Literal> and <Function>fn</Function> is that they
3257 are compatible, the only (useful) thing we can do with them is to
3258 apply <Function>fn</Function> to <Literal>val</Literal> to get a boolean. For example:
3265 f (MkFoo val fn) = fn val
3271 What this allows us to do is to package heterogenous values
3272 together with a bunch of functions that manipulate them, and then treat
3273 that collection of packages in a uniform manner. You can express
3274 quite a bit of object-oriented-like programming this way.
3277 <Sect2 id="existential">
3278 <Title>Why existential?
3282 What has this to do with <Emphasis>existential</Emphasis> quantification?
3283 Simply that <Function>MkFoo</Function> has the (nearly) isomorphic type
3289 MkFoo :: (exists a . (a, a -> Bool)) -> Foo
3295 But Haskell programmers can safely think of the ordinary
3296 <Emphasis>universally</Emphasis> quantified type given above, thereby avoiding
3297 adding a new existential quantification construct.
3303 <Title>Type classes</Title>
3306 An easy extension (implemented in <Command>hbc</Command>) is to allow
3307 arbitrary contexts before the constructor. For example:
3313 data Baz = forall a. Eq a => Baz1 a a
3314 | forall b. Show b => Baz2 b (b -> b)
3320 The two constructors have the types you'd expect:
3326 Baz1 :: forall a. Eq a => a -> a -> Baz
3327 Baz2 :: forall b. Show b => b -> (b -> b) -> Baz
3333 But when pattern matching on <Function>Baz1</Function> the matched values can be compared
3334 for equality, and when pattern matching on <Function>Baz2</Function> the first matched
3335 value can be converted to a string (as well as applying the function to it).
3336 So this program is legal:
3343 f (Baz1 p q) | p == q = "Yes"
3345 f (Baz1 v fn) = show (fn v)
3351 Operationally, in a dictionary-passing implementation, the
3352 constructors <Function>Baz1</Function> and <Function>Baz2</Function> must store the
3353 dictionaries for <Literal>Eq</Literal> and <Literal>Show</Literal> respectively, and
3354 extract it on pattern matching.
3358 Notice the way that the syntax fits smoothly with that used for
3359 universal quantification earlier.
3365 <Title>Restrictions</Title>
3368 There are several restrictions on the ways in which existentially-quantified
3369 constructors can be use.
3378 When pattern matching, each pattern match introduces a new,
3379 distinct, type for each existential type variable. These types cannot
3380 be unified with any other type, nor can they escape from the scope of
3381 the pattern match. For example, these fragments are incorrect:
3389 Here, the type bound by <Function>MkFoo</Function> "escapes", because <Literal>a</Literal>
3390 is the result of <Function>f1</Function>. One way to see why this is wrong is to
3391 ask what type <Function>f1</Function> has:
3395 f1 :: Foo -> a -- Weird!
3399 What is this "<Literal>a</Literal>" in the result type? Clearly we don't mean
3404 f1 :: forall a. Foo -> a -- Wrong!
3408 The original program is just plain wrong. Here's another sort of error
3412 f2 (Baz1 a b) (Baz1 p q) = a==q
3416 It's ok to say <Literal>a==b</Literal> or <Literal>p==q</Literal>, but
3417 <Literal>a==q</Literal> is wrong because it equates the two distinct types arising
3418 from the two <Function>Baz1</Function> constructors.
3426 You can't pattern-match on an existentially quantified
3427 constructor in a <Literal>let</Literal> or <Literal>where</Literal> group of
3428 bindings. So this is illegal:
3432 f3 x = a==b where { Baz1 a b = x }
3436 You can only pattern-match
3437 on an existentially-quantified constructor in a <Literal>case</Literal> expression or
3438 in the patterns of a function definition.
3440 The reason for this restriction is really an implementation one.
3441 Type-checking binding groups is already a nightmare without
3442 existentials complicating the picture. Also an existential pattern
3443 binding at the top level of a module doesn't make sense, because it's
3444 not clear how to prevent the existentially-quantified type "escaping".
3445 So for now, there's a simple-to-state restriction. We'll see how
3453 You can't use existential quantification for <Literal>newtype</Literal>
3454 declarations. So this is illegal:
3458 newtype T = forall a. Ord a => MkT a
3462 Reason: a value of type <Literal>T</Literal> must be represented as a pair
3463 of a dictionary for <Literal>Ord t</Literal> and a value of type <Literal>t</Literal>.
3464 That contradicts the idea that <Literal>newtype</Literal> should have no
3465 concrete representation. You can get just the same efficiency and effect
3466 by using <Literal>data</Literal> instead of <Literal>newtype</Literal>. If there is no
3467 overloading involved, then there is more of a case for allowing
3468 an existentially-quantified <Literal>newtype</Literal>, because the <Literal>data</Literal>
3469 because the <Literal>data</Literal> version does carry an implementation cost,
3470 but single-field existentially quantified constructors aren't much
3471 use. So the simple restriction (no existential stuff on <Literal>newtype</Literal>)
3472 stands, unless there are convincing reasons to change it.
3480 You can't use <Literal>deriving</Literal> to define instances of a
3481 data type with existentially quantified data constructors.
3483 Reason: in most cases it would not make sense. For example:#
3486 data T = forall a. MkT [a] deriving( Eq )
3489 To derive <Literal>Eq</Literal> in the standard way we would need to have equality
3490 between the single component of two <Function>MkT</Function> constructors:
3494 (MkT a) == (MkT b) = ???
3497 But <VarName>a</VarName> and <VarName>b</VarName> have distinct types, and so can't be compared.
3498 It's just about possible to imagine examples in which the derived instance
3499 would make sense, but it seems altogether simpler simply to prohibit such
3500 declarations. Define your own instances!
3512 <Sect1 id="sec-assertions">
3514 <IndexTerm><Primary>Assertions</Primary></IndexTerm>
3518 If you want to make use of assertions in your standard Haskell code, you
3519 could define a function like the following:
3525 assert :: Bool -> a -> a
3526 assert False x = error "assertion failed!"
3533 which works, but gives you back a less than useful error message --
3534 an assertion failed, but which and where?
3538 One way out is to define an extended <Function>assert</Function> function which also
3539 takes a descriptive string to include in the error message and
3540 perhaps combine this with the use of a pre-processor which inserts
3541 the source location where <Function>assert</Function> was used.
3545 Ghc offers a helping hand here, doing all of this for you. For every
3546 use of <Function>assert</Function> in the user's source:
3552 kelvinToC :: Double -> Double
3553 kelvinToC k = assert (k &gt;= 0.0) (k+273.15)
3559 Ghc will rewrite this to also include the source location where the
3566 assert pred val ==> assertError "Main.hs|15" pred val
3572 The rewrite is only performed by the compiler when it spots
3573 applications of <Function>Exception.assert</Function>, so you can still define and
3574 use your own versions of <Function>assert</Function>, should you so wish. If not,
3575 import <Literal>Exception</Literal> to make use <Function>assert</Function> in your code.
3579 To have the compiler ignore uses of assert, use the compiler option
3580 <Option>-fignore-asserts</Option>. <IndexTerm><Primary>-fignore-asserts option</Primary></IndexTerm> That is,
3581 expressions of the form <Literal>assert pred e</Literal> will be rewritten to <Literal>e</Literal>.
3585 Assertion failures can be caught, see the documentation for the
3586 Hugs/GHC Exception library for information of how.
3591 <Sect1 id="scoped-type-variables">
3592 <Title>Scoped Type Variables
3596 A <Emphasis>pattern type signature</Emphasis> can introduce a <Emphasis>scoped type
3597 variable</Emphasis>. For example
3603 f (xs::[a]) = ys ++ ys
3612 The pattern <Literal>(xs::[a])</Literal> includes a type signature for <VarName>xs</VarName>.
3613 This brings the type variable <Literal>a</Literal> into scope; it scopes over
3614 all the patterns and right hand sides for this equation for <Function>f</Function>.
3615 In particular, it is in scope at the type signature for <VarName>y</VarName>.
3619 At ordinary type signatures, such as that for <VarName>ys</VarName>, any type variables
3620 mentioned in the type signature <Emphasis>that are not in scope</Emphasis> are
3621 implicitly universally quantified. (If there are no type variables in
3622 scope, all type variables mentioned in the signature are universally
3623 quantified, which is just as in Haskell 98.) In this case, since <VarName>a</VarName>
3624 is in scope, it is not universally quantified, so the type of <VarName>ys</VarName> is
3625 the same as that of <VarName>xs</VarName>. In Haskell 98 it is not possible to declare
3626 a type for <VarName>ys</VarName>; a major benefit of scoped type variables is that
3627 it becomes possible to do so.
3631 Scoped type variables are implemented in both GHC and Hugs. Where the
3632 implementations differ from the specification below, those differences
3637 So much for the basic idea. Here are the details.
3641 <Title>Scope and implicit quantification</Title>
3649 All the type variables mentioned in the patterns for a single
3650 function definition equation, that are not already in scope,
3651 are brought into scope by the patterns. We describe this set as
3652 the <Emphasis>type variables bound by the equation</Emphasis>.
3659 The type variables thus brought into scope may be mentioned
3660 in ordinary type signatures or pattern type signatures anywhere within
3668 In ordinary type signatures, any type variable mentioned in the
3669 signature that is in scope is <Emphasis>not</Emphasis> universally quantified.
3676 Ordinary type signatures do not bring any new type variables
3677 into scope (except in the type signature itself!). So this is illegal:
3686 It's illegal because <VarName>a</VarName> is not in scope in the body of <Function>f</Function>,
3687 so the ordinary signature <Literal>x::a</Literal> is equivalent to <Literal>x::forall a.a</Literal>;
3688 and that is an incorrect typing.
3695 There is no implicit universal quantification on pattern type
3696 signatures, nor may one write an explicit <Literal>forall</Literal> type in a pattern
3697 type signature. The pattern type signature is a monotype.
3705 The type variables in the head of a <Literal>class</Literal> or <Literal>instance</Literal> declaration
3706 scope over the methods defined in the <Literal>where</Literal> part. For example:
3720 (Not implemented in Hugs yet, Dec 98).
3731 <Title>Polymorphism</Title>
3739 Pattern type signatures are completely orthogonal to ordinary, separate
3740 type signatures. The two can be used independently or together. There is
3741 no scoping associated with the names of the type variables in a separate type signature.
3746 f (xs::[b]) = reverse xs
3755 The function must be polymorphic in the type variables
3756 bound by all its equations. Operationally, the type variables bound
3757 by one equation must not:
3764 Be unified with a type (such as <Literal>Int</Literal>, or <Literal>[a]</Literal>).
3770 Be unified with a type variable free in the environment.
3776 Be unified with each other. (They may unify with the type variables
3777 bound by another equation for the same function, of course.)
3784 For example, the following all fail to type check:
3788 f (x::a) (y::b) = [x,y] -- a unifies with b
3790 g (x::a) = x + 1::Int -- a unifies with Int
3792 h x = let k (y::a) = [x,y] -- a is free in the
3793 in k x -- environment
3795 k (x::a) True = ... -- a unifies with Int
3796 k (x::Int) False = ...
3799 w (x::a) = x -- a unifies with [b]
3808 The pattern-bound type variable may, however, be constrained
3809 by the context of the principal type, thus:
3813 f (x::a) (y::a) = x+y*2
3817 gets the inferred type: <Literal>forall a. Num a => a -> a -> a</Literal>.
3828 <Title>Result type signatures</Title>
3836 The result type of a function can be given a signature,
3841 f (x::a) :: [a] = [x,x,x]
3845 The final <Literal>:: [a]</Literal> after all the patterns gives a signature to the
3846 result type. Sometimes this is the only way of naming the type variable
3851 f :: Int -> [a] -> [a]
3852 f n :: ([a] -> [a]) = let g (x::a, y::a) = (y,x)
3853 in \xs -> map g (reverse xs `zip` xs)
3865 Result type signatures are not yet implemented in Hugs.
3871 <Title>Pattern signatures on other constructs</Title>
3879 A pattern type signature can be on an arbitrary sub-pattern, not
3884 f ((x,y)::(a,b)) = (y,x) :: (b,a)
3893 Pattern type signatures, including the result part, can be used
3894 in lambda abstractions:
3898 (\ (x::a, y) :: a -> x)
3902 Type variables bound by these patterns must be polymorphic in
3903 the sense defined above.
3908 f1 (x::c) = f1 x -- ok
3909 f2 = \(x::c) -> f2 x -- not ok
3913 Here, <Function>f1</Function> is OK, but <Function>f2</Function> is not, because <VarName>c</VarName> gets unified
3914 with a type variable free in the environment, in this
3915 case, the type of <Function>f2</Function>, which is in the environment when
3916 the lambda abstraction is checked.
3923 Pattern type signatures, including the result part, can be used
3924 in <Literal>case</Literal> expressions:
3928 case e of { (x::a, y) :: a -> x }
3932 The pattern-bound type variables must, as usual,
3933 be polymorphic in the following sense: each case alternative,
3934 considered as a lambda abstraction, must be polymorphic.
3939 case (True,False) of { (x::a, y) -> x }
3943 Even though the context is that of a pair of booleans,
3944 the alternative itself is polymorphic. Of course, it is
3949 case (True,False) of { (x::Bool, y) -> x }
3958 To avoid ambiguity, the type after the “<Literal>::</Literal>” in a result
3959 pattern signature on a lambda or <Literal>case</Literal> must be atomic (i.e. a single
3960 token or a parenthesised type of some sort). To see why,
3961 consider how one would parse this:
3974 Pattern type signatures that bind new type variables
3975 may not be used in pattern bindings at all.
3980 f x = let (y, z::a) = x in ...
3984 But these are OK, because they do not bind fresh type variables:
3988 f1 x = let (y, z::Int) = x in ...
3989 f2 (x::(Int,a)) = let (y, z::a) = x in ...
3993 However a single variable is considered a degenerate function binding,
3994 rather than a degerate pattern binding, so this is permitted, even
3995 though it binds a type variable:
3999 f :: (b->b) = \(x::b) -> x
4008 Such degnerate function bindings do not fall under the monomorphism
4015 g :: a -> a -> Bool = \x y. x==y
4021 Here <Function>g</Function> has type <Literal>forall a. Eq a => a -> a -> Bool</Literal>, just as if
4022 <Function>g</Function> had a separate type signature. Lacking a type signature, <Function>g</Function>
4023 would get a monomorphic type.
4029 <Title>Existentials</Title>
4037 Pattern type signatures can bind existential type variables.
4042 data T = forall a. MkT [a]
4045 f (MkT [t::a]) = MkT t3
4062 <Sect1 id="pragmas">
4067 GHC supports several pragmas, or instructions to the compiler placed
4068 in the source code. Pragmas don't affect the meaning of the program,
4069 but they might affect the efficiency of the generated code.
4072 <Sect2 id="inline-pragma">
4073 <Title>INLINE pragma
4075 <IndexTerm><Primary>INLINE pragma</Primary></IndexTerm>
4076 <IndexTerm><Primary>pragma, INLINE</Primary></IndexTerm></Title>
4079 GHC (with <Option>-O</Option>, as always) tries to inline (or “unfold”)
4080 functions/values that are “small enough,” thus avoiding the call
4081 overhead and possibly exposing other more-wonderful optimisations.
4085 You will probably see these unfoldings (in Core syntax) in your
4090 Normally, if GHC decides a function is “too expensive” to inline, it
4091 will not do so, nor will it export that unfolding for other modules to
4096 The sledgehammer you can bring to bear is the
4097 <Literal>INLINE</Literal><IndexTerm><Primary>INLINE pragma</Primary></IndexTerm> pragma, used thusly:
4100 key_function :: Int -> String -> (Bool, Double)
4102 #ifdef __GLASGOW_HASKELL__
4103 {-# INLINE key_function #-}
4107 (You don't need to do the C pre-processor carry-on unless you're going
4108 to stick the code through HBC—it doesn't like <Literal>INLINE</Literal> pragmas.)
4112 The major effect of an <Literal>INLINE</Literal> pragma is to declare a function's
4113 “cost” to be very low. The normal unfolding machinery will then be
4114 very keen to inline it.
4118 An <Literal>INLINE</Literal> pragma for a function can be put anywhere its type
4119 signature could be put.
4123 <Literal>INLINE</Literal> pragmas are a particularly good idea for the
4124 <Literal>then</Literal>/<Literal>return</Literal> (or <Literal>bind</Literal>/<Literal>unit</Literal>) functions in a monad.
4125 For example, in GHC's own <Literal>UniqueSupply</Literal> monad code, we have:
4128 #ifdef __GLASGOW_HASKELL__
4129 {-# INLINE thenUs #-}
4130 {-# INLINE returnUs #-}
4138 <Sect2 id="noinline-pragma">
4139 <Title>NOINLINE pragma
4143 <IndexTerm><Primary>NOINLINE pragma</Primary></IndexTerm>
4144 <IndexTerm><Primary>pragma, NOINLINE</Primary></IndexTerm>
4148 The <Literal>NOINLINE</Literal> pragma does exactly what you'd expect: it stops the
4149 named function from being inlined by the compiler. You shouldn't ever
4150 need to do this, unless you're very cautious about code size.
4155 <Sect2 id="specialize-pragma">
4156 <Title>SPECIALIZE pragma
4160 <IndexTerm><Primary>SPECIALIZE pragma</Primary></IndexTerm>
4161 <IndexTerm><Primary>pragma, SPECIALIZE</Primary></IndexTerm>
4162 <IndexTerm><Primary>overloading, death to</Primary></IndexTerm>
4166 (UK spelling also accepted.) For key overloaded functions, you can
4167 create extra versions (NB: more code space) specialised to particular
4168 types. Thus, if you have an overloaded function:
4174 hammeredLookup :: Ord key => [(key, value)] -> key -> value
4180 If it is heavily used on lists with <Literal>Widget</Literal> keys, you could
4181 specialise it as follows:
4184 {-# SPECIALIZE hammeredLookup :: [(Widget, value)] -> Widget -> value #-}
4190 To get very fancy, you can also specify a named function to use for
4191 the specialised value, by adding <Literal>= blah</Literal>, as in:
4194 {-# SPECIALIZE hammeredLookup :: ...as before... = blah #-}
4197 It's <Emphasis>Your Responsibility</Emphasis> to make sure that <Function>blah</Function> really
4198 behaves as a specialised version of <Function>hammeredLookup</Function>!!!
4202 NOTE: the <Literal>=blah</Literal> feature isn't implemented in GHC 4.xx.
4206 An example in which the <Literal>= blah</Literal> form will Win Big:
4209 toDouble :: Real a => a -> Double
4210 toDouble = fromRational . toRational
4212 {-# SPECIALIZE toDouble :: Int -> Double = i2d #-}
4213 i2d (I# i) = D# (int2Double# i) -- uses Glasgow prim-op directly
4216 The <Function>i2d</Function> function is virtually one machine instruction; the
4217 default conversion—via an intermediate <Literal>Rational</Literal>—is obscenely
4218 expensive by comparison.
4222 By using the US spelling, your <Literal>SPECIALIZE</Literal> pragma will work with
4223 HBC, too. Note that HBC doesn't support the <Literal>= blah</Literal> form.
4227 A <Literal>SPECIALIZE</Literal> pragma for a function can be put anywhere its type
4228 signature could be put.
4233 <Sect2 id="specialize-instance-pragma">
4234 <Title>SPECIALIZE instance pragma
4238 <IndexTerm><Primary>SPECIALIZE pragma</Primary></IndexTerm>
4239 <IndexTerm><Primary>overloading, death to</Primary></IndexTerm>
4240 Same idea, except for instance declarations. For example:
4243 instance (Eq a) => Eq (Foo a) where { ... usual stuff ... }
4245 {-# SPECIALIZE instance Eq (Foo [(Int, Bar)] #-}
4248 Compatible with HBC, by the way.
4253 <Sect2 id="line-pragma">
4258 <IndexTerm><Primary>LINE pragma</Primary></IndexTerm>
4259 <IndexTerm><Primary>pragma, LINE</Primary></IndexTerm>
4263 This pragma is similar to C's <Literal>#line</Literal> pragma, and is mainly for use in
4264 automatically generated Haskell code. It lets you specify the line
4265 number and filename of the original code; for example
4271 {-# LINE 42 "Foo.vhs" #-}
4277 if you'd generated the current file from something called <Filename>Foo.vhs</Filename>
4278 and this line corresponds to line 42 in the original. GHC will adjust
4279 its error messages to refer to the line/file named in the <Literal>LINE</Literal>
4286 <Title>RULES pragma</Title>
4289 The RULES pragma lets you specify rewrite rules. It is described in
4290 <XRef LinkEnd="rewrite-rules">.
4297 <Sect1 id="rewrite-rules">
4298 <Title>Rewrite rules
4300 <IndexTerm><Primary>RULES pagma</Primary></IndexTerm>
4301 <IndexTerm><Primary>pragma, RULES</Primary></IndexTerm>
4302 <IndexTerm><Primary>rewrite rules</Primary></IndexTerm></Title>
4305 The programmer can specify rewrite rules as part of the source program
4306 (in a pragma). GHC applies these rewrite rules wherever it can.
4314 "map/map" forall f g xs. map f (map g xs) = map (f.g) xs
4321 <Title>Syntax</Title>
4324 From a syntactic point of view:
4330 Each rule has a name, enclosed in double quotes. The name itself has
4331 no significance at all. It is only used when reporting how many times the rule fired.
4337 There may be zero or more rules in a <Literal>RULES</Literal> pragma.
4343 Layout applies in a <Literal>RULES</Literal> pragma. Currently no new indentation level
4344 is set, so you must lay out your rules starting in the same column as the
4345 enclosing definitions.
4351 Each variable mentioned in a rule must either be in scope (e.g. <Function>map</Function>),
4352 or bound by the <Literal>forall</Literal> (e.g. <Function>f</Function>, <Function>g</Function>, <Function>xs</Function>). The variables bound by
4353 the <Literal>forall</Literal> are called the <Emphasis>pattern</Emphasis> variables. They are separated
4354 by spaces, just like in a type <Literal>forall</Literal>.
4360 A pattern variable may optionally have a type signature.
4361 If the type of the pattern variable is polymorphic, it <Emphasis>must</Emphasis> have a type signature.
4362 For example, here is the <Literal>foldr/build</Literal> rule:
4365 "fold/build" forall k z (g::forall b. (a->b->b) -> b -> b) .
4366 foldr k z (build g) = g k z
4369 Since <Function>g</Function> has a polymorphic type, it must have a type signature.
4376 The left hand side of a rule must consist of a top-level variable applied
4377 to arbitrary expressions. For example, this is <Emphasis>not</Emphasis> OK:
4380 "wrong1" forall e1 e2. case True of { True -> e1; False -> e2 } = e1
4381 "wrong2" forall f. f True = True
4384 In <Literal>"wrong1"</Literal>, the LHS is not an application; in <Literal>"wrong1"</Literal>, the LHS has a pattern variable
4391 A rule does not need to be in the same module as (any of) the
4392 variables it mentions, though of course they need to be in scope.
4398 Rules are automatically exported from a module, just as instance declarations are.
4409 <Title>Semantics</Title>
4412 From a semantic point of view:
4418 Rules are only applied if you use the <Option>-O</Option> flag.
4424 Rules are regarded as left-to-right rewrite rules.
4425 When GHC finds an expression that is a substitution instance of the LHS
4426 of a rule, it replaces the expression by the (appropriately-substituted) RHS.
4427 By "a substitution instance" we mean that the LHS can be made equal to the
4428 expression by substituting for the pattern variables.
4435 The LHS and RHS of a rule are typechecked, and must have the
4443 GHC makes absolutely no attempt to verify that the LHS and RHS
4444 of a rule have the same meaning. That is undecideable in general, and
4445 infeasible in most interesting cases. The responsibility is entirely the programmer's!
4452 GHC makes no attempt to make sure that the rules are confluent or
4453 terminating. For example:
4456 "loop" forall x,y. f x y = f y x
4459 This rule will cause the compiler to go into an infinite loop.
4466 If more than one rule matches a call, GHC will choose one arbitrarily to apply.
4472 GHC currently uses a very simple, syntactic, matching algorithm
4473 for matching a rule LHS with an expression. It seeks a substitution
4474 which makes the LHS and expression syntactically equal modulo alpha
4475 conversion. The pattern (rule), but not the expression, is eta-expanded if
4476 necessary. (Eta-expanding the epression can lead to laziness bugs.)
4477 But not beta conversion (that's called higher-order matching).
4481 Matching is carried out on GHC's intermediate language, which includes
4482 type abstractions and applications. So a rule only matches if the
4483 types match too. See <XRef LinkEnd="rule-spec"> below.
4489 GHC keeps trying to apply the rules as it optimises the program.
4490 For example, consider:
4499 The expression <Literal>s (t xs)</Literal> does not match the rule <Literal>"map/map"</Literal>, but GHC
4500 will substitute for <VarName>s</VarName> and <VarName>t</VarName>, giving an expression which does match.
4501 If <VarName>s</VarName> or <VarName>t</VarName> was (a) used more than once, and (b) large or a redex, then it would
4502 not be substituted, and the rule would not fire.
4509 In the earlier phases of compilation, GHC inlines <Emphasis>nothing
4510 that appears on the LHS of a rule</Emphasis>, because once you have substituted
4511 for something you can't match against it (given the simple minded
4512 matching). So if you write the rule
4515 "map/map" forall f,g. map f . map g = map (f.g)
4518 this <Emphasis>won't</Emphasis> match the expression <Literal>map f (map g xs)</Literal>.
4519 It will only match something written with explicit use of ".".
4520 Well, not quite. It <Emphasis>will</Emphasis> match the expression
4526 where <Function>wibble</Function> is defined:
4529 wibble f g = map f . map g
4532 because <Function>wibble</Function> will be inlined (it's small).
4534 Later on in compilation, GHC starts inlining even things on the
4535 LHS of rules, but still leaves the rules enabled. This inlining
4536 policy is controlled by the per-simplification-pass flag <Option>-finline-phase</Option><Emphasis>n</Emphasis>.
4543 All rules are implicitly exported from the module, and are therefore
4544 in force in any module that imports the module that defined the rule, directly
4545 or indirectly. (That is, if A imports B, which imports C, then C's rules are
4546 in force when compiling A.) The situation is very similar to that for instance
4558 <Title>List fusion</Title>
4561 The RULES mechanism is used to implement fusion (deforestation) of common list functions.
4562 If a "good consumer" consumes an intermediate list constructed by a "good producer", the
4563 intermediate list should be eliminated entirely.
4567 The following are good producers:
4579 Enumerations of <Literal>Int</Literal> and <Literal>Char</Literal> (e.g. <Literal>['a'..'z']</Literal>).
4585 Explicit lists (e.g. <Literal>[True, False]</Literal>)
4591 The cons constructor (e.g <Literal>3:4:[]</Literal>)
4597 <Function>++</Function>
4603 <Function>map</Function>
4609 <Function>filter</Function>
4615 <Function>iterate</Function>, <Function>repeat</Function>
4621 <Function>zip</Function>, <Function>zipWith</Function>
4630 The following are good consumers:
4642 <Function>array</Function> (on its second argument)
4648 <Function>length</Function>
4654 <Function>++</Function> (on its first argument)
4660 <Function>map</Function>
4666 <Function>filter</Function>
4672 <Function>concat</Function>
4678 <Function>unzip</Function>, <Function>unzip2</Function>, <Function>unzip3</Function>, <Function>unzip4</Function>
4684 <Function>zip</Function>, <Function>zipWith</Function> (but on one argument only; if both are good producers, <Function>zip</Function>
4685 will fuse with one but not the other)
4691 <Function>partition</Function>
4697 <Function>head</Function>
4703 <Function>and</Function>, <Function>or</Function>, <Function>any</Function>, <Function>all</Function>
4709 <Function>sequence_</Function>
4715 <Function>msum</Function>
4721 <Function>sortBy</Function>
4730 So, for example, the following should generate no intermediate lists:
4733 array (1,10) [(i,i*i) | i <- map (+ 1) [0..9]]
4739 This list could readily be extended; if there are Prelude functions that you use
4740 a lot which are not included, please tell us.
4744 If you want to write your own good consumers or producers, look at the
4745 Prelude definitions of the above functions to see how to do so.
4750 <Sect2 id="rule-spec">
4751 <Title>Specialisation
4755 Rewrite rules can be used to get the same effect as a feature
4756 present in earlier version of GHC:
4759 {-# SPECIALIZE fromIntegral :: Int8 -> Int16 = int8ToInt16 #-}
4762 This told GHC to use <Function>int8ToInt16</Function> instead of <Function>fromIntegral</Function> whenever
4763 the latter was called with type <Literal>Int8 -> Int16</Literal>. That is, rather than
4764 specialising the original definition of <Function>fromIntegral</Function> the programmer is
4765 promising that it is safe to use <Function>int8ToInt16</Function> instead.
4769 This feature is no longer in GHC. But rewrite rules let you do the
4774 "fromIntegral/Int8/Int16" fromIntegral = int8ToInt16
4778 This slightly odd-looking rule instructs GHC to replace <Function>fromIntegral</Function>
4779 by <Function>int8ToInt16</Function> <Emphasis>whenever the types match</Emphasis>. Speaking more operationally,
4780 GHC adds the type and dictionary applications to get the typed rule
4783 forall (d1::Integral Int8) (d2::Num Int16) .
4784 fromIntegral Int8 Int16 d1 d2 = int8ToInt16
4788 this rule does not need to be in the same file as fromIntegral,
4789 unlike the <Literal>SPECIALISE</Literal> pragmas which currently do (so that they
4790 have an original definition available to specialise).
4796 <Title>Controlling what's going on</Title>
4804 Use <Option>-ddump-rules</Option> to see what transformation rules GHC is using.
4810 Use <Option>-ddump-simpl-stats</Option> to see what rules are being fired.
4811 If you add <Option>-dppr-debug</Option> you get a more detailed listing.
4817 The defintion of (say) <Function>build</Function> in <FileName>PrelBase.lhs</FileName> looks llike this:
4820 build :: forall a. (forall b. (a -> b -> b) -> b -> b) -> [a]
4821 {-# INLINE build #-}
4825 Notice the <Literal>INLINE</Literal>! That prevents <Literal>(:)</Literal> from being inlined when compiling
4826 <Literal>PrelBase</Literal>, so that an importing module will “see” the <Literal>(:)</Literal>, and can
4827 match it on the LHS of a rule. <Literal>INLINE</Literal> prevents any inlining happening
4828 in the RHS of the <Literal>INLINE</Literal> thing. I regret the delicacy of this.
4835 In <Filename>ghc/lib/std/PrelBase.lhs</Filename> look at the rules for <Function>map</Function> to
4836 see how to write rules that will do fusion and yet give an efficient
4837 program even if fusion doesn't happen. More rules in <Filename>PrelList.lhs</Filename>.