1 <?xml version="1.0" encoding="iso-8859-1"?>
3 <indexterm><primary>language, GHC</primary></indexterm>
4 <indexterm><primary>extensions, GHC</primary></indexterm>
5 As with all known Haskell systems, GHC implements some extensions to
6 the language. They are all enabled by options; by default GHC
7 understands only plain Haskell 98.
11 Some of the Glasgow extensions serve to give you access to the
12 underlying facilities with which we implement Haskell. Thus, you can
13 get at the Raw Iron, if you are willing to write some non-portable
14 code at a more primitive level. You need not be “stuck”
15 on performance because of the implementation costs of Haskell's
16 “high-level” features—you can always code
17 “under” them. In an extreme case, you can write all your
18 time-critical code in C, and then just glue it together with Haskell!
22 Before you get too carried away working at the lowest level (e.g.,
23 sloshing <literal>MutableByteArray#</literal>s around your
24 program), you may wish to check if there are libraries that provide a
25 “Haskellised veneer” over the features you want. The
26 separate <ulink url="../libraries/index.html">libraries
27 documentation</ulink> describes all the libraries that come with GHC.
30 <!-- LANGUAGE OPTIONS -->
31 <sect1 id="options-language">
32 <title>Language options</title>
34 <indexterm><primary>language</primary><secondary>option</secondary>
36 <indexterm><primary>options</primary><secondary>language</secondary>
38 <indexterm><primary>extensions</primary><secondary>options controlling</secondary>
41 <para>The language option flag control what variation of the language are
42 permitted. Leaving out all of them gives you standard Haskell
45 <para>Generally speaking, all the language options are introduced by "<option>-X</option>",
46 e.g. <option>-XTemplateHaskell</option>.
49 <para> All the language options can be turned off by using the prefix "<option>No</option>";
50 e.g. "<option>-XNoTemplateHaskell</option>".</para>
52 <para> Language options recognised by Cabal can also be enabled using the <literal>LANGUAGE</literal> pragma,
53 thus <literal>{-# LANGUAGE TemplateHaskell #-}</literal> (see <xref linkend="language-pragma"/>>). </para>
55 <para>The flag <option>-fglasgow-exts</option>
56 <indexterm><primary><option>-fglasgow-exts</option></primary></indexterm>
57 is equivalent to enabling the following extensions:
58 <option>-XPrintExplicitForalls</option>,
59 <option>-XForeignFunctionInterface</option>,
60 <option>-XUnliftedFFITypes</option>,
61 <option>-XGADTs</option>,
62 <option>-XImplicitParams</option>,
63 <option>-XScopedTypeVariables</option>,
64 <option>-XUnboxedTuples</option>,
65 <option>-XTypeSynonymInstances</option>,
66 <option>-XStandaloneDeriving</option>,
67 <option>-XDeriveDataTypeable</option>,
68 <option>-XFlexibleContexts</option>,
69 <option>-XFlexibleInstances</option>,
70 <option>-XConstrainedClassMethods</option>,
71 <option>-XMultiParamTypeClasses</option>,
72 <option>-XFunctionalDependencies</option>,
73 <option>-XMagicHash</option>,
74 <option>-XPolymorphicComponents</option>,
75 <option>-XExistentialQuantification</option>,
76 <option>-XUnicodeSyntax</option>,
77 <option>-XPostfixOperators</option>,
78 <option>-XPatternGuards</option>,
79 <option>-XLiberalTypeSynonyms</option>,
80 <option>-XRankNTypes</option>,
81 <option>-XImpredicativeTypes</option>,
82 <option>-XTypeOperators</option>,
83 <option>-XRecursiveDo</option>,
84 <option>-XParallelListComp</option>,
85 <option>-XEmptyDataDecls</option>,
86 <option>-XKindSignatures</option>,
87 <option>-XGeneralizedNewtypeDeriving</option>,
88 <option>-XTypeFamilies</option>.
89 Enabling these options is the <emphasis>only</emphasis>
90 effect of <options>-fglasgow-exts</options>.
91 We are trying to move away from this portmanteau flag,
92 and towards enabling features individually.</para>
96 <!-- UNBOXED TYPES AND PRIMITIVE OPERATIONS -->
97 <sect1 id="primitives">
98 <title>Unboxed types and primitive operations</title>
100 <para>GHC is built on a raft of primitive data types and operations;
101 "primitive" in the sense that they cannot be defined in Haskell itself.
102 While you really can use this stuff to write fast code,
103 we generally find it a lot less painful, and more satisfying in the
104 long run, to use higher-level language features and libraries. With
105 any luck, the code you write will be optimised to the efficient
106 unboxed version in any case. And if it isn't, we'd like to know
109 <para>All these primitive data types and operations are exported by the
110 library <literal>GHC.Prim</literal>, for which there is
111 <ulink url="../libraries/base/GHC.Prim.html">detailed online documentation</ulink>.
112 (This documentation is generated from the file <filename>compiler/prelude/primops.txt.pp</filename>.)
115 If you want to mention any of the primitive data types or operations in your
116 program, you must first import <literal>GHC.Prim</literal> to bring them
117 into scope. Many of them have names ending in "#", and to mention such
118 names you need the <option>-XMagicHash</option> extension (<xref linkend="magic-hash"/>).
121 <para>The primops make extensive use of <link linkend="glasgow-unboxed">unboxed types</link>
122 and <link linkend="unboxed-tuples">unboxed tuples</link>, which
123 we briefly summarise here. </para>
125 <sect2 id="glasgow-unboxed">
130 <indexterm><primary>Unboxed types (Glasgow extension)</primary></indexterm>
133 <para>Most types in GHC are <firstterm>boxed</firstterm>, which means
134 that values of that type are represented by a pointer to a heap
135 object. The representation of a Haskell <literal>Int</literal>, for
136 example, is a two-word heap object. An <firstterm>unboxed</firstterm>
137 type, however, is represented by the value itself, no pointers or heap
138 allocation are involved.
142 Unboxed types correspond to the “raw machine” types you
143 would use in C: <literal>Int#</literal> (long int),
144 <literal>Double#</literal> (double), <literal>Addr#</literal>
145 (void *), etc. The <emphasis>primitive operations</emphasis>
146 (PrimOps) on these types are what you might expect; e.g.,
147 <literal>(+#)</literal> is addition on
148 <literal>Int#</literal>s, and is the machine-addition that we all
149 know and love—usually one instruction.
153 Primitive (unboxed) types cannot be defined in Haskell, and are
154 therefore built into the language and compiler. Primitive types are
155 always unlifted; that is, a value of a primitive type cannot be
156 bottom. We use the convention (but it is only a convention)
157 that primitive types, values, and
158 operations have a <literal>#</literal> suffix (see <xref linkend="magic-hash"/>).
159 For some primitive types we have special syntax for literals, also
160 described in the <link linkend="magic-hash">same section</link>.
164 Primitive values are often represented by a simple bit-pattern, such
165 as <literal>Int#</literal>, <literal>Float#</literal>,
166 <literal>Double#</literal>. But this is not necessarily the case:
167 a primitive value might be represented by a pointer to a
168 heap-allocated object. Examples include
169 <literal>Array#</literal>, the type of primitive arrays. A
170 primitive array is heap-allocated because it is too big a value to fit
171 in a register, and would be too expensive to copy around; in a sense,
172 it is accidental that it is represented by a pointer. If a pointer
173 represents a primitive value, then it really does point to that value:
174 no unevaluated thunks, no indirections…nothing can be at the
175 other end of the pointer than the primitive value.
176 A numerically-intensive program using unboxed types can
177 go a <emphasis>lot</emphasis> faster than its “standard”
178 counterpart—we saw a threefold speedup on one example.
182 There are some restrictions on the use of primitive types:
184 <listitem><para>The main restriction
185 is that you can't pass a primitive value to a polymorphic
186 function or store one in a polymorphic data type. This rules out
187 things like <literal>[Int#]</literal> (i.e. lists of primitive
188 integers). The reason for this restriction is that polymorphic
189 arguments and constructor fields are assumed to be pointers: if an
190 unboxed integer is stored in one of these, the garbage collector would
191 attempt to follow it, leading to unpredictable space leaks. Or a
192 <function>seq</function> operation on the polymorphic component may
193 attempt to dereference the pointer, with disastrous results. Even
194 worse, the unboxed value might be larger than a pointer
195 (<literal>Double#</literal> for instance).
198 <listitem><para> You cannot define a newtype whose representation type
199 (the argument type of the data constructor) is an unboxed type. Thus,
205 <listitem><para> You cannot bind a variable with an unboxed type
206 in a <emphasis>top-level</emphasis> binding.
208 <listitem><para> You cannot bind a variable with an unboxed type
209 in a <emphasis>recursive</emphasis> binding.
211 <listitem><para> You may bind unboxed variables in a (non-recursive,
212 non-top-level) pattern binding, but any such variable causes the entire
214 to become strict. For example:
216 data Foo = Foo Int Int#
218 f x = let (Foo a b, w) = ..rhs.. in ..body..
220 Since <literal>b</literal> has type <literal>Int#</literal>, the entire pattern
222 is strict, and the program behaves as if you had written
224 data Foo = Foo Int Int#
226 f x = case ..rhs.. of { (Foo a b, w) -> ..body.. }
235 <sect2 id="unboxed-tuples">
236 <title>Unboxed Tuples
240 Unboxed tuples aren't really exported by <literal>GHC.Exts</literal>,
241 they're available by default with <option>-fglasgow-exts</option>. An
242 unboxed tuple looks like this:
254 where <literal>e_1..e_n</literal> are expressions of any
255 type (primitive or non-primitive). The type of an unboxed tuple looks
260 Unboxed tuples are used for functions that need to return multiple
261 values, but they avoid the heap allocation normally associated with
262 using fully-fledged tuples. When an unboxed tuple is returned, the
263 components are put directly into registers or on the stack; the
264 unboxed tuple itself does not have a composite representation. Many
265 of the primitive operations listed in <literal>primops.txt.pp</literal> return unboxed
267 In particular, the <literal>IO</literal> and <literal>ST</literal> monads use unboxed
268 tuples to avoid unnecessary allocation during sequences of operations.
272 There are some pretty stringent restrictions on the use of unboxed tuples:
277 Values of unboxed tuple types are subject to the same restrictions as
278 other unboxed types; i.e. they may not be stored in polymorphic data
279 structures or passed to polymorphic functions.
286 No variable can have an unboxed tuple type, nor may a constructor or function
287 argument have an unboxed tuple type. The following are all illegal:
291 data Foo = Foo (# Int, Int #)
293 f :: (# Int, Int #) -> (# Int, Int #)
296 g :: (# Int, Int #) -> Int
299 h x = let y = (# x,x #) in ...
306 The typical use of unboxed tuples is simply to return multiple values,
307 binding those multiple results with a <literal>case</literal> expression, thus:
309 f x y = (# x+1, y-1 #)
310 g x = case f x x of { (# a, b #) -> a + b }
312 You can have an unboxed tuple in a pattern binding, thus
314 f x = let (# p,q #) = h x in ..body..
316 If the types of <literal>p</literal> and <literal>q</literal> are not unboxed,
317 the resulting binding is lazy like any other Haskell pattern binding. The
318 above example desugars like this:
320 f x = let t = case h x o f{ (# p,q #) -> (p,q)
325 Indeed, the bindings can even be recursive.
332 <!-- ====================== SYNTACTIC EXTENSIONS ======================= -->
334 <sect1 id="syntax-extns">
335 <title>Syntactic extensions</title>
337 <sect2 id="magic-hash">
338 <title>The magic hash</title>
339 <para>The language extension <option>-XMagicHash</option> allows "#" as a
340 postfix modifier to identifiers. Thus, "x#" is a valid variable, and "T#" is
341 a valid type constructor or data constructor.</para>
343 <para>The hash sign does not change sematics at all. We tend to use variable
344 names ending in "#" for unboxed values or types (e.g. <literal>Int#</literal>),
345 but there is no requirement to do so; they are just plain ordinary variables.
346 Nor does the <option>-XMagicHash</option> extension bring anything into scope.
347 For example, to bring <literal>Int#</literal> into scope you must
348 import <literal>GHC.Prim</literal> (see <xref linkend="primitives"/>);
349 the <option>-XMagicHash</option> extension
350 then allows you to <emphasis>refer</emphasis> to the <literal>Int#</literal>
351 that is now in scope.</para>
352 <para> The <option>-XMagicHash</option> also enables some new forms of literals (see <xref linkend="glasgow-unboxed"/>):
354 <listitem><para> <literal>'x'#</literal> has type <literal>Char#</literal></para> </listitem>
355 <listitem><para> <literal>"foo"#</literal> has type <literal>Addr#</literal></para> </listitem>
356 <listitem><para> <literal>3#</literal> has type <literal>Int#</literal>. In general,
357 any Haskell 98 integer lexeme followed by a <literal>#</literal> is an <literal>Int#</literal> literal, e.g.
358 <literal>-0x3A#</literal> as well as <literal>32#</literal></para>.</listitem>
359 <listitem><para> <literal>3##</literal> has type <literal>Word#</literal>. In general,
360 any non-negative Haskell 98 integer lexeme followed by <literal>##</literal>
361 is a <literal>Word#</literal>. </para> </listitem>
362 <listitem><para> <literal>3.2#</literal> has type <literal>Float#</literal>.</para> </listitem>
363 <listitem><para> <literal>3.2##</literal> has type <literal>Double#</literal></para> </listitem>
369 <title>New qualified operator syntax</title>
371 <para>A new syntax for referencing qualified operators is
372 planned to be introduced by Haskell', and is enabled in GHC
374 the <option>-XNewQualifiedOperators</option><indexterm><primary><option>-XNewQualifiedOperators</option></primary></indexterm>
375 option. In the new syntax, the prefix form of a qualified
377 written <literal><replaceable>module</replaceable>.(<replaceable>symbol</replaceable>)</literal>
378 (in Haskell 98 this would
379 be <literal>(<replaceable>module</replaceable>.<replaceable>symbol</replaceable>)</literal>),
380 and the infix form is
381 written <literal>`<replaceable>module</replaceable>.(<replaceable>symbol</replaceable>)`</literal>
382 (in Haskell 98 this would
383 be <literal>`<replaceable>module</replaceable>.<replaceable>symbol</replaceable>`</literal>.
386 add x y = Prelude.(+) x y
387 subtract y = (`Prelude.(-)` y)
389 The new form of qualified operators is intended to regularise
390 the syntax by eliminating odd cases
391 like <literal>Prelude..</literal>. For example,
392 when <literal>NewQualifiedOperators</literal> is on, it is possible to
393 write the enerated sequence <literal>[Monday..]</literal>
394 without spaces, whereas in Haskell 98 this would be a
395 reference to the operator ‘<literal>.</literal>‘
396 from module <literal>Monday</literal>.</para>
398 <para>When <option>-XNewQualifiedOperators</option> is on, the old Haskell
399 98 syntax for qualified operators is not accepted, so this
400 option may cause existing Haskell 98 code to break.</para>
405 <!-- ====================== HIERARCHICAL MODULES ======================= -->
408 <sect2 id="hierarchical-modules">
409 <title>Hierarchical Modules</title>
411 <para>GHC supports a small extension to the syntax of module
412 names: a module name is allowed to contain a dot
413 <literal>‘.’</literal>. This is also known as the
414 “hierarchical module namespace” extension, because
415 it extends the normally flat Haskell module namespace into a
416 more flexible hierarchy of modules.</para>
418 <para>This extension has very little impact on the language
419 itself; modules names are <emphasis>always</emphasis> fully
420 qualified, so you can just think of the fully qualified module
421 name as <quote>the module name</quote>. In particular, this
422 means that the full module name must be given after the
423 <literal>module</literal> keyword at the beginning of the
424 module; for example, the module <literal>A.B.C</literal> must
427 <programlisting>module A.B.C</programlisting>
430 <para>It is a common strategy to use the <literal>as</literal>
431 keyword to save some typing when using qualified names with
432 hierarchical modules. For example:</para>
435 import qualified Control.Monad.ST.Strict as ST
438 <para>For details on how GHC searches for source and interface
439 files in the presence of hierarchical modules, see <xref
440 linkend="search-path"/>.</para>
442 <para>GHC comes with a large collection of libraries arranged
443 hierarchically; see the accompanying <ulink
444 url="../libraries/index.html">library
445 documentation</ulink>. More libraries to install are available
447 url="http://hackage.haskell.org/packages/hackage.html">HackageDB</ulink>.</para>
450 <!-- ====================== PATTERN GUARDS ======================= -->
452 <sect2 id="pattern-guards">
453 <title>Pattern guards</title>
456 <indexterm><primary>Pattern guards (Glasgow extension)</primary></indexterm>
457 The discussion that follows is an abbreviated version of Simon Peyton Jones's original <ulink url="http://research.microsoft.com/~simonpj/Haskell/guards.html">proposal</ulink>. (Note that the proposal was written before pattern guards were implemented, so refers to them as unimplemented.)
461 Suppose we have an abstract data type of finite maps, with a
465 lookup :: FiniteMap -> Int -> Maybe Int
468 The lookup returns <function>Nothing</function> if the supplied key is not in the domain of the mapping, and <function>(Just v)</function> otherwise,
469 where <varname>v</varname> is the value that the key maps to. Now consider the following definition:
473 clunky env var1 var2 | ok1 && ok2 = val1 + val2
474 | otherwise = var1 + var2
485 The auxiliary functions are
489 maybeToBool :: Maybe a -> Bool
490 maybeToBool (Just x) = True
491 maybeToBool Nothing = False
493 expectJust :: Maybe a -> a
494 expectJust (Just x) = x
495 expectJust Nothing = error "Unexpected Nothing"
499 What is <function>clunky</function> doing? The guard <literal>ok1 &&
500 ok2</literal> checks that both lookups succeed, using
501 <function>maybeToBool</function> to convert the <function>Maybe</function>
502 types to booleans. The (lazily evaluated) <function>expectJust</function>
503 calls extract the values from the results of the lookups, and binds the
504 returned values to <varname>val1</varname> and <varname>val2</varname>
505 respectively. If either lookup fails, then clunky takes the
506 <literal>otherwise</literal> case and returns the sum of its arguments.
510 This is certainly legal Haskell, but it is a tremendously verbose and
511 un-obvious way to achieve the desired effect. Arguably, a more direct way
512 to write clunky would be to use case expressions:
516 clunky env var1 var2 = case lookup env var1 of
518 Just val1 -> case lookup env var2 of
520 Just val2 -> val1 + val2
526 This is a bit shorter, but hardly better. Of course, we can rewrite any set
527 of pattern-matching, guarded equations as case expressions; that is
528 precisely what the compiler does when compiling equations! The reason that
529 Haskell provides guarded equations is because they allow us to write down
530 the cases we want to consider, one at a time, independently of each other.
531 This structure is hidden in the case version. Two of the right-hand sides
532 are really the same (<function>fail</function>), and the whole expression
533 tends to become more and more indented.
537 Here is how I would write clunky:
542 | Just val1 <- lookup env var1
543 , Just val2 <- lookup env var2
545 ...other equations for clunky...
549 The semantics should be clear enough. The qualifiers are matched in order.
550 For a <literal><-</literal> qualifier, which I call a pattern guard, the
551 right hand side is evaluated and matched against the pattern on the left.
552 If the match fails then the whole guard fails and the next equation is
553 tried. If it succeeds, then the appropriate binding takes place, and the
554 next qualifier is matched, in the augmented environment. Unlike list
555 comprehensions, however, the type of the expression to the right of the
556 <literal><-</literal> is the same as the type of the pattern to its
557 left. The bindings introduced by pattern guards scope over all the
558 remaining guard qualifiers, and over the right hand side of the equation.
562 Just as with list comprehensions, boolean expressions can be freely mixed
563 with among the pattern guards. For example:
574 Haskell's current guards therefore emerge as a special case, in which the
575 qualifier list has just one element, a boolean expression.
579 <!-- ===================== View patterns =================== -->
581 <sect2 id="view-patterns">
586 View patterns are enabled by the flag <literal>-XViewPatterns</literal>.
587 More information and examples of view patterns can be found on the
588 <ulink url="http://hackage.haskell.org/trac/ghc/wiki/ViewPatterns">Wiki
593 View patterns are somewhat like pattern guards that can be nested inside
594 of other patterns. They are a convenient way of pattern-matching
595 against values of abstract types. For example, in a programming language
596 implementation, we might represent the syntax of the types of the
605 view :: Type -> TypeView
607 -- additional operations for constructing Typ's ...
610 The representation of Typ is held abstract, permitting implementations
611 to use a fancy representation (e.g., hash-consing to manage sharing).
613 Without view patterns, using this signature a little inconvenient:
615 size :: Typ -> Integer
616 size t = case view t of
618 Arrow t1 t2 -> size t1 + size t2
621 It is necessary to iterate the case, rather than using an equational
622 function definition. And the situation is even worse when the matching
623 against <literal>t</literal> is buried deep inside another pattern.
627 View patterns permit calling the view function inside the pattern and
628 matching against the result:
630 size (view -> Unit) = 1
631 size (view -> Arrow t1 t2) = size t1 + size t2
634 That is, we add a new form of pattern, written
635 <replaceable>expression</replaceable> <literal>-></literal>
636 <replaceable>pattern</replaceable> that means "apply the expression to
637 whatever we're trying to match against, and then match the result of
638 that application against the pattern". The expression can be any Haskell
639 expression of function type, and view patterns can be used wherever
644 The semantics of a pattern <literal>(</literal>
645 <replaceable>exp</replaceable> <literal>-></literal>
646 <replaceable>pat</replaceable> <literal>)</literal> are as follows:
652 <para>The variables bound by the view pattern are the variables bound by
653 <replaceable>pat</replaceable>.
657 Any variables in <replaceable>exp</replaceable> are bound occurrences,
658 but variables bound "to the left" in a pattern are in scope. This
659 feature permits, for example, one argument to a function to be used in
660 the view of another argument. For example, the function
661 <literal>clunky</literal> from <xref linkend="pattern-guards" /> can be
662 written using view patterns as follows:
665 clunky env (lookup env -> Just val1) (lookup env -> Just val2) = val1 + val2
666 ...other equations for clunky...
671 More precisely, the scoping rules are:
675 In a single pattern, variables bound by patterns to the left of a view
676 pattern expression are in scope. For example:
678 example :: Maybe ((String -> Integer,Integer), String) -> Bool
679 example Just ((f,_), f -> 4) = True
682 Additionally, in function definitions, variables bound by matching earlier curried
683 arguments may be used in view pattern expressions in later arguments:
685 example :: (String -> Integer) -> String -> Bool
686 example f (f -> 4) = True
688 That is, the scoping is the same as it would be if the curried arguments
689 were collected into a tuple.
695 In mutually recursive bindings, such as <literal>let</literal>,
696 <literal>where</literal>, or the top level, view patterns in one
697 declaration may not mention variables bound by other declarations. That
698 is, each declaration must be self-contained. For example, the following
699 program is not allowed:
706 restriction in the future; the only cost is that type checking patterns
707 would get a little more complicated.)
717 <listitem><para> Typing: If <replaceable>exp</replaceable> has type
718 <replaceable>T1</replaceable> <literal>-></literal>
719 <replaceable>T2</replaceable> and <replaceable>pat</replaceable> matches
720 a <replaceable>T2</replaceable>, then the whole view pattern matches a
721 <replaceable>T1</replaceable>.
724 <listitem><para> Matching: To the equations in Section 3.17.3 of the
725 <ulink url="http://www.haskell.org/onlinereport/">Haskell 98
726 Report</ulink>, add the following:
728 case v of { (e -> p) -> e1 ; _ -> e2 }
730 case (e v) of { p -> e1 ; _ -> e2 }
732 That is, to match a variable <replaceable>v</replaceable> against a pattern
733 <literal>(</literal> <replaceable>exp</replaceable>
734 <literal>-></literal> <replaceable>pat</replaceable>
735 <literal>)</literal>, evaluate <literal>(</literal>
736 <replaceable>exp</replaceable> <replaceable> v</replaceable>
737 <literal>)</literal> and match the result against
738 <replaceable>pat</replaceable>.
741 <listitem><para> Efficiency: When the same view function is applied in
742 multiple branches of a function definition or a case expression (e.g.,
743 in <literal>size</literal> above), GHC makes an attempt to collect these
744 applications into a single nested case expression, so that the view
745 function is only applied once. Pattern compilation in GHC follows the
746 matrix algorithm described in Chapter 4 of <ulink
747 url="http://research.microsoft.com/~simonpj/Papers/slpj-book-1987/">The
748 Implementation of Functional Programming Languages</ulink>. When the
749 top rows of the first column of a matrix are all view patterns with the
750 "same" expression, these patterns are transformed into a single nested
751 case. This includes, for example, adjacent view patterns that line up
754 f ((view -> A, p1), p2) = e1
755 f ((view -> B, p3), p4) = e2
759 <para> The current notion of when two view pattern expressions are "the
760 same" is very restricted: it is not even full syntactic equality.
761 However, it does include variables, literals, applications, and tuples;
762 e.g., two instances of <literal>view ("hi", "there")</literal> will be
763 collected. However, the current implementation does not compare up to
764 alpha-equivalence, so two instances of <literal>(x, view x ->
765 y)</literal> will not be coalesced.
775 <!-- ===================== Recursive do-notation =================== -->
777 <sect2 id="mdo-notation">
778 <title>The recursive do-notation
781 <para> The recursive do-notation (also known as mdo-notation) is implemented as described in
782 <ulink url="http://citeseer.ist.psu.edu/erk02recursive.html">A recursive do for Haskell</ulink>,
783 by Levent Erkok, John Launchbury,
784 Haskell Workshop 2002, pages: 29-37. Pittsburgh, Pennsylvania.
785 This paper is essential reading for anyone making non-trivial use of mdo-notation,
786 and we do not repeat it here.
789 The do-notation of Haskell does not allow <emphasis>recursive bindings</emphasis>,
790 that is, the variables bound in a do-expression are visible only in the textually following
791 code block. Compare this to a let-expression, where bound variables are visible in the entire binding
792 group. It turns out that several applications can benefit from recursive bindings in
793 the do-notation, and this extension provides the necessary syntactic support.
796 Here is a simple (yet contrived) example:
799 import Control.Monad.Fix
801 justOnes = mdo xs <- Just (1:xs)
805 As you can guess <literal>justOnes</literal> will evaluate to <literal>Just [1,1,1,...</literal>.
809 The Control.Monad.Fix library introduces the <literal>MonadFix</literal> class. It's definition is:
812 class Monad m => MonadFix m where
813 mfix :: (a -> m a) -> m a
816 The function <literal>mfix</literal>
817 dictates how the required recursion operation should be performed. For example,
818 <literal>justOnes</literal> desugars as follows:
820 justOnes = mfix (\xs' -> do { xs <- Just (1:xs'); return xs }
822 For full details of the way in which mdo is typechecked and desugared, see
823 the paper <ulink url="http://citeseer.ist.psu.edu/erk02recursive.html">A recursive do for Haskell</ulink>.
824 In particular, GHC implements the segmentation technique described in Section 3.2 of the paper.
827 If recursive bindings are required for a monad,
828 then that monad must be declared an instance of the <literal>MonadFix</literal> class.
829 The following instances of <literal>MonadFix</literal> are automatically provided: List, Maybe, IO.
830 Furthermore, the Control.Monad.ST and Control.Monad.ST.Lazy modules provide the instances of the MonadFix class
831 for Haskell's internal state monad (strict and lazy, respectively).
834 Here are some important points in using the recursive-do notation:
837 The recursive version of the do-notation uses the keyword <literal>mdo</literal> (rather
838 than <literal>do</literal>).
842 It is enabled with the flag <literal>-XRecursiveDo</literal>, which is in turn implied by
843 <literal>-fglasgow-exts</literal>.
847 Unlike ordinary do-notation, but like <literal>let</literal> and <literal>where</literal> bindings,
848 name shadowing is not allowed; that is, all the names bound in a single <literal>mdo</literal> must
849 be distinct (Section 3.3 of the paper).
853 Variables bound by a <literal>let</literal> statement in an <literal>mdo</literal>
854 are monomorphic in the <literal>mdo</literal> (Section 3.1 of the paper). However
855 GHC breaks the <literal>mdo</literal> into segments to enhance polymorphism,
856 and improve termination (Section 3.2 of the paper).
862 The web page: <ulink url="http://www.cse.ogi.edu/PacSoft/projects/rmb/">http://www.cse.ogi.edu/PacSoft/projects/rmb/</ulink>
863 contains up to date information on recursive monadic bindings.
867 Historical note: The old implementation of the mdo-notation (and most
868 of the existing documents) used the name
869 <literal>MonadRec</literal> for the class and the corresponding library.
870 This name is not supported by GHC.
876 <!-- ===================== PARALLEL LIST COMPREHENSIONS =================== -->
878 <sect2 id="parallel-list-comprehensions">
879 <title>Parallel List Comprehensions</title>
880 <indexterm><primary>list comprehensions</primary><secondary>parallel</secondary>
882 <indexterm><primary>parallel list comprehensions</primary>
885 <para>Parallel list comprehensions are a natural extension to list
886 comprehensions. List comprehensions can be thought of as a nice
887 syntax for writing maps and filters. Parallel comprehensions
888 extend this to include the zipWith family.</para>
890 <para>A parallel list comprehension has multiple independent
891 branches of qualifier lists, each separated by a `|' symbol. For
892 example, the following zips together two lists:</para>
895 [ (x, y) | x <- xs | y <- ys ]
898 <para>The behavior of parallel list comprehensions follows that of
899 zip, in that the resulting list will have the same length as the
900 shortest branch.</para>
902 <para>We can define parallel list comprehensions by translation to
903 regular comprehensions. Here's the basic idea:</para>
905 <para>Given a parallel comprehension of the form: </para>
908 [ e | p1 <- e11, p2 <- e12, ...
909 | q1 <- e21, q2 <- e22, ...
914 <para>This will be translated to: </para>
917 [ e | ((p1,p2), (q1,q2), ...) <- zipN [(p1,p2) | p1 <- e11, p2 <- e12, ...]
918 [(q1,q2) | q1 <- e21, q2 <- e22, ...]
923 <para>where `zipN' is the appropriate zip for the given number of
928 <!-- ===================== TRANSFORM LIST COMPREHENSIONS =================== -->
930 <sect2 id="generalised-list-comprehensions">
931 <title>Generalised (SQL-Like) List Comprehensions</title>
932 <indexterm><primary>list comprehensions</primary><secondary>generalised</secondary>
934 <indexterm><primary>extended list comprehensions</primary>
936 <indexterm><primary>group</primary></indexterm>
937 <indexterm><primary>sql</primary></indexterm>
940 <para>Generalised list comprehensions are a further enhancement to the
941 list comprehension syntatic sugar to allow operations such as sorting
942 and grouping which are familiar from SQL. They are fully described in the
943 paper <ulink url="http://research.microsoft.com/~simonpj/papers/list-comp">
944 Comprehensive comprehensions: comprehensions with "order by" and "group by"</ulink>,
945 except that the syntax we use differs slightly from the paper.</para>
946 <para>Here is an example:
948 employees = [ ("Simon", "MS", 80)
949 , ("Erik", "MS", 100)
951 , ("Gordon", "Ed", 45)
952 , ("Paul", "Yale", 60)]
954 output = [ (the dept, sum salary)
955 | (name, dept, salary) <- employees
957 , then sortWith by (sum salary)
960 In this example, the list <literal>output</literal> would take on
964 [("Yale", 60), ("Ed", 85), ("MS", 180)]
967 <para>There are three new keywords: <literal>group</literal>, <literal>by</literal>, and <literal>using</literal>.
968 (The function <literal>sortWith</literal> is not a keyword; it is an ordinary
969 function that is exported by <literal>GHC.Exts</literal>.)</para>
971 <para>There are five new forms of comprehension qualifier,
972 all introduced by the (existing) keyword <literal>then</literal>:
980 This statement requires that <literal>f</literal> have the type <literal>
981 forall a. [a] -> [a]</literal>. You can see an example of it's use in the
982 motivating example, as this form is used to apply <literal>take 5</literal>.
993 This form is similar to the previous one, but allows you to create a function
994 which will be passed as the first argument to f. As a consequence f must have
995 the type <literal>forall a. (a -> t) -> [a] -> [a]</literal>. As you can see
996 from the type, this function lets f "project out" some information
997 from the elements of the list it is transforming.</para>
999 <para>An example is shown in the opening example, where <literal>sortWith</literal>
1000 is supplied with a function that lets it find out the <literal>sum salary</literal>
1001 for any item in the list comprehension it transforms.</para>
1009 then group by e using f
1012 <para>This is the most general of the grouping-type statements. In this form,
1013 f is required to have type <literal>forall a. (a -> t) -> [a] -> [[a]]</literal>.
1014 As with the <literal>then f by e</literal> case above, the first argument
1015 is a function supplied to f by the compiler which lets it compute e on every
1016 element of the list being transformed. However, unlike the non-grouping case,
1017 f additionally partitions the list into a number of sublists: this means that
1018 at every point after this statement, binders occurring before it in the comprehension
1019 refer to <emphasis>lists</emphasis> of possible values, not single values. To help understand
1020 this, let's look at an example:</para>
1023 -- This works similarly to groupWith in GHC.Exts, but doesn't sort its input first
1024 groupRuns :: Eq b => (a -> b) -> [a] -> [[a]]
1025 groupRuns f = groupBy (\x y -> f x == f y)
1027 output = [ (the x, y)
1028 | x <- ([1..3] ++ [1..2])
1030 , then group by x using groupRuns ]
1033 <para>This results in the variable <literal>output</literal> taking on the value below:</para>
1036 [(1, [4, 5, 6]), (2, [4, 5, 6]), (3, [4, 5, 6]), (1, [4, 5, 6]), (2, [4, 5, 6])]
1039 <para>Note that we have used the <literal>the</literal> function to change the type
1040 of x from a list to its original numeric type. The variable y, in contrast, is left
1041 unchanged from the list form introduced by the grouping.</para>
1051 <para>This form of grouping is essentially the same as the one described above. However,
1052 since no function to use for the grouping has been supplied it will fall back on the
1053 <literal>groupWith</literal> function defined in
1054 <ulink url="../libraries/base/GHC-Exts.html"><literal>GHC.Exts</literal></ulink>. This
1055 is the form of the group statement that we made use of in the opening example.</para>
1066 <para>With this form of the group statement, f is required to simply have the type
1067 <literal>forall a. [a] -> [[a]]</literal>, which will be used to group up the
1068 comprehension so far directly. An example of this form is as follows:</para>
1074 , then group using inits]
1077 <para>This will yield a list containing every prefix of the word "hello" written out 5 times:</para>
1080 ["","h","he","hel","hell","hello","helloh","hellohe","hellohel","hellohell","hellohello","hellohelloh",...]
1088 <!-- ===================== REBINDABLE SYNTAX =================== -->
1090 <sect2 id="rebindable-syntax">
1091 <title>Rebindable syntax and the implicit Prelude import</title>
1093 <para><indexterm><primary>-XNoImplicitPrelude
1094 option</primary></indexterm> GHC normally imports
1095 <filename>Prelude.hi</filename> files for you. If you'd
1096 rather it didn't, then give it a
1097 <option>-XNoImplicitPrelude</option> option. The idea is
1098 that you can then import a Prelude of your own. (But don't
1099 call it <literal>Prelude</literal>; the Haskell module
1100 namespace is flat, and you must not conflict with any
1101 Prelude module.)</para>
1103 <para>Suppose you are importing a Prelude of your own
1104 in order to define your own numeric class
1105 hierarchy. It completely defeats that purpose if the
1106 literal "1" means "<literal>Prelude.fromInteger
1107 1</literal>", which is what the Haskell Report specifies.
1108 So the <option>-XNoImplicitPrelude</option>
1109 flag <emphasis>also</emphasis> causes
1110 the following pieces of built-in syntax to refer to
1111 <emphasis>whatever is in scope</emphasis>, not the Prelude
1115 <para>An integer literal <literal>368</literal> means
1116 "<literal>fromInteger (368::Integer)</literal>", rather than
1117 "<literal>Prelude.fromInteger (368::Integer)</literal>".
1120 <listitem><para>Fractional literals are handed in just the same way,
1121 except that the translation is
1122 <literal>fromRational (3.68::Rational)</literal>.
1125 <listitem><para>The equality test in an overloaded numeric pattern
1126 uses whatever <literal>(==)</literal> is in scope.
1129 <listitem><para>The subtraction operation, and the
1130 greater-than-or-equal test, in <literal>n+k</literal> patterns
1131 use whatever <literal>(-)</literal> and <literal>(>=)</literal> are in scope.
1135 <para>Negation (e.g. "<literal>- (f x)</literal>")
1136 means "<literal>negate (f x)</literal>", both in numeric
1137 patterns, and expressions.
1141 <para>"Do" notation is translated using whatever
1142 functions <literal>(>>=)</literal>,
1143 <literal>(>>)</literal>, and <literal>fail</literal>,
1144 are in scope (not the Prelude
1145 versions). List comprehensions, mdo (<xref linkend="mdo-notation"/>), and parallel array
1146 comprehensions, are unaffected. </para></listitem>
1150 notation (see <xref linkend="arrow-notation"/>)
1151 uses whatever <literal>arr</literal>,
1152 <literal>(>>>)</literal>, <literal>first</literal>,
1153 <literal>app</literal>, <literal>(|||)</literal> and
1154 <literal>loop</literal> functions are in scope. But unlike the
1155 other constructs, the types of these functions must match the
1156 Prelude types very closely. Details are in flux; if you want
1160 In all cases (apart from arrow notation), the static semantics should be that of the desugared form,
1161 even if that is a little unexpected. For example, the
1162 static semantics of the literal <literal>368</literal>
1163 is exactly that of <literal>fromInteger (368::Integer)</literal>; it's fine for
1164 <literal>fromInteger</literal> to have any of the types:
1166 fromInteger :: Integer -> Integer
1167 fromInteger :: forall a. Foo a => Integer -> a
1168 fromInteger :: Num a => a -> Integer
1169 fromInteger :: Integer -> Bool -> Bool
1173 <para>Be warned: this is an experimental facility, with
1174 fewer checks than usual. Use <literal>-dcore-lint</literal>
1175 to typecheck the desugared program. If Core Lint is happy
1176 you should be all right.</para>
1180 <sect2 id="postfix-operators">
1181 <title>Postfix operators</title>
1184 GHC allows a small extension to the syntax of left operator sections, which
1185 allows you to define postfix operators. The extension is this: the left section
1189 is equivalent (from the point of view of both type checking and execution) to the expression
1193 (for any expression <literal>e</literal> and operator <literal>(!)</literal>.
1194 The strict Haskell 98 interpretation is that the section is equivalent to
1198 That is, the operator must be a function of two arguments. GHC allows it to
1199 take only one argument, and that in turn allows you to write the function
1202 <para>Since this extension goes beyond Haskell 98, it should really be enabled
1203 by a flag; but in fact it is enabled all the time. (No Haskell 98 programs
1204 change their behaviour, of course.)
1206 <para>The extension does not extend to the left-hand side of function
1207 definitions; you must define such a function in prefix form.</para>
1211 <sect2 id="disambiguate-fields">
1212 <title>Record field disambiguation</title>
1214 In record construction and record pattern matching
1215 it is entirely unambiguous which field is referred to, even if there are two different
1216 data types in scope with a common field name. For example:
1219 data S = MkS { x :: Int, y :: Bool }
1224 data T = MkT { x :: Int }
1226 ok1 (MkS { x = n }) = n+1 -- Unambiguous
1228 ok2 n = MkT { x = n+1 } -- Unambiguous
1230 bad1 k = k { x = 3 } -- Ambiguous
1231 bad2 k = x k -- Ambiguous
1233 Even though there are two <literal>x</literal>'s in scope,
1234 it is clear that the <literal>x</literal> in the pattern in the
1235 definition of <literal>ok1</literal> can only mean the field
1236 <literal>x</literal> from type <literal>S</literal>. Similarly for
1237 the function <literal>ok2</literal>. However, in the record update
1238 in <literal>bad1</literal> and the record selection in <literal>bad2</literal>
1239 it is not clear which of the two types is intended.
1242 Haskell 98 regards all four as ambiguous, but with the
1243 <option>-fdisambiguate-record-fields</option> flag, GHC will accept
1244 the former two. The rules are precisely the same as those for instance
1245 declarations in Haskell 98, where the method names on the left-hand side
1246 of the method bindings in an instance declaration refer unambiguously
1247 to the method of that class (provided they are in scope at all), even
1248 if there are other variables in scope with the same name.
1249 This reduces the clutter of qualified names when you import two
1250 records from different modules that use the same field name.
1254 <!-- ===================== Record puns =================== -->
1256 <sect2 id="record-puns">
1261 Record puns are enabled by the flag <literal>-XNamedFieldPuns</literal>.
1265 When using records, it is common to write a pattern that binds a
1266 variable with the same name as a record field, such as:
1269 data C = C {a :: Int}
1275 Record punning permits the variable name to be elided, so one can simply
1282 to mean the same pattern as above. That is, in a record pattern, the
1283 pattern <literal>a</literal> expands into the pattern <literal>a =
1284 a</literal> for the same name <literal>a</literal>.
1288 Note that puns and other patterns can be mixed in the same record:
1290 data C = C {a :: Int, b :: Int}
1291 f (C {a, b = 4}) = a
1293 and that puns can be used wherever record patterns occur (e.g. in
1294 <literal>let</literal> bindings or at the top-level).
1298 Record punning can also be used in an expression, writing, for example,
1304 let a = 1 in C {a = a}
1307 Note that this expansion is purely syntactic, so the record pun
1308 expression refers to the nearest enclosing variable that is spelled the
1309 same as the field name.
1314 <!-- ===================== Record wildcards =================== -->
1316 <sect2 id="record-wildcards">
1317 <title>Record wildcards
1321 Record wildcards are enabled by the flag <literal>-XRecordWildCards</literal>.
1325 For records with many fields, it can be tiresome to write out each field
1326 individually in a record pattern, as in
1328 data C = C {a :: Int, b :: Int, c :: Int, d :: Int}
1329 f (C {a = 1, b = b, c = c, d = d}) = b + c + d
1334 Record wildcard syntax permits a (<literal>..</literal>) in a record
1335 pattern, where each elided field <literal>f</literal> is replaced by the
1336 pattern <literal>f = f</literal>. For example, the above pattern can be
1339 f (C {a = 1, ..}) = b + c + d
1344 Note that wildcards can be mixed with other patterns, including puns
1345 (<xref linkend="record-puns"/>); for example, in a pattern <literal>C {a
1346 = 1, b, ..})</literal>. Additionally, record wildcards can be used
1347 wherever record patterns occur, including in <literal>let</literal>
1348 bindings and at the top-level. For example, the top-level binding
1352 defines <literal>b</literal>, <literal>c</literal>, and
1353 <literal>d</literal>.
1357 Record wildcards can also be used in expressions, writing, for example,
1360 let {a = 1; b = 2; c = 3; d = 4} in C {..}
1366 let {a = 1; b = 2; c = 3; d = 4} in C {a=a, b=b, c=c, d=d}
1369 Note that this expansion is purely syntactic, so the record wildcard
1370 expression refers to the nearest enclosing variables that are spelled
1371 the same as the omitted field names.
1376 <!-- ===================== Local fixity declarations =================== -->
1378 <sect2 id="local-fixity-declarations">
1379 <title>Local Fixity Declarations
1382 <para>A careful reading of the Haskell 98 Report reveals that fixity
1383 declarations (<literal>infix</literal>, <literal>infixl</literal>, and
1384 <literal>infixr</literal>) are permitted to appear inside local bindings
1385 such those introduced by <literal>let</literal> and
1386 <literal>where</literal>. However, the Haskell Report does not specify
1387 the semantics of such bindings very precisely.
1390 <para>In GHC, a fixity declaration may accompany a local binding:
1397 and the fixity declaration applies wherever the binding is in scope.
1398 For example, in a <literal>let</literal>, it applies in the right-hand
1399 sides of other <literal>let</literal>-bindings and the body of the
1400 <literal>let</literal>C. Or, in recursive <literal>do</literal>
1401 expressions (<xref linkend="mdo-notation"/>), the local fixity
1402 declarations of a <literal>let</literal> statement scope over other
1403 statements in the group, just as the bound name does.
1407 Moreover, a local fixity declaration *must* accompany a local binding of
1408 that name: it is not possible to revise the fixity of name bound
1411 let infixr 9 $ in ...
1414 Because local fixity declarations are technically Haskell 98, no flag is
1415 necessary to enable them.
1419 <sect2 id="package-imports">
1420 <title>Package-qualified imports</title>
1422 <para>With the <option>-XPackageImports</option> flag, GHC allows
1423 import declarations to be qualified by the package name that the
1424 module is intended to be imported from. For example:</para>
1427 import "network" Network.Socket
1430 <para>would import the module <literal>Network.Socket</literal> from
1431 the package <literal>network</literal> (any version). This may
1432 be used to disambiguate an import when the same module is
1433 available from multiple packages, or is present in both the
1434 current package being built and an external package.</para>
1436 <para>Note: you probably don't need to use this feature, it was
1437 added mainly so that we can build backwards-compatible versions of
1438 packages when APIs change. It can lead to fragile dependencies in
1439 the common case: modules occasionally move from one package to
1440 another, rendering any package-qualified imports broken.</para>
1443 <sect2 id="syntax-stolen">
1444 <title>Summary of stolen syntax</title>
1446 <para>Turning on an option that enables special syntax
1447 <emphasis>might</emphasis> cause working Haskell 98 code to fail
1448 to compile, perhaps because it uses a variable name which has
1449 become a reserved word. This section lists the syntax that is
1450 "stolen" by language extensions.
1452 notation and nonterminal names from the Haskell 98 lexical syntax
1453 (see the Haskell 98 Report).
1454 We only list syntax changes here that might affect
1455 existing working programs (i.e. "stolen" syntax). Many of these
1456 extensions will also enable new context-free syntax, but in all
1457 cases programs written to use the new syntax would not be
1458 compilable without the option enabled.</para>
1460 <para>There are two classes of special
1465 <para>New reserved words and symbols: character sequences
1466 which are no longer available for use as identifiers in the
1470 <para>Other special syntax: sequences of characters that have
1471 a different meaning when this particular option is turned
1476 The following syntax is stolen:
1481 <literal>forall</literal>
1482 <indexterm><primary><literal>forall</literal></primary></indexterm>
1485 Stolen (in types) by: <option>-XScopedTypeVariables</option>,
1486 <option>-XLiberalTypeSynonyms</option>,
1487 <option>-XRank2Types</option>,
1488 <option>-XRankNTypes</option>,
1489 <option>-XPolymorphicComponents</option>,
1490 <option>-XExistentialQuantification</option>
1496 <literal>mdo</literal>
1497 <indexterm><primary><literal>mdo</literal></primary></indexterm>
1500 Stolen by: <option>-XRecursiveDo</option>,
1506 <literal>foreign</literal>
1507 <indexterm><primary><literal>foreign</literal></primary></indexterm>
1510 Stolen by: <option>-XForeignFunctionInterface</option>,
1516 <literal>rec</literal>,
1517 <literal>proc</literal>, <literal>-<</literal>,
1518 <literal>>-</literal>, <literal>-<<</literal>,
1519 <literal>>>-</literal>, and <literal>(|</literal>,
1520 <literal>|)</literal> brackets
1521 <indexterm><primary><literal>proc</literal></primary></indexterm>
1524 Stolen by: <option>-XArrows</option>,
1530 <literal>?<replaceable>varid</replaceable></literal>,
1531 <literal>%<replaceable>varid</replaceable></literal>
1532 <indexterm><primary>implicit parameters</primary></indexterm>
1535 Stolen by: <option>-XImplicitParams</option>,
1541 <literal>[|</literal>,
1542 <literal>[e|</literal>, <literal>[p|</literal>,
1543 <literal>[d|</literal>, <literal>[t|</literal>,
1544 <literal>$(</literal>,
1545 <literal>$<replaceable>varid</replaceable></literal>
1546 <indexterm><primary>Template Haskell</primary></indexterm>
1549 Stolen by: <option>-XTemplateHaskell</option>,
1555 <literal>[:<replaceable>varid</replaceable>|</literal>
1556 <indexterm><primary>quasi-quotation</primary></indexterm>
1559 Stolen by: <option>-XQuasiQuotes</option>,
1565 <replaceable>varid</replaceable>{<literal>#</literal>},
1566 <replaceable>char</replaceable><literal>#</literal>,
1567 <replaceable>string</replaceable><literal>#</literal>,
1568 <replaceable>integer</replaceable><literal>#</literal>,
1569 <replaceable>float</replaceable><literal>#</literal>,
1570 <replaceable>float</replaceable><literal>##</literal>,
1571 <literal>(#</literal>, <literal>#)</literal>,
1574 Stolen by: <option>-XMagicHash</option>,
1583 <!-- TYPE SYSTEM EXTENSIONS -->
1584 <sect1 id="data-type-extensions">
1585 <title>Extensions to data types and type synonyms</title>
1587 <sect2 id="nullary-types">
1588 <title>Data types with no constructors</title>
1590 <para>With the <option>-fglasgow-exts</option> flag, GHC lets you declare
1591 a data type with no constructors. For example:</para>
1595 data T a -- T :: * -> *
1598 <para>Syntactically, the declaration lacks the "= constrs" part. The
1599 type can be parameterised over types of any kind, but if the kind is
1600 not <literal>*</literal> then an explicit kind annotation must be used
1601 (see <xref linkend="kinding"/>).</para>
1603 <para>Such data types have only one value, namely bottom.
1604 Nevertheless, they can be useful when defining "phantom types".</para>
1607 <sect2 id="infix-tycons">
1608 <title>Infix type constructors, classes, and type variables</title>
1611 GHC allows type constructors, classes, and type variables to be operators, and
1612 to be written infix, very much like expressions. More specifically:
1615 A type constructor or class can be an operator, beginning with a colon; e.g. <literal>:*:</literal>.
1616 The lexical syntax is the same as that for data constructors.
1619 Data type and type-synonym declarations can be written infix, parenthesised
1620 if you want further arguments. E.g.
1622 data a :*: b = Foo a b
1623 type a :+: b = Either a b
1624 class a :=: b where ...
1626 data (a :**: b) x = Baz a b x
1627 type (a :++: b) y = Either (a,b) y
1631 Types, and class constraints, can be written infix. For example
1634 f :: (a :=: b) => a -> b
1638 A type variable can be an (unqualified) operator e.g. <literal>+</literal>.
1639 The lexical syntax is the same as that for variable operators, excluding "(.)",
1640 "(!)", and "(*)". In a binding position, the operator must be
1641 parenthesised. For example:
1643 type T (+) = Int + Int
1647 liftA2 :: Arrow (~>)
1648 => (a -> b -> c) -> (e ~> a) -> (e ~> b) -> (e ~> c)
1654 as for expressions, both for type constructors and type variables; e.g. <literal>Int `Either` Bool</literal>, or
1655 <literal>Int `a` Bool</literal>. Similarly, parentheses work the same; e.g. <literal>(:*:) Int Bool</literal>.
1658 Fixities may be declared for type constructors, or classes, just as for data constructors. However,
1659 one cannot distinguish between the two in a fixity declaration; a fixity declaration
1660 sets the fixity for a data constructor and the corresponding type constructor. For example:
1664 sets the fixity for both type constructor <literal>T</literal> and data constructor <literal>T</literal>,
1665 and similarly for <literal>:*:</literal>.
1666 <literal>Int `a` Bool</literal>.
1669 Function arrow is <literal>infixr</literal> with fixity 0. (This might change; I'm not sure what it should be.)
1676 <sect2 id="type-synonyms">
1677 <title>Liberalised type synonyms</title>
1680 Type synonyms are like macros at the type level, but Haskell 98 imposes many rules
1681 on individual synonym declarations.
1682 With the <option>-XLiberalTypeSynonyms</option> extension,
1683 GHC does validity checking on types <emphasis>only after expanding type synonyms</emphasis>.
1684 That means that GHC can be very much more liberal about type synonyms than Haskell 98.
1687 <listitem> <para>You can write a <literal>forall</literal> (including overloading)
1688 in a type synonym, thus:
1690 type Discard a = forall b. Show b => a -> b -> (a, String)
1695 g :: Discard Int -> (Int,String) -- A rank-2 type
1702 If you also use <option>-XUnboxedTuples</option>,
1703 you can write an unboxed tuple in a type synonym:
1705 type Pr = (# Int, Int #)
1713 You can apply a type synonym to a forall type:
1715 type Foo a = a -> a -> Bool
1717 f :: Foo (forall b. b->b)
1719 After expanding the synonym, <literal>f</literal> has the legal (in GHC) type:
1721 f :: (forall b. b->b) -> (forall b. b->b) -> Bool
1726 You can apply a type synonym to a partially applied type synonym:
1728 type Generic i o = forall x. i x -> o x
1731 foo :: Generic Id []
1733 After expanding the synonym, <literal>foo</literal> has the legal (in GHC) type:
1735 foo :: forall x. x -> [x]
1743 GHC currently does kind checking before expanding synonyms (though even that
1747 After expanding type synonyms, GHC does validity checking on types, looking for
1748 the following mal-formedness which isn't detected simply by kind checking:
1751 Type constructor applied to a type involving for-alls.
1754 Unboxed tuple on left of an arrow.
1757 Partially-applied type synonym.
1761 this will be rejected:
1763 type Pr = (# Int, Int #)
1768 because GHC does not allow unboxed tuples on the left of a function arrow.
1773 <sect2 id="existential-quantification">
1774 <title>Existentially quantified data constructors
1778 The idea of using existential quantification in data type declarations
1779 was suggested by Perry, and implemented in Hope+ (Nigel Perry, <emphasis>The Implementation
1780 of Practical Functional Programming Languages</emphasis>, PhD Thesis, University of
1781 London, 1991). It was later formalised by Laufer and Odersky
1782 (<emphasis>Polymorphic type inference and abstract data types</emphasis>,
1783 TOPLAS, 16(5), pp1411-1430, 1994).
1784 It's been in Lennart
1785 Augustsson's <command>hbc</command> Haskell compiler for several years, and
1786 proved very useful. Here's the idea. Consider the declaration:
1792 data Foo = forall a. MkFoo a (a -> Bool)
1799 The data type <literal>Foo</literal> has two constructors with types:
1805 MkFoo :: forall a. a -> (a -> Bool) -> Foo
1812 Notice that the type variable <literal>a</literal> in the type of <function>MkFoo</function>
1813 does not appear in the data type itself, which is plain <literal>Foo</literal>.
1814 For example, the following expression is fine:
1820 [MkFoo 3 even, MkFoo 'c' isUpper] :: [Foo]
1826 Here, <literal>(MkFoo 3 even)</literal> packages an integer with a function
1827 <function>even</function> that maps an integer to <literal>Bool</literal>; and <function>MkFoo 'c'
1828 isUpper</function> packages a character with a compatible function. These
1829 two things are each of type <literal>Foo</literal> and can be put in a list.
1833 What can we do with a value of type <literal>Foo</literal>?. In particular,
1834 what happens when we pattern-match on <function>MkFoo</function>?
1840 f (MkFoo val fn) = ???
1846 Since all we know about <literal>val</literal> and <function>fn</function> is that they
1847 are compatible, the only (useful) thing we can do with them is to
1848 apply <function>fn</function> to <literal>val</literal> to get a boolean. For example:
1855 f (MkFoo val fn) = fn val
1861 What this allows us to do is to package heterogeneous values
1862 together with a bunch of functions that manipulate them, and then treat
1863 that collection of packages in a uniform manner. You can express
1864 quite a bit of object-oriented-like programming this way.
1867 <sect3 id="existential">
1868 <title>Why existential?
1872 What has this to do with <emphasis>existential</emphasis> quantification?
1873 Simply that <function>MkFoo</function> has the (nearly) isomorphic type
1879 MkFoo :: (exists a . (a, a -> Bool)) -> Foo
1885 But Haskell programmers can safely think of the ordinary
1886 <emphasis>universally</emphasis> quantified type given above, thereby avoiding
1887 adding a new existential quantification construct.
1892 <sect3 id="existential-with-context">
1893 <title>Existentials and type classes</title>
1896 An easy extension is to allow
1897 arbitrary contexts before the constructor. For example:
1903 data Baz = forall a. Eq a => Baz1 a a
1904 | forall b. Show b => Baz2 b (b -> b)
1910 The two constructors have the types you'd expect:
1916 Baz1 :: forall a. Eq a => a -> a -> Baz
1917 Baz2 :: forall b. Show b => b -> (b -> b) -> Baz
1923 But when pattern matching on <function>Baz1</function> the matched values can be compared
1924 for equality, and when pattern matching on <function>Baz2</function> the first matched
1925 value can be converted to a string (as well as applying the function to it).
1926 So this program is legal:
1933 f (Baz1 p q) | p == q = "Yes"
1935 f (Baz2 v fn) = show (fn v)
1941 Operationally, in a dictionary-passing implementation, the
1942 constructors <function>Baz1</function> and <function>Baz2</function> must store the
1943 dictionaries for <literal>Eq</literal> and <literal>Show</literal> respectively, and
1944 extract it on pattern matching.
1949 <sect3 id="existential-records">
1950 <title>Record Constructors</title>
1953 GHC allows existentials to be used with records syntax as well. For example:
1956 data Counter a = forall self. NewCounter
1958 , _inc :: self -> self
1959 , _display :: self -> IO ()
1963 Here <literal>tag</literal> is a public field, with a well-typed selector
1964 function <literal>tag :: Counter a -> a</literal>. The <literal>self</literal>
1965 type is hidden from the outside; any attempt to apply <literal>_this</literal>,
1966 <literal>_inc</literal> or <literal>_display</literal> as functions will raise a
1967 compile-time error. In other words, <emphasis>GHC defines a record selector function
1968 only for fields whose type does not mention the existentially-quantified variables</emphasis>.
1969 (This example used an underscore in the fields for which record selectors
1970 will not be defined, but that is only programming style; GHC ignores them.)
1974 To make use of these hidden fields, we need to create some helper functions:
1977 inc :: Counter a -> Counter a
1978 inc (NewCounter x i d t) = NewCounter
1979 { _this = i x, _inc = i, _display = d, tag = t }
1981 display :: Counter a -> IO ()
1982 display NewCounter{ _this = x, _display = d } = d x
1985 Now we can define counters with different underlying implementations:
1988 counterA :: Counter String
1989 counterA = NewCounter
1990 { _this = 0, _inc = (1+), _display = print, tag = "A" }
1992 counterB :: Counter String
1993 counterB = NewCounter
1994 { _this = "", _inc = ('#':), _display = putStrLn, tag = "B" }
1997 display (inc counterA) -- prints "1"
1998 display (inc (inc counterB)) -- prints "##"
2001 At the moment, record update syntax is only supported for Haskell 98 data types,
2002 so the following function does <emphasis>not</emphasis> work:
2005 -- This is invalid; use explicit NewCounter instead for now
2006 setTag :: Counter a -> a -> Counter a
2007 setTag obj t = obj{ tag = t }
2016 <title>Restrictions</title>
2019 There are several restrictions on the ways in which existentially-quantified
2020 constructors can be use.
2029 When pattern matching, each pattern match introduces a new,
2030 distinct, type for each existential type variable. These types cannot
2031 be unified with any other type, nor can they escape from the scope of
2032 the pattern match. For example, these fragments are incorrect:
2040 Here, the type bound by <function>MkFoo</function> "escapes", because <literal>a</literal>
2041 is the result of <function>f1</function>. One way to see why this is wrong is to
2042 ask what type <function>f1</function> has:
2046 f1 :: Foo -> a -- Weird!
2050 What is this "<literal>a</literal>" in the result type? Clearly we don't mean
2055 f1 :: forall a. Foo -> a -- Wrong!
2059 The original program is just plain wrong. Here's another sort of error
2063 f2 (Baz1 a b) (Baz1 p q) = a==q
2067 It's ok to say <literal>a==b</literal> or <literal>p==q</literal>, but
2068 <literal>a==q</literal> is wrong because it equates the two distinct types arising
2069 from the two <function>Baz1</function> constructors.
2077 You can't pattern-match on an existentially quantified
2078 constructor in a <literal>let</literal> or <literal>where</literal> group of
2079 bindings. So this is illegal:
2083 f3 x = a==b where { Baz1 a b = x }
2086 Instead, use a <literal>case</literal> expression:
2089 f3 x = case x of Baz1 a b -> a==b
2092 In general, you can only pattern-match
2093 on an existentially-quantified constructor in a <literal>case</literal> expression or
2094 in the patterns of a function definition.
2096 The reason for this restriction is really an implementation one.
2097 Type-checking binding groups is already a nightmare without
2098 existentials complicating the picture. Also an existential pattern
2099 binding at the top level of a module doesn't make sense, because it's
2100 not clear how to prevent the existentially-quantified type "escaping".
2101 So for now, there's a simple-to-state restriction. We'll see how
2109 You can't use existential quantification for <literal>newtype</literal>
2110 declarations. So this is illegal:
2114 newtype T = forall a. Ord a => MkT a
2118 Reason: a value of type <literal>T</literal> must be represented as a
2119 pair of a dictionary for <literal>Ord t</literal> and a value of type
2120 <literal>t</literal>. That contradicts the idea that
2121 <literal>newtype</literal> should have no concrete representation.
2122 You can get just the same efficiency and effect by using
2123 <literal>data</literal> instead of <literal>newtype</literal>. If
2124 there is no overloading involved, then there is more of a case for
2125 allowing an existentially-quantified <literal>newtype</literal>,
2126 because the <literal>data</literal> version does carry an
2127 implementation cost, but single-field existentially quantified
2128 constructors aren't much use. So the simple restriction (no
2129 existential stuff on <literal>newtype</literal>) stands, unless there
2130 are convincing reasons to change it.
2138 You can't use <literal>deriving</literal> to define instances of a
2139 data type with existentially quantified data constructors.
2141 Reason: in most cases it would not make sense. For example:;
2144 data T = forall a. MkT [a] deriving( Eq )
2147 To derive <literal>Eq</literal> in the standard way we would need to have equality
2148 between the single component of two <function>MkT</function> constructors:
2152 (MkT a) == (MkT b) = ???
2155 But <varname>a</varname> and <varname>b</varname> have distinct types, and so can't be compared.
2156 It's just about possible to imagine examples in which the derived instance
2157 would make sense, but it seems altogether simpler simply to prohibit such
2158 declarations. Define your own instances!
2169 <!-- ====================== Generalised algebraic data types ======================= -->
2171 <sect2 id="gadt-style">
2172 <title>Declaring data types with explicit constructor signatures</title>
2174 <para>GHC allows you to declare an algebraic data type by
2175 giving the type signatures of constructors explicitly. For example:
2179 Just :: a -> Maybe a
2181 The form is called a "GADT-style declaration"
2182 because Generalised Algebraic Data Types, described in <xref linkend="gadt"/>,
2183 can only be declared using this form.</para>
2184 <para>Notice that GADT-style syntax generalises existential types (<xref linkend="existential-quantification"/>).
2185 For example, these two declarations are equivalent:
2187 data Foo = forall a. MkFoo a (a -> Bool)
2188 data Foo' where { MKFoo :: a -> (a->Bool) -> Foo' }
2191 <para>Any data type that can be declared in standard Haskell-98 syntax
2192 can also be declared using GADT-style syntax.
2193 The choice is largely stylistic, but GADT-style declarations differ in one important respect:
2194 they treat class constraints on the data constructors differently.
2195 Specifically, if the constructor is given a type-class context, that
2196 context is made available by pattern matching. For example:
2199 MkSet :: Eq a => [a] -> Set a
2201 makeSet :: Eq a => [a] -> Set a
2202 makeSet xs = MkSet (nub xs)
2204 insert :: a -> Set a -> Set a
2205 insert a (MkSet as) | a `elem` as = MkSet as
2206 | otherwise = MkSet (a:as)
2208 A use of <literal>MkSet</literal> as a constructor (e.g. in the definition of <literal>makeSet</literal>)
2209 gives rise to a <literal>(Eq a)</literal>
2210 constraint, as you would expect. The new feature is that pattern-matching on <literal>MkSet</literal>
2211 (as in the definition of <literal>insert</literal>) makes <emphasis>available</emphasis> an <literal>(Eq a)</literal>
2212 context. In implementation terms, the <literal>MkSet</literal> constructor has a hidden field that stores
2213 the <literal>(Eq a)</literal> dictionary that is passed to <literal>MkSet</literal>; so
2214 when pattern-matching that dictionary becomes available for the right-hand side of the match.
2215 In the example, the equality dictionary is used to satisfy the equality constraint
2216 generated by the call to <literal>elem</literal>, so that the type of
2217 <literal>insert</literal> itself has no <literal>Eq</literal> constraint.
2220 For example, one possible application is to reify dictionaries:
2222 data NumInst a where
2223 MkNumInst :: Num a => NumInst a
2225 intInst :: NumInst Int
2228 plus :: NumInst a -> a -> a -> a
2229 plus MkNumInst p q = p + q
2231 Here, a value of type <literal>NumInst a</literal> is equivalent
2232 to an explicit <literal>(Num a)</literal> dictionary.
2235 All this applies to constructors declared using the syntax of <xref linkend="existential-with-context"/>.
2236 For example, the <literal>NumInst</literal> data type above could equivalently be declared
2240 = Num a => MkNumInst (NumInst a)
2242 Notice that, unlike the situation when declaring an existential, there is
2243 no <literal>forall</literal>, because the <literal>Num</literal> constrains the
2244 data type's universally quantified type variable <literal>a</literal>.
2245 A constructor may have both universal and existential type variables: for example,
2246 the following two declarations are equivalent:
2249 = forall b. (Num a, Eq b) => MkT1 a b
2251 MkT2 :: (Num a, Eq b) => a -> b -> T2 a
2254 <para>All this behaviour contrasts with Haskell 98's peculiar treatment of
2255 contexts on a data type declaration (Section 4.2.1 of the Haskell 98 Report).
2256 In Haskell 98 the definition
2258 data Eq a => Set' a = MkSet' [a]
2260 gives <literal>MkSet'</literal> the same type as <literal>MkSet</literal> above. But instead of
2261 <emphasis>making available</emphasis> an <literal>(Eq a)</literal> constraint, pattern-matching
2262 on <literal>MkSet'</literal> <emphasis>requires</emphasis> an <literal>(Eq a)</literal> constraint!
2263 GHC faithfully implements this behaviour, odd though it is. But for GADT-style declarations,
2264 GHC's behaviour is much more useful, as well as much more intuitive.
2268 The rest of this section gives further details about GADT-style data
2273 The result type of each data constructor must begin with the type constructor being defined.
2274 If the result type of all constructors
2275 has the form <literal>T a1 ... an</literal>, where <literal>a1 ... an</literal>
2276 are distinct type variables, then the data type is <emphasis>ordinary</emphasis>;
2277 otherwise is a <emphasis>generalised</emphasis> data type (<xref linkend="gadt"/>).
2281 The type signature of
2282 each constructor is independent, and is implicitly universally quantified as usual.
2283 Different constructors may have different universally-quantified type variables
2284 and different type-class constraints.
2285 For example, this is fine:
2288 T1 :: Eq b => b -> T b
2289 T2 :: (Show c, Ix c) => c -> [c] -> T c
2294 Unlike a Haskell-98-style
2295 data type declaration, the type variable(s) in the "<literal>data Set a where</literal>" header
2296 have no scope. Indeed, one can write a kind signature instead:
2298 data Set :: * -> * where ...
2300 or even a mixture of the two:
2302 data Foo a :: (* -> *) -> * where ...
2304 The type variables (if given) may be explicitly kinded, so we could also write the header for <literal>Foo</literal>
2307 data Foo a (b :: * -> *) where ...
2313 You can use strictness annotations, in the obvious places
2314 in the constructor type:
2317 Lit :: !Int -> Term Int
2318 If :: Term Bool -> !(Term a) -> !(Term a) -> Term a
2319 Pair :: Term a -> Term b -> Term (a,b)
2324 You can use a <literal>deriving</literal> clause on a GADT-style data type
2325 declaration. For example, these two declarations are equivalent
2327 data Maybe1 a where {
2328 Nothing1 :: Maybe1 a ;
2329 Just1 :: a -> Maybe1 a
2330 } deriving( Eq, Ord )
2332 data Maybe2 a = Nothing2 | Just2 a
2338 You can use record syntax on a GADT-style data type declaration:
2342 Adult { name :: String, children :: [Person] } :: Person
2343 Child { name :: String } :: Person
2345 As usual, for every constructor that has a field <literal>f</literal>, the type of
2346 field <literal>f</literal> must be the same (modulo alpha conversion).
2349 At the moment, record updates are not yet possible with GADT-style declarations,
2350 so support is limited to record construction, selection and pattern matching.
2353 aPerson = Adult { name = "Fred", children = [] }
2355 shortName :: Person -> Bool
2356 hasChildren (Adult { children = kids }) = not (null kids)
2357 hasChildren (Child {}) = False
2362 As in the case of existentials declared using the Haskell-98-like record syntax
2363 (<xref linkend="existential-records"/>),
2364 record-selector functions are generated only for those fields that have well-typed
2366 Here is the example of that section, in GADT-style syntax:
2368 data Counter a where
2369 NewCounter { _this :: self
2370 , _inc :: self -> self
2371 , _display :: self -> IO ()
2376 As before, only one selector function is generated here, that for <literal>tag</literal>.
2377 Nevertheless, you can still use all the field names in pattern matching and record construction.
2379 </itemizedlist></para>
2383 <title>Generalised Algebraic Data Types (GADTs)</title>
2385 <para>Generalised Algebraic Data Types generalise ordinary algebraic data types
2386 by allowing constructors to have richer return types. Here is an example:
2389 Lit :: Int -> Term Int
2390 Succ :: Term Int -> Term Int
2391 IsZero :: Term Int -> Term Bool
2392 If :: Term Bool -> Term a -> Term a -> Term a
2393 Pair :: Term a -> Term b -> Term (a,b)
2395 Notice that the return type of the constructors is not always <literal>Term a</literal>, as is the
2396 case with ordinary data types. This generality allows us to
2397 write a well-typed <literal>eval</literal> function
2398 for these <literal>Terms</literal>:
2402 eval (Succ t) = 1 + eval t
2403 eval (IsZero t) = eval t == 0
2404 eval (If b e1 e2) = if eval b then eval e1 else eval e2
2405 eval (Pair e1 e2) = (eval e1, eval e2)
2407 The key point about GADTs is that <emphasis>pattern matching causes type refinement</emphasis>.
2408 For example, in the right hand side of the equation
2413 the type <literal>a</literal> is refined to <literal>Int</literal>. That's the whole point!
2414 A precise specification of the type rules is beyond what this user manual aspires to,
2415 but the design closely follows that described in
2417 url="http://research.microsoft.com/%7Esimonpj/papers/gadt/">Simple
2418 unification-based type inference for GADTs</ulink>,
2420 The general principle is this: <emphasis>type refinement is only carried out
2421 based on user-supplied type annotations</emphasis>.
2422 So if no type signature is supplied for <literal>eval</literal>, no type refinement happens,
2423 and lots of obscure error messages will
2424 occur. However, the refinement is quite general. For example, if we had:
2426 eval :: Term a -> a -> a
2427 eval (Lit i) j = i+j
2429 the pattern match causes the type <literal>a</literal> to be refined to <literal>Int</literal> (because of the type
2430 of the constructor <literal>Lit</literal>), and that refinement also applies to the type of <literal>j</literal>, and
2431 the result type of the <literal>case</literal> expression. Hence the addition <literal>i+j</literal> is legal.
2434 These and many other examples are given in papers by Hongwei Xi, and
2435 Tim Sheard. There is a longer introduction
2436 <ulink url="http://www.haskell.org/haskellwiki/GADT">on the wiki</ulink>,
2438 <ulink url="http://www.informatik.uni-bonn.de/~ralf/publications/With.pdf">Fun with phantom types</ulink> also has a number of examples. Note that papers
2439 may use different notation to that implemented in GHC.
2442 The rest of this section outlines the extensions to GHC that support GADTs. The extension is enabled with
2443 <option>-XGADTs</option>. The <option>-XGADTs</option> flag also sets <option>-XRelaxedPolyRec</option>.
2446 A GADT can only be declared using GADT-style syntax (<xref linkend="gadt-style"/>);
2447 the old Haskell-98 syntax for data declarations always declares an ordinary data type.
2448 The result type of each constructor must begin with the type constructor being defined,
2449 but for a GADT the arguments to the type constructor can be arbitrary monotypes.
2450 For example, in the <literal>Term</literal> data
2451 type above, the type of each constructor must end with <literal>Term ty</literal>, but
2452 the <literal>ty</literal> need not be a type variable (e.g. the <literal>Lit</literal>
2457 It's is permitted to declare an ordinary algebraic data type using GADT-style syntax.
2458 What makes a GADT into a GADT is not the syntax, but rather the presence of data constructors
2459 whose result type is not just <literal>T a b</literal>.
2463 You cannot use a <literal>deriving</literal> clause for a GADT; only for
2464 an ordinary data type.
2468 As mentioned in <xref linkend="gadt-style"/>, record syntax is supported.
2472 Lit { val :: Int } :: Term Int
2473 Succ { num :: Term Int } :: Term Int
2474 Pred { num :: Term Int } :: Term Int
2475 IsZero { arg :: Term Int } :: Term Bool
2476 Pair { arg1 :: Term a
2479 If { cnd :: Term Bool
2484 However, for GADTs there is the following additional constraint:
2485 every constructor that has a field <literal>f</literal> must have
2486 the same result type (modulo alpha conversion)
2487 Hence, in the above example, we cannot merge the <literal>num</literal>
2488 and <literal>arg</literal> fields above into a
2489 single name. Although their field types are both <literal>Term Int</literal>,
2490 their selector functions actually have different types:
2493 num :: Term Int -> Term Int
2494 arg :: Term Bool -> Term Int
2499 When pattern-matching against data constructors drawn from a GADT,
2500 for example in a <literal>case</literal> expression, the following rules apply:
2502 <listitem><para>The type of the scrutinee must be rigid.</para></listitem>
2503 <listitem><para>The type of the entire <literal>case</literal> expression must be rigid.</para></listitem>
2504 <listitem><para>The type of any free variable mentioned in any of
2505 the <literal>case</literal> alternatives must be rigid.</para></listitem>
2507 A type is "rigid" if it is completely known to the compiler at its binding site. The easiest
2508 way to ensure that a variable a rigid type is to give it a type signature.
2509 For more precise details see <ulink url="http://research.microsoft.com/%7Esimonpj/papers/gadt">
2510 Simple unification-based type inference for GADTs
2511 </ulink>. The criteria implemented by GHC are given in the Appendix.
2521 <!-- ====================== End of Generalised algebraic data types ======================= -->
2523 <sect1 id="deriving">
2524 <title>Extensions to the "deriving" mechanism</title>
2526 <sect2 id="deriving-inferred">
2527 <title>Inferred context for deriving clauses</title>
2530 The Haskell Report is vague about exactly when a <literal>deriving</literal> clause is
2533 data T0 f a = MkT0 a deriving( Eq )
2534 data T1 f a = MkT1 (f a) deriving( Eq )
2535 data T2 f a = MkT2 (f (f a)) deriving( Eq )
2537 The natural generated <literal>Eq</literal> code would result in these instance declarations:
2539 instance Eq a => Eq (T0 f a) where ...
2540 instance Eq (f a) => Eq (T1 f a) where ...
2541 instance Eq (f (f a)) => Eq (T2 f a) where ...
2543 The first of these is obviously fine. The second is still fine, although less obviously.
2544 The third is not Haskell 98, and risks losing termination of instances.
2547 GHC takes a conservative position: it accepts the first two, but not the third. The rule is this:
2548 each constraint in the inferred instance context must consist only of type variables,
2549 with no repetitions.
2552 This rule is applied regardless of flags. If you want a more exotic context, you can write
2553 it yourself, using the <link linkend="stand-alone-deriving">standalone deriving mechanism</link>.
2557 <sect2 id="stand-alone-deriving">
2558 <title>Stand-alone deriving declarations</title>
2561 GHC now allows stand-alone <literal>deriving</literal> declarations, enabled by <literal>-XStandaloneDeriving</literal>:
2563 data Foo a = Bar a | Baz String
2565 deriving instance Eq a => Eq (Foo a)
2567 The syntax is identical to that of an ordinary instance declaration apart from (a) the keyword
2568 <literal>deriving</literal>, and (b) the absence of the <literal>where</literal> part.
2569 You must supply a context (in the example the context is <literal>(Eq a)</literal>),
2570 exactly as you would in an ordinary instance declaration.
2571 (In contrast the context is inferred in a <literal>deriving</literal> clause
2572 attached to a data type declaration.)
2574 A <literal>deriving instance</literal> declaration
2575 must obey the same rules concerning form and termination as ordinary instance declarations,
2576 controlled by the same flags; see <xref linkend="instance-decls"/>.
2579 Unlike a <literal>deriving</literal>
2580 declaration attached to a <literal>data</literal> declaration, the instance can be more specific
2581 than the data type (assuming you also use
2582 <literal>-XFlexibleInstances</literal>, <xref linkend="instance-rules"/>). Consider
2585 data Foo a = Bar a | Baz String
2587 deriving instance Eq a => Eq (Foo [a])
2588 deriving instance Eq a => Eq (Foo (Maybe a))
2590 This will generate a derived instance for <literal>(Foo [a])</literal> and <literal>(Foo (Maybe a))</literal>,
2591 but other types such as <literal>(Foo (Int,Bool))</literal> will not be an instance of <literal>Eq</literal>.
2594 <para>The stand-alone syntax is generalised for newtypes in exactly the same
2595 way that ordinary <literal>deriving</literal> clauses are generalised (<xref linkend="newtype-deriving"/>).
2598 newtype Foo a = MkFoo (State Int a)
2600 deriving instance MonadState Int Foo
2602 GHC always treats the <emphasis>last</emphasis> parameter of the instance
2603 (<literal>Foo</literal> in this example) as the type whose instance is being derived.
2609 <sect2 id="deriving-typeable">
2610 <title>Deriving clause for classes <literal>Typeable</literal> and <literal>Data</literal></title>
2613 Haskell 98 allows the programmer to add "<literal>deriving( Eq, Ord )</literal>" to a data type
2614 declaration, to generate a standard instance declaration for classes specified in the <literal>deriving</literal> clause.
2615 In Haskell 98, the only classes that may appear in the <literal>deriving</literal> clause are the standard
2616 classes <literal>Eq</literal>, <literal>Ord</literal>,
2617 <literal>Enum</literal>, <literal>Ix</literal>, <literal>Bounded</literal>, <literal>Read</literal>, and <literal>Show</literal>.
2620 GHC extends this list with two more classes that may be automatically derived
2621 (provided the <option>-XDeriveDataTypeable</option> flag is specified):
2622 <literal>Typeable</literal>, and <literal>Data</literal>. These classes are defined in the library
2623 modules <literal>Data.Typeable</literal> and <literal>Data.Generics</literal> respectively, and the
2624 appropriate class must be in scope before it can be mentioned in the <literal>deriving</literal> clause.
2626 <para>An instance of <literal>Typeable</literal> can only be derived if the
2627 data type has seven or fewer type parameters, all of kind <literal>*</literal>.
2628 The reason for this is that the <literal>Typeable</literal> class is derived using the scheme
2630 <ulink url="http://research.microsoft.com/%7Esimonpj/papers/hmap/gmap2.ps">
2631 Scrap More Boilerplate: Reflection, Zips, and Generalised Casts
2633 (Section 7.4 of the paper describes the multiple <literal>Typeable</literal> classes that
2634 are used, and only <literal>Typeable1</literal> up to
2635 <literal>Typeable7</literal> are provided in the library.)
2636 In other cases, there is nothing to stop the programmer writing a <literal>TypableX</literal>
2637 class, whose kind suits that of the data type constructor, and
2638 then writing the data type instance by hand.
2642 <sect2 id="newtype-deriving">
2643 <title>Generalised derived instances for newtypes</title>
2646 When you define an abstract type using <literal>newtype</literal>, you may want
2647 the new type to inherit some instances from its representation. In
2648 Haskell 98, you can inherit instances of <literal>Eq</literal>, <literal>Ord</literal>,
2649 <literal>Enum</literal> and <literal>Bounded</literal> by deriving them, but for any
2650 other classes you have to write an explicit instance declaration. For
2651 example, if you define
2654 newtype Dollars = Dollars Int
2657 and you want to use arithmetic on <literal>Dollars</literal>, you have to
2658 explicitly define an instance of <literal>Num</literal>:
2661 instance Num Dollars where
2662 Dollars a + Dollars b = Dollars (a+b)
2665 All the instance does is apply and remove the <literal>newtype</literal>
2666 constructor. It is particularly galling that, since the constructor
2667 doesn't appear at run-time, this instance declaration defines a
2668 dictionary which is <emphasis>wholly equivalent</emphasis> to the <literal>Int</literal>
2669 dictionary, only slower!
2673 <sect3> <title> Generalising the deriving clause </title>
2675 GHC now permits such instances to be derived instead,
2676 using the flag <option>-XGeneralizedNewtypeDeriving</option>,
2679 newtype Dollars = Dollars Int deriving (Eq,Show,Num)
2682 and the implementation uses the <emphasis>same</emphasis> <literal>Num</literal> dictionary
2683 for <literal>Dollars</literal> as for <literal>Int</literal>. Notionally, the compiler
2684 derives an instance declaration of the form
2687 instance Num Int => Num Dollars
2690 which just adds or removes the <literal>newtype</literal> constructor according to the type.
2694 We can also derive instances of constructor classes in a similar
2695 way. For example, suppose we have implemented state and failure monad
2696 transformers, such that
2699 instance Monad m => Monad (State s m)
2700 instance Monad m => Monad (Failure m)
2702 In Haskell 98, we can define a parsing monad by
2704 type Parser tok m a = State [tok] (Failure m) a
2707 which is automatically a monad thanks to the instance declarations
2708 above. With the extension, we can make the parser type abstract,
2709 without needing to write an instance of class <literal>Monad</literal>, via
2712 newtype Parser tok m a = Parser (State [tok] (Failure m) a)
2715 In this case the derived instance declaration is of the form
2717 instance Monad (State [tok] (Failure m)) => Monad (Parser tok m)
2720 Notice that, since <literal>Monad</literal> is a constructor class, the
2721 instance is a <emphasis>partial application</emphasis> of the new type, not the
2722 entire left hand side. We can imagine that the type declaration is
2723 "eta-converted" to generate the context of the instance
2728 We can even derive instances of multi-parameter classes, provided the
2729 newtype is the last class parameter. In this case, a ``partial
2730 application'' of the class appears in the <literal>deriving</literal>
2731 clause. For example, given the class
2734 class StateMonad s m | m -> s where ...
2735 instance Monad m => StateMonad s (State s m) where ...
2737 then we can derive an instance of <literal>StateMonad</literal> for <literal>Parser</literal>s by
2739 newtype Parser tok m a = Parser (State [tok] (Failure m) a)
2740 deriving (Monad, StateMonad [tok])
2743 The derived instance is obtained by completing the application of the
2744 class to the new type:
2747 instance StateMonad [tok] (State [tok] (Failure m)) =>
2748 StateMonad [tok] (Parser tok m)
2753 As a result of this extension, all derived instances in newtype
2754 declarations are treated uniformly (and implemented just by reusing
2755 the dictionary for the representation type), <emphasis>except</emphasis>
2756 <literal>Show</literal> and <literal>Read</literal>, which really behave differently for
2757 the newtype and its representation.
2761 <sect3> <title> A more precise specification </title>
2763 Derived instance declarations are constructed as follows. Consider the
2764 declaration (after expansion of any type synonyms)
2767 newtype T v1...vn = T' (t vk+1...vn) deriving (c1...cm)
2773 The <literal>ci</literal> are partial applications of
2774 classes of the form <literal>C t1'...tj'</literal>, where the arity of <literal>C</literal>
2775 is exactly <literal>j+1</literal>. That is, <literal>C</literal> lacks exactly one type argument.
2778 The <literal>k</literal> is chosen so that <literal>ci (T v1...vk)</literal> is well-kinded.
2781 The type <literal>t</literal> is an arbitrary type.
2784 The type variables <literal>vk+1...vn</literal> do not occur in <literal>t</literal>,
2785 nor in the <literal>ci</literal>, and
2788 None of the <literal>ci</literal> is <literal>Read</literal>, <literal>Show</literal>,
2789 <literal>Typeable</literal>, or <literal>Data</literal>. These classes
2790 should not "look through" the type or its constructor. You can still
2791 derive these classes for a newtype, but it happens in the usual way, not
2792 via this new mechanism.
2795 Then, for each <literal>ci</literal>, the derived instance
2798 instance ci t => ci (T v1...vk)
2800 As an example which does <emphasis>not</emphasis> work, consider
2802 newtype NonMonad m s = NonMonad (State s m s) deriving Monad
2804 Here we cannot derive the instance
2806 instance Monad (State s m) => Monad (NonMonad m)
2809 because the type variable <literal>s</literal> occurs in <literal>State s m</literal>,
2810 and so cannot be "eta-converted" away. It is a good thing that this
2811 <literal>deriving</literal> clause is rejected, because <literal>NonMonad m</literal> is
2812 not, in fact, a monad --- for the same reason. Try defining
2813 <literal>>>=</literal> with the correct type: you won't be able to.
2817 Notice also that the <emphasis>order</emphasis> of class parameters becomes
2818 important, since we can only derive instances for the last one. If the
2819 <literal>StateMonad</literal> class above were instead defined as
2822 class StateMonad m s | m -> s where ...
2825 then we would not have been able to derive an instance for the
2826 <literal>Parser</literal> type above. We hypothesise that multi-parameter
2827 classes usually have one "main" parameter for which deriving new
2828 instances is most interesting.
2830 <para>Lastly, all of this applies only for classes other than
2831 <literal>Read</literal>, <literal>Show</literal>, <literal>Typeable</literal>,
2832 and <literal>Data</literal>, for which the built-in derivation applies (section
2833 4.3.3. of the Haskell Report).
2834 (For the standard classes <literal>Eq</literal>, <literal>Ord</literal>,
2835 <literal>Ix</literal>, and <literal>Bounded</literal> it is immaterial whether
2836 the standard method is used or the one described here.)
2843 <!-- TYPE SYSTEM EXTENSIONS -->
2844 <sect1 id="type-class-extensions">
2845 <title>Class and instances declarations</title>
2847 <sect2 id="multi-param-type-classes">
2848 <title>Class declarations</title>
2851 This section, and the next one, documents GHC's type-class extensions.
2852 There's lots of background in the paper <ulink
2853 url="http://research.microsoft.com/~simonpj/Papers/type-class-design-space/">Type
2854 classes: exploring the design space</ulink> (Simon Peyton Jones, Mark
2855 Jones, Erik Meijer).
2858 All the extensions are enabled by the <option>-fglasgow-exts</option> flag.
2862 <title>Multi-parameter type classes</title>
2864 Multi-parameter type classes are permitted. For example:
2868 class Collection c a where
2869 union :: c a -> c a -> c a
2877 <title>The superclasses of a class declaration</title>
2880 There are no restrictions on the context in a class declaration
2881 (which introduces superclasses), except that the class hierarchy must
2882 be acyclic. So these class declarations are OK:
2886 class Functor (m k) => FiniteMap m k where
2889 class (Monad m, Monad (t m)) => Transform t m where
2890 lift :: m a -> (t m) a
2896 As in Haskell 98, The class hierarchy must be acyclic. However, the definition
2897 of "acyclic" involves only the superclass relationships. For example,
2903 op :: D b => a -> b -> b
2906 class C a => D a where { ... }
2910 Here, <literal>C</literal> is a superclass of <literal>D</literal>, but it's OK for a
2911 class operation <literal>op</literal> of <literal>C</literal> to mention <literal>D</literal>. (It
2912 would not be OK for <literal>D</literal> to be a superclass of <literal>C</literal>.)
2919 <sect3 id="class-method-types">
2920 <title>Class method types</title>
2923 Haskell 98 prohibits class method types to mention constraints on the
2924 class type variable, thus:
2927 fromList :: [a] -> s a
2928 elem :: Eq a => a -> s a -> Bool
2930 The type of <literal>elem</literal> is illegal in Haskell 98, because it
2931 contains the constraint <literal>Eq a</literal>, constrains only the
2932 class type variable (in this case <literal>a</literal>).
2933 GHC lifts this restriction (flag <option>-XConstrainedClassMethods</option>).
2940 <sect2 id="functional-dependencies">
2941 <title>Functional dependencies
2944 <para> Functional dependencies are implemented as described by Mark Jones
2945 in “<ulink url="http://citeseer.ist.psu.edu/jones00type.html">Type Classes with Functional Dependencies</ulink>”, Mark P. Jones,
2946 In Proceedings of the 9th European Symposium on Programming,
2947 ESOP 2000, Berlin, Germany, March 2000, Springer-Verlag LNCS 1782,
2951 Functional dependencies are introduced by a vertical bar in the syntax of a
2952 class declaration; e.g.
2954 class (Monad m) => MonadState s m | m -> s where ...
2956 class Foo a b c | a b -> c where ...
2958 There should be more documentation, but there isn't (yet). Yell if you need it.
2961 <sect3><title>Rules for functional dependencies </title>
2963 In a class declaration, all of the class type variables must be reachable (in the sense
2964 mentioned in <xref linkend="type-restrictions"/>)
2965 from the free variables of each method type.
2969 class Coll s a where
2971 insert :: s -> a -> s
2974 is not OK, because the type of <literal>empty</literal> doesn't mention
2975 <literal>a</literal>. Functional dependencies can make the type variable
2978 class Coll s a | s -> a where
2980 insert :: s -> a -> s
2983 Alternatively <literal>Coll</literal> might be rewritten
2986 class Coll s a where
2988 insert :: s a -> a -> s a
2992 which makes the connection between the type of a collection of
2993 <literal>a</literal>'s (namely <literal>(s a)</literal>) and the element type <literal>a</literal>.
2994 Occasionally this really doesn't work, in which case you can split the
3002 class CollE s => Coll s a where
3003 insert :: s -> a -> s
3010 <title>Background on functional dependencies</title>
3012 <para>The following description of the motivation and use of functional dependencies is taken
3013 from the Hugs user manual, reproduced here (with minor changes) by kind
3014 permission of Mark Jones.
3017 Consider the following class, intended as part of a
3018 library for collection types:
3020 class Collects e ce where
3022 insert :: e -> ce -> ce
3023 member :: e -> ce -> Bool
3025 The type variable e used here represents the element type, while ce is the type
3026 of the container itself. Within this framework, we might want to define
3027 instances of this class for lists or characteristic functions (both of which
3028 can be used to represent collections of any equality type), bit sets (which can
3029 be used to represent collections of characters), or hash tables (which can be
3030 used to represent any collection whose elements have a hash function). Omitting
3031 standard implementation details, this would lead to the following declarations:
3033 instance Eq e => Collects e [e] where ...
3034 instance Eq e => Collects e (e -> Bool) where ...
3035 instance Collects Char BitSet where ...
3036 instance (Hashable e, Collects a ce)
3037 => Collects e (Array Int ce) where ...
3039 All this looks quite promising; we have a class and a range of interesting
3040 implementations. Unfortunately, there are some serious problems with the class
3041 declaration. First, the empty function has an ambiguous type:
3043 empty :: Collects e ce => ce
3045 By "ambiguous" we mean that there is a type variable e that appears on the left
3046 of the <literal>=></literal> symbol, but not on the right. The problem with
3047 this is that, according to the theoretical foundations of Haskell overloading,
3048 we cannot guarantee a well-defined semantics for any term with an ambiguous
3052 We can sidestep this specific problem by removing the empty member from the
3053 class declaration. However, although the remaining members, insert and member,
3054 do not have ambiguous types, we still run into problems when we try to use
3055 them. For example, consider the following two functions:
3057 f x y = insert x . insert y
3060 for which GHC infers the following types:
3062 f :: (Collects a c, Collects b c) => a -> b -> c -> c
3063 g :: (Collects Bool c, Collects Char c) => c -> c
3065 Notice that the type for f allows the two parameters x and y to be assigned
3066 different types, even though it attempts to insert each of the two values, one
3067 after the other, into the same collection. If we're trying to model collections
3068 that contain only one type of value, then this is clearly an inaccurate
3069 type. Worse still, the definition for g is accepted, without causing a type
3070 error. As a result, the error in this code will not be flagged at the point
3071 where it appears. Instead, it will show up only when we try to use g, which
3072 might even be in a different module.
3075 <sect4><title>An attempt to use constructor classes</title>
3078 Faced with the problems described above, some Haskell programmers might be
3079 tempted to use something like the following version of the class declaration:
3081 class Collects e c where
3083 insert :: e -> c e -> c e
3084 member :: e -> c e -> Bool
3086 The key difference here is that we abstract over the type constructor c that is
3087 used to form the collection type c e, and not over that collection type itself,
3088 represented by ce in the original class declaration. This avoids the immediate
3089 problems that we mentioned above: empty has type <literal>Collects e c => c
3090 e</literal>, which is not ambiguous.
3093 The function f from the previous section has a more accurate type:
3095 f :: (Collects e c) => e -> e -> c e -> c e
3097 The function g from the previous section is now rejected with a type error as
3098 we would hope because the type of f does not allow the two arguments to have
3100 This, then, is an example of a multiple parameter class that does actually work
3101 quite well in practice, without ambiguity problems.
3102 There is, however, a catch. This version of the Collects class is nowhere near
3103 as general as the original class seemed to be: only one of the four instances
3104 for <literal>Collects</literal>
3105 given above can be used with this version of Collects because only one of
3106 them---the instance for lists---has a collection type that can be written in
3107 the form c e, for some type constructor c, and element type e.
3111 <sect4><title>Adding functional dependencies</title>
3114 To get a more useful version of the Collects class, Hugs provides a mechanism
3115 that allows programmers to specify dependencies between the parameters of a
3116 multiple parameter class (For readers with an interest in theoretical
3117 foundations and previous work: The use of dependency information can be seen
3118 both as a generalization of the proposal for `parametric type classes' that was
3119 put forward by Chen, Hudak, and Odersky, or as a special case of Mark Jones's
3120 later framework for "improvement" of qualified types. The
3121 underlying ideas are also discussed in a more theoretical and abstract setting
3122 in a manuscript [implparam], where they are identified as one point in a
3123 general design space for systems of implicit parameterization.).
3125 To start with an abstract example, consider a declaration such as:
3127 class C a b where ...
3129 which tells us simply that C can be thought of as a binary relation on types
3130 (or type constructors, depending on the kinds of a and b). Extra clauses can be
3131 included in the definition of classes to add information about dependencies
3132 between parameters, as in the following examples:
3134 class D a b | a -> b where ...
3135 class E a b | a -> b, b -> a where ...
3137 The notation <literal>a -> b</literal> used here between the | and where
3138 symbols --- not to be
3139 confused with a function type --- indicates that the a parameter uniquely
3140 determines the b parameter, and might be read as "a determines b." Thus D is
3141 not just a relation, but actually a (partial) function. Similarly, from the two
3142 dependencies that are included in the definition of E, we can see that E
3143 represents a (partial) one-one mapping between types.
3146 More generally, dependencies take the form <literal>x1 ... xn -> y1 ... ym</literal>,
3147 where x1, ..., xn, and y1, ..., yn are type variables with n>0 and
3148 m>=0, meaning that the y parameters are uniquely determined by the x
3149 parameters. Spaces can be used as separators if more than one variable appears
3150 on any single side of a dependency, as in <literal>t -> a b</literal>. Note that a class may be
3151 annotated with multiple dependencies using commas as separators, as in the
3152 definition of E above. Some dependencies that we can write in this notation are
3153 redundant, and will be rejected because they don't serve any useful
3154 purpose, and may instead indicate an error in the program. Examples of
3155 dependencies like this include <literal>a -> a </literal>,
3156 <literal>a -> a a </literal>,
3157 <literal>a -> </literal>, etc. There can also be
3158 some redundancy if multiple dependencies are given, as in
3159 <literal>a->b</literal>,
3160 <literal>b->c </literal>, <literal>a->c </literal>, and
3161 in which some subset implies the remaining dependencies. Examples like this are
3162 not treated as errors. Note that dependencies appear only in class
3163 declarations, and not in any other part of the language. In particular, the
3164 syntax for instance declarations, class constraints, and types is completely
3168 By including dependencies in a class declaration, we provide a mechanism for
3169 the programmer to specify each multiple parameter class more precisely. The
3170 compiler, on the other hand, is responsible for ensuring that the set of
3171 instances that are in scope at any given point in the program is consistent
3172 with any declared dependencies. For example, the following pair of instance
3173 declarations cannot appear together in the same scope because they violate the
3174 dependency for D, even though either one on its own would be acceptable:
3176 instance D Bool Int where ...
3177 instance D Bool Char where ...
3179 Note also that the following declaration is not allowed, even by itself:
3181 instance D [a] b where ...
3183 The problem here is that this instance would allow one particular choice of [a]
3184 to be associated with more than one choice for b, which contradicts the
3185 dependency specified in the definition of D. More generally, this means that,
3186 in any instance of the form:
3188 instance D t s where ...
3190 for some particular types t and s, the only variables that can appear in s are
3191 the ones that appear in t, and hence, if the type t is known, then s will be
3192 uniquely determined.
3195 The benefit of including dependency information is that it allows us to define
3196 more general multiple parameter classes, without ambiguity problems, and with
3197 the benefit of more accurate types. To illustrate this, we return to the
3198 collection class example, and annotate the original definition of <literal>Collects</literal>
3199 with a simple dependency:
3201 class Collects e ce | ce -> e where
3203 insert :: e -> ce -> ce
3204 member :: e -> ce -> Bool
3206 The dependency <literal>ce -> e</literal> here specifies that the type e of elements is uniquely
3207 determined by the type of the collection ce. Note that both parameters of
3208 Collects are of kind *; there are no constructor classes here. Note too that
3209 all of the instances of Collects that we gave earlier can be used
3210 together with this new definition.
3213 What about the ambiguity problems that we encountered with the original
3214 definition? The empty function still has type Collects e ce => ce, but it is no
3215 longer necessary to regard that as an ambiguous type: Although the variable e
3216 does not appear on the right of the => symbol, the dependency for class
3217 Collects tells us that it is uniquely determined by ce, which does appear on
3218 the right of the => symbol. Hence the context in which empty is used can still
3219 give enough information to determine types for both ce and e, without
3220 ambiguity. More generally, we need only regard a type as ambiguous if it
3221 contains a variable on the left of the => that is not uniquely determined
3222 (either directly or indirectly) by the variables on the right.
3225 Dependencies also help to produce more accurate types for user defined
3226 functions, and hence to provide earlier detection of errors, and less cluttered
3227 types for programmers to work with. Recall the previous definition for a
3230 f x y = insert x y = insert x . insert y
3232 for which we originally obtained a type:
3234 f :: (Collects a c, Collects b c) => a -> b -> c -> c
3236 Given the dependency information that we have for Collects, however, we can
3237 deduce that a and b must be equal because they both appear as the second
3238 parameter in a Collects constraint with the same first parameter c. Hence we
3239 can infer a shorter and more accurate type for f:
3241 f :: (Collects a c) => a -> a -> c -> c
3243 In a similar way, the earlier definition of g will now be flagged as a type error.
3246 Although we have given only a few examples here, it should be clear that the
3247 addition of dependency information can help to make multiple parameter classes
3248 more useful in practice, avoiding ambiguity problems, and allowing more general
3249 sets of instance declarations.
3255 <sect2 id="instance-decls">
3256 <title>Instance declarations</title>
3258 <sect3 id="instance-rules">
3259 <title>Relaxed rules for instance declarations</title>
3261 <para>An instance declaration has the form
3263 instance ( <replaceable>assertion</replaceable><subscript>1</subscript>, ..., <replaceable>assertion</replaceable><subscript>n</subscript>) => <replaceable>class</replaceable> <replaceable>type</replaceable><subscript>1</subscript> ... <replaceable>type</replaceable><subscript>m</subscript> where ...
3265 The part before the "<literal>=></literal>" is the
3266 <emphasis>context</emphasis>, while the part after the
3267 "<literal>=></literal>" is the <emphasis>head</emphasis> of the instance declaration.
3271 In Haskell 98 the head of an instance declaration
3272 must be of the form <literal>C (T a1 ... an)</literal>, where
3273 <literal>C</literal> is the class, <literal>T</literal> is a type constructor,
3274 and the <literal>a1 ... an</literal> are distinct type variables.
3275 Furthermore, the assertions in the context of the instance declaration
3276 must be of the form <literal>C a</literal> where <literal>a</literal>
3277 is a type variable that occurs in the head.
3280 The <option>-XFlexibleInstances</option> flag loosens these restrictions
3281 considerably. Firstly, multi-parameter type classes are permitted. Secondly,
3282 the context and head of the instance declaration can each consist of arbitrary
3283 (well-kinded) assertions <literal>(C t1 ... tn)</literal> subject only to the
3287 The Paterson Conditions: for each assertion in the context
3289 <listitem><para>No type variable has more occurrences in the assertion than in the head</para></listitem>
3290 <listitem><para>The assertion has fewer constructors and variables (taken together
3291 and counting repetitions) than the head</para></listitem>
3295 <listitem><para>The Coverage Condition. For each functional dependency,
3296 <replaceable>tvs</replaceable><subscript>left</subscript> <literal>-></literal>
3297 <replaceable>tvs</replaceable><subscript>right</subscript>, of the class,
3298 every type variable in
3299 S(<replaceable>tvs</replaceable><subscript>right</subscript>) must appear in
3300 S(<replaceable>tvs</replaceable><subscript>left</subscript>), where S is the
3301 substitution mapping each type variable in the class declaration to the
3302 corresponding type in the instance declaration.
3305 These restrictions ensure that context reduction terminates: each reduction
3306 step makes the problem smaller by at least one
3307 constructor. Both the Paterson Conditions and the Coverage Condition are lifted
3308 if you give the <option>-XUndecidableInstances</option>
3309 flag (<xref linkend="undecidable-instances"/>).
3310 You can find lots of background material about the reason for these
3311 restrictions in the paper <ulink
3312 url="http://research.microsoft.com/%7Esimonpj/papers/fd%2Dchr/">
3313 Understanding functional dependencies via Constraint Handling Rules</ulink>.
3316 For example, these are OK:
3318 instance C Int [a] -- Multiple parameters
3319 instance Eq (S [a]) -- Structured type in head
3321 -- Repeated type variable in head
3322 instance C4 a a => C4 [a] [a]
3323 instance Stateful (ST s) (MutVar s)
3325 -- Head can consist of type variables only
3327 instance (Eq a, Show b) => C2 a b
3329 -- Non-type variables in context
3330 instance Show (s a) => Show (Sized s a)
3331 instance C2 Int a => C3 Bool [a]
3332 instance C2 Int a => C3 [a] b
3336 -- Context assertion no smaller than head
3337 instance C a => C a where ...
3338 -- (C b b) has more more occurrences of b than the head
3339 instance C b b => Foo [b] where ...
3344 The same restrictions apply to instances generated by
3345 <literal>deriving</literal> clauses. Thus the following is accepted:
3347 data MinHeap h a = H a (h a)
3350 because the derived instance
3352 instance (Show a, Show (h a)) => Show (MinHeap h a)
3354 conforms to the above rules.
3358 A useful idiom permitted by the above rules is as follows.
3359 If one allows overlapping instance declarations then it's quite
3360 convenient to have a "default instance" declaration that applies if
3361 something more specific does not:
3369 <sect3 id="undecidable-instances">
3370 <title>Undecidable instances</title>
3373 Sometimes even the rules of <xref linkend="instance-rules"/> are too onerous.
3374 For example, sometimes you might want to use the following to get the
3375 effect of a "class synonym":
3377 class (C1 a, C2 a, C3 a) => C a where { }
3379 instance (C1 a, C2 a, C3 a) => C a where { }
3381 This allows you to write shorter signatures:
3387 f :: (C1 a, C2 a, C3 a) => ...
3389 The restrictions on functional dependencies (<xref
3390 linkend="functional-dependencies"/>) are particularly troublesome.
3391 It is tempting to introduce type variables in the context that do not appear in
3392 the head, something that is excluded by the normal rules. For example:
3394 class HasConverter a b | a -> b where
3397 data Foo a = MkFoo a
3399 instance (HasConverter a b,Show b) => Show (Foo a) where
3400 show (MkFoo value) = show (convert value)
3402 This is dangerous territory, however. Here, for example, is a program that would make the
3407 instance F [a] [[a]]
3408 instance (D c, F a c) => D [a] -- 'c' is not mentioned in the head
3410 Similarly, it can be tempting to lift the coverage condition:
3412 class Mul a b c | a b -> c where
3413 (.*.) :: a -> b -> c
3415 instance Mul Int Int Int where (.*.) = (*)
3416 instance Mul Int Float Float where x .*. y = fromIntegral x * y
3417 instance Mul a b c => Mul a [b] [c] where x .*. v = map (x.*.) v
3419 The third instance declaration does not obey the coverage condition;
3420 and indeed the (somewhat strange) definition:
3422 f = \ b x y -> if b then x .*. [y] else y
3424 makes instance inference go into a loop, because it requires the constraint
3425 <literal>(Mul a [b] b)</literal>.
3428 Nevertheless, GHC allows you to experiment with more liberal rules. If you use
3429 the experimental flag <option>-XUndecidableInstances</option>
3430 <indexterm><primary>-XUndecidableInstances</primary></indexterm>,
3431 both the Paterson Conditions and the Coverage Condition
3432 (described in <xref linkend="instance-rules"/>) are lifted. Termination is ensured by having a
3433 fixed-depth recursion stack. If you exceed the stack depth you get a
3434 sort of backtrace, and the opportunity to increase the stack depth
3435 with <option>-fcontext-stack=</option><emphasis>N</emphasis>.
3441 <sect3 id="instance-overlap">
3442 <title>Overlapping instances</title>
3444 In general, <emphasis>GHC requires that that it be unambiguous which instance
3446 should be used to resolve a type-class constraint</emphasis>. This behaviour
3447 can be modified by two flags: <option>-XOverlappingInstances</option>
3448 <indexterm><primary>-XOverlappingInstances
3449 </primary></indexterm>
3450 and <option>-XIncoherentInstances</option>
3451 <indexterm><primary>-XIncoherentInstances
3452 </primary></indexterm>, as this section discusses. Both these
3453 flags are dynamic flags, and can be set on a per-module basis, using
3454 an <literal>OPTIONS_GHC</literal> pragma if desired (<xref linkend="source-file-options"/>).</para>
3456 When GHC tries to resolve, say, the constraint <literal>C Int Bool</literal>,
3457 it tries to match every instance declaration against the
3459 by instantiating the head of the instance declaration. For example, consider
3462 instance context1 => C Int a where ... -- (A)
3463 instance context2 => C a Bool where ... -- (B)
3464 instance context3 => C Int [a] where ... -- (C)
3465 instance context4 => C Int [Int] where ... -- (D)
3467 The instances (A) and (B) match the constraint <literal>C Int Bool</literal>,
3468 but (C) and (D) do not. When matching, GHC takes
3469 no account of the context of the instance declaration
3470 (<literal>context1</literal> etc).
3471 GHC's default behaviour is that <emphasis>exactly one instance must match the
3472 constraint it is trying to resolve</emphasis>.
3473 It is fine for there to be a <emphasis>potential</emphasis> of overlap (by
3474 including both declarations (A) and (B), say); an error is only reported if a
3475 particular constraint matches more than one.
3479 The <option>-XOverlappingInstances</option> flag instructs GHC to allow
3480 more than one instance to match, provided there is a most specific one. For
3481 example, the constraint <literal>C Int [Int]</literal> matches instances (A),
3482 (C) and (D), but the last is more specific, and hence is chosen. If there is no
3483 most-specific match, the program is rejected.
3486 However, GHC is conservative about committing to an overlapping instance. For example:
3491 Suppose that from the RHS of <literal>f</literal> we get the constraint
3492 <literal>C Int [b]</literal>. But
3493 GHC does not commit to instance (C), because in a particular
3494 call of <literal>f</literal>, <literal>b</literal> might be instantiate
3495 to <literal>Int</literal>, in which case instance (D) would be more specific still.
3496 So GHC rejects the program.
3497 (If you add the flag <option>-XIncoherentInstances</option>,
3498 GHC will instead pick (C), without complaining about
3499 the problem of subsequent instantiations.)
3502 Notice that we gave a type signature to <literal>f</literal>, so GHC had to
3503 <emphasis>check</emphasis> that <literal>f</literal> has the specified type.
3504 Suppose instead we do not give a type signature, asking GHC to <emphasis>infer</emphasis>
3505 it instead. In this case, GHC will refrain from
3506 simplifying the constraint <literal>C Int [b]</literal> (for the same reason
3507 as before) but, rather than rejecting the program, it will infer the type
3509 f :: C Int [b] => [b] -> [b]
3511 That postpones the question of which instance to pick to the
3512 call site for <literal>f</literal>
3513 by which time more is known about the type <literal>b</literal>.
3514 You can write this type signature yourself if you use the
3515 <link linkend="flexible-contexts"><option>-XFlexibleContexts</option></link>
3519 Exactly the same situation can arise in instance declarations themselves. Suppose we have
3523 instance Foo [b] where
3526 and, as before, the constraint <literal>C Int [b]</literal> arises from <literal>f</literal>'s
3527 right hand side. GHC will reject the instance, complaining as before that it does not know how to resolve
3528 the constraint <literal>C Int [b]</literal>, because it matches more than one instance
3529 declaration. The solution is to postpone the choice by adding the constraint to the context
3530 of the instance declaration, thus:
3532 instance C Int [b] => Foo [b] where
3535 (You need <link linkend="instance-rules"><option>-XFlexibleInstances</option></link> to do this.)
3538 The willingness to be overlapped or incoherent is a property of
3539 the <emphasis>instance declaration</emphasis> itself, controlled by the
3540 presence or otherwise of the <option>-XOverlappingInstances</option>
3541 and <option>-XIncoherentInstances</option> flags when that module is
3542 being defined. Neither flag is required in a module that imports and uses the
3543 instance declaration. Specifically, during the lookup process:
3546 An instance declaration is ignored during the lookup process if (a) a more specific
3547 match is found, and (b) the instance declaration was compiled with
3548 <option>-XOverlappingInstances</option>. The flag setting for the
3549 more-specific instance does not matter.
3552 Suppose an instance declaration does not match the constraint being looked up, but
3553 does unify with it, so that it might match when the constraint is further
3554 instantiated. Usually GHC will regard this as a reason for not committing to
3555 some other constraint. But if the instance declaration was compiled with
3556 <option>-XIncoherentInstances</option>, GHC will skip the "does-it-unify?"
3557 check for that declaration.
3560 These rules make it possible for a library author to design a library that relies on
3561 overlapping instances without the library client having to know.
3564 If an instance declaration is compiled without
3565 <option>-XOverlappingInstances</option>,
3566 then that instance can never be overlapped. This could perhaps be
3567 inconvenient. Perhaps the rule should instead say that the
3568 <emphasis>overlapping</emphasis> instance declaration should be compiled in
3569 this way, rather than the <emphasis>overlapped</emphasis> one. Perhaps overlap
3570 at a usage site should be permitted regardless of how the instance declarations
3571 are compiled, if the <option>-XOverlappingInstances</option> flag is
3572 used at the usage site. (Mind you, the exact usage site can occasionally be
3573 hard to pin down.) We are interested to receive feedback on these points.
3575 <para>The <option>-XIncoherentInstances</option> flag implies the
3576 <option>-XOverlappingInstances</option> flag, but not vice versa.
3581 <title>Type synonyms in the instance head</title>
3584 <emphasis>Unlike Haskell 98, instance heads may use type
3585 synonyms</emphasis>. (The instance "head" is the bit after the "=>" in an instance decl.)
3586 As always, using a type synonym is just shorthand for
3587 writing the RHS of the type synonym definition. For example:
3591 type Point = (Int,Int)
3592 instance C Point where ...
3593 instance C [Point] where ...
3597 is legal. However, if you added
3601 instance C (Int,Int) where ...
3605 as well, then the compiler will complain about the overlapping
3606 (actually, identical) instance declarations. As always, type synonyms
3607 must be fully applied. You cannot, for example, write:
3612 instance Monad P where ...
3616 This design decision is independent of all the others, and easily
3617 reversed, but it makes sense to me.
3625 <sect2 id="overloaded-strings">
3626 <title>Overloaded string literals
3630 GHC supports <emphasis>overloaded string literals</emphasis>. Normally a
3631 string literal has type <literal>String</literal>, but with overloaded string
3632 literals enabled (with <literal>-XOverloadedStrings</literal>)
3633 a string literal has type <literal>(IsString a) => a</literal>.
3636 This means that the usual string syntax can be used, e.g., for packed strings
3637 and other variations of string like types. String literals behave very much
3638 like integer literals, i.e., they can be used in both expressions and patterns.
3639 If used in a pattern the literal with be replaced by an equality test, in the same
3640 way as an integer literal is.
3643 The class <literal>IsString</literal> is defined as:
3645 class IsString a where
3646 fromString :: String -> a
3648 The only predefined instance is the obvious one to make strings work as usual:
3650 instance IsString [Char] where
3653 The class <literal>IsString</literal> is not in scope by default. If you want to mention
3654 it explicitly (for example, to give an instance declaration for it), you can import it
3655 from module <literal>GHC.Exts</literal>.
3658 Haskell's defaulting mechanism is extended to cover string literals, when <option>-XOverloadedStrings</option> is specified.
3662 Each type in a default declaration must be an
3663 instance of <literal>Num</literal> <emphasis>or</emphasis> of <literal>IsString</literal>.
3667 The standard defaulting rule (<ulink url="http://www.haskell.org/onlinereport/decls.html#sect4.3.4">Haskell Report, Section 4.3.4</ulink>)
3668 is extended thus: defaulting applies when all the unresolved constraints involve standard classes
3669 <emphasis>or</emphasis> <literal>IsString</literal>; and at least one is a numeric class
3670 <emphasis>or</emphasis> <literal>IsString</literal>.
3679 import GHC.Exts( IsString(..) )
3681 newtype MyString = MyString String deriving (Eq, Show)
3682 instance IsString MyString where
3683 fromString = MyString
3685 greet :: MyString -> MyString
3686 greet "hello" = "world"
3690 print $ greet "hello"
3691 print $ greet "fool"
3695 Note that deriving <literal>Eq</literal> is necessary for the pattern matching
3696 to work since it gets translated into an equality comparison.
3702 <sect1 id="other-type-extensions">
3703 <title>Other type system extensions</title>
3705 <sect2 id="type-restrictions">
3706 <title>Type signatures</title>
3708 <sect3 id="flexible-contexts"><title>The context of a type signature</title>
3710 The <option>-XFlexibleContexts</option> flag lifts the Haskell 98 restriction
3711 that the type-class constraints in a type signature must have the
3712 form <emphasis>(class type-variable)</emphasis> or
3713 <emphasis>(class (type-variable type-variable ...))</emphasis>.
3714 With <option>-XFlexibleContexts</option>
3715 these type signatures are perfectly OK
3718 g :: Ord (T a ()) => ...
3722 GHC imposes the following restrictions on the constraints in a type signature.
3726 forall tv1..tvn (c1, ...,cn) => type
3729 (Here, we write the "foralls" explicitly, although the Haskell source
3730 language omits them; in Haskell 98, all the free type variables of an
3731 explicit source-language type signature are universally quantified,
3732 except for the class type variables in a class declaration. However,
3733 in GHC, you can give the foralls if you want. See <xref linkend="universal-quantification"/>).
3742 <emphasis>Each universally quantified type variable
3743 <literal>tvi</literal> must be reachable from <literal>type</literal></emphasis>.
3745 A type variable <literal>a</literal> is "reachable" if it appears
3746 in the same constraint as either a type variable free in
3747 <literal>type</literal>, or another reachable type variable.
3748 A value with a type that does not obey
3749 this reachability restriction cannot be used without introducing
3750 ambiguity; that is why the type is rejected.
3751 Here, for example, is an illegal type:
3755 forall a. Eq a => Int
3759 When a value with this type was used, the constraint <literal>Eq tv</literal>
3760 would be introduced where <literal>tv</literal> is a fresh type variable, and
3761 (in the dictionary-translation implementation) the value would be
3762 applied to a dictionary for <literal>Eq tv</literal>. The difficulty is that we
3763 can never know which instance of <literal>Eq</literal> to use because we never
3764 get any more information about <literal>tv</literal>.
3768 that the reachability condition is weaker than saying that <literal>a</literal> is
3769 functionally dependent on a type variable free in
3770 <literal>type</literal> (see <xref
3771 linkend="functional-dependencies"/>). The reason for this is there
3772 might be a "hidden" dependency, in a superclass perhaps. So
3773 "reachable" is a conservative approximation to "functionally dependent".
3774 For example, consider:
3776 class C a b | a -> b where ...
3777 class C a b => D a b where ...
3778 f :: forall a b. D a b => a -> a
3780 This is fine, because in fact <literal>a</literal> does functionally determine <literal>b</literal>
3781 but that is not immediately apparent from <literal>f</literal>'s type.
3787 <emphasis>Every constraint <literal>ci</literal> must mention at least one of the
3788 universally quantified type variables <literal>tvi</literal></emphasis>.
3790 For example, this type is OK because <literal>C a b</literal> mentions the
3791 universally quantified type variable <literal>b</literal>:
3795 forall a. C a b => burble
3799 The next type is illegal because the constraint <literal>Eq b</literal> does not
3800 mention <literal>a</literal>:
3804 forall a. Eq b => burble
3808 The reason for this restriction is milder than the other one. The
3809 excluded types are never useful or necessary (because the offending
3810 context doesn't need to be witnessed at this point; it can be floated
3811 out). Furthermore, floating them out increases sharing. Lastly,
3812 excluding them is a conservative choice; it leaves a patch of
3813 territory free in case we need it later.
3827 <sect2 id="implicit-parameters">
3828 <title>Implicit parameters</title>
3830 <para> Implicit parameters are implemented as described in
3831 "Implicit parameters: dynamic scoping with static types",
3832 J Lewis, MB Shields, E Meijer, J Launchbury,
3833 27th ACM Symposium on Principles of Programming Languages (POPL'00),
3837 <para>(Most of the following, still rather incomplete, documentation is
3838 due to Jeff Lewis.)</para>
3840 <para>Implicit parameter support is enabled with the option
3841 <option>-XImplicitParams</option>.</para>
3844 A variable is called <emphasis>dynamically bound</emphasis> when it is bound by the calling
3845 context of a function and <emphasis>statically bound</emphasis> when bound by the callee's
3846 context. In Haskell, all variables are statically bound. Dynamic
3847 binding of variables is a notion that goes back to Lisp, but was later
3848 discarded in more modern incarnations, such as Scheme. Dynamic binding
3849 can be very confusing in an untyped language, and unfortunately, typed
3850 languages, in particular Hindley-Milner typed languages like Haskell,
3851 only support static scoping of variables.
3854 However, by a simple extension to the type class system of Haskell, we
3855 can support dynamic binding. Basically, we express the use of a
3856 dynamically bound variable as a constraint on the type. These
3857 constraints lead to types of the form <literal>(?x::t') => t</literal>, which says "this
3858 function uses a dynamically-bound variable <literal>?x</literal>
3859 of type <literal>t'</literal>". For
3860 example, the following expresses the type of a sort function,
3861 implicitly parameterized by a comparison function named <literal>cmp</literal>.
3863 sort :: (?cmp :: a -> a -> Bool) => [a] -> [a]
3865 The dynamic binding constraints are just a new form of predicate in the type class system.
3868 An implicit parameter occurs in an expression using the special form <literal>?x</literal>,
3869 where <literal>x</literal> is
3870 any valid identifier (e.g. <literal>ord ?x</literal> is a valid expression).
3871 Use of this construct also introduces a new
3872 dynamic-binding constraint in the type of the expression.
3873 For example, the following definition
3874 shows how we can define an implicitly parameterized sort function in
3875 terms of an explicitly parameterized <literal>sortBy</literal> function:
3877 sortBy :: (a -> a -> Bool) -> [a] -> [a]
3879 sort :: (?cmp :: a -> a -> Bool) => [a] -> [a]
3885 <title>Implicit-parameter type constraints</title>
3887 Dynamic binding constraints behave just like other type class
3888 constraints in that they are automatically propagated. Thus, when a
3889 function is used, its implicit parameters are inherited by the
3890 function that called it. For example, our <literal>sort</literal> function might be used
3891 to pick out the least value in a list:
3893 least :: (?cmp :: a -> a -> Bool) => [a] -> a
3894 least xs = head (sort xs)
3896 Without lifting a finger, the <literal>?cmp</literal> parameter is
3897 propagated to become a parameter of <literal>least</literal> as well. With explicit
3898 parameters, the default is that parameters must always be explicit
3899 propagated. With implicit parameters, the default is to always
3903 An implicit-parameter type constraint differs from other type class constraints in the
3904 following way: All uses of a particular implicit parameter must have
3905 the same type. This means that the type of <literal>(?x, ?x)</literal>
3906 is <literal>(?x::a) => (a,a)</literal>, and not
3907 <literal>(?x::a, ?x::b) => (a, b)</literal>, as would be the case for type
3911 <para> You can't have an implicit parameter in the context of a class or instance
3912 declaration. For example, both these declarations are illegal:
3914 class (?x::Int) => C a where ...
3915 instance (?x::a) => Foo [a] where ...
3917 Reason: exactly which implicit parameter you pick up depends on exactly where
3918 you invoke a function. But the ``invocation'' of instance declarations is done
3919 behind the scenes by the compiler, so it's hard to figure out exactly where it is done.
3920 Easiest thing is to outlaw the offending types.</para>
3922 Implicit-parameter constraints do not cause ambiguity. For example, consider:
3924 f :: (?x :: [a]) => Int -> Int
3927 g :: (Read a, Show a) => String -> String
3930 Here, <literal>g</literal> has an ambiguous type, and is rejected, but <literal>f</literal>
3931 is fine. The binding for <literal>?x</literal> at <literal>f</literal>'s call site is
3932 quite unambiguous, and fixes the type <literal>a</literal>.
3937 <title>Implicit-parameter bindings</title>
3940 An implicit parameter is <emphasis>bound</emphasis> using the standard
3941 <literal>let</literal> or <literal>where</literal> binding forms.
3942 For example, we define the <literal>min</literal> function by binding
3943 <literal>cmp</literal>.
3946 min = let ?cmp = (<=) in least
3950 A group of implicit-parameter bindings may occur anywhere a normal group of Haskell
3951 bindings can occur, except at top level. That is, they can occur in a <literal>let</literal>
3952 (including in a list comprehension, or do-notation, or pattern guards),
3953 or a <literal>where</literal> clause.
3954 Note the following points:
3957 An implicit-parameter binding group must be a
3958 collection of simple bindings to implicit-style variables (no
3959 function-style bindings, and no type signatures); these bindings are
3960 neither polymorphic or recursive.
3963 You may not mix implicit-parameter bindings with ordinary bindings in a
3964 single <literal>let</literal>
3965 expression; use two nested <literal>let</literal>s instead.
3966 (In the case of <literal>where</literal> you are stuck, since you can't nest <literal>where</literal> clauses.)
3970 You may put multiple implicit-parameter bindings in a
3971 single binding group; but they are <emphasis>not</emphasis> treated
3972 as a mutually recursive group (as ordinary <literal>let</literal> bindings are).
3973 Instead they are treated as a non-recursive group, simultaneously binding all the implicit
3974 parameter. The bindings are not nested, and may be re-ordered without changing
3975 the meaning of the program.
3976 For example, consider:
3978 f t = let { ?x = t; ?y = ?x+(1::Int) } in ?x + ?y
3980 The use of <literal>?x</literal> in the binding for <literal>?y</literal> does not "see"
3981 the binding for <literal>?x</literal>, so the type of <literal>f</literal> is
3983 f :: (?x::Int) => Int -> Int
3991 <sect3><title>Implicit parameters and polymorphic recursion</title>
3994 Consider these two definitions:
3997 len1 xs = let ?acc = 0 in len_acc1 xs
4000 len_acc1 (x:xs) = let ?acc = ?acc + (1::Int) in len_acc1 xs
4005 len2 xs = let ?acc = 0 in len_acc2 xs
4007 len_acc2 :: (?acc :: Int) => [a] -> Int
4009 len_acc2 (x:xs) = let ?acc = ?acc + (1::Int) in len_acc2 xs
4011 The only difference between the two groups is that in the second group
4012 <literal>len_acc</literal> is given a type signature.
4013 In the former case, <literal>len_acc1</literal> is monomorphic in its own
4014 right-hand side, so the implicit parameter <literal>?acc</literal> is not
4015 passed to the recursive call. In the latter case, because <literal>len_acc2</literal>
4016 has a type signature, the recursive call is made to the
4017 <emphasis>polymorphic</emphasis> version, which takes <literal>?acc</literal>
4018 as an implicit parameter. So we get the following results in GHCi:
4025 Adding a type signature dramatically changes the result! This is a rather
4026 counter-intuitive phenomenon, worth watching out for.
4030 <sect3><title>Implicit parameters and monomorphism</title>
4032 <para>GHC applies the dreaded Monomorphism Restriction (section 4.5.5 of the
4033 Haskell Report) to implicit parameters. For example, consider:
4041 Since the binding for <literal>y</literal> falls under the Monomorphism
4042 Restriction it is not generalised, so the type of <literal>y</literal> is
4043 simply <literal>Int</literal>, not <literal>(?x::Int) => Int</literal>.
4044 Hence, <literal>(f 9)</literal> returns result <literal>9</literal>.
4045 If you add a type signature for <literal>y</literal>, then <literal>y</literal>
4046 will get type <literal>(?x::Int) => Int</literal>, so the occurrence of
4047 <literal>y</literal> in the body of the <literal>let</literal> will see the
4048 inner binding of <literal>?x</literal>, so <literal>(f 9)</literal> will return
4049 <literal>14</literal>.
4054 <!-- ======================= COMMENTED OUT ========================
4056 We intend to remove linear implicit parameters, so I'm at least removing
4057 them from the 6.6 user manual
4059 <sect2 id="linear-implicit-parameters">
4060 <title>Linear implicit parameters</title>
4062 Linear implicit parameters are an idea developed by Koen Claessen,
4063 Mark Shields, and Simon PJ. They address the long-standing
4064 problem that monads seem over-kill for certain sorts of problem, notably:
4067 <listitem> <para> distributing a supply of unique names </para> </listitem>
4068 <listitem> <para> distributing a supply of random numbers </para> </listitem>
4069 <listitem> <para> distributing an oracle (as in QuickCheck) </para> </listitem>
4073 Linear implicit parameters are just like ordinary implicit parameters,
4074 except that they are "linear"; that is, they cannot be copied, and
4075 must be explicitly "split" instead. Linear implicit parameters are
4076 written '<literal>%x</literal>' instead of '<literal>?x</literal>'.
4077 (The '/' in the '%' suggests the split!)
4082 import GHC.Exts( Splittable )
4084 data NameSupply = ...
4086 splitNS :: NameSupply -> (NameSupply, NameSupply)
4087 newName :: NameSupply -> Name
4089 instance Splittable NameSupply where
4093 f :: (%ns :: NameSupply) => Env -> Expr -> Expr
4094 f env (Lam x e) = Lam x' (f env e)
4097 env' = extend env x x'
4098 ...more equations for f...
4100 Notice that the implicit parameter %ns is consumed
4102 <listitem> <para> once by the call to <literal>newName</literal> </para> </listitem>
4103 <listitem> <para> once by the recursive call to <literal>f</literal> </para></listitem>
4107 So the translation done by the type checker makes
4108 the parameter explicit:
4110 f :: NameSupply -> Env -> Expr -> Expr
4111 f ns env (Lam x e) = Lam x' (f ns1 env e)
4113 (ns1,ns2) = splitNS ns
4115 env = extend env x x'
4117 Notice the call to 'split' introduced by the type checker.
4118 How did it know to use 'splitNS'? Because what it really did
4119 was to introduce a call to the overloaded function 'split',
4120 defined by the class <literal>Splittable</literal>:
4122 class Splittable a where
4125 The instance for <literal>Splittable NameSupply</literal> tells GHC how to implement
4126 split for name supplies. But we can simply write
4132 g :: (Splittable a, %ns :: a) => b -> (b,a,a)
4134 The <literal>Splittable</literal> class is built into GHC. It's exported by module
4135 <literal>GHC.Exts</literal>.
4140 <listitem> <para> '<literal>?x</literal>' and '<literal>%x</literal>'
4141 are entirely distinct implicit parameters: you
4142 can use them together and they won't interfere with each other. </para>
4145 <listitem> <para> You can bind linear implicit parameters in 'with' clauses. </para> </listitem>
4147 <listitem> <para>You cannot have implicit parameters (whether linear or not)
4148 in the context of a class or instance declaration. </para></listitem>
4152 <sect3><title>Warnings</title>
4155 The monomorphism restriction is even more important than usual.
4156 Consider the example above:
4158 f :: (%ns :: NameSupply) => Env -> Expr -> Expr
4159 f env (Lam x e) = Lam x' (f env e)
4162 env' = extend env x x'
4164 If we replaced the two occurrences of x' by (newName %ns), which is
4165 usually a harmless thing to do, we get:
4167 f :: (%ns :: NameSupply) => Env -> Expr -> Expr
4168 f env (Lam x e) = Lam (newName %ns) (f env e)
4170 env' = extend env x (newName %ns)
4172 But now the name supply is consumed in <emphasis>three</emphasis> places
4173 (the two calls to newName,and the recursive call to f), so
4174 the result is utterly different. Urk! We don't even have
4178 Well, this is an experimental change. With implicit
4179 parameters we have already lost beta reduction anyway, and
4180 (as John Launchbury puts it) we can't sensibly reason about
4181 Haskell programs without knowing their typing.
4186 <sect3><title>Recursive functions</title>
4187 <para>Linear implicit parameters can be particularly tricky when you have a recursive function
4190 foo :: %x::T => Int -> [Int]
4192 foo n = %x : foo (n-1)
4194 where T is some type in class Splittable.</para>
4196 Do you get a list of all the same T's or all different T's
4197 (assuming that split gives two distinct T's back)?
4199 If you supply the type signature, taking advantage of polymorphic
4200 recursion, you get what you'd probably expect. Here's the
4201 translated term, where the implicit param is made explicit:
4204 foo x n = let (x1,x2) = split x
4205 in x1 : foo x2 (n-1)
4207 But if you don't supply a type signature, GHC uses the Hindley
4208 Milner trick of using a single monomorphic instance of the function
4209 for the recursive calls. That is what makes Hindley Milner type inference
4210 work. So the translation becomes
4214 foom n = x : foom (n-1)
4218 Result: 'x' is not split, and you get a list of identical T's. So the
4219 semantics of the program depends on whether or not foo has a type signature.
4222 You may say that this is a good reason to dislike linear implicit parameters
4223 and you'd be right. That is why they are an experimental feature.
4229 ================ END OF Linear Implicit Parameters commented out -->
4231 <sect2 id="kinding">
4232 <title>Explicitly-kinded quantification</title>
4235 Haskell infers the kind of each type variable. Sometimes it is nice to be able
4236 to give the kind explicitly as (machine-checked) documentation,
4237 just as it is nice to give a type signature for a function. On some occasions,
4238 it is essential to do so. For example, in his paper "Restricted Data Types in Haskell" (Haskell Workshop 1999)
4239 John Hughes had to define the data type:
4241 data Set cxt a = Set [a]
4242 | Unused (cxt a -> ())
4244 The only use for the <literal>Unused</literal> constructor was to force the correct
4245 kind for the type variable <literal>cxt</literal>.
4248 GHC now instead allows you to specify the kind of a type variable directly, wherever
4249 a type variable is explicitly bound, with the flag <option>-XKindSignatures</option>.
4252 This flag enables kind signatures in the following places:
4254 <listitem><para><literal>data</literal> declarations:
4256 data Set (cxt :: * -> *) a = Set [a]
4257 </screen></para></listitem>
4258 <listitem><para><literal>type</literal> declarations:
4260 type T (f :: * -> *) = f Int
4261 </screen></para></listitem>
4262 <listitem><para><literal>class</literal> declarations:
4264 class (Eq a) => C (f :: * -> *) a where ...
4265 </screen></para></listitem>
4266 <listitem><para><literal>forall</literal>'s in type signatures:
4268 f :: forall (cxt :: * -> *). Set cxt Int
4269 </screen></para></listitem>
4274 The parentheses are required. Some of the spaces are required too, to
4275 separate the lexemes. If you write <literal>(f::*->*)</literal> you
4276 will get a parse error, because "<literal>::*->*</literal>" is a
4277 single lexeme in Haskell.
4281 As part of the same extension, you can put kind annotations in types
4284 f :: (Int :: *) -> Int
4285 g :: forall a. a -> (a :: *)
4289 atype ::= '(' ctype '::' kind ')
4291 The parentheses are required.
4296 <sect2 id="universal-quantification">
4297 <title>Arbitrary-rank polymorphism
4301 Haskell type signatures are implicitly quantified. The new keyword <literal>forall</literal>
4302 allows us to say exactly what this means. For example:
4310 g :: forall b. (b -> b)
4312 The two are treated identically.
4316 However, GHC's type system supports <emphasis>arbitrary-rank</emphasis>
4317 explicit universal quantification in
4319 For example, all the following types are legal:
4321 f1 :: forall a b. a -> b -> a
4322 g1 :: forall a b. (Ord a, Eq b) => a -> b -> a
4324 f2 :: (forall a. a->a) -> Int -> Int
4325 g2 :: (forall a. Eq a => [a] -> a -> Bool) -> Int -> Int
4327 f3 :: ((forall a. a->a) -> Int) -> Bool -> Bool
4329 f4 :: Int -> (forall a. a -> a)
4331 Here, <literal>f1</literal> and <literal>g1</literal> are rank-1 types, and
4332 can be written in standard Haskell (e.g. <literal>f1 :: a->b->a</literal>).
4333 The <literal>forall</literal> makes explicit the universal quantification that
4334 is implicitly added by Haskell.
4337 The functions <literal>f2</literal> and <literal>g2</literal> have rank-2 types;
4338 the <literal>forall</literal> is on the left of a function arrow. As <literal>g2</literal>
4339 shows, the polymorphic type on the left of the function arrow can be overloaded.
4342 The function <literal>f3</literal> has a rank-3 type;
4343 it has rank-2 types on the left of a function arrow.
4346 GHC has three flags to control higher-rank types:
4349 <option>-XPolymorphicComponents</option>: data constructors (only) can have polymorphic argument types.
4352 <option>-XRank2Types</option>: any function (including data constructors) can have a rank-2 type.
4355 <option>-XRankNTypes</option>: any function (including data constructors) can have an arbitrary-rank type.
4356 That is, you can nest <literal>forall</literal>s
4357 arbitrarily deep in function arrows.
4358 In particular, a forall-type (also called a "type scheme"),
4359 including an operational type class context, is legal:
4361 <listitem> <para> On the left or right (see <literal>f4</literal>, for example)
4362 of a function arrow </para> </listitem>
4363 <listitem> <para> As the argument of a constructor, or type of a field, in a data type declaration. For
4364 example, any of the <literal>f1,f2,f3,g1,g2</literal> above would be valid
4365 field type signatures.</para> </listitem>
4366 <listitem> <para> As the type of an implicit parameter </para> </listitem>
4367 <listitem> <para> In a pattern type signature (see <xref linkend="scoped-type-variables"/>) </para> </listitem>
4371 Of course <literal>forall</literal> becomes a keyword; you can't use <literal>forall</literal> as
4372 a type variable any more!
4381 In a <literal>data</literal> or <literal>newtype</literal> declaration one can quantify
4382 the types of the constructor arguments. Here are several examples:
4388 data T a = T1 (forall b. b -> b -> b) a
4390 data MonadT m = MkMonad { return :: forall a. a -> m a,
4391 bind :: forall a b. m a -> (a -> m b) -> m b
4394 newtype Swizzle = MkSwizzle (Ord a => [a] -> [a])
4400 The constructors have rank-2 types:
4406 T1 :: forall a. (forall b. b -> b -> b) -> a -> T a
4407 MkMonad :: forall m. (forall a. a -> m a)
4408 -> (forall a b. m a -> (a -> m b) -> m b)
4410 MkSwizzle :: (Ord a => [a] -> [a]) -> Swizzle
4416 Notice that you don't need to use a <literal>forall</literal> if there's an
4417 explicit context. For example in the first argument of the
4418 constructor <function>MkSwizzle</function>, an implicit "<literal>forall a.</literal>" is
4419 prefixed to the argument type. The implicit <literal>forall</literal>
4420 quantifies all type variables that are not already in scope, and are
4421 mentioned in the type quantified over.
4425 As for type signatures, implicit quantification happens for non-overloaded
4426 types too. So if you write this:
4429 data T a = MkT (Either a b) (b -> b)
4432 it's just as if you had written this:
4435 data T a = MkT (forall b. Either a b) (forall b. b -> b)
4438 That is, since the type variable <literal>b</literal> isn't in scope, it's
4439 implicitly universally quantified. (Arguably, it would be better
4440 to <emphasis>require</emphasis> explicit quantification on constructor arguments
4441 where that is what is wanted. Feedback welcomed.)
4445 You construct values of types <literal>T1, MonadT, Swizzle</literal> by applying
4446 the constructor to suitable values, just as usual. For example,
4457 a3 = MkSwizzle reverse
4460 a4 = let r x = Just x
4467 mkTs :: (forall b. b -> b -> b) -> a -> [T a]
4468 mkTs f x y = [T1 f x, T1 f y]
4474 The type of the argument can, as usual, be more general than the type
4475 required, as <literal>(MkSwizzle reverse)</literal> shows. (<function>reverse</function>
4476 does not need the <literal>Ord</literal> constraint.)
4480 When you use pattern matching, the bound variables may now have
4481 polymorphic types. For example:
4487 f :: T a -> a -> (a, Char)
4488 f (T1 w k) x = (w k x, w 'c' 'd')
4490 g :: (Ord a, Ord b) => Swizzle -> [a] -> (a -> b) -> [b]
4491 g (MkSwizzle s) xs f = s (map f (s xs))
4493 h :: MonadT m -> [m a] -> m [a]
4494 h m [] = return m []
4495 h m (x:xs) = bind m x $ \y ->
4496 bind m (h m xs) $ \ys ->
4503 In the function <function>h</function> we use the record selectors <literal>return</literal>
4504 and <literal>bind</literal> to extract the polymorphic bind and return functions
4505 from the <literal>MonadT</literal> data structure, rather than using pattern
4511 <title>Type inference</title>
4514 In general, type inference for arbitrary-rank types is undecidable.
4515 GHC uses an algorithm proposed by Odersky and Laufer ("Putting type annotations to work", POPL'96)
4516 to get a decidable algorithm by requiring some help from the programmer.
4517 We do not yet have a formal specification of "some help" but the rule is this:
4520 <emphasis>For a lambda-bound or case-bound variable, x, either the programmer
4521 provides an explicit polymorphic type for x, or GHC's type inference will assume
4522 that x's type has no foralls in it</emphasis>.
4525 What does it mean to "provide" an explicit type for x? You can do that by
4526 giving a type signature for x directly, using a pattern type signature
4527 (<xref linkend="scoped-type-variables"/>), thus:
4529 \ f :: (forall a. a->a) -> (f True, f 'c')
4531 Alternatively, you can give a type signature to the enclosing
4532 context, which GHC can "push down" to find the type for the variable:
4534 (\ f -> (f True, f 'c')) :: (forall a. a->a) -> (Bool,Char)
4536 Here the type signature on the expression can be pushed inwards
4537 to give a type signature for f. Similarly, and more commonly,
4538 one can give a type signature for the function itself:
4540 h :: (forall a. a->a) -> (Bool,Char)
4541 h f = (f True, f 'c')
4543 You don't need to give a type signature if the lambda bound variable
4544 is a constructor argument. Here is an example we saw earlier:
4546 f :: T a -> a -> (a, Char)
4547 f (T1 w k) x = (w k x, w 'c' 'd')
4549 Here we do not need to give a type signature to <literal>w</literal>, because
4550 it is an argument of constructor <literal>T1</literal> and that tells GHC all
4557 <sect3 id="implicit-quant">
4558 <title>Implicit quantification</title>
4561 GHC performs implicit quantification as follows. <emphasis>At the top level (only) of
4562 user-written types, if and only if there is no explicit <literal>forall</literal>,
4563 GHC finds all the type variables mentioned in the type that are not already
4564 in scope, and universally quantifies them.</emphasis> For example, the following pairs are
4568 f :: forall a. a -> a
4575 h :: forall b. a -> b -> b
4581 Notice that GHC does <emphasis>not</emphasis> find the innermost possible quantification
4584 f :: (a -> a) -> Int
4586 f :: forall a. (a -> a) -> Int
4588 f :: (forall a. a -> a) -> Int
4591 g :: (Ord a => a -> a) -> Int
4592 -- MEANS the illegal type
4593 g :: forall a. (Ord a => a -> a) -> Int
4595 g :: (forall a. Ord a => a -> a) -> Int
4597 The latter produces an illegal type, which you might think is silly,
4598 but at least the rule is simple. If you want the latter type, you
4599 can write your for-alls explicitly. Indeed, doing so is strongly advised
4606 <sect2 id="impredicative-polymorphism">
4607 <title>Impredicative polymorphism
4609 <para>GHC supports <emphasis>impredicative polymorphism</emphasis>,
4610 enabled with <option>-XImpredicativeTypes</option>.
4612 that you can call a polymorphic function at a polymorphic type, and
4613 parameterise data structures over polymorphic types. For example:
4615 f :: Maybe (forall a. [a] -> [a]) -> Maybe ([Int], [Char])
4616 f (Just g) = Just (g [3], g "hello")
4619 Notice here that the <literal>Maybe</literal> type is parameterised by the
4620 <emphasis>polymorphic</emphasis> type <literal>(forall a. [a] ->
4623 <para>The technical details of this extension are described in the paper
4624 <ulink url="http://research.microsoft.com/%7Esimonpj/papers/boxy/">Boxy types:
4625 type inference for higher-rank types and impredicativity</ulink>,
4626 which appeared at ICFP 2006.
4630 <sect2 id="scoped-type-variables">
4631 <title>Lexically scoped type variables
4635 GHC supports <emphasis>lexically scoped type variables</emphasis>, without
4636 which some type signatures are simply impossible to write. For example:
4638 f :: forall a. [a] -> [a]
4644 The type signature for <literal>f</literal> brings the type variable <literal>a</literal> into scope; it scopes over
4645 the entire definition of <literal>f</literal>.
4646 In particular, it is in scope at the type signature for <varname>ys</varname>.
4647 In Haskell 98 it is not possible to declare
4648 a type for <varname>ys</varname>; a major benefit of scoped type variables is that
4649 it becomes possible to do so.
4651 <para>Lexically-scoped type variables are enabled by
4652 <option>-XScopedTypeVariables</option>. This flag implies <option>-XRelaxedPolyRec</option>.
4654 <para>Note: GHC 6.6 contains substantial changes to the way that scoped type
4655 variables work, compared to earlier releases. Read this section
4659 <title>Overview</title>
4661 <para>The design follows the following principles
4663 <listitem><para>A scoped type variable stands for a type <emphasis>variable</emphasis>, and not for
4664 a <emphasis>type</emphasis>. (This is a change from GHC's earlier
4665 design.)</para></listitem>
4666 <listitem><para>Furthermore, distinct lexical type variables stand for distinct
4667 type variables. This means that every programmer-written type signature
4668 (including one that contains free scoped type variables) denotes a
4669 <emphasis>rigid</emphasis> type; that is, the type is fully known to the type
4670 checker, and no inference is involved.</para></listitem>
4671 <listitem><para>Lexical type variables may be alpha-renamed freely, without
4672 changing the program.</para></listitem>
4676 A <emphasis>lexically scoped type variable</emphasis> can be bound by:
4678 <listitem><para>A declaration type signature (<xref linkend="decl-type-sigs"/>)</para></listitem>
4679 <listitem><para>An expression type signature (<xref linkend="exp-type-sigs"/>)</para></listitem>
4680 <listitem><para>A pattern type signature (<xref linkend="pattern-type-sigs"/>)</para></listitem>
4681 <listitem><para>Class and instance declarations (<xref linkend="cls-inst-scoped-tyvars"/>)</para></listitem>
4685 In Haskell, a programmer-written type signature is implicitly quantified over
4686 its free type variables (<ulink
4687 url="http://www.haskell.org/onlinereport/decls.html#sect4.1.2">Section
4689 of the Haskell Report).
4690 Lexically scoped type variables affect this implicit quantification rules
4691 as follows: any type variable that is in scope is <emphasis>not</emphasis> universally
4692 quantified. For example, if type variable <literal>a</literal> is in scope,
4695 (e :: a -> a) means (e :: a -> a)
4696 (e :: b -> b) means (e :: forall b. b->b)
4697 (e :: a -> b) means (e :: forall b. a->b)
4705 <sect3 id="decl-type-sigs">
4706 <title>Declaration type signatures</title>
4707 <para>A declaration type signature that has <emphasis>explicit</emphasis>
4708 quantification (using <literal>forall</literal>) brings into scope the
4709 explicitly-quantified
4710 type variables, in the definition of the named function. For example:
4712 f :: forall a. [a] -> [a]
4713 f (x:xs) = xs ++ [ x :: a ]
4715 The "<literal>forall a</literal>" brings "<literal>a</literal>" into scope in
4716 the definition of "<literal>f</literal>".
4718 <para>This only happens if:
4720 <listitem><para> The quantification in <literal>f</literal>'s type
4721 signature is explicit. For example:
4724 g (x:xs) = xs ++ [ x :: a ]
4726 This program will be rejected, because "<literal>a</literal>" does not scope
4727 over the definition of "<literal>f</literal>", so "<literal>x::a</literal>"
4728 means "<literal>x::forall a. a</literal>" by Haskell's usual implicit
4729 quantification rules.
4731 <listitem><para> The signature gives a type for a function binding or a bare variable binding,
4732 not a pattern binding.
4735 f1 :: forall a. [a] -> [a]
4736 f1 (x:xs) = xs ++ [ x :: a ] -- OK
4738 f2 :: forall a. [a] -> [a]
4739 f2 = \(x:xs) -> xs ++ [ x :: a ] -- OK
4741 f3 :: forall a. [a] -> [a]
4742 Just f3 = Just (\(x:xs) -> xs ++ [ x :: a ]) -- Not OK!
4744 The binding for <literal>f3</literal> is a pattern binding, and so its type signature
4745 does not bring <literal>a</literal> into scope. However <literal>f1</literal> is a
4746 function binding, and <literal>f2</literal> binds a bare variable; in both cases
4747 the type signature brings <literal>a</literal> into scope.
4753 <sect3 id="exp-type-sigs">
4754 <title>Expression type signatures</title>
4756 <para>An expression type signature that has <emphasis>explicit</emphasis>
4757 quantification (using <literal>forall</literal>) brings into scope the
4758 explicitly-quantified
4759 type variables, in the annotated expression. For example:
4761 f = runST ( (op >>= \(x :: STRef s Int) -> g x) :: forall s. ST s Bool )
4763 Here, the type signature <literal>forall a. ST s Bool</literal> brings the
4764 type variable <literal>s</literal> into scope, in the annotated expression
4765 <literal>(op >>= \(x :: STRef s Int) -> g x)</literal>.
4770 <sect3 id="pattern-type-sigs">
4771 <title>Pattern type signatures</title>
4773 A type signature may occur in any pattern; this is a <emphasis>pattern type
4774 signature</emphasis>.
4777 -- f and g assume that 'a' is already in scope
4778 f = \(x::Int, y::a) -> x
4780 h ((x,y) :: (Int,Bool)) = (y,x)
4782 In the case where all the type variables in the pattern type signature are
4783 already in scope (i.e. bound by the enclosing context), matters are simple: the
4784 signature simply constrains the type of the pattern in the obvious way.
4787 Unlike expression and declaration type signatures, pattern type signatures are not implicitly generalised.
4788 The pattern in a <emphasis>pattern binding</emphasis> may only mention type variables
4789 that are already in scope. For example:
4791 f :: forall a. [a] -> (Int, [a])
4794 (ys::[a], n) = (reverse xs, length xs) -- OK
4795 zs::[a] = xs ++ ys -- OK
4797 Just (v::b) = ... -- Not OK; b is not in scope
4799 Here, the pattern signatures for <literal>ys</literal> and <literal>zs</literal>
4800 are fine, but the one for <literal>v</literal> is not because <literal>b</literal> is
4804 However, in all patterns <emphasis>other</emphasis> than pattern bindings, a pattern
4805 type signature may mention a type variable that is not in scope; in this case,
4806 <emphasis>the signature brings that type variable into scope</emphasis>.
4807 This is particularly important for existential data constructors. For example:
4809 data T = forall a. MkT [a]
4812 k (MkT [t::a]) = MkT t3
4816 Here, the pattern type signature <literal>(t::a)</literal> mentions a lexical type
4817 variable that is not already in scope. Indeed, it <emphasis>cannot</emphasis> already be in scope,
4818 because it is bound by the pattern match. GHC's rule is that in this situation
4819 (and only then), a pattern type signature can mention a type variable that is
4820 not already in scope; the effect is to bring it into scope, standing for the
4821 existentially-bound type variable.
4824 When a pattern type signature binds a type variable in this way, GHC insists that the
4825 type variable is bound to a <emphasis>rigid</emphasis>, or fully-known, type variable.
4826 This means that any user-written type signature always stands for a completely known type.
4829 If all this seems a little odd, we think so too. But we must have
4830 <emphasis>some</emphasis> way to bring such type variables into scope, else we
4831 could not name existentially-bound type variables in subsequent type signatures.
4834 This is (now) the <emphasis>only</emphasis> situation in which a pattern type
4835 signature is allowed to mention a lexical variable that is not already in
4837 For example, both <literal>f</literal> and <literal>g</literal> would be
4838 illegal if <literal>a</literal> was not already in scope.
4844 <!-- ==================== Commented out part about result type signatures
4846 <sect3 id="result-type-sigs">
4847 <title>Result type signatures</title>
4850 The result type of a function, lambda, or case expression alternative can be given a signature, thus:
4853 {- f assumes that 'a' is already in scope -}
4854 f x y :: [a] = [x,y,x]
4856 g = \ x :: [Int] -> [3,4]
4858 h :: forall a. [a] -> a
4862 The final <literal>:: [a]</literal> after the patterns of <literal>f</literal> gives the type of
4863 the result of the function. Similarly, the body of the lambda in the RHS of
4864 <literal>g</literal> is <literal>[Int]</literal>, and the RHS of the case
4865 alternative in <literal>h</literal> is <literal>a</literal>.
4867 <para> A result type signature never brings new type variables into scope.</para>
4869 There are a couple of syntactic wrinkles. First, notice that all three
4870 examples would parse quite differently with parentheses:
4872 {- f assumes that 'a' is already in scope -}
4873 f x (y :: [a]) = [x,y,x]
4875 g = \ (x :: [Int]) -> [3,4]
4877 h :: forall a. [a] -> a
4881 Now the signature is on the <emphasis>pattern</emphasis>; and
4882 <literal>h</literal> would certainly be ill-typed (since the pattern
4883 <literal>(y:ys)</literal> cannot have the type <literal>a</literal>.
4885 Second, to avoid ambiguity, the type after the “<literal>::</literal>” in a result
4886 pattern signature on a lambda or <literal>case</literal> must be atomic (i.e. a single
4887 token or a parenthesised type of some sort). To see why,
4888 consider how one would parse this:
4897 <sect3 id="cls-inst-scoped-tyvars">
4898 <title>Class and instance declarations</title>
4901 The type variables in the head of a <literal>class</literal> or <literal>instance</literal> declaration
4902 scope over the methods defined in the <literal>where</literal> part. For example:
4920 <sect2 id="typing-binds">
4921 <title>Generalised typing of mutually recursive bindings</title>
4924 The Haskell Report specifies that a group of bindings (at top level, or in a
4925 <literal>let</literal> or <literal>where</literal>) should be sorted into
4926 strongly-connected components, and then type-checked in dependency order
4927 (<ulink url="http://www.haskell.org/onlinereport/decls.html#sect4.5.1">Haskell
4928 Report, Section 4.5.1</ulink>).
4929 As each group is type-checked, any binders of the group that
4931 an explicit type signature are put in the type environment with the specified
4933 and all others are monomorphic until the group is generalised
4934 (<ulink url="http://www.haskell.org/onlinereport/decls.html#sect4.5.2">Haskell Report, Section 4.5.2</ulink>).
4937 <para>Following a suggestion of Mark Jones, in his paper
4938 <ulink url="http://citeseer.ist.psu.edu/424440.html">Typing Haskell in
4940 GHC implements a more general scheme. If <option>-XRelaxedPolyRec</option> is
4942 <emphasis>the dependency analysis ignores references to variables that have an explicit
4943 type signature</emphasis>.
4944 As a result of this refined dependency analysis, the dependency groups are smaller, and more bindings will
4945 typecheck. For example, consider:
4947 f :: Eq a => a -> Bool
4948 f x = (x == x) || g True || g "Yes"
4950 g y = (y <= y) || f True
4952 This is rejected by Haskell 98, but under Jones's scheme the definition for
4953 <literal>g</literal> is typechecked first, separately from that for
4954 <literal>f</literal>,
4955 because the reference to <literal>f</literal> in <literal>g</literal>'s right
4956 hand side is ignored by the dependency analysis. Then <literal>g</literal>'s
4957 type is generalised, to get
4959 g :: Ord a => a -> Bool
4961 Now, the definition for <literal>f</literal> is typechecked, with this type for
4962 <literal>g</literal> in the type environment.
4966 The same refined dependency analysis also allows the type signatures of
4967 mutually-recursive functions to have different contexts, something that is illegal in
4968 Haskell 98 (Section 4.5.2, last sentence). With
4969 <option>-XRelaxedPolyRec</option>
4970 GHC only insists that the type signatures of a <emphasis>refined</emphasis> group have identical
4971 type signatures; in practice this means that only variables bound by the same
4972 pattern binding must have the same context. For example, this is fine:
4974 f :: Eq a => a -> Bool
4975 f x = (x == x) || g True
4977 g :: Ord a => a -> Bool
4978 g y = (y <= y) || f True
4983 <sect2 id="type-families">
4984 <title>Type families
4988 GHC supports the definition of type families indexed by types. They may be
4989 seen as an extension of Haskell 98's class-based overloading of values to
4990 types. When type families are declared in classes, they are also known as
4994 There are two forms of type families: data families and type synonym families.
4995 Currently, only the former are fully implemented, while we are still working
4996 on the latter. As a result, the specification of the language extension is
4997 also still to some degree in flux. Hence, a more detailed description of
4998 the language extension and its use is currently available
4999 from <ulink url="http://www.haskell.org/haskellwiki/GHC/Indexed_types">the Haskell
5000 wiki page on type families</ulink>. The material will be moved to this user's
5001 guide when it has stabilised.
5004 Type families are enabled by the flag <option>-XTypeFamilies</option>.
5011 <!-- ==================== End of type system extensions ================= -->
5013 <!-- ====================== TEMPLATE HASKELL ======================= -->
5015 <sect1 id="template-haskell">
5016 <title>Template Haskell</title>
5018 <para>Template Haskell allows you to do compile-time meta-programming in
5021 the main technical innovations is discussed in "<ulink
5022 url="http://research.microsoft.com/~simonpj/papers/meta-haskell/">
5023 Template Meta-programming for Haskell</ulink>" (Proc Haskell Workshop 2002).
5026 There is a Wiki page about
5027 Template Haskell at <ulink url="http://www.haskell.org/haskellwiki/Template_Haskell">
5028 http://www.haskell.org/haskellwiki/Template_Haskell</ulink>, and that is the best place to look for
5032 url="http://www.haskell.org/ghc/docs/latest/html/libraries/index.html">online
5033 Haskell library reference material</ulink>
5034 (look for module <literal>Language.Haskell.TH</literal>).
5035 Many changes to the original design are described in
5036 <ulink url="http://research.microsoft.com/~simonpj/papers/meta-haskell/notes2.ps">
5037 Notes on Template Haskell version 2</ulink>.
5038 Not all of these changes are in GHC, however.
5041 <para> The first example from that paper is set out below (<xref linkend="th-example"/>)
5042 as a worked example to help get you started.
5046 The documentation here describes the realisation of Template Haskell in GHC. It is not detailed enough to
5047 understand Template Haskell; see the <ulink url="http://haskell.org/haskellwiki/Template_Haskell">
5052 <title>Syntax</title>
5054 <para> Template Haskell has the following new syntactic
5055 constructions. You need to use the flag
5056 <option>-XTemplateHaskell</option>
5057 <indexterm><primary><option>-XTemplateHaskell</option></primary>
5058 </indexterm>to switch these syntactic extensions on
5059 (<option>-XTemplateHaskell</option> is no longer implied by
5060 <option>-fglasgow-exts</option>).</para>
5064 A splice is written <literal>$x</literal>, where <literal>x</literal> is an
5065 identifier, or <literal>$(...)</literal>, where the "..." is an arbitrary expression.
5066 There must be no space between the "$" and the identifier or parenthesis. This use
5067 of "$" overrides its meaning as an infix operator, just as "M.x" overrides the meaning
5068 of "." as an infix operator. If you want the infix operator, put spaces around it.
5070 <para> A splice can occur in place of
5072 <listitem><para> an expression; the spliced expression must
5073 have type <literal>Q Exp</literal></para></listitem>
5074 <listitem><para> a list of top-level declarations; the spliced expression must have type <literal>Q [Dec]</literal></para></listitem>
5077 Inside a splice you can can only call functions defined in imported modules,
5078 not functions defined elsewhere in the same module.</listitem>
5082 A expression quotation is written in Oxford brackets, thus:
5084 <listitem><para> <literal>[| ... |]</literal>, where the "..." is an expression;
5085 the quotation has type <literal>Q Exp</literal>.</para></listitem>
5086 <listitem><para> <literal>[d| ... |]</literal>, where the "..." is a list of top-level declarations;
5087 the quotation has type <literal>Q [Dec]</literal>.</para></listitem>
5088 <listitem><para> <literal>[t| ... |]</literal>, where the "..." is a type;
5089 the quotation has type <literal>Q Typ</literal>.</para></listitem>
5090 </itemizedlist></para></listitem>
5093 A quasi-quotation can appear in either a pattern context or an
5094 expression context and is also written in Oxford brackets:
5096 <listitem><para> <literal>[:<replaceable>varid</replaceable>| ... |]</literal>,
5097 where the "..." is an arbitrary string; a full description of the
5098 quasi-quotation facility is given in <xref linkend="th-quasiquotation"/>.</para></listitem>
5099 </itemizedlist></para></listitem>
5102 A name can be quoted with either one or two prefix single quotes:
5104 <listitem><para> <literal>'f</literal> has type <literal>Name</literal>, and names the function <literal>f</literal>.
5105 Similarly <literal>'C</literal> has type <literal>Name</literal> and names the data constructor <literal>C</literal>.
5106 In general <literal>'</literal><replaceable>thing</replaceable> interprets <replaceable>thing</replaceable> in an expression context.
5108 <listitem><para> <literal>''T</literal> has type <literal>Name</literal>, and names the type constructor <literal>T</literal>.
5109 That is, <literal>''</literal><replaceable>thing</replaceable> interprets <replaceable>thing</replaceable> in a type context.
5112 These <literal>Names</literal> can be used to construct Template Haskell expressions, patterns, declarations etc. They
5113 may also be given as an argument to the <literal>reify</literal> function.
5119 (Compared to the original paper, there are many differences of detail.
5120 The syntax for a declaration splice uses "<literal>$</literal>" not "<literal>splice</literal>".
5121 The type of the enclosed expression must be <literal>Q [Dec]</literal>, not <literal>[Q Dec]</literal>.
5122 Type splices are not implemented, and neither are pattern splices or quotations.
5126 <sect2> <title> Using Template Haskell </title>
5130 The data types and monadic constructor functions for Template Haskell are in the library
5131 <literal>Language.Haskell.THSyntax</literal>.
5135 You can only run a function at compile time if it is imported from another module. That is,
5136 you can't define a function in a module, and call it from within a splice in the same module.
5137 (It would make sense to do so, but it's hard to implement.)
5141 You can only run a function at compile time if it is imported
5142 from another module <emphasis>that is not part of a mutually-recursive group of modules
5143 that includes the module currently being compiled</emphasis>. Furthermore, all of the modules of
5144 the mutually-recursive group must be reachable by non-SOURCE imports from the module where the
5145 splice is to be run.</para>
5147 For example, when compiling module A,
5148 you can only run Template Haskell functions imported from B if B does not import A (directly or indirectly).
5149 The reason should be clear: to run B we must compile and run A, but we are currently type-checking A.
5153 The flag <literal>-ddump-splices</literal> shows the expansion of all top-level splices as they happen.
5156 If you are building GHC from source, you need at least a stage-2 bootstrap compiler to
5157 run Template Haskell. A stage-1 compiler will reject the TH constructs. Reason: TH
5158 compiles and runs a program, and then looks at the result. So it's important that
5159 the program it compiles produces results whose representations are identical to
5160 those of the compiler itself.
5164 <para> Template Haskell works in any mode (<literal>--make</literal>, <literal>--interactive</literal>,
5165 or file-at-a-time). There used to be a restriction to the former two, but that restriction
5170 <sect2 id="th-example"> <title> A Template Haskell Worked Example </title>
5171 <para>To help you get over the confidence barrier, try out this skeletal worked example.
5172 First cut and paste the two modules below into "Main.hs" and "Printf.hs":</para>
5179 -- Import our template "pr"
5180 import Printf ( pr )
5182 -- The splice operator $ takes the Haskell source code
5183 -- generated at compile time by "pr" and splices it into
5184 -- the argument of "putStrLn".
5185 main = putStrLn ( $(pr "Hello") )
5191 -- Skeletal printf from the paper.
5192 -- It needs to be in a separate module to the one where
5193 -- you intend to use it.
5195 -- Import some Template Haskell syntax
5196 import Language.Haskell.TH
5198 -- Describe a format string
5199 data Format = D | S | L String
5201 -- Parse a format string. This is left largely to you
5202 -- as we are here interested in building our first ever
5203 -- Template Haskell program and not in building printf.
5204 parse :: String -> [Format]
5207 -- Generate Haskell source code from a parsed representation
5208 -- of the format string. This code will be spliced into
5209 -- the module which calls "pr", at compile time.
5210 gen :: [Format] -> Q Exp
5211 gen [D] = [| \n -> show n |]
5212 gen [S] = [| \s -> s |]
5213 gen [L s] = stringE s
5215 -- Here we generate the Haskell code for the splice
5216 -- from an input format string.
5217 pr :: String -> Q Exp
5218 pr s = gen (parse s)
5221 <para>Now run the compiler (here we are a Cygwin prompt on Windows):
5224 $ ghc --make -XTemplateHaskell main.hs -o main.exe
5227 <para>Run "main.exe" and here is your output:</para>
5237 <title>Using Template Haskell with Profiling</title>
5238 <indexterm><primary>profiling</primary><secondary>with Template Haskell</secondary></indexterm>
5240 <para>Template Haskell relies on GHC's built-in bytecode compiler and
5241 interpreter to run the splice expressions. The bytecode interpreter
5242 runs the compiled expression on top of the same runtime on which GHC
5243 itself is running; this means that the compiled code referred to by
5244 the interpreted expression must be compatible with this runtime, and
5245 in particular this means that object code that is compiled for
5246 profiling <emphasis>cannot</emphasis> be loaded and used by a splice
5247 expression, because profiled object code is only compatible with the
5248 profiling version of the runtime.</para>
5250 <para>This causes difficulties if you have a multi-module program
5251 containing Template Haskell code and you need to compile it for
5252 profiling, because GHC cannot load the profiled object code and use it
5253 when executing the splices. Fortunately GHC provides a workaround.
5254 The basic idea is to compile the program twice:</para>
5258 <para>Compile the program or library first the normal way, without
5259 <option>-prof</option><indexterm><primary><option>-prof</option></primary></indexterm>.</para>
5262 <para>Then compile it again with <option>-prof</option>, and
5263 additionally use <option>-osuf
5264 p_o</option><indexterm><primary><option>-osuf</option></primary></indexterm>
5265 to name the object files differently (you can choose any suffix
5266 that isn't the normal object suffix here). GHC will automatically
5267 load the object files built in the first step when executing splice
5268 expressions. If you omit the <option>-osuf</option> flag when
5269 building with <option>-prof</option> and Template Haskell is used,
5270 GHC will emit an error message. </para>
5275 <sect2 id="th-quasiquotation"> <title> Template Haskell Quasi-quotation </title>
5276 <para>Quasi-quotation allows patterns and expressions to be written using
5277 programmer-defined concrete syntax; the motivation behind the extension and
5278 several examples are documented in
5279 "<ulink url="http://www.eecs.harvard.edu/~mainland/ghc-quasiquoting/">Why It's
5280 Nice to be Quoted: Quasiquoting for Haskell</ulink>" (Proc Haskell Workshop
5281 2007). The example below shows how to write a quasiquoter for a simple
5282 expression language.</para>
5285 In the example, the quasiquoter <literal>expr</literal> is bound to a value of
5286 type <literal>Language.Haskell.TH.Quote.QuasiQuoter</literal> which contains two
5287 functions for quoting expressions and patterns, respectively. The first argument
5288 to each quoter is the (arbitrary) string enclosed in the Oxford brackets. The
5289 context of the quasi-quotation statement determines which of the two parsers is
5290 called: if the quasi-quotation occurs in an expression context, the expression
5291 parser is called, and if it occurs in a pattern context, the pattern parser is
5295 Note that in the example we make use of an antiquoted
5296 variable <literal>n</literal>, indicated by the syntax <literal>'int:n</literal>
5297 (this syntax for anti-quotation was defined by the parser's
5298 author, <emphasis>not</emphasis> by GHC). This binds <literal>n</literal> to the
5299 integer value argument of the constructor <literal>IntExpr</literal> when
5300 pattern matching. Please see the referenced paper for further details regarding
5301 anti-quotation as well as the description of a technique that uses SYB to
5302 leverage a single parser of type <literal>String -> a</literal> to generate both
5303 an expression parser that returns a value of type <literal>Q Exp</literal> and a
5304 pattern parser that returns a value of type <literal>Q Pat</literal>.
5307 <para>In general, a quasi-quote has the form
5308 <literal>[$<replaceable>quoter</replaceable>| <replaceable>string</replaceable> |]</literal>.
5309 The <replaceable>quoter</replaceable> must be the name of an imported quoter; it
5310 cannot be an arbitrary expression. The quoted <replaceable>string</replaceable>
5311 can be arbitrary, and may contain newlines.
5314 Quasiquoters must obey the same stage restrictions as Template Haskell, e.g., in
5315 the example, <literal>expr</literal> cannot be defined
5316 in <literal>Main.hs</literal> where it is used, but must be imported.
5327 main = do { print $ eval [$expr|1 + 2|]
5329 { [$expr|'int:n|] -> print n
5338 import qualified Language.Haskell.TH as TH
5339 import Language.Haskell.TH.Quasi
5341 data Expr = IntExpr Integer
5342 | AntiIntExpr String
5343 | BinopExpr BinOp Expr Expr
5345 deriving(Show, Typeable, Data)
5351 deriving(Show, Typeable, Data)
5353 eval :: Expr -> Integer
5354 eval (IntExpr n) = n
5355 eval (BinopExpr op x y) = (opToFun op) (eval x) (eval y)
5362 expr = QuasiQuoter parseExprExp parseExprPat
5364 -- Parse an Expr, returning its representation as
5365 -- either a Q Exp or a Q Pat. See the referenced paper
5366 -- for how to use SYB to do this by writing a single
5367 -- parser of type String -> Expr instead of two
5368 -- separate parsers.
5370 parseExprExp :: String -> Q Exp
5373 parseExprPat :: String -> Q Pat
5377 <para>Now run the compiler:
5380 $ ghc --make -XQuasiQuotes Main.hs -o main
5383 <para>Run "main" and here is your output:</para>
5395 <!-- ===================== Arrow notation =================== -->
5397 <sect1 id="arrow-notation">
5398 <title>Arrow notation
5401 <para>Arrows are a generalization of monads introduced by John Hughes.
5402 For more details, see
5407 “Generalising Monads to Arrows”,
5408 John Hughes, in <citetitle>Science of Computer Programming</citetitle> 37,
5409 pp67–111, May 2000.
5410 The paper that introduced arrows: a friendly introduction, motivated with
5411 programming examples.
5417 “<ulink url="http://www.soi.city.ac.uk/~ross/papers/notation.html">A New Notation for Arrows</ulink>”,
5418 Ross Paterson, in <citetitle>ICFP</citetitle>, Sep 2001.
5419 Introduced the notation described here.
5425 “<ulink url="http://www.soi.city.ac.uk/~ross/papers/fop.html">Arrows and Computation</ulink>”,
5426 Ross Paterson, in <citetitle>The Fun of Programming</citetitle>,
5433 “<ulink url="http://www.cs.chalmers.se/~rjmh/afp-arrows.pdf">Programming with Arrows</ulink>”,
5434 John Hughes, in <citetitle>5th International Summer School on
5435 Advanced Functional Programming</citetitle>,
5436 <citetitle>Lecture Notes in Computer Science</citetitle> vol. 3622,
5438 This paper includes another introduction to the notation,
5439 with practical examples.
5445 “<ulink url="http://www.haskell.org/ghc/docs/papers/arrow-rules.pdf">Type and Translation Rules for Arrow Notation in GHC</ulink>”,
5446 Ross Paterson and Simon Peyton Jones, September 16, 2004.
5447 A terse enumeration of the formal rules used
5448 (extracted from comments in the source code).
5454 The arrows web page at
5455 <ulink url="http://www.haskell.org/arrows/"><literal>http://www.haskell.org/arrows/</literal></ulink>.
5460 With the <option>-XArrows</option> flag, GHC supports the arrow
5461 notation described in the second of these papers,
5462 translating it using combinators from the
5463 <ulink url="../libraries/base/Control-Arrow.html"><literal>Control.Arrow</literal></ulink>
5465 What follows is a brief introduction to the notation;
5466 it won't make much sense unless you've read Hughes's paper.
5469 <para>The extension adds a new kind of expression for defining arrows:
5471 <replaceable>exp</replaceable><superscript>10</superscript> ::= ...
5472 | proc <replaceable>apat</replaceable> -> <replaceable>cmd</replaceable>
5474 where <literal>proc</literal> is a new keyword.
5475 The variables of the pattern are bound in the body of the
5476 <literal>proc</literal>-expression,
5477 which is a new sort of thing called a <firstterm>command</firstterm>.
5478 The syntax of commands is as follows:
5480 <replaceable>cmd</replaceable> ::= <replaceable>exp</replaceable><superscript>10</superscript> -< <replaceable>exp</replaceable>
5481 | <replaceable>exp</replaceable><superscript>10</superscript> -<< <replaceable>exp</replaceable>
5482 | <replaceable>cmd</replaceable><superscript>0</superscript>
5484 with <replaceable>cmd</replaceable><superscript>0</superscript> up to
5485 <replaceable>cmd</replaceable><superscript>9</superscript> defined using
5486 infix operators as for expressions, and
5488 <replaceable>cmd</replaceable><superscript>10</superscript> ::= \ <replaceable>apat</replaceable> ... <replaceable>apat</replaceable> -> <replaceable>cmd</replaceable>
5489 | let <replaceable>decls</replaceable> in <replaceable>cmd</replaceable>
5490 | if <replaceable>exp</replaceable> then <replaceable>cmd</replaceable> else <replaceable>cmd</replaceable>
5491 | case <replaceable>exp</replaceable> of { <replaceable>calts</replaceable> }
5492 | do { <replaceable>cstmt</replaceable> ; ... <replaceable>cstmt</replaceable> ; <replaceable>cmd</replaceable> }
5493 | <replaceable>fcmd</replaceable>
5495 <replaceable>fcmd</replaceable> ::= <replaceable>fcmd</replaceable> <replaceable>aexp</replaceable>
5496 | ( <replaceable>cmd</replaceable> )
5497 | (| <replaceable>aexp</replaceable> <replaceable>cmd</replaceable> ... <replaceable>cmd</replaceable> |)
5499 <replaceable>cstmt</replaceable> ::= let <replaceable>decls</replaceable>
5500 | <replaceable>pat</replaceable> <- <replaceable>cmd</replaceable>
5501 | rec { <replaceable>cstmt</replaceable> ; ... <replaceable>cstmt</replaceable> [;] }
5502 | <replaceable>cmd</replaceable>
5504 where <replaceable>calts</replaceable> are like <replaceable>alts</replaceable>
5505 except that the bodies are commands instead of expressions.
5509 Commands produce values, but (like monadic computations)
5510 may yield more than one value,
5511 or none, and may do other things as well.
5512 For the most part, familiarity with monadic notation is a good guide to
5514 However the values of expressions, even monadic ones,
5515 are determined by the values of the variables they contain;
5516 this is not necessarily the case for commands.
5520 A simple example of the new notation is the expression
5522 proc x -> f -< x+1
5524 We call this a <firstterm>procedure</firstterm> or
5525 <firstterm>arrow abstraction</firstterm>.
5526 As with a lambda expression, the variable <literal>x</literal>
5527 is a new variable bound within the <literal>proc</literal>-expression.
5528 It refers to the input to the arrow.
5529 In the above example, <literal>-<</literal> is not an identifier but an
5530 new reserved symbol used for building commands from an expression of arrow
5531 type and an expression to be fed as input to that arrow.
5532 (The weird look will make more sense later.)
5533 It may be read as analogue of application for arrows.
5534 The above example is equivalent to the Haskell expression
5536 arr (\ x -> x+1) >>> f
5538 That would make no sense if the expression to the left of
5539 <literal>-<</literal> involves the bound variable <literal>x</literal>.
5540 More generally, the expression to the left of <literal>-<</literal>
5541 may not involve any <firstterm>local variable</firstterm>,
5542 i.e. a variable bound in the current arrow abstraction.
5543 For such a situation there is a variant <literal>-<<</literal>, as in
5545 proc x -> f x -<< x+1
5547 which is equivalent to
5549 arr (\ x -> (f x, x+1)) >>> app
5551 so in this case the arrow must belong to the <literal>ArrowApply</literal>
5553 Such an arrow is equivalent to a monad, so if you're using this form
5554 you may find a monadic formulation more convenient.
5558 <title>do-notation for commands</title>
5561 Another form of command is a form of <literal>do</literal>-notation.
5562 For example, you can write
5571 You can read this much like ordinary <literal>do</literal>-notation,
5572 but with commands in place of monadic expressions.
5573 The first line sends the value of <literal>x+1</literal> as an input to
5574 the arrow <literal>f</literal>, and matches its output against
5575 <literal>y</literal>.
5576 In the next line, the output is discarded.
5577 The arrow <function>returnA</function> is defined in the
5578 <ulink url="../libraries/base/Control-Arrow.html"><literal>Control.Arrow</literal></ulink>
5579 module as <literal>arr id</literal>.
5580 The above example is treated as an abbreviation for
5582 arr (\ x -> (x, x)) >>>
5583 first (arr (\ x -> x+1) >>> f) >>>
5584 arr (\ (y, x) -> (y, (x, y))) >>>
5585 first (arr (\ y -> 2*y) >>> g) >>>
5587 arr (\ (x, y) -> let z = x+y in ((x, z), z)) >>>
5588 first (arr (\ (x, z) -> x*z) >>> h) >>>
5589 arr (\ (t, z) -> t+z) >>>
5592 Note that variables not used later in the composition are projected out.
5593 After simplification using rewrite rules (see <xref linkend="rewrite-rules"/>)
5595 <ulink url="../libraries/base/Control-Arrow.html"><literal>Control.Arrow</literal></ulink>
5596 module, this reduces to
5598 arr (\ x -> (x+1, x)) >>>
5600 arr (\ (y, x) -> (2*y, (x, y))) >>>
5602 arr (\ (_, (x, y)) -> let z = x+y in (x*z, z)) >>>
5604 arr (\ (t, z) -> t+z)
5606 which is what you might have written by hand.
5607 With arrow notation, GHC keeps track of all those tuples of variables for you.
5611 Note that although the above translation suggests that
5612 <literal>let</literal>-bound variables like <literal>z</literal> must be
5613 monomorphic, the actual translation produces Core,
5614 so polymorphic variables are allowed.
5618 It's also possible to have mutually recursive bindings,
5619 using the new <literal>rec</literal> keyword, as in the following example:
5621 counter :: ArrowCircuit a => a Bool Int
5622 counter = proc reset -> do
5623 rec output <- returnA -< if reset then 0 else next
5624 next <- delay 0 -< output+1
5625 returnA -< output
5627 The translation of such forms uses the <function>loop</function> combinator,
5628 so the arrow concerned must belong to the <literal>ArrowLoop</literal> class.
5634 <title>Conditional commands</title>
5637 In the previous example, we used a conditional expression to construct the
5639 Sometimes we want to conditionally execute different commands, as in
5646 which is translated to
5648 arr (\ (x,y) -> if f x y then Left x else Right y) >>>
5649 (arr (\x -> x+1) >>> f) ||| (arr (\y -> y+2) >>> g)
5651 Since the translation uses <function>|||</function>,
5652 the arrow concerned must belong to the <literal>ArrowChoice</literal> class.
5656 There are also <literal>case</literal> commands, like
5662 y <- h -< (x1, x2)
5666 The syntax is the same as for <literal>case</literal> expressions,
5667 except that the bodies of the alternatives are commands rather than expressions.
5668 The translation is similar to that of <literal>if</literal> commands.
5674 <title>Defining your own control structures</title>
5677 As we're seen, arrow notation provides constructs,
5678 modelled on those for expressions,
5679 for sequencing, value recursion and conditionals.
5680 But suitable combinators,
5681 which you can define in ordinary Haskell,
5682 may also be used to build new commands out of existing ones.
5683 The basic idea is that a command defines an arrow from environments to values.
5684 These environments assign values to the free local variables of the command.
5685 Thus combinators that produce arrows from arrows
5686 may also be used to build commands from commands.
5687 For example, the <literal>ArrowChoice</literal> class includes a combinator
5689 ArrowChoice a => (<+>) :: a e c -> a e c -> a e c
5691 so we can use it to build commands:
5693 expr' = proc x -> do
5696 symbol Plus -< ()
5697 y <- term -< ()
5700 symbol Minus -< ()
5701 y <- term -< ()
5704 (The <literal>do</literal> on the first line is needed to prevent the first
5705 <literal><+> ...</literal> from being interpreted as part of the
5706 expression on the previous line.)
5707 This is equivalent to
5709 expr' = (proc x -> returnA -< x)
5710 <+> (proc x -> do
5711 symbol Plus -< ()
5712 y <- term -< ()
5714 <+> (proc x -> do
5715 symbol Minus -< ()
5716 y <- term -< ()
5719 It is essential that this operator be polymorphic in <literal>e</literal>
5720 (representing the environment input to the command
5721 and thence to its subcommands)
5722 and satisfy the corresponding naturality property
5724 arr k >>> (f <+> g) = (arr k >>> f) <+> (arr k >>> g)
5726 at least for strict <literal>k</literal>.
5727 (This should be automatic if you're not using <function>seq</function>.)
5728 This ensures that environments seen by the subcommands are environments
5729 of the whole command,
5730 and also allows the translation to safely trim these environments.
5731 The operator must also not use any variable defined within the current
5736 We could define our own operator
5738 untilA :: ArrowChoice a => a e () -> a e Bool -> a e ()
5739 untilA body cond = proc x ->
5740 b <- cond -< x
5741 if b then returnA -< ()
5744 untilA body cond -< x
5746 and use it in the same way.
5747 Of course this infix syntax only makes sense for binary operators;
5748 there is also a more general syntax involving special brackets:
5752 (|untilA (increment -< x+y) (within 0.5 -< x)|)
5759 <title>Primitive constructs</title>
5762 Some operators will need to pass additional inputs to their subcommands.
5763 For example, in an arrow type supporting exceptions,
5764 the operator that attaches an exception handler will wish to pass the
5765 exception that occurred to the handler.
5766 Such an operator might have a type
5768 handleA :: ... => a e c -> a (e,Ex) c -> a e c
5770 where <literal>Ex</literal> is the type of exceptions handled.
5771 You could then use this with arrow notation by writing a command
5773 body `handleA` \ ex -> handler
5775 so that if an exception is raised in the command <literal>body</literal>,
5776 the variable <literal>ex</literal> is bound to the value of the exception
5777 and the command <literal>handler</literal>,
5778 which typically refers to <literal>ex</literal>, is entered.
5779 Though the syntax here looks like a functional lambda,
5780 we are talking about commands, and something different is going on.
5781 The input to the arrow represented by a command consists of values for
5782 the free local variables in the command, plus a stack of anonymous values.
5783 In all the prior examples, this stack was empty.
5784 In the second argument to <function>handleA</function>,
5785 this stack consists of one value, the value of the exception.
5786 The command form of lambda merely gives this value a name.
5791 the values on the stack are paired to the right of the environment.
5792 So operators like <function>handleA</function> that pass
5793 extra inputs to their subcommands can be designed for use with the notation
5794 by pairing the values with the environment in this way.
5795 More precisely, the type of each argument of the operator (and its result)
5796 should have the form
5798 a (...(e,t1), ... tn) t
5800 where <replaceable>e</replaceable> is a polymorphic variable
5801 (representing the environment)
5802 and <replaceable>ti</replaceable> are the types of the values on the stack,
5803 with <replaceable>t1</replaceable> being the <quote>top</quote>.
5804 The polymorphic variable <replaceable>e</replaceable> must not occur in
5805 <replaceable>a</replaceable>, <replaceable>ti</replaceable> or
5806 <replaceable>t</replaceable>.
5807 However the arrows involved need not be the same.
5808 Here are some more examples of suitable operators:
5810 bracketA :: ... => a e b -> a (e,b) c -> a (e,c) d -> a e d
5811 runReader :: ... => a e c -> a' (e,State) c
5812 runState :: ... => a e c -> a' (e,State) (c,State)
5814 We can supply the extra input required by commands built with the last two
5815 by applying them to ordinary expressions, as in
5819 (|runReader (do { ... })|) s
5821 which adds <literal>s</literal> to the stack of inputs to the command
5822 built using <function>runReader</function>.
5826 The command versions of lambda abstraction and application are analogous to
5827 the expression versions.
5828 In particular, the beta and eta rules describe equivalences of commands.
5829 These three features (operators, lambda abstraction and application)
5830 are the core of the notation; everything else can be built using them,
5831 though the results would be somewhat clumsy.
5832 For example, we could simulate <literal>do</literal>-notation by defining
5834 bind :: Arrow a => a e b -> a (e,b) c -> a e c
5835 u `bind` f = returnA &&& u >>> f
5837 bind_ :: Arrow a => a e b -> a e c -> a e c
5838 u `bind_` f = u `bind` (arr fst >>> f)
5840 We could simulate <literal>if</literal> by defining
5842 cond :: ArrowChoice a => a e b -> a e b -> a (e,Bool) b
5843 cond f g = arr (\ (e,b) -> if b then Left e else Right e) >>> f ||| g
5850 <title>Differences with the paper</title>
5855 <para>Instead of a single form of arrow application (arrow tail) with two
5856 translations, the implementation provides two forms
5857 <quote><literal>-<</literal></quote> (first-order)
5858 and <quote><literal>-<<</literal></quote> (higher-order).
5863 <para>User-defined operators are flagged with banana brackets instead of
5864 a new <literal>form</literal> keyword.
5873 <title>Portability</title>
5876 Although only GHC implements arrow notation directly,
5877 there is also a preprocessor
5879 <ulink url="http://www.haskell.org/arrows/">arrows web page</ulink>)
5880 that translates arrow notation into Haskell 98
5881 for use with other Haskell systems.
5882 You would still want to check arrow programs with GHC;
5883 tracing type errors in the preprocessor output is not easy.
5884 Modules intended for both GHC and the preprocessor must observe some
5885 additional restrictions:
5890 The module must import
5891 <ulink url="../libraries/base/Control-Arrow.html"><literal>Control.Arrow</literal></ulink>.
5897 The preprocessor cannot cope with other Haskell extensions.
5898 These would have to go in separate modules.
5904 Because the preprocessor targets Haskell (rather than Core),
5905 <literal>let</literal>-bound variables are monomorphic.
5916 <!-- ==================== BANG PATTERNS ================= -->
5918 <sect1 id="bang-patterns">
5919 <title>Bang patterns
5920 <indexterm><primary>Bang patterns</primary></indexterm>
5922 <para>GHC supports an extension of pattern matching called <emphasis>bang
5923 patterns</emphasis>. Bang patterns are under consideration for Haskell Prime.
5925 url="http://hackage.haskell.org/trac/haskell-prime/wiki/BangPatterns">Haskell
5926 prime feature description</ulink> contains more discussion and examples
5927 than the material below.
5930 Bang patterns are enabled by the flag <option>-XBangPatterns</option>.
5933 <sect2 id="bang-patterns-informal">
5934 <title>Informal description of bang patterns
5937 The main idea is to add a single new production to the syntax of patterns:
5941 Matching an expression <literal>e</literal> against a pattern <literal>!p</literal> is done by first
5942 evaluating <literal>e</literal> (to WHNF) and then matching the result against <literal>p</literal>.
5947 This definition makes <literal>f1</literal> is strict in <literal>x</literal>,
5948 whereas without the bang it would be lazy.
5949 Bang patterns can be nested of course:
5953 Here, <literal>f2</literal> is strict in <literal>x</literal> but not in
5954 <literal>y</literal>.
5955 A bang only really has an effect if it precedes a variable or wild-card pattern:
5960 Here, <literal>f3</literal> and <literal>f4</literal> are identical; putting a bang before a pattern that
5961 forces evaluation anyway does nothing.
5963 Bang patterns work in <literal>case</literal> expressions too, of course:
5965 g5 x = let y = f x in body
5966 g6 x = case f x of { y -> body }
5967 g7 x = case f x of { !y -> body }
5969 The functions <literal>g5</literal> and <literal>g6</literal> mean exactly the same thing.
5970 But <literal>g7</literal> evaluates <literal>(f x)</literal>, binds <literal>y</literal> to the
5971 result, and then evaluates <literal>body</literal>.
5973 Bang patterns work in <literal>let</literal> and <literal>where</literal>
5974 definitions too. For example:
5978 is a strict pattern: operationally, it evaluates <literal>e</literal>, matches
5979 it against the pattern <literal>[x,y]</literal>, and then evaluates <literal>b</literal>
5980 The "<literal>!</literal>" should not be regarded as part of the pattern; after all,
5981 in a function argument <literal>![x,y]</literal> means the
5982 same as <literal>[x,y]</literal>. Rather, the "<literal>!</literal>"
5983 is part of the syntax of <literal>let</literal> bindings.
5988 <sect2 id="bang-patterns-sem">
5989 <title>Syntax and semantics
5993 We add a single new production to the syntax of patterns:
5997 There is one problem with syntactic ambiguity. Consider:
6001 Is this a definition of the infix function "<literal>(!)</literal>",
6002 or of the "<literal>f</literal>" with a bang pattern? GHC resolves this
6003 ambiguity in favour of the latter. If you want to define
6004 <literal>(!)</literal> with bang-patterns enabled, you have to do so using
6009 The semantics of Haskell pattern matching is described in <ulink
6010 url="http://www.haskell.org/onlinereport/exps.html#sect3.17.2">
6011 Section 3.17.2</ulink> of the Haskell Report. To this description add
6012 one extra item 10, saying:
6013 <itemizedlist><listitem><para>Matching
6014 the pattern <literal>!pat</literal> against a value <literal>v</literal> behaves as follows:
6015 <itemizedlist><listitem><para>if <literal>v</literal> is bottom, the match diverges</para></listitem>
6016 <listitem><para>otherwise, <literal>pat</literal> is matched against
6017 <literal>v</literal></para></listitem>
6019 </para></listitem></itemizedlist>
6020 Similarly, in Figure 4 of <ulink url="http://www.haskell.org/onlinereport/exps.html#sect3.17.3">
6021 Section 3.17.3</ulink>, add a new case (t):
6023 case v of { !pat -> e; _ -> e' }
6024 = v `seq` case v of { pat -> e; _ -> e' }
6027 That leaves let expressions, whose translation is given in
6028 <ulink url="http://www.haskell.org/onlinereport/exps.html#sect3.12">Section
6030 of the Haskell Report.
6031 In the translation box, first apply
6032 the following transformation: for each pattern <literal>pi</literal> that is of
6033 form <literal>!qi = ei</literal>, transform it to <literal>(xi,!qi) = ((),ei)</literal>, and and replace <literal>e0</literal>
6034 by <literal>(xi `seq` e0)</literal>. Then, when none of the left-hand-side patterns
6035 have a bang at the top, apply the rules in the existing box.
6037 <para>The effect of the let rule is to force complete matching of the pattern
6038 <literal>qi</literal> before evaluation of the body is begun. The bang is
6039 retained in the translated form in case <literal>qi</literal> is a variable,
6047 The let-binding can be recursive. However, it is much more common for
6048 the let-binding to be non-recursive, in which case the following law holds:
6049 <literal>(let !p = rhs in body)</literal>
6051 <literal>(case rhs of !p -> body)</literal>
6054 A pattern with a bang at the outermost level is not allowed at the top level of
6060 <!-- ==================== ASSERTIONS ================= -->
6062 <sect1 id="assertions">
6064 <indexterm><primary>Assertions</primary></indexterm>
6068 If you want to make use of assertions in your standard Haskell code, you
6069 could define a function like the following:
6075 assert :: Bool -> a -> a
6076 assert False x = error "assertion failed!"
6083 which works, but gives you back a less than useful error message --
6084 an assertion failed, but which and where?
6088 One way out is to define an extended <function>assert</function> function which also
6089 takes a descriptive string to include in the error message and
6090 perhaps combine this with the use of a pre-processor which inserts
6091 the source location where <function>assert</function> was used.
6095 Ghc offers a helping hand here, doing all of this for you. For every
6096 use of <function>assert</function> in the user's source:
6102 kelvinToC :: Double -> Double
6103 kelvinToC k = assert (k >= 0.0) (k+273.15)
6109 Ghc will rewrite this to also include the source location where the
6116 assert pred val ==> assertError "Main.hs|15" pred val
6122 The rewrite is only performed by the compiler when it spots
6123 applications of <function>Control.Exception.assert</function>, so you
6124 can still define and use your own versions of
6125 <function>assert</function>, should you so wish. If not, import
6126 <literal>Control.Exception</literal> to make use
6127 <function>assert</function> in your code.
6131 GHC ignores assertions when optimisation is turned on with the
6132 <option>-O</option><indexterm><primary><option>-O</option></primary></indexterm> flag. That is, expressions of the form
6133 <literal>assert pred e</literal> will be rewritten to
6134 <literal>e</literal>. You can also disable assertions using the
6135 <option>-fignore-asserts</option>
6136 option<indexterm><primary><option>-fignore-asserts</option></primary>
6137 </indexterm>.</para>
6140 Assertion failures can be caught, see the documentation for the
6141 <literal>Control.Exception</literal> library for the details.
6147 <!-- =============================== PRAGMAS =========================== -->
6149 <sect1 id="pragmas">
6150 <title>Pragmas</title>
6152 <indexterm><primary>pragma</primary></indexterm>
6154 <para>GHC supports several pragmas, or instructions to the
6155 compiler placed in the source code. Pragmas don't normally affect
6156 the meaning of the program, but they might affect the efficiency
6157 of the generated code.</para>
6159 <para>Pragmas all take the form
6161 <literal>{-# <replaceable>word</replaceable> ... #-}</literal>
6163 where <replaceable>word</replaceable> indicates the type of
6164 pragma, and is followed optionally by information specific to that
6165 type of pragma. Case is ignored in
6166 <replaceable>word</replaceable>. The various values for
6167 <replaceable>word</replaceable> that GHC understands are described
6168 in the following sections; any pragma encountered with an
6169 unrecognised <replaceable>word</replaceable> is (silently)
6170 ignored. The layout rule applies in pragmas, so the closing <literal>#-}</literal>
6171 should start in a column to the right of the opening <literal>{-#</literal>. </para>
6173 <para>Certain pragmas are <emphasis>file-header pragmas</emphasis>. A file-header
6174 pragma must precede the <literal>module</literal> keyword in the file.
6175 There can be as many file-header pragmas as you please, and they can be
6176 preceded or followed by comments.</para>
6178 <sect2 id="language-pragma">
6179 <title>LANGUAGE pragma</title>
6181 <indexterm><primary>LANGUAGE</primary><secondary>pragma</secondary></indexterm>
6182 <indexterm><primary>pragma</primary><secondary>LANGUAGE</secondary></indexterm>
6184 <para>The <literal>LANGUAGE</literal> pragma allows language extensions to be enabled
6186 It is the intention that all Haskell compilers support the
6187 <literal>LANGUAGE</literal> pragma with the same syntax, although not
6188 all extensions are supported by all compilers, of
6189 course. The <literal>LANGUAGE</literal> pragma should be used instead
6190 of <literal>OPTIONS_GHC</literal>, if possible.</para>
6192 <para>For example, to enable the FFI and preprocessing with CPP:</para>
6194 <programlisting>{-# LANGUAGE ForeignFunctionInterface, CPP #-}</programlisting>
6196 <para><literal>LANGUAGE</literal> is a file-header pragma (see <xref linkend="pragmas"/>).</para>
6198 <para>Every language extension can also be turned into a command-line flag
6199 by prefixing it with "<literal>-X</literal>"; for example <option>-XForeignFunctionInterface</option>.
6200 (Similarly, all "<literal>-X</literal>" flags can be written as <literal>LANGUAGE</literal> pragmas.
6203 <para>A list of all supported language extensions can be obtained by invoking
6204 <literal>ghc --supported-languages</literal> (see <xref linkend="modes"/>).</para>
6206 <para>Any extension from the <literal>Extension</literal> type defined in
6208 url="../libraries/Cabal/Language-Haskell-Extension.html"><literal>Language.Haskell.Extension</literal></ulink>
6209 may be used. GHC will report an error if any of the requested extensions are not supported.</para>
6213 <sect2 id="options-pragma">
6214 <title>OPTIONS_GHC pragma</title>
6215 <indexterm><primary>OPTIONS_GHC</primary>
6217 <indexterm><primary>pragma</primary><secondary>OPTIONS_GHC</secondary>
6220 <para>The <literal>OPTIONS_GHC</literal> pragma is used to specify
6221 additional options that are given to the compiler when compiling
6222 this source file. See <xref linkend="source-file-options"/> for
6225 <para>Previous versions of GHC accepted <literal>OPTIONS</literal> rather
6226 than <literal>OPTIONS_GHC</literal>, but that is now deprecated.</para>
6229 <para><literal>OPTIONS_GHC</literal> is a file-header pragma (see <xref linkend="pragmas"/>).</para>
6231 <sect2 id="include-pragma">
6232 <title>INCLUDE pragma</title>
6234 <para>The <literal>INCLUDE</literal> pragma is for specifying the names
6235 of C header files that should be <literal>#include</literal>'d into
6236 the C source code generated by the compiler for the current module (if
6237 compiling via C). For example:</para>
6240 {-# INCLUDE "foo.h" #-}
6241 {-# INCLUDE <stdio.h> #-}</programlisting>
6243 <para><literal>INCLUDE</literal> is a file-header pragma (see <xref linkend="pragmas"/>).</para>
6245 <para>An <literal>INCLUDE</literal> pragma is the preferred alternative
6246 to the <option>-#include</option> option (<xref
6247 linkend="options-C-compiler" />), because the
6248 <literal>INCLUDE</literal> pragma is understood by other
6249 compilers. Yet another alternative is to add the include file to each
6250 <literal>foreign import</literal> declaration in your code, but we
6251 don't recommend using this approach with GHC.</para>
6254 <sect2 id="warning-deprecated-pragma">
6255 <title>WARNING and DEPRECATED pragmas</title>
6256 <indexterm><primary>WARNING</primary></indexterm>
6257 <indexterm><primary>DEPRECATED</primary></indexterm>
6259 <para>The WARNING pragma allows you to attach an arbitrary warning
6260 to a particular function, class, or type.
6261 A DEPRECATED pragma lets you specify that
6262 a particular function, class, or type is deprecated.
6263 There are two ways of using these pragmas.
6267 <para>You can work on an entire module thus:</para>
6269 module Wibble {-# DEPRECATED "Use Wobble instead" #-} where
6274 module Wibble {-# WARNING "This is an unstable interface." #-} where
6277 <para>When you compile any module that import
6278 <literal>Wibble</literal>, GHC will print the specified
6283 <para>You can attach a warning to a function, class, type, or data constructor, with the
6284 following top-level declarations:</para>
6286 {-# DEPRECATED f, C, T "Don't use these" #-}
6287 {-# WARNING unsafePerformIO "This is unsafe; I hope you know what you're doing" #-}
6289 <para>When you compile any module that imports and uses any
6290 of the specified entities, GHC will print the specified
6292 <para> You can only attach to entities declared at top level in the module
6293 being compiled, and you can only use unqualified names in the list of
6294 entities. A capitalised name, such as <literal>T</literal>
6295 refers to <emphasis>either</emphasis> the type constructor <literal>T</literal>
6296 <emphasis>or</emphasis> the data constructor <literal>T</literal>, or both if
6297 both are in scope. If both are in scope, there is currently no way to
6298 specify one without the other (c.f. fixities
6299 <xref linkend="infix-tycons"/>).</para>
6302 Warnings and deprecations are not reported for
6303 (a) uses within the defining module, and
6304 (b) uses in an export list.
6305 The latter reduces spurious complaints within a library
6306 in which one module gathers together and re-exports
6307 the exports of several others.
6309 <para>You can suppress the warnings with the flag
6310 <option>-fno-warn-warnings-deprecations</option>.</para>
6313 <sect2 id="inline-noinline-pragma">
6314 <title>INLINE and NOINLINE pragmas</title>
6316 <para>These pragmas control the inlining of function
6319 <sect3 id="inline-pragma">
6320 <title>INLINE pragma</title>
6321 <indexterm><primary>INLINE</primary></indexterm>
6323 <para>GHC (with <option>-O</option>, as always) tries to
6324 inline (or “unfold”) functions/values that are
6325 “small enough,” thus avoiding the call overhead
6326 and possibly exposing other more-wonderful optimisations.
6327 Normally, if GHC decides a function is “too
6328 expensive” to inline, it will not do so, nor will it
6329 export that unfolding for other modules to use.</para>
6331 <para>The sledgehammer you can bring to bear is the
6332 <literal>INLINE</literal><indexterm><primary>INLINE
6333 pragma</primary></indexterm> pragma, used thusly:</para>
6336 key_function :: Int -> String -> (Bool, Double)
6337 {-# INLINE key_function #-}
6340 <para>The major effect of an <literal>INLINE</literal> pragma
6341 is to declare a function's “cost” to be very low.
6342 The normal unfolding machinery will then be very keen to
6343 inline it. However, an <literal>INLINE</literal> pragma for a
6344 function "<literal>f</literal>" has a number of other effects:
6347 No functions are inlined into <literal>f</literal>. Otherwise
6348 GHC might inline a big function into <literal>f</literal>'s right hand side,
6349 making <literal>f</literal> big; and then inline <literal>f</literal> blindly.
6352 The float-in, float-out, and common-sub-expression transformations are not
6353 applied to the body of <literal>f</literal>.
6356 An INLINE function is not worker/wrappered by strictness analysis.
6357 It's going to be inlined wholesale instead.
6360 All of these effects are aimed at ensuring that what gets inlined is
6361 exactly what you asked for, no more and no less.
6363 <para>GHC ensures that inlining cannot go on forever: every mutually-recursive
6364 group is cut by one or more <emphasis>loop breakers</emphasis> that is never inlined
6365 (see <ulink url="http://research.microsoft.com/%7Esimonpj/Papers/inlining/index.htm">
6366 Secrets of the GHC inliner, JFP 12(4) July 2002</ulink>).
6367 GHC tries not to select a function with an INLINE pragma as a loop breaker, but
6368 when there is no choice even an INLINE function can be selected, in which case
6369 the INLINE pragma is ignored.
6370 For example, for a self-recursive function, the loop breaker can only be the function
6371 itself, so an INLINE pragma is always ignored.</para>
6373 <para>Syntactically, an <literal>INLINE</literal> pragma for a
6374 function can be put anywhere its type signature could be
6377 <para><literal>INLINE</literal> pragmas are a particularly
6379 <literal>then</literal>/<literal>return</literal> (or
6380 <literal>bind</literal>/<literal>unit</literal>) functions in
6381 a monad. For example, in GHC's own
6382 <literal>UniqueSupply</literal> monad code, we have:</para>
6385 {-# INLINE thenUs #-}
6386 {-# INLINE returnUs #-}
6389 <para>See also the <literal>NOINLINE</literal> pragma (<xref
6390 linkend="noinline-pragma"/>).</para>
6392 <para>Note: the HBC compiler doesn't like <literal>INLINE</literal> pragmas,
6393 so if you want your code to be HBC-compatible you'll have to surround
6394 the pragma with C pre-processor directives
6395 <literal>#ifdef __GLASGOW_HASKELL__</literal>...<literal>#endif</literal>.</para>
6399 <sect3 id="noinline-pragma">
6400 <title>NOINLINE pragma</title>
6402 <indexterm><primary>NOINLINE</primary></indexterm>
6403 <indexterm><primary>NOTINLINE</primary></indexterm>
6405 <para>The <literal>NOINLINE</literal> pragma does exactly what
6406 you'd expect: it stops the named function from being inlined
6407 by the compiler. You shouldn't ever need to do this, unless
6408 you're very cautious about code size.</para>
6410 <para><literal>NOTINLINE</literal> is a synonym for
6411 <literal>NOINLINE</literal> (<literal>NOINLINE</literal> is
6412 specified by Haskell 98 as the standard way to disable
6413 inlining, so it should be used if you want your code to be
6417 <sect3 id="phase-control">
6418 <title>Phase control</title>
6420 <para> Sometimes you want to control exactly when in GHC's
6421 pipeline the INLINE pragma is switched on. Inlining happens
6422 only during runs of the <emphasis>simplifier</emphasis>. Each
6423 run of the simplifier has a different <emphasis>phase
6424 number</emphasis>; the phase number decreases towards zero.
6425 If you use <option>-dverbose-core2core</option> you'll see the
6426 sequence of phase numbers for successive runs of the
6427 simplifier. In an INLINE pragma you can optionally specify a
6431 <para>"<literal>INLINE[k] f</literal>" means: do not inline
6432 <literal>f</literal>
6433 until phase <literal>k</literal>, but from phase
6434 <literal>k</literal> onwards be very keen to inline it.
6437 <para>"<literal>INLINE[~k] f</literal>" means: be very keen to inline
6438 <literal>f</literal>
6439 until phase <literal>k</literal>, but from phase
6440 <literal>k</literal> onwards do not inline it.
6443 <para>"<literal>NOINLINE[k] f</literal>" means: do not inline
6444 <literal>f</literal>
6445 until phase <literal>k</literal>, but from phase
6446 <literal>k</literal> onwards be willing to inline it (as if
6447 there was no pragma).
6450 <para>"<literal>NOINLINE[~k] f</literal>" means: be willing to inline
6451 <literal>f</literal>
6452 until phase <literal>k</literal>, but from phase
6453 <literal>k</literal> onwards do not inline it.
6456 The same information is summarised here:
6458 -- Before phase 2 Phase 2 and later
6459 {-# INLINE [2] f #-} -- No Yes
6460 {-# INLINE [~2] f #-} -- Yes No
6461 {-# NOINLINE [2] f #-} -- No Maybe
6462 {-# NOINLINE [~2] f #-} -- Maybe No
6464 {-# INLINE f #-} -- Yes Yes
6465 {-# NOINLINE f #-} -- No No
6467 By "Maybe" we mean that the usual heuristic inlining rules apply (if the
6468 function body is small, or it is applied to interesting-looking arguments etc).
6469 Another way to understand the semantics is this:
6471 <listitem><para>For both INLINE and NOINLINE, the phase number says
6472 when inlining is allowed at all.</para></listitem>
6473 <listitem><para>The INLINE pragma has the additional effect of making the
6474 function body look small, so that when inlining is allowed it is very likely to
6479 <para>The same phase-numbering control is available for RULES
6480 (<xref linkend="rewrite-rules"/>).</para>
6484 <sect2 id="line-pragma">
6485 <title>LINE pragma</title>
6487 <indexterm><primary>LINE</primary><secondary>pragma</secondary></indexterm>
6488 <indexterm><primary>pragma</primary><secondary>LINE</secondary></indexterm>
6489 <para>This pragma is similar to C's <literal>#line</literal>
6490 pragma, and is mainly for use in automatically generated Haskell
6491 code. It lets you specify the line number and filename of the
6492 original code; for example</para>
6494 <programlisting>{-# LINE 42 "Foo.vhs" #-}</programlisting>
6496 <para>if you'd generated the current file from something called
6497 <filename>Foo.vhs</filename> and this line corresponds to line
6498 42 in the original. GHC will adjust its error messages to refer
6499 to the line/file named in the <literal>LINE</literal>
6504 <title>RULES pragma</title>
6506 <para>The RULES pragma lets you specify rewrite rules. It is
6507 described in <xref linkend="rewrite-rules"/>.</para>
6510 <sect2 id="specialize-pragma">
6511 <title>SPECIALIZE pragma</title>
6513 <indexterm><primary>SPECIALIZE pragma</primary></indexterm>
6514 <indexterm><primary>pragma, SPECIALIZE</primary></indexterm>
6515 <indexterm><primary>overloading, death to</primary></indexterm>
6517 <para>(UK spelling also accepted.) For key overloaded
6518 functions, you can create extra versions (NB: more code space)
6519 specialised to particular types. Thus, if you have an
6520 overloaded function:</para>
6523 hammeredLookup :: Ord key => [(key, value)] -> key -> value
6526 <para>If it is heavily used on lists with
6527 <literal>Widget</literal> keys, you could specialise it as
6531 {-# SPECIALIZE hammeredLookup :: [(Widget, value)] -> Widget -> value #-}
6534 <para>A <literal>SPECIALIZE</literal> pragma for a function can
6535 be put anywhere its type signature could be put.</para>
6537 <para>A <literal>SPECIALIZE</literal> has the effect of generating
6538 (a) a specialised version of the function and (b) a rewrite rule
6539 (see <xref linkend="rewrite-rules"/>) that rewrites a call to the
6540 un-specialised function into a call to the specialised one.</para>
6542 <para>The type in a SPECIALIZE pragma can be any type that is less
6543 polymorphic than the type of the original function. In concrete terms,
6544 if the original function is <literal>f</literal> then the pragma
6546 {-# SPECIALIZE f :: <type> #-}
6548 is valid if and only if the definition
6550 f_spec :: <type>
6553 is valid. Here are some examples (where we only give the type signature
6554 for the original function, not its code):
6556 f :: Eq a => a -> b -> b
6557 {-# SPECIALISE f :: Int -> b -> b #-}
6559 g :: (Eq a, Ix b) => a -> b -> b
6560 {-# SPECIALISE g :: (Eq a) => a -> Int -> Int #-}
6562 h :: Eq a => a -> a -> a
6563 {-# SPECIALISE h :: (Eq a) => [a] -> [a] -> [a] #-}
6565 The last of these examples will generate a
6566 RULE with a somewhat-complex left-hand side (try it yourself), so it might not fire very
6567 well. If you use this kind of specialisation, let us know how well it works.
6570 <para>A <literal>SPECIALIZE</literal> pragma can optionally be followed with a
6571 <literal>INLINE</literal> or <literal>NOINLINE</literal> pragma, optionally
6572 followed by a phase, as described in <xref linkend="inline-noinline-pragma"/>.
6573 The <literal>INLINE</literal> pragma affects the specialised version of the
6574 function (only), and applies even if the function is recursive. The motivating
6577 -- A GADT for arrays with type-indexed representation
6579 ArrInt :: !Int -> ByteArray# -> Arr Int
6580 ArrPair :: !Int -> Arr e1 -> Arr e2 -> Arr (e1, e2)
6582 (!:) :: Arr e -> Int -> e
6583 {-# SPECIALISE INLINE (!:) :: Arr Int -> Int -> Int #-}
6584 {-# SPECIALISE INLINE (!:) :: Arr (a, b) -> Int -> (a, b) #-}
6585 (ArrInt _ ba) !: (I# i) = I# (indexIntArray# ba i)
6586 (ArrPair _ a1 a2) !: i = (a1 !: i, a2 !: i)
6588 Here, <literal>(!:)</literal> is a recursive function that indexes arrays
6589 of type <literal>Arr e</literal>. Consider a call to <literal>(!:)</literal>
6590 at type <literal>(Int,Int)</literal>. The second specialisation will fire, and
6591 the specialised function will be inlined. It has two calls to
6592 <literal>(!:)</literal>,
6593 both at type <literal>Int</literal>. Both these calls fire the first
6594 specialisation, whose body is also inlined. The result is a type-based
6595 unrolling of the indexing function.</para>
6596 <para>Warning: you can make GHC diverge by using <literal>SPECIALISE INLINE</literal>
6597 on an ordinarily-recursive function.</para>
6599 <para>Note: In earlier versions of GHC, it was possible to provide your own
6600 specialised function for a given type:
6603 {-# SPECIALIZE hammeredLookup :: [(Int, value)] -> Int -> value = intLookup #-}
6606 This feature has been removed, as it is now subsumed by the
6607 <literal>RULES</literal> pragma (see <xref linkend="rule-spec"/>).</para>
6611 <sect2 id="specialize-instance-pragma">
6612 <title>SPECIALIZE instance pragma
6616 <indexterm><primary>SPECIALIZE pragma</primary></indexterm>
6617 <indexterm><primary>overloading, death to</primary></indexterm>
6618 Same idea, except for instance declarations. For example:
6621 instance (Eq a) => Eq (Foo a) where {
6622 {-# SPECIALIZE instance Eq (Foo [(Int, Bar)]) #-}
6626 The pragma must occur inside the <literal>where</literal> part
6627 of the instance declaration.
6630 Compatible with HBC, by the way, except perhaps in the placement
6636 <sect2 id="unpack-pragma">
6637 <title>UNPACK pragma</title>
6639 <indexterm><primary>UNPACK</primary></indexterm>
6641 <para>The <literal>UNPACK</literal> indicates to the compiler
6642 that it should unpack the contents of a constructor field into
6643 the constructor itself, removing a level of indirection. For
6647 data T = T {-# UNPACK #-} !Float
6648 {-# UNPACK #-} !Float
6651 <para>will create a constructor <literal>T</literal> containing
6652 two unboxed floats. This may not always be an optimisation: if
6653 the <function>T</function> constructor is scrutinised and the
6654 floats passed to a non-strict function for example, they will
6655 have to be reboxed (this is done automatically by the
6658 <para>Unpacking constructor fields should only be used in
6659 conjunction with <option>-O</option>, in order to expose
6660 unfoldings to the compiler so the reboxing can be removed as
6661 often as possible. For example:</para>
6665 f (T f1 f2) = f1 + f2
6668 <para>The compiler will avoid reboxing <function>f1</function>
6669 and <function>f2</function> by inlining <function>+</function>
6670 on floats, but only when <option>-O</option> is on.</para>
6672 <para>Any single-constructor data is eligible for unpacking; for
6676 data T = T {-# UNPACK #-} !(Int,Int)
6679 <para>will store the two <literal>Int</literal>s directly in the
6680 <function>T</function> constructor, by flattening the pair.
6681 Multi-level unpacking is also supported:
6684 data T = T {-# UNPACK #-} !S
6685 data S = S {-# UNPACK #-} !Int {-# UNPACK #-} !Int
6688 will store two unboxed <literal>Int#</literal>s
6689 directly in the <function>T</function> constructor. The
6690 unpacker can see through newtypes, too.</para>
6692 <para>If a field cannot be unpacked, you will not get a warning,
6693 so it might be an idea to check the generated code with
6694 <option>-ddump-simpl</option>.</para>
6696 <para>See also the <option>-funbox-strict-fields</option> flag,
6697 which essentially has the effect of adding
6698 <literal>{-# UNPACK #-}</literal> to every strict
6699 constructor field.</para>
6702 <sect2 id="source-pragma">
6703 <title>SOURCE pragma</title>
6705 <indexterm><primary>SOURCE</primary></indexterm>
6706 <para>The <literal>{-# SOURCE #-}</literal> pragma is used only in <literal>import</literal> declarations,
6707 to break a module loop. It is described in detail in <xref linkend="mutual-recursion"/>.
6713 <!-- ======================= REWRITE RULES ======================== -->
6715 <sect1 id="rewrite-rules">
6716 <title>Rewrite rules
6718 <indexterm><primary>RULES pragma</primary></indexterm>
6719 <indexterm><primary>pragma, RULES</primary></indexterm>
6720 <indexterm><primary>rewrite rules</primary></indexterm></title>
6723 The programmer can specify rewrite rules as part of the source program
6729 "map/map" forall f g xs. map f (map g xs) = map (f.g) xs
6734 Use the debug flag <option>-ddump-simpl-stats</option> to see what rules fired.
6735 If you need more information, then <option>-ddump-rule-firings</option> shows you
6736 each individual rule firing in detail.
6740 <title>Syntax</title>
6743 From a syntactic point of view:
6749 There may be zero or more rules in a <literal>RULES</literal> pragma, separated by semicolons (which
6750 may be generated by the layout rule).
6756 The layout rule applies in a pragma.
6757 Currently no new indentation level
6758 is set, so if you put several rules in single RULES pragma and wish to use layout to separate them,
6759 you must lay out the starting in the same column as the enclosing definitions.
6762 "map/map" forall f g xs. map f (map g xs) = map (f.g) xs
6763 "map/append" forall f xs ys. map f (xs ++ ys) = map f xs ++ map f ys
6766 Furthermore, the closing <literal>#-}</literal>
6767 should start in a column to the right of the opening <literal>{-#</literal>.
6773 Each rule has a name, enclosed in double quotes. The name itself has
6774 no significance at all. It is only used when reporting how many times the rule fired.
6780 A rule may optionally have a phase-control number (see <xref linkend="phase-control"/>),
6781 immediately after the name of the rule. Thus:
6784 "map/map" [2] forall f g xs. map f (map g xs) = map (f.g) xs
6787 The "[2]" means that the rule is active in Phase 2 and subsequent phases. The inverse
6788 notation "[~2]" is also accepted, meaning that the rule is active up to, but not including,
6797 Each variable mentioned in a rule must either be in scope (e.g. <function>map</function>),
6798 or bound by the <literal>forall</literal> (e.g. <function>f</function>, <function>g</function>, <function>xs</function>). The variables bound by
6799 the <literal>forall</literal> are called the <emphasis>pattern</emphasis> variables. They are separated
6800 by spaces, just like in a type <literal>forall</literal>.
6806 A pattern variable may optionally have a type signature.
6807 If the type of the pattern variable is polymorphic, it <emphasis>must</emphasis> have a type signature.
6808 For example, here is the <literal>foldr/build</literal> rule:
6811 "fold/build" forall k z (g::forall b. (a->b->b) -> b -> b) .
6812 foldr k z (build g) = g k z
6815 Since <function>g</function> has a polymorphic type, it must have a type signature.
6822 The left hand side of a rule must consist of a top-level variable applied
6823 to arbitrary expressions. For example, this is <emphasis>not</emphasis> OK:
6826 "wrong1" forall e1 e2. case True of { True -> e1; False -> e2 } = e1
6827 "wrong2" forall f. f True = True
6830 In <literal>"wrong1"</literal>, the LHS is not an application; in <literal>"wrong2"</literal>, the LHS has a pattern variable
6837 A rule does not need to be in the same module as (any of) the
6838 variables it mentions, though of course they need to be in scope.
6844 All rules are implicitly exported from the module, and are therefore
6845 in force in any module that imports the module that defined the rule, directly
6846 or indirectly. (That is, if A imports B, which imports C, then C's rules are
6847 in force when compiling A.) The situation is very similar to that for instance
6855 Inside a RULE "<literal>forall</literal>" is treated as a keyword, regardless of
6856 any other flag settings. Furthermore, inside a RULE, the language extension
6857 <option>-XScopedTypeVariables</option> is automatically enabled; see
6858 <xref linkend="scoped-type-variables"/>.
6864 Like other pragmas, RULE pragmas are always checked for scope errors, and
6865 are typechecked. Typechecking means that the LHS and RHS of a rule are typechecked,
6866 and must have the same type. However, rules are only <emphasis>enabled</emphasis>
6867 if the <option>-fenable-rewrite-rules</option> flag is
6868 on (see <xref linkend="rule-semantics"/>).
6877 <sect2 id="rule-semantics">
6878 <title>Semantics</title>
6881 From a semantic point of view:
6886 Rules are enabled (that is, used during optimisation)
6887 by the <option>-fenable-rewrite-rules</option> flag.
6888 This flag is implied by <option>-O</option>, and may be switched
6889 off (as usual) by <option>-fno-enable-rewrite-rules</option>.
6890 (NB: enabling <option>-fenable-rewrite-rules</option> without <option>-O</option>
6891 may not do what you expect, though, because without <option>-O</option> GHC
6892 ignores all optimisation information in interface files;
6893 see <option>-fignore-interface-pragmas</option>, <xref linkend="options-f"/>.)
6894 Note that <option>-fenable-rewrite-rules</option> is an <emphasis>optimisation</emphasis> flag, and
6895 has no effect on parsing or typechecking.
6901 Rules are regarded as left-to-right rewrite rules.
6902 When GHC finds an expression that is a substitution instance of the LHS
6903 of a rule, it replaces the expression by the (appropriately-substituted) RHS.
6904 By "a substitution instance" we mean that the LHS can be made equal to the
6905 expression by substituting for the pattern variables.
6912 GHC makes absolutely no attempt to verify that the LHS and RHS
6913 of a rule have the same meaning. That is undecidable in general, and
6914 infeasible in most interesting cases. The responsibility is entirely the programmer's!
6921 GHC makes no attempt to make sure that the rules are confluent or
6922 terminating. For example:
6925 "loop" forall x y. f x y = f y x
6928 This rule will cause the compiler to go into an infinite loop.
6935 If more than one rule matches a call, GHC will choose one arbitrarily to apply.
6941 GHC currently uses a very simple, syntactic, matching algorithm
6942 for matching a rule LHS with an expression. It seeks a substitution
6943 which makes the LHS and expression syntactically equal modulo alpha
6944 conversion. The pattern (rule), but not the expression, is eta-expanded if
6945 necessary. (Eta-expanding the expression can lead to laziness bugs.)
6946 But not beta conversion (that's called higher-order matching).
6950 Matching is carried out on GHC's intermediate language, which includes
6951 type abstractions and applications. So a rule only matches if the
6952 types match too. See <xref linkend="rule-spec"/> below.
6958 GHC keeps trying to apply the rules as it optimises the program.
6959 For example, consider:
6968 The expression <literal>s (t xs)</literal> does not match the rule <literal>"map/map"</literal>, but GHC
6969 will substitute for <varname>s</varname> and <varname>t</varname>, giving an expression which does match.
6970 If <varname>s</varname> or <varname>t</varname> was (a) used more than once, and (b) large or a redex, then it would
6971 not be substituted, and the rule would not fire.
6978 Ordinary inlining happens at the same time as rule rewriting, which may lead to unexpected
6979 results. Consider this (artificial) example
6982 {-# RULES "f" f True = False #-}
6988 Since <literal>f</literal>'s right-hand side is small, it is inlined into <literal>g</literal>,
6993 Now <literal>g</literal> is inlined into <literal>h</literal>, but <literal>f</literal>'s RULE has
6995 If instead GHC had first inlined <literal>g</literal> into <literal>h</literal> then there
6996 would have been a better chance that <literal>f</literal>'s RULE might fire.
6999 The way to get predictable behaviour is to use a NOINLINE
7000 pragma on <literal>f</literal>, to ensure
7001 that it is not inlined until its RULEs have had a chance to fire.
7011 <title>List fusion</title>
7014 The RULES mechanism is used to implement fusion (deforestation) of common list functions.
7015 If a "good consumer" consumes an intermediate list constructed by a "good producer", the
7016 intermediate list should be eliminated entirely.
7020 The following are good producers:
7032 Enumerations of <literal>Int</literal> and <literal>Char</literal> (e.g. <literal>['a'..'z']</literal>).
7038 Explicit lists (e.g. <literal>[True, False]</literal>)
7044 The cons constructor (e.g <literal>3:4:[]</literal>)
7050 <function>++</function>
7056 <function>map</function>
7062 <function>take</function>, <function>filter</function>
7068 <function>iterate</function>, <function>repeat</function>
7074 <function>zip</function>, <function>zipWith</function>
7083 The following are good consumers:
7095 <function>array</function> (on its second argument)
7101 <function>++</function> (on its first argument)
7107 <function>foldr</function>
7113 <function>map</function>
7119 <function>take</function>, <function>filter</function>
7125 <function>concat</function>
7131 <function>unzip</function>, <function>unzip2</function>, <function>unzip3</function>, <function>unzip4</function>
7137 <function>zip</function>, <function>zipWith</function> (but on one argument only; if both are good producers, <function>zip</function>
7138 will fuse with one but not the other)
7144 <function>partition</function>
7150 <function>head</function>
7156 <function>and</function>, <function>or</function>, <function>any</function>, <function>all</function>
7162 <function>sequence_</function>
7168 <function>msum</function>
7174 <function>sortBy</function>
7183 So, for example, the following should generate no intermediate lists:
7186 array (1,10) [(i,i*i) | i <- map (+ 1) [0..9]]
7192 This list could readily be extended; if there are Prelude functions that you use
7193 a lot which are not included, please tell us.
7197 If you want to write your own good consumers or producers, look at the
7198 Prelude definitions of the above functions to see how to do so.
7203 <sect2 id="rule-spec">
7204 <title>Specialisation
7208 Rewrite rules can be used to get the same effect as a feature
7209 present in earlier versions of GHC.
7210 For example, suppose that:
7213 genericLookup :: Ord a => Table a b -> a -> b
7214 intLookup :: Table Int b -> Int -> b
7217 where <function>intLookup</function> is an implementation of
7218 <function>genericLookup</function> that works very fast for
7219 keys of type <literal>Int</literal>. You might wish
7220 to tell GHC to use <function>intLookup</function> instead of
7221 <function>genericLookup</function> whenever the latter was called with
7222 type <literal>Table Int b -> Int -> b</literal>.
7223 It used to be possible to write
7226 {-# SPECIALIZE genericLookup :: Table Int b -> Int -> b = intLookup #-}
7229 This feature is no longer in GHC, but rewrite rules let you do the same thing:
7232 {-# RULES "genericLookup/Int" genericLookup = intLookup #-}
7235 This slightly odd-looking rule instructs GHC to replace
7236 <function>genericLookup</function> by <function>intLookup</function>
7237 <emphasis>whenever the types match</emphasis>.
7238 What is more, this rule does not need to be in the same
7239 file as <function>genericLookup</function>, unlike the
7240 <literal>SPECIALIZE</literal> pragmas which currently do (so that they
7241 have an original definition available to specialise).
7244 <para>It is <emphasis>Your Responsibility</emphasis> to make sure that
7245 <function>intLookup</function> really behaves as a specialised version
7246 of <function>genericLookup</function>!!!</para>
7248 <para>An example in which using <literal>RULES</literal> for
7249 specialisation will Win Big:
7252 toDouble :: Real a => a -> Double
7253 toDouble = fromRational . toRational
7255 {-# RULES "toDouble/Int" toDouble = i2d #-}
7256 i2d (I# i) = D# (int2Double# i) -- uses Glasgow prim-op directly
7259 The <function>i2d</function> function is virtually one machine
7260 instruction; the default conversion—via an intermediate
7261 <literal>Rational</literal>—is obscenely expensive by
7268 <title>Controlling what's going on</title>
7276 Use <option>-ddump-rules</option> to see what transformation rules GHC is using.
7282 Use <option>-ddump-simpl-stats</option> to see what rules are being fired.
7283 If you add <option>-dppr-debug</option> you get a more detailed listing.
7289 The definition of (say) <function>build</function> in <filename>GHC/Base.lhs</filename> looks like this:
7292 build :: forall a. (forall b. (a -> b -> b) -> b -> b) -> [a]
7293 {-# INLINE build #-}
7297 Notice the <literal>INLINE</literal>! That prevents <literal>(:)</literal> from being inlined when compiling
7298 <literal>PrelBase</literal>, so that an importing module will “see” the <literal>(:)</literal>, and can
7299 match it on the LHS of a rule. <literal>INLINE</literal> prevents any inlining happening
7300 in the RHS of the <literal>INLINE</literal> thing. I regret the delicacy of this.
7307 In <filename>libraries/base/GHC/Base.lhs</filename> look at the rules for <function>map</function> to
7308 see how to write rules that will do fusion and yet give an efficient
7309 program even if fusion doesn't happen. More rules in <filename>GHC/List.lhs</filename>.
7319 <sect2 id="core-pragma">
7320 <title>CORE pragma</title>
7322 <indexterm><primary>CORE pragma</primary></indexterm>
7323 <indexterm><primary>pragma, CORE</primary></indexterm>
7324 <indexterm><primary>core, annotation</primary></indexterm>
7327 The external core format supports <quote>Note</quote> annotations;
7328 the <literal>CORE</literal> pragma gives a way to specify what these
7329 should be in your Haskell source code. Syntactically, core
7330 annotations are attached to expressions and take a Haskell string
7331 literal as an argument. The following function definition shows an
7335 f x = ({-# CORE "foo" #-} show) ({-# CORE "bar" #-} x)
7338 Semantically, this is equivalent to:
7346 However, when external core is generated (via
7347 <option>-fext-core</option>), there will be Notes attached to the
7348 expressions <function>show</function> and <varname>x</varname>.
7349 The core function declaration for <function>f</function> is:
7353 f :: %forall a . GHCziShow.ZCTShow a ->
7354 a -> GHCziBase.ZMZN GHCziBase.Char =
7355 \ @ a (zddShow::GHCziShow.ZCTShow a) (eta::a) ->
7357 %case zddShow %of (tpl::GHCziShow.ZCTShow a)
7359 (tpl1::GHCziBase.Int ->
7361 GHCziBase.ZMZN GHCziBase.Char -> GHCziBase.ZMZN GHCziBase.Cha
7363 (tpl2::a -> GHCziBase.ZMZN GHCziBase.Char)
7364 (tpl3::GHCziBase.ZMZN a ->
7365 GHCziBase.ZMZN GHCziBase.Char -> GHCziBase.ZMZN GHCziBase.Cha
7373 Here, we can see that the function <function>show</function> (which
7374 has been expanded out to a case expression over the Show dictionary)
7375 has a <literal>%note</literal> attached to it, as does the
7376 expression <varname>eta</varname> (which used to be called
7377 <varname>x</varname>).
7384 <sect1 id="special-ids">
7385 <title>Special built-in functions</title>
7386 <para>GHC has a few built-in functions with special behaviour. These
7387 are now described in the module <ulink
7388 url="../libraries/base/GHC-Prim.html"><literal>GHC.Prim</literal></ulink>
7389 in the library documentation.</para>
7393 <sect1 id="generic-classes">
7394 <title>Generic classes</title>
7397 The ideas behind this extension are described in detail in "Derivable type classes",
7398 Ralf Hinze and Simon Peyton Jones, Haskell Workshop, Montreal Sept 2000, pp94-105.
7399 An example will give the idea:
7407 fromBin :: [Int] -> (a, [Int])
7409 toBin {| Unit |} Unit = []
7410 toBin {| a :+: b |} (Inl x) = 0 : toBin x
7411 toBin {| a :+: b |} (Inr y) = 1 : toBin y
7412 toBin {| a :*: b |} (x :*: y) = toBin x ++ toBin y
7414 fromBin {| Unit |} bs = (Unit, bs)
7415 fromBin {| a :+: b |} (0:bs) = (Inl x, bs') where (x,bs') = fromBin bs
7416 fromBin {| a :+: b |} (1:bs) = (Inr y, bs') where (y,bs') = fromBin bs
7417 fromBin {| a :*: b |} bs = (x :*: y, bs'') where (x,bs' ) = fromBin bs
7418 (y,bs'') = fromBin bs'
7421 This class declaration explains how <literal>toBin</literal> and <literal>fromBin</literal>
7422 work for arbitrary data types. They do so by giving cases for unit, product, and sum,
7423 which are defined thus in the library module <literal>Generics</literal>:
7427 data a :+: b = Inl a | Inr b
7428 data a :*: b = a :*: b
7431 Now you can make a data type into an instance of Bin like this:
7433 instance (Bin a, Bin b) => Bin (a,b)
7434 instance Bin a => Bin [a]
7436 That is, just leave off the "where" clause. Of course, you can put in the
7437 where clause and over-ride whichever methods you please.
7441 <title> Using generics </title>
7442 <para>To use generics you need to</para>
7445 <para>Use the flags <option>-fglasgow-exts</option> (to enable the extra syntax),
7446 <option>-XGenerics</option> (to generate extra per-data-type code),
7447 and <option>-package lang</option> (to make the <literal>Generics</literal> library
7451 <para>Import the module <literal>Generics</literal> from the
7452 <literal>lang</literal> package. This import brings into
7453 scope the data types <literal>Unit</literal>,
7454 <literal>:*:</literal>, and <literal>:+:</literal>. (You
7455 don't need this import if you don't mention these types
7456 explicitly; for example, if you are simply giving instance
7457 declarations.)</para>
7462 <sect2> <title> Changes wrt the paper </title>
7464 Note that the type constructors <literal>:+:</literal> and <literal>:*:</literal>
7465 can be written infix (indeed, you can now use
7466 any operator starting in a colon as an infix type constructor). Also note that
7467 the type constructors are not exactly as in the paper (Unit instead of 1, etc).
7468 Finally, note that the syntax of the type patterns in the class declaration
7469 uses "<literal>{|</literal>" and "<literal>|}</literal>" brackets; curly braces
7470 alone would ambiguous when they appear on right hand sides (an extension we
7471 anticipate wanting).
7475 <sect2> <title>Terminology and restrictions</title>
7477 Terminology. A "generic default method" in a class declaration
7478 is one that is defined using type patterns as above.
7479 A "polymorphic default method" is a default method defined as in Haskell 98.
7480 A "generic class declaration" is a class declaration with at least one
7481 generic default method.
7489 Alas, we do not yet implement the stuff about constructor names and
7496 A generic class can have only one parameter; you can't have a generic
7497 multi-parameter class.
7503 A default method must be defined entirely using type patterns, or entirely
7504 without. So this is illegal:
7507 op :: a -> (a, Bool)
7508 op {| Unit |} Unit = (Unit, True)
7511 However it is perfectly OK for some methods of a generic class to have
7512 generic default methods and others to have polymorphic default methods.
7518 The type variable(s) in the type pattern for a generic method declaration
7519 scope over the right hand side. So this is legal (note the use of the type variable ``p'' in a type signature on the right hand side:
7523 op {| p :*: q |} (x :*: y) = op (x :: p)
7531 The type patterns in a generic default method must take one of the forms:
7537 where "a" and "b" are type variables. Furthermore, all the type patterns for
7538 a single type constructor (<literal>:*:</literal>, say) must be identical; they
7539 must use the same type variables. So this is illegal:
7543 op {| a :+: b |} (Inl x) = True
7544 op {| p :+: q |} (Inr y) = False
7546 The type patterns must be identical, even in equations for different methods of the class.
7547 So this too is illegal:
7551 op1 {| a :*: b |} (x :*: y) = True
7554 op2 {| p :*: q |} (x :*: y) = False
7556 (The reason for this restriction is that we gather all the equations for a particular type constructor
7557 into a single generic instance declaration.)
7563 A generic method declaration must give a case for each of the three type constructors.
7569 The type for a generic method can be built only from:
7571 <listitem> <para> Function arrows </para> </listitem>
7572 <listitem> <para> Type variables </para> </listitem>
7573 <listitem> <para> Tuples </para> </listitem>
7574 <listitem> <para> Arbitrary types not involving type variables </para> </listitem>
7576 Here are some example type signatures for generic methods:
7579 op2 :: Bool -> (a,Bool)
7580 op3 :: [Int] -> a -> a
7583 Here, op1, op2, op3 are OK, but op4 is rejected, because it has a type variable
7587 This restriction is an implementation restriction: we just haven't got around to
7588 implementing the necessary bidirectional maps over arbitrary type constructors.
7589 It would be relatively easy to add specific type constructors, such as Maybe and list,
7590 to the ones that are allowed.</para>
7595 In an instance declaration for a generic class, the idea is that the compiler
7596 will fill in the methods for you, based on the generic templates. However it can only
7601 The instance type is simple (a type constructor applied to type variables, as in Haskell 98).
7606 No constructor of the instance type has unboxed fields.
7610 (Of course, these things can only arise if you are already using GHC extensions.)
7611 However, you can still give an instance declarations for types which break these rules,
7612 provided you give explicit code to override any generic default methods.
7620 The option <option>-ddump-deriv</option> dumps incomprehensible stuff giving details of
7621 what the compiler does with generic declarations.
7626 <sect2> <title> Another example </title>
7628 Just to finish with, here's another example I rather like:
7632 nCons {| Unit |} _ = 1
7633 nCons {| a :*: b |} _ = 1
7634 nCons {| a :+: b |} _ = nCons (bot::a) + nCons (bot::b)
7637 tag {| Unit |} _ = 1
7638 tag {| a :*: b |} _ = 1
7639 tag {| a :+: b |} (Inl x) = tag x
7640 tag {| a :+: b |} (Inr y) = nCons (bot::a) + tag y
7646 <sect1 id="monomorphism">
7647 <title>Control over monomorphism</title>
7649 <para>GHC supports two flags that control the way in which generalisation is
7650 carried out at let and where bindings.
7654 <title>Switching off the dreaded Monomorphism Restriction</title>
7655 <indexterm><primary><option>-XNoMonomorphismRestriction</option></primary></indexterm>
7657 <para>Haskell's monomorphism restriction (see
7658 <ulink url="http://www.haskell.org/onlinereport/decls.html#sect4.5.5">Section
7660 of the Haskell Report)
7661 can be completely switched off by
7662 <option>-XNoMonomorphismRestriction</option>.
7667 <title>Monomorphic pattern bindings</title>
7668 <indexterm><primary><option>-XNoMonoPatBinds</option></primary></indexterm>
7669 <indexterm><primary><option>-XMonoPatBinds</option></primary></indexterm>
7671 <para> As an experimental change, we are exploring the possibility of
7672 making pattern bindings monomorphic; that is, not generalised at all.
7673 A pattern binding is a binding whose LHS has no function arguments,
7674 and is not a simple variable. For example:
7676 f x = x -- Not a pattern binding
7677 f = \x -> x -- Not a pattern binding
7678 f :: Int -> Int = \x -> x -- Not a pattern binding
7680 (g,h) = e -- A pattern binding
7681 (f) = e -- A pattern binding
7682 [x] = e -- A pattern binding
7684 Experimentally, GHC now makes pattern bindings monomorphic <emphasis>by
7685 default</emphasis>. Use <option>-XNoMonoPatBinds</option> to recover the
7694 ;;; Local Variables: ***
7696 ;;; sgml-parent-document: ("users_guide.xml" "book" "chapter" "sect1") ***
7697 ;;; ispell-local-dictionary: "british" ***