1 <?xml version="1.0" encoding="iso-8859-1"?>
3 <indexterm><primary>language, GHC</primary></indexterm>
4 <indexterm><primary>extensions, GHC</primary></indexterm>
5 As with all known Haskell systems, GHC implements some extensions to
6 the language. They are all enabled by options; by default GHC
7 understands only plain Haskell 98.
11 Some of the Glasgow extensions serve to give you access to the
12 underlying facilities with which we implement Haskell. Thus, you can
13 get at the Raw Iron, if you are willing to write some non-portable
14 code at a more primitive level. You need not be “stuck”
15 on performance because of the implementation costs of Haskell's
16 “high-level” features—you can always code
17 “under” them. In an extreme case, you can write all your
18 time-critical code in C, and then just glue it together with Haskell!
22 Before you get too carried away working at the lowest level (e.g.,
23 sloshing <literal>MutableByteArray#</literal>s around your
24 program), you may wish to check if there are libraries that provide a
25 “Haskellised veneer” over the features you want. The
26 separate <ulink url="../libraries/index.html">libraries
27 documentation</ulink> describes all the libraries that come with GHC.
30 <!-- LANGUAGE OPTIONS -->
31 <sect1 id="options-language">
32 <title>Language options</title>
34 <indexterm><primary>language</primary><secondary>option</secondary>
36 <indexterm><primary>options</primary><secondary>language</secondary>
38 <indexterm><primary>extensions</primary><secondary>options controlling</secondary>
41 <para>The language option flags control what variation of the language are
42 permitted. Leaving out all of them gives you standard Haskell
45 <para>Language options can be controlled in two ways:
47 <listitem><para>Every language option can switched on by a command-line flag "<option>-X...</option>"
48 (e.g. <option>-XTemplateHaskell</option>), and switched off by the flag "<option>-XNo...</option>";
49 (e.g. <option>-XNoTemplateHaskell</option>).</para></listitem>
51 Language options recognised by Cabal can also be enabled using the <literal>LANGUAGE</literal> pragma,
52 thus <literal>{-# LANGUAGE TemplateHaskell #-}</literal> (see <xref linkend="language-pragma"/>). </para>
54 </itemizedlist></para>
56 <para>The flag <option>-fglasgow-exts</option>
57 <indexterm><primary><option>-fglasgow-exts</option></primary></indexterm>
58 is equivalent to enabling the following extensions:
59 <option>-XPrintExplicitForalls</option>,
60 <option>-XForeignFunctionInterface</option>,
61 <option>-XUnliftedFFITypes</option>,
62 <option>-XGADTs</option>,
63 <option>-XImplicitParams</option>,
64 <option>-XScopedTypeVariables</option>,
65 <option>-XUnboxedTuples</option>,
66 <option>-XTypeSynonymInstances</option>,
67 <option>-XStandaloneDeriving</option>,
68 <option>-XDeriveDataTypeable</option>,
69 <option>-XFlexibleContexts</option>,
70 <option>-XFlexibleInstances</option>,
71 <option>-XConstrainedClassMethods</option>,
72 <option>-XMultiParamTypeClasses</option>,
73 <option>-XFunctionalDependencies</option>,
74 <option>-XMagicHash</option>,
75 <option>-XPolymorphicComponents</option>,
76 <option>-XExistentialQuantification</option>,
77 <option>-XUnicodeSyntax</option>,
78 <option>-XPostfixOperators</option>,
79 <option>-XPatternGuards</option>,
80 <option>-XLiberalTypeSynonyms</option>,
81 <option>-XExplicitForAll</option>,
82 <option>-XRankNTypes</option>,
83 <option>-XImpredicativeTypes</option>,
84 <option>-XTypeOperators</option>,
85 <option>-XDoRec</option>,
86 <option>-XParallelListComp</option>,
87 <option>-XEmptyDataDecls</option>,
88 <option>-XKindSignatures</option>,
89 <option>-XGeneralizedNewtypeDeriving</option>,
90 <option>-XTypeFamilies</option>.
91 Enabling these options is the <emphasis>only</emphasis>
92 effect of <option>-fglasgow-exts</option>.
93 We are trying to move away from this portmanteau flag,
94 and towards enabling features individually.</para>
98 <!-- UNBOXED TYPES AND PRIMITIVE OPERATIONS -->
99 <sect1 id="primitives">
100 <title>Unboxed types and primitive operations</title>
102 <para>GHC is built on a raft of primitive data types and operations;
103 "primitive" in the sense that they cannot be defined in Haskell itself.
104 While you really can use this stuff to write fast code,
105 we generally find it a lot less painful, and more satisfying in the
106 long run, to use higher-level language features and libraries. With
107 any luck, the code you write will be optimised to the efficient
108 unboxed version in any case. And if it isn't, we'd like to know
111 <para>All these primitive data types and operations are exported by the
112 library <literal>GHC.Prim</literal>, for which there is
113 <ulink url="../libraries/ghc-prim/GHC-Prim.html">detailed online documentation</ulink>.
114 (This documentation is generated from the file <filename>compiler/prelude/primops.txt.pp</filename>.)
117 If you want to mention any of the primitive data types or operations in your
118 program, you must first import <literal>GHC.Prim</literal> to bring them
119 into scope. Many of them have names ending in "#", and to mention such
120 names you need the <option>-XMagicHash</option> extension (<xref linkend="magic-hash"/>).
123 <para>The primops make extensive use of <link linkend="glasgow-unboxed">unboxed types</link>
124 and <link linkend="unboxed-tuples">unboxed tuples</link>, which
125 we briefly summarise here. </para>
127 <sect2 id="glasgow-unboxed">
132 <indexterm><primary>Unboxed types (Glasgow extension)</primary></indexterm>
135 <para>Most types in GHC are <firstterm>boxed</firstterm>, which means
136 that values of that type are represented by a pointer to a heap
137 object. The representation of a Haskell <literal>Int</literal>, for
138 example, is a two-word heap object. An <firstterm>unboxed</firstterm>
139 type, however, is represented by the value itself, no pointers or heap
140 allocation are involved.
144 Unboxed types correspond to the “raw machine” types you
145 would use in C: <literal>Int#</literal> (long int),
146 <literal>Double#</literal> (double), <literal>Addr#</literal>
147 (void *), etc. The <emphasis>primitive operations</emphasis>
148 (PrimOps) on these types are what you might expect; e.g.,
149 <literal>(+#)</literal> is addition on
150 <literal>Int#</literal>s, and is the machine-addition that we all
151 know and love—usually one instruction.
155 Primitive (unboxed) types cannot be defined in Haskell, and are
156 therefore built into the language and compiler. Primitive types are
157 always unlifted; that is, a value of a primitive type cannot be
158 bottom. We use the convention (but it is only a convention)
159 that primitive types, values, and
160 operations have a <literal>#</literal> suffix (see <xref linkend="magic-hash"/>).
161 For some primitive types we have special syntax for literals, also
162 described in the <link linkend="magic-hash">same section</link>.
166 Primitive values are often represented by a simple bit-pattern, such
167 as <literal>Int#</literal>, <literal>Float#</literal>,
168 <literal>Double#</literal>. But this is not necessarily the case:
169 a primitive value might be represented by a pointer to a
170 heap-allocated object. Examples include
171 <literal>Array#</literal>, the type of primitive arrays. A
172 primitive array is heap-allocated because it is too big a value to fit
173 in a register, and would be too expensive to copy around; in a sense,
174 it is accidental that it is represented by a pointer. If a pointer
175 represents a primitive value, then it really does point to that value:
176 no unevaluated thunks, no indirections…nothing can be at the
177 other end of the pointer than the primitive value.
178 A numerically-intensive program using unboxed types can
179 go a <emphasis>lot</emphasis> faster than its “standard”
180 counterpart—we saw a threefold speedup on one example.
184 There are some restrictions on the use of primitive types:
186 <listitem><para>The main restriction
187 is that you can't pass a primitive value to a polymorphic
188 function or store one in a polymorphic data type. This rules out
189 things like <literal>[Int#]</literal> (i.e. lists of primitive
190 integers). The reason for this restriction is that polymorphic
191 arguments and constructor fields are assumed to be pointers: if an
192 unboxed integer is stored in one of these, the garbage collector would
193 attempt to follow it, leading to unpredictable space leaks. Or a
194 <function>seq</function> operation on the polymorphic component may
195 attempt to dereference the pointer, with disastrous results. Even
196 worse, the unboxed value might be larger than a pointer
197 (<literal>Double#</literal> for instance).
200 <listitem><para> You cannot define a newtype whose representation type
201 (the argument type of the data constructor) is an unboxed type. Thus,
207 <listitem><para> You cannot bind a variable with an unboxed type
208 in a <emphasis>top-level</emphasis> binding.
210 <listitem><para> You cannot bind a variable with an unboxed type
211 in a <emphasis>recursive</emphasis> binding.
213 <listitem><para> You may bind unboxed variables in a (non-recursive,
214 non-top-level) pattern binding, but you must make any such pattern-match
215 strict. For example, rather than:
217 data Foo = Foo Int Int#
219 f x = let (Foo a b, w) = ..rhs.. in ..body..
223 data Foo = Foo Int Int#
225 f x = let !(Foo a b, w) = ..rhs.. in ..body..
227 since <literal>b</literal> has type <literal>Int#</literal>.
235 <sect2 id="unboxed-tuples">
236 <title>Unboxed Tuples
240 Unboxed tuples aren't really exported by <literal>GHC.Exts</literal>,
241 they're available by default with <option>-fglasgow-exts</option>. An
242 unboxed tuple looks like this:
254 where <literal>e_1..e_n</literal> are expressions of any
255 type (primitive or non-primitive). The type of an unboxed tuple looks
260 Unboxed tuples are used for functions that need to return multiple
261 values, but they avoid the heap allocation normally associated with
262 using fully-fledged tuples. When an unboxed tuple is returned, the
263 components are put directly into registers or on the stack; the
264 unboxed tuple itself does not have a composite representation. Many
265 of the primitive operations listed in <literal>primops.txt.pp</literal> return unboxed
267 In particular, the <literal>IO</literal> and <literal>ST</literal> monads use unboxed
268 tuples to avoid unnecessary allocation during sequences of operations.
272 There are some pretty stringent restrictions on the use of unboxed tuples:
277 Values of unboxed tuple types are subject to the same restrictions as
278 other unboxed types; i.e. they may not be stored in polymorphic data
279 structures or passed to polymorphic functions.
286 No variable can have an unboxed tuple type, nor may a constructor or function
287 argument have an unboxed tuple type. The following are all illegal:
291 data Foo = Foo (# Int, Int #)
293 f :: (# Int, Int #) -> (# Int, Int #)
296 g :: (# Int, Int #) -> Int
299 h x = let y = (# x,x #) in ...
306 The typical use of unboxed tuples is simply to return multiple values,
307 binding those multiple results with a <literal>case</literal> expression, thus:
309 f x y = (# x+1, y-1 #)
310 g x = case f x x of { (# a, b #) -> a + b }
312 You can have an unboxed tuple in a pattern binding, thus
314 f x = let (# p,q #) = h x in ..body..
316 If the types of <literal>p</literal> and <literal>q</literal> are not unboxed,
317 the resulting binding is lazy like any other Haskell pattern binding. The
318 above example desugars like this:
320 f x = let t = case h x o f{ (# p,q #) -> (p,q)
325 Indeed, the bindings can even be recursive.
332 <!-- ====================== SYNTACTIC EXTENSIONS ======================= -->
334 <sect1 id="syntax-extns">
335 <title>Syntactic extensions</title>
337 <sect2 id="unicode-syntax">
338 <title>Unicode syntax</title>
340 extension <option>-XUnicodeSyntax</option><indexterm><primary><option>-XUnicodeSyntax</option></primary></indexterm>
341 enables Unicode characters to be used to stand for certain ASCII
342 character sequences. The following alternatives are provided:</para>
345 <tgroup cols="2" align="left" colsep="1" rowsep="1">
349 <entry>Unicode alternative</entry>
350 <entry>Code point</entry>
356 to find the DocBook entities for these characters, find
357 the Unicode code point (e.g. 0x2237), and grep for it in
358 /usr/share/sgml/docbook/xml-dtd-*/ent/* (or equivalent on
359 your system. Some of these Unicode code points don't have
360 equivalent DocBook entities.
365 <entry><literal>::</literal></entry>
366 <entry>::</entry> <!-- no special char, apparently -->
367 <entry>0x2237</entry>
368 <entry>PROPORTION</entry>
373 <entry><literal>=></literal></entry>
374 <entry>⇒</entry>
375 <entry>0x21D2</entry>
376 <entry>RIGHTWARDS DOUBLE ARROW</entry>
381 <entry><literal>forall</literal></entry>
382 <entry>∀</entry>
383 <entry>0x2200</entry>
384 <entry>FOR ALL</entry>
389 <entry><literal>-></literal></entry>
390 <entry>→</entry>
391 <entry>0x2192</entry>
392 <entry>RIGHTWARDS ARROW</entry>
397 <entry><literal><-</literal></entry>
398 <entry>←</entry>
399 <entry>0x2190</entry>
400 <entry>LEFTWARDS ARROW</entry>
406 <entry>…</entry>
407 <entry>0x22EF</entry>
408 <entry>MIDLINE HORIZONTAL ELLIPSIS</entry>
415 <entry>↢</entry>
416 <entry>0x2919</entry>
417 <entry>LEFTWARDS ARROW-TAIL</entry>
424 <entry>↣</entry>
425 <entry>0x291A</entry>
426 <entry>RIGHTWARDS ARROW-TAIL</entry>
432 <entry>-<<</entry>
434 <entry>0x291B</entry>
435 <entry>LEFTWARDS DOUBLE ARROW-TAIL</entry>
441 <entry>>>-</entry>
443 <entry>0x291C</entry>
444 <entry>RIGHTWARDS DOUBLE ARROW-TAIL</entry>
451 <entry>★</entry>
452 <entry>0x2605</entry>
453 <entry>BLACK STAR</entry>
461 <sect2 id="magic-hash">
462 <title>The magic hash</title>
463 <para>The language extension <option>-XMagicHash</option> allows "#" as a
464 postfix modifier to identifiers. Thus, "x#" is a valid variable, and "T#" is
465 a valid type constructor or data constructor.</para>
467 <para>The hash sign does not change sematics at all. We tend to use variable
468 names ending in "#" for unboxed values or types (e.g. <literal>Int#</literal>),
469 but there is no requirement to do so; they are just plain ordinary variables.
470 Nor does the <option>-XMagicHash</option> extension bring anything into scope.
471 For example, to bring <literal>Int#</literal> into scope you must
472 import <literal>GHC.Prim</literal> (see <xref linkend="primitives"/>);
473 the <option>-XMagicHash</option> extension
474 then allows you to <emphasis>refer</emphasis> to the <literal>Int#</literal>
475 that is now in scope.</para>
476 <para> The <option>-XMagicHash</option> also enables some new forms of literals (see <xref linkend="glasgow-unboxed"/>):
478 <listitem><para> <literal>'x'#</literal> has type <literal>Char#</literal></para> </listitem>
479 <listitem><para> <literal>"foo"#</literal> has type <literal>Addr#</literal></para> </listitem>
480 <listitem><para> <literal>3#</literal> has type <literal>Int#</literal>. In general,
481 any Haskell 98 integer lexeme followed by a <literal>#</literal> is an <literal>Int#</literal> literal, e.g.
482 <literal>-0x3A#</literal> as well as <literal>32#</literal></para>.</listitem>
483 <listitem><para> <literal>3##</literal> has type <literal>Word#</literal>. In general,
484 any non-negative Haskell 98 integer lexeme followed by <literal>##</literal>
485 is a <literal>Word#</literal>. </para> </listitem>
486 <listitem><para> <literal>3.2#</literal> has type <literal>Float#</literal>.</para> </listitem>
487 <listitem><para> <literal>3.2##</literal> has type <literal>Double#</literal></para> </listitem>
492 <sect2 id="new-qualified-operators">
493 <title>New qualified operator syntax</title>
495 <para>A new syntax for referencing qualified operators is
496 planned to be introduced by Haskell', and is enabled in GHC
498 the <option>-XNewQualifiedOperators</option><indexterm><primary><option>-XNewQualifiedOperators</option></primary></indexterm>
499 option. In the new syntax, the prefix form of a qualified
501 written <literal><replaceable>module</replaceable>.(<replaceable>symbol</replaceable>)</literal>
502 (in Haskell 98 this would
503 be <literal>(<replaceable>module</replaceable>.<replaceable>symbol</replaceable>)</literal>),
504 and the infix form is
505 written <literal>`<replaceable>module</replaceable>.(<replaceable>symbol</replaceable>)`</literal>
506 (in Haskell 98 this would
507 be <literal>`<replaceable>module</replaceable>.<replaceable>symbol</replaceable>`</literal>.
510 add x y = Prelude.(+) x y
511 subtract y = (`Prelude.(-)` y)
513 The new form of qualified operators is intended to regularise
514 the syntax by eliminating odd cases
515 like <literal>Prelude..</literal>. For example,
516 when <literal>NewQualifiedOperators</literal> is on, it is possible to
517 write the enumerated sequence <literal>[Monday..]</literal>
518 without spaces, whereas in Haskell 98 this would be a
519 reference to the operator ‘<literal>.</literal>‘
520 from module <literal>Monday</literal>.</para>
522 <para>When <option>-XNewQualifiedOperators</option> is on, the old Haskell
523 98 syntax for qualified operators is not accepted, so this
524 option may cause existing Haskell 98 code to break.</para>
529 <!-- ====================== HIERARCHICAL MODULES ======================= -->
532 <sect2 id="hierarchical-modules">
533 <title>Hierarchical Modules</title>
535 <para>GHC supports a small extension to the syntax of module
536 names: a module name is allowed to contain a dot
537 <literal>‘.’</literal>. This is also known as the
538 “hierarchical module namespace” extension, because
539 it extends the normally flat Haskell module namespace into a
540 more flexible hierarchy of modules.</para>
542 <para>This extension has very little impact on the language
543 itself; modules names are <emphasis>always</emphasis> fully
544 qualified, so you can just think of the fully qualified module
545 name as <quote>the module name</quote>. In particular, this
546 means that the full module name must be given after the
547 <literal>module</literal> keyword at the beginning of the
548 module; for example, the module <literal>A.B.C</literal> must
551 <programlisting>module A.B.C</programlisting>
554 <para>It is a common strategy to use the <literal>as</literal>
555 keyword to save some typing when using qualified names with
556 hierarchical modules. For example:</para>
559 import qualified Control.Monad.ST.Strict as ST
562 <para>For details on how GHC searches for source and interface
563 files in the presence of hierarchical modules, see <xref
564 linkend="search-path"/>.</para>
566 <para>GHC comes with a large collection of libraries arranged
567 hierarchically; see the accompanying <ulink
568 url="../libraries/index.html">library
569 documentation</ulink>. More libraries to install are available
571 url="http://hackage.haskell.org/packages/hackage.html">HackageDB</ulink>.</para>
574 <!-- ====================== PATTERN GUARDS ======================= -->
576 <sect2 id="pattern-guards">
577 <title>Pattern guards</title>
580 <indexterm><primary>Pattern guards (Glasgow extension)</primary></indexterm>
581 The discussion that follows is an abbreviated version of Simon Peyton Jones's original <ulink url="http://research.microsoft.com/~simonpj/Haskell/guards.html">proposal</ulink>. (Note that the proposal was written before pattern guards were implemented, so refers to them as unimplemented.)
585 Suppose we have an abstract data type of finite maps, with a
589 lookup :: FiniteMap -> Int -> Maybe Int
592 The lookup returns <function>Nothing</function> if the supplied key is not in the domain of the mapping, and <function>(Just v)</function> otherwise,
593 where <varname>v</varname> is the value that the key maps to. Now consider the following definition:
597 clunky env var1 var2 | ok1 && ok2 = val1 + val2
598 | otherwise = var1 + var2
609 The auxiliary functions are
613 maybeToBool :: Maybe a -> Bool
614 maybeToBool (Just x) = True
615 maybeToBool Nothing = False
617 expectJust :: Maybe a -> a
618 expectJust (Just x) = x
619 expectJust Nothing = error "Unexpected Nothing"
623 What is <function>clunky</function> doing? The guard <literal>ok1 &&
624 ok2</literal> checks that both lookups succeed, using
625 <function>maybeToBool</function> to convert the <function>Maybe</function>
626 types to booleans. The (lazily evaluated) <function>expectJust</function>
627 calls extract the values from the results of the lookups, and binds the
628 returned values to <varname>val1</varname> and <varname>val2</varname>
629 respectively. If either lookup fails, then clunky takes the
630 <literal>otherwise</literal> case and returns the sum of its arguments.
634 This is certainly legal Haskell, but it is a tremendously verbose and
635 un-obvious way to achieve the desired effect. Arguably, a more direct way
636 to write clunky would be to use case expressions:
640 clunky env var1 var2 = case lookup env var1 of
642 Just val1 -> case lookup env var2 of
644 Just val2 -> val1 + val2
650 This is a bit shorter, but hardly better. Of course, we can rewrite any set
651 of pattern-matching, guarded equations as case expressions; that is
652 precisely what the compiler does when compiling equations! The reason that
653 Haskell provides guarded equations is because they allow us to write down
654 the cases we want to consider, one at a time, independently of each other.
655 This structure is hidden in the case version. Two of the right-hand sides
656 are really the same (<function>fail</function>), and the whole expression
657 tends to become more and more indented.
661 Here is how I would write clunky:
666 | Just val1 <- lookup env var1
667 , Just val2 <- lookup env var2
669 ...other equations for clunky...
673 The semantics should be clear enough. The qualifiers are matched in order.
674 For a <literal><-</literal> qualifier, which I call a pattern guard, the
675 right hand side is evaluated and matched against the pattern on the left.
676 If the match fails then the whole guard fails and the next equation is
677 tried. If it succeeds, then the appropriate binding takes place, and the
678 next qualifier is matched, in the augmented environment. Unlike list
679 comprehensions, however, the type of the expression to the right of the
680 <literal><-</literal> is the same as the type of the pattern to its
681 left. The bindings introduced by pattern guards scope over all the
682 remaining guard qualifiers, and over the right hand side of the equation.
686 Just as with list comprehensions, boolean expressions can be freely mixed
687 with among the pattern guards. For example:
698 Haskell's current guards therefore emerge as a special case, in which the
699 qualifier list has just one element, a boolean expression.
703 <!-- ===================== View patterns =================== -->
705 <sect2 id="view-patterns">
710 View patterns are enabled by the flag <literal>-XViewPatterns</literal>.
711 More information and examples of view patterns can be found on the
712 <ulink url="http://hackage.haskell.org/trac/ghc/wiki/ViewPatterns">Wiki
717 View patterns are somewhat like pattern guards that can be nested inside
718 of other patterns. They are a convenient way of pattern-matching
719 against values of abstract types. For example, in a programming language
720 implementation, we might represent the syntax of the types of the
729 view :: Type -> TypeView
731 -- additional operations for constructing Typ's ...
734 The representation of Typ is held abstract, permitting implementations
735 to use a fancy representation (e.g., hash-consing to manage sharing).
737 Without view patterns, using this signature a little inconvenient:
739 size :: Typ -> Integer
740 size t = case view t of
742 Arrow t1 t2 -> size t1 + size t2
745 It is necessary to iterate the case, rather than using an equational
746 function definition. And the situation is even worse when the matching
747 against <literal>t</literal> is buried deep inside another pattern.
751 View patterns permit calling the view function inside the pattern and
752 matching against the result:
754 size (view -> Unit) = 1
755 size (view -> Arrow t1 t2) = size t1 + size t2
758 That is, we add a new form of pattern, written
759 <replaceable>expression</replaceable> <literal>-></literal>
760 <replaceable>pattern</replaceable> that means "apply the expression to
761 whatever we're trying to match against, and then match the result of
762 that application against the pattern". The expression can be any Haskell
763 expression of function type, and view patterns can be used wherever
768 The semantics of a pattern <literal>(</literal>
769 <replaceable>exp</replaceable> <literal>-></literal>
770 <replaceable>pat</replaceable> <literal>)</literal> are as follows:
776 <para>The variables bound by the view pattern are the variables bound by
777 <replaceable>pat</replaceable>.
781 Any variables in <replaceable>exp</replaceable> are bound occurrences,
782 but variables bound "to the left" in a pattern are in scope. This
783 feature permits, for example, one argument to a function to be used in
784 the view of another argument. For example, the function
785 <literal>clunky</literal> from <xref linkend="pattern-guards" /> can be
786 written using view patterns as follows:
789 clunky env (lookup env -> Just val1) (lookup env -> Just val2) = val1 + val2
790 ...other equations for clunky...
795 More precisely, the scoping rules are:
799 In a single pattern, variables bound by patterns to the left of a view
800 pattern expression are in scope. For example:
802 example :: Maybe ((String -> Integer,Integer), String) -> Bool
803 example Just ((f,_), f -> 4) = True
806 Additionally, in function definitions, variables bound by matching earlier curried
807 arguments may be used in view pattern expressions in later arguments:
809 example :: (String -> Integer) -> String -> Bool
810 example f (f -> 4) = True
812 That is, the scoping is the same as it would be if the curried arguments
813 were collected into a tuple.
819 In mutually recursive bindings, such as <literal>let</literal>,
820 <literal>where</literal>, or the top level, view patterns in one
821 declaration may not mention variables bound by other declarations. That
822 is, each declaration must be self-contained. For example, the following
823 program is not allowed:
830 restriction in the future; the only cost is that type checking patterns
831 would get a little more complicated.)
841 <listitem><para> Typing: If <replaceable>exp</replaceable> has type
842 <replaceable>T1</replaceable> <literal>-></literal>
843 <replaceable>T2</replaceable> and <replaceable>pat</replaceable> matches
844 a <replaceable>T2</replaceable>, then the whole view pattern matches a
845 <replaceable>T1</replaceable>.
848 <listitem><para> Matching: To the equations in Section 3.17.3 of the
849 <ulink url="http://www.haskell.org/onlinereport/">Haskell 98
850 Report</ulink>, add the following:
852 case v of { (e -> p) -> e1 ; _ -> e2 }
854 case (e v) of { p -> e1 ; _ -> e2 }
856 That is, to match a variable <replaceable>v</replaceable> against a pattern
857 <literal>(</literal> <replaceable>exp</replaceable>
858 <literal>-></literal> <replaceable>pat</replaceable>
859 <literal>)</literal>, evaluate <literal>(</literal>
860 <replaceable>exp</replaceable> <replaceable> v</replaceable>
861 <literal>)</literal> and match the result against
862 <replaceable>pat</replaceable>.
865 <listitem><para> Efficiency: When the same view function is applied in
866 multiple branches of a function definition or a case expression (e.g.,
867 in <literal>size</literal> above), GHC makes an attempt to collect these
868 applications into a single nested case expression, so that the view
869 function is only applied once. Pattern compilation in GHC follows the
870 matrix algorithm described in Chapter 4 of <ulink
871 url="http://research.microsoft.com/~simonpj/Papers/slpj-book-1987/">The
872 Implementation of Functional Programming Languages</ulink>. When the
873 top rows of the first column of a matrix are all view patterns with the
874 "same" expression, these patterns are transformed into a single nested
875 case. This includes, for example, adjacent view patterns that line up
878 f ((view -> A, p1), p2) = e1
879 f ((view -> B, p3), p4) = e2
883 <para> The current notion of when two view pattern expressions are "the
884 same" is very restricted: it is not even full syntactic equality.
885 However, it does include variables, literals, applications, and tuples;
886 e.g., two instances of <literal>view ("hi", "there")</literal> will be
887 collected. However, the current implementation does not compare up to
888 alpha-equivalence, so two instances of <literal>(x, view x ->
889 y)</literal> will not be coalesced.
899 <!-- ===================== n+k patterns =================== -->
901 <sect2 id="n-k-patterns">
902 <title>n+k patterns</title>
903 <indexterm><primary><option>-XNoNPlusKPatterns</option></primary></indexterm>
906 <literal>n+k</literal> pattern support is enabled by default. To disable
907 it, you can use the <option>-XNoNPlusKPatterns</option> flag.
912 <!-- ===================== Recursive do-notation =================== -->
914 <sect2 id="mdo-notation">
915 <title>The recursive do-notation
919 The do-notation of Haskell 98 does not allow <emphasis>recursive bindings</emphasis>,
920 that is, the variables bound in a do-expression are visible only in the textually following
921 code block. Compare this to a let-expression, where bound variables are visible in the entire binding
922 group. It turns out that several applications can benefit from recursive bindings in
923 the do-notation. The <option>-XDoRec</option> flag provides the necessary syntactic support.
926 Here is a simple (albeit contrived) example:
928 {-# LANGUAGE DoRec #-}
929 justOnes = do { rec { xs <- Just (1:xs) }
930 ; return (map negate xs) }
932 As you can guess <literal>justOnes</literal> will evaluate to <literal>Just [-1,-1,-1,...</literal>.
935 The background and motivation for recusrive do-notation is described in
936 <ulink url="http://sites.google.com/site/leventerkok/">A recursive do for Haskell</ulink>,
937 by Levent Erkok, John Launchbury,
938 Haskell Workshop 2002, pages: 29-37. Pittsburgh, Pennsylvania.
939 The theory behind monadic value recursion is explained further in Erkok's thesis
940 <ulink url="http://sites.google.com/site/leventerkok/erkok-thesis.pdf">Value Recursion in Monadic Computations</ulink>.
941 However, note that GHC uses a different syntax than the one described in these documents.
945 <title>Details of recursive do-notation</title>
947 The recursive do-notation is enabled with the flag <option>-XDoRec</option> or, equivalently,
948 the LANGUAGE pragma <option>DoRec</option>. It introduces the single new keyword "<literal>rec</literal>",
949 which wraps a mutually-recursive group of monadic statements,
950 producing a single statement.
952 <para>Similar to a <literal>let</literal>
953 statement, the variables bound in the <literal>rec</literal> are
954 visible throughout the <literal>rec</literal> group, and below it.
957 do { a <- getChar do { a <- getChar
958 ; let { r1 = f a r2 ; rec { r1 <- f a r2
959 ; r2 = g r1 } ; r2 <- g r1 }
960 ; return (r1 ++ r2) } ; return (r1 ++ r2) }
962 In both cases, <literal>r1</literal> and <literal>r2</literal> are
963 available both throughout the <literal>let</literal> or <literal>rec</literal> block, and
964 in the statements that follow it. The difference is that <literal>let</literal> is non-monadic,
965 while <literal>rec</literal> is monadic. (In Haskell <literal>let</literal> is
966 really <literal>letrec</literal>, of course.)
969 The static and dynamic semantics of <literal>rec</literal> can be described as follows:
973 similar to let-bindings, the <literal>rec</literal> is broken into
974 minimal recursive groups, a process known as <emphasis>segmentation</emphasis>.
977 rec { a <- getChar ===> a <- getChar
978 ; b <- f a c rec { b <- f a c
979 ; c <- f b a ; c <- f b a }
980 ; putChar c } putChar c
982 The details of segmentation are described in Section 3.2 of
983 <ulink url="http://sites.google.com/site/leventerkok/">A recursive do for Haskell</ulink>.
984 Segmentation improves polymorphism, reduces the size of the recursive "knot", and, as the paper
985 describes, also has a semantic effect (unless the monad satisfies the right-shrinking law).
988 Then each resulting <literal>rec</literal> is desugared, using a call to <literal>Control.Monad.Fix.mfix</literal>.
989 For example, the <literal>rec</literal> group in the preceding example is desugared like this:
991 rec { b <- f a c ===> (b,c) <- mfix (\~(b,c) -> do { b <- f a c
992 ; c <- f b a } ; c <- f b a
995 In general, the statment <literal>rec <replaceable>ss</replaceable></literal>
996 is desugared to the statement
998 <replaceable>vs</replaceable> <- mfix (\~<replaceable>vs</replaceable> -> do { <replaceable>ss</replaceable>; return <replaceable>vs</replaceable> })
1000 where <replaceable>vs</replaceable> is a tuple of the variables bound by <replaceable>ss</replaceable>.
1002 The original <literal>rec</literal> typechecks exactly
1003 when the above desugared version would do so. For example, this means that
1004 the variables <replaceable>vs</replaceable> are all monomorphic in the statements
1005 following the <literal>rec</literal>, because they are bound by a lambda.
1008 The <literal>mfix</literal> function is defined in the <literal>MonadFix</literal>
1009 class, in <literal>Control.Monad.Fix</literal>, thus:
1011 class Monad m => MonadFix m where
1012 mfix :: (a -> m a) -> m a
1019 Here are some other important points in using the recursive-do notation:
1022 It is enabled with the flag <literal>-XDoRec</literal>, which is in turn implied by
1023 <literal>-fglasgow-exts</literal>.
1027 If recursive bindings are required for a monad,
1028 then that monad must be declared an instance of the <literal>MonadFix</literal> class.
1032 The following instances of <literal>MonadFix</literal> are automatically provided: List, Maybe, IO.
1033 Furthermore, the Control.Monad.ST and Control.Monad.ST.Lazy modules provide the instances of the MonadFix class
1034 for Haskell's internal state monad (strict and lazy, respectively).
1038 Like <literal>let</literal> and <literal>where</literal> bindings,
1039 name shadowing is not allowed within a <literal>rec</literal>;
1040 that is, all the names bound in a single <literal>rec</literal> must
1041 be distinct (Section 3.3 of the paper).
1044 It supports rebindable syntax (see <xref linkend="rebindable-syntax"/>).
1050 <sect3> <title> Mdo-notation (deprecated) </title>
1052 <para> GHC used to support the flag <option>-XREecursiveDo</option>,
1053 which enabled the keyword <literal>mdo</literal>, precisely as described in
1054 <ulink url="http://sites.google.com/site/leventerkok/">A recursive do for Haskell</ulink>,
1055 but this is now deprecated. Instead of <literal>mdo { Q; e }</literal>, write
1056 <literal>do { rec Q; e }</literal>.
1059 Historical note: The old implementation of the mdo-notation (and most
1060 of the existing documents) used the name
1061 <literal>MonadRec</literal> for the class and the corresponding library.
1062 This name is not supported by GHC.
1069 <!-- ===================== PARALLEL LIST COMPREHENSIONS =================== -->
1071 <sect2 id="parallel-list-comprehensions">
1072 <title>Parallel List Comprehensions</title>
1073 <indexterm><primary>list comprehensions</primary><secondary>parallel</secondary>
1075 <indexterm><primary>parallel list comprehensions</primary>
1078 <para>Parallel list comprehensions are a natural extension to list
1079 comprehensions. List comprehensions can be thought of as a nice
1080 syntax for writing maps and filters. Parallel comprehensions
1081 extend this to include the zipWith family.</para>
1083 <para>A parallel list comprehension has multiple independent
1084 branches of qualifier lists, each separated by a `|' symbol. For
1085 example, the following zips together two lists:</para>
1088 [ (x, y) | x <- xs | y <- ys ]
1091 <para>The behavior of parallel list comprehensions follows that of
1092 zip, in that the resulting list will have the same length as the
1093 shortest branch.</para>
1095 <para>We can define parallel list comprehensions by translation to
1096 regular comprehensions. Here's the basic idea:</para>
1098 <para>Given a parallel comprehension of the form: </para>
1101 [ e | p1 <- e11, p2 <- e12, ...
1102 | q1 <- e21, q2 <- e22, ...
1107 <para>This will be translated to: </para>
1110 [ e | ((p1,p2), (q1,q2), ...) <- zipN [(p1,p2) | p1 <- e11, p2 <- e12, ...]
1111 [(q1,q2) | q1 <- e21, q2 <- e22, ...]
1116 <para>where `zipN' is the appropriate zip for the given number of
1121 <!-- ===================== TRANSFORM LIST COMPREHENSIONS =================== -->
1123 <sect2 id="generalised-list-comprehensions">
1124 <title>Generalised (SQL-Like) List Comprehensions</title>
1125 <indexterm><primary>list comprehensions</primary><secondary>generalised</secondary>
1127 <indexterm><primary>extended list comprehensions</primary>
1129 <indexterm><primary>group</primary></indexterm>
1130 <indexterm><primary>sql</primary></indexterm>
1133 <para>Generalised list comprehensions are a further enhancement to the
1134 list comprehension syntactic sugar to allow operations such as sorting
1135 and grouping which are familiar from SQL. They are fully described in the
1136 paper <ulink url="http://research.microsoft.com/~simonpj/papers/list-comp">
1137 Comprehensive comprehensions: comprehensions with "order by" and "group by"</ulink>,
1138 except that the syntax we use differs slightly from the paper.</para>
1139 <para>The extension is enabled with the flag <option>-XTransformListComp</option>.</para>
1140 <para>Here is an example:
1142 employees = [ ("Simon", "MS", 80)
1143 , ("Erik", "MS", 100)
1144 , ("Phil", "Ed", 40)
1145 , ("Gordon", "Ed", 45)
1146 , ("Paul", "Yale", 60)]
1148 output = [ (the dept, sum salary)
1149 | (name, dept, salary) <- employees
1150 , then group by dept
1151 , then sortWith by (sum salary)
1154 In this example, the list <literal>output</literal> would take on
1158 [("Yale", 60), ("Ed", 85), ("MS", 180)]
1161 <para>There are three new keywords: <literal>group</literal>, <literal>by</literal>, and <literal>using</literal>.
1162 (The function <literal>sortWith</literal> is not a keyword; it is an ordinary
1163 function that is exported by <literal>GHC.Exts</literal>.)</para>
1165 <para>There are five new forms of comprehension qualifier,
1166 all introduced by the (existing) keyword <literal>then</literal>:
1174 This statement requires that <literal>f</literal> have the type <literal>
1175 forall a. [a] -> [a]</literal>. You can see an example of its use in the
1176 motivating example, as this form is used to apply <literal>take 5</literal>.
1187 This form is similar to the previous one, but allows you to create a function
1188 which will be passed as the first argument to f. As a consequence f must have
1189 the type <literal>forall a. (a -> t) -> [a] -> [a]</literal>. As you can see
1190 from the type, this function lets f "project out" some information
1191 from the elements of the list it is transforming.</para>
1193 <para>An example is shown in the opening example, where <literal>sortWith</literal>
1194 is supplied with a function that lets it find out the <literal>sum salary</literal>
1195 for any item in the list comprehension it transforms.</para>
1203 then group by e using f
1206 <para>This is the most general of the grouping-type statements. In this form,
1207 f is required to have type <literal>forall a. (a -> t) -> [a] -> [[a]]</literal>.
1208 As with the <literal>then f by e</literal> case above, the first argument
1209 is a function supplied to f by the compiler which lets it compute e on every
1210 element of the list being transformed. However, unlike the non-grouping case,
1211 f additionally partitions the list into a number of sublists: this means that
1212 at every point after this statement, binders occurring before it in the comprehension
1213 refer to <emphasis>lists</emphasis> of possible values, not single values. To help understand
1214 this, let's look at an example:</para>
1217 -- This works similarly to groupWith in GHC.Exts, but doesn't sort its input first
1218 groupRuns :: Eq b => (a -> b) -> [a] -> [[a]]
1219 groupRuns f = groupBy (\x y -> f x == f y)
1221 output = [ (the x, y)
1222 | x <- ([1..3] ++ [1..2])
1224 , then group by x using groupRuns ]
1227 <para>This results in the variable <literal>output</literal> taking on the value below:</para>
1230 [(1, [4, 5, 6]), (2, [4, 5, 6]), (3, [4, 5, 6]), (1, [4, 5, 6]), (2, [4, 5, 6])]
1233 <para>Note that we have used the <literal>the</literal> function to change the type
1234 of x from a list to its original numeric type. The variable y, in contrast, is left
1235 unchanged from the list form introduced by the grouping.</para>
1245 <para>This form of grouping is essentially the same as the one described above. However,
1246 since no function to use for the grouping has been supplied it will fall back on the
1247 <literal>groupWith</literal> function defined in
1248 <ulink url="../libraries/base/GHC-Exts.html"><literal>GHC.Exts</literal></ulink>. This
1249 is the form of the group statement that we made use of in the opening example.</para>
1260 <para>With this form of the group statement, f is required to simply have the type
1261 <literal>forall a. [a] -> [[a]]</literal>, which will be used to group up the
1262 comprehension so far directly. An example of this form is as follows:</para>
1268 , then group using inits]
1271 <para>This will yield a list containing every prefix of the word "hello" written out 5 times:</para>
1274 ["","h","he","hel","hell","hello","helloh","hellohe","hellohel","hellohell","hellohello","hellohelloh",...]
1282 <!-- ===================== REBINDABLE SYNTAX =================== -->
1284 <sect2 id="rebindable-syntax">
1285 <title>Rebindable syntax and the implicit Prelude import</title>
1287 <para><indexterm><primary>-XNoImplicitPrelude
1288 option</primary></indexterm> GHC normally imports
1289 <filename>Prelude.hi</filename> files for you. If you'd
1290 rather it didn't, then give it a
1291 <option>-XNoImplicitPrelude</option> option. The idea is
1292 that you can then import a Prelude of your own. (But don't
1293 call it <literal>Prelude</literal>; the Haskell module
1294 namespace is flat, and you must not conflict with any
1295 Prelude module.)</para>
1297 <para>Suppose you are importing a Prelude of your own
1298 in order to define your own numeric class
1299 hierarchy. It completely defeats that purpose if the
1300 literal "1" means "<literal>Prelude.fromInteger
1301 1</literal>", which is what the Haskell Report specifies.
1302 So the <option>-XNoImplicitPrelude</option>
1303 flag <emphasis>also</emphasis> causes
1304 the following pieces of built-in syntax to refer to
1305 <emphasis>whatever is in scope</emphasis>, not the Prelude
1309 <para>An integer literal <literal>368</literal> means
1310 "<literal>fromInteger (368::Integer)</literal>", rather than
1311 "<literal>Prelude.fromInteger (368::Integer)</literal>".
1314 <listitem><para>Fractional literals are handed in just the same way,
1315 except that the translation is
1316 <literal>fromRational (3.68::Rational)</literal>.
1319 <listitem><para>The equality test in an overloaded numeric pattern
1320 uses whatever <literal>(==)</literal> is in scope.
1323 <listitem><para>The subtraction operation, and the
1324 greater-than-or-equal test, in <literal>n+k</literal> patterns
1325 use whatever <literal>(-)</literal> and <literal>(>=)</literal> are in scope.
1329 <para>Negation (e.g. "<literal>- (f x)</literal>")
1330 means "<literal>negate (f x)</literal>", both in numeric
1331 patterns, and expressions.
1335 <para>"Do" notation is translated using whatever
1336 functions <literal>(>>=)</literal>,
1337 <literal>(>>)</literal>, and <literal>fail</literal>,
1338 are in scope (not the Prelude
1339 versions). List comprehensions, mdo (<xref linkend="mdo-notation"/>), and parallel array
1340 comprehensions, are unaffected. </para></listitem>
1344 notation (see <xref linkend="arrow-notation"/>)
1345 uses whatever <literal>arr</literal>,
1346 <literal>(>>>)</literal>, <literal>first</literal>,
1347 <literal>app</literal>, <literal>(|||)</literal> and
1348 <literal>loop</literal> functions are in scope. But unlike the
1349 other constructs, the types of these functions must match the
1350 Prelude types very closely. Details are in flux; if you want
1354 In all cases (apart from arrow notation), the static semantics should be that of the desugared form,
1355 even if that is a little unexpected. For example, the
1356 static semantics of the literal <literal>368</literal>
1357 is exactly that of <literal>fromInteger (368::Integer)</literal>; it's fine for
1358 <literal>fromInteger</literal> to have any of the types:
1360 fromInteger :: Integer -> Integer
1361 fromInteger :: forall a. Foo a => Integer -> a
1362 fromInteger :: Num a => a -> Integer
1363 fromInteger :: Integer -> Bool -> Bool
1367 <para>Be warned: this is an experimental facility, with
1368 fewer checks than usual. Use <literal>-dcore-lint</literal>
1369 to typecheck the desugared program. If Core Lint is happy
1370 you should be all right.</para>
1374 <sect2 id="postfix-operators">
1375 <title>Postfix operators</title>
1378 The <option>-XPostfixOperators</option> flag enables a small
1379 extension to the syntax of left operator sections, which allows you to
1380 define postfix operators. The extension is this: the left section
1384 is equivalent (from the point of view of both type checking and execution) to the expression
1388 (for any expression <literal>e</literal> and operator <literal>(!)</literal>.
1389 The strict Haskell 98 interpretation is that the section is equivalent to
1393 That is, the operator must be a function of two arguments. GHC allows it to
1394 take only one argument, and that in turn allows you to write the function
1397 <para>The extension does not extend to the left-hand side of function
1398 definitions; you must define such a function in prefix form.</para>
1402 <sect2 id="tuple-sections">
1403 <title>Tuple sections</title>
1406 The <option>-XTupleSections</option> flag enables Python-style partially applied
1407 tuple constructors. For example, the following program
1411 is considered to be an alternative notation for the more unwieldy alternative
1415 You can omit any combination of arguments to the tuple, as in the following
1417 (, "I", , , "Love", , 1337)
1421 \a b c d -> (a, "I", b, c, "Love", d, 1337)
1426 If you have <link linkend="unboxed-tuples">unboxed tuples</link> enabled, tuple sections
1427 will also be available for them, like so
1431 Because there is no unboxed unit tuple, the following expression
1435 continues to stand for the unboxed singleton tuple data constructor.
1440 <sect2 id="disambiguate-fields">
1441 <title>Record field disambiguation</title>
1443 In record construction and record pattern matching
1444 it is entirely unambiguous which field is referred to, even if there are two different
1445 data types in scope with a common field name. For example:
1448 data S = MkS { x :: Int, y :: Bool }
1453 data T = MkT { x :: Int }
1455 ok1 (MkS { x = n }) = n+1 -- Unambiguous
1456 ok2 n = MkT { x = n+1 } -- Unambiguous
1458 bad1 k = k { x = 3 } -- Ambiguous
1459 bad2 k = x k -- Ambiguous
1461 Even though there are two <literal>x</literal>'s in scope,
1462 it is clear that the <literal>x</literal> in the pattern in the
1463 definition of <literal>ok1</literal> can only mean the field
1464 <literal>x</literal> from type <literal>S</literal>. Similarly for
1465 the function <literal>ok2</literal>. However, in the record update
1466 in <literal>bad1</literal> and the record selection in <literal>bad2</literal>
1467 it is not clear which of the two types is intended.
1470 Haskell 98 regards all four as ambiguous, but with the
1471 <option>-XDisambiguateRecordFields</option> flag, GHC will accept
1472 the former two. The rules are precisely the same as those for instance
1473 declarations in Haskell 98, where the method names on the left-hand side
1474 of the method bindings in an instance declaration refer unambiguously
1475 to the method of that class (provided they are in scope at all), even
1476 if there are other variables in scope with the same name.
1477 This reduces the clutter of qualified names when you import two
1478 records from different modules that use the same field name.
1484 Field disambiguation can be combined with punning (see <xref linkend="record-puns"/>). For exampe:
1489 ok3 (MkS { x }) = x+1 -- Uses both disambiguation and punning
1494 With <option>-XDisambiguateRecordFields</option> you can use <emphasis>unqualifed</emphasis>
1495 field names even if the correponding selector is only in scope <emphasis>qualified</emphasis>
1496 For example, assuming the same module <literal>M</literal> as in our earlier example, this is legal:
1499 import qualified M -- Note qualified
1501 ok4 (M.MkS { x = n }) = n+1 -- Unambiguous
1503 Since the constructore <literal>MkS</literal> is only in scope qualified, you must
1504 name it <literal>M.MkS</literal>, but the field <literal>x</literal> does not need
1505 to be qualified even though <literal>M.x</literal> is in scope but <literal>x</literal>
1506 is not. (In effect, it is qualified by the constructor.)
1513 <!-- ===================== Record puns =================== -->
1515 <sect2 id="record-puns">
1520 Record puns are enabled by the flag <literal>-XNamedFieldPuns</literal>.
1524 When using records, it is common to write a pattern that binds a
1525 variable with the same name as a record field, such as:
1528 data C = C {a :: Int}
1534 Record punning permits the variable name to be elided, so one can simply
1541 to mean the same pattern as above. That is, in a record pattern, the
1542 pattern <literal>a</literal> expands into the pattern <literal>a =
1543 a</literal> for the same name <literal>a</literal>.
1550 Record punning can also be used in an expression, writing, for example,
1556 let a = 1 in C {a = a}
1558 The expansion is purely syntactic, so the expanded right-hand side
1559 expression refers to the nearest enclosing variable that is spelled the
1560 same as the field name.
1564 Puns and other patterns can be mixed in the same record:
1566 data C = C {a :: Int, b :: Int}
1567 f (C {a, b = 4}) = a
1572 Puns can be used wherever record patterns occur (e.g. in
1573 <literal>let</literal> bindings or at the top-level).
1577 A pun on a qualified field name is expanded by stripping off the module qualifier.
1584 f (M.C {M.a = a}) = a
1586 (This is useful if the field selector <literal>a</literal> for constructor <literal>M.C</literal>
1587 is only in scope in qualified form.)
1595 <!-- ===================== Record wildcards =================== -->
1597 <sect2 id="record-wildcards">
1598 <title>Record wildcards
1602 Record wildcards are enabled by the flag <literal>-XRecordWildCards</literal>.
1603 This flag implies <literal>-XDisambiguateRecordFields</literal>.
1607 For records with many fields, it can be tiresome to write out each field
1608 individually in a record pattern, as in
1610 data C = C {a :: Int, b :: Int, c :: Int, d :: Int}
1611 f (C {a = 1, b = b, c = c, d = d}) = b + c + d
1616 Record wildcard syntax permits a "<literal>..</literal>" in a record
1617 pattern, where each elided field <literal>f</literal> is replaced by the
1618 pattern <literal>f = f</literal>. For example, the above pattern can be
1621 f (C {a = 1, ..}) = b + c + d
1629 Wildcards can be mixed with other patterns, including puns
1630 (<xref linkend="record-puns"/>); for example, in a pattern <literal>C {a
1631 = 1, b, ..})</literal>. Additionally, record wildcards can be used
1632 wherever record patterns occur, including in <literal>let</literal>
1633 bindings and at the top-level. For example, the top-level binding
1637 defines <literal>b</literal>, <literal>c</literal>, and
1638 <literal>d</literal>.
1642 Record wildcards can also be used in expressions, writing, for example,
1644 let {a = 1; b = 2; c = 3; d = 4} in C {..}
1648 let {a = 1; b = 2; c = 3; d = 4} in C {a=a, b=b, c=c, d=d}
1650 The expansion is purely syntactic, so the record wildcard
1651 expression refers to the nearest enclosing variables that are spelled
1652 the same as the omitted field names.
1656 The "<literal>..</literal>" expands to the missing
1657 <emphasis>in-scope</emphasis> record fields, where "in scope"
1658 includes both unqualified and qualified-only.
1659 Any fields that are not in scope are not filled in. For example
1662 data R = R { a,b,c :: Int }
1664 import qualified M( R(a,b) )
1667 The <literal>{..}</literal> expands to <literal>{M.a=a,M.b=b}</literal>,
1668 omitting <literal>c</literal> since it is not in scope at all.
1675 <!-- ===================== Local fixity declarations =================== -->
1677 <sect2 id="local-fixity-declarations">
1678 <title>Local Fixity Declarations
1681 <para>A careful reading of the Haskell 98 Report reveals that fixity
1682 declarations (<literal>infix</literal>, <literal>infixl</literal>, and
1683 <literal>infixr</literal>) are permitted to appear inside local bindings
1684 such those introduced by <literal>let</literal> and
1685 <literal>where</literal>. However, the Haskell Report does not specify
1686 the semantics of such bindings very precisely.
1689 <para>In GHC, a fixity declaration may accompany a local binding:
1696 and the fixity declaration applies wherever the binding is in scope.
1697 For example, in a <literal>let</literal>, it applies in the right-hand
1698 sides of other <literal>let</literal>-bindings and the body of the
1699 <literal>let</literal>C. Or, in recursive <literal>do</literal>
1700 expressions (<xref linkend="mdo-notation"/>), the local fixity
1701 declarations of a <literal>let</literal> statement scope over other
1702 statements in the group, just as the bound name does.
1706 Moreover, a local fixity declaration *must* accompany a local binding of
1707 that name: it is not possible to revise the fixity of name bound
1710 let infixr 9 $ in ...
1713 Because local fixity declarations are technically Haskell 98, no flag is
1714 necessary to enable them.
1718 <sect2 id="package-imports">
1719 <title>Package-qualified imports</title>
1721 <para>With the <option>-XPackageImports</option> flag, GHC allows
1722 import declarations to be qualified by the package name that the
1723 module is intended to be imported from. For example:</para>
1726 import "network" Network.Socket
1729 <para>would import the module <literal>Network.Socket</literal> from
1730 the package <literal>network</literal> (any version). This may
1731 be used to disambiguate an import when the same module is
1732 available from multiple packages, or is present in both the
1733 current package being built and an external package.</para>
1735 <para>Note: you probably don't need to use this feature, it was
1736 added mainly so that we can build backwards-compatible versions of
1737 packages when APIs change. It can lead to fragile dependencies in
1738 the common case: modules occasionally move from one package to
1739 another, rendering any package-qualified imports broken.</para>
1742 <sect2 id="syntax-stolen">
1743 <title>Summary of stolen syntax</title>
1745 <para>Turning on an option that enables special syntax
1746 <emphasis>might</emphasis> cause working Haskell 98 code to fail
1747 to compile, perhaps because it uses a variable name which has
1748 become a reserved word. This section lists the syntax that is
1749 "stolen" by language extensions.
1751 notation and nonterminal names from the Haskell 98 lexical syntax
1752 (see the Haskell 98 Report).
1753 We only list syntax changes here that might affect
1754 existing working programs (i.e. "stolen" syntax). Many of these
1755 extensions will also enable new context-free syntax, but in all
1756 cases programs written to use the new syntax would not be
1757 compilable without the option enabled.</para>
1759 <para>There are two classes of special
1764 <para>New reserved words and symbols: character sequences
1765 which are no longer available for use as identifiers in the
1769 <para>Other special syntax: sequences of characters that have
1770 a different meaning when this particular option is turned
1775 The following syntax is stolen:
1780 <literal>forall</literal>
1781 <indexterm><primary><literal>forall</literal></primary></indexterm>
1784 Stolen (in types) by: <option>-XExplicitForAll</option>, and hence by
1785 <option>-XScopedTypeVariables</option>,
1786 <option>-XLiberalTypeSynonyms</option>,
1787 <option>-XRank2Types</option>,
1788 <option>-XRankNTypes</option>,
1789 <option>-XPolymorphicComponents</option>,
1790 <option>-XExistentialQuantification</option>
1796 <literal>mdo</literal>
1797 <indexterm><primary><literal>mdo</literal></primary></indexterm>
1800 Stolen by: <option>-XRecursiveDo</option>,
1806 <literal>foreign</literal>
1807 <indexterm><primary><literal>foreign</literal></primary></indexterm>
1810 Stolen by: <option>-XForeignFunctionInterface</option>,
1816 <literal>rec</literal>,
1817 <literal>proc</literal>, <literal>-<</literal>,
1818 <literal>>-</literal>, <literal>-<<</literal>,
1819 <literal>>>-</literal>, and <literal>(|</literal>,
1820 <literal>|)</literal> brackets
1821 <indexterm><primary><literal>proc</literal></primary></indexterm>
1824 Stolen by: <option>-XArrows</option>,
1830 <literal>?<replaceable>varid</replaceable></literal>,
1831 <literal>%<replaceable>varid</replaceable></literal>
1832 <indexterm><primary>implicit parameters</primary></indexterm>
1835 Stolen by: <option>-XImplicitParams</option>,
1841 <literal>[|</literal>,
1842 <literal>[e|</literal>, <literal>[p|</literal>,
1843 <literal>[d|</literal>, <literal>[t|</literal>,
1844 <literal>$(</literal>,
1845 <literal>$<replaceable>varid</replaceable></literal>
1846 <indexterm><primary>Template Haskell</primary></indexterm>
1849 Stolen by: <option>-XTemplateHaskell</option>,
1855 <literal>[:<replaceable>varid</replaceable>|</literal>
1856 <indexterm><primary>quasi-quotation</primary></indexterm>
1859 Stolen by: <option>-XQuasiQuotes</option>,
1865 <replaceable>varid</replaceable>{<literal>#</literal>},
1866 <replaceable>char</replaceable><literal>#</literal>,
1867 <replaceable>string</replaceable><literal>#</literal>,
1868 <replaceable>integer</replaceable><literal>#</literal>,
1869 <replaceable>float</replaceable><literal>#</literal>,
1870 <replaceable>float</replaceable><literal>##</literal>,
1871 <literal>(#</literal>, <literal>#)</literal>,
1874 Stolen by: <option>-XMagicHash</option>,
1883 <!-- TYPE SYSTEM EXTENSIONS -->
1884 <sect1 id="data-type-extensions">
1885 <title>Extensions to data types and type synonyms</title>
1887 <sect2 id="nullary-types">
1888 <title>Data types with no constructors</title>
1890 <para>With the <option>-fglasgow-exts</option> flag, GHC lets you declare
1891 a data type with no constructors. For example:</para>
1895 data T a -- T :: * -> *
1898 <para>Syntactically, the declaration lacks the "= constrs" part. The
1899 type can be parameterised over types of any kind, but if the kind is
1900 not <literal>*</literal> then an explicit kind annotation must be used
1901 (see <xref linkend="kinding"/>).</para>
1903 <para>Such data types have only one value, namely bottom.
1904 Nevertheless, they can be useful when defining "phantom types".</para>
1907 <sect2 id="infix-tycons">
1908 <title>Infix type constructors, classes, and type variables</title>
1911 GHC allows type constructors, classes, and type variables to be operators, and
1912 to be written infix, very much like expressions. More specifically:
1915 A type constructor or class can be an operator, beginning with a colon; e.g. <literal>:*:</literal>.
1916 The lexical syntax is the same as that for data constructors.
1919 Data type and type-synonym declarations can be written infix, parenthesised
1920 if you want further arguments. E.g.
1922 data a :*: b = Foo a b
1923 type a :+: b = Either a b
1924 class a :=: b where ...
1926 data (a :**: b) x = Baz a b x
1927 type (a :++: b) y = Either (a,b) y
1931 Types, and class constraints, can be written infix. For example
1934 f :: (a :=: b) => a -> b
1938 A type variable can be an (unqualified) operator e.g. <literal>+</literal>.
1939 The lexical syntax is the same as that for variable operators, excluding "(.)",
1940 "(!)", and "(*)". In a binding position, the operator must be
1941 parenthesised. For example:
1943 type T (+) = Int + Int
1947 liftA2 :: Arrow (~>)
1948 => (a -> b -> c) -> (e ~> a) -> (e ~> b) -> (e ~> c)
1954 as for expressions, both for type constructors and type variables; e.g. <literal>Int `Either` Bool</literal>, or
1955 <literal>Int `a` Bool</literal>. Similarly, parentheses work the same; e.g. <literal>(:*:) Int Bool</literal>.
1958 Fixities may be declared for type constructors, or classes, just as for data constructors. However,
1959 one cannot distinguish between the two in a fixity declaration; a fixity declaration
1960 sets the fixity for a data constructor and the corresponding type constructor. For example:
1964 sets the fixity for both type constructor <literal>T</literal> and data constructor <literal>T</literal>,
1965 and similarly for <literal>:*:</literal>.
1966 <literal>Int `a` Bool</literal>.
1969 Function arrow is <literal>infixr</literal> with fixity 0. (This might change; I'm not sure what it should be.)
1976 <sect2 id="type-synonyms">
1977 <title>Liberalised type synonyms</title>
1980 Type synonyms are like macros at the type level, but Haskell 98 imposes many rules
1981 on individual synonym declarations.
1982 With the <option>-XLiberalTypeSynonyms</option> extension,
1983 GHC does validity checking on types <emphasis>only after expanding type synonyms</emphasis>.
1984 That means that GHC can be very much more liberal about type synonyms than Haskell 98.
1987 <listitem> <para>You can write a <literal>forall</literal> (including overloading)
1988 in a type synonym, thus:
1990 type Discard a = forall b. Show b => a -> b -> (a, String)
1995 g :: Discard Int -> (Int,String) -- A rank-2 type
2002 If you also use <option>-XUnboxedTuples</option>,
2003 you can write an unboxed tuple in a type synonym:
2005 type Pr = (# Int, Int #)
2013 You can apply a type synonym to a forall type:
2015 type Foo a = a -> a -> Bool
2017 f :: Foo (forall b. b->b)
2019 After expanding the synonym, <literal>f</literal> has the legal (in GHC) type:
2021 f :: (forall b. b->b) -> (forall b. b->b) -> Bool
2026 You can apply a type synonym to a partially applied type synonym:
2028 type Generic i o = forall x. i x -> o x
2031 foo :: Generic Id []
2033 After expanding the synonym, <literal>foo</literal> has the legal (in GHC) type:
2035 foo :: forall x. x -> [x]
2043 GHC currently does kind checking before expanding synonyms (though even that
2047 After expanding type synonyms, GHC does validity checking on types, looking for
2048 the following mal-formedness which isn't detected simply by kind checking:
2051 Type constructor applied to a type involving for-alls.
2054 Unboxed tuple on left of an arrow.
2057 Partially-applied type synonym.
2061 this will be rejected:
2063 type Pr = (# Int, Int #)
2068 because GHC does not allow unboxed tuples on the left of a function arrow.
2073 <sect2 id="existential-quantification">
2074 <title>Existentially quantified data constructors
2078 The idea of using existential quantification in data type declarations
2079 was suggested by Perry, and implemented in Hope+ (Nigel Perry, <emphasis>The Implementation
2080 of Practical Functional Programming Languages</emphasis>, PhD Thesis, University of
2081 London, 1991). It was later formalised by Laufer and Odersky
2082 (<emphasis>Polymorphic type inference and abstract data types</emphasis>,
2083 TOPLAS, 16(5), pp1411-1430, 1994).
2084 It's been in Lennart
2085 Augustsson's <command>hbc</command> Haskell compiler for several years, and
2086 proved very useful. Here's the idea. Consider the declaration:
2092 data Foo = forall a. MkFoo a (a -> Bool)
2099 The data type <literal>Foo</literal> has two constructors with types:
2105 MkFoo :: forall a. a -> (a -> Bool) -> Foo
2112 Notice that the type variable <literal>a</literal> in the type of <function>MkFoo</function>
2113 does not appear in the data type itself, which is plain <literal>Foo</literal>.
2114 For example, the following expression is fine:
2120 [MkFoo 3 even, MkFoo 'c' isUpper] :: [Foo]
2126 Here, <literal>(MkFoo 3 even)</literal> packages an integer with a function
2127 <function>even</function> that maps an integer to <literal>Bool</literal>; and <function>MkFoo 'c'
2128 isUpper</function> packages a character with a compatible function. These
2129 two things are each of type <literal>Foo</literal> and can be put in a list.
2133 What can we do with a value of type <literal>Foo</literal>?. In particular,
2134 what happens when we pattern-match on <function>MkFoo</function>?
2140 f (MkFoo val fn) = ???
2146 Since all we know about <literal>val</literal> and <function>fn</function> is that they
2147 are compatible, the only (useful) thing we can do with them is to
2148 apply <function>fn</function> to <literal>val</literal> to get a boolean. For example:
2155 f (MkFoo val fn) = fn val
2161 What this allows us to do is to package heterogeneous values
2162 together with a bunch of functions that manipulate them, and then treat
2163 that collection of packages in a uniform manner. You can express
2164 quite a bit of object-oriented-like programming this way.
2167 <sect3 id="existential">
2168 <title>Why existential?
2172 What has this to do with <emphasis>existential</emphasis> quantification?
2173 Simply that <function>MkFoo</function> has the (nearly) isomorphic type
2179 MkFoo :: (exists a . (a, a -> Bool)) -> Foo
2185 But Haskell programmers can safely think of the ordinary
2186 <emphasis>universally</emphasis> quantified type given above, thereby avoiding
2187 adding a new existential quantification construct.
2192 <sect3 id="existential-with-context">
2193 <title>Existentials and type classes</title>
2196 An easy extension is to allow
2197 arbitrary contexts before the constructor. For example:
2203 data Baz = forall a. Eq a => Baz1 a a
2204 | forall b. Show b => Baz2 b (b -> b)
2210 The two constructors have the types you'd expect:
2216 Baz1 :: forall a. Eq a => a -> a -> Baz
2217 Baz2 :: forall b. Show b => b -> (b -> b) -> Baz
2223 But when pattern matching on <function>Baz1</function> the matched values can be compared
2224 for equality, and when pattern matching on <function>Baz2</function> the first matched
2225 value can be converted to a string (as well as applying the function to it).
2226 So this program is legal:
2233 f (Baz1 p q) | p == q = "Yes"
2235 f (Baz2 v fn) = show (fn v)
2241 Operationally, in a dictionary-passing implementation, the
2242 constructors <function>Baz1</function> and <function>Baz2</function> must store the
2243 dictionaries for <literal>Eq</literal> and <literal>Show</literal> respectively, and
2244 extract it on pattern matching.
2249 <sect3 id="existential-records">
2250 <title>Record Constructors</title>
2253 GHC allows existentials to be used with records syntax as well. For example:
2256 data Counter a = forall self. NewCounter
2258 , _inc :: self -> self
2259 , _display :: self -> IO ()
2263 Here <literal>tag</literal> is a public field, with a well-typed selector
2264 function <literal>tag :: Counter a -> a</literal>. The <literal>self</literal>
2265 type is hidden from the outside; any attempt to apply <literal>_this</literal>,
2266 <literal>_inc</literal> or <literal>_display</literal> as functions will raise a
2267 compile-time error. In other words, <emphasis>GHC defines a record selector function
2268 only for fields whose type does not mention the existentially-quantified variables</emphasis>.
2269 (This example used an underscore in the fields for which record selectors
2270 will not be defined, but that is only programming style; GHC ignores them.)
2274 To make use of these hidden fields, we need to create some helper functions:
2277 inc :: Counter a -> Counter a
2278 inc (NewCounter x i d t) = NewCounter
2279 { _this = i x, _inc = i, _display = d, tag = t }
2281 display :: Counter a -> IO ()
2282 display NewCounter{ _this = x, _display = d } = d x
2285 Now we can define counters with different underlying implementations:
2288 counterA :: Counter String
2289 counterA = NewCounter
2290 { _this = 0, _inc = (1+), _display = print, tag = "A" }
2292 counterB :: Counter String
2293 counterB = NewCounter
2294 { _this = "", _inc = ('#':), _display = putStrLn, tag = "B" }
2297 display (inc counterA) -- prints "1"
2298 display (inc (inc counterB)) -- prints "##"
2301 Record update syntax is supported for existentials (and GADTs):
2303 setTag :: Counter a -> a -> Counter a
2304 setTag obj t = obj{ tag = t }
2306 The rule for record update is this: <emphasis>
2307 the types of the updated fields may
2308 mention only the universally-quantified type variables
2309 of the data constructor. For GADTs, the field may mention only types
2310 that appear as a simple type-variable argument in the constructor's result
2311 type</emphasis>. For example:
2313 data T a b where { T1 { f1::a, f2::b, f3::(b,c) } :: T a b } -- c is existential
2314 upd1 t x = t { f1=x } -- OK: upd1 :: T a b -> a' -> T a' b
2315 upd2 t x = t { f3=x } -- BAD (f3's type mentions c, which is
2316 -- existentially quantified)
2318 data G a b where { G1 { g1::a, g2::c } :: G a [c] }
2319 upd3 g x = g { g1=x } -- OK: upd3 :: G a b -> c -> G c b
2320 upd4 g x = g { g2=x } -- BAD (f2's type mentions c, which is not a simple
2321 -- type-variable argument in G1's result type)
2329 <title>Restrictions</title>
2332 There are several restrictions on the ways in which existentially-quantified
2333 constructors can be use.
2342 When pattern matching, each pattern match introduces a new,
2343 distinct, type for each existential type variable. These types cannot
2344 be unified with any other type, nor can they escape from the scope of
2345 the pattern match. For example, these fragments are incorrect:
2353 Here, the type bound by <function>MkFoo</function> "escapes", because <literal>a</literal>
2354 is the result of <function>f1</function>. One way to see why this is wrong is to
2355 ask what type <function>f1</function> has:
2359 f1 :: Foo -> a -- Weird!
2363 What is this "<literal>a</literal>" in the result type? Clearly we don't mean
2368 f1 :: forall a. Foo -> a -- Wrong!
2372 The original program is just plain wrong. Here's another sort of error
2376 f2 (Baz1 a b) (Baz1 p q) = a==q
2380 It's ok to say <literal>a==b</literal> or <literal>p==q</literal>, but
2381 <literal>a==q</literal> is wrong because it equates the two distinct types arising
2382 from the two <function>Baz1</function> constructors.
2390 You can't pattern-match on an existentially quantified
2391 constructor in a <literal>let</literal> or <literal>where</literal> group of
2392 bindings. So this is illegal:
2396 f3 x = a==b where { Baz1 a b = x }
2399 Instead, use a <literal>case</literal> expression:
2402 f3 x = case x of Baz1 a b -> a==b
2405 In general, you can only pattern-match
2406 on an existentially-quantified constructor in a <literal>case</literal> expression or
2407 in the patterns of a function definition.
2409 The reason for this restriction is really an implementation one.
2410 Type-checking binding groups is already a nightmare without
2411 existentials complicating the picture. Also an existential pattern
2412 binding at the top level of a module doesn't make sense, because it's
2413 not clear how to prevent the existentially-quantified type "escaping".
2414 So for now, there's a simple-to-state restriction. We'll see how
2422 You can't use existential quantification for <literal>newtype</literal>
2423 declarations. So this is illegal:
2427 newtype T = forall a. Ord a => MkT a
2431 Reason: a value of type <literal>T</literal> must be represented as a
2432 pair of a dictionary for <literal>Ord t</literal> and a value of type
2433 <literal>t</literal>. That contradicts the idea that
2434 <literal>newtype</literal> should have no concrete representation.
2435 You can get just the same efficiency and effect by using
2436 <literal>data</literal> instead of <literal>newtype</literal>. If
2437 there is no overloading involved, then there is more of a case for
2438 allowing an existentially-quantified <literal>newtype</literal>,
2439 because the <literal>data</literal> version does carry an
2440 implementation cost, but single-field existentially quantified
2441 constructors aren't much use. So the simple restriction (no
2442 existential stuff on <literal>newtype</literal>) stands, unless there
2443 are convincing reasons to change it.
2451 You can't use <literal>deriving</literal> to define instances of a
2452 data type with existentially quantified data constructors.
2454 Reason: in most cases it would not make sense. For example:;
2457 data T = forall a. MkT [a] deriving( Eq )
2460 To derive <literal>Eq</literal> in the standard way we would need to have equality
2461 between the single component of two <function>MkT</function> constructors:
2465 (MkT a) == (MkT b) = ???
2468 But <varname>a</varname> and <varname>b</varname> have distinct types, and so can't be compared.
2469 It's just about possible to imagine examples in which the derived instance
2470 would make sense, but it seems altogether simpler simply to prohibit such
2471 declarations. Define your own instances!
2482 <!-- ====================== Generalised algebraic data types ======================= -->
2484 <sect2 id="gadt-style">
2485 <title>Declaring data types with explicit constructor signatures</title>
2487 <para>GHC allows you to declare an algebraic data type by
2488 giving the type signatures of constructors explicitly. For example:
2492 Just :: a -> Maybe a
2494 The form is called a "GADT-style declaration"
2495 because Generalised Algebraic Data Types, described in <xref linkend="gadt"/>,
2496 can only be declared using this form.</para>
2497 <para>Notice that GADT-style syntax generalises existential types (<xref linkend="existential-quantification"/>).
2498 For example, these two declarations are equivalent:
2500 data Foo = forall a. MkFoo a (a -> Bool)
2501 data Foo' where { MKFoo :: a -> (a->Bool) -> Foo' }
2504 <para>Any data type that can be declared in standard Haskell-98 syntax
2505 can also be declared using GADT-style syntax.
2506 The choice is largely stylistic, but GADT-style declarations differ in one important respect:
2507 they treat class constraints on the data constructors differently.
2508 Specifically, if the constructor is given a type-class context, that
2509 context is made available by pattern matching. For example:
2512 MkSet :: Eq a => [a] -> Set a
2514 makeSet :: Eq a => [a] -> Set a
2515 makeSet xs = MkSet (nub xs)
2517 insert :: a -> Set a -> Set a
2518 insert a (MkSet as) | a `elem` as = MkSet as
2519 | otherwise = MkSet (a:as)
2521 A use of <literal>MkSet</literal> as a constructor (e.g. in the definition of <literal>makeSet</literal>)
2522 gives rise to a <literal>(Eq a)</literal>
2523 constraint, as you would expect. The new feature is that pattern-matching on <literal>MkSet</literal>
2524 (as in the definition of <literal>insert</literal>) makes <emphasis>available</emphasis> an <literal>(Eq a)</literal>
2525 context. In implementation terms, the <literal>MkSet</literal> constructor has a hidden field that stores
2526 the <literal>(Eq a)</literal> dictionary that is passed to <literal>MkSet</literal>; so
2527 when pattern-matching that dictionary becomes available for the right-hand side of the match.
2528 In the example, the equality dictionary is used to satisfy the equality constraint
2529 generated by the call to <literal>elem</literal>, so that the type of
2530 <literal>insert</literal> itself has no <literal>Eq</literal> constraint.
2533 For example, one possible application is to reify dictionaries:
2535 data NumInst a where
2536 MkNumInst :: Num a => NumInst a
2538 intInst :: NumInst Int
2541 plus :: NumInst a -> a -> a -> a
2542 plus MkNumInst p q = p + q
2544 Here, a value of type <literal>NumInst a</literal> is equivalent
2545 to an explicit <literal>(Num a)</literal> dictionary.
2548 All this applies to constructors declared using the syntax of <xref linkend="existential-with-context"/>.
2549 For example, the <literal>NumInst</literal> data type above could equivalently be declared
2553 = Num a => MkNumInst (NumInst a)
2555 Notice that, unlike the situation when declaring an existential, there is
2556 no <literal>forall</literal>, because the <literal>Num</literal> constrains the
2557 data type's universally quantified type variable <literal>a</literal>.
2558 A constructor may have both universal and existential type variables: for example,
2559 the following two declarations are equivalent:
2562 = forall b. (Num a, Eq b) => MkT1 a b
2564 MkT2 :: (Num a, Eq b) => a -> b -> T2 a
2567 <para>All this behaviour contrasts with Haskell 98's peculiar treatment of
2568 contexts on a data type declaration (Section 4.2.1 of the Haskell 98 Report).
2569 In Haskell 98 the definition
2571 data Eq a => Set' a = MkSet' [a]
2573 gives <literal>MkSet'</literal> the same type as <literal>MkSet</literal> above. But instead of
2574 <emphasis>making available</emphasis> an <literal>(Eq a)</literal> constraint, pattern-matching
2575 on <literal>MkSet'</literal> <emphasis>requires</emphasis> an <literal>(Eq a)</literal> constraint!
2576 GHC faithfully implements this behaviour, odd though it is. But for GADT-style declarations,
2577 GHC's behaviour is much more useful, as well as much more intuitive.
2581 The rest of this section gives further details about GADT-style data
2586 The result type of each data constructor must begin with the type constructor being defined.
2587 If the result type of all constructors
2588 has the form <literal>T a1 ... an</literal>, where <literal>a1 ... an</literal>
2589 are distinct type variables, then the data type is <emphasis>ordinary</emphasis>;
2590 otherwise is a <emphasis>generalised</emphasis> data type (<xref linkend="gadt"/>).
2594 As with other type signatures, you can give a single signature for several data constructors.
2595 In this example we give a single signature for <literal>T1</literal> and <literal>T2</literal>:
2604 The type signature of
2605 each constructor is independent, and is implicitly universally quantified as usual.
2606 In particular, the type variable(s) in the "<literal>data T a where</literal>" header
2607 have no scope, and different constructors may have different universally-quantified type variables:
2609 data T a where -- The 'a' has no scope
2610 T1,T2 :: b -> T b -- Means forall b. b -> T b
2611 T3 :: T a -- Means forall a. T a
2616 A constructor signature may mention type class constraints, which can differ for
2617 different constructors. For example, this is fine:
2620 T1 :: Eq b => b -> b -> T b
2621 T2 :: (Show c, Ix c) => c -> [c] -> T c
2623 When patten matching, these constraints are made available to discharge constraints
2624 in the body of the match. For example:
2627 f (T1 x y) | x==y = "yes"
2631 Note that <literal>f</literal> is not overloaded; the <literal>Eq</literal> constraint arising
2632 from the use of <literal>==</literal> is discharged by the pattern match on <literal>T1</literal>
2633 and similarly the <literal>Show</literal> constraint arising from the use of <literal>show</literal>.
2637 Unlike a Haskell-98-style
2638 data type declaration, the type variable(s) in the "<literal>data Set a where</literal>" header
2639 have no scope. Indeed, one can write a kind signature instead:
2641 data Set :: * -> * where ...
2643 or even a mixture of the two:
2645 data Bar a :: (* -> *) -> * where ...
2647 The type variables (if given) may be explicitly kinded, so we could also write the header for <literal>Foo</literal>
2650 data Bar a (b :: * -> *) where ...
2656 You can use strictness annotations, in the obvious places
2657 in the constructor type:
2660 Lit :: !Int -> Term Int
2661 If :: Term Bool -> !(Term a) -> !(Term a) -> Term a
2662 Pair :: Term a -> Term b -> Term (a,b)
2667 You can use a <literal>deriving</literal> clause on a GADT-style data type
2668 declaration. For example, these two declarations are equivalent
2670 data Maybe1 a where {
2671 Nothing1 :: Maybe1 a ;
2672 Just1 :: a -> Maybe1 a
2673 } deriving( Eq, Ord )
2675 data Maybe2 a = Nothing2 | Just2 a
2681 The type signature may have quantified type variables that do not appear
2685 MkFoo :: a -> (a->Bool) -> Foo
2688 Here the type variable <literal>a</literal> does not appear in the result type
2689 of either constructor.
2690 Although it is universally quantified in the type of the constructor, such
2691 a type variable is often called "existential".
2692 Indeed, the above declaration declares precisely the same type as
2693 the <literal>data Foo</literal> in <xref linkend="existential-quantification"/>.
2695 The type may contain a class context too, of course:
2698 MkShowable :: Show a => a -> Showable
2703 You can use record syntax on a GADT-style data type declaration:
2707 Adult :: { name :: String, children :: [Person] } -> Person
2708 Child :: Show a => { name :: !String, funny :: a } -> Person
2710 As usual, for every constructor that has a field <literal>f</literal>, the type of
2711 field <literal>f</literal> must be the same (modulo alpha conversion).
2712 The <literal>Child</literal> constructor above shows that the signature
2713 may have a context, existentially-quantified variables, and strictness annotations,
2714 just as in the non-record case. (NB: the "type" that follows the double-colon
2715 is not really a type, because of the record syntax and strictness annotations.
2716 A "type" of this form can appear only in a constructor signature.)
2720 Record updates are allowed with GADT-style declarations,
2721 only fields that have the following property: the type of the field
2722 mentions no existential type variables.
2726 As in the case of existentials declared using the Haskell-98-like record syntax
2727 (<xref linkend="existential-records"/>),
2728 record-selector functions are generated only for those fields that have well-typed
2730 Here is the example of that section, in GADT-style syntax:
2732 data Counter a where
2733 NewCounter { _this :: self
2734 , _inc :: self -> self
2735 , _display :: self -> IO ()
2740 As before, only one selector function is generated here, that for <literal>tag</literal>.
2741 Nevertheless, you can still use all the field names in pattern matching and record construction.
2743 </itemizedlist></para>
2747 <title>Generalised Algebraic Data Types (GADTs)</title>
2749 <para>Generalised Algebraic Data Types generalise ordinary algebraic data types
2750 by allowing constructors to have richer return types. Here is an example:
2753 Lit :: Int -> Term Int
2754 Succ :: Term Int -> Term Int
2755 IsZero :: Term Int -> Term Bool
2756 If :: Term Bool -> Term a -> Term a -> Term a
2757 Pair :: Term a -> Term b -> Term (a,b)
2759 Notice that the return type of the constructors is not always <literal>Term a</literal>, as is the
2760 case with ordinary data types. This generality allows us to
2761 write a well-typed <literal>eval</literal> function
2762 for these <literal>Terms</literal>:
2766 eval (Succ t) = 1 + eval t
2767 eval (IsZero t) = eval t == 0
2768 eval (If b e1 e2) = if eval b then eval e1 else eval e2
2769 eval (Pair e1 e2) = (eval e1, eval e2)
2771 The key point about GADTs is that <emphasis>pattern matching causes type refinement</emphasis>.
2772 For example, in the right hand side of the equation
2777 the type <literal>a</literal> is refined to <literal>Int</literal>. That's the whole point!
2778 A precise specification of the type rules is beyond what this user manual aspires to,
2779 but the design closely follows that described in
2781 url="http://research.microsoft.com/%7Esimonpj/papers/gadt/">Simple
2782 unification-based type inference for GADTs</ulink>,
2784 The general principle is this: <emphasis>type refinement is only carried out
2785 based on user-supplied type annotations</emphasis>.
2786 So if no type signature is supplied for <literal>eval</literal>, no type refinement happens,
2787 and lots of obscure error messages will
2788 occur. However, the refinement is quite general. For example, if we had:
2790 eval :: Term a -> a -> a
2791 eval (Lit i) j = i+j
2793 the pattern match causes the type <literal>a</literal> to be refined to <literal>Int</literal> (because of the type
2794 of the constructor <literal>Lit</literal>), and that refinement also applies to the type of <literal>j</literal>, and
2795 the result type of the <literal>case</literal> expression. Hence the addition <literal>i+j</literal> is legal.
2798 These and many other examples are given in papers by Hongwei Xi, and
2799 Tim Sheard. There is a longer introduction
2800 <ulink url="http://www.haskell.org/haskellwiki/GADT">on the wiki</ulink>,
2802 <ulink url="http://www.informatik.uni-bonn.de/~ralf/publications/With.pdf">Fun with phantom types</ulink> also has a number of examples. Note that papers
2803 may use different notation to that implemented in GHC.
2806 The rest of this section outlines the extensions to GHC that support GADTs. The extension is enabled with
2807 <option>-XGADTs</option>. The <option>-XGADTs</option> flag also sets <option>-XRelaxedPolyRec</option>.
2810 A GADT can only be declared using GADT-style syntax (<xref linkend="gadt-style"/>);
2811 the old Haskell-98 syntax for data declarations always declares an ordinary data type.
2812 The result type of each constructor must begin with the type constructor being defined,
2813 but for a GADT the arguments to the type constructor can be arbitrary monotypes.
2814 For example, in the <literal>Term</literal> data
2815 type above, the type of each constructor must end with <literal>Term ty</literal>, but
2816 the <literal>ty</literal> need not be a type variable (e.g. the <literal>Lit</literal>
2821 It is permitted to declare an ordinary algebraic data type using GADT-style syntax.
2822 What makes a GADT into a GADT is not the syntax, but rather the presence of data constructors
2823 whose result type is not just <literal>T a b</literal>.
2827 You cannot use a <literal>deriving</literal> clause for a GADT; only for
2828 an ordinary data type.
2832 As mentioned in <xref linkend="gadt-style"/>, record syntax is supported.
2836 Lit { val :: Int } :: Term Int
2837 Succ { num :: Term Int } :: Term Int
2838 Pred { num :: Term Int } :: Term Int
2839 IsZero { arg :: Term Int } :: Term Bool
2840 Pair { arg1 :: Term a
2843 If { cnd :: Term Bool
2848 However, for GADTs there is the following additional constraint:
2849 every constructor that has a field <literal>f</literal> must have
2850 the same result type (modulo alpha conversion)
2851 Hence, in the above example, we cannot merge the <literal>num</literal>
2852 and <literal>arg</literal> fields above into a
2853 single name. Although their field types are both <literal>Term Int</literal>,
2854 their selector functions actually have different types:
2857 num :: Term Int -> Term Int
2858 arg :: Term Bool -> Term Int
2863 When pattern-matching against data constructors drawn from a GADT,
2864 for example in a <literal>case</literal> expression, the following rules apply:
2866 <listitem><para>The type of the scrutinee must be rigid.</para></listitem>
2867 <listitem><para>The type of the entire <literal>case</literal> expression must be rigid.</para></listitem>
2868 <listitem><para>The type of any free variable mentioned in any of
2869 the <literal>case</literal> alternatives must be rigid.</para></listitem>
2871 A type is "rigid" if it is completely known to the compiler at its binding site. The easiest
2872 way to ensure that a variable a rigid type is to give it a type signature.
2873 For more precise details see <ulink url="http://research.microsoft.com/%7Esimonpj/papers/gadt">
2874 Simple unification-based type inference for GADTs
2875 </ulink>. The criteria implemented by GHC are given in the Appendix.
2885 <!-- ====================== End of Generalised algebraic data types ======================= -->
2887 <sect1 id="deriving">
2888 <title>Extensions to the "deriving" mechanism</title>
2890 <sect2 id="deriving-inferred">
2891 <title>Inferred context for deriving clauses</title>
2894 The Haskell Report is vague about exactly when a <literal>deriving</literal> clause is
2897 data T0 f a = MkT0 a deriving( Eq )
2898 data T1 f a = MkT1 (f a) deriving( Eq )
2899 data T2 f a = MkT2 (f (f a)) deriving( Eq )
2901 The natural generated <literal>Eq</literal> code would result in these instance declarations:
2903 instance Eq a => Eq (T0 f a) where ...
2904 instance Eq (f a) => Eq (T1 f a) where ...
2905 instance Eq (f (f a)) => Eq (T2 f a) where ...
2907 The first of these is obviously fine. The second is still fine, although less obviously.
2908 The third is not Haskell 98, and risks losing termination of instances.
2911 GHC takes a conservative position: it accepts the first two, but not the third. The rule is this:
2912 each constraint in the inferred instance context must consist only of type variables,
2913 with no repetitions.
2916 This rule is applied regardless of flags. If you want a more exotic context, you can write
2917 it yourself, using the <link linkend="stand-alone-deriving">standalone deriving mechanism</link>.
2921 <sect2 id="stand-alone-deriving">
2922 <title>Stand-alone deriving declarations</title>
2925 GHC now allows stand-alone <literal>deriving</literal> declarations, enabled by <literal>-XStandaloneDeriving</literal>:
2927 data Foo a = Bar a | Baz String
2929 deriving instance Eq a => Eq (Foo a)
2931 The syntax is identical to that of an ordinary instance declaration apart from (a) the keyword
2932 <literal>deriving</literal>, and (b) the absence of the <literal>where</literal> part.
2933 Note the following points:
2936 You must supply an explicit context (in the example the context is <literal>(Eq a)</literal>),
2937 exactly as you would in an ordinary instance declaration.
2938 (In contrast, in a <literal>deriving</literal> clause
2939 attached to a data type declaration, the context is inferred.)
2943 A <literal>deriving instance</literal> declaration
2944 must obey the same rules concerning form and termination as ordinary instance declarations,
2945 controlled by the same flags; see <xref linkend="instance-decls"/>.
2949 Unlike a <literal>deriving</literal>
2950 declaration attached to a <literal>data</literal> declaration, the instance can be more specific
2951 than the data type (assuming you also use
2952 <literal>-XFlexibleInstances</literal>, <xref linkend="instance-rules"/>). Consider
2955 data Foo a = Bar a | Baz String
2957 deriving instance Eq a => Eq (Foo [a])
2958 deriving instance Eq a => Eq (Foo (Maybe a))
2960 This will generate a derived instance for <literal>(Foo [a])</literal> and <literal>(Foo (Maybe a))</literal>,
2961 but other types such as <literal>(Foo (Int,Bool))</literal> will not be an instance of <literal>Eq</literal>.
2965 Unlike a <literal>deriving</literal>
2966 declaration attached to a <literal>data</literal> declaration,
2967 GHC does not restrict the form of the data type. Instead, GHC simply generates the appropriate
2968 boilerplate code for the specified class, and typechecks it. If there is a type error, it is
2969 your problem. (GHC will show you the offending code if it has a type error.)
2970 The merit of this is that you can derive instances for GADTs and other exotic
2971 data types, providing only that the boilerplate code does indeed typecheck. For example:
2977 deriving instance Show (T a)
2979 In this example, you cannot say <literal>... deriving( Show )</literal> on the
2980 data type declaration for <literal>T</literal>,
2981 because <literal>T</literal> is a GADT, but you <emphasis>can</emphasis> generate
2982 the instance declaration using stand-alone deriving.
2987 <para>The stand-alone syntax is generalised for newtypes in exactly the same
2988 way that ordinary <literal>deriving</literal> clauses are generalised (<xref linkend="newtype-deriving"/>).
2991 newtype Foo a = MkFoo (State Int a)
2993 deriving instance MonadState Int Foo
2995 GHC always treats the <emphasis>last</emphasis> parameter of the instance
2996 (<literal>Foo</literal> in this example) as the type whose instance is being derived.
2998 </itemizedlist></para>
3003 <sect2 id="deriving-typeable">
3004 <title>Deriving clause for extra classes (<literal>Typeable</literal>, <literal>Data</literal>, etc)</title>
3007 Haskell 98 allows the programmer to add "<literal>deriving( Eq, Ord )</literal>" to a data type
3008 declaration, to generate a standard instance declaration for classes specified in the <literal>deriving</literal> clause.
3009 In Haskell 98, the only classes that may appear in the <literal>deriving</literal> clause are the standard
3010 classes <literal>Eq</literal>, <literal>Ord</literal>,
3011 <literal>Enum</literal>, <literal>Ix</literal>, <literal>Bounded</literal>, <literal>Read</literal>, and <literal>Show</literal>.
3014 GHC extends this list with several more classes that may be automatically derived:
3016 <listitem><para> With <option>-XDeriveDataTypeable</option>, you can derive instances of the classes
3017 <literal>Typeable</literal>, and <literal>Data</literal>, defined in the library
3018 modules <literal>Data.Typeable</literal> and <literal>Data.Generics</literal> respectively.
3020 <para>An instance of <literal>Typeable</literal> can only be derived if the
3021 data type has seven or fewer type parameters, all of kind <literal>*</literal>.
3022 The reason for this is that the <literal>Typeable</literal> class is derived using the scheme
3024 <ulink url="http://research.microsoft.com/%7Esimonpj/papers/hmap/gmap2.ps">
3025 Scrap More Boilerplate: Reflection, Zips, and Generalised Casts
3027 (Section 7.4 of the paper describes the multiple <literal>Typeable</literal> classes that
3028 are used, and only <literal>Typeable1</literal> up to
3029 <literal>Typeable7</literal> are provided in the library.)
3030 In other cases, there is nothing to stop the programmer writing a <literal>TypableX</literal>
3031 class, whose kind suits that of the data type constructor, and
3032 then writing the data type instance by hand.
3036 <listitem><para> With <option>-XDeriveFunctor</option>, you can derive instances of
3037 the class <literal>Functor</literal>,
3038 defined in <literal>GHC.Base</literal>.
3041 <listitem><para> With <option>-XDeriveFoldable</option>, you can derive instances of
3042 the class <literal>Foldable</literal>,
3043 defined in <literal>Data.Foldable</literal>.
3046 <listitem><para> With <option>-XDeriveTraversable</option>, you can derive instances of
3047 the class <literal>Traversable</literal>,
3048 defined in <literal>Data.Traversable</literal>.
3051 In each case the appropriate class must be in scope before it
3052 can be mentioned in the <literal>deriving</literal> clause.
3056 <sect2 id="newtype-deriving">
3057 <title>Generalised derived instances for newtypes</title>
3060 When you define an abstract type using <literal>newtype</literal>, you may want
3061 the new type to inherit some instances from its representation. In
3062 Haskell 98, you can inherit instances of <literal>Eq</literal>, <literal>Ord</literal>,
3063 <literal>Enum</literal> and <literal>Bounded</literal> by deriving them, but for any
3064 other classes you have to write an explicit instance declaration. For
3065 example, if you define
3068 newtype Dollars = Dollars Int
3071 and you want to use arithmetic on <literal>Dollars</literal>, you have to
3072 explicitly define an instance of <literal>Num</literal>:
3075 instance Num Dollars where
3076 Dollars a + Dollars b = Dollars (a+b)
3079 All the instance does is apply and remove the <literal>newtype</literal>
3080 constructor. It is particularly galling that, since the constructor
3081 doesn't appear at run-time, this instance declaration defines a
3082 dictionary which is <emphasis>wholly equivalent</emphasis> to the <literal>Int</literal>
3083 dictionary, only slower!
3087 <sect3> <title> Generalising the deriving clause </title>
3089 GHC now permits such instances to be derived instead,
3090 using the flag <option>-XGeneralizedNewtypeDeriving</option>,
3093 newtype Dollars = Dollars Int deriving (Eq,Show,Num)
3096 and the implementation uses the <emphasis>same</emphasis> <literal>Num</literal> dictionary
3097 for <literal>Dollars</literal> as for <literal>Int</literal>. Notionally, the compiler
3098 derives an instance declaration of the form
3101 instance Num Int => Num Dollars
3104 which just adds or removes the <literal>newtype</literal> constructor according to the type.
3108 We can also derive instances of constructor classes in a similar
3109 way. For example, suppose we have implemented state and failure monad
3110 transformers, such that
3113 instance Monad m => Monad (State s m)
3114 instance Monad m => Monad (Failure m)
3116 In Haskell 98, we can define a parsing monad by
3118 type Parser tok m a = State [tok] (Failure m) a
3121 which is automatically a monad thanks to the instance declarations
3122 above. With the extension, we can make the parser type abstract,
3123 without needing to write an instance of class <literal>Monad</literal>, via
3126 newtype Parser tok m a = Parser (State [tok] (Failure m) a)
3129 In this case the derived instance declaration is of the form
3131 instance Monad (State [tok] (Failure m)) => Monad (Parser tok m)
3134 Notice that, since <literal>Monad</literal> is a constructor class, the
3135 instance is a <emphasis>partial application</emphasis> of the new type, not the
3136 entire left hand side. We can imagine that the type declaration is
3137 "eta-converted" to generate the context of the instance
3142 We can even derive instances of multi-parameter classes, provided the
3143 newtype is the last class parameter. In this case, a ``partial
3144 application'' of the class appears in the <literal>deriving</literal>
3145 clause. For example, given the class
3148 class StateMonad s m | m -> s where ...
3149 instance Monad m => StateMonad s (State s m) where ...
3151 then we can derive an instance of <literal>StateMonad</literal> for <literal>Parser</literal>s by
3153 newtype Parser tok m a = Parser (State [tok] (Failure m) a)
3154 deriving (Monad, StateMonad [tok])
3157 The derived instance is obtained by completing the application of the
3158 class to the new type:
3161 instance StateMonad [tok] (State [tok] (Failure m)) =>
3162 StateMonad [tok] (Parser tok m)
3167 As a result of this extension, all derived instances in newtype
3168 declarations are treated uniformly (and implemented just by reusing
3169 the dictionary for the representation type), <emphasis>except</emphasis>
3170 <literal>Show</literal> and <literal>Read</literal>, which really behave differently for
3171 the newtype and its representation.
3175 <sect3> <title> A more precise specification </title>
3177 Derived instance declarations are constructed as follows. Consider the
3178 declaration (after expansion of any type synonyms)
3181 newtype T v1...vn = T' (t vk+1...vn) deriving (c1...cm)
3187 The <literal>ci</literal> are partial applications of
3188 classes of the form <literal>C t1'...tj'</literal>, where the arity of <literal>C</literal>
3189 is exactly <literal>j+1</literal>. That is, <literal>C</literal> lacks exactly one type argument.
3192 The <literal>k</literal> is chosen so that <literal>ci (T v1...vk)</literal> is well-kinded.
3195 The type <literal>t</literal> is an arbitrary type.
3198 The type variables <literal>vk+1...vn</literal> do not occur in <literal>t</literal>,
3199 nor in the <literal>ci</literal>, and
3202 None of the <literal>ci</literal> is <literal>Read</literal>, <literal>Show</literal>,
3203 <literal>Typeable</literal>, or <literal>Data</literal>. These classes
3204 should not "look through" the type or its constructor. You can still
3205 derive these classes for a newtype, but it happens in the usual way, not
3206 via this new mechanism.
3209 Then, for each <literal>ci</literal>, the derived instance
3212 instance ci t => ci (T v1...vk)
3214 As an example which does <emphasis>not</emphasis> work, consider
3216 newtype NonMonad m s = NonMonad (State s m s) deriving Monad
3218 Here we cannot derive the instance
3220 instance Monad (State s m) => Monad (NonMonad m)
3223 because the type variable <literal>s</literal> occurs in <literal>State s m</literal>,
3224 and so cannot be "eta-converted" away. It is a good thing that this
3225 <literal>deriving</literal> clause is rejected, because <literal>NonMonad m</literal> is
3226 not, in fact, a monad --- for the same reason. Try defining
3227 <literal>>>=</literal> with the correct type: you won't be able to.
3231 Notice also that the <emphasis>order</emphasis> of class parameters becomes
3232 important, since we can only derive instances for the last one. If the
3233 <literal>StateMonad</literal> class above were instead defined as
3236 class StateMonad m s | m -> s where ...
3239 then we would not have been able to derive an instance for the
3240 <literal>Parser</literal> type above. We hypothesise that multi-parameter
3241 classes usually have one "main" parameter for which deriving new
3242 instances is most interesting.
3244 <para>Lastly, all of this applies only for classes other than
3245 <literal>Read</literal>, <literal>Show</literal>, <literal>Typeable</literal>,
3246 and <literal>Data</literal>, for which the built-in derivation applies (section
3247 4.3.3. of the Haskell Report).
3248 (For the standard classes <literal>Eq</literal>, <literal>Ord</literal>,
3249 <literal>Ix</literal>, and <literal>Bounded</literal> it is immaterial whether
3250 the standard method is used or the one described here.)
3257 <!-- TYPE SYSTEM EXTENSIONS -->
3258 <sect1 id="type-class-extensions">
3259 <title>Class and instances declarations</title>
3261 <sect2 id="multi-param-type-classes">
3262 <title>Class declarations</title>
3265 This section, and the next one, documents GHC's type-class extensions.
3266 There's lots of background in the paper <ulink
3267 url="http://research.microsoft.com/~simonpj/Papers/type-class-design-space/">Type
3268 classes: exploring the design space</ulink> (Simon Peyton Jones, Mark
3269 Jones, Erik Meijer).
3272 All the extensions are enabled by the <option>-fglasgow-exts</option> flag.
3276 <title>Multi-parameter type classes</title>
3278 Multi-parameter type classes are permitted, with flag <option>-XMultiParamTypeClasses</option>.
3283 class Collection c a where
3284 union :: c a -> c a -> c a
3291 <sect3 id="superclass-rules">
3292 <title>The superclasses of a class declaration</title>
3295 In Haskell 98 the context of a class declaration (which introduces superclasses)
3296 must be simple; that is, each predicate must consist of a class applied to
3297 type variables. The flag <option>-XFlexibleContexts</option>
3298 (<xref linkend="flexible-contexts"/>)
3299 lifts this restriction,
3300 so that the only restriction on the context in a class declaration is
3301 that the class hierarchy must be acyclic. So these class declarations are OK:
3305 class Functor (m k) => FiniteMap m k where
3308 class (Monad m, Monad (t m)) => Transform t m where
3309 lift :: m a -> (t m) a
3315 As in Haskell 98, The class hierarchy must be acyclic. However, the definition
3316 of "acyclic" involves only the superclass relationships. For example,
3322 op :: D b => a -> b -> b
3325 class C a => D a where { ... }
3329 Here, <literal>C</literal> is a superclass of <literal>D</literal>, but it's OK for a
3330 class operation <literal>op</literal> of <literal>C</literal> to mention <literal>D</literal>. (It
3331 would not be OK for <literal>D</literal> to be a superclass of <literal>C</literal>.)
3338 <sect3 id="class-method-types">
3339 <title>Class method types</title>
3342 Haskell 98 prohibits class method types to mention constraints on the
3343 class type variable, thus:
3346 fromList :: [a] -> s a
3347 elem :: Eq a => a -> s a -> Bool
3349 The type of <literal>elem</literal> is illegal in Haskell 98, because it
3350 contains the constraint <literal>Eq a</literal>, constrains only the
3351 class type variable (in this case <literal>a</literal>).
3352 GHC lifts this restriction (flag <option>-XConstrainedClassMethods</option>).
3359 <sect2 id="functional-dependencies">
3360 <title>Functional dependencies
3363 <para> Functional dependencies are implemented as described by Mark Jones
3364 in “<ulink url="http://citeseer.ist.psu.edu/jones00type.html">Type Classes with Functional Dependencies</ulink>”, Mark P. Jones,
3365 In Proceedings of the 9th European Symposium on Programming,
3366 ESOP 2000, Berlin, Germany, March 2000, Springer-Verlag LNCS 1782,
3370 Functional dependencies are introduced by a vertical bar in the syntax of a
3371 class declaration; e.g.
3373 class (Monad m) => MonadState s m | m -> s where ...
3375 class Foo a b c | a b -> c where ...
3377 There should be more documentation, but there isn't (yet). Yell if you need it.
3380 <sect3><title>Rules for functional dependencies </title>
3382 In a class declaration, all of the class type variables must be reachable (in the sense
3383 mentioned in <xref linkend="flexible-contexts"/>)
3384 from the free variables of each method type.
3388 class Coll s a where
3390 insert :: s -> a -> s
3393 is not OK, because the type of <literal>empty</literal> doesn't mention
3394 <literal>a</literal>. Functional dependencies can make the type variable
3397 class Coll s a | s -> a where
3399 insert :: s -> a -> s
3402 Alternatively <literal>Coll</literal> might be rewritten
3405 class Coll s a where
3407 insert :: s a -> a -> s a
3411 which makes the connection between the type of a collection of
3412 <literal>a</literal>'s (namely <literal>(s a)</literal>) and the element type <literal>a</literal>.
3413 Occasionally this really doesn't work, in which case you can split the
3421 class CollE s => Coll s a where
3422 insert :: s -> a -> s
3429 <title>Background on functional dependencies</title>
3431 <para>The following description of the motivation and use of functional dependencies is taken
3432 from the Hugs user manual, reproduced here (with minor changes) by kind
3433 permission of Mark Jones.
3436 Consider the following class, intended as part of a
3437 library for collection types:
3439 class Collects e ce where
3441 insert :: e -> ce -> ce
3442 member :: e -> ce -> Bool
3444 The type variable e used here represents the element type, while ce is the type
3445 of the container itself. Within this framework, we might want to define
3446 instances of this class for lists or characteristic functions (both of which
3447 can be used to represent collections of any equality type), bit sets (which can
3448 be used to represent collections of characters), or hash tables (which can be
3449 used to represent any collection whose elements have a hash function). Omitting
3450 standard implementation details, this would lead to the following declarations:
3452 instance Eq e => Collects e [e] where ...
3453 instance Eq e => Collects e (e -> Bool) where ...
3454 instance Collects Char BitSet where ...
3455 instance (Hashable e, Collects a ce)
3456 => Collects e (Array Int ce) where ...
3458 All this looks quite promising; we have a class and a range of interesting
3459 implementations. Unfortunately, there are some serious problems with the class
3460 declaration. First, the empty function has an ambiguous type:
3462 empty :: Collects e ce => ce
3464 By "ambiguous" we mean that there is a type variable e that appears on the left
3465 of the <literal>=></literal> symbol, but not on the right. The problem with
3466 this is that, according to the theoretical foundations of Haskell overloading,
3467 we cannot guarantee a well-defined semantics for any term with an ambiguous
3471 We can sidestep this specific problem by removing the empty member from the
3472 class declaration. However, although the remaining members, insert and member,
3473 do not have ambiguous types, we still run into problems when we try to use
3474 them. For example, consider the following two functions:
3476 f x y = insert x . insert y
3479 for which GHC infers the following types:
3481 f :: (Collects a c, Collects b c) => a -> b -> c -> c
3482 g :: (Collects Bool c, Collects Char c) => c -> c
3484 Notice that the type for f allows the two parameters x and y to be assigned
3485 different types, even though it attempts to insert each of the two values, one
3486 after the other, into the same collection. If we're trying to model collections
3487 that contain only one type of value, then this is clearly an inaccurate
3488 type. Worse still, the definition for g is accepted, without causing a type
3489 error. As a result, the error in this code will not be flagged at the point
3490 where it appears. Instead, it will show up only when we try to use g, which
3491 might even be in a different module.
3494 <sect4><title>An attempt to use constructor classes</title>
3497 Faced with the problems described above, some Haskell programmers might be
3498 tempted to use something like the following version of the class declaration:
3500 class Collects e c where
3502 insert :: e -> c e -> c e
3503 member :: e -> c e -> Bool
3505 The key difference here is that we abstract over the type constructor c that is
3506 used to form the collection type c e, and not over that collection type itself,
3507 represented by ce in the original class declaration. This avoids the immediate
3508 problems that we mentioned above: empty has type <literal>Collects e c => c
3509 e</literal>, which is not ambiguous.
3512 The function f from the previous section has a more accurate type:
3514 f :: (Collects e c) => e -> e -> c e -> c e
3516 The function g from the previous section is now rejected with a type error as
3517 we would hope because the type of f does not allow the two arguments to have
3519 This, then, is an example of a multiple parameter class that does actually work
3520 quite well in practice, without ambiguity problems.
3521 There is, however, a catch. This version of the Collects class is nowhere near
3522 as general as the original class seemed to be: only one of the four instances
3523 for <literal>Collects</literal>
3524 given above can be used with this version of Collects because only one of
3525 them---the instance for lists---has a collection type that can be written in
3526 the form c e, for some type constructor c, and element type e.
3530 <sect4><title>Adding functional dependencies</title>
3533 To get a more useful version of the Collects class, Hugs provides a mechanism
3534 that allows programmers to specify dependencies between the parameters of a
3535 multiple parameter class (For readers with an interest in theoretical
3536 foundations and previous work: The use of dependency information can be seen
3537 both as a generalization of the proposal for `parametric type classes' that was
3538 put forward by Chen, Hudak, and Odersky, or as a special case of Mark Jones's
3539 later framework for "improvement" of qualified types. The
3540 underlying ideas are also discussed in a more theoretical and abstract setting
3541 in a manuscript [implparam], where they are identified as one point in a
3542 general design space for systems of implicit parameterization.).
3544 To start with an abstract example, consider a declaration such as:
3546 class C a b where ...
3548 which tells us simply that C can be thought of as a binary relation on types
3549 (or type constructors, depending on the kinds of a and b). Extra clauses can be
3550 included in the definition of classes to add information about dependencies
3551 between parameters, as in the following examples:
3553 class D a b | a -> b where ...
3554 class E a b | a -> b, b -> a where ...
3556 The notation <literal>a -> b</literal> used here between the | and where
3557 symbols --- not to be
3558 confused with a function type --- indicates that the a parameter uniquely
3559 determines the b parameter, and might be read as "a determines b." Thus D is
3560 not just a relation, but actually a (partial) function. Similarly, from the two
3561 dependencies that are included in the definition of E, we can see that E
3562 represents a (partial) one-one mapping between types.
3565 More generally, dependencies take the form <literal>x1 ... xn -> y1 ... ym</literal>,
3566 where x1, ..., xn, and y1, ..., yn are type variables with n>0 and
3567 m>=0, meaning that the y parameters are uniquely determined by the x
3568 parameters. Spaces can be used as separators if more than one variable appears
3569 on any single side of a dependency, as in <literal>t -> a b</literal>. Note that a class may be
3570 annotated with multiple dependencies using commas as separators, as in the
3571 definition of E above. Some dependencies that we can write in this notation are
3572 redundant, and will be rejected because they don't serve any useful
3573 purpose, and may instead indicate an error in the program. Examples of
3574 dependencies like this include <literal>a -> a </literal>,
3575 <literal>a -> a a </literal>,
3576 <literal>a -> </literal>, etc. There can also be
3577 some redundancy if multiple dependencies are given, as in
3578 <literal>a->b</literal>,
3579 <literal>b->c </literal>, <literal>a->c </literal>, and
3580 in which some subset implies the remaining dependencies. Examples like this are
3581 not treated as errors. Note that dependencies appear only in class
3582 declarations, and not in any other part of the language. In particular, the
3583 syntax for instance declarations, class constraints, and types is completely
3587 By including dependencies in a class declaration, we provide a mechanism for
3588 the programmer to specify each multiple parameter class more precisely. The
3589 compiler, on the other hand, is responsible for ensuring that the set of
3590 instances that are in scope at any given point in the program is consistent
3591 with any declared dependencies. For example, the following pair of instance
3592 declarations cannot appear together in the same scope because they violate the
3593 dependency for D, even though either one on its own would be acceptable:
3595 instance D Bool Int where ...
3596 instance D Bool Char where ...
3598 Note also that the following declaration is not allowed, even by itself:
3600 instance D [a] b where ...
3602 The problem here is that this instance would allow one particular choice of [a]
3603 to be associated with more than one choice for b, which contradicts the
3604 dependency specified in the definition of D. More generally, this means that,
3605 in any instance of the form:
3607 instance D t s where ...
3609 for some particular types t and s, the only variables that can appear in s are
3610 the ones that appear in t, and hence, if the type t is known, then s will be
3611 uniquely determined.
3614 The benefit of including dependency information is that it allows us to define
3615 more general multiple parameter classes, without ambiguity problems, and with
3616 the benefit of more accurate types. To illustrate this, we return to the
3617 collection class example, and annotate the original definition of <literal>Collects</literal>
3618 with a simple dependency:
3620 class Collects e ce | ce -> e where
3622 insert :: e -> ce -> ce
3623 member :: e -> ce -> Bool
3625 The dependency <literal>ce -> e</literal> here specifies that the type e of elements is uniquely
3626 determined by the type of the collection ce. Note that both parameters of
3627 Collects are of kind *; there are no constructor classes here. Note too that
3628 all of the instances of Collects that we gave earlier can be used
3629 together with this new definition.
3632 What about the ambiguity problems that we encountered with the original
3633 definition? The empty function still has type Collects e ce => ce, but it is no
3634 longer necessary to regard that as an ambiguous type: Although the variable e
3635 does not appear on the right of the => symbol, the dependency for class
3636 Collects tells us that it is uniquely determined by ce, which does appear on
3637 the right of the => symbol. Hence the context in which empty is used can still
3638 give enough information to determine types for both ce and e, without
3639 ambiguity. More generally, we need only regard a type as ambiguous if it
3640 contains a variable on the left of the => that is not uniquely determined
3641 (either directly or indirectly) by the variables on the right.
3644 Dependencies also help to produce more accurate types for user defined
3645 functions, and hence to provide earlier detection of errors, and less cluttered
3646 types for programmers to work with. Recall the previous definition for a
3649 f x y = insert x y = insert x . insert y
3651 for which we originally obtained a type:
3653 f :: (Collects a c, Collects b c) => a -> b -> c -> c
3655 Given the dependency information that we have for Collects, however, we can
3656 deduce that a and b must be equal because they both appear as the second
3657 parameter in a Collects constraint with the same first parameter c. Hence we
3658 can infer a shorter and more accurate type for f:
3660 f :: (Collects a c) => a -> a -> c -> c
3662 In a similar way, the earlier definition of g will now be flagged as a type error.
3665 Although we have given only a few examples here, it should be clear that the
3666 addition of dependency information can help to make multiple parameter classes
3667 more useful in practice, avoiding ambiguity problems, and allowing more general
3668 sets of instance declarations.
3674 <sect2 id="instance-decls">
3675 <title>Instance declarations</title>
3677 <para>An instance declaration has the form
3679 instance ( <replaceable>assertion</replaceable><subscript>1</subscript>, ..., <replaceable>assertion</replaceable><subscript>n</subscript>) => <replaceable>class</replaceable> <replaceable>type</replaceable><subscript>1</subscript> ... <replaceable>type</replaceable><subscript>m</subscript> where ...
3681 The part before the "<literal>=></literal>" is the
3682 <emphasis>context</emphasis>, while the part after the
3683 "<literal>=></literal>" is the <emphasis>head</emphasis> of the instance declaration.
3686 <sect3 id="flexible-instance-head">
3687 <title>Relaxed rules for the instance head</title>
3690 In Haskell 98 the head of an instance declaration
3691 must be of the form <literal>C (T a1 ... an)</literal>, where
3692 <literal>C</literal> is the class, <literal>T</literal> is a data type constructor,
3693 and the <literal>a1 ... an</literal> are distinct type variables.
3694 GHC relaxes these rules in two ways.
3698 The <option>-XFlexibleInstances</option> flag allows the head of the instance
3699 declaration to mention arbitrary nested types.
3700 For example, this becomes a legal instance declaration
3702 instance C (Maybe Int) where ...
3704 See also the <link linkend="instance-overlap">rules on overlap</link>.
3707 With the <option>-XTypeSynonymInstances</option> flag, instance heads may use type
3708 synonyms. As always, using a type synonym is just shorthand for
3709 writing the RHS of the type synonym definition. For example:
3713 type Point = (Int,Int)
3714 instance C Point where ...
3715 instance C [Point] where ...
3719 is legal. However, if you added
3723 instance C (Int,Int) where ...
3727 as well, then the compiler will complain about the overlapping
3728 (actually, identical) instance declarations. As always, type synonyms
3729 must be fully applied. You cannot, for example, write:
3733 instance Monad P where ...
3741 <sect3 id="instance-rules">
3742 <title>Relaxed rules for instance contexts</title>
3744 <para>In Haskell 98, the assertions in the context of the instance declaration
3745 must be of the form <literal>C a</literal> where <literal>a</literal>
3746 is a type variable that occurs in the head.
3750 The <option>-XFlexibleContexts</option> flag relaxes this rule, as well
3751 as the corresponding rule for type signatures (see <xref linkend="flexible-contexts"/>).
3752 With this flag the context of the instance declaration can each consist of arbitrary
3753 (well-kinded) assertions <literal>(C t1 ... tn)</literal> subject only to the
3757 The Paterson Conditions: for each assertion in the context
3759 <listitem><para>No type variable has more occurrences in the assertion than in the head</para></listitem>
3760 <listitem><para>The assertion has fewer constructors and variables (taken together
3761 and counting repetitions) than the head</para></listitem>
3765 <listitem><para>The Coverage Condition. For each functional dependency,
3766 <replaceable>tvs</replaceable><subscript>left</subscript> <literal>-></literal>
3767 <replaceable>tvs</replaceable><subscript>right</subscript>, of the class,
3768 every type variable in
3769 S(<replaceable>tvs</replaceable><subscript>right</subscript>) must appear in
3770 S(<replaceable>tvs</replaceable><subscript>left</subscript>), where S is the
3771 substitution mapping each type variable in the class declaration to the
3772 corresponding type in the instance declaration.
3775 These restrictions ensure that context reduction terminates: each reduction
3776 step makes the problem smaller by at least one
3777 constructor. Both the Paterson Conditions and the Coverage Condition are lifted
3778 if you give the <option>-XUndecidableInstances</option>
3779 flag (<xref linkend="undecidable-instances"/>).
3780 You can find lots of background material about the reason for these
3781 restrictions in the paper <ulink
3782 url="http://research.microsoft.com/%7Esimonpj/papers/fd%2Dchr/">
3783 Understanding functional dependencies via Constraint Handling Rules</ulink>.
3786 For example, these are OK:
3788 instance C Int [a] -- Multiple parameters
3789 instance Eq (S [a]) -- Structured type in head
3791 -- Repeated type variable in head
3792 instance C4 a a => C4 [a] [a]
3793 instance Stateful (ST s) (MutVar s)
3795 -- Head can consist of type variables only
3797 instance (Eq a, Show b) => C2 a b
3799 -- Non-type variables in context
3800 instance Show (s a) => Show (Sized s a)
3801 instance C2 Int a => C3 Bool [a]
3802 instance C2 Int a => C3 [a] b
3806 -- Context assertion no smaller than head
3807 instance C a => C a where ...
3808 -- (C b b) has more more occurrences of b than the head
3809 instance C b b => Foo [b] where ...
3814 The same restrictions apply to instances generated by
3815 <literal>deriving</literal> clauses. Thus the following is accepted:
3817 data MinHeap h a = H a (h a)
3820 because the derived instance
3822 instance (Show a, Show (h a)) => Show (MinHeap h a)
3824 conforms to the above rules.
3828 A useful idiom permitted by the above rules is as follows.
3829 If one allows overlapping instance declarations then it's quite
3830 convenient to have a "default instance" declaration that applies if
3831 something more specific does not:
3839 <sect3 id="undecidable-instances">
3840 <title>Undecidable instances</title>
3843 Sometimes even the rules of <xref linkend="instance-rules"/> are too onerous.
3844 For example, sometimes you might want to use the following to get the
3845 effect of a "class synonym":
3847 class (C1 a, C2 a, C3 a) => C a where { }
3849 instance (C1 a, C2 a, C3 a) => C a where { }
3851 This allows you to write shorter signatures:
3857 f :: (C1 a, C2 a, C3 a) => ...
3859 The restrictions on functional dependencies (<xref
3860 linkend="functional-dependencies"/>) are particularly troublesome.
3861 It is tempting to introduce type variables in the context that do not appear in
3862 the head, something that is excluded by the normal rules. For example:
3864 class HasConverter a b | a -> b where
3867 data Foo a = MkFoo a
3869 instance (HasConverter a b,Show b) => Show (Foo a) where
3870 show (MkFoo value) = show (convert value)
3872 This is dangerous territory, however. Here, for example, is a program that would make the
3877 instance F [a] [[a]]
3878 instance (D c, F a c) => D [a] -- 'c' is not mentioned in the head
3880 Similarly, it can be tempting to lift the coverage condition:
3882 class Mul a b c | a b -> c where
3883 (.*.) :: a -> b -> c
3885 instance Mul Int Int Int where (.*.) = (*)
3886 instance Mul Int Float Float where x .*. y = fromIntegral x * y
3887 instance Mul a b c => Mul a [b] [c] where x .*. v = map (x.*.) v
3889 The third instance declaration does not obey the coverage condition;
3890 and indeed the (somewhat strange) definition:
3892 f = \ b x y -> if b then x .*. [y] else y
3894 makes instance inference go into a loop, because it requires the constraint
3895 <literal>(Mul a [b] b)</literal>.
3898 Nevertheless, GHC allows you to experiment with more liberal rules. If you use
3899 the experimental flag <option>-XUndecidableInstances</option>
3900 <indexterm><primary>-XUndecidableInstances</primary></indexterm>,
3901 both the Paterson Conditions and the Coverage Condition
3902 (described in <xref linkend="instance-rules"/>) are lifted. Termination is ensured by having a
3903 fixed-depth recursion stack. If you exceed the stack depth you get a
3904 sort of backtrace, and the opportunity to increase the stack depth
3905 with <option>-fcontext-stack=</option><emphasis>N</emphasis>.
3911 <sect3 id="instance-overlap">
3912 <title>Overlapping instances</title>
3914 In general, <emphasis>GHC requires that that it be unambiguous which instance
3916 should be used to resolve a type-class constraint</emphasis>. This behaviour
3917 can be modified by two flags: <option>-XOverlappingInstances</option>
3918 <indexterm><primary>-XOverlappingInstances
3919 </primary></indexterm>
3920 and <option>-XIncoherentInstances</option>
3921 <indexterm><primary>-XIncoherentInstances
3922 </primary></indexterm>, as this section discusses. Both these
3923 flags are dynamic flags, and can be set on a per-module basis, using
3924 an <literal>OPTIONS_GHC</literal> pragma if desired (<xref linkend="source-file-options"/>).</para>
3926 When GHC tries to resolve, say, the constraint <literal>C Int Bool</literal>,
3927 it tries to match every instance declaration against the
3929 by instantiating the head of the instance declaration. For example, consider
3932 instance context1 => C Int a where ... -- (A)
3933 instance context2 => C a Bool where ... -- (B)
3934 instance context3 => C Int [a] where ... -- (C)
3935 instance context4 => C Int [Int] where ... -- (D)
3937 The instances (A) and (B) match the constraint <literal>C Int Bool</literal>,
3938 but (C) and (D) do not. When matching, GHC takes
3939 no account of the context of the instance declaration
3940 (<literal>context1</literal> etc).
3941 GHC's default behaviour is that <emphasis>exactly one instance must match the
3942 constraint it is trying to resolve</emphasis>.
3943 It is fine for there to be a <emphasis>potential</emphasis> of overlap (by
3944 including both declarations (A) and (B), say); an error is only reported if a
3945 particular constraint matches more than one.
3949 The <option>-XOverlappingInstances</option> flag instructs GHC to allow
3950 more than one instance to match, provided there is a most specific one. For
3951 example, the constraint <literal>C Int [Int]</literal> matches instances (A),
3952 (C) and (D), but the last is more specific, and hence is chosen. If there is no
3953 most-specific match, the program is rejected.
3956 However, GHC is conservative about committing to an overlapping instance. For example:
3961 Suppose that from the RHS of <literal>f</literal> we get the constraint
3962 <literal>C Int [b]</literal>. But
3963 GHC does not commit to instance (C), because in a particular
3964 call of <literal>f</literal>, <literal>b</literal> might be instantiate
3965 to <literal>Int</literal>, in which case instance (D) would be more specific still.
3966 So GHC rejects the program.
3967 (If you add the flag <option>-XIncoherentInstances</option>,
3968 GHC will instead pick (C), without complaining about
3969 the problem of subsequent instantiations.)
3972 Notice that we gave a type signature to <literal>f</literal>, so GHC had to
3973 <emphasis>check</emphasis> that <literal>f</literal> has the specified type.
3974 Suppose instead we do not give a type signature, asking GHC to <emphasis>infer</emphasis>
3975 it instead. In this case, GHC will refrain from
3976 simplifying the constraint <literal>C Int [b]</literal> (for the same reason
3977 as before) but, rather than rejecting the program, it will infer the type
3979 f :: C Int [b] => [b] -> [b]
3981 That postpones the question of which instance to pick to the
3982 call site for <literal>f</literal>
3983 by which time more is known about the type <literal>b</literal>.
3984 You can write this type signature yourself if you use the
3985 <link linkend="flexible-contexts"><option>-XFlexibleContexts</option></link>
3989 Exactly the same situation can arise in instance declarations themselves. Suppose we have
3993 instance Foo [b] where
3996 and, as before, the constraint <literal>C Int [b]</literal> arises from <literal>f</literal>'s
3997 right hand side. GHC will reject the instance, complaining as before that it does not know how to resolve
3998 the constraint <literal>C Int [b]</literal>, because it matches more than one instance
3999 declaration. The solution is to postpone the choice by adding the constraint to the context
4000 of the instance declaration, thus:
4002 instance C Int [b] => Foo [b] where
4005 (You need <link linkend="instance-rules"><option>-XFlexibleInstances</option></link> to do this.)
4008 The willingness to be overlapped or incoherent is a property of
4009 the <emphasis>instance declaration</emphasis> itself, controlled by the
4010 presence or otherwise of the <option>-XOverlappingInstances</option>
4011 and <option>-XIncoherentInstances</option> flags when that module is
4012 being defined. Neither flag is required in a module that imports and uses the
4013 instance declaration. Specifically, during the lookup process:
4016 An instance declaration is ignored during the lookup process if (a) a more specific
4017 match is found, and (b) the instance declaration was compiled with
4018 <option>-XOverlappingInstances</option>. The flag setting for the
4019 more-specific instance does not matter.
4022 Suppose an instance declaration does not match the constraint being looked up, but
4023 does unify with it, so that it might match when the constraint is further
4024 instantiated. Usually GHC will regard this as a reason for not committing to
4025 some other constraint. But if the instance declaration was compiled with
4026 <option>-XIncoherentInstances</option>, GHC will skip the "does-it-unify?"
4027 check for that declaration.
4030 These rules make it possible for a library author to design a library that relies on
4031 overlapping instances without the library client having to know.
4034 If an instance declaration is compiled without
4035 <option>-XOverlappingInstances</option>,
4036 then that instance can never be overlapped. This could perhaps be
4037 inconvenient. Perhaps the rule should instead say that the
4038 <emphasis>overlapping</emphasis> instance declaration should be compiled in
4039 this way, rather than the <emphasis>overlapped</emphasis> one. Perhaps overlap
4040 at a usage site should be permitted regardless of how the instance declarations
4041 are compiled, if the <option>-XOverlappingInstances</option> flag is
4042 used at the usage site. (Mind you, the exact usage site can occasionally be
4043 hard to pin down.) We are interested to receive feedback on these points.
4045 <para>The <option>-XIncoherentInstances</option> flag implies the
4046 <option>-XOverlappingInstances</option> flag, but not vice versa.
4054 <sect2 id="overloaded-strings">
4055 <title>Overloaded string literals
4059 GHC supports <emphasis>overloaded string literals</emphasis>. Normally a
4060 string literal has type <literal>String</literal>, but with overloaded string
4061 literals enabled (with <literal>-XOverloadedStrings</literal>)
4062 a string literal has type <literal>(IsString a) => a</literal>.
4065 This means that the usual string syntax can be used, e.g., for packed strings
4066 and other variations of string like types. String literals behave very much
4067 like integer literals, i.e., they can be used in both expressions and patterns.
4068 If used in a pattern the literal with be replaced by an equality test, in the same
4069 way as an integer literal is.
4072 The class <literal>IsString</literal> is defined as:
4074 class IsString a where
4075 fromString :: String -> a
4077 The only predefined instance is the obvious one to make strings work as usual:
4079 instance IsString [Char] where
4082 The class <literal>IsString</literal> is not in scope by default. If you want to mention
4083 it explicitly (for example, to give an instance declaration for it), you can import it
4084 from module <literal>GHC.Exts</literal>.
4087 Haskell's defaulting mechanism is extended to cover string literals, when <option>-XOverloadedStrings</option> is specified.
4091 Each type in a default declaration must be an
4092 instance of <literal>Num</literal> <emphasis>or</emphasis> of <literal>IsString</literal>.
4096 The standard defaulting rule (<ulink url="http://www.haskell.org/onlinereport/decls.html#sect4.3.4">Haskell Report, Section 4.3.4</ulink>)
4097 is extended thus: defaulting applies when all the unresolved constraints involve standard classes
4098 <emphasis>or</emphasis> <literal>IsString</literal>; and at least one is a numeric class
4099 <emphasis>or</emphasis> <literal>IsString</literal>.
4108 import GHC.Exts( IsString(..) )
4110 newtype MyString = MyString String deriving (Eq, Show)
4111 instance IsString MyString where
4112 fromString = MyString
4114 greet :: MyString -> MyString
4115 greet "hello" = "world"
4119 print $ greet "hello"
4120 print $ greet "fool"
4124 Note that deriving <literal>Eq</literal> is necessary for the pattern matching
4125 to work since it gets translated into an equality comparison.
4131 <sect1 id="type-families">
4132 <title>Type families</title>
4135 <firstterm>Indexed type families</firstterm> are a new GHC extension to
4136 facilitate type-level
4137 programming. Type families are a generalisation of <firstterm>associated
4138 data types</firstterm>
4139 (“<ulink url="http://www.cse.unsw.edu.au/~chak/papers/CKPM05.html">Associated
4140 Types with Class</ulink>”, M. Chakravarty, G. Keller, S. Peyton Jones,
4141 and S. Marlow. In Proceedings of “The 32nd Annual ACM SIGPLAN-SIGACT
4142 Symposium on Principles of Programming Languages (POPL'05)”, pages
4143 1-13, ACM Press, 2005) and <firstterm>associated type synonyms</firstterm>
4144 (“<ulink url="http://www.cse.unsw.edu.au/~chak/papers/CKP05.html">Type
4145 Associated Type Synonyms</ulink>”. M. Chakravarty, G. Keller, and
4147 In Proceedings of “The Tenth ACM SIGPLAN International Conference on
4148 Functional Programming”, ACM Press, pages 241-253, 2005). Type families
4149 themselves are described in the paper “<ulink
4150 url="http://www.cse.unsw.edu.au/~chak/papers/SPCS08.html">Type
4151 Checking with Open Type Functions</ulink>”, T. Schrijvers,
4153 M. Chakravarty, and M. Sulzmann, in Proceedings of “ICFP 2008: The
4154 13th ACM SIGPLAN International Conference on Functional
4155 Programming”, ACM Press, pages 51-62, 2008. Type families
4156 essentially provide type-indexed data types and named functions on types,
4157 which are useful for generic programming and highly parameterised library
4158 interfaces as well as interfaces with enhanced static information, much like
4159 dependent types. They might also be regarded as an alternative to functional
4160 dependencies, but provide a more functional style of type-level programming
4161 than the relational style of functional dependencies.
4164 Indexed type families, or type families for short, are type constructors that
4165 represent sets of types. Set members are denoted by supplying the type family
4166 constructor with type parameters, which are called <firstterm>type
4167 indices</firstterm>. The
4168 difference between vanilla parametrised type constructors and family
4169 constructors is much like between parametrically polymorphic functions and
4170 (ad-hoc polymorphic) methods of type classes. Parametric polymorphic functions
4171 behave the same at all type instances, whereas class methods can change their
4172 behaviour in dependence on the class type parameters. Similarly, vanilla type
4173 constructors imply the same data representation for all type instances, but
4174 family constructors can have varying representation types for varying type
4178 Indexed type families come in two flavours: <firstterm>data
4179 families</firstterm> and <firstterm>type synonym
4180 families</firstterm>. They are the indexed family variants of algebraic
4181 data types and type synonyms, respectively. The instances of data families
4182 can be data types and newtypes.
4185 Type families are enabled by the flag <option>-XTypeFamilies</option>.
4186 Additional information on the use of type families in GHC is available on
4187 <ulink url="http://www.haskell.org/haskellwiki/GHC/Indexed_types">the
4188 Haskell wiki page on type families</ulink>.
4191 <sect2 id="data-families">
4192 <title>Data families</title>
4195 Data families appear in two flavours: (1) they can be defined on the
4197 or (2) they can appear inside type classes (in which case they are known as
4198 associated types). The former is the more general variant, as it lacks the
4199 requirement for the type-indexes to coincide with the class
4200 parameters. However, the latter can lead to more clearly structured code and
4201 compiler warnings if some type instances were - possibly accidentally -
4202 omitted. In the following, we always discuss the general toplevel form first
4203 and then cover the additional constraints placed on associated types.
4206 <sect3 id="data-family-declarations">
4207 <title>Data family declarations</title>
4210 Indexed data families are introduced by a signature, such as
4212 data family GMap k :: * -> *
4214 The special <literal>family</literal> distinguishes family from standard
4215 data declarations. The result kind annotation is optional and, as
4216 usual, defaults to <literal>*</literal> if omitted. An example is
4220 Named arguments can also be given explicit kind signatures if needed.
4222 [http://www.haskell.org/ghc/docs/latest/html/users_guide/gadt.html GADT
4223 declarations] named arguments are entirely optional, so that we can
4224 declare <literal>Array</literal> alternatively with
4226 data family Array :: * -> *
4230 <sect4 id="assoc-data-family-decl">
4231 <title>Associated data family declarations</title>
4233 When a data family is declared as part of a type class, we drop
4234 the <literal>family</literal> special. The <literal>GMap</literal>
4235 declaration takes the following form
4237 class GMapKey k where
4238 data GMap k :: * -> *
4241 In contrast to toplevel declarations, named arguments must be used for
4242 all type parameters that are to be used as type-indexes. Moreover,
4243 the argument names must be class parameters. Each class parameter may
4244 only be used at most once per associated type, but some may be omitted
4245 and they may be in an order other than in the class head. Hence, the
4246 following contrived example is admissible:
4255 <sect3 id="data-instance-declarations">
4256 <title>Data instance declarations</title>
4259 Instance declarations of data and newtype families are very similar to
4260 standard data and newtype declarations. The only two differences are
4261 that the keyword <literal>data</literal> or <literal>newtype</literal>
4262 is followed by <literal>instance</literal> and that some or all of the
4263 type arguments can be non-variable types, but may not contain forall
4264 types or type synonym families. However, data families are generally
4265 allowed in type parameters, and type synonyms are allowed as long as
4266 they are fully applied and expand to a type that is itself admissible -
4267 exactly as this is required for occurrences of type synonyms in class
4268 instance parameters. For example, the <literal>Either</literal>
4269 instance for <literal>GMap</literal> is
4271 data instance GMap (Either a b) v = GMapEither (GMap a v) (GMap b v)
4273 In this example, the declaration has only one variant. In general, it
4277 Data and newtype instance declarations are only permitted when an
4278 appropriate family declaration is in scope - just as a class instance declaratoin
4279 requires the class declaration to be visible. Moreover, each instance
4280 declaration has to conform to the kind determined by its family
4281 declaration. This implies that the number of parameters of an instance
4282 declaration matches the arity determined by the kind of the family.
4285 A data family instance declaration can use the full exprssiveness of
4286 ordinary <literal>data</literal> or <literal>newtype</literal> declarations:
4288 <listitem><para> Although, a data family is <emphasis>introduced</emphasis> with
4289 the keyword "<literal>data</literal>", a data family <emphasis>instance</emphasis> can
4290 use either <literal>data</literal> or <literal>newtype</literal>. For example:
4293 data instance T Int = T1 Int | T2 Bool
4294 newtype instance T Char = TC Bool
4297 <listitem><para> A <literal>data instance</literal> can use GADT syntax for the data constructors,
4298 and indeed can define a GADT. For example:
4301 data instance G [a] b where
4302 G1 :: c -> G [Int] b
4306 <listitem><para> You can use a <literal>deriving</literal> clause on a
4307 <literal>data instance</literal> or <literal>newtype instance</literal>
4314 Even if type families are defined as toplevel declarations, functions
4315 that perform different computations for different family instances may still
4316 need to be defined as methods of type classes. In particular, the
4317 following is not possible:
4320 data instance T Int = A
4321 data instance T Char = B
4323 foo A = 1 -- WRONG: These two equations together...
4324 foo B = 2 -- ...will produce a type error.
4326 Instead, you would have to write <literal>foo</literal> as a class operation, thus:
4330 instance Foo Int where
4332 instance Foo Char where
4335 (Given the functionality provided by GADTs (Generalised Algebraic Data
4336 Types), it might seem as if a definition, such as the above, should be
4337 feasible. However, type families are - in contrast to GADTs - are
4338 <emphasis>open;</emphasis> i.e., new instances can always be added,
4340 modules. Supporting pattern matching across different data instances
4341 would require a form of extensible case construct.)
4344 <sect4 id="assoc-data-inst">
4345 <title>Associated data instances</title>
4347 When an associated data family instance is declared within a type
4348 class instance, we drop the <literal>instance</literal> keyword in the
4349 family instance. So, the <literal>Either</literal> instance
4350 for <literal>GMap</literal> becomes:
4352 instance (GMapKey a, GMapKey b) => GMapKey (Either a b) where
4353 data GMap (Either a b) v = GMapEither (GMap a v) (GMap b v)
4356 The most important point about associated family instances is that the
4357 type indexes corresponding to class parameters must be identical to
4358 the type given in the instance head; here this is the first argument
4359 of <literal>GMap</literal>, namely <literal>Either a b</literal>,
4360 which coincides with the only class parameter. Any parameters to the
4361 family constructor that do not correspond to class parameters, need to
4362 be variables in every instance; here this is the
4363 variable <literal>v</literal>.
4366 Instances for an associated family can only appear as part of
4367 instances declarations of the class in which the family was declared -
4368 just as with the equations of the methods of a class. Also in
4369 correspondence to how methods are handled, declarations of associated
4370 types can be omitted in class instances. If an associated family
4371 instance is omitted, the corresponding instance type is not inhabited;
4372 i.e., only diverging expressions, such
4373 as <literal>undefined</literal>, can assume the type.
4377 <sect4 id="scoping-class-params">
4378 <title>Scoping of class parameters</title>
4380 In the case of multi-parameter type classes, the visibility of class
4381 parameters in the right-hand side of associated family instances
4382 depends <emphasis>solely</emphasis> on the parameters of the data
4383 family. As an example, consider the simple class declaration
4388 Only one of the two class parameters is a parameter to the data
4389 family. Hence, the following instance declaration is invalid:
4391 instance C [c] d where
4392 data T [c] = MkT (c, d) -- WRONG!! 'd' is not in scope
4394 Here, the right-hand side of the data instance mentions the type
4395 variable <literal>d</literal> that does not occur in its left-hand
4396 side. We cannot admit such data instances as they would compromise
4401 <sect4 id="family-class-inst">
4402 <title>Type class instances of family instances</title>
4404 Type class instances of instances of data families can be defined as
4405 usual, and in particular data instance declarations can
4406 have <literal>deriving</literal> clauses. For example, we can write
4408 data GMap () v = GMapUnit (Maybe v)
4411 which implicitly defines an instance of the form
4413 instance Show v => Show (GMap () v) where ...
4417 Note that class instances are always for
4418 particular <emphasis>instances</emphasis> of a data family and never
4419 for an entire family as a whole. This is for essentially the same
4420 reasons that we cannot define a toplevel function that performs
4421 pattern matching on the data constructors
4422 of <emphasis>different</emphasis> instances of a single type family.
4423 It would require a form of extensible case construct.
4427 <sect4 id="data-family-overlap">
4428 <title>Overlap of data instances</title>
4430 The instance declarations of a data family used in a single program
4431 may not overlap at all, independent of whether they are associated or
4432 not. In contrast to type class instances, this is not only a matter
4433 of consistency, but one of type safety.
4439 <sect3 id="data-family-import-export">
4440 <title>Import and export</title>
4443 The association of data constructors with type families is more dynamic
4444 than that is the case with standard data and newtype declarations. In
4445 the standard case, the notation <literal>T(..)</literal> in an import or
4446 export list denotes the type constructor and all the data constructors
4447 introduced in its declaration. However, a family declaration never
4448 introduces any data constructors; instead, data constructors are
4449 introduced by family instances. As a result, which data constructors
4450 are associated with a type family depends on the currently visible
4451 instance declarations for that family. Consequently, an import or
4452 export item of the form <literal>T(..)</literal> denotes the family
4453 constructor and all currently visible data constructors - in the case of
4454 an export item, these may be either imported or defined in the current
4455 module. The treatment of import and export items that explicitly list
4456 data constructors, such as <literal>GMap(GMapEither)</literal>, is
4460 <sect4 id="data-family-impexp-assoc">
4461 <title>Associated families</title>
4463 As expected, an import or export item of the
4464 form <literal>C(..)</literal> denotes all of the class' methods and
4465 associated types. However, when associated types are explicitly
4466 listed as subitems of a class, we need some new syntax, as uppercase
4467 identifiers as subitems are usually data constructors, not type
4468 constructors. To clarify that we denote types here, each associated
4469 type name needs to be prefixed by the keyword <literal>type</literal>.
4470 So for example, when explicitly listing the components of
4471 the <literal>GMapKey</literal> class, we write <literal>GMapKey(type
4472 GMap, empty, lookup, insert)</literal>.
4476 <sect4 id="data-family-impexp-examples">
4477 <title>Examples</title>
4479 Assuming our running <literal>GMapKey</literal> class example, let us
4480 look at some export lists and their meaning:
4483 <para><literal>module GMap (GMapKey) where...</literal>: Exports
4484 just the class name.</para>
4487 <para><literal>module GMap (GMapKey(..)) where...</literal>:
4488 Exports the class, the associated type <literal>GMap</literal>
4490 functions <literal>empty</literal>, <literal>lookup</literal>,
4491 and <literal>insert</literal>. None of the data constructors is
4495 <para><literal>module GMap (GMapKey(..), GMap(..))
4496 where...</literal>: As before, but also exports all the data
4497 constructors <literal>GMapInt</literal>,
4498 <literal>GMapChar</literal>,
4499 <literal>GMapUnit</literal>, <literal>GMapPair</literal>,
4500 and <literal>GMapUnit</literal>.</para>
4503 <para><literal>module GMap (GMapKey(empty, lookup, insert),
4504 GMap(..)) where...</literal>: As before.</para>
4507 <para><literal>module GMap (GMapKey, empty, lookup, insert, GMap(..))
4508 where...</literal>: As before.</para>
4513 Finally, you can write <literal>GMapKey(type GMap)</literal> to denote
4514 both the class <literal>GMapKey</literal> as well as its associated
4515 type <literal>GMap</literal>. However, you cannot
4516 write <literal>GMapKey(type GMap(..))</literal> — i.e.,
4517 sub-component specifications cannot be nested. To
4518 specify <literal>GMap</literal>'s data constructors, you have to list
4523 <sect4 id="data-family-impexp-instances">
4524 <title>Instances</title>
4526 Family instances are implicitly exported, just like class instances.
4527 However, this applies only to the heads of instances, not to the data
4528 constructors an instance defines.
4536 <sect2 id="synonym-families">
4537 <title>Synonym families</title>
4540 Type families appear in two flavours: (1) they can be defined on the
4541 toplevel or (2) they can appear inside type classes (in which case they
4542 are known as associated type synonyms). The former is the more general
4543 variant, as it lacks the requirement for the type-indexes to coincide with
4544 the class parameters. However, the latter can lead to more clearly
4545 structured code and compiler warnings if some type instances were -
4546 possibly accidentally - omitted. In the following, we always discuss the
4547 general toplevel form first and then cover the additional constraints
4548 placed on associated types.
4551 <sect3 id="type-family-declarations">
4552 <title>Type family declarations</title>
4555 Indexed type families are introduced by a signature, such as
4557 type family Elem c :: *
4559 The special <literal>family</literal> distinguishes family from standard
4560 type declarations. The result kind annotation is optional and, as
4561 usual, defaults to <literal>*</literal> if omitted. An example is
4565 Parameters can also be given explicit kind signatures if needed. We
4566 call the number of parameters in a type family declaration, the family's
4567 arity, and all applications of a type family must be fully saturated
4568 w.r.t. to that arity. This requirement is unlike ordinary type synonyms
4569 and it implies that the kind of a type family is not sufficient to
4570 determine a family's arity, and hence in general, also insufficient to
4571 determine whether a type family application is well formed. As an
4572 example, consider the following declaration:
4574 type family F a b :: * -> * -- F's arity is 2,
4575 -- although its overall kind is * -> * -> * -> *
4577 Given this declaration the following are examples of well-formed and
4580 F Char [Int] -- OK! Kind: * -> *
4581 F Char [Int] Bool -- OK! Kind: *
4582 F IO Bool -- WRONG: kind mismatch in the first argument
4583 F Bool -- WRONG: unsaturated application
4587 <sect4 id="assoc-type-family-decl">
4588 <title>Associated type family declarations</title>
4590 When a type family is declared as part of a type class, we drop
4591 the <literal>family</literal> special. The <literal>Elem</literal>
4592 declaration takes the following form
4594 class Collects ce where
4598 The argument names of the type family must be class parameters. Each
4599 class parameter may only be used at most once per associated type, but
4600 some may be omitted and they may be in an order other than in the
4601 class head. Hence, the following contrived example is admissible:
4606 These rules are exactly as for associated data families.
4611 <sect3 id="type-instance-declarations">
4612 <title>Type instance declarations</title>
4614 Instance declarations of type families are very similar to standard type
4615 synonym declarations. The only two differences are that the
4616 keyword <literal>type</literal> is followed
4617 by <literal>instance</literal> and that some or all of the type
4618 arguments can be non-variable types, but may not contain forall types or
4619 type synonym families. However, data families are generally allowed, and
4620 type synonyms are allowed as long as they are fully applied and expand
4621 to a type that is admissible - these are the exact same requirements as
4622 for data instances. For example, the <literal>[e]</literal> instance
4623 for <literal>Elem</literal> is
4625 type instance Elem [e] = e
4629 Type family instance declarations are only legitimate when an
4630 appropriate family declaration is in scope - just like class instances
4631 require the class declaration to be visible. Moreover, each instance
4632 declaration has to conform to the kind determined by its family
4633 declaration, and the number of type parameters in an instance
4634 declaration must match the number of type parameters in the family
4635 declaration. Finally, the right-hand side of a type instance must be a
4636 monotype (i.e., it may not include foralls) and after the expansion of
4637 all saturated vanilla type synonyms, no synonyms, except family synonyms
4638 may remain. Here are some examples of admissible and illegal type
4641 type family F a :: *
4642 type instance F [Int] = Int -- OK!
4643 type instance F String = Char -- OK!
4644 type instance F (F a) = a -- WRONG: type parameter mentions a type family
4645 type instance F (forall a. (a, b)) = b -- WRONG: a forall type appears in a type parameter
4646 type instance F Float = forall a.a -- WRONG: right-hand side may not be a forall type
4648 type family G a b :: * -> *
4649 type instance G Int = (,) -- WRONG: must be two type parameters
4650 type instance G Int Char Float = Double -- WRONG: must be two type parameters
4654 <sect4 id="assoc-type-instance">
4655 <title>Associated type instance declarations</title>
4657 When an associated family instance is declared within a type class
4658 instance, we drop the <literal>instance</literal> keyword in the family
4659 instance. So, the <literal>[e]</literal> instance
4660 for <literal>Elem</literal> becomes:
4662 instance (Eq (Elem [e])) => Collects ([e]) where
4666 The most important point about associated family instances is that the
4667 type indexes corresponding to class parameters must be identical to the
4668 type given in the instance head; here this is <literal>[e]</literal>,
4669 which coincides with the only class parameter.
4672 Instances for an associated family can only appear as part of instances
4673 declarations of the class in which the family was declared - just as
4674 with the equations of the methods of a class. Also in correspondence to
4675 how methods are handled, declarations of associated types can be omitted
4676 in class instances. If an associated family instance is omitted, the
4677 corresponding instance type is not inhabited; i.e., only diverging
4678 expressions, such as <literal>undefined</literal>, can assume the type.
4682 <sect4 id="type-family-overlap">
4683 <title>Overlap of type synonym instances</title>
4685 The instance declarations of a type family used in a single program
4686 may only overlap if the right-hand sides of the overlapping instances
4687 coincide for the overlapping types. More formally, two instance
4688 declarations overlap if there is a substitution that makes the
4689 left-hand sides of the instances syntactically the same. Whenever
4690 that is the case, the right-hand sides of the instances must also be
4691 syntactically equal under the same substitution. This condition is
4692 independent of whether the type family is associated or not, and it is
4693 not only a matter of consistency, but one of type safety.
4696 Here are two example to illustrate the condition under which overlap
4699 type instance F (a, Int) = [a]
4700 type instance F (Int, b) = [b] -- overlap permitted
4702 type instance G (a, Int) = [a]
4703 type instance G (Char, a) = [a] -- ILLEGAL overlap, as [Char] /= [Int]
4708 <sect4 id="type-family-decidability">
4709 <title>Decidability of type synonym instances</title>
4711 In order to guarantee that type inference in the presence of type
4712 families decidable, we need to place a number of additional
4713 restrictions on the formation of type instance declarations (c.f.,
4714 Definition 5 (Relaxed Conditions) of “<ulink
4715 url="http://www.cse.unsw.edu.au/~chak/papers/SPCS08.html">Type
4716 Checking with Open Type Functions</ulink>”). Instance
4717 declarations have the general form
4719 type instance F t1 .. tn = t
4721 where we require that for every type family application <literal>(G s1
4722 .. sm)</literal> in <literal>t</literal>,
4725 <para><literal>s1 .. sm</literal> do not contain any type family
4726 constructors,</para>
4729 <para>the total number of symbols (data type constructors and type
4730 variables) in <literal>s1 .. sm</literal> is strictly smaller than
4731 in <literal>t1 .. tn</literal>, and</para>
4734 <para>for every type
4735 variable <literal>a</literal>, <literal>a</literal> occurs
4736 in <literal>s1 .. sm</literal> at most as often as in <literal>t1
4737 .. tn</literal>.</para>
4740 These restrictions are easily verified and ensure termination of type
4741 inference. However, they are not sufficient to guarantee completeness
4742 of type inference in the presence of, so called, ''loopy equalities'',
4743 such as <literal>a ~ [F a]</literal>, where a recursive occurrence of
4744 a type variable is underneath a family application and data
4745 constructor application - see the above mentioned paper for details.
4748 If the option <option>-XUndecidableInstances</option> is passed to the
4749 compiler, the above restrictions are not enforced and it is on the
4750 programmer to ensure termination of the normalisation of type families
4751 during type inference.
4756 <sect3 id-="equality-constraints">
4757 <title>Equality constraints</title>
4759 Type context can include equality constraints of the form <literal>t1 ~
4760 t2</literal>, which denote that the types <literal>t1</literal>
4761 and <literal>t2</literal> need to be the same. In the presence of type
4762 families, whether two types are equal cannot generally be decided
4763 locally. Hence, the contexts of function signatures may include
4764 equality constraints, as in the following example:
4766 sumCollects :: (Collects c1, Collects c2, Elem c1 ~ Elem c2) => c1 -> c2 -> c2
4768 where we require that the element type of <literal>c1</literal>
4769 and <literal>c2</literal> are the same. In general, the
4770 types <literal>t1</literal> and <literal>t2</literal> of an equality
4771 constraint may be arbitrary monotypes; i.e., they may not contain any
4772 quantifiers, independent of whether higher-rank types are otherwise
4776 Equality constraints can also appear in class and instance contexts.
4777 The former enable a simple translation of programs using functional
4778 dependencies into programs using family synonyms instead. The general
4779 idea is to rewrite a class declaration of the form
4781 class C a b | a -> b
4785 class (F a ~ b) => C a b where
4788 That is, we represent every functional dependency (FD) <literal>a1 .. an
4789 -> b</literal> by an FD type family <literal>F a1 .. an</literal> and a
4790 superclass context equality <literal>F a1 .. an ~ b</literal>,
4791 essentially giving a name to the functional dependency. In class
4792 instances, we define the type instances of FD families in accordance
4793 with the class head. Method signatures are not affected by that
4797 NB: Equalities in superclass contexts are not fully implemented in
4802 <sect3 id-="ty-fams-in-instances">
4803 <title>Type families and instance declarations</title>
4804 <para>Type families require us to extend the rules for
4805 the form of instance heads, which are given
4806 in <xref linkend="flexible-instance-head"/>.
4809 <listitem><para>Data type families may appear in an instance head</para></listitem>
4810 <listitem><para>Type synonym families may not appear (at all) in an instance head</para></listitem>
4812 The reason for the latter restriction is that there is no way to check for. Consider
4815 type instance F Bool = Int
4822 Now a constraint <literal>(C (F Bool))</literal> would match both instances.
4823 The situation is especially bad because the type instance for <literal>F Bool</literal>
4824 might be in another module, or even in a module that is not yet written.
4831 <sect1 id="other-type-extensions">
4832 <title>Other type system extensions</title>
4834 <sect2 id="explicit-foralls"><title>Explicit universal quantification (forall)</title>
4836 Haskell type signatures are implicitly quantified. When the language option <option>-XExplicitForAll</option>
4837 is used, the keyword <literal>forall</literal>
4838 allows us to say exactly what this means. For example:
4846 g :: forall b. (b -> b)
4848 The two are treated identically.
4851 Of course <literal>forall</literal> becomes a keyword; you can't use <literal>forall</literal> as
4852 a type variable any more!
4857 <sect2 id="flexible-contexts"><title>The context of a type signature</title>
4859 The <option>-XFlexibleContexts</option> flag lifts the Haskell 98 restriction
4860 that the type-class constraints in a type signature must have the
4861 form <emphasis>(class type-variable)</emphasis> or
4862 <emphasis>(class (type-variable type-variable ...))</emphasis>.
4863 With <option>-XFlexibleContexts</option>
4864 these type signatures are perfectly OK
4867 g :: Ord (T a ()) => ...
4869 The flag <option>-XFlexibleContexts</option> also lifts the corresponding
4870 restriction on class declarations (<xref linkend="superclass-rules"/>) and instance declarations
4871 (<xref linkend="instance-rules"/>).
4875 GHC imposes the following restrictions on the constraints in a type signature.
4879 forall tv1..tvn (c1, ...,cn) => type
4882 (Here, we write the "foralls" explicitly, although the Haskell source
4883 language omits them; in Haskell 98, all the free type variables of an
4884 explicit source-language type signature are universally quantified,
4885 except for the class type variables in a class declaration. However,
4886 in GHC, you can give the foralls if you want. See <xref linkend="explicit-foralls"/>).
4895 <emphasis>Each universally quantified type variable
4896 <literal>tvi</literal> must be reachable from <literal>type</literal></emphasis>.
4898 A type variable <literal>a</literal> is "reachable" if it appears
4899 in the same constraint as either a type variable free in
4900 <literal>type</literal>, or another reachable type variable.
4901 A value with a type that does not obey
4902 this reachability restriction cannot be used without introducing
4903 ambiguity; that is why the type is rejected.
4904 Here, for example, is an illegal type:
4908 forall a. Eq a => Int
4912 When a value with this type was used, the constraint <literal>Eq tv</literal>
4913 would be introduced where <literal>tv</literal> is a fresh type variable, and
4914 (in the dictionary-translation implementation) the value would be
4915 applied to a dictionary for <literal>Eq tv</literal>. The difficulty is that we
4916 can never know which instance of <literal>Eq</literal> to use because we never
4917 get any more information about <literal>tv</literal>.
4921 that the reachability condition is weaker than saying that <literal>a</literal> is
4922 functionally dependent on a type variable free in
4923 <literal>type</literal> (see <xref
4924 linkend="functional-dependencies"/>). The reason for this is there
4925 might be a "hidden" dependency, in a superclass perhaps. So
4926 "reachable" is a conservative approximation to "functionally dependent".
4927 For example, consider:
4929 class C a b | a -> b where ...
4930 class C a b => D a b where ...
4931 f :: forall a b. D a b => a -> a
4933 This is fine, because in fact <literal>a</literal> does functionally determine <literal>b</literal>
4934 but that is not immediately apparent from <literal>f</literal>'s type.
4940 <emphasis>Every constraint <literal>ci</literal> must mention at least one of the
4941 universally quantified type variables <literal>tvi</literal></emphasis>.
4943 For example, this type is OK because <literal>C a b</literal> mentions the
4944 universally quantified type variable <literal>b</literal>:
4948 forall a. C a b => burble
4952 The next type is illegal because the constraint <literal>Eq b</literal> does not
4953 mention <literal>a</literal>:
4957 forall a. Eq b => burble
4961 The reason for this restriction is milder than the other one. The
4962 excluded types are never useful or necessary (because the offending
4963 context doesn't need to be witnessed at this point; it can be floated
4964 out). Furthermore, floating them out increases sharing. Lastly,
4965 excluding them is a conservative choice; it leaves a patch of
4966 territory free in case we need it later.
4977 <sect2 id="implicit-parameters">
4978 <title>Implicit parameters</title>
4980 <para> Implicit parameters are implemented as described in
4981 "Implicit parameters: dynamic scoping with static types",
4982 J Lewis, MB Shields, E Meijer, J Launchbury,
4983 27th ACM Symposium on Principles of Programming Languages (POPL'00),
4987 <para>(Most of the following, still rather incomplete, documentation is
4988 due to Jeff Lewis.)</para>
4990 <para>Implicit parameter support is enabled with the option
4991 <option>-XImplicitParams</option>.</para>
4994 A variable is called <emphasis>dynamically bound</emphasis> when it is bound by the calling
4995 context of a function and <emphasis>statically bound</emphasis> when bound by the callee's
4996 context. In Haskell, all variables are statically bound. Dynamic
4997 binding of variables is a notion that goes back to Lisp, but was later
4998 discarded in more modern incarnations, such as Scheme. Dynamic binding
4999 can be very confusing in an untyped language, and unfortunately, typed
5000 languages, in particular Hindley-Milner typed languages like Haskell,
5001 only support static scoping of variables.
5004 However, by a simple extension to the type class system of Haskell, we
5005 can support dynamic binding. Basically, we express the use of a
5006 dynamically bound variable as a constraint on the type. These
5007 constraints lead to types of the form <literal>(?x::t') => t</literal>, which says "this
5008 function uses a dynamically-bound variable <literal>?x</literal>
5009 of type <literal>t'</literal>". For
5010 example, the following expresses the type of a sort function,
5011 implicitly parameterized by a comparison function named <literal>cmp</literal>.
5013 sort :: (?cmp :: a -> a -> Bool) => [a] -> [a]
5015 The dynamic binding constraints are just a new form of predicate in the type class system.
5018 An implicit parameter occurs in an expression using the special form <literal>?x</literal>,
5019 where <literal>x</literal> is
5020 any valid identifier (e.g. <literal>ord ?x</literal> is a valid expression).
5021 Use of this construct also introduces a new
5022 dynamic-binding constraint in the type of the expression.
5023 For example, the following definition
5024 shows how we can define an implicitly parameterized sort function in
5025 terms of an explicitly parameterized <literal>sortBy</literal> function:
5027 sortBy :: (a -> a -> Bool) -> [a] -> [a]
5029 sort :: (?cmp :: a -> a -> Bool) => [a] -> [a]
5035 <title>Implicit-parameter type constraints</title>
5037 Dynamic binding constraints behave just like other type class
5038 constraints in that they are automatically propagated. Thus, when a
5039 function is used, its implicit parameters are inherited by the
5040 function that called it. For example, our <literal>sort</literal> function might be used
5041 to pick out the least value in a list:
5043 least :: (?cmp :: a -> a -> Bool) => [a] -> a
5044 least xs = head (sort xs)
5046 Without lifting a finger, the <literal>?cmp</literal> parameter is
5047 propagated to become a parameter of <literal>least</literal> as well. With explicit
5048 parameters, the default is that parameters must always be explicit
5049 propagated. With implicit parameters, the default is to always
5053 An implicit-parameter type constraint differs from other type class constraints in the
5054 following way: All uses of a particular implicit parameter must have
5055 the same type. This means that the type of <literal>(?x, ?x)</literal>
5056 is <literal>(?x::a) => (a,a)</literal>, and not
5057 <literal>(?x::a, ?x::b) => (a, b)</literal>, as would be the case for type
5061 <para> You can't have an implicit parameter in the context of a class or instance
5062 declaration. For example, both these declarations are illegal:
5064 class (?x::Int) => C a where ...
5065 instance (?x::a) => Foo [a] where ...
5067 Reason: exactly which implicit parameter you pick up depends on exactly where
5068 you invoke a function. But the ``invocation'' of instance declarations is done
5069 behind the scenes by the compiler, so it's hard to figure out exactly where it is done.
5070 Easiest thing is to outlaw the offending types.</para>
5072 Implicit-parameter constraints do not cause ambiguity. For example, consider:
5074 f :: (?x :: [a]) => Int -> Int
5077 g :: (Read a, Show a) => String -> String
5080 Here, <literal>g</literal> has an ambiguous type, and is rejected, but <literal>f</literal>
5081 is fine. The binding for <literal>?x</literal> at <literal>f</literal>'s call site is
5082 quite unambiguous, and fixes the type <literal>a</literal>.
5087 <title>Implicit-parameter bindings</title>
5090 An implicit parameter is <emphasis>bound</emphasis> using the standard
5091 <literal>let</literal> or <literal>where</literal> binding forms.
5092 For example, we define the <literal>min</literal> function by binding
5093 <literal>cmp</literal>.
5096 min = let ?cmp = (<=) in least
5100 A group of implicit-parameter bindings may occur anywhere a normal group of Haskell
5101 bindings can occur, except at top level. That is, they can occur in a <literal>let</literal>
5102 (including in a list comprehension, or do-notation, or pattern guards),
5103 or a <literal>where</literal> clause.
5104 Note the following points:
5107 An implicit-parameter binding group must be a
5108 collection of simple bindings to implicit-style variables (no
5109 function-style bindings, and no type signatures); these bindings are
5110 neither polymorphic or recursive.
5113 You may not mix implicit-parameter bindings with ordinary bindings in a
5114 single <literal>let</literal>
5115 expression; use two nested <literal>let</literal>s instead.
5116 (In the case of <literal>where</literal> you are stuck, since you can't nest <literal>where</literal> clauses.)
5120 You may put multiple implicit-parameter bindings in a
5121 single binding group; but they are <emphasis>not</emphasis> treated
5122 as a mutually recursive group (as ordinary <literal>let</literal> bindings are).
5123 Instead they are treated as a non-recursive group, simultaneously binding all the implicit
5124 parameter. The bindings are not nested, and may be re-ordered without changing
5125 the meaning of the program.
5126 For example, consider:
5128 f t = let { ?x = t; ?y = ?x+(1::Int) } in ?x + ?y
5130 The use of <literal>?x</literal> in the binding for <literal>?y</literal> does not "see"
5131 the binding for <literal>?x</literal>, so the type of <literal>f</literal> is
5133 f :: (?x::Int) => Int -> Int
5141 <sect3><title>Implicit parameters and polymorphic recursion</title>
5144 Consider these two definitions:
5147 len1 xs = let ?acc = 0 in len_acc1 xs
5150 len_acc1 (x:xs) = let ?acc = ?acc + (1::Int) in len_acc1 xs
5155 len2 xs = let ?acc = 0 in len_acc2 xs
5157 len_acc2 :: (?acc :: Int) => [a] -> Int
5159 len_acc2 (x:xs) = let ?acc = ?acc + (1::Int) in len_acc2 xs
5161 The only difference between the two groups is that in the second group
5162 <literal>len_acc</literal> is given a type signature.
5163 In the former case, <literal>len_acc1</literal> is monomorphic in its own
5164 right-hand side, so the implicit parameter <literal>?acc</literal> is not
5165 passed to the recursive call. In the latter case, because <literal>len_acc2</literal>
5166 has a type signature, the recursive call is made to the
5167 <emphasis>polymorphic</emphasis> version, which takes <literal>?acc</literal>
5168 as an implicit parameter. So we get the following results in GHCi:
5175 Adding a type signature dramatically changes the result! This is a rather
5176 counter-intuitive phenomenon, worth watching out for.
5180 <sect3><title>Implicit parameters and monomorphism</title>
5182 <para>GHC applies the dreaded Monomorphism Restriction (section 4.5.5 of the
5183 Haskell Report) to implicit parameters. For example, consider:
5191 Since the binding for <literal>y</literal> falls under the Monomorphism
5192 Restriction it is not generalised, so the type of <literal>y</literal> is
5193 simply <literal>Int</literal>, not <literal>(?x::Int) => Int</literal>.
5194 Hence, <literal>(f 9)</literal> returns result <literal>9</literal>.
5195 If you add a type signature for <literal>y</literal>, then <literal>y</literal>
5196 will get type <literal>(?x::Int) => Int</literal>, so the occurrence of
5197 <literal>y</literal> in the body of the <literal>let</literal> will see the
5198 inner binding of <literal>?x</literal>, so <literal>(f 9)</literal> will return
5199 <literal>14</literal>.
5204 <!-- ======================= COMMENTED OUT ========================
5206 We intend to remove linear implicit parameters, so I'm at least removing
5207 them from the 6.6 user manual
5209 <sect2 id="linear-implicit-parameters">
5210 <title>Linear implicit parameters</title>
5212 Linear implicit parameters are an idea developed by Koen Claessen,
5213 Mark Shields, and Simon PJ. They address the long-standing
5214 problem that monads seem over-kill for certain sorts of problem, notably:
5217 <listitem> <para> distributing a supply of unique names </para> </listitem>
5218 <listitem> <para> distributing a supply of random numbers </para> </listitem>
5219 <listitem> <para> distributing an oracle (as in QuickCheck) </para> </listitem>
5223 Linear implicit parameters are just like ordinary implicit parameters,
5224 except that they are "linear"; that is, they cannot be copied, and
5225 must be explicitly "split" instead. Linear implicit parameters are
5226 written '<literal>%x</literal>' instead of '<literal>?x</literal>'.
5227 (The '/' in the '%' suggests the split!)
5232 import GHC.Exts( Splittable )
5234 data NameSupply = ...
5236 splitNS :: NameSupply -> (NameSupply, NameSupply)
5237 newName :: NameSupply -> Name
5239 instance Splittable NameSupply where
5243 f :: (%ns :: NameSupply) => Env -> Expr -> Expr
5244 f env (Lam x e) = Lam x' (f env e)
5247 env' = extend env x x'
5248 ...more equations for f...
5250 Notice that the implicit parameter %ns is consumed
5252 <listitem> <para> once by the call to <literal>newName</literal> </para> </listitem>
5253 <listitem> <para> once by the recursive call to <literal>f</literal> </para></listitem>
5257 So the translation done by the type checker makes
5258 the parameter explicit:
5260 f :: NameSupply -> Env -> Expr -> Expr
5261 f ns env (Lam x e) = Lam x' (f ns1 env e)
5263 (ns1,ns2) = splitNS ns
5265 env = extend env x x'
5267 Notice the call to 'split' introduced by the type checker.
5268 How did it know to use 'splitNS'? Because what it really did
5269 was to introduce a call to the overloaded function 'split',
5270 defined by the class <literal>Splittable</literal>:
5272 class Splittable a where
5275 The instance for <literal>Splittable NameSupply</literal> tells GHC how to implement
5276 split for name supplies. But we can simply write
5282 g :: (Splittable a, %ns :: a) => b -> (b,a,a)
5284 The <literal>Splittable</literal> class is built into GHC. It's exported by module
5285 <literal>GHC.Exts</literal>.
5290 <listitem> <para> '<literal>?x</literal>' and '<literal>%x</literal>'
5291 are entirely distinct implicit parameters: you
5292 can use them together and they won't interfere with each other. </para>
5295 <listitem> <para> You can bind linear implicit parameters in 'with' clauses. </para> </listitem>
5297 <listitem> <para>You cannot have implicit parameters (whether linear or not)
5298 in the context of a class or instance declaration. </para></listitem>
5302 <sect3><title>Warnings</title>
5305 The monomorphism restriction is even more important than usual.
5306 Consider the example above:
5308 f :: (%ns :: NameSupply) => Env -> Expr -> Expr
5309 f env (Lam x e) = Lam x' (f env e)
5312 env' = extend env x x'
5314 If we replaced the two occurrences of x' by (newName %ns), which is
5315 usually a harmless thing to do, we get:
5317 f :: (%ns :: NameSupply) => Env -> Expr -> Expr
5318 f env (Lam x e) = Lam (newName %ns) (f env e)
5320 env' = extend env x (newName %ns)
5322 But now the name supply is consumed in <emphasis>three</emphasis> places
5323 (the two calls to newName,and the recursive call to f), so
5324 the result is utterly different. Urk! We don't even have
5328 Well, this is an experimental change. With implicit
5329 parameters we have already lost beta reduction anyway, and
5330 (as John Launchbury puts it) we can't sensibly reason about
5331 Haskell programs without knowing their typing.
5336 <sect3><title>Recursive functions</title>
5337 <para>Linear implicit parameters can be particularly tricky when you have a recursive function
5340 foo :: %x::T => Int -> [Int]
5342 foo n = %x : foo (n-1)
5344 where T is some type in class Splittable.</para>
5346 Do you get a list of all the same T's or all different T's
5347 (assuming that split gives two distinct T's back)?
5349 If you supply the type signature, taking advantage of polymorphic
5350 recursion, you get what you'd probably expect. Here's the
5351 translated term, where the implicit param is made explicit:
5354 foo x n = let (x1,x2) = split x
5355 in x1 : foo x2 (n-1)
5357 But if you don't supply a type signature, GHC uses the Hindley
5358 Milner trick of using a single monomorphic instance of the function
5359 for the recursive calls. That is what makes Hindley Milner type inference
5360 work. So the translation becomes
5364 foom n = x : foom (n-1)
5368 Result: 'x' is not split, and you get a list of identical T's. So the
5369 semantics of the program depends on whether or not foo has a type signature.
5372 You may say that this is a good reason to dislike linear implicit parameters
5373 and you'd be right. That is why they are an experimental feature.
5379 ================ END OF Linear Implicit Parameters commented out -->
5381 <sect2 id="kinding">
5382 <title>Explicitly-kinded quantification</title>
5385 Haskell infers the kind of each type variable. Sometimes it is nice to be able
5386 to give the kind explicitly as (machine-checked) documentation,
5387 just as it is nice to give a type signature for a function. On some occasions,
5388 it is essential to do so. For example, in his paper "Restricted Data Types in Haskell" (Haskell Workshop 1999)
5389 John Hughes had to define the data type:
5391 data Set cxt a = Set [a]
5392 | Unused (cxt a -> ())
5394 The only use for the <literal>Unused</literal> constructor was to force the correct
5395 kind for the type variable <literal>cxt</literal>.
5398 GHC now instead allows you to specify the kind of a type variable directly, wherever
5399 a type variable is explicitly bound, with the flag <option>-XKindSignatures</option>.
5402 This flag enables kind signatures in the following places:
5404 <listitem><para><literal>data</literal> declarations:
5406 data Set (cxt :: * -> *) a = Set [a]
5407 </screen></para></listitem>
5408 <listitem><para><literal>type</literal> declarations:
5410 type T (f :: * -> *) = f Int
5411 </screen></para></listitem>
5412 <listitem><para><literal>class</literal> declarations:
5414 class (Eq a) => C (f :: * -> *) a where ...
5415 </screen></para></listitem>
5416 <listitem><para><literal>forall</literal>'s in type signatures:
5418 f :: forall (cxt :: * -> *). Set cxt Int
5419 </screen></para></listitem>
5424 The parentheses are required. Some of the spaces are required too, to
5425 separate the lexemes. If you write <literal>(f::*->*)</literal> you
5426 will get a parse error, because "<literal>::*->*</literal>" is a
5427 single lexeme in Haskell.
5431 As part of the same extension, you can put kind annotations in types
5434 f :: (Int :: *) -> Int
5435 g :: forall a. a -> (a :: *)
5439 atype ::= '(' ctype '::' kind ')
5441 The parentheses are required.
5446 <sect2 id="universal-quantification">
5447 <title>Arbitrary-rank polymorphism
5451 GHC's type system supports <emphasis>arbitrary-rank</emphasis>
5452 explicit universal quantification in
5454 For example, all the following types are legal:
5456 f1 :: forall a b. a -> b -> a
5457 g1 :: forall a b. (Ord a, Eq b) => a -> b -> a
5459 f2 :: (forall a. a->a) -> Int -> Int
5460 g2 :: (forall a. Eq a => [a] -> a -> Bool) -> Int -> Int
5462 f3 :: ((forall a. a->a) -> Int) -> Bool -> Bool
5464 f4 :: Int -> (forall a. a -> a)
5466 Here, <literal>f1</literal> and <literal>g1</literal> are rank-1 types, and
5467 can be written in standard Haskell (e.g. <literal>f1 :: a->b->a</literal>).
5468 The <literal>forall</literal> makes explicit the universal quantification that
5469 is implicitly added by Haskell.
5472 The functions <literal>f2</literal> and <literal>g2</literal> have rank-2 types;
5473 the <literal>forall</literal> is on the left of a function arrow. As <literal>g2</literal>
5474 shows, the polymorphic type on the left of the function arrow can be overloaded.
5477 The function <literal>f3</literal> has a rank-3 type;
5478 it has rank-2 types on the left of a function arrow.
5481 GHC has three flags to control higher-rank types:
5484 <option>-XPolymorphicComponents</option>: data constructors (only) can have polymorphic argument types.
5487 <option>-XRank2Types</option>: any function (including data constructors) can have a rank-2 type.
5490 <option>-XRankNTypes</option>: any function (including data constructors) can have an arbitrary-rank type.
5491 That is, you can nest <literal>forall</literal>s
5492 arbitrarily deep in function arrows.
5493 In particular, a forall-type (also called a "type scheme"),
5494 including an operational type class context, is legal:
5496 <listitem> <para> On the left or right (see <literal>f4</literal>, for example)
5497 of a function arrow </para> </listitem>
5498 <listitem> <para> As the argument of a constructor, or type of a field, in a data type declaration. For
5499 example, any of the <literal>f1,f2,f3,g1,g2</literal> above would be valid
5500 field type signatures.</para> </listitem>
5501 <listitem> <para> As the type of an implicit parameter </para> </listitem>
5502 <listitem> <para> In a pattern type signature (see <xref linkend="scoped-type-variables"/>) </para> </listitem>
5514 In a <literal>data</literal> or <literal>newtype</literal> declaration one can quantify
5515 the types of the constructor arguments. Here are several examples:
5521 data T a = T1 (forall b. b -> b -> b) a
5523 data MonadT m = MkMonad { return :: forall a. a -> m a,
5524 bind :: forall a b. m a -> (a -> m b) -> m b
5527 newtype Swizzle = MkSwizzle (Ord a => [a] -> [a])
5533 The constructors have rank-2 types:
5539 T1 :: forall a. (forall b. b -> b -> b) -> a -> T a
5540 MkMonad :: forall m. (forall a. a -> m a)
5541 -> (forall a b. m a -> (a -> m b) -> m b)
5543 MkSwizzle :: (Ord a => [a] -> [a]) -> Swizzle
5549 Notice that you don't need to use a <literal>forall</literal> if there's an
5550 explicit context. For example in the first argument of the
5551 constructor <function>MkSwizzle</function>, an implicit "<literal>forall a.</literal>" is
5552 prefixed to the argument type. The implicit <literal>forall</literal>
5553 quantifies all type variables that are not already in scope, and are
5554 mentioned in the type quantified over.
5558 As for type signatures, implicit quantification happens for non-overloaded
5559 types too. So if you write this:
5562 data T a = MkT (Either a b) (b -> b)
5565 it's just as if you had written this:
5568 data T a = MkT (forall b. Either a b) (forall b. b -> b)
5571 That is, since the type variable <literal>b</literal> isn't in scope, it's
5572 implicitly universally quantified. (Arguably, it would be better
5573 to <emphasis>require</emphasis> explicit quantification on constructor arguments
5574 where that is what is wanted. Feedback welcomed.)
5578 You construct values of types <literal>T1, MonadT, Swizzle</literal> by applying
5579 the constructor to suitable values, just as usual. For example,
5590 a3 = MkSwizzle reverse
5593 a4 = let r x = Just x
5600 mkTs :: (forall b. b -> b -> b) -> a -> [T a]
5601 mkTs f x y = [T1 f x, T1 f y]
5607 The type of the argument can, as usual, be more general than the type
5608 required, as <literal>(MkSwizzle reverse)</literal> shows. (<function>reverse</function>
5609 does not need the <literal>Ord</literal> constraint.)
5613 When you use pattern matching, the bound variables may now have
5614 polymorphic types. For example:
5620 f :: T a -> a -> (a, Char)
5621 f (T1 w k) x = (w k x, w 'c' 'd')
5623 g :: (Ord a, Ord b) => Swizzle -> [a] -> (a -> b) -> [b]
5624 g (MkSwizzle s) xs f = s (map f (s xs))
5626 h :: MonadT m -> [m a] -> m [a]
5627 h m [] = return m []
5628 h m (x:xs) = bind m x $ \y ->
5629 bind m (h m xs) $ \ys ->
5636 In the function <function>h</function> we use the record selectors <literal>return</literal>
5637 and <literal>bind</literal> to extract the polymorphic bind and return functions
5638 from the <literal>MonadT</literal> data structure, rather than using pattern
5644 <title>Type inference</title>
5647 In general, type inference for arbitrary-rank types is undecidable.
5648 GHC uses an algorithm proposed by Odersky and Laufer ("Putting type annotations to work", POPL'96)
5649 to get a decidable algorithm by requiring some help from the programmer.
5650 We do not yet have a formal specification of "some help" but the rule is this:
5653 <emphasis>For a lambda-bound or case-bound variable, x, either the programmer
5654 provides an explicit polymorphic type for x, or GHC's type inference will assume
5655 that x's type has no foralls in it</emphasis>.
5658 What does it mean to "provide" an explicit type for x? You can do that by
5659 giving a type signature for x directly, using a pattern type signature
5660 (<xref linkend="scoped-type-variables"/>), thus:
5662 \ f :: (forall a. a->a) -> (f True, f 'c')
5664 Alternatively, you can give a type signature to the enclosing
5665 context, which GHC can "push down" to find the type for the variable:
5667 (\ f -> (f True, f 'c')) :: (forall a. a->a) -> (Bool,Char)
5669 Here the type signature on the expression can be pushed inwards
5670 to give a type signature for f. Similarly, and more commonly,
5671 one can give a type signature for the function itself:
5673 h :: (forall a. a->a) -> (Bool,Char)
5674 h f = (f True, f 'c')
5676 You don't need to give a type signature if the lambda bound variable
5677 is a constructor argument. Here is an example we saw earlier:
5679 f :: T a -> a -> (a, Char)
5680 f (T1 w k) x = (w k x, w 'c' 'd')
5682 Here we do not need to give a type signature to <literal>w</literal>, because
5683 it is an argument of constructor <literal>T1</literal> and that tells GHC all
5690 <sect3 id="implicit-quant">
5691 <title>Implicit quantification</title>
5694 GHC performs implicit quantification as follows. <emphasis>At the top level (only) of
5695 user-written types, if and only if there is no explicit <literal>forall</literal>,
5696 GHC finds all the type variables mentioned in the type that are not already
5697 in scope, and universally quantifies them.</emphasis> For example, the following pairs are
5701 f :: forall a. a -> a
5708 h :: forall b. a -> b -> b
5714 Notice that GHC does <emphasis>not</emphasis> find the innermost possible quantification
5717 f :: (a -> a) -> Int
5719 f :: forall a. (a -> a) -> Int
5721 f :: (forall a. a -> a) -> Int
5724 g :: (Ord a => a -> a) -> Int
5725 -- MEANS the illegal type
5726 g :: forall a. (Ord a => a -> a) -> Int
5728 g :: (forall a. Ord a => a -> a) -> Int
5730 The latter produces an illegal type, which you might think is silly,
5731 but at least the rule is simple. If you want the latter type, you
5732 can write your for-alls explicitly. Indeed, doing so is strongly advised
5739 <sect2 id="impredicative-polymorphism">
5740 <title>Impredicative polymorphism
5742 <para><emphasis>NOTE: the impredicative-polymorphism feature is deprecated in GHC 6.12, and
5743 will be removed or replaced in GHC 6.14.</emphasis></para>
5745 <para>GHC supports <emphasis>impredicative polymorphism</emphasis>,
5746 enabled with <option>-XImpredicativeTypes</option>.
5748 that you can call a polymorphic function at a polymorphic type, and
5749 parameterise data structures over polymorphic types. For example:
5751 f :: Maybe (forall a. [a] -> [a]) -> Maybe ([Int], [Char])
5752 f (Just g) = Just (g [3], g "hello")
5755 Notice here that the <literal>Maybe</literal> type is parameterised by the
5756 <emphasis>polymorphic</emphasis> type <literal>(forall a. [a] ->
5759 <para>The technical details of this extension are described in the paper
5760 <ulink url="http://research.microsoft.com/%7Esimonpj/papers/boxy/">Boxy types:
5761 type inference for higher-rank types and impredicativity</ulink>,
5762 which appeared at ICFP 2006.
5766 <sect2 id="scoped-type-variables">
5767 <title>Lexically scoped type variables
5771 GHC supports <emphasis>lexically scoped type variables</emphasis>, without
5772 which some type signatures are simply impossible to write. For example:
5774 f :: forall a. [a] -> [a]
5780 The type signature for <literal>f</literal> brings the type variable <literal>a</literal> into scope,
5781 because of the explicit <literal>forall</literal> (<xref linkend="decl-type-sigs"/>).
5782 The type variables bound by a <literal>forall</literal> scope over
5783 the entire definition of the accompanying value declaration.
5784 In this example, the type variable <literal>a</literal> scopes over the whole
5785 definition of <literal>f</literal>, including over
5786 the type signature for <varname>ys</varname>.
5787 In Haskell 98 it is not possible to declare
5788 a type for <varname>ys</varname>; a major benefit of scoped type variables is that
5789 it becomes possible to do so.
5791 <para>Lexically-scoped type variables are enabled by
5792 <option>-XScopedTypeVariables</option>. This flag implies <option>-XRelaxedPolyRec</option>.
5794 <para>Note: GHC 6.6 contains substantial changes to the way that scoped type
5795 variables work, compared to earlier releases. Read this section
5799 <title>Overview</title>
5801 <para>The design follows the following principles
5803 <listitem><para>A scoped type variable stands for a type <emphasis>variable</emphasis>, and not for
5804 a <emphasis>type</emphasis>. (This is a change from GHC's earlier
5805 design.)</para></listitem>
5806 <listitem><para>Furthermore, distinct lexical type variables stand for distinct
5807 type variables. This means that every programmer-written type signature
5808 (including one that contains free scoped type variables) denotes a
5809 <emphasis>rigid</emphasis> type; that is, the type is fully known to the type
5810 checker, and no inference is involved.</para></listitem>
5811 <listitem><para>Lexical type variables may be alpha-renamed freely, without
5812 changing the program.</para></listitem>
5816 A <emphasis>lexically scoped type variable</emphasis> can be bound by:
5818 <listitem><para>A declaration type signature (<xref linkend="decl-type-sigs"/>)</para></listitem>
5819 <listitem><para>An expression type signature (<xref linkend="exp-type-sigs"/>)</para></listitem>
5820 <listitem><para>A pattern type signature (<xref linkend="pattern-type-sigs"/>)</para></listitem>
5821 <listitem><para>Class and instance declarations (<xref linkend="cls-inst-scoped-tyvars"/>)</para></listitem>
5825 In Haskell, a programmer-written type signature is implicitly quantified over
5826 its free type variables (<ulink
5827 url="http://www.haskell.org/onlinereport/decls.html#sect4.1.2">Section
5829 of the Haskell Report).
5830 Lexically scoped type variables affect this implicit quantification rules
5831 as follows: any type variable that is in scope is <emphasis>not</emphasis> universally
5832 quantified. For example, if type variable <literal>a</literal> is in scope,
5835 (e :: a -> a) means (e :: a -> a)
5836 (e :: b -> b) means (e :: forall b. b->b)
5837 (e :: a -> b) means (e :: forall b. a->b)
5845 <sect3 id="decl-type-sigs">
5846 <title>Declaration type signatures</title>
5847 <para>A declaration type signature that has <emphasis>explicit</emphasis>
5848 quantification (using <literal>forall</literal>) brings into scope the
5849 explicitly-quantified
5850 type variables, in the definition of the named function. For example:
5852 f :: forall a. [a] -> [a]
5853 f (x:xs) = xs ++ [ x :: a ]
5855 The "<literal>forall a</literal>" brings "<literal>a</literal>" into scope in
5856 the definition of "<literal>f</literal>".
5858 <para>This only happens if:
5860 <listitem><para> The quantification in <literal>f</literal>'s type
5861 signature is explicit. For example:
5864 g (x:xs) = xs ++ [ x :: a ]
5866 This program will be rejected, because "<literal>a</literal>" does not scope
5867 over the definition of "<literal>f</literal>", so "<literal>x::a</literal>"
5868 means "<literal>x::forall a. a</literal>" by Haskell's usual implicit
5869 quantification rules.
5871 <listitem><para> The signature gives a type for a function binding or a bare variable binding,
5872 not a pattern binding.
5875 f1 :: forall a. [a] -> [a]
5876 f1 (x:xs) = xs ++ [ x :: a ] -- OK
5878 f2 :: forall a. [a] -> [a]
5879 f2 = \(x:xs) -> xs ++ [ x :: a ] -- OK
5881 f3 :: forall a. [a] -> [a]
5882 Just f3 = Just (\(x:xs) -> xs ++ [ x :: a ]) -- Not OK!
5884 The binding for <literal>f3</literal> is a pattern binding, and so its type signature
5885 does not bring <literal>a</literal> into scope. However <literal>f1</literal> is a
5886 function binding, and <literal>f2</literal> binds a bare variable; in both cases
5887 the type signature brings <literal>a</literal> into scope.
5893 <sect3 id="exp-type-sigs">
5894 <title>Expression type signatures</title>
5896 <para>An expression type signature that has <emphasis>explicit</emphasis>
5897 quantification (using <literal>forall</literal>) brings into scope the
5898 explicitly-quantified
5899 type variables, in the annotated expression. For example:
5901 f = runST ( (op >>= \(x :: STRef s Int) -> g x) :: forall s. ST s Bool )
5903 Here, the type signature <literal>forall a. ST s Bool</literal> brings the
5904 type variable <literal>s</literal> into scope, in the annotated expression
5905 <literal>(op >>= \(x :: STRef s Int) -> g x)</literal>.
5910 <sect3 id="pattern-type-sigs">
5911 <title>Pattern type signatures</title>
5913 A type signature may occur in any pattern; this is a <emphasis>pattern type
5914 signature</emphasis>.
5917 -- f and g assume that 'a' is already in scope
5918 f = \(x::Int, y::a) -> x
5920 h ((x,y) :: (Int,Bool)) = (y,x)
5922 In the case where all the type variables in the pattern type signature are
5923 already in scope (i.e. bound by the enclosing context), matters are simple: the
5924 signature simply constrains the type of the pattern in the obvious way.
5927 Unlike expression and declaration type signatures, pattern type signatures are not implicitly generalised.
5928 The pattern in a <emphasis>pattern binding</emphasis> may only mention type variables
5929 that are already in scope. For example:
5931 f :: forall a. [a] -> (Int, [a])
5934 (ys::[a], n) = (reverse xs, length xs) -- OK
5935 zs::[a] = xs ++ ys -- OK
5937 Just (v::b) = ... -- Not OK; b is not in scope
5939 Here, the pattern signatures for <literal>ys</literal> and <literal>zs</literal>
5940 are fine, but the one for <literal>v</literal> is not because <literal>b</literal> is
5944 However, in all patterns <emphasis>other</emphasis> than pattern bindings, a pattern
5945 type signature may mention a type variable that is not in scope; in this case,
5946 <emphasis>the signature brings that type variable into scope</emphasis>.
5947 This is particularly important for existential data constructors. For example:
5949 data T = forall a. MkT [a]
5952 k (MkT [t::a]) = MkT t3
5956 Here, the pattern type signature <literal>(t::a)</literal> mentions a lexical type
5957 variable that is not already in scope. Indeed, it <emphasis>cannot</emphasis> already be in scope,
5958 because it is bound by the pattern match. GHC's rule is that in this situation
5959 (and only then), a pattern type signature can mention a type variable that is
5960 not already in scope; the effect is to bring it into scope, standing for the
5961 existentially-bound type variable.
5964 When a pattern type signature binds a type variable in this way, GHC insists that the
5965 type variable is bound to a <emphasis>rigid</emphasis>, or fully-known, type variable.
5966 This means that any user-written type signature always stands for a completely known type.
5969 If all this seems a little odd, we think so too. But we must have
5970 <emphasis>some</emphasis> way to bring such type variables into scope, else we
5971 could not name existentially-bound type variables in subsequent type signatures.
5974 This is (now) the <emphasis>only</emphasis> situation in which a pattern type
5975 signature is allowed to mention a lexical variable that is not already in
5977 For example, both <literal>f</literal> and <literal>g</literal> would be
5978 illegal if <literal>a</literal> was not already in scope.
5984 <!-- ==================== Commented out part about result type signatures
5986 <sect3 id="result-type-sigs">
5987 <title>Result type signatures</title>
5990 The result type of a function, lambda, or case expression alternative can be given a signature, thus:
5993 {- f assumes that 'a' is already in scope -}
5994 f x y :: [a] = [x,y,x]
5996 g = \ x :: [Int] -> [3,4]
5998 h :: forall a. [a] -> a
6002 The final <literal>:: [a]</literal> after the patterns of <literal>f</literal> gives the type of
6003 the result of the function. Similarly, the body of the lambda in the RHS of
6004 <literal>g</literal> is <literal>[Int]</literal>, and the RHS of the case
6005 alternative in <literal>h</literal> is <literal>a</literal>.
6007 <para> A result type signature never brings new type variables into scope.</para>
6009 There are a couple of syntactic wrinkles. First, notice that all three
6010 examples would parse quite differently with parentheses:
6012 {- f assumes that 'a' is already in scope -}
6013 f x (y :: [a]) = [x,y,x]
6015 g = \ (x :: [Int]) -> [3,4]
6017 h :: forall a. [a] -> a
6021 Now the signature is on the <emphasis>pattern</emphasis>; and
6022 <literal>h</literal> would certainly be ill-typed (since the pattern
6023 <literal>(y:ys)</literal> cannot have the type <literal>a</literal>.
6025 Second, to avoid ambiguity, the type after the “<literal>::</literal>” in a result
6026 pattern signature on a lambda or <literal>case</literal> must be atomic (i.e. a single
6027 token or a parenthesised type of some sort). To see why,
6028 consider how one would parse this:
6037 <sect3 id="cls-inst-scoped-tyvars">
6038 <title>Class and instance declarations</title>
6041 The type variables in the head of a <literal>class</literal> or <literal>instance</literal> declaration
6042 scope over the methods defined in the <literal>where</literal> part. For example:
6060 <sect2 id="typing-binds">
6061 <title>Generalised typing of mutually recursive bindings</title>
6064 The Haskell Report specifies that a group of bindings (at top level, or in a
6065 <literal>let</literal> or <literal>where</literal>) should be sorted into
6066 strongly-connected components, and then type-checked in dependency order
6067 (<ulink url="http://www.haskell.org/onlinereport/decls.html#sect4.5.1">Haskell
6068 Report, Section 4.5.1</ulink>).
6069 As each group is type-checked, any binders of the group that
6071 an explicit type signature are put in the type environment with the specified
6073 and all others are monomorphic until the group is generalised
6074 (<ulink url="http://www.haskell.org/onlinereport/decls.html#sect4.5.2">Haskell Report, Section 4.5.2</ulink>).
6077 <para>Following a suggestion of Mark Jones, in his paper
6078 <ulink url="http://citeseer.ist.psu.edu/424440.html">Typing Haskell in
6080 GHC implements a more general scheme. If <option>-XRelaxedPolyRec</option> is
6082 <emphasis>the dependency analysis ignores references to variables that have an explicit
6083 type signature</emphasis>.
6084 As a result of this refined dependency analysis, the dependency groups are smaller, and more bindings will
6085 typecheck. For example, consider:
6087 f :: Eq a => a -> Bool
6088 f x = (x == x) || g True || g "Yes"
6090 g y = (y <= y) || f True
6092 This is rejected by Haskell 98, but under Jones's scheme the definition for
6093 <literal>g</literal> is typechecked first, separately from that for
6094 <literal>f</literal>,
6095 because the reference to <literal>f</literal> in <literal>g</literal>'s right
6096 hand side is ignored by the dependency analysis. Then <literal>g</literal>'s
6097 type is generalised, to get
6099 g :: Ord a => a -> Bool
6101 Now, the definition for <literal>f</literal> is typechecked, with this type for
6102 <literal>g</literal> in the type environment.
6106 The same refined dependency analysis also allows the type signatures of
6107 mutually-recursive functions to have different contexts, something that is illegal in
6108 Haskell 98 (Section 4.5.2, last sentence). With
6109 <option>-XRelaxedPolyRec</option>
6110 GHC only insists that the type signatures of a <emphasis>refined</emphasis> group have identical
6111 type signatures; in practice this means that only variables bound by the same
6112 pattern binding must have the same context. For example, this is fine:
6114 f :: Eq a => a -> Bool
6115 f x = (x == x) || g True
6117 g :: Ord a => a -> Bool
6118 g y = (y <= y) || f True
6123 <sect2 id="mono-local-binds">
6124 <title>Monomorphic local bindings</title>
6126 We are actively thinking of simplifying GHC's type system, by <emphasis>not generalising local bindings</emphasis>.
6127 The rationale is described in the paper
6128 <ulink url="http://research.microsoft.com/~simonpj/papers/constraints/index.htm">Let should not be generalised</ulink>.
6131 The experimental new behaviour is enabled by the flag <option>-XMonoLocalBinds</option>. The effect is
6132 that local (that is, non-top-level) bindings without a type signature are not generalised at all. You can
6133 think of it as an extreme (but much more predictable) version of the Monomorphism Restriction.
6134 If you supply a type signature, then the flag has no effect.
6139 <!-- ==================== End of type system extensions ================= -->
6141 <!-- ====================== TEMPLATE HASKELL ======================= -->
6143 <sect1 id="template-haskell">
6144 <title>Template Haskell</title>
6146 <para>Template Haskell allows you to do compile-time meta-programming in
6149 the main technical innovations is discussed in "<ulink
6150 url="http://research.microsoft.com/~simonpj/papers/meta-haskell/">
6151 Template Meta-programming for Haskell</ulink>" (Proc Haskell Workshop 2002).
6154 There is a Wiki page about
6155 Template Haskell at <ulink url="http://www.haskell.org/haskellwiki/Template_Haskell">
6156 http://www.haskell.org/haskellwiki/Template_Haskell</ulink>, and that is the best place to look for
6160 url="http://www.haskell.org/ghc/docs/latest/html/libraries/index.html">online
6161 Haskell library reference material</ulink>
6162 (look for module <literal>Language.Haskell.TH</literal>).
6163 Many changes to the original design are described in
6164 <ulink url="http://research.microsoft.com/~simonpj/papers/meta-haskell/notes2.ps">
6165 Notes on Template Haskell version 2</ulink>.
6166 Not all of these changes are in GHC, however.
6169 <para> The first example from that paper is set out below (<xref linkend="th-example"/>)
6170 as a worked example to help get you started.
6174 The documentation here describes the realisation of Template Haskell in GHC. It is not detailed enough to
6175 understand Template Haskell; see the <ulink url="http://haskell.org/haskellwiki/Template_Haskell">
6180 <title>Syntax</title>
6182 <para> Template Haskell has the following new syntactic
6183 constructions. You need to use the flag
6184 <option>-XTemplateHaskell</option>
6185 <indexterm><primary><option>-XTemplateHaskell</option></primary>
6186 </indexterm>to switch these syntactic extensions on
6187 (<option>-XTemplateHaskell</option> is no longer implied by
6188 <option>-fglasgow-exts</option>).</para>
6192 A splice is written <literal>$x</literal>, where <literal>x</literal> is an
6193 identifier, or <literal>$(...)</literal>, where the "..." is an arbitrary expression.
6194 There must be no space between the "$" and the identifier or parenthesis. This use
6195 of "$" overrides its meaning as an infix operator, just as "M.x" overrides the meaning
6196 of "." as an infix operator. If you want the infix operator, put spaces around it.
6198 <para> A splice can occur in place of
6200 <listitem><para> an expression; the spliced expression must
6201 have type <literal>Q Exp</literal></para></listitem>
6202 <listitem><para> an type; the spliced expression must
6203 have type <literal>Q Typ</literal></para></listitem>
6204 <listitem><para> a list of top-level declarations; the spliced expression
6205 must have type <literal>Q [Dec]</literal></para></listitem>
6207 Inside a splice you can can only call functions defined in imported modules,
6208 not functions defined elsewhere in the same module.</para></listitem>
6211 A expression quotation is written in Oxford brackets, thus:
6213 <listitem><para> <literal>[| ... |]</literal>, where the "..." is an expression;
6214 the quotation has type <literal>Q Exp</literal>.</para></listitem>
6215 <listitem><para> <literal>[d| ... |]</literal>, where the "..." is a list of top-level declarations;
6216 the quotation has type <literal>Q [Dec]</literal>.</para></listitem>
6217 <listitem><para> <literal>[t| ... |]</literal>, where the "..." is a type;
6218 the quotation has type <literal>Q Typ</literal>.</para></listitem>
6219 </itemizedlist></para></listitem>
6222 A quasi-quotation can appear in either a pattern context or an
6223 expression context and is also written in Oxford brackets:
6225 <listitem><para> <literal>[$<replaceable>varid</replaceable>| ... |]</literal>,
6226 where the "..." is an arbitrary string; a full description of the
6227 quasi-quotation facility is given in <xref linkend="th-quasiquotation"/>.</para></listitem>
6228 </itemizedlist></para></listitem>
6231 A name can be quoted with either one or two prefix single quotes:
6233 <listitem><para> <literal>'f</literal> has type <literal>Name</literal>, and names the function <literal>f</literal>.
6234 Similarly <literal>'C</literal> has type <literal>Name</literal> and names the data constructor <literal>C</literal>.
6235 In general <literal>'</literal><replaceable>thing</replaceable> interprets <replaceable>thing</replaceable> in an expression context.
6237 <listitem><para> <literal>''T</literal> has type <literal>Name</literal>, and names the type constructor <literal>T</literal>.
6238 That is, <literal>''</literal><replaceable>thing</replaceable> interprets <replaceable>thing</replaceable> in a type context.
6241 These <literal>Names</literal> can be used to construct Template Haskell expressions, patterns, declarations etc. They
6242 may also be given as an argument to the <literal>reify</literal> function.
6246 <listitem><para> You may omit the <literal>$(...)</literal> in a top-level declaration splice.
6247 Simply writing an expression (rather than a declaration) implies a splice. For example, you can write
6254 $(deriveStuff 'f) -- Uses the $(...) notation
6258 deriveStuff 'g -- Omits the $(...)
6262 This abbreviation makes top-level declaration slices quieter and less intimidating.
6267 (Compared to the original paper, there are many differences of detail.
6268 The syntax for a declaration splice uses "<literal>$</literal>" not "<literal>splice</literal>".
6269 The type of the enclosed expression must be <literal>Q [Dec]</literal>, not <literal>[Q Dec]</literal>.
6270 Pattern splices and quotations are not implemented.)
6274 <sect2> <title> Using Template Haskell </title>
6278 The data types and monadic constructor functions for Template Haskell are in the library
6279 <literal>Language.Haskell.THSyntax</literal>.
6283 You can only run a function at compile time if it is imported from another module. That is,
6284 you can't define a function in a module, and call it from within a splice in the same module.
6285 (It would make sense to do so, but it's hard to implement.)
6289 You can only run a function at compile time if it is imported
6290 from another module <emphasis>that is not part of a mutually-recursive group of modules
6291 that includes the module currently being compiled</emphasis>. Furthermore, all of the modules of
6292 the mutually-recursive group must be reachable by non-SOURCE imports from the module where the
6293 splice is to be run.</para>
6295 For example, when compiling module A,
6296 you can only run Template Haskell functions imported from B if B does not import A (directly or indirectly).
6297 The reason should be clear: to run B we must compile and run A, but we are currently type-checking A.
6301 The flag <literal>-ddump-splices</literal> shows the expansion of all top-level splices as they happen.
6304 If you are building GHC from source, you need at least a stage-2 bootstrap compiler to
6305 run Template Haskell. A stage-1 compiler will reject the TH constructs. Reason: TH
6306 compiles and runs a program, and then looks at the result. So it's important that
6307 the program it compiles produces results whose representations are identical to
6308 those of the compiler itself.
6312 <para> Template Haskell works in any mode (<literal>--make</literal>, <literal>--interactive</literal>,
6313 or file-at-a-time). There used to be a restriction to the former two, but that restriction
6318 <sect2 id="th-example"> <title> A Template Haskell Worked Example </title>
6319 <para>To help you get over the confidence barrier, try out this skeletal worked example.
6320 First cut and paste the two modules below into "Main.hs" and "Printf.hs":</para>
6327 -- Import our template "pr"
6328 import Printf ( pr )
6330 -- The splice operator $ takes the Haskell source code
6331 -- generated at compile time by "pr" and splices it into
6332 -- the argument of "putStrLn".
6333 main = putStrLn ( $(pr "Hello") )
6339 -- Skeletal printf from the paper.
6340 -- It needs to be in a separate module to the one where
6341 -- you intend to use it.
6343 -- Import some Template Haskell syntax
6344 import Language.Haskell.TH
6346 -- Describe a format string
6347 data Format = D | S | L String
6349 -- Parse a format string. This is left largely to you
6350 -- as we are here interested in building our first ever
6351 -- Template Haskell program and not in building printf.
6352 parse :: String -> [Format]
6355 -- Generate Haskell source code from a parsed representation
6356 -- of the format string. This code will be spliced into
6357 -- the module which calls "pr", at compile time.
6358 gen :: [Format] -> Q Exp
6359 gen [D] = [| \n -> show n |]
6360 gen [S] = [| \s -> s |]
6361 gen [L s] = stringE s
6363 -- Here we generate the Haskell code for the splice
6364 -- from an input format string.
6365 pr :: String -> Q Exp
6366 pr s = gen (parse s)
6369 <para>Now run the compiler (here we are a Cygwin prompt on Windows):
6372 $ ghc --make -XTemplateHaskell main.hs -o main.exe
6375 <para>Run "main.exe" and here is your output:</para>
6385 <title>Using Template Haskell with Profiling</title>
6386 <indexterm><primary>profiling</primary><secondary>with Template Haskell</secondary></indexterm>
6388 <para>Template Haskell relies on GHC's built-in bytecode compiler and
6389 interpreter to run the splice expressions. The bytecode interpreter
6390 runs the compiled expression on top of the same runtime on which GHC
6391 itself is running; this means that the compiled code referred to by
6392 the interpreted expression must be compatible with this runtime, and
6393 in particular this means that object code that is compiled for
6394 profiling <emphasis>cannot</emphasis> be loaded and used by a splice
6395 expression, because profiled object code is only compatible with the
6396 profiling version of the runtime.</para>
6398 <para>This causes difficulties if you have a multi-module program
6399 containing Template Haskell code and you need to compile it for
6400 profiling, because GHC cannot load the profiled object code and use it
6401 when executing the splices. Fortunately GHC provides a workaround.
6402 The basic idea is to compile the program twice:</para>
6406 <para>Compile the program or library first the normal way, without
6407 <option>-prof</option><indexterm><primary><option>-prof</option></primary></indexterm>.</para>
6410 <para>Then compile it again with <option>-prof</option>, and
6411 additionally use <option>-osuf
6412 p_o</option><indexterm><primary><option>-osuf</option></primary></indexterm>
6413 to name the object files differently (you can choose any suffix
6414 that isn't the normal object suffix here). GHC will automatically
6415 load the object files built in the first step when executing splice
6416 expressions. If you omit the <option>-osuf</option> flag when
6417 building with <option>-prof</option> and Template Haskell is used,
6418 GHC will emit an error message. </para>
6423 <sect2 id="th-quasiquotation"> <title> Template Haskell Quasi-quotation </title>
6424 <para>Quasi-quotation allows patterns and expressions to be written using
6425 programmer-defined concrete syntax; the motivation behind the extension and
6426 several examples are documented in
6427 "<ulink url="http://www.eecs.harvard.edu/~mainland/ghc-quasiquoting/">Why It's
6428 Nice to be Quoted: Quasiquoting for Haskell</ulink>" (Proc Haskell Workshop
6429 2007). The example below shows how to write a quasiquoter for a simple
6430 expression language.</para>
6433 In the example, the quasiquoter <literal>expr</literal> is bound to a value of
6434 type <literal>Language.Haskell.TH.Quote.QuasiQuoter</literal> which contains two
6435 functions for quoting expressions and patterns, respectively. The first argument
6436 to each quoter is the (arbitrary) string enclosed in the Oxford brackets. The
6437 context of the quasi-quotation statement determines which of the two parsers is
6438 called: if the quasi-quotation occurs in an expression context, the expression
6439 parser is called, and if it occurs in a pattern context, the pattern parser is
6443 Note that in the example we make use of an antiquoted
6444 variable <literal>n</literal>, indicated by the syntax <literal>'int:n</literal>
6445 (this syntax for anti-quotation was defined by the parser's
6446 author, <emphasis>not</emphasis> by GHC). This binds <literal>n</literal> to the
6447 integer value argument of the constructor <literal>IntExpr</literal> when
6448 pattern matching. Please see the referenced paper for further details regarding
6449 anti-quotation as well as the description of a technique that uses SYB to
6450 leverage a single parser of type <literal>String -> a</literal> to generate both
6451 an expression parser that returns a value of type <literal>Q Exp</literal> and a
6452 pattern parser that returns a value of type <literal>Q Pat</literal>.
6455 <para>In general, a quasi-quote has the form
6456 <literal>[$<replaceable>quoter</replaceable>| <replaceable>string</replaceable> |]</literal>.
6457 The <replaceable>quoter</replaceable> must be the name of an imported quoter; it
6458 cannot be an arbitrary expression. The quoted <replaceable>string</replaceable>
6459 can be arbitrary, and may contain newlines.
6462 Quasiquoters must obey the same stage restrictions as Template Haskell, e.g., in
6463 the example, <literal>expr</literal> cannot be defined
6464 in <literal>Main.hs</literal> where it is used, but must be imported.
6475 main = do { print $ eval [$expr|1 + 2|]
6477 { [$expr|'int:n|] -> print n
6486 import qualified Language.Haskell.TH as TH
6487 import Language.Haskell.TH.Quote
6489 data Expr = IntExpr Integer
6490 | AntiIntExpr String
6491 | BinopExpr BinOp Expr Expr
6493 deriving(Show, Typeable, Data)
6499 deriving(Show, Typeable, Data)
6501 eval :: Expr -> Integer
6502 eval (IntExpr n) = n
6503 eval (BinopExpr op x y) = (opToFun op) (eval x) (eval y)
6510 expr = QuasiQuoter parseExprExp parseExprPat
6512 -- Parse an Expr, returning its representation as
6513 -- either a Q Exp or a Q Pat. See the referenced paper
6514 -- for how to use SYB to do this by writing a single
6515 -- parser of type String -> Expr instead of two
6516 -- separate parsers.
6518 parseExprExp :: String -> Q Exp
6521 parseExprPat :: String -> Q Pat
6525 <para>Now run the compiler:
6528 $ ghc --make -XQuasiQuotes Main.hs -o main
6531 <para>Run "main" and here is your output:</para>
6543 <!-- ===================== Arrow notation =================== -->
6545 <sect1 id="arrow-notation">
6546 <title>Arrow notation
6549 <para>Arrows are a generalization of monads introduced by John Hughes.
6550 For more details, see
6555 “Generalising Monads to Arrows”,
6556 John Hughes, in <citetitle>Science of Computer Programming</citetitle> 37,
6557 pp67–111, May 2000.
6558 The paper that introduced arrows: a friendly introduction, motivated with
6559 programming examples.
6565 “<ulink url="http://www.soi.city.ac.uk/~ross/papers/notation.html">A New Notation for Arrows</ulink>”,
6566 Ross Paterson, in <citetitle>ICFP</citetitle>, Sep 2001.
6567 Introduced the notation described here.
6573 “<ulink url="http://www.soi.city.ac.uk/~ross/papers/fop.html">Arrows and Computation</ulink>”,
6574 Ross Paterson, in <citetitle>The Fun of Programming</citetitle>,
6581 “<ulink url="http://www.cs.chalmers.se/~rjmh/afp-arrows.pdf">Programming with Arrows</ulink>”,
6582 John Hughes, in <citetitle>5th International Summer School on
6583 Advanced Functional Programming</citetitle>,
6584 <citetitle>Lecture Notes in Computer Science</citetitle> vol. 3622,
6586 This paper includes another introduction to the notation,
6587 with practical examples.
6593 “<ulink url="http://www.haskell.org/ghc/docs/papers/arrow-rules.pdf">Type and Translation Rules for Arrow Notation in GHC</ulink>”,
6594 Ross Paterson and Simon Peyton Jones, September 16, 2004.
6595 A terse enumeration of the formal rules used
6596 (extracted from comments in the source code).
6602 The arrows web page at
6603 <ulink url="http://www.haskell.org/arrows/"><literal>http://www.haskell.org/arrows/</literal></ulink>.
6608 With the <option>-XArrows</option> flag, GHC supports the arrow
6609 notation described in the second of these papers,
6610 translating it using combinators from the
6611 <ulink url="../libraries/base/Control-Arrow.html"><literal>Control.Arrow</literal></ulink>
6613 What follows is a brief introduction to the notation;
6614 it won't make much sense unless you've read Hughes's paper.
6617 <para>The extension adds a new kind of expression for defining arrows:
6619 <replaceable>exp</replaceable><superscript>10</superscript> ::= ...
6620 | proc <replaceable>apat</replaceable> -> <replaceable>cmd</replaceable>
6622 where <literal>proc</literal> is a new keyword.
6623 The variables of the pattern are bound in the body of the
6624 <literal>proc</literal>-expression,
6625 which is a new sort of thing called a <firstterm>command</firstterm>.
6626 The syntax of commands is as follows:
6628 <replaceable>cmd</replaceable> ::= <replaceable>exp</replaceable><superscript>10</superscript> -< <replaceable>exp</replaceable>
6629 | <replaceable>exp</replaceable><superscript>10</superscript> -<< <replaceable>exp</replaceable>
6630 | <replaceable>cmd</replaceable><superscript>0</superscript>
6632 with <replaceable>cmd</replaceable><superscript>0</superscript> up to
6633 <replaceable>cmd</replaceable><superscript>9</superscript> defined using
6634 infix operators as for expressions, and
6636 <replaceable>cmd</replaceable><superscript>10</superscript> ::= \ <replaceable>apat</replaceable> ... <replaceable>apat</replaceable> -> <replaceable>cmd</replaceable>
6637 | let <replaceable>decls</replaceable> in <replaceable>cmd</replaceable>
6638 | if <replaceable>exp</replaceable> then <replaceable>cmd</replaceable> else <replaceable>cmd</replaceable>
6639 | case <replaceable>exp</replaceable> of { <replaceable>calts</replaceable> }
6640 | do { <replaceable>cstmt</replaceable> ; ... <replaceable>cstmt</replaceable> ; <replaceable>cmd</replaceable> }
6641 | <replaceable>fcmd</replaceable>
6643 <replaceable>fcmd</replaceable> ::= <replaceable>fcmd</replaceable> <replaceable>aexp</replaceable>
6644 | ( <replaceable>cmd</replaceable> )
6645 | (| <replaceable>aexp</replaceable> <replaceable>cmd</replaceable> ... <replaceable>cmd</replaceable> |)
6647 <replaceable>cstmt</replaceable> ::= let <replaceable>decls</replaceable>
6648 | <replaceable>pat</replaceable> <- <replaceable>cmd</replaceable>
6649 | rec { <replaceable>cstmt</replaceable> ; ... <replaceable>cstmt</replaceable> [;] }
6650 | <replaceable>cmd</replaceable>
6652 where <replaceable>calts</replaceable> are like <replaceable>alts</replaceable>
6653 except that the bodies are commands instead of expressions.
6657 Commands produce values, but (like monadic computations)
6658 may yield more than one value,
6659 or none, and may do other things as well.
6660 For the most part, familiarity with monadic notation is a good guide to
6662 However the values of expressions, even monadic ones,
6663 are determined by the values of the variables they contain;
6664 this is not necessarily the case for commands.
6668 A simple example of the new notation is the expression
6670 proc x -> f -< x+1
6672 We call this a <firstterm>procedure</firstterm> or
6673 <firstterm>arrow abstraction</firstterm>.
6674 As with a lambda expression, the variable <literal>x</literal>
6675 is a new variable bound within the <literal>proc</literal>-expression.
6676 It refers to the input to the arrow.
6677 In the above example, <literal>-<</literal> is not an identifier but an
6678 new reserved symbol used for building commands from an expression of arrow
6679 type and an expression to be fed as input to that arrow.
6680 (The weird look will make more sense later.)
6681 It may be read as analogue of application for arrows.
6682 The above example is equivalent to the Haskell expression
6684 arr (\ x -> x+1) >>> f
6686 That would make no sense if the expression to the left of
6687 <literal>-<</literal> involves the bound variable <literal>x</literal>.
6688 More generally, the expression to the left of <literal>-<</literal>
6689 may not involve any <firstterm>local variable</firstterm>,
6690 i.e. a variable bound in the current arrow abstraction.
6691 For such a situation there is a variant <literal>-<<</literal>, as in
6693 proc x -> f x -<< x+1
6695 which is equivalent to
6697 arr (\ x -> (f x, x+1)) >>> app
6699 so in this case the arrow must belong to the <literal>ArrowApply</literal>
6701 Such an arrow is equivalent to a monad, so if you're using this form
6702 you may find a monadic formulation more convenient.
6706 <title>do-notation for commands</title>
6709 Another form of command is a form of <literal>do</literal>-notation.
6710 For example, you can write
6719 You can read this much like ordinary <literal>do</literal>-notation,
6720 but with commands in place of monadic expressions.
6721 The first line sends the value of <literal>x+1</literal> as an input to
6722 the arrow <literal>f</literal>, and matches its output against
6723 <literal>y</literal>.
6724 In the next line, the output is discarded.
6725 The arrow <function>returnA</function> is defined in the
6726 <ulink url="../libraries/base/Control-Arrow.html"><literal>Control.Arrow</literal></ulink>
6727 module as <literal>arr id</literal>.
6728 The above example is treated as an abbreviation for
6730 arr (\ x -> (x, x)) >>>
6731 first (arr (\ x -> x+1) >>> f) >>>
6732 arr (\ (y, x) -> (y, (x, y))) >>>
6733 first (arr (\ y -> 2*y) >>> g) >>>
6735 arr (\ (x, y) -> let z = x+y in ((x, z), z)) >>>
6736 first (arr (\ (x, z) -> x*z) >>> h) >>>
6737 arr (\ (t, z) -> t+z) >>>
6740 Note that variables not used later in the composition are projected out.
6741 After simplification using rewrite rules (see <xref linkend="rewrite-rules"/>)
6743 <ulink url="../libraries/base/Control-Arrow.html"><literal>Control.Arrow</literal></ulink>
6744 module, this reduces to
6746 arr (\ x -> (x+1, x)) >>>
6748 arr (\ (y, x) -> (2*y, (x, y))) >>>
6750 arr (\ (_, (x, y)) -> let z = x+y in (x*z, z)) >>>
6752 arr (\ (t, z) -> t+z)
6754 which is what you might have written by hand.
6755 With arrow notation, GHC keeps track of all those tuples of variables for you.
6759 Note that although the above translation suggests that
6760 <literal>let</literal>-bound variables like <literal>z</literal> must be
6761 monomorphic, the actual translation produces Core,
6762 so polymorphic variables are allowed.
6766 It's also possible to have mutually recursive bindings,
6767 using the new <literal>rec</literal> keyword, as in the following example:
6769 counter :: ArrowCircuit a => a Bool Int
6770 counter = proc reset -> do
6771 rec output <- returnA -< if reset then 0 else next
6772 next <- delay 0 -< output+1
6773 returnA -< output
6775 The translation of such forms uses the <function>loop</function> combinator,
6776 so the arrow concerned must belong to the <literal>ArrowLoop</literal> class.
6782 <title>Conditional commands</title>
6785 In the previous example, we used a conditional expression to construct the
6787 Sometimes we want to conditionally execute different commands, as in
6794 which is translated to
6796 arr (\ (x,y) -> if f x y then Left x else Right y) >>>
6797 (arr (\x -> x+1) >>> f) ||| (arr (\y -> y+2) >>> g)
6799 Since the translation uses <function>|||</function>,
6800 the arrow concerned must belong to the <literal>ArrowChoice</literal> class.
6804 There are also <literal>case</literal> commands, like
6810 y <- h -< (x1, x2)
6814 The syntax is the same as for <literal>case</literal> expressions,
6815 except that the bodies of the alternatives are commands rather than expressions.
6816 The translation is similar to that of <literal>if</literal> commands.
6822 <title>Defining your own control structures</title>
6825 As we're seen, arrow notation provides constructs,
6826 modelled on those for expressions,
6827 for sequencing, value recursion and conditionals.
6828 But suitable combinators,
6829 which you can define in ordinary Haskell,
6830 may also be used to build new commands out of existing ones.
6831 The basic idea is that a command defines an arrow from environments to values.
6832 These environments assign values to the free local variables of the command.
6833 Thus combinators that produce arrows from arrows
6834 may also be used to build commands from commands.
6835 For example, the <literal>ArrowChoice</literal> class includes a combinator
6837 ArrowChoice a => (<+>) :: a e c -> a e c -> a e c
6839 so we can use it to build commands:
6841 expr' = proc x -> do
6844 symbol Plus -< ()
6845 y <- term -< ()
6848 symbol Minus -< ()
6849 y <- term -< ()
6852 (The <literal>do</literal> on the first line is needed to prevent the first
6853 <literal><+> ...</literal> from being interpreted as part of the
6854 expression on the previous line.)
6855 This is equivalent to
6857 expr' = (proc x -> returnA -< x)
6858 <+> (proc x -> do
6859 symbol Plus -< ()
6860 y <- term -< ()
6862 <+> (proc x -> do
6863 symbol Minus -< ()
6864 y <- term -< ()
6867 It is essential that this operator be polymorphic in <literal>e</literal>
6868 (representing the environment input to the command
6869 and thence to its subcommands)
6870 and satisfy the corresponding naturality property
6872 arr k >>> (f <+> g) = (arr k >>> f) <+> (arr k >>> g)
6874 at least for strict <literal>k</literal>.
6875 (This should be automatic if you're not using <function>seq</function>.)
6876 This ensures that environments seen by the subcommands are environments
6877 of the whole command,
6878 and also allows the translation to safely trim these environments.
6879 The operator must also not use any variable defined within the current
6884 We could define our own operator
6886 untilA :: ArrowChoice a => a e () -> a e Bool -> a e ()
6887 untilA body cond = proc x ->
6888 b <- cond -< x
6889 if b then returnA -< ()
6892 untilA body cond -< x
6894 and use it in the same way.
6895 Of course this infix syntax only makes sense for binary operators;
6896 there is also a more general syntax involving special brackets:
6900 (|untilA (increment -< x+y) (within 0.5 -< x)|)
6907 <title>Primitive constructs</title>
6910 Some operators will need to pass additional inputs to their subcommands.
6911 For example, in an arrow type supporting exceptions,
6912 the operator that attaches an exception handler will wish to pass the
6913 exception that occurred to the handler.
6914 Such an operator might have a type
6916 handleA :: ... => a e c -> a (e,Ex) c -> a e c
6918 where <literal>Ex</literal> is the type of exceptions handled.
6919 You could then use this with arrow notation by writing a command
6921 body `handleA` \ ex -> handler
6923 so that if an exception is raised in the command <literal>body</literal>,
6924 the variable <literal>ex</literal> is bound to the value of the exception
6925 and the command <literal>handler</literal>,
6926 which typically refers to <literal>ex</literal>, is entered.
6927 Though the syntax here looks like a functional lambda,
6928 we are talking about commands, and something different is going on.
6929 The input to the arrow represented by a command consists of values for
6930 the free local variables in the command, plus a stack of anonymous values.
6931 In all the prior examples, this stack was empty.
6932 In the second argument to <function>handleA</function>,
6933 this stack consists of one value, the value of the exception.
6934 The command form of lambda merely gives this value a name.
6939 the values on the stack are paired to the right of the environment.
6940 So operators like <function>handleA</function> that pass
6941 extra inputs to their subcommands can be designed for use with the notation
6942 by pairing the values with the environment in this way.
6943 More precisely, the type of each argument of the operator (and its result)
6944 should have the form
6946 a (...(e,t1), ... tn) t
6948 where <replaceable>e</replaceable> is a polymorphic variable
6949 (representing the environment)
6950 and <replaceable>ti</replaceable> are the types of the values on the stack,
6951 with <replaceable>t1</replaceable> being the <quote>top</quote>.
6952 The polymorphic variable <replaceable>e</replaceable> must not occur in
6953 <replaceable>a</replaceable>, <replaceable>ti</replaceable> or
6954 <replaceable>t</replaceable>.
6955 However the arrows involved need not be the same.
6956 Here are some more examples of suitable operators:
6958 bracketA :: ... => a e b -> a (e,b) c -> a (e,c) d -> a e d
6959 runReader :: ... => a e c -> a' (e,State) c
6960 runState :: ... => a e c -> a' (e,State) (c,State)
6962 We can supply the extra input required by commands built with the last two
6963 by applying them to ordinary expressions, as in
6967 (|runReader (do { ... })|) s
6969 which adds <literal>s</literal> to the stack of inputs to the command
6970 built using <function>runReader</function>.
6974 The command versions of lambda abstraction and application are analogous to
6975 the expression versions.
6976 In particular, the beta and eta rules describe equivalences of commands.
6977 These three features (operators, lambda abstraction and application)
6978 are the core of the notation; everything else can be built using them,
6979 though the results would be somewhat clumsy.
6980 For example, we could simulate <literal>do</literal>-notation by defining
6982 bind :: Arrow a => a e b -> a (e,b) c -> a e c
6983 u `bind` f = returnA &&& u >>> f
6985 bind_ :: Arrow a => a e b -> a e c -> a e c
6986 u `bind_` f = u `bind` (arr fst >>> f)
6988 We could simulate <literal>if</literal> by defining
6990 cond :: ArrowChoice a => a e b -> a e b -> a (e,Bool) b
6991 cond f g = arr (\ (e,b) -> if b then Left e else Right e) >>> f ||| g
6998 <title>Differences with the paper</title>
7003 <para>Instead of a single form of arrow application (arrow tail) with two
7004 translations, the implementation provides two forms
7005 <quote><literal>-<</literal></quote> (first-order)
7006 and <quote><literal>-<<</literal></quote> (higher-order).
7011 <para>User-defined operators are flagged with banana brackets instead of
7012 a new <literal>form</literal> keyword.
7021 <title>Portability</title>
7024 Although only GHC implements arrow notation directly,
7025 there is also a preprocessor
7027 <ulink url="http://www.haskell.org/arrows/">arrows web page</ulink>)
7028 that translates arrow notation into Haskell 98
7029 for use with other Haskell systems.
7030 You would still want to check arrow programs with GHC;
7031 tracing type errors in the preprocessor output is not easy.
7032 Modules intended for both GHC and the preprocessor must observe some
7033 additional restrictions:
7038 The module must import
7039 <ulink url="../libraries/base/Control-Arrow.html"><literal>Control.Arrow</literal></ulink>.
7045 The preprocessor cannot cope with other Haskell extensions.
7046 These would have to go in separate modules.
7052 Because the preprocessor targets Haskell (rather than Core),
7053 <literal>let</literal>-bound variables are monomorphic.
7064 <!-- ==================== BANG PATTERNS ================= -->
7066 <sect1 id="bang-patterns">
7067 <title>Bang patterns
7068 <indexterm><primary>Bang patterns</primary></indexterm>
7070 <para>GHC supports an extension of pattern matching called <emphasis>bang
7071 patterns</emphasis>, written <literal>!<replaceable>pat</replaceable></literal>.
7072 Bang patterns are under consideration for Haskell Prime.
7074 url="http://hackage.haskell.org/trac/haskell-prime/wiki/BangPatterns">Haskell
7075 prime feature description</ulink> contains more discussion and examples
7076 than the material below.
7079 The key change is the addition of a new rule to the
7080 <ulink url="http://haskell.org/onlinereport/exps.html#sect3.17.2">semantics of pattern matching in the Haskell 98 report</ulink>.
7081 Add new bullet 10, saying: Matching the pattern <literal>!</literal><replaceable>pat</replaceable>
7082 against a value <replaceable>v</replaceable> behaves as follows:
7084 <listitem><para>if <replaceable>v</replaceable> is bottom, the match diverges</para></listitem>
7085 <listitem><para>otherwise, <replaceable>pat</replaceable> is matched against <replaceable>v</replaceable> </para></listitem>
7089 Bang patterns are enabled by the flag <option>-XBangPatterns</option>.
7092 <sect2 id="bang-patterns-informal">
7093 <title>Informal description of bang patterns
7096 The main idea is to add a single new production to the syntax of patterns:
7100 Matching an expression <literal>e</literal> against a pattern <literal>!p</literal> is done by first
7101 evaluating <literal>e</literal> (to WHNF) and then matching the result against <literal>p</literal>.
7106 This definition makes <literal>f1</literal> is strict in <literal>x</literal>,
7107 whereas without the bang it would be lazy.
7108 Bang patterns can be nested of course:
7112 Here, <literal>f2</literal> is strict in <literal>x</literal> but not in
7113 <literal>y</literal>.
7114 A bang only really has an effect if it precedes a variable or wild-card pattern:
7119 Here, <literal>f3</literal> and <literal>f4</literal> are identical;
7120 putting a bang before a pattern that
7121 forces evaluation anyway does nothing.
7124 There is one (apparent) exception to this general rule that a bang only
7125 makes a difference when it precedes a variable or wild-card: a bang at the
7126 top level of a <literal>let</literal> or <literal>where</literal>
7127 binding makes the binding strict, regardless of the pattern. For example:
7131 is a strict binding: operationally, it evaluates <literal>e</literal>, matches
7132 it against the pattern <literal>[x,y]</literal>, and then evaluates <literal>b</literal>.
7133 (We say "apparent" exception because the Right Way to think of it is that the bang
7134 at the top of a binding is not part of the <emphasis>pattern</emphasis>; rather it
7135 is part of the syntax of the <emphasis>binding</emphasis>.)
7136 Nested bangs in a pattern binding behave uniformly with all other forms of
7137 pattern matching. For example
7139 let (!x,[y]) = e in b
7141 is equivalent to this:
7143 let { t = case e of (x,[y]) -> x `seq` (x,y)
7148 The binding is lazy, but when either <literal>x</literal> or <literal>y</literal> is
7149 evaluated by <literal>b</literal> the entire pattern is matched, including forcing the
7150 evaluation of <literal>x</literal>.
7153 Bang patterns work in <literal>case</literal> expressions too, of course:
7155 g5 x = let y = f x in body
7156 g6 x = case f x of { y -> body }
7157 g7 x = case f x of { !y -> body }
7159 The functions <literal>g5</literal> and <literal>g6</literal> mean exactly the same thing.
7160 But <literal>g7</literal> evaluates <literal>(f x)</literal>, binds <literal>y</literal> to the
7161 result, and then evaluates <literal>body</literal>.
7166 <sect2 id="bang-patterns-sem">
7167 <title>Syntax and semantics
7171 We add a single new production to the syntax of patterns:
7175 There is one problem with syntactic ambiguity. Consider:
7179 Is this a definition of the infix function "<literal>(!)</literal>",
7180 or of the "<literal>f</literal>" with a bang pattern? GHC resolves this
7181 ambiguity in favour of the latter. If you want to define
7182 <literal>(!)</literal> with bang-patterns enabled, you have to do so using
7187 The semantics of Haskell pattern matching is described in <ulink
7188 url="http://www.haskell.org/onlinereport/exps.html#sect3.17.2">
7189 Section 3.17.2</ulink> of the Haskell Report. To this description add
7190 one extra item 10, saying:
7191 <itemizedlist><listitem><para>Matching
7192 the pattern <literal>!pat</literal> against a value <literal>v</literal> behaves as follows:
7193 <itemizedlist><listitem><para>if <literal>v</literal> is bottom, the match diverges</para></listitem>
7194 <listitem><para>otherwise, <literal>pat</literal> is matched against
7195 <literal>v</literal></para></listitem>
7197 </para></listitem></itemizedlist>
7198 Similarly, in Figure 4 of <ulink url="http://www.haskell.org/onlinereport/exps.html#sect3.17.3">
7199 Section 3.17.3</ulink>, add a new case (t):
7201 case v of { !pat -> e; _ -> e' }
7202 = v `seq` case v of { pat -> e; _ -> e' }
7205 That leaves let expressions, whose translation is given in
7206 <ulink url="http://www.haskell.org/onlinereport/exps.html#sect3.12">Section
7208 of the Haskell Report.
7209 In the translation box, first apply
7210 the following transformation: for each pattern <literal>pi</literal> that is of
7211 form <literal>!qi = ei</literal>, transform it to <literal>(xi,!qi) = ((),ei)</literal>, and and replace <literal>e0</literal>
7212 by <literal>(xi `seq` e0)</literal>. Then, when none of the left-hand-side patterns
7213 have a bang at the top, apply the rules in the existing box.
7215 <para>The effect of the let rule is to force complete matching of the pattern
7216 <literal>qi</literal> before evaluation of the body is begun. The bang is
7217 retained in the translated form in case <literal>qi</literal> is a variable,
7225 The let-binding can be recursive. However, it is much more common for
7226 the let-binding to be non-recursive, in which case the following law holds:
7227 <literal>(let !p = rhs in body)</literal>
7229 <literal>(case rhs of !p -> body)</literal>
7232 A pattern with a bang at the outermost level is not allowed at the top level of
7238 <!-- ==================== ASSERTIONS ================= -->
7240 <sect1 id="assertions">
7242 <indexterm><primary>Assertions</primary></indexterm>
7246 If you want to make use of assertions in your standard Haskell code, you
7247 could define a function like the following:
7253 assert :: Bool -> a -> a
7254 assert False x = error "assertion failed!"
7261 which works, but gives you back a less than useful error message --
7262 an assertion failed, but which and where?
7266 One way out is to define an extended <function>assert</function> function which also
7267 takes a descriptive string to include in the error message and
7268 perhaps combine this with the use of a pre-processor which inserts
7269 the source location where <function>assert</function> was used.
7273 Ghc offers a helping hand here, doing all of this for you. For every
7274 use of <function>assert</function> in the user's source:
7280 kelvinToC :: Double -> Double
7281 kelvinToC k = assert (k >= 0.0) (k+273.15)
7287 Ghc will rewrite this to also include the source location where the
7294 assert pred val ==> assertError "Main.hs|15" pred val
7300 The rewrite is only performed by the compiler when it spots
7301 applications of <function>Control.Exception.assert</function>, so you
7302 can still define and use your own versions of
7303 <function>assert</function>, should you so wish. If not, import
7304 <literal>Control.Exception</literal> to make use
7305 <function>assert</function> in your code.
7309 GHC ignores assertions when optimisation is turned on with the
7310 <option>-O</option><indexterm><primary><option>-O</option></primary></indexterm> flag. That is, expressions of the form
7311 <literal>assert pred e</literal> will be rewritten to
7312 <literal>e</literal>. You can also disable assertions using the
7313 <option>-fignore-asserts</option>
7314 option<indexterm><primary><option>-fignore-asserts</option></primary>
7315 </indexterm>.</para>
7318 Assertion failures can be caught, see the documentation for the
7319 <literal>Control.Exception</literal> library for the details.
7325 <!-- =============================== PRAGMAS =========================== -->
7327 <sect1 id="pragmas">
7328 <title>Pragmas</title>
7330 <indexterm><primary>pragma</primary></indexterm>
7332 <para>GHC supports several pragmas, or instructions to the
7333 compiler placed in the source code. Pragmas don't normally affect
7334 the meaning of the program, but they might affect the efficiency
7335 of the generated code.</para>
7337 <para>Pragmas all take the form
7339 <literal>{-# <replaceable>word</replaceable> ... #-}</literal>
7341 where <replaceable>word</replaceable> indicates the type of
7342 pragma, and is followed optionally by information specific to that
7343 type of pragma. Case is ignored in
7344 <replaceable>word</replaceable>. The various values for
7345 <replaceable>word</replaceable> that GHC understands are described
7346 in the following sections; any pragma encountered with an
7347 unrecognised <replaceable>word</replaceable> is
7348 ignored. The layout rule applies in pragmas, so the closing <literal>#-}</literal>
7349 should start in a column to the right of the opening <literal>{-#</literal>. </para>
7351 <para>Certain pragmas are <emphasis>file-header pragmas</emphasis>:
7355 pragma must precede the <literal>module</literal> keyword in the file.
7358 There can be as many file-header pragmas as you please, and they can be
7359 preceded or followed by comments.
7362 File-header pragmas are read once only, before
7363 pre-processing the file (e.g. with cpp).
7366 The file-header pragmas are: <literal>{-# LANGUAGE #-}</literal>,
7367 <literal>{-# OPTIONS_GHC #-}</literal>, and
7368 <literal>{-# INCLUDE #-}</literal>.
7373 <sect2 id="language-pragma">
7374 <title>LANGUAGE pragma</title>
7376 <indexterm><primary>LANGUAGE</primary><secondary>pragma</secondary></indexterm>
7377 <indexterm><primary>pragma</primary><secondary>LANGUAGE</secondary></indexterm>
7379 <para>The <literal>LANGUAGE</literal> pragma allows language extensions to be enabled
7381 It is the intention that all Haskell compilers support the
7382 <literal>LANGUAGE</literal> pragma with the same syntax, although not
7383 all extensions are supported by all compilers, of
7384 course. The <literal>LANGUAGE</literal> pragma should be used instead
7385 of <literal>OPTIONS_GHC</literal>, if possible.</para>
7387 <para>For example, to enable the FFI and preprocessing with CPP:</para>
7389 <programlisting>{-# LANGUAGE ForeignFunctionInterface, CPP #-}</programlisting>
7391 <para><literal>LANGUAGE</literal> is a file-header pragma (see <xref linkend="pragmas"/>).</para>
7393 <para>Every language extension can also be turned into a command-line flag
7394 by prefixing it with "<literal>-X</literal>"; for example <option>-XForeignFunctionInterface</option>.
7395 (Similarly, all "<literal>-X</literal>" flags can be written as <literal>LANGUAGE</literal> pragmas.
7398 <para>A list of all supported language extensions can be obtained by invoking
7399 <literal>ghc --supported-languages</literal> (see <xref linkend="modes"/>).</para>
7401 <para>Any extension from the <literal>Extension</literal> type defined in
7403 url="../libraries/Cabal/Language-Haskell-Extension.html"><literal>Language.Haskell.Extension</literal></ulink>
7404 may be used. GHC will report an error if any of the requested extensions are not supported.</para>
7408 <sect2 id="options-pragma">
7409 <title>OPTIONS_GHC pragma</title>
7410 <indexterm><primary>OPTIONS_GHC</primary>
7412 <indexterm><primary>pragma</primary><secondary>OPTIONS_GHC</secondary>
7415 <para>The <literal>OPTIONS_GHC</literal> pragma is used to specify
7416 additional options that are given to the compiler when compiling
7417 this source file. See <xref linkend="source-file-options"/> for
7420 <para>Previous versions of GHC accepted <literal>OPTIONS</literal> rather
7421 than <literal>OPTIONS_GHC</literal>, but that is now deprecated.</para>
7424 <para><literal>OPTIONS_GHC</literal> is a file-header pragma (see <xref linkend="pragmas"/>).</para>
7426 <sect2 id="include-pragma">
7427 <title>INCLUDE pragma</title>
7429 <para>The <literal>INCLUDE</literal> used to be necessary for
7430 specifying header files to be included when using the FFI and
7431 compiling via C. It is no longer required for GHC, but is
7432 accepted (and ignored) for compatibility with other
7436 <sect2 id="warning-deprecated-pragma">
7437 <title>WARNING and DEPRECATED pragmas</title>
7438 <indexterm><primary>WARNING</primary></indexterm>
7439 <indexterm><primary>DEPRECATED</primary></indexterm>
7441 <para>The WARNING pragma allows you to attach an arbitrary warning
7442 to a particular function, class, or type.
7443 A DEPRECATED pragma lets you specify that
7444 a particular function, class, or type is deprecated.
7445 There are two ways of using these pragmas.
7449 <para>You can work on an entire module thus:</para>
7451 module Wibble {-# DEPRECATED "Use Wobble instead" #-} where
7456 module Wibble {-# WARNING "This is an unstable interface." #-} where
7459 <para>When you compile any module that import
7460 <literal>Wibble</literal>, GHC will print the specified
7465 <para>You can attach a warning to a function, class, type, or data constructor, with the
7466 following top-level declarations:</para>
7468 {-# DEPRECATED f, C, T "Don't use these" #-}
7469 {-# WARNING unsafePerformIO "This is unsafe; I hope you know what you're doing" #-}
7471 <para>When you compile any module that imports and uses any
7472 of the specified entities, GHC will print the specified
7474 <para> You can only attach to entities declared at top level in the module
7475 being compiled, and you can only use unqualified names in the list of
7476 entities. A capitalised name, such as <literal>T</literal>
7477 refers to <emphasis>either</emphasis> the type constructor <literal>T</literal>
7478 <emphasis>or</emphasis> the data constructor <literal>T</literal>, or both if
7479 both are in scope. If both are in scope, there is currently no way to
7480 specify one without the other (c.f. fixities
7481 <xref linkend="infix-tycons"/>).</para>
7484 Warnings and deprecations are not reported for
7485 (a) uses within the defining module, and
7486 (b) uses in an export list.
7487 The latter reduces spurious complaints within a library
7488 in which one module gathers together and re-exports
7489 the exports of several others.
7491 <para>You can suppress the warnings with the flag
7492 <option>-fno-warn-warnings-deprecations</option>.</para>
7495 <sect2 id="inline-noinline-pragma">
7496 <title>INLINE and NOINLINE pragmas</title>
7498 <para>These pragmas control the inlining of function
7501 <sect3 id="inline-pragma">
7502 <title>INLINE pragma</title>
7503 <indexterm><primary>INLINE</primary></indexterm>
7505 <para>GHC (with <option>-O</option>, as always) tries to
7506 inline (or “unfold”) functions/values that are
7507 “small enough,” thus avoiding the call overhead
7508 and possibly exposing other more-wonderful optimisations.
7509 Normally, if GHC decides a function is “too
7510 expensive” to inline, it will not do so, nor will it
7511 export that unfolding for other modules to use.</para>
7513 <para>The sledgehammer you can bring to bear is the
7514 <literal>INLINE</literal><indexterm><primary>INLINE
7515 pragma</primary></indexterm> pragma, used thusly:</para>
7518 key_function :: Int -> String -> (Bool, Double)
7519 {-# INLINE key_function #-}
7522 <para>The major effect of an <literal>INLINE</literal> pragma
7523 is to declare a function's “cost” to be very low.
7524 The normal unfolding machinery will then be very keen to
7525 inline it. However, an <literal>INLINE</literal> pragma for a
7526 function "<literal>f</literal>" has a number of other effects:
7529 No functions are inlined into <literal>f</literal>. Otherwise
7530 GHC might inline a big function into <literal>f</literal>'s right hand side,
7531 making <literal>f</literal> big; and then inline <literal>f</literal> blindly.
7534 The float-in, float-out, and common-sub-expression transformations are not
7535 applied to the body of <literal>f</literal>.
7538 An INLINE function is not worker/wrappered by strictness analysis.
7539 It's going to be inlined wholesale instead.
7542 All of these effects are aimed at ensuring that what gets inlined is
7543 exactly what you asked for, no more and no less.
7545 <para>GHC ensures that inlining cannot go on forever: every mutually-recursive
7546 group is cut by one or more <emphasis>loop breakers</emphasis> that is never inlined
7547 (see <ulink url="http://research.microsoft.com/%7Esimonpj/Papers/inlining/index.htm">
7548 Secrets of the GHC inliner, JFP 12(4) July 2002</ulink>).
7549 GHC tries not to select a function with an INLINE pragma as a loop breaker, but
7550 when there is no choice even an INLINE function can be selected, in which case
7551 the INLINE pragma is ignored.
7552 For example, for a self-recursive function, the loop breaker can only be the function
7553 itself, so an INLINE pragma is always ignored.</para>
7555 <para>Syntactically, an <literal>INLINE</literal> pragma for a
7556 function can be put anywhere its type signature could be
7559 <para><literal>INLINE</literal> pragmas are a particularly
7561 <literal>then</literal>/<literal>return</literal> (or
7562 <literal>bind</literal>/<literal>unit</literal>) functions in
7563 a monad. For example, in GHC's own
7564 <literal>UniqueSupply</literal> monad code, we have:</para>
7567 {-# INLINE thenUs #-}
7568 {-# INLINE returnUs #-}
7571 <para>See also the <literal>NOINLINE</literal> pragma (<xref
7572 linkend="noinline-pragma"/>).</para>
7574 <para>Note: the HBC compiler doesn't like <literal>INLINE</literal> pragmas,
7575 so if you want your code to be HBC-compatible you'll have to surround
7576 the pragma with C pre-processor directives
7577 <literal>#ifdef __GLASGOW_HASKELL__</literal>...<literal>#endif</literal>.</para>
7581 <sect3 id="noinline-pragma">
7582 <title>NOINLINE pragma</title>
7584 <indexterm><primary>NOINLINE</primary></indexterm>
7585 <indexterm><primary>NOTINLINE</primary></indexterm>
7587 <para>The <literal>NOINLINE</literal> pragma does exactly what
7588 you'd expect: it stops the named function from being inlined
7589 by the compiler. You shouldn't ever need to do this, unless
7590 you're very cautious about code size.</para>
7592 <para><literal>NOTINLINE</literal> is a synonym for
7593 <literal>NOINLINE</literal> (<literal>NOINLINE</literal> is
7594 specified by Haskell 98 as the standard way to disable
7595 inlining, so it should be used if you want your code to be
7599 <sect3 id="conlike-pragma">
7600 <title>CONLIKE modifier</title>
7601 <indexterm><primary>CONLIKE</primary></indexterm>
7602 <para>An INLINE or NOINLINE pragma may have a CONLIKE modifier,
7603 which affects matching in RULEs (only). See <xref linkend="conlike"/>.
7607 <sect3 id="phase-control">
7608 <title>Phase control</title>
7610 <para> Sometimes you want to control exactly when in GHC's
7611 pipeline the INLINE pragma is switched on. Inlining happens
7612 only during runs of the <emphasis>simplifier</emphasis>. Each
7613 run of the simplifier has a different <emphasis>phase
7614 number</emphasis>; the phase number decreases towards zero.
7615 If you use <option>-dverbose-core2core</option> you'll see the
7616 sequence of phase numbers for successive runs of the
7617 simplifier. In an INLINE pragma you can optionally specify a
7621 <para>"<literal>INLINE[k] f</literal>" means: do not inline
7622 <literal>f</literal>
7623 until phase <literal>k</literal>, but from phase
7624 <literal>k</literal> onwards be very keen to inline it.
7627 <para>"<literal>INLINE[~k] f</literal>" means: be very keen to inline
7628 <literal>f</literal>
7629 until phase <literal>k</literal>, but from phase
7630 <literal>k</literal> onwards do not inline it.
7633 <para>"<literal>NOINLINE[k] f</literal>" means: do not inline
7634 <literal>f</literal>
7635 until phase <literal>k</literal>, but from phase
7636 <literal>k</literal> onwards be willing to inline it (as if
7637 there was no pragma).
7640 <para>"<literal>NOINLINE[~k] f</literal>" means: be willing to inline
7641 <literal>f</literal>
7642 until phase <literal>k</literal>, but from phase
7643 <literal>k</literal> onwards do not inline it.
7646 The same information is summarised here:
7648 -- Before phase 2 Phase 2 and later
7649 {-# INLINE [2] f #-} -- No Yes
7650 {-# INLINE [~2] f #-} -- Yes No
7651 {-# NOINLINE [2] f #-} -- No Maybe
7652 {-# NOINLINE [~2] f #-} -- Maybe No
7654 {-# INLINE f #-} -- Yes Yes
7655 {-# NOINLINE f #-} -- No No
7657 By "Maybe" we mean that the usual heuristic inlining rules apply (if the
7658 function body is small, or it is applied to interesting-looking arguments etc).
7659 Another way to understand the semantics is this:
7661 <listitem><para>For both INLINE and NOINLINE, the phase number says
7662 when inlining is allowed at all.</para></listitem>
7663 <listitem><para>The INLINE pragma has the additional effect of making the
7664 function body look small, so that when inlining is allowed it is very likely to
7669 <para>The same phase-numbering control is available for RULES
7670 (<xref linkend="rewrite-rules"/>).</para>
7674 <sect2 id="annotation-pragmas">
7675 <title>ANN pragmas</title>
7677 <para>GHC offers the ability to annotate various code constructs with additional
7678 data by using three pragmas. This data can then be inspected at a later date by
7679 using GHC-as-a-library.</para>
7681 <sect3 id="ann-pragma">
7682 <title>Annotating values</title>
7684 <indexterm><primary>ANN</primary></indexterm>
7686 <para>Any expression that has both <literal>Typeable</literal> and <literal>Data</literal> instances may be attached to a top-level value
7687 binding using an <literal>ANN</literal> pragma. In particular, this means you can use <literal>ANN</literal>
7688 to annotate data constructors (e.g. <literal>Just</literal>) as well as normal values (e.g. <literal>take</literal>).
7689 By way of example, to annotate the function <literal>foo</literal> with the annotation <literal>Just "Hello"</literal>
7690 you would do this:</para>
7693 {-# ANN foo (Just "Hello") #-}
7698 A number of restrictions apply to use of annotations:
7700 <listitem><para>The binder being annotated must be at the top level (i.e. no nested binders)</para></listitem>
7701 <listitem><para>The binder being annotated must be declared in the current module</para></listitem>
7702 <listitem><para>The expression you are annotating with must have a type with <literal>Typeable</literal> and <literal>Data</literal> instances</para></listitem>
7703 <listitem><para>The <ulink linkend="using-template-haskell">Template Haskell staging restrictions</ulink> apply to the
7704 expression being annotated with, so for example you cannot run a function from the module being compiled.</para>
7706 <para>To be precise, the annotation <literal>{-# ANN x e #-}</literal> is well staged if and only if <literal>$(e)</literal> would be
7707 (disregarding the usual type restrictions of the splice syntax, and the usual restriction on splicing inside a splice - <literal>$([|1|])</literal> is fine as an annotation, albeit redundant).</para></listitem>
7710 If you feel strongly that any of these restrictions are too onerous, <ulink url="http://hackage.haskell.org/trac/ghc/wiki/MailingListsAndIRC">
7711 please give the GHC team a shout</ulink>.
7714 <para>However, apart from these restrictions, many things are allowed, including expressions which are not fully evaluated!
7715 Annotation expressions will be evaluated by the compiler just like Template Haskell splices are. So, this annotation is fine:</para>
7718 {-# ANN f SillyAnnotation { foo = (id 10) + $([| 20 |]), bar = 'f } #-}
7723 <sect3 id="typeann-pragma">
7724 <title>Annotating types</title>
7726 <indexterm><primary>ANN type</primary></indexterm>
7727 <indexterm><primary>ANN</primary></indexterm>
7729 <para>You can annotate types with the <literal>ANN</literal> pragma by using the <literal>type</literal> keyword. For example:</para>
7732 {-# ANN type Foo (Just "A `Maybe String' annotation") #-}
7737 <sect3 id="modann-pragma">
7738 <title>Annotating modules</title>
7740 <indexterm><primary>ANN module</primary></indexterm>
7741 <indexterm><primary>ANN</primary></indexterm>
7743 <para>You can annotate modules with the <literal>ANN</literal> pragma by using the <literal>module</literal> keyword. For example:</para>
7746 {-# ANN module (Just "A `Maybe String' annotation") #-}
7751 <sect2 id="line-pragma">
7752 <title>LINE pragma</title>
7754 <indexterm><primary>LINE</primary><secondary>pragma</secondary></indexterm>
7755 <indexterm><primary>pragma</primary><secondary>LINE</secondary></indexterm>
7756 <para>This pragma is similar to C's <literal>#line</literal>
7757 pragma, and is mainly for use in automatically generated Haskell
7758 code. It lets you specify the line number and filename of the
7759 original code; for example</para>
7761 <programlisting>{-# LINE 42 "Foo.vhs" #-}</programlisting>
7763 <para>if you'd generated the current file from something called
7764 <filename>Foo.vhs</filename> and this line corresponds to line
7765 42 in the original. GHC will adjust its error messages to refer
7766 to the line/file named in the <literal>LINE</literal>
7771 <title>RULES pragma</title>
7773 <para>The RULES pragma lets you specify rewrite rules. It is
7774 described in <xref linkend="rewrite-rules"/>.</para>
7777 <sect2 id="specialize-pragma">
7778 <title>SPECIALIZE pragma</title>
7780 <indexterm><primary>SPECIALIZE pragma</primary></indexterm>
7781 <indexterm><primary>pragma, SPECIALIZE</primary></indexterm>
7782 <indexterm><primary>overloading, death to</primary></indexterm>
7784 <para>(UK spelling also accepted.) For key overloaded
7785 functions, you can create extra versions (NB: more code space)
7786 specialised to particular types. Thus, if you have an
7787 overloaded function:</para>
7790 hammeredLookup :: Ord key => [(key, value)] -> key -> value
7793 <para>If it is heavily used on lists with
7794 <literal>Widget</literal> keys, you could specialise it as
7798 {-# SPECIALIZE hammeredLookup :: [(Widget, value)] -> Widget -> value #-}
7801 <para>A <literal>SPECIALIZE</literal> pragma for a function can
7802 be put anywhere its type signature could be put.</para>
7804 <para>A <literal>SPECIALIZE</literal> has the effect of generating
7805 (a) a specialised version of the function and (b) a rewrite rule
7806 (see <xref linkend="rewrite-rules"/>) that rewrites a call to the
7807 un-specialised function into a call to the specialised one.</para>
7809 <para>The type in a SPECIALIZE pragma can be any type that is less
7810 polymorphic than the type of the original function. In concrete terms,
7811 if the original function is <literal>f</literal> then the pragma
7813 {-# SPECIALIZE f :: <type> #-}
7815 is valid if and only if the definition
7817 f_spec :: <type>
7820 is valid. Here are some examples (where we only give the type signature
7821 for the original function, not its code):
7823 f :: Eq a => a -> b -> b
7824 {-# SPECIALISE f :: Int -> b -> b #-}
7826 g :: (Eq a, Ix b) => a -> b -> b
7827 {-# SPECIALISE g :: (Eq a) => a -> Int -> Int #-}
7829 h :: Eq a => a -> a -> a
7830 {-# SPECIALISE h :: (Eq a) => [a] -> [a] -> [a] #-}
7832 The last of these examples will generate a
7833 RULE with a somewhat-complex left-hand side (try it yourself), so it might not fire very
7834 well. If you use this kind of specialisation, let us know how well it works.
7837 <para>A <literal>SPECIALIZE</literal> pragma can optionally be followed with a
7838 <literal>INLINE</literal> or <literal>NOINLINE</literal> pragma, optionally
7839 followed by a phase, as described in <xref linkend="inline-noinline-pragma"/>.
7840 The <literal>INLINE</literal> pragma affects the specialised version of the
7841 function (only), and applies even if the function is recursive. The motivating
7844 -- A GADT for arrays with type-indexed representation
7846 ArrInt :: !Int -> ByteArray# -> Arr Int
7847 ArrPair :: !Int -> Arr e1 -> Arr e2 -> Arr (e1, e2)
7849 (!:) :: Arr e -> Int -> e
7850 {-# SPECIALISE INLINE (!:) :: Arr Int -> Int -> Int #-}
7851 {-# SPECIALISE INLINE (!:) :: Arr (a, b) -> Int -> (a, b) #-}
7852 (ArrInt _ ba) !: (I# i) = I# (indexIntArray# ba i)
7853 (ArrPair _ a1 a2) !: i = (a1 !: i, a2 !: i)
7855 Here, <literal>(!:)</literal> is a recursive function that indexes arrays
7856 of type <literal>Arr e</literal>. Consider a call to <literal>(!:)</literal>
7857 at type <literal>(Int,Int)</literal>. The second specialisation will fire, and
7858 the specialised function will be inlined. It has two calls to
7859 <literal>(!:)</literal>,
7860 both at type <literal>Int</literal>. Both these calls fire the first
7861 specialisation, whose body is also inlined. The result is a type-based
7862 unrolling of the indexing function.</para>
7863 <para>Warning: you can make GHC diverge by using <literal>SPECIALISE INLINE</literal>
7864 on an ordinarily-recursive function.</para>
7866 <para>Note: In earlier versions of GHC, it was possible to provide your own
7867 specialised function for a given type:
7870 {-# SPECIALIZE hammeredLookup :: [(Int, value)] -> Int -> value = intLookup #-}
7873 This feature has been removed, as it is now subsumed by the
7874 <literal>RULES</literal> pragma (see <xref linkend="rule-spec"/>).</para>
7878 <sect2 id="specialize-instance-pragma">
7879 <title>SPECIALIZE instance pragma
7883 <indexterm><primary>SPECIALIZE pragma</primary></indexterm>
7884 <indexterm><primary>overloading, death to</primary></indexterm>
7885 Same idea, except for instance declarations. For example:
7888 instance (Eq a) => Eq (Foo a) where {
7889 {-# SPECIALIZE instance Eq (Foo [(Int, Bar)]) #-}
7893 The pragma must occur inside the <literal>where</literal> part
7894 of the instance declaration.
7897 Compatible with HBC, by the way, except perhaps in the placement
7903 <sect2 id="unpack-pragma">
7904 <title>UNPACK pragma</title>
7906 <indexterm><primary>UNPACK</primary></indexterm>
7908 <para>The <literal>UNPACK</literal> indicates to the compiler
7909 that it should unpack the contents of a constructor field into
7910 the constructor itself, removing a level of indirection. For
7914 data T = T {-# UNPACK #-} !Float
7915 {-# UNPACK #-} !Float
7918 <para>will create a constructor <literal>T</literal> containing
7919 two unboxed floats. This may not always be an optimisation: if
7920 the <function>T</function> constructor is scrutinised and the
7921 floats passed to a non-strict function for example, they will
7922 have to be reboxed (this is done automatically by the
7925 <para>Unpacking constructor fields should only be used in
7926 conjunction with <option>-O</option>, in order to expose
7927 unfoldings to the compiler so the reboxing can be removed as
7928 often as possible. For example:</para>
7932 f (T f1 f2) = f1 + f2
7935 <para>The compiler will avoid reboxing <function>f1</function>
7936 and <function>f2</function> by inlining <function>+</function>
7937 on floats, but only when <option>-O</option> is on.</para>
7939 <para>Any single-constructor data is eligible for unpacking; for
7943 data T = T {-# UNPACK #-} !(Int,Int)
7946 <para>will store the two <literal>Int</literal>s directly in the
7947 <function>T</function> constructor, by flattening the pair.
7948 Multi-level unpacking is also supported:
7951 data T = T {-# UNPACK #-} !S
7952 data S = S {-# UNPACK #-} !Int {-# UNPACK #-} !Int
7955 will store two unboxed <literal>Int#</literal>s
7956 directly in the <function>T</function> constructor. The
7957 unpacker can see through newtypes, too.</para>
7959 <para>If a field cannot be unpacked, you will not get a warning,
7960 so it might be an idea to check the generated code with
7961 <option>-ddump-simpl</option>.</para>
7963 <para>See also the <option>-funbox-strict-fields</option> flag,
7964 which essentially has the effect of adding
7965 <literal>{-# UNPACK #-}</literal> to every strict
7966 constructor field.</para>
7969 <sect2 id="source-pragma">
7970 <title>SOURCE pragma</title>
7972 <indexterm><primary>SOURCE</primary></indexterm>
7973 <para>The <literal>{-# SOURCE #-}</literal> pragma is used only in <literal>import</literal> declarations,
7974 to break a module loop. It is described in detail in <xref linkend="mutual-recursion"/>.
7980 <!-- ======================= REWRITE RULES ======================== -->
7982 <sect1 id="rewrite-rules">
7983 <title>Rewrite rules
7985 <indexterm><primary>RULES pragma</primary></indexterm>
7986 <indexterm><primary>pragma, RULES</primary></indexterm>
7987 <indexterm><primary>rewrite rules</primary></indexterm></title>
7990 The programmer can specify rewrite rules as part of the source program
7996 "map/map" forall f g xs. map f (map g xs) = map (f.g) xs
8001 Use the debug flag <option>-ddump-simpl-stats</option> to see what rules fired.
8002 If you need more information, then <option>-ddump-rule-firings</option> shows you
8003 each individual rule firing in detail.
8007 <title>Syntax</title>
8010 From a syntactic point of view:
8016 There may be zero or more rules in a <literal>RULES</literal> pragma, separated by semicolons (which
8017 may be generated by the layout rule).
8023 The layout rule applies in a pragma.
8024 Currently no new indentation level
8025 is set, so if you put several rules in single RULES pragma and wish to use layout to separate them,
8026 you must lay out the starting in the same column as the enclosing definitions.
8029 "map/map" forall f g xs. map f (map g xs) = map (f.g) xs
8030 "map/append" forall f xs ys. map f (xs ++ ys) = map f xs ++ map f ys
8033 Furthermore, the closing <literal>#-}</literal>
8034 should start in a column to the right of the opening <literal>{-#</literal>.
8040 Each rule has a name, enclosed in double quotes. The name itself has
8041 no significance at all. It is only used when reporting how many times the rule fired.
8047 A rule may optionally have a phase-control number (see <xref linkend="phase-control"/>),
8048 immediately after the name of the rule. Thus:
8051 "map/map" [2] forall f g xs. map f (map g xs) = map (f.g) xs
8054 The "[2]" means that the rule is active in Phase 2 and subsequent phases. The inverse
8055 notation "[~2]" is also accepted, meaning that the rule is active up to, but not including,
8064 Each variable mentioned in a rule must either be in scope (e.g. <function>map</function>),
8065 or bound by the <literal>forall</literal> (e.g. <function>f</function>, <function>g</function>, <function>xs</function>). The variables bound by
8066 the <literal>forall</literal> are called the <emphasis>pattern</emphasis> variables. They are separated
8067 by spaces, just like in a type <literal>forall</literal>.
8073 A pattern variable may optionally have a type signature.
8074 If the type of the pattern variable is polymorphic, it <emphasis>must</emphasis> have a type signature.
8075 For example, here is the <literal>foldr/build</literal> rule:
8078 "fold/build" forall k z (g::forall b. (a->b->b) -> b -> b) .
8079 foldr k z (build g) = g k z
8082 Since <function>g</function> has a polymorphic type, it must have a type signature.
8089 The left hand side of a rule must consist of a top-level variable applied
8090 to arbitrary expressions. For example, this is <emphasis>not</emphasis> OK:
8093 "wrong1" forall e1 e2. case True of { True -> e1; False -> e2 } = e1
8094 "wrong2" forall f. f True = True
8097 In <literal>"wrong1"</literal>, the LHS is not an application; in <literal>"wrong2"</literal>, the LHS has a pattern variable
8104 A rule does not need to be in the same module as (any of) the
8105 variables it mentions, though of course they need to be in scope.
8111 All rules are implicitly exported from the module, and are therefore
8112 in force in any module that imports the module that defined the rule, directly
8113 or indirectly. (That is, if A imports B, which imports C, then C's rules are
8114 in force when compiling A.) The situation is very similar to that for instance
8122 Inside a RULE "<literal>forall</literal>" is treated as a keyword, regardless of
8123 any other flag settings. Furthermore, inside a RULE, the language extension
8124 <option>-XScopedTypeVariables</option> is automatically enabled; see
8125 <xref linkend="scoped-type-variables"/>.
8131 Like other pragmas, RULE pragmas are always checked for scope errors, and
8132 are typechecked. Typechecking means that the LHS and RHS of a rule are typechecked,
8133 and must have the same type. However, rules are only <emphasis>enabled</emphasis>
8134 if the <option>-fenable-rewrite-rules</option> flag is
8135 on (see <xref linkend="rule-semantics"/>).
8144 <sect2 id="rule-semantics">
8145 <title>Semantics</title>
8148 From a semantic point of view:
8153 Rules are enabled (that is, used during optimisation)
8154 by the <option>-fenable-rewrite-rules</option> flag.
8155 This flag is implied by <option>-O</option>, and may be switched
8156 off (as usual) by <option>-fno-enable-rewrite-rules</option>.
8157 (NB: enabling <option>-fenable-rewrite-rules</option> without <option>-O</option>
8158 may not do what you expect, though, because without <option>-O</option> GHC
8159 ignores all optimisation information in interface files;
8160 see <option>-fignore-interface-pragmas</option>, <xref linkend="options-f"/>.)
8161 Note that <option>-fenable-rewrite-rules</option> is an <emphasis>optimisation</emphasis> flag, and
8162 has no effect on parsing or typechecking.
8168 Rules are regarded as left-to-right rewrite rules.
8169 When GHC finds an expression that is a substitution instance of the LHS
8170 of a rule, it replaces the expression by the (appropriately-substituted) RHS.
8171 By "a substitution instance" we mean that the LHS can be made equal to the
8172 expression by substituting for the pattern variables.
8179 GHC makes absolutely no attempt to verify that the LHS and RHS
8180 of a rule have the same meaning. That is undecidable in general, and
8181 infeasible in most interesting cases. The responsibility is entirely the programmer's!
8188 GHC makes no attempt to make sure that the rules are confluent or
8189 terminating. For example:
8192 "loop" forall x y. f x y = f y x
8195 This rule will cause the compiler to go into an infinite loop.
8202 If more than one rule matches a call, GHC will choose one arbitrarily to apply.
8208 GHC currently uses a very simple, syntactic, matching algorithm
8209 for matching a rule LHS with an expression. It seeks a substitution
8210 which makes the LHS and expression syntactically equal modulo alpha
8211 conversion. The pattern (rule), but not the expression, is eta-expanded if
8212 necessary. (Eta-expanding the expression can lead to laziness bugs.)
8213 But not beta conversion (that's called higher-order matching).
8217 Matching is carried out on GHC's intermediate language, which includes
8218 type abstractions and applications. So a rule only matches if the
8219 types match too. See <xref linkend="rule-spec"/> below.
8225 GHC keeps trying to apply the rules as it optimises the program.
8226 For example, consider:
8235 The expression <literal>s (t xs)</literal> does not match the rule <literal>"map/map"</literal>, but GHC
8236 will substitute for <varname>s</varname> and <varname>t</varname>, giving an expression which does match.
8237 If <varname>s</varname> or <varname>t</varname> was (a) used more than once, and (b) large or a redex, then it would
8238 not be substituted, and the rule would not fire.
8248 <sect2 id="conlike">
8249 <title>How rules interact with INLINE/NOINLINE and CONLIKE pragmas</title>
8252 Ordinary inlining happens at the same time as rule rewriting, which may lead to unexpected
8253 results. Consider this (artificial) example
8259 {-# RULES "f" f True = False #-}
8261 Since <literal>f</literal>'s right-hand side is small, it is inlined into <literal>g</literal>,
8266 Now <literal>g</literal> is inlined into <literal>h</literal>, but <literal>f</literal>'s RULE has
8268 If instead GHC had first inlined <literal>g</literal> into <literal>h</literal> then there
8269 would have been a better chance that <literal>f</literal>'s RULE might fire.
8272 The way to get predictable behaviour is to use a NOINLINE
8273 pragma, or an INLINE[<replaceable>phase</replaceable>] pragma, on <literal>f</literal>, to ensure
8274 that it is not inlined until its RULEs have had a chance to fire.
8277 GHC is very cautious about duplicating work. For example, consider
8279 f k z xs = let xs = build g
8280 in ...(foldr k z xs)...sum xs...
8281 {-# RULES "foldr/build" forall k z g. foldr k z (build g) = g k z #-}
8283 Since <literal>xs</literal> is used twice, GHC does not fire the foldr/build rule. Rightly
8284 so, because it might take a lot of work to compute <literal>xs</literal>, which would be
8285 duplicated if the rule fired.
8288 Sometimes, however, this approach is over-cautious, and we <emphasis>do</emphasis> want the
8289 rule to fire, even though doing so would duplicate redex. There is no way that GHC can work out
8290 when this is a good idea, so we provide the CONLIKE pragma to declare it, thus:
8292 {-# INLINE[1] CONLIKE f #-}
8293 f x = <replaceable>blah</replaceable>
8295 CONLIKE is a modifier to an INLINE or NOINLINE pragam. It specifies that an application
8296 of f to one argument (in general, the number of arguments to the left of the '=' sign)
8297 should be considered cheap enough to duplicate, if such a duplication would make rule
8298 fire. (The name "CONLIKE" is short for "constructor-like", because constructors certainly
8299 have such a property.)
8300 The CONLIKE pragam is a modifier to INLINE/NOINLINE because it really only makes sense to match
8301 <literal>f</literal> on the LHS of a rule if you are sure that <literal>f</literal> is
8302 not going to be inlined before the rule has a chance to fire.
8307 <title>List fusion</title>
8310 The RULES mechanism is used to implement fusion (deforestation) of common list functions.
8311 If a "good consumer" consumes an intermediate list constructed by a "good producer", the
8312 intermediate list should be eliminated entirely.
8316 The following are good producers:
8328 Enumerations of <literal>Int</literal> and <literal>Char</literal> (e.g. <literal>['a'..'z']</literal>).
8334 Explicit lists (e.g. <literal>[True, False]</literal>)
8340 The cons constructor (e.g <literal>3:4:[]</literal>)
8346 <function>++</function>
8352 <function>map</function>
8358 <function>take</function>, <function>filter</function>
8364 <function>iterate</function>, <function>repeat</function>
8370 <function>zip</function>, <function>zipWith</function>
8379 The following are good consumers:
8391 <function>array</function> (on its second argument)
8397 <function>++</function> (on its first argument)
8403 <function>foldr</function>
8409 <function>map</function>
8415 <function>take</function>, <function>filter</function>
8421 <function>concat</function>
8427 <function>unzip</function>, <function>unzip2</function>, <function>unzip3</function>, <function>unzip4</function>
8433 <function>zip</function>, <function>zipWith</function> (but on one argument only; if both are good producers, <function>zip</function>
8434 will fuse with one but not the other)
8440 <function>partition</function>
8446 <function>head</function>
8452 <function>and</function>, <function>or</function>, <function>any</function>, <function>all</function>
8458 <function>sequence_</function>
8464 <function>msum</function>
8470 <function>sortBy</function>
8479 So, for example, the following should generate no intermediate lists:
8482 array (1,10) [(i,i*i) | i <- map (+ 1) [0..9]]
8488 This list could readily be extended; if there are Prelude functions that you use
8489 a lot which are not included, please tell us.
8493 If you want to write your own good consumers or producers, look at the
8494 Prelude definitions of the above functions to see how to do so.
8499 <sect2 id="rule-spec">
8500 <title>Specialisation
8504 Rewrite rules can be used to get the same effect as a feature
8505 present in earlier versions of GHC.
8506 For example, suppose that:
8509 genericLookup :: Ord a => Table a b -> a -> b
8510 intLookup :: Table Int b -> Int -> b
8513 where <function>intLookup</function> is an implementation of
8514 <function>genericLookup</function> that works very fast for
8515 keys of type <literal>Int</literal>. You might wish
8516 to tell GHC to use <function>intLookup</function> instead of
8517 <function>genericLookup</function> whenever the latter was called with
8518 type <literal>Table Int b -> Int -> b</literal>.
8519 It used to be possible to write
8522 {-# SPECIALIZE genericLookup :: Table Int b -> Int -> b = intLookup #-}
8525 This feature is no longer in GHC, but rewrite rules let you do the same thing:
8528 {-# RULES "genericLookup/Int" genericLookup = intLookup #-}
8531 This slightly odd-looking rule instructs GHC to replace
8532 <function>genericLookup</function> by <function>intLookup</function>
8533 <emphasis>whenever the types match</emphasis>.
8534 What is more, this rule does not need to be in the same
8535 file as <function>genericLookup</function>, unlike the
8536 <literal>SPECIALIZE</literal> pragmas which currently do (so that they
8537 have an original definition available to specialise).
8540 <para>It is <emphasis>Your Responsibility</emphasis> to make sure that
8541 <function>intLookup</function> really behaves as a specialised version
8542 of <function>genericLookup</function>!!!</para>
8544 <para>An example in which using <literal>RULES</literal> for
8545 specialisation will Win Big:
8548 toDouble :: Real a => a -> Double
8549 toDouble = fromRational . toRational
8551 {-# RULES "toDouble/Int" toDouble = i2d #-}
8552 i2d (I# i) = D# (int2Double# i) -- uses Glasgow prim-op directly
8555 The <function>i2d</function> function is virtually one machine
8556 instruction; the default conversion—via an intermediate
8557 <literal>Rational</literal>—is obscenely expensive by
8564 <title>Controlling what's going on</title>
8572 Use <option>-ddump-rules</option> to see what transformation rules GHC is using.
8578 Use <option>-ddump-simpl-stats</option> to see what rules are being fired.
8579 If you add <option>-dppr-debug</option> you get a more detailed listing.
8585 Use <option>-ddump-rule-firings</option> to see in great detail what rules are being fired.
8586 If you add <option>-dppr-debug</option> you get a still more detailed listing.
8592 The definition of (say) <function>build</function> in <filename>GHC/Base.lhs</filename> looks like this:
8595 build :: forall a. (forall b. (a -> b -> b) -> b -> b) -> [a]
8596 {-# INLINE build #-}
8600 Notice the <literal>INLINE</literal>! That prevents <literal>(:)</literal> from being inlined when compiling
8601 <literal>PrelBase</literal>, so that an importing module will “see” the <literal>(:)</literal>, and can
8602 match it on the LHS of a rule. <literal>INLINE</literal> prevents any inlining happening
8603 in the RHS of the <literal>INLINE</literal> thing. I regret the delicacy of this.
8610 In <filename>libraries/base/GHC/Base.lhs</filename> look at the rules for <function>map</function> to
8611 see how to write rules that will do fusion and yet give an efficient
8612 program even if fusion doesn't happen. More rules in <filename>GHC/List.lhs</filename>.
8622 <sect2 id="core-pragma">
8623 <title>CORE pragma</title>
8625 <indexterm><primary>CORE pragma</primary></indexterm>
8626 <indexterm><primary>pragma, CORE</primary></indexterm>
8627 <indexterm><primary>core, annotation</primary></indexterm>
8630 The external core format supports <quote>Note</quote> annotations;
8631 the <literal>CORE</literal> pragma gives a way to specify what these
8632 should be in your Haskell source code. Syntactically, core
8633 annotations are attached to expressions and take a Haskell string
8634 literal as an argument. The following function definition shows an
8638 f x = ({-# CORE "foo" #-} show) ({-# CORE "bar" #-} x)
8641 Semantically, this is equivalent to:
8649 However, when external core is generated (via
8650 <option>-fext-core</option>), there will be Notes attached to the
8651 expressions <function>show</function> and <varname>x</varname>.
8652 The core function declaration for <function>f</function> is:
8656 f :: %forall a . GHCziShow.ZCTShow a ->
8657 a -> GHCziBase.ZMZN GHCziBase.Char =
8658 \ @ a (zddShow::GHCziShow.ZCTShow a) (eta::a) ->
8660 %case zddShow %of (tpl::GHCziShow.ZCTShow a)
8662 (tpl1::GHCziBase.Int ->
8664 GHCziBase.ZMZN GHCziBase.Char -> GHCziBase.ZMZN GHCziBase.Cha
8666 (tpl2::a -> GHCziBase.ZMZN GHCziBase.Char)
8667 (tpl3::GHCziBase.ZMZN a ->
8668 GHCziBase.ZMZN GHCziBase.Char -> GHCziBase.ZMZN GHCziBase.Cha
8676 Here, we can see that the function <function>show</function> (which
8677 has been expanded out to a case expression over the Show dictionary)
8678 has a <literal>%note</literal> attached to it, as does the
8679 expression <varname>eta</varname> (which used to be called
8680 <varname>x</varname>).
8687 <sect1 id="special-ids">
8688 <title>Special built-in functions</title>
8689 <para>GHC has a few built-in functions with special behaviour. These
8690 are now described in the module <ulink
8691 url="../libraries/ghc-prim/GHC-Prim.html"><literal>GHC.Prim</literal></ulink>
8692 in the library documentation.</para>
8696 <sect1 id="generic-classes">
8697 <title>Generic classes</title>
8700 The ideas behind this extension are described in detail in "Derivable type classes",
8701 Ralf Hinze and Simon Peyton Jones, Haskell Workshop, Montreal Sept 2000, pp94-105.
8702 An example will give the idea:
8710 fromBin :: [Int] -> (a, [Int])
8712 toBin {| Unit |} Unit = []
8713 toBin {| a :+: b |} (Inl x) = 0 : toBin x
8714 toBin {| a :+: b |} (Inr y) = 1 : toBin y
8715 toBin {| a :*: b |} (x :*: y) = toBin x ++ toBin y
8717 fromBin {| Unit |} bs = (Unit, bs)
8718 fromBin {| a :+: b |} (0:bs) = (Inl x, bs') where (x,bs') = fromBin bs
8719 fromBin {| a :+: b |} (1:bs) = (Inr y, bs') where (y,bs') = fromBin bs
8720 fromBin {| a :*: b |} bs = (x :*: y, bs'') where (x,bs' ) = fromBin bs
8721 (y,bs'') = fromBin bs'
8724 This class declaration explains how <literal>toBin</literal> and <literal>fromBin</literal>
8725 work for arbitrary data types. They do so by giving cases for unit, product, and sum,
8726 which are defined thus in the library module <literal>Generics</literal>:
8730 data a :+: b = Inl a | Inr b
8731 data a :*: b = a :*: b
8734 Now you can make a data type into an instance of Bin like this:
8736 instance (Bin a, Bin b) => Bin (a,b)
8737 instance Bin a => Bin [a]
8739 That is, just leave off the "where" clause. Of course, you can put in the
8740 where clause and over-ride whichever methods you please.
8744 <title> Using generics </title>
8745 <para>To use generics you need to</para>
8748 <para>Use the flags <option>-fglasgow-exts</option> (to enable the extra syntax),
8749 <option>-XGenerics</option> (to generate extra per-data-type code),
8750 and <option>-package lang</option> (to make the <literal>Generics</literal> library
8754 <para>Import the module <literal>Generics</literal> from the
8755 <literal>lang</literal> package. This import brings into
8756 scope the data types <literal>Unit</literal>,
8757 <literal>:*:</literal>, and <literal>:+:</literal>. (You
8758 don't need this import if you don't mention these types
8759 explicitly; for example, if you are simply giving instance
8760 declarations.)</para>
8765 <sect2> <title> Changes wrt the paper </title>
8767 Note that the type constructors <literal>:+:</literal> and <literal>:*:</literal>
8768 can be written infix (indeed, you can now use
8769 any operator starting in a colon as an infix type constructor). Also note that
8770 the type constructors are not exactly as in the paper (Unit instead of 1, etc).
8771 Finally, note that the syntax of the type patterns in the class declaration
8772 uses "<literal>{|</literal>" and "<literal>|}</literal>" brackets; curly braces
8773 alone would ambiguous when they appear on right hand sides (an extension we
8774 anticipate wanting).
8778 <sect2> <title>Terminology and restrictions</title>
8780 Terminology. A "generic default method" in a class declaration
8781 is one that is defined using type patterns as above.
8782 A "polymorphic default method" is a default method defined as in Haskell 98.
8783 A "generic class declaration" is a class declaration with at least one
8784 generic default method.
8792 Alas, we do not yet implement the stuff about constructor names and
8799 A generic class can have only one parameter; you can't have a generic
8800 multi-parameter class.
8806 A default method must be defined entirely using type patterns, or entirely
8807 without. So this is illegal:
8810 op :: a -> (a, Bool)
8811 op {| Unit |} Unit = (Unit, True)
8814 However it is perfectly OK for some methods of a generic class to have
8815 generic default methods and others to have polymorphic default methods.
8821 The type variable(s) in the type pattern for a generic method declaration
8822 scope over the right hand side. So this is legal (note the use of the type variable ``p'' in a type signature on the right hand side:
8826 op {| p :*: q |} (x :*: y) = op (x :: p)
8834 The type patterns in a generic default method must take one of the forms:
8840 where "a" and "b" are type variables. Furthermore, all the type patterns for
8841 a single type constructor (<literal>:*:</literal>, say) must be identical; they
8842 must use the same type variables. So this is illegal:
8846 op {| a :+: b |} (Inl x) = True
8847 op {| p :+: q |} (Inr y) = False
8849 The type patterns must be identical, even in equations for different methods of the class.
8850 So this too is illegal:
8854 op1 {| a :*: b |} (x :*: y) = True
8857 op2 {| p :*: q |} (x :*: y) = False
8859 (The reason for this restriction is that we gather all the equations for a particular type constructor
8860 into a single generic instance declaration.)
8866 A generic method declaration must give a case for each of the three type constructors.
8872 The type for a generic method can be built only from:
8874 <listitem> <para> Function arrows </para> </listitem>
8875 <listitem> <para> Type variables </para> </listitem>
8876 <listitem> <para> Tuples </para> </listitem>
8877 <listitem> <para> Arbitrary types not involving type variables </para> </listitem>
8879 Here are some example type signatures for generic methods:
8882 op2 :: Bool -> (a,Bool)
8883 op3 :: [Int] -> a -> a
8886 Here, op1, op2, op3 are OK, but op4 is rejected, because it has a type variable
8890 This restriction is an implementation restriction: we just haven't got around to
8891 implementing the necessary bidirectional maps over arbitrary type constructors.
8892 It would be relatively easy to add specific type constructors, such as Maybe and list,
8893 to the ones that are allowed.</para>
8898 In an instance declaration for a generic class, the idea is that the compiler
8899 will fill in the methods for you, based on the generic templates. However it can only
8904 The instance type is simple (a type constructor applied to type variables, as in Haskell 98).
8909 No constructor of the instance type has unboxed fields.
8913 (Of course, these things can only arise if you are already using GHC extensions.)
8914 However, you can still give an instance declarations for types which break these rules,
8915 provided you give explicit code to override any generic default methods.
8923 The option <option>-ddump-deriv</option> dumps incomprehensible stuff giving details of
8924 what the compiler does with generic declarations.
8929 <sect2> <title> Another example </title>
8931 Just to finish with, here's another example I rather like:
8935 nCons {| Unit |} _ = 1
8936 nCons {| a :*: b |} _ = 1
8937 nCons {| a :+: b |} _ = nCons (bot::a) + nCons (bot::b)
8940 tag {| Unit |} _ = 1
8941 tag {| a :*: b |} _ = 1
8942 tag {| a :+: b |} (Inl x) = tag x
8943 tag {| a :+: b |} (Inr y) = nCons (bot::a) + tag y
8949 <sect1 id="monomorphism">
8950 <title>Control over monomorphism</title>
8952 <para>GHC supports two flags that control the way in which generalisation is
8953 carried out at let and where bindings.
8957 <title>Switching off the dreaded Monomorphism Restriction</title>
8958 <indexterm><primary><option>-XNoMonomorphismRestriction</option></primary></indexterm>
8960 <para>Haskell's monomorphism restriction (see
8961 <ulink url="http://www.haskell.org/onlinereport/decls.html#sect4.5.5">Section
8963 of the Haskell Report)
8964 can be completely switched off by
8965 <option>-XNoMonomorphismRestriction</option>.
8970 <title>Monomorphic pattern bindings</title>
8971 <indexterm><primary><option>-XNoMonoPatBinds</option></primary></indexterm>
8972 <indexterm><primary><option>-XMonoPatBinds</option></primary></indexterm>
8974 <para> As an experimental change, we are exploring the possibility of
8975 making pattern bindings monomorphic; that is, not generalised at all.
8976 A pattern binding is a binding whose LHS has no function arguments,
8977 and is not a simple variable. For example:
8979 f x = x -- Not a pattern binding
8980 f = \x -> x -- Not a pattern binding
8981 f :: Int -> Int = \x -> x -- Not a pattern binding
8983 (g,h) = e -- A pattern binding
8984 (f) = e -- A pattern binding
8985 [x] = e -- A pattern binding
8987 Experimentally, GHC now makes pattern bindings monomorphic <emphasis>by
8988 default</emphasis>. Use <option>-XNoMonoPatBinds</option> to recover the
8997 ;;; Local Variables: ***
8999 ;;; sgml-parent-document: ("users_guide.xml" "book" "chapter" "sect1") ***
9000 ;;; ispell-local-dictionary: "british" ***