[project @ 2005-02-23 13:46:43 by simonpj]

[ghc-hetmet.git] / ghc / docs / users_guide / glasgow_exts.xml
diff --git a/ghc/docs/users_guide/glasgow_exts.xml b/ghc/docs/users_guide/glasgow_exts.xml

index 5a22e7c..93237d0 100644 (file)
--- a/ghc/docs/users_guide/glasgow_exts.xml
+++ b/ghc/docs/users_guide/glasgow_exts.xml
@@ -219,6 +219,28 @@ documentation</ulink> describes all the libraries that come with GHC.
        </varlistentry>
  
        <varlistentry>
+       <term><option>-fimplicit-params</option></term>
+       <listitem>
+         <para>Enables implicit parameters (see <xref
+         linkend="implicit-parameters"/>).  Currently also implied by 
+         <option>-fglasgow-exts</option>.</para>
+
+         <para>Syntax stolen:
+         <literal>?<replaceable>varid</replaceable></literal>,
+         <literal>%<replaceable>varid</replaceable></literal>.</para>
+       </listitem>
+      </varlistentry>
+
+      <varlistentry>
+       <term><option>-fscoped-type-variables</option></term>
+       <listitem>
+         <para>Enables lexically-scoped type variables (see <xref
+         linkend="scoped-type-variables"/>).  Implied by
+         <option>-fglasgow-exts</option>.</para>
+       </listitem>
+      </varlistentry>
+
+      <varlistentry>
         <term><option>-fth</option></term>
         <listitem>
           <para>Enables Template Haskell (see <xref
@@ -233,19 +255,6 @@ documentation</ulink> describes all the libraries that come with GHC.
         </listitem>
        </varlistentry>
  
-      <varlistentry>
-       <term><option>-fimplicit-params</option></term>
-       <listitem>
-         <para>Enables implicit parameters (see <xref
-         linkend="implicit-parameters"/>).  Currently also implied by 
-         <option>-fglasgow-exts</option>.</para>
-
-         <para>Syntax stolen:
-         <literal>?<replaceable>varid</replaceable></literal>,
-         <literal>%<replaceable>varid</replaceable></literal>.</para>
-       </listitem>
-      </varlistentry>
-
      </variablelist>
    </sect1>
  
@@ -270,7 +279,7 @@ became out of date, and wrong information is worse than none.</para>
  
  <para>The Real Truth about what primitive types there are, and what operations
  work over those types, is held in the file
-<filename>fptools/ghc/compiler/prelude/primops.txt</filename>.
+<filename>fptools/ghc/compiler/prelude/primops.txt.pp</filename>.
  This file is used directly to generate GHC's primitive-operation definitions, so
  it is always correct!  It is also intended for processing into text.</para>
  
@@ -334,11 +343,16 @@ it is accidental that it is represented by a pointer.  If a pointer
  represents a primitive value, then it really does point to that value:
  no unevaluated thunks, no indirections&hellip;nothing can be at the
  other end of the pointer than the primitive value.
+A numerically-intensive program using unboxed types can
+go a <emphasis>lot</emphasis> faster than its &ldquo;standard&rdquo;
+counterpart&mdash;we saw a threefold speedup on one example.
  </para>
  
  <para>
-There are some restrictions on the use of primitive types, the main
-one being that you can't pass a primitive value to a polymorphic
+There are some restrictions on the use of primitive types:
+<itemizedlist>
+<listitem><para>The main restriction
+is that you can't pass a primitive value to a polymorphic
  function or store one in a polymorphic data type.  This rules out
  things like <literal>[Int&num;]</literal> (i.e. lists of primitive
  integers).  The reason for this restriction is that polymorphic
@@ -350,11 +364,33 @@ attempt to dereference the pointer, with disastrous results.  Even
  worse, the unboxed value might be larger than a pointer
  (<literal>Double&num;</literal> for instance).
  </para>
+</listitem>
+<listitem><para> You cannot bind a variable with an unboxed type
+in a <emphasis>top-level</emphasis> binding.
+</para></listitem>
+<listitem><para> You cannot bind a variable with an unboxed type
+in a <emphasis>recursive</emphasis> binding.
+</para></listitem>
+<listitem><para> You may bind unboxed variables in a (non-recursive,
+non-top-level) pattern binding, but any such variable causes the entire
+pattern-match
+to become strict.  For example:
+<programlisting>
+  data Foo = Foo Int Int#
  
-<para>
-Nevertheless, A numerically-intensive program using unboxed types can
-go a <emphasis>lot</emphasis> faster than its &ldquo;standard&rdquo;
-counterpart&mdash;we saw a threefold speedup on one example.
+  f x = let (Foo a b, w) = ..rhs.. in ..body..
+</programlisting>
+Since <literal>b</literal> has type <literal>Int#</literal>, the entire pattern
+match
+is strict, and the program behaves as if you had written
+<programlisting>
+  data Foo = Foo Int Int#
+
+  f x = case ..rhs.. of { (Foo a b, w) -> ..body.. }
+</programlisting>
+</para>
+</listitem>
+</itemizedlist>
  </para>
  
  </sect2>
@@ -389,21 +425,19 @@ values, but they avoid the heap allocation normally associated with
  using fully-fledged tuples.  When an unboxed tuple is returned, the
  components are put directly into registers or on the stack; the
  unboxed tuple itself does not have a composite representation.  Many
-of the primitive operations listed in this section return unboxed
+of the primitive operations listed in <literal>primops.txt.pp</literal> return unboxed
  tuples.
+In particular, the <literal>IO</literal> and <literal>ST</literal> monads use unboxed
+tuples to avoid unnecessary allocation during sequences of operations.
  </para>
  
  <para>
  There are some pretty stringent restrictions on the use of unboxed tuples:
-</para>
-
-<para>
-
  <itemizedlist>
  <listitem>
  
  <para>
- Unboxed tuple types are subject to the same restrictions as
+Values of unboxed tuple types are subject to the same restrictions as
  other unboxed types; i.e. they may not be stored in polymorphic data
  structures or passed to polymorphic functions.
  
@@ -412,56 +446,46 @@ structures or passed to polymorphic functions.
  <listitem>
  
  <para>
- Unboxed tuples may only be constructed as the direct result of
-a function, and may only be deconstructed with a <literal>case</literal> expression.
-eg. the following are valid:
-
-
-<programlisting>
-f x y = (# x+1, y-1 #)
-g x = case f x x of { (# a, b #) -&#62; a + b }
-</programlisting>
-
-
-but the following are invalid:
+No variable can have an unboxed tuple type, nor may a constructor or function
+argument have an unboxed tuple type.  The following are all illegal:
  
  
  <programlisting>
-f x y = g (# x, y #)
-g (# x, y #) = x + y
-</programlisting>
+  data Foo = Foo (# Int, Int #)
  
+  f :: (# Int, Int #) -&#62; (# Int, Int #)
+  f x = x
  
-</para>
-</listitem>
-<listitem>
-
-<para>
- No variable can have an unboxed tuple type.  This is illegal:
-
+  g :: (# Int, Int #) -&#62; Int
+  g (# a,b #) = a
  
-<programlisting>
-f :: (# Int, Int #) -&#62; (# Int, Int #)
-f x = x
+  h x = let y = (# x,x #) in ...
  </programlisting>
-
-
-because <literal>x</literal> has an unboxed tuple type.
-
  </para>
  </listitem>
-
  </itemizedlist>
-
  </para>
-
-<para>
-Note: we may relax some of these restrictions in the future.
-</para>
-
  <para>
-The <literal>IO</literal> and <literal>ST</literal> monads use unboxed
-tuples to avoid unnecessary allocation during sequences of operations.
+The typical use of unboxed tuples is simply to return multiple values,
+binding those multiple results with a <literal>case</literal> expression, thus:
+<programlisting>
+  f x y = (# x+1, y-1 #)
+  g x = case f x x of { (# a, b #) -&#62; a + b }
+</programlisting>
+You can have an unboxed tuple in a pattern binding, thus
+<programlisting>
+  f x = let (# p,q #) = h x in ..body..
+</programlisting>
+If the types of <literal>p</literal> and <literal>q</literal> are not unboxed,
+the resulting binding is lazy like any other Haskell pattern binding.  The 
+above example desugars like this:
+<programlisting>
+  f x = let t = case h x o f{ (# p,q #) -> (p,q)
+            p = fst t
+            q = snd t
+        in ..body..
+</programlisting>
+Indeed, the bindings can even be recursive.
  </para>
  
  </sect2>
@@ -619,7 +643,7 @@ clunky env var1 var1
  </programlisting>
  
  <para>
-The semantics should be clear enough.  The qualifers are matched in order. 
+The semantics should be clear enough.  The qualifiers are matched in order. 
  For a <literal>&lt;-</literal> qualifier, which I call a pattern guard, the
  right hand side is evaluated and matched against the pattern on the left. 
  If the match fails then the whole guard fails and the next equation is
@@ -835,26 +859,38 @@ This name is not supported by GHC.
               <literal>return</literal>, are in scope (not the Prelude
               versions).  List comprehensions, and parallel array
               comprehensions, are unaffected.  </para></listitem>
+
+             <listitem>
+               <para>Similarly recursive do notation (see
+               <xref linkend="mdo-notation"/>) uses whatever
+               <literal>mfix</literal> function is in scope, and arrow
+               notation (see <xref linkend="arrow-notation"/>)
+               uses whatever <literal>arr</literal>,
+               <literal>(>>>)</literal>, <literal>first</literal>,
+               <literal>app</literal>, <literal>(|||)</literal> and
+               <literal>loop</literal> functions are in scope.</para>
+             </listitem>
             </itemizedlist>
  
-            <para>Be warned: this is an experimental facility, with fewer checks than
-            usual.  In particular, it is essential that the functions GHC finds in scope
-            must have the appropriate types, namely:
+            <para>The functions with these names that GHC finds in scope
+            must have types matching those of the originals, namely:
              <screen>
-               fromInteger  :: forall a. (...) => Integer  -> a
-               fromRational :: forall a. (...) => Rational -> a
-               negate       :: forall a. (...) => a -> a
-               (-)          :: forall a. (...) => a -> a -> a
-               (>>=)        :: forall m a. (...) => m a -> (a -> m b) -> m b
-               (>>)         :: forall m a. (...) => m a -> m b -> m b
-               return       :: forall m a. (...) => a      -> m a
-               fail         :: forall m a. (...) => String -> m a
+               fromInteger  :: Integer  -> N
+               fromRational :: Rational -> N
+               negate       :: N -> N
+               (-)          :: N -> N -> N
+               (>>=)        :: forall a b. M a -> (a -> M b) -> M b
+               (>>)         :: forall a b. M a -> M b -> M b
+               return       :: forall a.   a      -> M a
+               fail         :: forall a.   String -> M a
              </screen>
-            (The (...) part can be any context including the empty context; that part 
-            is up to you.)
-            If the functions don't have the right type, very peculiar things may 
-            happen.  Use <literal>-dcore-lint</literal> to
-            typecheck the desugared program.  If Core Lint is happy you should be all right.</para>
+            (Here <literal>N</literal> may be any type,
+            and <literal>M</literal> any type constructor.)</para>
+
+            <para>Be warned: this is an experimental facility, with
+            fewer checks than usual.  Use <literal>-dcore-lint</literal>
+            to typecheck the desugared program.  If Core Lint is happy
+            you should be all right.</para>
  
  </sect2>
  </sect1>
@@ -889,18 +925,34 @@ Nevertheless, they can be useful when defining "phantom types".</para>
  </sect3>
  
  <sect3 id="infix-tycons">
-<title>Infix type constructors</title>
+<title>Infix type constructors and classes</title>
  
  <para>
-GHC allows type constructors to be operators, and to be written infix, very much 
+GHC allows type constructors and classes to be operators, and to be written infix, very much 
  like expressions.  More specifically:
  <itemizedlist>
  <listitem><para>
-  A type constructor can be an operator, beginning with a colon; e.g. <literal>:*:</literal>.
+  A type constructor or class can be an operator, beginning with a colon; e.g. <literal>:*:</literal>.
    The lexical syntax is the same as that for data constructors.
    </para></listitem>
  <listitem><para>
-  Types can be written infix.  For example <literal>Int :*: Bool</literal>.  
+  Data type and type-synonym declarations can be written infix, parenthesised
+  if you want further arguments.  E.g.
+<screen>
+  data a :*: b = Foo a b
+  type a :+: b = Either a b
+  class a :=: b where ...
+
+  data (a :**: b) x = Baz a b x
+  type (a :++: b) y = Either (a,b) y
+</screen>
+  </para></listitem>
+<listitem><para>
+  Types, and class constraints, can be written infix.  For example
+  <screen>
+       x :: Int :*: Bool
+        f :: (a :=: b) => a -> b
+  </screen>
    </para></listitem>
  <listitem><para>
    Back-quotes work
@@ -908,7 +960,7 @@ like expressions.  More specifically:
    <literal>Int `a` Bool</literal>.  Similarly, parentheses work the same; e.g.  <literal>(:*:) Int Bool</literal>.
    </para></listitem>
  <listitem><para>
-  Fixities may be declared for type constructors just as for data constructors.  However,
+  Fixities may be declared for type constructors, or classes, just as for data constructors.  However,
    one cannot distinguish between the two in a fixity declaration; a fixity declaration
    sets the fixity for a data constructor and the corresponding type constructor.  For example:
  <screen>
@@ -922,13 +974,6 @@ like expressions.  More specifically:
    Function arrow is <literal>infixr</literal> with fixity 0.  (This might change; I'm not sure what it should be.)
    </para></listitem>
  <listitem><para>
-  Data type and type-synonym declarations can be written infix.  E.g.
-<screen>
-  data a :*: b = Foo a b
-  type a :+: b = Either a b
-</screen>
-  </para></listitem>
-<listitem><para>
    The only thing that differs between operators in types and operators in expressions is that
    ordinary non-constructor operators, such as <literal>+</literal> and <literal>*</literal>
    are not allowed in types. Reason: the uniform thing to do would be to make them type
@@ -945,7 +990,7 @@ like expressions.  More specifically:
  <title>Liberalised type synonyms</title>
  
  <para>
-Type synonmys are like macros at the type level, and
+Type synonyms are like macros at the type level, and
  GHC does validity checking on types <emphasis>only after expanding type synonyms</emphasis>.
  That means that GHC can be very much more liberal about type synonyms than Haskell 98:
  <itemizedlist>
@@ -994,7 +1039,7 @@ You can apply a type synonym to a partially applied type synonym:
    
    foo :: Generic Id []
  </programlisting>
-After epxanding the synonym, <literal>foo</literal> has the legal (in GHC) type:
+After expanding the synonym, <literal>foo</literal> has the legal (in GHC) type:
  <programlisting>
    foo :: forall x. x -> [x]
  </programlisting>
@@ -1446,7 +1491,7 @@ be acyclic</emphasis>.  So these class declarations are OK:
  <para>
   <emphasis>All of the class type variables must be reachable (in the sense 
  mentioned in <xref linkend="type-restrictions"/>)
-from the free varibles of each method type
+from the free variables of each method type
  </emphasis>.  For example:
  
  
@@ -1577,7 +1622,7 @@ get any more information about <literal>tv</literal>.
  <para>
  Note
  that the reachability condition is weaker than saying that <literal>a</literal> is
-functionally dependendent on a type variable free in
+functionally dependent on a type variable free in
  <literal>type</literal> (see <xref
  linkend="functional-dependencies"/>).  The reason for this is there
  might be a "hidden" dependency, in a superclass perhaps.  So
@@ -1689,96 +1734,59 @@ means
  <sect3>
  <title>Overlapping instances</title>
  <para>
-In general, <emphasis>instance declarations may not overlap</emphasis>.  The two instance
-declarations
-
-
-<programlisting>
-  instance context1 => C type1 where ...
-  instance context2 => C type2 where ...
-</programlisting>
-
-"overlap" if <literal>type1</literal> and <literal>type2</literal> unify.
-</para>
-<para>
-However, if you give the command line option
-<option>-fallow-overlapping-instances</option><indexterm><primary>-fallow-overlapping-instances
-option</primary></indexterm> then overlapping instance declarations are permitted.
-However, GHC arranges never to commit to using an instance declaration
-if another instance declaration also applies, either now or later.
-
-<itemizedlist>
-<listitem>
-
-<para>
- EITHER <literal>type1</literal> and <literal>type2</literal> do not unify
-</para>
-</listitem>
-<listitem>
-
-<para>
- OR <literal>type2</literal> is a substitution instance of <literal>type1</literal>
-(but not identical to <literal>type1</literal>), or vice versa.
-</para>
-</listitem>
-</itemizedlist>
-Notice that these rules
-<itemizedlist>
-<listitem>
-
-<para>
- make it clear which instance decl to use
-(pick the most specific one that matches)
-
-</para>
-</listitem>
-<listitem>
-
-<para>
- do not mention the contexts <literal>context1</literal>, <literal>context2</literal>
-Reason: you can pick which instance decl
-"matches" based on the type.
-</para>
-</listitem>
-
-</itemizedlist>
-However the rules are over-conservative.  Two instance declarations can overlap,
-but it can still be clear in particular situations which to use.  For example:
-<programlisting>
-  instance C (Int,a) where ...
-  instance C (a,Bool) where ...
-</programlisting>
-These are rejected by GHC's rules, but it is clear what to do when trying
-to solve the constraint <literal>C (Int,Int)</literal> because the second instance
-cannot apply.  Yell if this restriction bites you.
-</para>
-<para>
-GHC is also conservative about committing to an overlapping instance.  For example:
-<programlisting>
-  class C a where { op :: a -> a }
-  instance C [Int] where ...
-  instance C a => C [a] where ...
-  
-  f :: C b => [b] -> [b]
-  f x = op x
-</programlisting>
-From the RHS of f we get the constraint <literal>C [b]</literal>.  But
-GHC does not commit to the second instance declaration, because in a paricular
-call of f, b might be instantiate to Int, so the first instance declaration
-would be appropriate.  So GHC rejects the program.  If you add <option>-fallow-incoherent-instances</option>
-GHC will instead silently pick the second instance, without complaining about 
+In general, <emphasis>GHC requires that that it be unambiguous which instance
+declaration
+should be used to resolve a type-class constraint</emphasis>. This behaviour
+can be modified by two flags: <option>-fallow-overlapping-instances</option>
+<indexterm><primary>-fallow-overlapping-instances
+</primary></indexterm> 
+and <option>-fallow-incoherent-instances</option>
+<indexterm><primary>-fallow-incoherent-instances
+</primary></indexterm>, as this section discusses.</para>
+<para>
+When GHC tries to resolve, say, the constraint <literal>C Int Bool</literal>,
+it tries to match every instance declaration against the
+constraint,
+by instantiating the head of the instance declaration.  For example, consider
+these declarations:
+<programlisting>
+  instance context1 => C Int a     where ...  -- (A)
+  instance context2 => C a   Bool  where ...  -- (B)
+  instance context3 => C Int [a]   where ...  -- (C)
+  instance context4 => C Int [Int] where ...  -- (D)
+</programlisting>
+The instances (A) and (B) match the constraint <literal>C Int Bool</literal>, but (C) and (D) do not.  When matching, GHC takes
+no account of the context of the instance declaration
+(<literal>context1</literal> etc).
+GHC's default behaviour is that <emphasis>exactly one instance must match the
+constraint it is trying to resolve</emphasis>.  
+It is fine for there to be a <emphasis>potential</emphasis> of overlap (by
+including both declarations (A) and (B), say); an error is only reported if a 
+particular constraint matches more than one.
+</para>
+
+<para>
+The <option>-fallow-overlapping-instances</option> flag instructs GHC to allow
+more than one instance to match, provided there is a most specific one.  For
+example, the constraint <literal>C Int [Int]</literal> matches instances (A),
+(C) and (D), but the last is more specific, and hence is chosen.  If there is no
+most-specific match, the program is rejected.
+</para>
+<para>
+However, GHC is conservative about committing to an overlapping instance.  For example:
+<programlisting>
+  f :: [b] -> [b]
+  f x = ...
+</programlisting>
+Suppose that from the RHS of <literal>f</literal> we get the constraint
+<literal>C Int [b]</literal>.  But
+GHC does not commit to instance (C), because in a particular
+call of <literal>f</literal>, <literal>b</literal> might be instantiate 
+to <literal>Int</literal>, in which case instance (D) would be more specific still.
+So GHC rejects the program.  If you add the flag <option>-fallow-incoherent-instances</option>,
+GHC will instead pick (C), without complaining about 
  the problem of subsequent instantiations.
  </para>
-<para>
-Regrettably, GHC doesn't guarantee to detect overlapping instance
-declarations if they appear in different modules.  GHC can "see" the
-instance declarations in the transitive closure of all the modules
-imported by the one being compiled, so it can "see" all instance decls
-when it is compiling <literal>Main</literal>.  However, it currently chooses not
-to look at ones that can't possibly be of use in the module currently
-being compiled, in the interests of efficiency.  (Perhaps we should
-change that decision, at least for <literal>Main</literal>.)
-</para>
  </sect3>
  
  <sect3>
@@ -1935,7 +1943,7 @@ allowing these idioms interesting idioms.
  <sect2 id="implicit-parameters">
  <title>Implicit parameters</title>
  
-<para> Implicit paramters are implemented as described in 
+<para> Implicit parameters are implemented as described in 
  "Implicit parameters: dynamic scoping with static types", 
  J Lewis, MB Shields, E Meijer, J Launchbury,
  27th ACM Symposium on Principles of Programming Languages (POPL'00),
@@ -2106,7 +2114,7 @@ problem that monads seem over-kill for certain sorts of problem, notably:
  </para>
  <itemizedlist>
  <listitem> <para> distributing a supply of unique names </para> </listitem>
-<listitem> <para> distributing a suppply of random numbers </para> </listitem>
+<listitem> <para> distributing a supply of random numbers </para> </listitem>
  <listitem> <para> distributing an oracle (as in QuickCheck) </para> </listitem>
  </itemizedlist>
  
@@ -2393,7 +2401,7 @@ is implicitly added by Haskell.
  </para>
  <para>
  The functions <literal>f2</literal> and <literal>g2</literal> have rank-2 types;
-the <literal>forall</literal> is on the left of a function arrrow.  As <literal>g2</literal>
+the <literal>forall</literal> is on the left of a function arrow.  As <literal>g2</literal>
  shows, the polymorphic type on the left of the function arrow can be overloaded.
  </para>
  <para>
@@ -2566,7 +2574,7 @@ matching.
  <title>Type inference</title>
  
  <para>
-In general, type inference for arbitrary-rank types is undecideable.
+In general, type inference for arbitrary-rank types is undecidable.
  GHC uses an algorithm proposed by Odersky and Laufer ("Putting type annotations to work", POPL'96)
  to get a decidable algorithm by requiring some help from the programmer.
  We do not yet have a formal specification of "some help" but the rule is this:
@@ -2895,7 +2903,7 @@ A pattern type signature can occur in any pattern.  For example:
  <listitem>
  <para>
  A pattern type signature can be on an arbitrary sub-pattern, not
-ust on a variable:
+just on a variable:
  
  
  <programlisting>
@@ -3279,6 +3287,109 @@ instances is most interesting.
  </sect1>
  <!-- ==================== End of type system extensions =================  -->
    
+<!-- ====================== Generalised algebraic data types =======================  -->
+
+<sect1 id="gadt">
+<title>Generalised Algebraic Data Types</title>
+
+<para>Generalised Algebraic Data Types (GADTs) generalise ordinary algebraic data types by allowing you
+to give the type signatures of constructors explicitly.  For example:
+<programlisting>
+  data Term a where
+      Lit    :: Int -> Term Int
+      Succ   :: Term Int -> Term Int
+      IsZero :: Term Int -> Term Bool  
+      If     :: Term Bool -> Term a -> Term a -> Term a
+      Pair   :: Term a -> Term b -> Term (a,b)
+</programlisting>
+Notice that the return type of the constructors is not always <literal>Term a</literal>, as is the
+case with ordinary vanilla data types.  Now we can write a well-typed <literal>eval</literal> function
+for these <literal>Terms</literal>:
+<programlisting>
+  eval :: Term a -> a
+  eval (Lit i)             = i
+  eval (Succ t)     = 1 + eval t
+  eval (IsZero i)   = eval i == 0
+  eval (If b e1 e2) = if eval b then eval e1 else eval e2
+  eval (Pair e1 e2) = (eval e2, eval e2)
+</programlisting>
+These and many other examples are given in papers by Hongwei Xi, and Tim Sheard.
+</para>
+<para> The extensions to GHC are these:
+<itemizedlist>
+<listitem><para>
+  Data type declarations have a 'where' form, as exemplified above.  The type signature of
+each constructor is independent, and is implicitly universally quantified as usual. Unlike a normal
+Haskell data type declaration, the type variable(s) in the "<literal>data Term a where</literal>" header 
+have no scope.  Indeed, one can write a kind signature instead:
+<programlisting>
+  data Term :: * -> * where ...
+</programlisting>
+or even a mixture of the two:
+<programlisting>
+  data Foo a :: (* -> *) -> * where ...
+</programlisting>
+The type variables (if given) may be explicitly kinded, so we could also write the header for <literal>Foo</literal>
+like this:
+<programlisting>
+  data Foo a (b :: * -> *) where ...
+</programlisting>
+</para></listitem>
+
+<listitem><para>
+There are no restrictions on the type of the data constructor, except that the result
+type must begin with the type constructor being defined.  For example, in the <literal>Term</literal> data
+type above, the type of each constructor must end with <literal> ... -> Term ...</literal>.
+</para></listitem>
+
+<listitem><para>
+You cannot use a <literal>deriving</literal> clause on a GADT-style data type declaration,
+nor can you use record syntax.  (It's not clear what these constructs would mean.  For example,
+the record selectors might ill-typed.)  However, you can use strictness annotations, in the obvious places
+in the constructor type:
+<programlisting>
+  data Term a where
+      Lit    :: !Int -> Term Int
+      If     :: Term Bool -> !(Term a) -> !(Term a) -> Term a
+      Pair   :: Term a -> Term b -> Term (a,b)
+</programlisting>
+</para></listitem>
+
+<listitem><para>
+Pattern matching causes type refinement.  For example, in the right hand side of the equation
+<programlisting>
+  eval :: Term a -> a
+  eval (Lit i) =  ...
+</programlisting>
+the type <literal>a</literal> is refined to <literal>Int</literal>.  (That's the whole point!)
+A precise specification of the type rules is beyond what this user manual aspires to, but there is a paper
+about the ideas: "Wobbly types: practical type inference for generalised algebraic data types", on Simon PJ's home page.</para>
+
+<para> The general principle is this: <emphasis>type refinement is only carried out based on user-supplied type annotations</emphasis>.
+So if no type signature is supplied for <literal>eval</literal>, no type refinement happens, and lots of obscure error messages will
+occur.  However, the refinement is quite general.  For example, if we had:
+<programlisting>
+  eval :: Term a -> a -> a
+  eval (Lit i) j =  i+j
+</programlisting>
+the pattern match causes the type <literal>a</literal> to be refined to <literal>Int</literal> (because of the type
+of the constructor <literal>Lit</literal>, and that refinement also applies to the type of <literal>j</literal>, and
+the result type of the <literal>case</literal> expression.  Hence the addition <literal>i+j</literal> is legal.
+</para>
+</listitem>
+</itemizedlist>
+</para>
+
+<para>Notice that GADTs generalise existential types.  For example, these two declarations are equivalent:
+<programlisting>
+  data T a = forall b. MkT b (b->a)
+  data T' a where { MKT :: b -> (b->a) -> T a }
+</programlisting>
+</para>
+</sect1>
+
+<!-- ====================== End of Generalised algebraic data types =======================  -->
+
  <!-- ====================== TEMPLATE HASKELL =======================  -->
  
  <sect1 id="template-haskell">
@@ -3423,7 +3534,7 @@ module Printf where
  -- you intend to use it.
  
  -- Import some Template Haskell syntax
-import Language.Haskell.TH.Syntax
+import Language.Haskell.TH
  
  -- Describe a format string
  data Format = D | S | L String
@@ -4096,7 +4207,7 @@ Assertion failures can be caught, see the documentation for the
     {-# DEPRECATED f, C, T "Don't use these" #-}
  </programlisting>
           <para>When you compile any module that imports and uses any
-          of the specifed entities, GHC will print the specified
+          of the specified entities, GHC will print the specified
            message.</para>
         </listitem>
        </itemizedlist>
@@ -4113,6 +4224,31 @@ Assertion failures can be caught, see the documentation for the
        <option>-fno-warn-deprecations</option>.</para>
      </sect2>
  
+    <sect2 id="include-pragma">
+      <title>INCLUDE pragma</title>
+
+      <para>The <literal>INCLUDE</literal> pragma is for specifying the names
+       of C header files that should be <literal>#include</literal>'d into
+       the C source code generated by the compiler for the current module (if
+       compiling via C).  For example:</para>
+
+<programlisting>
+{-# INCLUDE "foo.h" #-}
+{-# INCLUDE &lt;stdio.h&gt; #-}</programlisting>
+
+      <para>The <literal>INCLUDE</literal> pragma(s) must appear at the top of
+       your source file with any <literal>OPTIONS_GHC</literal>
+       pragma(s).</para>
+
+      <para>An <literal>INCLUDE</literal> pragma is  the preferred alternative
+       to the <option>-#include</option> option (<xref
+         linkend="options-C-compiler" />), because the
+       <literal>INCLUDE</literal> pragma is understood by other
+       compilers.  Yet another alternative is to add the include file to each
+       <literal>foreign import</literal> declaration in your code, but we
+       don't recommend using this approach with GHC.</para>
+    </sect2>
+
      <sect2 id="inline-noinline-pragma">
        <title>INLINE and NOINLINE pragmas</title>
  
@@ -4152,7 +4288,7 @@ key_function :: Int -> String -> (Bool, Double)
          The normal unfolding machinery will then be very keen to
          inline it.</para>
  
-       <para>Syntactially, an <literal>INLINE</literal> pragma for a
+       <para>Syntactically, an <literal>INLINE</literal> pragma for a
          function can be put anywhere its type signature could be
          put.</para>
  
@@ -4186,7 +4322,7 @@ key_function :: Int -> String -> (Bool, Double)
          you're very cautious about code size.</para>
  
         <para><literal>NOTINLINE</literal> is a synonym for
-        <literal>NOINLINE</literal> (<literal>NOTINLINE</literal> is
+        <literal>NOINLINE</literal> (<literal>NOINLINE</literal> is
          specified by Haskell 98 as the standard way to disable
          inlining, so it should be used if you want your code to be
          portable).</para>
@@ -4202,7 +4338,7 @@ key_function :: Int -> String -> (Bool, Double)
          number</emphasis>; the phase number decreases towards zero.
          If you use <option>-dverbose-core2core</option> you'll see the
          sequence of phase numbers for successive runs of the
-        simpifier.  In an INLINE pragma you can optionally specify a
+        simplifier.  In an INLINE pragma you can optionally specify a
          phase number, thus:</para>
  
         <itemizedlist>
@@ -4286,16 +4422,19 @@ key_function :: Int -> String -> (Bool, Double)
      </sect2>
  
      <sect2 id="options-pragma">
-      <title>OPTIONS pragma</title>
-      <indexterm><primary>OPTIONS</primary>
+      <title>OPTIONS_GHC pragma</title>
+      <indexterm><primary>OPTIONS_GHC</primary>
        </indexterm>
-      <indexterm><primary>pragma</primary><secondary>OPTIONS</secondary>
+      <indexterm><primary>pragma</primary><secondary>OPTIONS_GHC</secondary>
        </indexterm>
  
-      <para>The <literal>OPTIONS</literal> pragma is used to specify
+      <para>The <literal>OPTIONS_GHC</literal> pragma is used to specify
        additional options that are given to the compiler when compiling
        this source file.  See <xref linkend="source-file-options"/> for
        details.</para>
+
+      <para>Previous versions of GHC accepted <literal>OPTIONS</literal> rather
+       than <literal>OPTIONS_GHC</literal>, but that is now deprecated.</para>
      </sect2>
  
      <sect2 id="rules">
@@ -4447,7 +4586,7 @@ data S = S {-# UNPACK #-} !Int {-# UNPACK #-} !Int
  <sect1 id="rewrite-rules">
  <title>Rewrite rules
  
-<indexterm><primary>RULES pagma</primary></indexterm>
+<indexterm><primary>RULES pragma</primary></indexterm>
  <indexterm><primary>pragma, RULES</primary></indexterm>
  <indexterm><primary>rewrite rules</primary></indexterm></title>
  
@@ -4613,7 +4752,7 @@ same type.
  
  <para>
   GHC makes absolutely no attempt to verify that the LHS and RHS
-of a rule have the same meaning.  That is undecideable in general, and
+of a rule have the same meaning.  That is undecidable in general, and
  infeasible in most interesting cases.  The responsibility is entirely the programmer's!
  
  </para>
@@ -4645,7 +4784,7 @@ This rule will cause the compiler to go into an infinite loop.
  for matching a rule LHS with an expression.  It seeks a substitution
  which makes the LHS and expression syntactically equal modulo alpha
  conversion.  The pattern (rule), but not the expression, is eta-expanded if
-necessary.  (Eta-expanding the epression can lead to laziness bugs.)
+necessary.  (Eta-expanding the expression can lead to laziness bugs.)
  But not beta conversion (that's called higher-order matching).
  </para>
  
@@ -5011,7 +5150,7 @@ If you add <option>-dppr-debug</option> you get a more detailed listing.
  <listitem>
  
  <para>
- The defintion of (say) <function>build</function> in <filename>GHC/Base.lhs</filename> looks llike this:
+ The definition of (say) <function>build</function> in <filename>GHC/Base.lhs</filename> looks llike this:
  
  <programlisting>
          build   :: forall a. (forall b. (a -> b -> b) -> b -> b) -> [a]
@@ -5060,7 +5199,7 @@ program even if fusion doesn't happen.  More rules in <filename>GHC/List.lhs</fi
  f x = ({-# CORE "foo" #-} show) ({-# CORE "bar" #-} x)
  </programlisting>
  
-  Sematically, this is equivalent to:
+  Semantically, this is equivalent to:
  
  <programlisting>
  g x = show x