[project @ 2004-10-05 13:56:31 by simonmar]

[ghc-hetmet.git] / ghc / docs / users_guide / glasgow_exts.xml
diff --git a/ghc/docs/users_guide/glasgow_exts.xml b/ghc/docs/users_guide/glasgow_exts.xml

index 2412451..a3e1d01 100644 (file)
--- a/ghc/docs/users_guide/glasgow_exts.xml
+++ b/ghc/docs/users_guide/glasgow_exts.xml
@@ -619,7 +619,7 @@ clunky env var1 var1
  </programlisting>
  
  <para>
-The semantics should be clear enough.  The qualifers are matched in order. 
+The semantics should be clear enough.  The qualifiers are matched in order. 
  For a <literal>&lt;-</literal> qualifier, which I call a pattern guard, the
  right hand side is evaluated and matched against the pattern on the left. 
  If the match fails then the whole guard fails and the next equation is
@@ -835,26 +835,38 @@ This name is not supported by GHC.
               <literal>return</literal>, are in scope (not the Prelude
               versions).  List comprehensions, and parallel array
               comprehensions, are unaffected.  </para></listitem>
+
+             <listitem>
+               <para>Similarly recursive do notation (see
+               <xref linkend="mdo-notation"/>) uses whatever
+               <literal>mfix</literal> function is in scope, and arrow
+               notation (see <xref linkend="arrow-notation"/>)
+               uses whatever <literal>arr</literal>,
+               <literal>(>>>)</literal>, <literal>first</literal>,
+               <literal>app</literal>, <literal>(|||)</literal> and
+               <literal>loop</literal> functions are in scope.</para>
+             </listitem>
             </itemizedlist>
  
-            <para>Be warned: this is an experimental facility, with fewer checks than
-            usual.  In particular, it is essential that the functions GHC finds in scope
-            must have the appropriate types, namely:
+            <para>The functions with these names that GHC finds in scope
+            must have types matching those of the originals, namely:
              <screen>
-               fromInteger  :: forall a. (...) => Integer  -> a
-               fromRational :: forall a. (...) => Rational -> a
-               negate       :: forall a. (...) => a -> a
-               (-)          :: forall a. (...) => a -> a -> a
-               (>>=)        :: forall m a. (...) => m a -> (a -> m b) -> m b
-               (>>)         :: forall m a. (...) => m a -> m b -> m b
-               return       :: forall m a. (...) => a      -> m a
-               fail         :: forall m a. (...) => String -> m a
+               fromInteger  :: Integer  -> N
+               fromRational :: Rational -> N
+               negate       :: N -> N
+               (-)          :: N -> N -> N
+               (>>=)        :: forall a b. M a -> (a -> M b) -> M b
+               (>>)         :: forall a b. M a -> M b -> M b
+               return       :: forall a.   a      -> M a
+               fail         :: forall a.   String -> M a
              </screen>
-            (The (...) part can be any context including the empty context; that part 
-            is up to you.)
-            If the functions don't have the right type, very peculiar things may 
-            happen.  Use <literal>-dcore-lint</literal> to
-            typecheck the desugared program.  If Core Lint is happy you should be all right.</para>
+            (Here <literal>N</literal> may be any type,
+            and <literal>M</literal> any type constructor.)</para>
+
+            <para>Be warned: this is an experimental facility, with
+            fewer checks than usual.  Use <literal>-dcore-lint</literal>
+            to typecheck the desugared program.  If Core Lint is happy
+            you should be all right.</para>
  
  </sect2>
  </sect1>
@@ -945,7 +957,7 @@ like expressions.  More specifically:
  <title>Liberalised type synonyms</title>
  
  <para>
-Type synonmys are like macros at the type level, and
+Type synonyms are like macros at the type level, and
  GHC does validity checking on types <emphasis>only after expanding type synonyms</emphasis>.
  That means that GHC can be very much more liberal about type synonyms than Haskell 98:
  <itemizedlist>
@@ -994,7 +1006,7 @@ You can apply a type synonym to a partially applied type synonym:
    
    foo :: Generic Id []
  </programlisting>
-After epxanding the synonym, <literal>foo</literal> has the legal (in GHC) type:
+After expanding the synonym, <literal>foo</literal> has the legal (in GHC) type:
  <programlisting>
    foo :: forall x. x -> [x]
  </programlisting>
@@ -1446,7 +1458,7 @@ be acyclic</emphasis>.  So these class declarations are OK:
  <para>
   <emphasis>All of the class type variables must be reachable (in the sense 
  mentioned in <xref linkend="type-restrictions"/>)
-from the free varibles of each method type
+from the free variables of each method type
  </emphasis>.  For example:
  
  
@@ -1577,7 +1589,7 @@ get any more information about <literal>tv</literal>.
  <para>
  Note
  that the reachability condition is weaker than saying that <literal>a</literal> is
-functionally dependendent on a type variable free in
+functionally dependent on a type variable free in
  <literal>type</literal> (see <xref
  linkend="functional-dependencies"/>).  The reason for this is there
  might be a "hidden" dependency, in a superclass perhaps.  So
@@ -1763,7 +1775,7 @@ GHC is also conservative about committing to an overlapping instance.  For examp
    f x = op x
  </programlisting>
  From the RHS of f we get the constraint <literal>C [b]</literal>.  But
-GHC does not commit to the second instance declaration, because in a paricular
+GHC does not commit to the second instance declaration, because in a particular
  call of f, b might be instantiate to Int, so the first instance declaration
  would be appropriate.  So GHC rejects the program.  If you add <option>-fallow-incoherent-instances</option>
  GHC will instead silently pick the second instance, without complaining about 
@@ -1935,7 +1947,7 @@ allowing these idioms interesting idioms.
  <sect2 id="implicit-parameters">
  <title>Implicit parameters</title>
  
-<para> Implicit paramters are implemented as described in 
+<para> Implicit parameters are implemented as described in 
  "Implicit parameters: dynamic scoping with static types", 
  J Lewis, MB Shields, E Meijer, J Launchbury,
  27th ACM Symposium on Principles of Programming Languages (POPL'00),
@@ -2106,7 +2118,7 @@ problem that monads seem over-kill for certain sorts of problem, notably:
  </para>
  <itemizedlist>
  <listitem> <para> distributing a supply of unique names </para> </listitem>
-<listitem> <para> distributing a suppply of random numbers </para> </listitem>
+<listitem> <para> distributing a supply of random numbers </para> </listitem>
  <listitem> <para> distributing an oracle (as in QuickCheck) </para> </listitem>
  </itemizedlist>
  
@@ -2393,7 +2405,7 @@ is implicitly added by Haskell.
  </para>
  <para>
  The functions <literal>f2</literal> and <literal>g2</literal> have rank-2 types;
-the <literal>forall</literal> is on the left of a function arrrow.  As <literal>g2</literal>
+the <literal>forall</literal> is on the left of a function arrow.  As <literal>g2</literal>
  shows, the polymorphic type on the left of the function arrow can be overloaded.
  </para>
  <para>
@@ -2566,7 +2578,7 @@ matching.
  <title>Type inference</title>
  
  <para>
-In general, type inference for arbitrary-rank types is undecideable.
+In general, type inference for arbitrary-rank types is undecidable.
  GHC uses an algorithm proposed by Odersky and Laufer ("Putting type annotations to work", POPL'96)
  to get a decidable algorithm by requiring some help from the programmer.
  We do not yet have a formal specification of "some help" but the rule is this:
@@ -2895,7 +2907,7 @@ A pattern type signature can occur in any pattern.  For example:
  <listitem>
  <para>
  A pattern type signature can be on an arbitrary sub-pattern, not
-ust on a variable:
+just on a variable:
  
  
  <programlisting>
@@ -3279,6 +3291,109 @@ instances is most interesting.
  </sect1>
  <!-- ==================== End of type system extensions =================  -->
    
+<!-- ====================== Generalised algebraic data types =======================  -->
+
+<sect1 id="gadt">
+<title>Generalised Algebraic Data Types</title>
+
+<para>Generalised Algebraic Data Types (GADTs) generalise ordinary algebraic data types by allowing you
+to give the type signatures of constructors explicitly.  For example:
+<programlisting>
+  data Term a where
+      Lit    :: Int -> Term Int
+      Succ   :: Term Int -> Term Int
+      IsZero :: Term Int -> Term Bool  
+      If     :: Term Bool -> Term a -> Term a -> Term a
+      Pair   :: Term a -> Term b -> Term (a,b)
+</programlisting>
+Notice that the return type of the constructors is not always <literal>Term a</literal>, as is the
+case with ordinary vanilla data types.  Now we can write a well-typed <literal>eval</literal> function
+for these <literal>Terms</literal>:
+<programlisting>
+  eval :: Term a -> a
+  eval (Lit i)             = i
+  eval (Succ t)     = 1 + eval t
+  eval (IsZero i)   = eval i == 0
+  eval (If b e1 e2) = if eval b then eval e1 else eval e2
+  eval (Pair e1 e2) = (eval e2, eval e2)
+</programlisting>
+These and many other examples are given in papers by Hongwei Xi, and Tim Sheard.
+</para>
+<para> The extensions to GHC are these:
+<itemizedlist>
+<listitem><para>
+  Data type declarations have a 'where' form, as exemplified above.  The type signature of
+each constructor is independent, and is implicitly universally quantified as usual. Unlike a normal
+Haskell data type declaration, the type variable(s) in the "<literal>data Term a where</literal>" header 
+have no scope.  Indeed, one can write a kind signature instead:
+<programlisting>
+  data Term :: * -> * where ...
+</programlisting>
+or even a mixture of the two:
+<programlisting>
+  data Foo a :: (* -> *) -> * where ...
+</programlisting>
+The type variables (if given) may be explicitly kinded, so we could also write the header for <literal>Foo</literal>
+like this:
+<programlisting>
+  data Foo a (b :: * -> *) where ...
+</programlisting>
+</para></listitem>
+
+<listitem><para>
+There are no restrictions on the type of the data constructor, except that the result
+type must begin with the type constructor being defined.  For example, in the <literal>Term</literal> data
+type above, the type of each constructor must end with <literal> ... -> Term ...</literal>.
+</para></listitem>
+
+<listitem><para>
+You cannot use a <literal>deriving</literal> clause on a GADT-style data type declaration,
+nor can you use record syntax.  (It's not clear what these constructs would mean.  For example,
+the record selectors might ill-typed.)  However, you can use strictness annotations, in the obvious places
+in the constructor type:
+<programlisting>
+  data Term a where
+      Lit    :: !Int -> Term Int
+      If     :: Term Bool -> !(Term a) -> !(Term a) -> Term a
+      Pair   :: Term a -> Term b -> Term (a,b)
+</programlisting>
+</para></listitem>
+
+<listitem><para>
+Pattern matching causes type refinement.  For example, in the right hand side of the equation
+<programlisting>
+  eval :: Term a -> a
+  eval (Lit i) =  ...
+</programlisting>
+the type <literal>a</literal> is refined to <literal>Int</literal>.  (That's the whole point!)
+A precise specification of the type rules is beyond what this user manual aspires to, but there is a paper
+about the ideas: "Wobbly types: practical type inference for generalised algebraic data types", on Simon PJ's home page.</para>
+
+<para> The general principle is this: <emphasis>type refinement is only carried out based on user-supplied type annotations</emphasis>.
+So if no type signature is supplied for <literal>eval</literal>, no type refinement happens, and lots of obscure error messages will
+occur.  However, the refinement is quite general.  For example, if we had:
+<programlisting>
+  eval :: Term a -> a -> a
+  eval (Lit i) j =  i+j
+</programlisting>
+the pattern match causes the type <literal>a</literal> to be refined to <literal>Int</literal> (because of the type
+of the constructor <literal>Lit</literal>, and that refinement also applies to the type of <literal>j</literal>, and
+the result type of the <literal>case</literal> expression.  Hence the addition <literal>i+j</literal> is legal.
+</para>
+</listitem>
+</itemizedlist>
+</para>
+
+<para>Notice that GADTs generalise existential types.  For example, these two declarations are equivalent:
+<programlisting>
+  data T a = forall b. MkT b (b->a)
+  data T' a where { MKT :: b -> (b->a) -> T a }
+</programlisting>
+</para>
+</sect1>
+
+<!-- ====================== End of Generalised algebraic data types =======================  -->
+
  <!-- ====================== TEMPLATE HASKELL =======================  -->
  
  <sect1 id="template-haskell">
@@ -4096,7 +4211,7 @@ Assertion failures can be caught, see the documentation for the
     {-# DEPRECATED f, C, T "Don't use these" #-}
  </programlisting>
           <para>When you compile any module that imports and uses any
-          of the specifed entities, GHC will print the specified
+          of the specified entities, GHC will print the specified
            message.</para>
         </listitem>
        </itemizedlist>
@@ -4152,7 +4267,7 @@ key_function :: Int -> String -> (Bool, Double)
          The normal unfolding machinery will then be very keen to
          inline it.</para>
  
-       <para>Syntactially, an <literal>INLINE</literal> pragma for a
+       <para>Syntactically, an <literal>INLINE</literal> pragma for a
          function can be put anywhere its type signature could be
          put.</para>
  
@@ -4202,7 +4317,7 @@ key_function :: Int -> String -> (Bool, Double)
          number</emphasis>; the phase number decreases towards zero.
          If you use <option>-dverbose-core2core</option> you'll see the
          sequence of phase numbers for successive runs of the
-        simpifier.  In an INLINE pragma you can optionally specify a
+        simplifier.  In an INLINE pragma you can optionally specify a
          phase number, thus:</para>
  
         <itemizedlist>
@@ -4447,7 +4562,7 @@ data S = S {-# UNPACK #-} !Int {-# UNPACK #-} !Int
  <sect1 id="rewrite-rules">
  <title>Rewrite rules
  
-<indexterm><primary>RULES pagma</primary></indexterm>
+<indexterm><primary>RULES pragma</primary></indexterm>
  <indexterm><primary>pragma, RULES</primary></indexterm>
  <indexterm><primary>rewrite rules</primary></indexterm></title>
  
@@ -4613,7 +4728,7 @@ same type.
  
  <para>
   GHC makes absolutely no attempt to verify that the LHS and RHS
-of a rule have the same meaning.  That is undecideable in general, and
+of a rule have the same meaning.  That is undecidable in general, and
  infeasible in most interesting cases.  The responsibility is entirely the programmer's!
  
  </para>
@@ -4645,7 +4760,7 @@ This rule will cause the compiler to go into an infinite loop.
  for matching a rule LHS with an expression.  It seeks a substitution
  which makes the LHS and expression syntactically equal modulo alpha
  conversion.  The pattern (rule), but not the expression, is eta-expanded if
-necessary.  (Eta-expanding the epression can lead to laziness bugs.)
+necessary.  (Eta-expanding the expression can lead to laziness bugs.)
  But not beta conversion (that's called higher-order matching).
  </para>
  
@@ -5011,7 +5126,7 @@ If you add <option>-dppr-debug</option> you get a more detailed listing.
  <listitem>
  
  <para>
- The defintion of (say) <function>build</function> in <filename>GHC/Base.lhs</filename> looks llike this:
+ The definition of (say) <function>build</function> in <filename>GHC/Base.lhs</filename> looks llike this:
  
  <programlisting>
          build   :: forall a. (forall b. (a -> b -> b) -> b -> b) -> [a]
@@ -5060,7 +5175,7 @@ program even if fusion doesn't happen.  More rules in <filename>GHC/List.lhs</fi
  f x = ({-# CORE "foo" #-} show) ({-# CORE "bar" #-} x)
  </programlisting>
  
-  Sematically, this is equivalent to:
+  Semantically, this is equivalent to:
  
  <programlisting>
  g x = show x