</varlistentry>
<varlistentry>
+ <term><option>-farrows</option></term>
+ <indexterm><primary><option>-farrows</option></primary></indexterm>
+ <listitem>
+ <para>See <xref LinkEnd="arrow-notation">. Independent of
+ <option>-fglasgow-exts</option>.</para>
+ </listitem>
+ </varlistentry>
+
+ <varlistentry>
<term><option>-fgenerics</option></term>
<indexterm><primary><option>-fgenerics</option></primary></indexterm>
<listitem>
<!-- UNBOXED TYPES AND PRIMITIVE OPERATIONS -->
<!-- included from primitives.sgml -->
-&primitives;
-
-
-<!-- TYPE SYSTEM EXTENSIONS -->
-<sect1 id="type-extensions">
-<title>Type system extensions</title>
-
-<sect2 id="nullary-types">
-<title>Data types with no constructors</title>
-
-<para>With the <option>-fglasgow-exts</option> flag, GHC lets you declare
-a data type with no constructors. For example:</para>
-
-<programlisting>
- data S -- S :: *
- data T a -- T :: * -> *
-</programlisting>
-
-<para>Syntactically, the declaration lacks the "= constrs" part. The
-type can be parameterised over types of any kind, but if the kind is
-not <literal>*</literal> then an explicit kind annotation must be used
-(see <xref linkend="sec-kinding">).</para>
-
-<para>Such data types have only one value, namely bottom.
-Nevertheless, they can be useful when defining "phantom types".</para>
-</sect2>
-
-<sect2 id="infix-tycons">
-<title>Infix type constructors</title>
+<!-- &primitives; -->
+<sect1 id="primitives">
+ <title>Unboxed types and primitive operations</title>
+
+<para>GHC is built on a raft of primitive data types and operations.
+While you really can use this stuff to write fast code,
+ we generally find it a lot less painful, and more satisfying in the
+ long run, to use higher-level language features and libraries. With
+ any luck, the code you write will be optimised to the efficient
+ unboxed version in any case. And if it isn't, we'd like to know
+ about it.</para>
+
+<para>We do not currently have good, up-to-date documentation about the
+primitives, perhaps because they are mainly intended for internal use.
+There used to be a long section about them here in the User Guide, but it
+became out of date, and wrong information is worse than none.</para>
+
+<para>The Real Truth about what primitive types there are, and what operations
+work over those types, is held in the file
+<filename>fptools/ghc/compiler/prelude/primops.txt</filename>.
+This file is used directly to generate GHC's primitive-operation definitions, so
+it is always correct! It is also intended for processing into text.</para>
+
+<para> Indeed,
+the result of such processing is part of the description of the
+ <ulink
+ url="http://haskell.cs.yale.edu/ghc/docs/papers/core.ps.gz">External
+ Core language</ulink>.
+So that document is a good place to look for a type-set version.
+We would be very happy if someone wanted to volunteer to produce an SGML
+back end to the program that processes <filename>primops.txt</filename> so that
+we could include the results here in the User Guide.</para>
+
+<para>What follows here is a brief summary of some main points.</para>
+
+<sect2 id="glasgow-unboxed">
+<title>Unboxed types
+</title>
<para>
-GHC allows type constructors to be operators, and to be written infix, very much
-like expressions. More specifically:
-<itemizedlist>
-<listitem><para>
- A type constructor can be an operator, beginning with a colon; e.g. <literal>:*:</literal>.
- The lexical syntax is the same as that for data constructors.
- </para></listitem>
-<listitem><para>
- Types can be written infix. For example <literal>Int :*: Bool</literal>.
- </para></listitem>
-<listitem><para>
- Back-quotes work
- as for expressions, both for type constructors and type variables; e.g. <literal>Int `Either` Bool</literal>, or
- <literal>Int `a` Bool</literal>. Similarly, parentheses work the same; e.g. <literal>(:*:) Int Bool</literal>.
- </para></listitem>
-<listitem><para>
- Fixities may be declared for type constructors just as for data constructors. However,
- one cannot distinguish between the two in a fixity declaration; a fixity declaration
- sets the fixity for a data constructor and the corresponding type constructor. For example:
-<screen>
- infixl 7 T, :*:
-</screen>
- sets the fixity for both type constructor <literal>T</literal> and data constructor <literal>T</literal>,
- and similarly for <literal>:*:</literal>.
- <literal>Int `a` Bool</literal>.
- </para></listitem>
-<listitem><para>
- Function arrow is <literal>infixr</literal> with fixity 0. (This might change; I'm not sure what it should be.)
- </para></listitem>
-<listitem><para>
- Data type and type-synonym declarations can be written infix. E.g.
-<screen>
- data a :*: b = Foo a b
- type a :+: b = Either a b
-</screen>
- </para></listitem>
-<listitem><para>
- The only thing that differs between operators in types and operators in expressions is that
- ordinary non-constructor operators, such as <literal>+</literal> and <literal>*</literal>
- are not allowed in types. Reason: the uniform thing to do would be to make them type
- variables, but that's not very useful. A less uniform but more useful thing would be to
- allow them to be type <emphasis>constructors</emphasis>. But that gives trouble in export
- lists. So for now we just exclude them.
- </para></listitem>
+<indexterm><primary>Unboxed types (Glasgow extension)</primary></indexterm>
+</para>
-</itemizedlist>
+<para>Most types in GHC are <firstterm>boxed</firstterm>, which means
+that values of that type are represented by a pointer to a heap
+object. The representation of a Haskell <literal>Int</literal>, for
+example, is a two-word heap object. An <firstterm>unboxed</firstterm>
+type, however, is represented by the value itself, no pointers or heap
+allocation are involved.
</para>
-</sect2>
-<sect2 id="class-method-types">
-<title>Class method types
-</title>
<para>
-Haskell 98 prohibits class method types to mention constraints on the
-class type variable, thus:
-<programlisting>
- class Seq s a where
- fromList :: [a] -> s a
- elem :: Eq a => a -> s a -> Bool
-</programlisting>
-The type of <literal>elem</literal> is illegal in Haskell 98, because it
-contains the constraint <literal>Eq a</literal>, constrains only the
-class type variable (in this case <literal>a</literal>).
+Unboxed types correspond to the “raw machine” types you
+would use in C: <literal>Int#</literal> (long int),
+<literal>Double#</literal> (double), <literal>Addr#</literal>
+(void *), etc. The <emphasis>primitive operations</emphasis>
+(PrimOps) on these types are what you might expect; e.g.,
+<literal>(+#)</literal> is addition on
+<literal>Int#</literal>s, and is the machine-addition that we all
+know and love—usually one instruction.
</para>
+
<para>
-With the <option>-fglasgow-exts</option> GHC lifts this restriction.
+Primitive (unboxed) types cannot be defined in Haskell, and are
+therefore built into the language and compiler. Primitive types are
+always unlifted; that is, a value of a primitive type cannot be
+bottom. We use the convention that primitive types, values, and
+operations have a <literal>#</literal> suffix.
</para>
-</sect2>
-
-<sect2 id="multi-param-type-classes">
-<title>Multi-parameter type classes
-</title>
-
<para>
-This section documents GHC's implementation of multi-parameter type
-classes. There's lots of background in the paper <ULink
-URL="http://research.microsoft.com/~simonpj/multi.ps.gz" >Type
-classes: exploring the design space</ULink > (Simon Peyton Jones, Mark
-Jones, Erik Meijer).
+Primitive values are often represented by a simple bit-pattern, such
+as <literal>Int#</literal>, <literal>Float#</literal>,
+<literal>Double#</literal>. But this is not necessarily the case:
+a primitive value might be represented by a pointer to a
+heap-allocated object. Examples include
+<literal>Array#</literal>, the type of primitive arrays. A
+primitive array is heap-allocated because it is too big a value to fit
+in a register, and would be too expensive to copy around; in a sense,
+it is accidental that it is represented by a pointer. If a pointer
+represents a primitive value, then it really does point to that value:
+no unevaluated thunks, no indirections…nothing can be at the
+other end of the pointer than the primitive value.
</para>
<para>
-I'd like to thank people who reported shorcomings in the GHC 3.02
-implementation. Our default decisions were all conservative ones, and
-the experience of these heroic pioneers has given useful concrete
-examples to support several generalisations. (These appear below as
-design choices not implemented in 3.02.)
+There are some restrictions on the use of primitive types, the main
+one being that you can't pass a primitive value to a polymorphic
+function or store one in a polymorphic data type. This rules out
+things like <literal>[Int#]</literal> (i.e. lists of primitive
+integers). The reason for this restriction is that polymorphic
+arguments and constructor fields are assumed to be pointers: if an
+unboxed integer is stored in one of these, the garbage collector would
+attempt to follow it, leading to unpredictable space leaks. Or a
+<function>seq</function> operation on the polymorphic component may
+attempt to dereference the pointer, with disastrous results. Even
+worse, the unboxed value might be larger than a pointer
+(<literal>Double#</literal> for instance).
</para>
<para>
-I've discussed these notes with Mark Jones, and I believe that Hugs
-will migrate towards the same design choices as I outline here.
-Thanks to him, and to many others who have offered very useful
-feedback.
+Nevertheless, A numerically-intensive program using unboxed types can
+go a <emphasis>lot</emphasis> faster than its “standard”
+counterpart—we saw a threefold speedup on one example.
</para>
-<sect3>
-<title>Types</title>
+</sect2>
+
+<sect2 id="unboxed-tuples">
+<title>Unboxed Tuples
+</title>
<para>
-There are the following restrictions on the form of a qualified
-type:
+Unboxed tuples aren't really exported by <literal>GHC.Exts</literal>,
+they're available by default with <option>-fglasgow-exts</option>. An
+unboxed tuple looks like this:
</para>
<para>
<programlisting>
- forall tv1..tvn (c1, ...,cn) => type
+(# e_1, ..., e_n #)
</programlisting>
</para>
<para>
-(Here, I write the "foralls" explicitly, although the Haskell source
-language omits them; in Haskell 1.4, all the free type variables of an
-explicit source-language type signature are universally quantified,
-except for the class type variables in a class declaration. However,
-in GHC, you can give the foralls if you want. See <xref LinkEnd="universal-quantification">).
+where <literal>e_1..e_n</literal> are expressions of any
+type (primitive or non-primitive). The type of an unboxed tuple looks
+the same.
</para>
<para>
-
-<OrderedList>
-<listitem>
+Unboxed tuples are used for functions that need to return multiple
+values, but they avoid the heap allocation normally associated with
+using fully-fledged tuples. When an unboxed tuple is returned, the
+components are put directly into registers or on the stack; the
+unboxed tuple itself does not have a composite representation. Many
+of the primitive operations listed in this section return unboxed
+tuples.
+</para>
<para>
- <emphasis>Each universally quantified type variable
-<literal>tvi</literal> must be mentioned (i.e. appear free) in <literal>type</literal></emphasis>.
-
-The reason for this is that a value with a type that does not obey
-this restriction could not be used without introducing
-ambiguity. Here, for example, is an illegal type:
-
+There are some pretty stringent restrictions on the use of unboxed tuples:
+</para>
-<programlisting>
- forall a. Eq a => Int
-</programlisting>
+<para>
+<itemizedlist>
+<listitem>
-When a value with this type was used, the constraint <literal>Eq tv</literal>
-would be introduced where <literal>tv</literal> is a fresh type variable, and
-(in the dictionary-translation implementation) the value would be
-applied to a dictionary for <literal>Eq tv</literal>. The difficulty is that we
-can never know which instance of <literal>Eq</literal> to use because we never
-get any more information about <literal>tv</literal>.
+<para>
+ Unboxed tuple types are subject to the same restrictions as
+other unboxed types; i.e. they may not be stored in polymorphic data
+structures or passed to polymorphic functions.
</para>
</listitem>
<listitem>
<para>
- <emphasis>Every constraint <literal>ci</literal> must mention at least one of the
-universally quantified type variables <literal>tvi</literal></emphasis>.
+ Unboxed tuples may only be constructed as the direct result of
+a function, and may only be deconstructed with a <literal>case</literal> expression.
+eg. the following are valid:
-For example, this type is OK because <literal>C a b</literal> mentions the
-universally quantified type variable <literal>b</literal>:
+
+<programlisting>
+f x y = (# x+1, y-1 #)
+g x = case f x x of { (# a, b #) -> a + b }
+</programlisting>
+
+
+but the following are invalid:
<programlisting>
- forall a. C a b => burble
+f x y = g (# x, y #)
+g (# x, y #) = x + y
</programlisting>
-The next type is illegal because the constraint <literal>Eq b</literal> does not
-mention <literal>a</literal>:
+</para>
+</listitem>
+<listitem>
+
+<para>
+ No variable can have an unboxed tuple type. This is illegal:
<programlisting>
- forall a. Eq b => burble
+f :: (# Int, Int #) -> (# Int, Int #)
+f x = x
</programlisting>
-The reason for this restriction is milder than the other one. The
-excluded types are never useful or necessary (because the offending
-context doesn't need to be witnessed at this point; it can be floated
-out). Furthermore, floating them out increases sharing. Lastly,
-excluding them is a conservative choice; it leaves a patch of
-territory free in case we need it later.
+because <literal>x</literal> has an unboxed tuple type.
</para>
</listitem>
-</OrderedList>
+</itemizedlist>
</para>
<para>
-These restrictions apply to all types, whether declared in a type signature
-or inferred.
+Note: we may relax some of these restrictions in the future.
</para>
<para>
-Unlike Haskell 1.4, constraints in types do <emphasis>not</emphasis> have to be of
-the form <emphasis>(class type-variables)</emphasis>. Thus, these type signatures
-are perfectly OK
+The <literal>IO</literal> and <literal>ST</literal> monads use unboxed
+tuples to avoid unnecessary allocation during sequences of operations.
</para>
-<para>
+</sect2>
+</sect1>
-<programlisting>
- f :: Eq (m a) => [m a] -> [m a]
- g :: Eq [a] => ...
-</programlisting>
-</para>
+<!-- ====================== SYNTACTIC EXTENSIONS ======================= -->
-<para>
-This choice recovers principal types, a property that Haskell 1.4 does not have.
-</para>
+<sect1 id="syntax-extns">
+<title>Syntactic extensions</title>
+
+ <!-- ====================== HIERARCHICAL MODULES ======================= -->
-</sect3>
+ <sect2 id="hierarchical-modules">
+ <title>Hierarchical Modules</title>
-<sect3>
-<title>Class declarations</title>
+ <para>GHC supports a small extension to the syntax of module
+ names: a module name is allowed to contain a dot
+ <literal>‘.’</literal>. This is also known as the
+ “hierarchical module namespace” extension, because
+ it extends the normally flat Haskell module namespace into a
+ more flexible hierarchy of modules.</para>
-<para>
+ <para>This extension has very little impact on the language
+ itself; modules names are <emphasis>always</emphasis> fully
+ qualified, so you can just think of the fully qualified module
+ name as <quote>the module name</quote>. In particular, this
+ means that the full module name must be given after the
+ <literal>module</literal> keyword at the beginning of the
+ module; for example, the module <literal>A.B.C</literal> must
+ begin</para>
-<OrderedList>
-<listitem>
+<programlisting>module A.B.C</programlisting>
-<para>
- <emphasis>Multi-parameter type classes are permitted</emphasis>. For example:
+ <para>It is a common strategy to use the <literal>as</literal>
+ keyword to save some typing when using qualified names with
+ hierarchical modules. For example:</para>
<programlisting>
- class Collection c a where
- union :: c a -> c a -> c a
- ...etc.
+import qualified Control.Monad.ST.Strict as ST
</programlisting>
+ <para>Hierarchical modules have an impact on the way that GHC
+ searches for files. For a description, see <xref
+ linkend="finding-hierarchical-modules">.</para>
+ <para>GHC comes with a large collection of libraries arranged
+ hierarchically; see the accompanying library documentation.
+ There is an ongoing project to create and maintain a stable set
+ of <quote>core</quote> libraries used by several Haskell
+ compilers, and the libraries that GHC comes with represent the
+ current status of that project. For more details, see <ulink
+ url="http://www.haskell.org/~simonmar/libraries/libraries.html">Haskell
+ Libraries</ulink>.</para>
-</para>
-</listitem>
-<listitem>
-
-<para>
- <emphasis>The class hierarchy must be acyclic</emphasis>. However, the definition
-of "acyclic" involves only the superclass relationships. For example,
-this is OK:
-
-
-<programlisting>
- class C a where {
- op :: D b => a -> b -> b
- }
-
- class C a => D a where { ... }
-</programlisting>
+ </sect2>
+ <!-- ====================== PATTERN GUARDS ======================= -->
-Here, <literal>C</literal> is a superclass of <literal>D</literal>, but it's OK for a
-class operation <literal>op</literal> of <literal>C</literal> to mention <literal>D</literal>. (It
-would not be OK for <literal>D</literal> to be a superclass of <literal>C</literal>.)
+<sect2 id="pattern-guards">
+<title>Pattern guards</title>
+<para>
+<indexterm><primary>Pattern guards (Glasgow extension)</primary></indexterm>
+The discussion that follows is an abbreviated version of Simon Peyton Jones's original <ULink URL="http://research.microsoft.com/~simonpj/Haskell/guards.html">proposal</ULink>. (Note that the proposal was written before pattern guards were implemented, so refers to them as unimplemented.)
</para>
-</listitem>
-<listitem>
<para>
- <emphasis>There are no restrictions on the context in a class declaration
-(which introduces superclasses), except that the class hierarchy must
-be acyclic</emphasis>. So these class declarations are OK:
-
+Suppose we have an abstract data type of finite maps, with a
+lookup operation:
<programlisting>
- class Functor (m k) => FiniteMap m k where
- ...
-
- class (Monad m, Monad (t m)) => Transform t m where
- lift :: m a -> (t m) a
+lookup :: FiniteMap -> Int -> Maybe Int
</programlisting>
-
+The lookup returns <function>Nothing</function> if the supplied key is not in the domain of the mapping, and <function>(Just v)</function> otherwise,
+where <VarName>v</VarName> is the value that the key maps to. Now consider the following definition:
</para>
-</listitem>
-<listitem>
-
-<para>
- <emphasis>In the signature of a class operation, every constraint
-must mention at least one type variable that is not a class type
-variable</emphasis>.
-
-Thus:
-
<programlisting>
- class Collection c a where
- mapC :: Collection c b => (a->b) -> c a -> c b
+clunky env var1 var2 | ok1 && ok2 = val1 + val2
+| otherwise = var1 + var2
+where
+ m1 = lookup env var1
+ m2 = lookup env var2
+ ok1 = maybeToBool m1
+ ok2 = maybeToBool m2
+ val1 = expectJust m1
+ val2 = expectJust m2
</programlisting>
-
-is OK because the constraint <literal>(Collection a b)</literal> mentions
-<literal>b</literal>, even though it also mentions the class variable
-<literal>a</literal>. On the other hand:
-
+<para>
+The auxiliary functions are
+</para>
<programlisting>
- class C a where
- op :: Eq a => (a,b) -> (a,b)
-</programlisting>
-
-
-is not OK because the constraint <literal>(Eq a)</literal> mentions on the class
-type variable <literal>a</literal>, but not <literal>b</literal>. However, any such
-example is easily fixed by moving the offending context up to the
-superclass context:
-
+maybeToBool :: Maybe a -> Bool
+maybeToBool (Just x) = True
+maybeToBool Nothing = False
-<programlisting>
- class Eq a => C a where
- op ::(a,b) -> (a,b)
+expectJust :: Maybe a -> a
+expectJust (Just x) = x
+expectJust Nothing = error "Unexpected Nothing"
</programlisting>
-
-A yet more relaxed rule would allow the context of a class-op signature
-to mention only class type variables. However, that conflicts with
-Rule 1(b) for types above.
-
+<para>
+What is <function>clunky</function> doing? The guard <literal>ok1 &&
+ok2</literal> checks that both lookups succeed, using
+<function>maybeToBool</function> to convert the <function>Maybe</function>
+types to booleans. The (lazily evaluated) <function>expectJust</function>
+calls extract the values from the results of the lookups, and binds the
+returned values to <VarName>val1</VarName> and <VarName>val2</VarName>
+respectively. If either lookup fails, then clunky takes the
+<literal>otherwise</literal> case and returns the sum of its arguments.
</para>
-</listitem>
-<listitem>
<para>
- <emphasis>The type of each class operation must mention <emphasis>all</emphasis> of
-the class type variables</emphasis>. For example:
-
-
-<programlisting>
- class Coll s a where
- empty :: s
- insert :: s -> a -> s
-</programlisting>
-
-
-is not OK, because the type of <literal>empty</literal> doesn't mention
-<literal>a</literal>. This rule is a consequence of Rule 1(a), above, for
-types, and has the same motivation.
-
-Sometimes, offending class declarations exhibit misunderstandings. For
-example, <literal>Coll</literal> might be rewritten
-
-
-<programlisting>
- class Coll s a where
- empty :: s a
- insert :: s a -> a -> s a
-</programlisting>
-
-
-which makes the connection between the type of a collection of
-<literal>a</literal>'s (namely <literal>(s a)</literal>) and the element type <literal>a</literal>.
-Occasionally this really doesn't work, in which case you can split the
-class like this:
-
+This is certainly legal Haskell, but it is a tremendously verbose and
+un-obvious way to achieve the desired effect. Arguably, a more direct way
+to write clunky would be to use case expressions:
+</para>
<programlisting>
- class CollE s where
- empty :: s
-
- class CollE s => Coll s a where
- insert :: s -> a -> s
+clunky env var1 var1 = case lookup env var1 of
+ Nothing -> fail
+ Just val1 -> case lookup env var2 of
+ Nothing -> fail
+ Just val2 -> val1 + val2
+where
+ fail = val1 + val2
</programlisting>
-
+<para>
+This is a bit shorter, but hardly better. Of course, we can rewrite any set
+of pattern-matching, guarded equations as case expressions; that is
+precisely what the compiler does when compiling equations! The reason that
+Haskell provides guarded equations is because they allow us to write down
+the cases we want to consider, one at a time, independently of each other.
+This structure is hidden in the case version. Two of the right-hand sides
+are really the same (<function>fail</function>), and the whole expression
+tends to become more and more indented.
</para>
-</listitem>
-
-</OrderedList>
+<para>
+Here is how I would write clunky:
</para>
-</sect3>
-
-<sect3 id="instance-decls">
-<title>Instance declarations</title>
+<programlisting>
+clunky env var1 var1
+ | Just val1 <- lookup env var1
+ , Just val2 <- lookup env var2
+ = val1 + val2
+...other equations for clunky...
+</programlisting>
<para>
-
-<OrderedList>
-<listitem>
+The semantics should be clear enough. The qualifers are matched in order.
+For a <literal><-</literal> qualifier, which I call a pattern guard, the
+right hand side is evaluated and matched against the pattern on the left.
+If the match fails then the whole guard fails and the next equation is
+tried. If it succeeds, then the appropriate binding takes place, and the
+next qualifier is matched, in the augmented environment. Unlike list
+comprehensions, however, the type of the expression to the right of the
+<literal><-</literal> is the same as the type of the pattern to its
+left. The bindings introduced by pattern guards scope over all the
+remaining guard qualifiers, and over the right hand side of the equation.
+</para>
<para>
- <emphasis>Instance declarations may not overlap</emphasis>. The two instance
-declarations
-
+Just as with list comprehensions, boolean expressions can be freely mixed
+with among the pattern guards. For example:
+</para>
<programlisting>
- instance context1 => C type1 where ...
- instance context2 => C type2 where ...
+f x | [y] <- x
+ , y > 3
+ , Just z <- h y
+ = ...
</programlisting>
+<para>
+Haskell's current guards therefore emerge as a special case, in which the
+qualifier list has just one element, a boolean expression.
+</para>
+</sect2>
-"overlap" if <literal>type1</literal> and <literal>type2</literal> unify
-
-However, if you give the command line option
-<option>-fallow-overlapping-instances</option><indexterm><primary>-fallow-overlapping-instances
-option</primary></indexterm> then overlapping instance declarations are permitted.
-However, GHC arranges never to commit to using an instance declaration
-if another instance declaration also applies, either now or later.
+ <!-- ===================== Recursive do-notation =================== -->
-<itemizedlist>
-<listitem>
+<sect2 id="mdo-notation">
+<title>The recursive do-notation
+</title>
+<para> The recursive do-notation (also known as mdo-notation) is implemented as described in
+"A recursive do for Haskell",
+Levent Erkok, John Launchbury",
+Haskell Workshop 2002, pages: 29-37. Pittsburgh, Pennsylvania.
+</para>
<para>
- EITHER <literal>type1</literal> and <literal>type2</literal> do not unify
+The do-notation of Haskell does not allow <emphasis>recursive bindings</emphasis>,
+that is, the variables bound in a do-expression are visible only in the textually following
+code block. Compare this to a let-expression, where bound variables are visible in the entire binding
+group. It turns out that several applications can benefit from recursive bindings in
+the do-notation, and this extension provides the necessary syntactic support.
</para>
-</listitem>
-<listitem>
-
<para>
- OR <literal>type2</literal> is a substitution instance of <literal>type1</literal>
-(but not identical to <literal>type1</literal>), or vice versa.
+Here is a simple (yet contrived) example:
</para>
-</listitem>
-</itemizedlist>
-Notice that these rules
-<itemizedlist>
-<listitem>
+<programlisting>
+import Control.Monad.Fix
+justOnes = mdo xs <- Just (1:xs)
+ return xs
+</programlisting>
<para>
- make it clear which instance decl to use
-(pick the most specific one that matches)
-
+As you can guess <literal>justOnes</literal> will evaluate to <literal>Just [1,1,1,...</literal>.
</para>
-</listitem>
-<listitem>
<para>
- do not mention the contexts <literal>context1</literal>, <literal>context2</literal>
-Reason: you can pick which instance decl
-"matches" based on the type.
+The Control.Monad.Fix library introduces the <literal>MonadFix</literal> class. It's definition is:
</para>
-</listitem>
-
-</itemizedlist>
-However the rules are over-conservative. Two instance declarations can overlap,
-but it can still be clear in particular situations which to use. For example:
<programlisting>
- instance C (Int,a) where ...
- instance C (a,Bool) where ...
+class Monad m => MonadFix m where
+ mfix :: (a -> m a) -> m a
</programlisting>
-These are rejected by GHC's rules, but it is clear what to do when trying
-to solve the constraint <literal>C (Int,Int)</literal> because the second instance
-cannot apply. Yell if this restriction bites you.
+<para>
+The function <literal>mfix</literal>
+dictates how the required recursion operation should be performed. If recursive bindings are required for a monad,
+then that monad must be declared an instance of the <literal>MonadFix</literal> class.
+For details, see the above mentioned reference.
</para>
<para>
-GHC is also conservative about committing to an overlapping instance. For example:
-<programlisting>
- class C a where { op :: a -> a }
- instance C [Int] where ...
- instance C a => C [a] where ...
-
- f :: C b => [b] -> [b]
- f x = op x
-</programlisting>
-From the RHS of f we get the constraint <literal>C [b]</literal>. But
-GHC does not commit to the second instance declaration, because in a paricular
-call of f, b might be instantiate to Int, so the first instance declaration
-would be appropriate. So GHC rejects the program. If you add <option>-fallow-incoherent-instances</option>
-GHC will instead silently pick the second instance, without complaining about
-the problem of subsequent instantiations.
+The following instances of <literal>MonadFix</literal> are automatically provided: List, Maybe, IO.
+Furthermore, the Control.Monad.ST and Control.Monad.ST.Lazy modules provide the instances of the MonadFix class
+for Haskell's internal state monad (strict and lazy, respectively).
</para>
<para>
-Regrettably, GHC doesn't guarantee to detect overlapping instance
-declarations if they appear in different modules. GHC can "see" the
-instance declarations in the transitive closure of all the modules
-imported by the one being compiled, so it can "see" all instance decls
-when it is compiling <literal>Main</literal>. However, it currently chooses not
-to look at ones that can't possibly be of use in the module currently
-being compiled, in the interests of efficiency. (Perhaps we should
-change that decision, at least for <literal>Main</literal>.)
+There are three important points in using the recursive-do notation:
+<itemizedlist>
+<listitem><para>
+The recursive version of the do-notation uses the keyword <literal>mdo</literal> (rather
+than <literal>do</literal>).
+</para></listitem>
+
+<listitem><para>
+You should <literal>import Control.Monad.Fix</literal>.
+(Note: Strictly speaking, this import is required only when you need to refer to the name
+<literal>MonadFix</literal> in your program, but the import is always safe, and the programmers
+are encouraged to always import this module when using the mdo-notation.)
+</para></listitem>
+<listitem><para>
+As with other extensions, ghc should be given the flag <literal>-fglasgow-exts</literal>
+</para></listitem>
+</itemizedlist>
</para>
-</listitem>
-<listitem>
<para>
- <emphasis>There are no restrictions on the type in an instance
-<emphasis>head</emphasis>, except that at least one must not be a type variable</emphasis>.
-The instance "head" is the bit after the "=>" in an instance decl. For
-example, these are OK:
+The web page: <ulink url="http://www.cse.ogi.edu/PacSoft/projects/rmb">http://www.cse.ogi.edu/PacSoft/projects/rmb</ulink>
+contains up to date information on recursive monadic bindings.
+</para>
+<para>
+Historical note: The old implementation of the mdo-notation (and most
+of the existing documents) used the name
+<literal>MonadRec</literal> for the class and the corresponding library.
+This name is not supported by GHC.
+</para>
-<programlisting>
- instance C Int a where ...
+</sect2>
- instance D (Int, Int) where ...
- instance E [[a]] where ...
-</programlisting>
+ <!-- ===================== PARALLEL LIST COMPREHENSIONS =================== -->
+ <sect2 id="parallel-list-comprehensions">
+ <title>Parallel List Comprehensions</title>
+ <indexterm><primary>list comprehensions</primary><secondary>parallel</secondary>
+ </indexterm>
+ <indexterm><primary>parallel list comprehensions</primary>
+ </indexterm>
-Note that instance heads <emphasis>may</emphasis> contain repeated type variables.
-For example, this is OK:
+ <para>Parallel list comprehensions are a natural extension to list
+ comprehensions. List comprehensions can be thought of as a nice
+ syntax for writing maps and filters. Parallel comprehensions
+ extend this to include the zipWith family.</para>
+ <para>A parallel list comprehension has multiple independent
+ branches of qualifier lists, each separated by a `|' symbol. For
+ example, the following zips together two lists:</para>
<programlisting>
- instance Stateful (ST s) (MutVar s) where ...
+ [ (x, y) | x <- xs | y <- ys ]
+</programlisting>
+
+ <para>The behavior of parallel list comprehensions follows that of
+ zip, in that the resulting list will have the same length as the
+ shortest branch.</para>
+
+ <para>We can define parallel list comprehensions by translation to
+ regular comprehensions. Here's the basic idea:</para>
+
+ <para>Given a parallel comprehension of the form: </para>
+
+<programlisting>
+ [ e | p1 <- e11, p2 <- e12, ...
+ | q1 <- e21, q2 <- e22, ...
+ ...
+ ]
+</programlisting>
+
+ <para>This will be translated to: </para>
+
+<programlisting>
+ [ e | ((p1,p2), (q1,q2), ...) <- zipN [(p1,p2) | p1 <- e11, p2 <- e12, ...]
+ [(q1,q2) | q1 <- e21, q2 <- e22, ...]
+ ...
+ ]
+</programlisting>
+
+ <para>where `zipN' is the appropriate zip for the given number of
+ branches.</para>
+
+ </sect2>
+
+<sect2 id="rebindable-syntax">
+<title>Rebindable syntax</title>
+
+
+ <para>GHC allows most kinds of built-in syntax to be rebound by
+ the user, to facilitate replacing the <literal>Prelude</literal>
+ with a home-grown version, for example.</para>
+
+ <para>You may want to define your own numeric class
+ hierarchy. It completely defeats that purpose if the
+ literal "1" means "<literal>Prelude.fromInteger
+ 1</literal>", which is what the Haskell Report specifies.
+ So the <option>-fno-implicit-prelude</option> flag causes
+ the following pieces of built-in syntax to refer to
+ <emphasis>whatever is in scope</emphasis>, not the Prelude
+ versions:</para>
+
+ <itemizedlist>
+ <listitem>
+ <para>Integer and fractional literals mean
+ "<literal>fromInteger 1</literal>" and
+ "<literal>fromRational 3.2</literal>", not the
+ Prelude-qualified versions; both in expressions and in
+ patterns. </para>
+ <para>However, the standard Prelude <literal>Eq</literal> class
+ is still used for the equality test necessary for literal patterns.</para>
+ </listitem>
+
+ <listitem>
+ <para>Negation (e.g. "<literal>- (f x)</literal>")
+ means "<literal>negate (f x)</literal>" (not
+ <literal>Prelude.negate</literal>).</para>
+ </listitem>
+
+ <listitem>
+ <para>In an n+k pattern, the standard Prelude
+ <literal>Ord</literal> class is still used for comparison,
+ but the necessary subtraction uses whatever
+ "<literal>(-)</literal>" is in scope (not
+ "<literal>Prelude.(-)</literal>").</para>
+ </listitem>
+
+ <listitem>
+ <para>"Do" notation is translated using whatever
+ functions <literal>(>>=)</literal>,
+ <literal>(>>)</literal>, <literal>fail</literal>, and
+ <literal>return</literal>, are in scope (not the Prelude
+ versions). List comprehensions, and parallel array
+ comprehensions, are unaffected. </para></listitem>
+ </itemizedlist>
+
+ <para>Be warned: this is an experimental facility, with fewer checks than
+ usual. In particular, it is essential that the functions GHC finds in scope
+ must have the appropriate types, namely:
+ <screen>
+ fromInteger :: forall a. (...) => Integer -> a
+ fromRational :: forall a. (...) => Rational -> a
+ negate :: forall a. (...) => a -> a
+ (-) :: forall a. (...) => a -> a -> a
+ (>>=) :: forall m a. (...) => m a -> (a -> m b) -> m b
+ (>>) :: forall m a. (...) => m a -> m b -> m b
+ return :: forall m a. (...) => a -> m a
+ fail :: forall m a. (...) => String -> m a
+ </screen>
+ (The (...) part can be any context including the empty context; that part
+ is up to you.)
+ If the functions don't have the right type, very peculiar things may
+ happen. Use <literal>-dcore-lint</literal> to
+ typecheck the desugared program. If Core Lint is happy you should be all right.</para>
+
+</sect2>
+</sect1>
+
+
+<!-- TYPE SYSTEM EXTENSIONS -->
+<sect1 id="type-extensions">
+<title>Type system extensions</title>
+
+<sect2 id="nullary-types">
+<title>Data types with no constructors</title>
+
+<para>With the <option>-fglasgow-exts</option> flag, GHC lets you declare
+a data type with no constructors. For example:</para>
+
+<programlisting>
+ data S -- S :: *
+ data T a -- T :: * -> *
+</programlisting>
+
+<para>Syntactically, the declaration lacks the "= constrs" part. The
+type can be parameterised over types of any kind, but if the kind is
+not <literal>*</literal> then an explicit kind annotation must be used
+(see <xref linkend="sec-kinding">).</para>
+
+<para>Such data types have only one value, namely bottom.
+Nevertheless, they can be useful when defining "phantom types".</para>
+</sect2>
+
+<sect2 id="infix-tycons">
+<title>Infix type constructors</title>
+
+<para>
+GHC allows type constructors to be operators, and to be written infix, very much
+like expressions. More specifically:
+<itemizedlist>
+<listitem><para>
+ A type constructor can be an operator, beginning with a colon; e.g. <literal>:*:</literal>.
+ The lexical syntax is the same as that for data constructors.
+ </para></listitem>
+<listitem><para>
+ Types can be written infix. For example <literal>Int :*: Bool</literal>.
+ </para></listitem>
+<listitem><para>
+ Back-quotes work
+ as for expressions, both for type constructors and type variables; e.g. <literal>Int `Either` Bool</literal>, or
+ <literal>Int `a` Bool</literal>. Similarly, parentheses work the same; e.g. <literal>(:*:) Int Bool</literal>.
+ </para></listitem>
+<listitem><para>
+ Fixities may be declared for type constructors just as for data constructors. However,
+ one cannot distinguish between the two in a fixity declaration; a fixity declaration
+ sets the fixity for a data constructor and the corresponding type constructor. For example:
+<screen>
+ infixl 7 T, :*:
+</screen>
+ sets the fixity for both type constructor <literal>T</literal> and data constructor <literal>T</literal>,
+ and similarly for <literal>:*:</literal>.
+ <literal>Int `a` Bool</literal>.
+ </para></listitem>
+<listitem><para>
+ Function arrow is <literal>infixr</literal> with fixity 0. (This might change; I'm not sure what it should be.)
+ </para></listitem>
+<listitem><para>
+ Data type and type-synonym declarations can be written infix. E.g.
+<screen>
+ data a :*: b = Foo a b
+ type a :+: b = Either a b
+</screen>
+ </para></listitem>
+<listitem><para>
+ The only thing that differs between operators in types and operators in expressions is that
+ ordinary non-constructor operators, such as <literal>+</literal> and <literal>*</literal>
+ are not allowed in types. Reason: the uniform thing to do would be to make them type
+ variables, but that's not very useful. A less uniform but more useful thing would be to
+ allow them to be type <emphasis>constructors</emphasis>. But that gives trouble in export
+ lists. So for now we just exclude them.
+ </para></listitem>
+
+</itemizedlist>
+</para>
+</sect2>
+
+<sect2 id="sec-kinding">
+<title>Explicitly-kinded quantification</title>
+
+<para>
+Haskell infers the kind of each type variable. Sometimes it is nice to be able
+to give the kind explicitly as (machine-checked) documentation,
+just as it is nice to give a type signature for a function. On some occasions,
+it is essential to do so. For example, in his paper "Restricted Data Types in Haskell" (Haskell Workshop 1999)
+John Hughes had to define the data type:
+<Screen>
+ data Set cxt a = Set [a]
+ | Unused (cxt a -> ())
+</Screen>
+The only use for the <literal>Unused</literal> constructor was to force the correct
+kind for the type variable <literal>cxt</literal>.
+</para>
+<para>
+GHC now instead allows you to specify the kind of a type variable directly, wherever
+a type variable is explicitly bound. Namely:
+<itemizedlist>
+<listitem><para><literal>data</literal> declarations:
+<Screen>
+ data Set (cxt :: * -> *) a = Set [a]
+</Screen></para></listitem>
+<listitem><para><literal>type</literal> declarations:
+<Screen>
+ type T (f :: * -> *) = f Int
+</Screen></para></listitem>
+<listitem><para><literal>class</literal> declarations:
+<Screen>
+ class (Eq a) => C (f :: * -> *) a where ...
+</Screen></para></listitem>
+<listitem><para><literal>forall</literal>'s in type signatures:
+<Screen>
+ f :: forall (cxt :: * -> *). Set cxt Int
+</Screen></para></listitem>
+</itemizedlist>
+</para>
+
+<para>
+The parentheses are required. Some of the spaces are required too, to
+separate the lexemes. If you write <literal>(f::*->*)</literal> you
+will get a parse error, because "<literal>::*->*</literal>" is a
+single lexeme in Haskell.
+</para>
+
+<para>
+As part of the same extension, you can put kind annotations in types
+as well. Thus:
+<Screen>
+ f :: (Int :: *) -> Int
+ g :: forall a. a -> (a :: *)
+</Screen>
+The syntax is
+<Screen>
+ atype ::= '(' ctype '::' kind ')
+</Screen>
+The parentheses are required.
+</para>
+</sect2>
+
+
+<sect2 id="class-method-types">
+<title>Class method types
+</title>
+<para>
+Haskell 98 prohibits class method types to mention constraints on the
+class type variable, thus:
+<programlisting>
+ class Seq s a where
+ fromList :: [a] -> s a
+ elem :: Eq a => a -> s a -> Bool
+</programlisting>
+The type of <literal>elem</literal> is illegal in Haskell 98, because it
+contains the constraint <literal>Eq a</literal>, constrains only the
+class type variable (in this case <literal>a</literal>).
+</para>
+<para>
+With the <option>-fglasgow-exts</option> GHC lifts this restriction.
+</para>
+
+</sect2>
+
+<sect2 id="multi-param-type-classes">
+<title>Multi-parameter type classes
+</title>
+
+<para>
+This section documents GHC's implementation of multi-parameter type
+classes. There's lots of background in the paper <ULink
+URL="http://research.microsoft.com/~simonpj/multi.ps.gz" >Type
+classes: exploring the design space</ULink > (Simon Peyton Jones, Mark
+Jones, Erik Meijer).
+</para>
+
+
+<sect3 id="type-restrictions">
+<title>Types</title>
+
+<para>
+GHC imposes the following restrictions on the form of a qualified
+type, whether declared in a type signature
+or inferred. Consider the type:
+
+<programlisting>
+ forall tv1..tvn (c1, ...,cn) => type
+</programlisting>
+
+(Here, I write the "foralls" explicitly, although the Haskell source
+language omits them; in Haskell 1.4, all the free type variables of an
+explicit source-language type signature are universally quantified,
+except for the class type variables in a class declaration. However,
+in GHC, you can give the foralls if you want. See <xref LinkEnd="universal-quantification">).
+</para>
+
+<para>
+
+<OrderedList>
+<listitem>
+
+<para>
+ <emphasis>Each universally quantified type variable
+<literal>tvi</literal> must be reachable from <literal>type</literal></emphasis>.
+
+A type variable is "reachable" if it it is functionally dependent
+(see <xref linkend="functional-dependencies">)
+on the type variables free in <literal>type</literal>.
+The reason for this is that a value with a type that does not obey
+this restriction could not be used without introducing
+ambiguity.
+Here, for example, is an illegal type:
+
+
+<programlisting>
+ forall a. Eq a => Int
+</programlisting>
+
+
+When a value with this type was used, the constraint <literal>Eq tv</literal>
+would be introduced where <literal>tv</literal> is a fresh type variable, and
+(in the dictionary-translation implementation) the value would be
+applied to a dictionary for <literal>Eq tv</literal>. The difficulty is that we
+can never know which instance of <literal>Eq</literal> to use because we never
+get any more information about <literal>tv</literal>.
+
+</para>
+</listitem>
+<listitem>
+
+<para>
+ <emphasis>Every constraint <literal>ci</literal> must mention at least one of the
+universally quantified type variables <literal>tvi</literal></emphasis>.
+
+For example, this type is OK because <literal>C a b</literal> mentions the
+universally quantified type variable <literal>b</literal>:
+
+
+<programlisting>
+ forall a. C a b => burble
+</programlisting>
+
+
+The next type is illegal because the constraint <literal>Eq b</literal> does not
+mention <literal>a</literal>:
+
+
+<programlisting>
+ forall a. Eq b => burble
+</programlisting>
+
+
+The reason for this restriction is milder than the other one. The
+excluded types are never useful or necessary (because the offending
+context doesn't need to be witnessed at this point; it can be floated
+out). Furthermore, floating them out increases sharing. Lastly,
+excluding them is a conservative choice; it leaves a patch of
+territory free in case we need it later.
+
+</para>
+</listitem>
+
+</OrderedList>
+
+</para>
+
+
+<para>
+Unlike Haskell 1.4, constraints in types do <emphasis>not</emphasis> have to be of
+the form <emphasis>(class type-variables)</emphasis>. Thus, these type signatures
+are perfectly OK
+</para>
+
+<para>
+
+<programlisting>
+ f :: Eq (m a) => [m a] -> [m a]
+ g :: Eq [a] => ...
+</programlisting>
+
+</para>
+
+<para>
+This choice recovers principal types, a property that Haskell 1.4 does not have.
+</para>
+
+</sect3>
+
+<sect3>
+<title>Class declarations</title>
+
+<para>
+
+<OrderedList>
+<listitem>
+
+<para>
+ <emphasis>Multi-parameter type classes are permitted</emphasis>. For example:
+
+
+<programlisting>
+ class Collection c a where
+ union :: c a -> c a -> c a
+ ...etc.
+</programlisting>
+
+
+
+</para>
+</listitem>
+<listitem>
+
+<para>
+ <emphasis>The class hierarchy must be acyclic</emphasis>. However, the definition
+of "acyclic" involves only the superclass relationships. For example,
+this is OK:
+
+
+<programlisting>
+ class C a where {
+ op :: D b => a -> b -> b
+ }
+
+ class C a => D a where { ... }
+</programlisting>
+
+
+Here, <literal>C</literal> is a superclass of <literal>D</literal>, but it's OK for a
+class operation <literal>op</literal> of <literal>C</literal> to mention <literal>D</literal>. (It
+would not be OK for <literal>D</literal> to be a superclass of <literal>C</literal>.)
+
+</para>
+</listitem>
+<listitem>
+
+<para>
+ <emphasis>There are no restrictions on the context in a class declaration
+(which introduces superclasses), except that the class hierarchy must
+be acyclic</emphasis>. So these class declarations are OK:
+
+
+<programlisting>
+ class Functor (m k) => FiniteMap m k where
+ ...
+
+ class (Monad m, Monad (t m)) => Transform t m where
+ lift :: m a -> (t m) a
</programlisting>
-The "at least one not a type variable" restriction is to ensure that
-context reduction terminates: each reduction step removes one type
-constructor. For example, the following would make the type checker
-loop if it wasn't excluded:
+</para>
+</listitem>
+
+<listitem>
+
+<para>
+ <emphasis>All of the class type variables must be reachable (in the sense
+mentioned in <xref linkend="type-restrictions">)
+from the free varibles of each method type
+</emphasis>. For example:
<programlisting>
- instance C a => C a where ...
+ class Coll s a where
+ empty :: s
+ insert :: s -> a -> s
</programlisting>
-There are two situations in which the rule is a bit of a pain. First,
-if one allows overlapping instance declarations then it's quite
-convenient to have a "default instance" declaration that applies if
-something more specific does not:
+is not OK, because the type of <literal>empty</literal> doesn't mention
+<literal>a</literal>. This rule is a consequence of Rule 1(a), above, for
+types, and has the same motivation.
+
+Sometimes, offending class declarations exhibit misunderstandings. For
+example, <literal>Coll</literal> might be rewritten
<programlisting>
- instance C a where
- op = ... -- Default
+ class Coll s a where
+ empty :: s a
+ insert :: s a -> a -> s a
</programlisting>
-Second, sometimes you might want to use the following to get the
-effect of a "class synonym":
+which makes the connection between the type of a collection of
+<literal>a</literal>'s (namely <literal>(s a)</literal>) and the element type <literal>a</literal>.
+Occasionally this really doesn't work, in which case you can split the
+class like this:
<programlisting>
- class (C1 a, C2 a, C3 a) => C a where { }
+ class CollE s where
+ empty :: s
- instance (C1 a, C2 a, C3 a) => C a where { }
+ class CollE s => Coll s a where
+ insert :: s -> a -> s
</programlisting>
-This allows you to write shorter signatures:
+</para>
+</listitem>
+
+</OrderedList>
+
+</para>
+
+</sect3>
+
+<sect3 id="instance-decls">
+<title>Instance declarations</title>
+
+<para>
+
+<OrderedList>
+<listitem>
+
+<para>
+ <emphasis>Instance declarations may not overlap</emphasis>. The two instance
+declarations
<programlisting>
- f :: C a => ...
+ instance context1 => C type1 where ...
+ instance context2 => C type2 where ...
</programlisting>
-instead of
+"overlap" if <literal>type1</literal> and <literal>type2</literal> unify
+
+However, if you give the command line option
+<option>-fallow-overlapping-instances</option><indexterm><primary>-fallow-overlapping-instances
+option</primary></indexterm> then overlapping instance declarations are permitted.
+However, GHC arranges never to commit to using an instance declaration
+if another instance declaration also applies, either now or later.
+
+<itemizedlist>
+<listitem>
+
+<para>
+ EITHER <literal>type1</literal> and <literal>type2</literal> do not unify
+</para>
+</listitem>
+<listitem>
+
+<para>
+ OR <literal>type2</literal> is a substitution instance of <literal>type1</literal>
+(but not identical to <literal>type1</literal>), or vice versa.
+</para>
+</listitem>
+</itemizedlist>
+Notice that these rules
+<itemizedlist>
+<listitem>
+
+<para>
+ make it clear which instance decl to use
+(pick the most specific one that matches)
+</para>
+</listitem>
+<listitem>
+
+<para>
+ do not mention the contexts <literal>context1</literal>, <literal>context2</literal>
+Reason: you can pick which instance decl
+"matches" based on the type.
+</para>
+</listitem>
+</itemizedlist>
+However the rules are over-conservative. Two instance declarations can overlap,
+but it can still be clear in particular situations which to use. For example:
<programlisting>
- f :: (C1 a, C2 a, C3 a) => ...
+ instance C (Int,a) where ...
+ instance C (a,Bool) where ...
</programlisting>
+These are rejected by GHC's rules, but it is clear what to do when trying
+to solve the constraint <literal>C (Int,Int)</literal> because the second instance
+cannot apply. Yell if this restriction bites you.
+</para>
+<para>
+GHC is also conservative about committing to an overlapping instance. For example:
+<programlisting>
+ class C a where { op :: a -> a }
+ instance C [Int] where ...
+ instance C a => C [a] where ...
+
+ f :: C b => [b] -> [b]
+ f x = op x
+</programlisting>
+From the RHS of f we get the constraint <literal>C [b]</literal>. But
+GHC does not commit to the second instance declaration, because in a paricular
+call of f, b might be instantiate to Int, so the first instance declaration
+would be appropriate. So GHC rejects the program. If you add <option>-fallow-incoherent-instances</option>
+GHC will instead silently pick the second instance, without complaining about
+the problem of subsequent instantiations.
+</para>
+<para>
+Regrettably, GHC doesn't guarantee to detect overlapping instance
+declarations if they appear in different modules. GHC can "see" the
+instance declarations in the transitive closure of all the modules
+imported by the one being compiled, so it can "see" all instance decls
+when it is compiling <literal>Main</literal>. However, it currently chooses not
+to look at ones that can't possibly be of use in the module currently
+being compiled, in the interests of efficiency. (Perhaps we should
+change that decision, at least for <literal>Main</literal>.)
+
+</para>
+</listitem>
+<listitem>
+
+<para>
+ <emphasis>There are no restrictions on the type in an instance
+<emphasis>head</emphasis>, except that at least one must not be a type variable</emphasis>.
+The instance "head" is the bit after the "=>" in an instance decl. For
+example, these are OK:
+
+
+<programlisting>
+ instance C Int a where ...
+
+ instance D (Int, Int) where ...
+
+ instance E [[a]] where ...
+</programlisting>
+
+
+Note that instance heads <emphasis>may</emphasis> contain repeated type variables.
+For example, this is OK:
-I'm on the lookout for a simple rule that preserves decidability while
-allowing these idioms. The experimental flag
-<option>-fallow-undecidable-instances</option><indexterm><primary>-fallow-undecidable-instances
-option</primary></indexterm> lifts this restriction, allowing all the types in an
-instance head to be type variables.
+<programlisting>
+ instance Stateful (ST s) (MutVar s) where ...
+</programlisting>
+See <xref linkend="undecidable-instances"> for an experimental
+extension to lift this restriction.
</para>
</listitem>
<listitem>
</programlisting>
-as well, then the compiler will complain about the overlapping
-(actually, identical) instance declarations. As always, type synonyms
-must be fully applied. You cannot, for example, write:
+as well, then the compiler will complain about the overlapping
+(actually, identical) instance declarations. As always, type synonyms
+must be fully applied. You cannot, for example, write:
+
+
+<programlisting>
+ type P a = [[a]]
+ instance Monad P where ...
+</programlisting>
+
+
+This design decision is independent of all the others, and easily
+reversed, but it makes sense to me.
+
+</para>
+</listitem>
+<listitem>
+
+<para>
+<emphasis>The types in an instance-declaration <emphasis>context</emphasis> must all
+be type variables</emphasis>. Thus
+
+
+<programlisting>
+instance C a b => Eq (a,b) where ...
+</programlisting>
+
+
+is OK, but
+
+
+<programlisting>
+instance C Int b => Foo b where ...
+</programlisting>
+
+
+is not OK. See <xref linkend="undecidable-instances"> for an experimental
+extension to lift this restriction.
+
+
+
+</para>
+</listitem>
+
+</OrderedList>
+
+</para>
+
+</sect3>
+
+</sect2>
+
+<sect2 id="undecidable-instances">
+<title>Undecidable instances</title>
+
+<para>The rules for instance declarations state that:
+<itemizedlist>
+<listitem><para>At least one of the types in the <emphasis>head</emphasis> of
+an instance declaration <emphasis>must not</emphasis> be a type variable.
+</para></listitem>
+<listitem><para>All of the types in the <emphasis>context</emphasis> of
+an instance declaration <emphasis>must</emphasis> be type variables.
+</para></listitem>
+</itemizedlist>
+These restrictions ensure that
+context reduction terminates: each reduction step removes one type
+constructor. For example, the following would make the type checker
+loop if it wasn't excluded:
+<programlisting>
+ instance C a => C a where ...
+</programlisting>
+There are two situations in which the rule is a bit of a pain. First,
+if one allows overlapping instance declarations then it's quite
+convenient to have a "default instance" declaration that applies if
+something more specific does not:
+
+
+<programlisting>
+ instance C a where
+ op = ... -- Default
+</programlisting>
+
+
+Second, sometimes you might want to use the following to get the
+effect of a "class synonym":
<programlisting>
- type P a = [[a]]
- instance Monad P where ...
-</programlisting>
-
+ class (C1 a, C2 a, C3 a) => C a where { }
-This design decision is independent of all the others, and easily
-reversed, but it makes sense to me.
+ instance (C1 a, C2 a, C3 a) => C a where { }
+</programlisting>
-</para>
-</listitem>
-<listitem>
-<para>
-<emphasis>The types in an instance-declaration <emphasis>context</emphasis> must all
-be type variables</emphasis>. Thus
+This allows you to write shorter signatures:
<programlisting>
-instance C a b => Eq (a,b) where ...
+ f :: C a => ...
</programlisting>
-is OK, but
+instead of
<programlisting>
-instance C Int b => Foo b where ...
+ f :: (C1 a, C2 a, C3 a) => ...
</programlisting>
-is not OK. Again, the intent here is to make sure that context
-reduction terminates.
-
Voluminous correspondence on the Haskell mailing list has convinced me
-that it's worth experimenting with a more liberal rule. If you use
-the flag <option>-fallow-undecidable-instances</option> can use arbitrary
-types in an instance context. Termination is ensured by having a
+that it's worth experimenting with more liberal rules. If you use
+the experimental flag <option>-fallow-undecidable-instances</option>
+<indexterm><primary>-fallow-undecidable-instances
+option</primary></indexterm>, you can use arbitrary
+types in both an instance context and instance head. Termination is ensured by having a
fixed-depth recursion stack. If you exceed the stack depth you get a
sort of backtrace, and the opportunity to increase the stack depth
with <option>-fcontext-stack</option><emphasis>N</emphasis>.
-
</para>
-</listitem>
-
-</OrderedList>
-
+<para>
+I'm on the lookout for a less brutal solution: a simple rule that preserves decidability while
+allowing these idioms interesting idioms.
</para>
-
-</sect3>
-
</sect2>
<sect2 id="implicit-parameters">
The dynamic binding constraints are just a new form of predicate in the type class system.
</para>
<para>
-An implicit parameter is introduced by the special form <literal>?x</literal>,
+An implicit parameter occurs in an expression using the special form <literal>?x</literal>,
where <literal>x</literal> is
-any valid identifier. Use if this construct also introduces new
-dynamic binding constraints. For example, the following definition
+any valid identifier (e.g. <literal>ord ?x</literal> is a valid expression).
+Use of this construct also introduces a new
+dynamic-binding constraint in the type of the expression.
+For example, the following definition
shows how we can define an implicitly parameterized sort function in
terms of an explicitly parameterized <literal>sortBy</literal> function:
<programlisting>
sort :: (?cmp :: a -> a -> Bool) => [a] -> [a]
sort = sortBy ?cmp
</programlisting>
+</para>
+
+<sect3>
+<title>Implicit-parameter type constraints</title>
+<para>
Dynamic binding constraints behave just like other type class
constraints in that they are automatically propagated. Thus, when a
function is used, its implicit parameters are inherited by the
propagate them.
</para>
<para>
-An implicit parameter differs from other type class constraints in the
+An implicit-parameter type constraint differs from other type class constraints in the
following way: All uses of a particular implicit parameter must have
the same type. This means that the type of <literal>(?x, ?x)</literal>
is <literal>(?x::a) => (a,a)</literal>, and not
<literal>(?x::a, ?x::b) => (a, b)</literal>, as would be the case for type
class constraints.
</para>
-<para>
-An implicit parameter is bound using the standard
-<literal>let</literal> binding form, where the bindings must be a
-collection of simple bindings to implicit-style variables (no
-function-style bindings, and no type signatures); these bindings are
-neither polymorphic or recursive. This form binds the implicit
-parameters arising in the body, not the free variables as a
-<literal>let</literal> or <literal>where</literal> would do. For
-example, we define the <literal>min</literal> function by binding
-<literal>cmp</literal>.</para>
-<programlisting>
- min :: [a] -> a
- min = let ?cmp = (<=) in least
-</programlisting>
-<para>
-Note the following additional constraints:
-<itemizedlist>
-<listitem>
+
<para> You can't have an implicit parameter in the context of a class or instance
declaration. For example, both these declarations are illegal:
<programlisting>
you invoke a function. But the ``invocation'' of instance declarations is done
behind the scenes by the compiler, so it's hard to figure out exactly where it is done.
Easiest thing is to outlaw the offending types.</para>
-</listitem>
+<para>
+Implicit-parameter constraints do not cause ambiguity. For example, consider:
+<programlisting>
+ f :: (?x :: [a]) => Int -> Int
+ f n = n + length ?x
+
+ g :: (Read a, Show a) => String -> String
+ g s = show (read s)
+</programlisting>
+Here, <literal>g</literal> has an ambiguous type, and is rejected, but <literal>f</literal>
+is fine. The binding for <literal>?x</literal> at <literal>f</literal>'s call site is
+quite unambiguous, and fixes the type <literal>a</literal>.
+</para>
+</sect3>
+
+<sect3>
+<title>Implicit-parameter bindings</title>
+
+<para>
+An implicit parameter is <emphasis>bound</emphasis> using the standard
+<literal>let</literal> or <literal>where</literal> binding forms.
+For example, we define the <literal>min</literal> function by binding
+<literal>cmp</literal>.
+<programlisting>
+ min :: [a] -> a
+ min = let ?cmp = (<=) in least
+</programlisting>
+</para>
+<para>
+A group of implicit-parameter bindings may occur anywhere a normal group of Haskell
+bindings can occur, except at top level. That is, they can occur in a <literal>let</literal>
+(including in a list comprehension, or do-notation, or pattern guards),
+or a <literal>where</literal> clause.
+Note the following points:
+<itemizedlist>
+<listitem><para>
+An implicit-parameter binding group must be a
+collection of simple bindings to implicit-style variables (no
+function-style bindings, and no type signatures); these bindings are
+neither polymorphic or recursive.
+</para></listitem>
+<listitem><para>
+You may not mix implicit-parameter bindings with ordinary bindings in a
+single <literal>let</literal>
+expression; use two nested <literal>let</literal>s instead.
+(In the case of <literal>where</literal> you are stuck, since you can't nest <literal>where</literal> clauses.)
+</para></listitem>
+
+<listitem><para>
+You may put multiple implicit-parameter bindings in a
+single binding group; but they are <emphasis>not</emphasis> treated
+as a mutually recursive group (as ordinary <literal>let</literal> bindings are).
+Instead they are treated as a non-recursive group, simultaneously binding all the implicit
+parameter. The bindings are not nested, and may be re-ordered without changing
+the meaning of the program.
+For example, consider:
+<programlisting>
+ f t = let { ?x = t; ?y = ?x+(1::Int) } in ?x + ?y
+</programlisting>
+The use of <literal>?x</literal> in the binding for <literal>?y</literal> does not "see"
+the binding for <literal>?x</literal>, so the type of <literal>f</literal> is
+<programlisting>
+ f :: (?x::Int) => Int -> Int
+</programlisting>
+</para></listitem>
</itemizedlist>
</para>
+</sect3>
</sect2>
<sect2 id="linear-implicit-parameters">
</sect3>
+<sect3><title>Recursive functions</title>
+<para>Linear implicit parameters can be particularly tricky when you have a recursive function
+Consider
+<programlisting>
+ foo :: %x::T => Int -> [Int]
+ foo 0 = []
+ foo n = %x : foo (n-1)
+</programlisting>
+where T is some type in class Splittable.</para>
+<para>
+Do you get a list of all the same T's or all different T's
+(assuming that split gives two distinct T's back)?
+</para><para>
+If you supply the type signature, taking advantage of polymorphic
+recursion, you get what you'd probably expect. Here's the
+translated term, where the implicit param is made explicit:
+<programlisting>
+ foo x 0 = []
+ foo x n = let (x1,x2) = split x
+ in x1 : foo x2 (n-1)
+</programlisting>
+But if you don't supply a type signature, GHC uses the Hindley
+Milner trick of using a single monomorphic instance of the function
+for the recursive calls. That is what makes Hindley Milner type inference
+work. So the translation becomes
+<programlisting>
+ foo x = let
+ foom 0 = []
+ foom n = x : foom (n-1)
+ in
+ foom
+</programlisting>
+Result: 'x' is not split, and you get a list of identical T's. So the
+semantics of the program depends on whether or not foo has a type signature.
+Yikes!
+</para><para>
+You may say that this is a good reason to dislike linear implicit parameters
+and you'd be right. That is why they are an experimental feature.
+</para>
+</sect3>
+
</sect2>
<sect2 id="functional-dependencies">
</title>
<para> Functional dependencies are implemented as described by Mark Jones
-in "Type Classes with Functional Dependencies", Mark P. Jones,
+in “<ulink url="http://www.cse.ogi.edu/~mpj/pubs/fundeps.html">Type Classes with Functional Dependencies</ulink>”, Mark P. Jones,
In Proceedings of the 9th European Symposium on Programming,
-ESOP 2000, Berlin, Germany, March 2000, Springer-Verlag LNCS 1782.
+ESOP 2000, Berlin, Germany, March 2000, Springer-Verlag LNCS 1782,
+.
</para>
-
<para>
+Functional dependencies are introduced by a vertical bar in the syntax of a
+class declaration; e.g.
+<programlisting>
+ class (Monad m) => MonadState s m | m -> s where ...
+
+ class Foo a b c | a b -> c where ...
+</programlisting>
There should be more documentation, but there isn't (yet). Yell if you need it.
</para>
</sect2>
g :: Int -> Int -> forall b. b -> Int
</programlisting>
</para>
+<para>
+When doing this hoisting operation, GHC eliminates duplicate constraints. For
+example:
+<programlisting>
+ type Foo a = (?x::Int) => Bool -> a
+ g :: Foo (Foo Int)
+</programlisting>
+means
+<programlisting>
+ g :: (?x::Int) => Bool -> Bool -> Int
+</programlisting>
+</para>
</sect2>
f3 x = a==b where { Baz1 a b = x }
</programlisting>
+Instead, use a <literal>case</literal> expression:
+
+<programlisting>
+ f3 x = case x of Baz1 a b -> a==b
+</programlisting>
-You can only pattern-match
+In general, you can only pattern-match
on an existentially-quantified constructor in a <literal>case</literal> expression or
in the patterns of a function definition.
</sect3>
<sect3>
-<title>Result type signatures</title>
+<title>Where a pattern type signature can occur</title>
<para>
-
+A pattern type signature can occur in any pattern. For example:
<itemizedlist>
+
+<listitem>
+<para>
+A pattern type signature can be on an arbitrary sub-pattern, not
+ust on a variable:
+
+
+<programlisting>
+ f ((x,y)::(a,b)) = (y,x) :: (b,a)
+</programlisting>
+
+
+</para>
+</listitem>
+<listitem>
+
+<para>
+ Pattern type signatures, including the result part, can be used
+in lambda abstractions:
+
+<programlisting>
+ (\ (x::a, y) :: a -> x)
+</programlisting>
+</para>
+</listitem>
+<listitem>
+
+<para>
+ Pattern type signatures, including the result part, can be used
+in <literal>case</literal> expressions:
+
+
+<programlisting>
+ case e of { (x::a, y) :: a -> x }
+</programlisting>
+
+</para>
+</listitem>
+
+<listitem>
+<para>
+To avoid ambiguity, the type after the “<literal>::</literal>” in a result
+pattern signature on a lambda or <literal>case</literal> must be atomic (i.e. a single
+token or a parenthesised type of some sort). To see why,
+consider how one would parse this:
+
+
+<programlisting>
+ \ x :: a -> b -> x
+</programlisting>
+
+
+</para>
+</listitem>
+
+<listitem>
+
+<para>
+ Pattern type signatures can bind existential type variables.
+For example:
+
+
+<programlisting>
+ data T = forall a. MkT [a]
+
+ f :: T -> T
+ f (MkT [t::a]) = MkT t3
+ where
+ t3::[a] = [t,t,t]
+</programlisting>
+
+
+</para>
+</listitem>
+
+
<listitem>
<para>
- The result type of a function can be given a signature,
-thus:
+Pattern type signatures
+can be used in pattern bindings:
+
+<programlisting>
+ f x = let (y, z::a) = x in ...
+ f1 x = let (y, z::Int) = x in ...
+ f2 (x::(Int,a)) = let (y, z::a) = x in ...
+ f3 :: (b->b) = \x -> x
+</programlisting>
+
+In all such cases, the binding is not generalised over the pattern-bound
+type variables. Thus <literal>f3</literal> is monomorphic; <literal>f3</literal>
+has type <literal>b -> b</literal> for some type <literal>b</literal>,
+and <emphasis>not</emphasis> <literal>forall b. b -> b</literal>.
+In contrast, the binding
+<programlisting>
+ f4 :: b->b
+ f4 = \x -> x
+</programlisting>
+makes a polymorphic function, but <literal>b</literal> is not in scope anywhere
+in <literal>f4</literal>'s scope.
+
+</para>
+</listitem>
+</itemizedlist>
+</para>
+
+</sect3>
+
+<sect3>
+<title>Result type signatures</title>
+
+<para>
+The result type of a function can be given a signature, thus:
<programlisting>
in \xs -> map g (reverse xs `zip` xs)
</programlisting>
-
</para>
-</listitem>
-
-</itemizedlist>
+<para>
+The type variables bound in a result type signature scope over the right hand side
+of the definition. However, consider this corner-case:
+<programlisting>
+ rev1 :: [a] -> [a] = \xs -> reverse xs
+ foo ys = rev (ys::[a])
+</programlisting>
+The signature on <literal>rev1</literal> is considered a pattern type signature, not a result
+type signature, and the type variables it binds have the same scope as <literal>rev1</literal>
+itself (i.e. the right-hand side of <literal>rev1</literal> and the rest of the module too).
+In particular, the expression <literal>(ys::[a])</literal> is OK, because the type variable <literal>a</literal>
+is in scope (otherwise it would mean <literal>(ys::forall a.[a])</literal>, which would be rejected).
+</para>
+<para>
+As mentioned above, <literal>rev1</literal> is made monomorphic by this scoping rule.
+For example, the following program would be rejected, because it claims that <literal>rev1</literal>
+is polymorphic:
+<programlisting>
+ rev1 :: [b] -> [b]
+ rev1 :: [a] -> [a] = \xs -> reverse xs
+</programlisting>
</para>
<para>
Result type signatures are not yet implemented in Hugs.
</para>
-</sect3>
-
-<sect3>
-<title>Where a pattern type signature can occur</title>
-
-<para>
-A pattern type signature can occur in any pattern. For example:
-<itemizedlist>
+</sect3>
+
+</sect2>
+
+<sect2 id="newtype-deriving">
+<title>Generalised derived instances for newtypes</title>
+
+<para>
+When you define an abstract type using <literal>newtype</literal>, you may want
+the new type to inherit some instances from its representation. In
+Haskell 98, you can inherit instances of <literal>Eq</literal>, <literal>Ord</literal>,
+<literal>Enum</literal> and <literal>Bounded</literal> by deriving them, but for any
+other classes you have to write an explicit instance declaration. For
+example, if you define
+
+<programlisting>
+ newtype Dollars = Dollars Int
+</programlisting>
+
+and you want to use arithmetic on <literal>Dollars</literal>, you have to
+explicitly define an instance of <literal>Num</literal>:
+
+<programlisting>
+ instance Num Dollars where
+ Dollars a + Dollars b = Dollars (a+b)
+ ...
+</programlisting>
+All the instance does is apply and remove the <literal>newtype</literal>
+constructor. It is particularly galling that, since the constructor
+doesn't appear at run-time, this instance declaration defines a
+dictionary which is <emphasis>wholly equivalent</emphasis> to the <literal>Int</literal>
+dictionary, only slower!
+</para>
+
-<listitem>
+<sect3> <title> Generalising the deriving clause </title>
<para>
-A pattern type signature can be on an arbitrary sub-pattern, not
-ust on a variable:
-
+GHC now permits such instances to be derived instead, so one can write
+<programlisting>
+ newtype Dollars = Dollars Int deriving (Eq,Show,Num)
+</programlisting>
-<programlisting>
- f ((x,y)::(a,b)) = (y,x) :: (b,a)
-</programlisting>
+and the implementation uses the <emphasis>same</emphasis> <literal>Num</literal> dictionary
+for <literal>Dollars</literal> as for <literal>Int</literal>. Notionally, the compiler
+derives an instance declaration of the form
+<programlisting>
+ instance Num Int => Num Dollars
+</programlisting>
+which just adds or removes the <literal>newtype</literal> constructor according to the type.
</para>
-</listitem>
-<listitem>
-
<para>
- Pattern type signatures, including the result part, can be used
-in lambda abstractions:
-<programlisting>
- (\ (x::a, y) :: a -> x)
-</programlisting>
-</para>
-</listitem>
-<listitem>
+We can also derive instances of constructor classes in a similar
+way. For example, suppose we have implemented state and failure monad
+transformers, such that
-<para>
- Pattern type signatures, including the result part, can be used
-in <literal>case</literal> expressions:
+<programlisting>
+ instance Monad m => Monad (State s m)
+ instance Monad m => Monad (Failure m)
+</programlisting>
+In Haskell 98, we can define a parsing monad by
+<programlisting>
+ type Parser tok m a = State [tok] (Failure m) a
+</programlisting>
+which is automatically a monad thanks to the instance declarations
+above. With the extension, we can make the parser type abstract,
+without needing to write an instance of class <literal>Monad</literal>, via
-<programlisting>
- case e of { (x::a, y) :: a -> x }
+<programlisting>
+ newtype Parser tok m a = Parser (State [tok] (Failure m) a)
+ deriving Monad
</programlisting>
+In this case the derived instance declaration is of the form
+<programlisting>
+ instance Monad (State [tok] (Failure m)) => Monad (Parser tok m)
+</programlisting>
+Notice that, since <literal>Monad</literal> is a constructor class, the
+instance is a <emphasis>partial application</emphasis> of the new type, not the
+entire left hand side. We can imagine that the type declaration is
+``eta-converted'' to generate the context of the instance
+declaration.
</para>
-</listitem>
-
-<listitem>
<para>
-To avoid ambiguity, the type after the “<literal>::</literal>” in a result
-pattern signature on a lambda or <literal>case</literal> must be atomic (i.e. a single
-token or a parenthesised type of some sort). To see why,
-consider how one would parse this:
+We can even derive instances of multi-parameter classes, provided the
+newtype is the last class parameter. In this case, a ``partial
+application'' of the class appears in the <literal>deriving</literal>
+clause. For example, given the class
-<programlisting>
- \ x :: a -> b -> x
+<programlisting>
+ class StateMonad s m | m -> s where ...
+ instance Monad m => StateMonad s (State s m) where ...
+</programlisting>
+then we can derive an instance of <literal>StateMonad</literal> for <literal>Parser</literal>s by
+<programlisting>
+ newtype Parser tok m a = Parser (State [tok] (Failure m) a)
+ deriving (Monad, StateMonad [tok])
</programlisting>
+The derived instance is obtained by completing the application of the
+class to the new type:
+<programlisting>
+ instance StateMonad [tok] (State [tok] (Failure m)) =>
+ StateMonad [tok] (Parser tok m)
+</programlisting>
</para>
-</listitem>
+<para>
-<listitem>
+As a result of this extension, all derived instances in newtype
+declarations are treated uniformly (and implemented just by reusing
+the dictionary for the representation type), <emphasis>except</emphasis>
+<literal>Show</literal> and <literal>Read</literal>, which really behave differently for
+the newtype and its representation.
+</para>
+</sect3>
+<sect3> <title> A more precise specification </title>
<para>
- Pattern type signatures can bind existential type variables.
-For example:
-
+Derived instance declarations are constructed as follows. Consider the
+declaration (after expansion of any type synonyms)
-<programlisting>
- data T = forall a. MkT [a]
+<programlisting>
+ newtype T v1...vn = T' (S t1...tk vk+1...vn) deriving (c1...cm)
+</programlisting>
- f :: T -> T
- f (MkT [t::a]) = MkT t3
- where
- t3::[a] = [t,t,t]
+where
+ <itemizedlist>
+<listitem><para>
+ <literal>S</literal> is a type constructor,
+</para></listitem>
+<listitem><para>
+ <literal>t1...tk</literal> are types,
+</para></listitem>
+<listitem><para>
+ <literal>vk+1...vn</literal> are type variables which do not occur in any of
+ the <literal>ti</literal>, and
+</para></listitem>
+<listitem><para>
+ the <literal>ci</literal> are partial applications of
+ classes of the form <literal>C t1'...tj'</literal>, where the arity of <literal>C</literal>
+ is exactly <literal>j+1</literal>. That is, <literal>C</literal> lacks exactly one type argument.
+</para></listitem>
+</itemizedlist>
+Then, for each <literal>ci</literal>, the derived instance
+declaration is:
+<programlisting>
+ instance ci (S t1...tk vk+1...v) => ci (T v1...vp)
</programlisting>
-
-
+where <literal>p</literal> is chosen so that <literal>T v1...vp</literal> is of the
+right <emphasis>kind</emphasis> for the last parameter of class <literal>Ci</literal>.
</para>
-</listitem>
-
+<para>
-<listitem>
+As an example which does <emphasis>not</emphasis> work, consider
+<programlisting>
+ newtype NonMonad m s = NonMonad (State s m s) deriving Monad
+</programlisting>
+Here we cannot derive the instance
+<programlisting>
+ instance Monad (State s m) => Monad (NonMonad m)
+</programlisting>
+because the type variable <literal>s</literal> occurs in <literal>State s m</literal>,
+and so cannot be "eta-converted" away. It is a good thing that this
+<literal>deriving</literal> clause is rejected, because <literal>NonMonad m</literal> is
+not, in fact, a monad --- for the same reason. Try defining
+<literal>>>=</literal> with the correct type: you won't be able to.
+</para>
<para>
-Pattern type signatures
-can be used in pattern bindings:
-<programlisting>
- f x = let (y, z::a) = x in ...
- f1 x = let (y, z::Int) = x in ...
- f2 (x::(Int,a)) = let (y, z::a) = x in ...
- f3 :: (b->b) = \x -> x
-</programlisting>
+Notice also that the <emphasis>order</emphasis> of class parameters becomes
+important, since we can only derive instances for the last one. If the
+<literal>StateMonad</literal> class above were instead defined as
-In all such cases, the binding is not generalised over the pattern-bound
-type variables. Thus <literal>f3</literal> is monomorphic; <literal>f3</literal>
-has type <literal>b -> b</literal> for some type <literal>b</literal>,
-and <emphasis>not</emphasis> <literal>forall b. b -> b</literal>.
-In contrast, the binding
-<programlisting>
- f4 :: b->b
- f4 = \x -> x
+<programlisting>
+ class StateMonad m s | m -> s where ...
</programlisting>
-makes a polymorphic function, but <literal>b</literal> is not in scope anywhere
-in <literal>f4</literal>'s scope.
+then we would not have been able to derive an instance for the
+<literal>Parser</literal> type above. We hypothesise that multi-parameter
+classes usually have one "main" parameter for which deriving new
+instances is most interesting.
</para>
-</listitem>
-</itemizedlist>
-</para>
-
</sect3>
+
</sect2>
-<sect2 id="sec-kinding">
-<title>Explicitly-kinded quantification</title>
-<para>
-Haskell infers the kind of each type variable. Sometimes it is nice to be able
-to give the kind explicitly as (machine-checked) documentation,
-just as it is nice to give a type signature for a function. On some occasions,
-it is essential to do so. For example, in his paper "Restricted Data Types in Haskell" (Haskell Workshop 1999)
-John Hughes had to define the data type:
-<Screen>
- data Set cxt a = Set [a]
- | Unused (cxt a -> ())
-</Screen>
-The only use for the <literal>Unused</literal> constructor was to force the correct
-kind for the type variable <literal>cxt</literal>.
+</sect1>
+<!-- ==================== End of type system extensions ================= -->
+
+<!-- ====================== TEMPLATE HASKELL ======================= -->
+
+<sect1 id="template-haskell">
+<title>Template Haskell</title>
+
+<para>Template Haskell allows you to do compile-time meta-programming in Haskell. There is a "home page" for
+Template Haskell at <ulink url="http://www.haskell.org/th/">
+http://www.haskell.org/th/</ulink>, while
+the background to
+the main technical innovations is discussed in "<ulink
+url="http://research.microsoft.com/~simonpj/papers/meta-haskell">
+Template Meta-programming for Haskell</ulink>" (Proc Haskell Workshop 2002).
+</para>
+
+<para> The first example from that paper is set out below as a worked example to help get you started.
+</para>
+
+<para>
+The documentation here describes the realisation in GHC. (It's rather sketchy just now;
+Tim Sheard is going to expand it.)
+</para>
+
+<sect2> <title> Syntax </title>
+<para>
+ Template Haskell has the following new syntactic constructions. You need to use the flag
+ <literal>-fglasgow-exts</literal> to switch these syntactic extensions on.
+
+ <itemizedlist>
+ <listitem><para>
+ A splice is written <literal>$x</literal>, where <literal>x</literal> is an
+ identifier, or <literal>$(...)</literal>, where the "..." is an arbitrary expression.
+ There must be no space between the "$" and the identifier or parenthesis. This use
+ of "$" overrides its meaning as an infix operator, just as "M.x" overrides the meaning
+ of "." as an infix operator. If you want the infix operator, put spaces around it.
+ </para>
+ <para> A splice can occur in place of
+ <itemizedlist>
+ <listitem><para> an expression; the spliced expression must have type <literal>Expr</literal></para></listitem>
+ <listitem><para> a list of top-level declarations; ; the spliced expression must have type <literal>Q [Dec]</literal></para></listitem>
+ <listitem><para> a type; the spliced expression must have type <literal>Type</literal>.</para></listitem>
+ </itemizedlist>
+ (Note that the syntax for a declaration splice uses "<literal>$</literal>" not "<literal>splice</literal>" as in
+ the paper. Also the type of the enclosed expression must be <literal>Q [Dec]</literal>, not <literal>[Q Dec]</literal>
+ as in the paper.)
+ </para></listitem>
+
+
+ <listitem><para>
+ A expression quotation is written in Oxford brackets, thus:
+ <itemizedlist>
+ <listitem><para> <literal>[| ... |]</literal>, where the "..." is an expression;
+ the quotation has type <literal>Expr</literal>.</para></listitem>
+ <listitem><para> <literal>[d| ... |]</literal>, where the "..." is a list of top-level declarations;
+ the quotation has type <literal>Q [Dec]</literal>.</para></listitem>
+ <listitem><para> <literal>[t| ... |]</literal>, where the "..." is a type;
+ the quotation has type <literal>Type</literal>.</para></listitem>
+ </itemizedlist></para></listitem>
+
+ <listitem><para>
+ Reification is written thus:
+ <itemizedlist>
+ <listitem><para> <literal>reifyDecl T</literal>, where <literal>T</literal> is a type constructor; this expression
+ has type <literal>Dec</literal>. </para></listitem>
+ <listitem><para> <literal>reifyDecl C</literal>, where <literal>C</literal> is a class; has type <literal>Dec</literal>.</para></listitem>
+ <listitem><para> <literal>reifyType f</literal>, where <literal>f</literal> is an identifier; has type <literal>Typ</literal>.</para></listitem>
+ <listitem><para> Still to come: fixities </para></listitem>
+
+ </itemizedlist></para>
+ </listitem>
+
+
+ </itemizedlist>
</para>
+</sect2>
+
+<sect2> <title> Using Template Haskell </title>
<para>
-GHC now instead allows you to specify the kind of a type variable directly, wherever
-a type variable is explicitly bound. Namely:
<itemizedlist>
-<listitem><para><literal>data</literal> declarations:
-<Screen>
- data Set (cxt :: * -> *) a = Set [a]
-</Screen></para></listitem>
-<listitem><para><literal>type</literal> declarations:
-<Screen>
- type T (f :: * -> *) = f Int
-</Screen></para></listitem>
-<listitem><para><literal>class</literal> declarations:
-<Screen>
- class (Eq a) => C (f :: * -> *) a where ...
-</Screen></para></listitem>
-<listitem><para><literal>forall</literal>'s in type signatures:
-<Screen>
- f :: forall (cxt :: * -> *). Set cxt Int
-</Screen></para></listitem>
+ <listitem><para>
+ The data types and monadic constructor functions for Template Haskell are in the library
+ <literal>Language.Haskell.THSyntax</literal>.
+ </para></listitem>
+
+ <listitem><para>
+ You can only run a function at compile time if it is imported from another module. That is,
+ you can't define a function in a module, and call it from within a splice in the same module.
+ (It would make sense to do so, but it's hard to implement.)
+ </para></listitem>
+
+ <listitem><para>
+ The flag <literal>-ddump-splices</literal> shows the expansion of all top-level splices as they happen.
+ </para></listitem>
+ <listitem><para>
+ If you are building GHC from source, you need at least a stage-2 bootstrap compiler to
+ run Template Haskell. A stage-1 compiler will reject the TH constructs. Reason: TH
+ compiles and runs a program, and then looks at the result. So it's important that
+ the program it compiles produces results whose representations are identical to
+ those of the compiler itself.
+ </para></listitem>
</itemizedlist>
</para>
-
-<para>
-The parentheses are required. Some of the spaces are required too, to
-separate the lexemes. If you write <literal>(f::*->*)</literal> you
-will get a parse error, because "<literal>::*->*</literal>" is a
-single lexeme in Haskell.
-</para>
-
-<para>
-As part of the same extension, you can put kind annotations in types
-as well. Thus:
-<Screen>
- f :: (Int :: *) -> Int
- g :: forall a. a -> (a :: *)
-</Screen>
-The syntax is
-<Screen>
- atype ::= '(' ctype '::' kind ')
-</Screen>
-The parentheses are required.
+<para> Template Haskell works in any mode (<literal>--make</literal>, <literal>--interactive</literal>,
+ or file-at-a-time). There used to be a restriction to the former two, but that restriction
+ has been lifted.
</para>
</sect2>
+
+<sect2> <title> A Template Haskell Worked Example </title>
+<para>To help you get over the confidence barrier, try out this skeletal worked example.
+ First cut and paste the two modules below into "Main.hs" and "Printf.hs":</para>
-</sect1>
-<!-- ==================== End of type system extensions ================= -->
-
-
-<!-- ==================== ASSERTIONS ================= -->
-
-<sect1 id="sec-assertions">
-<title>Assertions
-<indexterm><primary>Assertions</primary></indexterm>
-</title>
+<programlisting>
+{- Main.hs -}
+module Main where
-<para>
-If you want to make use of assertions in your standard Haskell code, you
-could define a function like the following:
-</para>
+-- Import our template "pr"
+import Printf ( pr )
-<para>
+-- The splice operator $ takes the Haskell source code
+-- generated at compile time by "pr" and splices it into
+-- the argument of "putStrLn".
+main = putStrLn ( $(pr "Hello") )
+</programlisting>
<programlisting>
-assert :: Bool -> a -> a
-assert False x = error "assertion failed!"
-assert _ x = x
-</programlisting>
+{- Printf.hs -}
+module Printf where
-</para>
+-- Skeletal printf from the paper.
+-- It needs to be in a separate module to the one where
+-- you intend to use it.
-<para>
-which works, but gives you back a less than useful error message --
-an assertion failed, but which and where?
-</para>
+-- Import some Template Haskell syntax
+import Language.Haskell.THSyntax
-<para>
-One way out is to define an extended <function>assert</function> function which also
-takes a descriptive string to include in the error message and
-perhaps combine this with the use of a pre-processor which inserts
-the source location where <function>assert</function> was used.
-</para>
+-- Describe a format string
+data Format = D | S | L String
-<para>
-Ghc offers a helping hand here, doing all of this for you. For every
-use of <function>assert</function> in the user's source:
-</para>
+-- Parse a format string. This is left largely to you
+-- as we are here interested in building our first ever
+-- Template Haskell program and not in building printf.
+parse :: String -> [Format]
+parse s = [ L s ]
-<para>
+-- Generate Haskell source code from a parsed representation
+-- of the format string. This code will be spliced into
+-- the module which calls "pr", at compile time.
+gen :: [Format] -> Expr
+gen [D] = [| \n -> show n |]
+gen [S] = [| \s -> s |]
+gen [L s] = string s
-<programlisting>
-kelvinToC :: Double -> Double
-kelvinToC k = assert (k >= 0.0) (k+273.15)
+-- Here we generate the Haskell code for the splice
+-- from an input format string.
+pr :: String -> Expr
+pr s = gen (parse s)
</programlisting>
+<para>Now run the compiler (here we are using a "stage three" build of GHC, at a Cygwin prompt on Windows):
</para>
+<programlisting>
+ghc/compiler/stage3/ghc-inplace --make -fglasgow-exts -package haskell-src main.hs -o main.exe
+</programlisting>
-<para>
-Ghc will rewrite this to also include the source location where the
-assertion was made,
+<para>Run "main.exe" and here is your output:
</para>
-<para>
-
<programlisting>
-assert pred val ==> assertError "Main.hs|15" pred val
+$ ./main
+Hello
</programlisting>
-</para>
+</sect2>
+
+</sect1>
+
+<!-- ===================== Arrow notation =================== -->
+
+<sect1 id="arrow-notation">
+<title>Arrow notation
+</title>
+<para>Arrows are a generalization of monads introduced by John Hughes.
+For more details, see
+<itemizedlist>
+
+<listitem>
<para>
-The rewrite is only performed by the compiler when it spots
-applications of <function>Control.Exception.assert</function>, so you
-can still define and use your own versions of
-<function>assert</function>, should you so wish. If not, import
-<literal>Control.Exception</literal> to make use
-<function>assert</function> in your code.
+“Generalising Monads to Arrows”,
+John Hughes, in <citetitle>Science of Computer Programming</citetitle> 37,
+pp67–111, May 2000.
</para>
+</listitem>
+<listitem>
<para>
-To have the compiler ignore uses of assert, use the compiler option
-<option>-fignore-asserts</option>. <indexterm><primary>-fignore-asserts
-option</primary></indexterm> That is, expressions of the form
-<literal>assert pred e</literal> will be rewritten to
-<literal>e</literal>.
+“<ulink url="http://www.soi.city.ac.uk/~ross/papers/notation.html">A New Notation for Arrows</ulink>”,
+Ross Paterson, in <citetitle>ICFP</citetitle>, Sep 2001.
</para>
+</listitem>
+<listitem>
<para>
-Assertion failures can be caught, see the documentation for the
-<literal>Control.Exception</literal> library for the details.
+“<ulink url="http://www.soi.city.ac.uk/~ross/papers/fop.html">Arrows and Computation</ulink>”,
+Ross Paterson, in <citetitle>The Fun of Programming</citetitle>,
+Palgrave, 2003.
</para>
+</listitem>
-</sect1>
-
-
-<sect1 id="syntax-extns">
-<title>Syntactic extensions</title>
-
-<!-- ====================== HIERARCHICAL MODULES ======================= -->
-
- <sect2 id="hierarchical-modules">
- <title>Hierarchical Modules</title>
+</itemizedlist>
+and the arrows web page at
+<ulink url="http://www.haskell.org/arrows/"><literal>http://www.haskell.org/arrows/</literal></ulink>.
+With the <option>-farrows</option> flag, GHC supports the arrow
+notation described in the second of these papers.
+What follows is a brief introduction to the notation;
+it won't make much sense unless you've read Hughes's paper.
+This notation is translated to ordinary Haskell,
+using combinators from the
+<ulink url="../base/Control.Arrow.html"><literal>Control.Arrow</literal></ulink>
+module.
+</para>
+
+<para>The extension adds a new kind of expression for defining arrows,
+of the form <literal>proc pat -> cmd</literal>,
+where <literal>proc</literal> is a new keyword.
+The variables of the pattern are bound in the body of the
+<literal>proc</literal>-expression,
+which is a new sort of thing called a <firstterm>command</firstterm>.
+The syntax of commands is as follows:
+<screen>
+cmd ::= exp1 -< exp2
+ | exp1 -<< exp2
+ | do { cstmt1 .. cstmtn ; cmd }
+ | let decls in cmd
+ | if exp then cmd1 else cmd2
+ | case exp of { calts }
+ | cmd1 qop cmd2
+ | (| aexp cmd1 .. cmdn |)
+ | \ pat1 .. patn -> cmd
+ | cmd aexp
+ | ( cmd )
+
+cstmt ::= let decls
+ | pat <- cmd
+ | rec { cstmt1 .. cstmtn }
+ | cmd
+</screen>
+Commands produce values, but (like monadic computations)
+may yield more than one value,
+or none, and may do other things as well.
+For the most part, familiarity with monadic notation is a good guide to
+using commands.
+However the values of expressions, even monadic ones,
+are determined by the values of the variables they contain;
+this is not necessarily the case for commands.
+</para>
- <para>GHC supports a small extension to the syntax of module
- names: a module name is allowed to contain a dot
- <literal>‘.’</literal>. This is also known as the
- “hierarchical module namespace” extension, because
- it extends the normally flat Haskell module namespace into a
- more flexible hierarchy of modules.</para>
+<para>
+A simple example of the new notation is the expression
+<screen>
+proc x -> f -< x+1
+</screen>
+We call this a <firstterm>procedure</firstterm> or
+<firstterm>arrow abstraction</firstterm>.
+As with a lambda expression, the variable <literal>x</literal>
+is a new variable bound within the <literal>proc</literal>-expression.
+It refers to the input to the arrow.
+In the above example, <literal>-<</literal> is not an identifier but an
+new reserved symbol used for building commands from an expression of arrow
+type and an expression to be fed as input to that arrow.
+(The weird look will make more sense later.)
+It may be read as analogue of application for arrows.
+The above example is equivalent to the Haskell expression
+<screen>
+arr (\ x -> x+1) >>> f
+</screen>
+That would make no sense if the expression to the left of
+<literal>-<</literal> involves the bound variable <literal>x</literal>.
+More generally, the expression to the left of <literal>-<</literal>
+may not involve any <firstterm>local variable</firstterm>,
+i.e. a variable bound in the current arrow abstraction.
+For such a situation there is a variant <literal>-<<</literal>, as in
+<screen>
+proc x -> f x -<< x+1
+</screen>
+which is equivalent to
+<screen>
+arr (\ x -> (f, x+1)) >>> app
+</screen>
+so in this case the arrow must belong to the <literal>ArrowApply</literal>
+class.
+Such an arrow is equivalent to a monad, so if you're using this form
+you may find a monadic formulation more convenient.
+</para>
- <para>A module name in the extended syntax consists of a
- sequence of components, each separated by a dot. When searching
- for an interface file (or a source file, in the case of GHCi or
- when using <option>--make</option>) for an imported module, GHC
- interprets the dot as a path separator. So for example, if a
- module <literal>A.B.C</literal> is imported, then for each
- directory <literal>D</literal> on the search path (see the
- <option>-i</option> option, <xref
- linkend="options-finding-imports">), GHC will look in the
- directory <literal>D/A/B</literal><footnote><para>On Windows,
- this would be <literal>D\A\B</literal>.</para></footnote> for an
- interface file called <filename>C.hi</filename> or a source file
- <filename>C.hs</filename> or <filename>C.lhs</filename>.</para>
-
- <para>Note that as far as the compiler is concerned, module
- names are always fully qualified; the hierarchy only has a
- special meaning when searching for interface files and source
- files in the filesystem. In particular, this means that the
- full module name must be given after the
- <literal>module</literal> keyword at the beginning of the
- module; for example, the module <literal>A.B.C</literal> must
- begin</para>
+<sect2>
+<title>do-notation for commands</title>
-<programlisting>module A.B.C</programlisting>
+<para>
+Another form of command is a form of <literal>do</literal>-notation.
+For example, you can write
+<screen>
+proc x -> do
+ y <- f -< x+1
+ g -< 2*y
+ let z = x+y
+ t <- h -< x*z
+ returnA -< t+z
+</screen>
+You can read this much like ordinary <literal>do</literal>-notation,
+but with commands in place of monadic expressions.
+The first line sends the value of <literal>x+1</literal> as an input to
+the arrow <literal>f</literal>, and matches its output against
+<literal>y</literal>.
+In the next line, the output is discarded.
+The arrow <literal>returnA</literal> is defined in the
+<ulink url="../base/Control.Arrow.html"><literal>Control.Arrow</literal></ulink>
+module as <literal>arr id</literal>.
+The above example is treated as an abbreviation for
+<screen>
+arr (\ x -> (x, x)) >>>
+ first (arr (\ x -> x+1) >>> f) >>>
+ arr (\ (y, x) -> (y, (x, y))) >>>
+ first (arr (\ y -> 2*y) >>> g) >>>
+ arr snd >>>
+ arr (\ (x, y) -> let z = x+y in ((x, z), z)) >>>
+ first (arr (\ (x, z) -> x*z) >>> h) >>>
+ arr (\ (t, z) -> t+z) >>>
+ returnA
+</screen>
+Note that variables not used later in the composition are projected out.
+After simplification using rewrite rules (see <xref linkEnd="rewrite-rules">)
+defined in the
+<ulink url="../base/Control.Arrow.html"><literal>Control.Arrow</literal></ulink>
+module, this reduces to
+<screen>
+arr (\ x -> (x+1, x)) >>>
+ first f >>>
+ arr (\ (y, x) -> (2*y, (x, y))) >>>
+ first g >>>
+ arr (\ (_, (x, y)) -> let z = x+y in (x*z, z)) >>>
+ first h >>>
+ arr (\ (t, z) -> t+z)
+</screen>
+which is what you might have written by hand.
+With arrow notation, GHC keeps track of all those tuples of variables for you.
+</para>
- <para>GHC comes with a large collection of libraries arranged
- hierarchically; see the accompanying library documentation.
- There is an ongoing project to create and maintain a stable set
- of <quote>core</quote> libraries used by several Haskell
- compilers, and the libraries that GHC comes with represent the
- current status of that project. For more details, see <ulink
- url="http://www.haskell.org/~simonmar/libraries/libraries.html">.</ulink></para>
+<para>
+Note that although the above translation suggests that
+<literal>let</literal>-bound variables like <literal>z</literal> must be
+monomorphic, the actual translation produces Core,
+so polymorphic variables are allowed.
+</para>
- </sect2>
+<para>
+It's also possible to have mutually recursive bindings,
+using the new <literal>rec</literal> keyword, as in the following example:
+<screen>
+counter :: ArrowCircuit a => a Bool Int
+counter = proc reset -> do
+ rec output <- returnA -< if reset then 0 else next
+ next <- delay 0 -< output+1
+ returnA -< output
+</screen>
+The translation of such forms uses the <literal>loop</literal> combinator,
+so the arrow concerned must belong to the <literal>ArrowLoop</literal> class.
+</para>
-<!-- ====================== PATTERN GUARDS ======================= -->
+</sect2>
-<sect2 id="pattern-guards">
-<title>Pattern guards</title>
+<sect2>
+<title>Conditional commands</title>
<para>
-<indexterm><primary>Pattern guards (Glasgow extension)</primary></indexterm>
-The discussion that follows is an abbreviated version of Simon Peyton Jones's original <ULink URL="http://research.microsoft.com/~simonpj/Haskell/guards.html">proposal</ULink>. (Note that the proposal was written before pattern guards were implemented, so refers to them as unimplemented.)
+In the previous example, we used a conditional expression to construct the
+input for an arrow.
+Sometimes we want to conditionally execute different commands, as in
+<screen>
+proc (x,y) ->
+ if f x y
+ then g -< x+1
+ else h -< y+2
+</screen>
+which is translated to
+<screen>
+arr (\ (x,y) -> if f x y then Left x else Right y) >>>
+ (arr (\x -> x+1) >>> f) ||| (arr (\y -> y+2) >>> g)
+</screen>
+Since the translation uses <literal>|||</literal>,
+the arrow concerned must belong to the <literal>ArrowChoice</literal> class.
</para>
<para>
-Suppose we have an abstract data type of finite maps, with a
-lookup operation:
+There are also <literal>case</literal> commands, like
+<screen>
+case input of
+ [] -> f -< ()
+ [x] -> g -< x+1
+ x1:x2:xs -> do
+ y <- h -< (x1, x2)
+ ys <- k -< xs
+ returnA -< y:ys
+</screen>
+The syntax is the same as for <literal>case</literal> expressions,
+except that the bodies of the alternatives are commands rather than expressions.
+The translation is similar to that of <literal>if</literal> commands.
+</para>
-<programlisting>
-lookup :: FiniteMap -> Int -> Maybe Int
-</programlisting>
+</sect2>
-The lookup returns <function>Nothing</function> if the supplied key is not in the domain of the mapping, and <function>(Just v)</function> otherwise,
-where <VarName>v</VarName> is the value that the key maps to. Now consider the following definition:
+<sect2>
+<title>Defining your own control structures</title>
+
+<para>
+As we're seen, arrow notation provides constructs,
+modelled on those for expressions,
+for sequencing, value recursion and conditionals.
+But suitable combinators,
+which you can define in ordinary Haskell,
+may also be used to build new commands out of existing ones.
+The basic idea is that a command defines an arrow from environments to values.
+These environments assign values to the free local variables of the command.
+Thus combinators that produce arrows from arrows
+may also be used to build commands from commands.
+For example, the <literal>ArrowChoice</literal> class includes a combinator
+<programlisting>
+ArrowChoice a => (<+>) :: a e c -> a e c -> a e c
+</programlisting>
+so we can use it to build commands:
+<programlisting>
+expr' = proc x ->
+ returnA -< x
+ <+> do
+ symbol Plus -< ()
+ y <- term -< ()
+ expr' -< x + y
+ <+> do
+ symbol Minus -< ()
+ y <- term -< ()
+ expr' -< x - y
+</programlisting>
+This is equivalent to
+<programlisting>
+expr' = (proc x -> returnA -< x)
+ <+> (proc x -> do
+ symbol Plus -< ()
+ y <- term -< ()
+ expr' -< x + y)
+ <+> (proc x -> do
+ symbol Minus -< ()
+ y <- term -< ()
+ expr' -< x - y)
+</programlisting>
+It is essential that this operator be polymorphic in <literal>e</literal>
+(representing the environment input to the command
+and thence to its subcommands)
+and satisfy the corresponding naturality property
+<screen>
+arr k >>> (f <+> g) = (arr k >>> f) <+> (arr k >>> g)
+</screen>
+at least for strict <literal>k</literal>.
+(This should be automatic if you're not using <literal>seq</literal>.)
+This ensures that environments seen by the subcommands are environments
+of the whole command,
+and also allows the translation to safely trim these environments.
+The operator must also not use any variable defined within the current
+arrow abstraction.
</para>
+<para>
+We could define our own operator
<programlisting>
-clunky env var1 var2 | ok1 && ok2 = val1 + val2
-| otherwise = var1 + var2
-where
- m1 = lookup env var1
- m2 = lookup env var2
- ok1 = maybeToBool m1
- ok2 = maybeToBool m2
- val1 = expectJust m1
- val2 = expectJust m2
+untilA :: ArrowChoice a => a e () -> a e Bool -> a e ()
+untilA body cond = proc x ->
+ if cond x then returnA -< ()
+ else do
+ body -< x
+ untilA body cond -< x
</programlisting>
-
-<para>
-The auxiliary functions are
+and use it in the same way.
+Of course this infix syntax only makes sense for binary operators;
+there is also a more general syntax involving special brackets:
+<screen>
+proc x -> do
+ y <- f -< x+1
+ (|untilA (increment -< x+y) (within 0.5 -< x)|)
+</screen>
</para>
-<programlisting>
-maybeToBool :: Maybe a -> Bool
-maybeToBool (Just x) = True
-maybeToBool Nothing = False
+</sect2>
-expectJust :: Maybe a -> a
-expectJust (Just x) = x
-expectJust Nothing = error "Unexpected Nothing"
-</programlisting>
+<sect2>
+<title>Primitive constructs</title>
<para>
-What is <function>clunky</function> doing? The guard <literal>ok1 &&
-ok2</literal> checks that both lookups succeed, using
-<function>maybeToBool</function> to convert the <function>Maybe</function>
-types to booleans. The (lazily evaluated) <function>expectJust</function>
-calls extract the values from the results of the lookups, and binds the
-returned values to <VarName>val1</VarName> and <VarName>val2</VarName>
-respectively. If either lookup fails, then clunky takes the
-<literal>otherwise</literal> case and returns the sum of its arguments.
+Some operators will need to pass additional inputs to their subcommands.
+For example, in an arrow type supporting exceptions,
+the operator that attaches an exception handler will wish to pass the
+exception that occurred to the handler.
+Such an operator might have a type
+<screen>
+handleA :: ... => a e c -> a (e,Ex) c -> a e c
+</screen>
+where <literal>Ex</literal> is the type of exceptions handled.
+You could then use this with arrow notation by writing a command
+<screen>
+body `handleA` \ ex -> handler
+</screen>
+so that if an exception is raised in the command <literal>body</literal>,
+the variable <literal>ex</literal> is bound to the value of the exception
+and the command <literal>handler</literal>,
+which typically refers to <literal>ex</literal>, is entered.
+Though the syntax here looks like a functional lambda,
+we are talking about commands, and something different is going on.
+The input to the arrow represented by a command consists of values for
+the free local variables in the command, plus a stack of anonymous values.
+In all the prior examples, this stack was empty.
+In the second argument to <literal>handleA</literal>,
+this stack consists of one value, the value of the exception.
+The command form of lambda merely gives this value a name.
+</para>
+
+<para>
+More concretely,
+the values on the stack are paired to the right of the environment.
+So when designing operators like <literal>handleA</literal> that pass
+extra inputs to their subcommands,
+More precisely, the type of each argument of the operator (and its result)
+should have the form
+<screen>
+a (...(e,t1), ... tn) t
+</screen>
+where <replaceable>e</replaceable> is a polymorphic variable
+(representing the environment)
+and <replaceable>ti</replaceable> are the types of the values on the stack,
+with <replaceable>t1</replaceable> being the <quote>top</quote>.
+The polymorphic variable <replaceable>e</replaceable> must not occur in
+<replaceable>a</replaceable>, <replaceable>ti</replaceable> or
+<replaceable>t</replaceable>.
+However the arrows involved need not be the same.
+Here are some more examples of suitable operators:
+<screen>
+bracketA :: ... => a e b -> a (e,b) c -> a (e,c) d -> a e d
+runReader :: ... => a e c -> a' (e,State) c
+runState :: ... => a e c -> a' (e,State) (c,State)
+</screen>
+We can supply the extra input required by commands built with the last two
+by applying them to ordinary expressions, as in
+<screen>
+proc x -> do
+ s <- ...
+ (|runReader (do { ... })|) s
+</screen>
+which adds <literal>s</literal> to the stack of inputs to the command
+built using <literal>runReader</literal>.
</para>
<para>
-This is certainly legal Haskell, but it is a tremendously verbose and
-un-obvious way to achieve the desired effect. Arguably, a more direct way
-to write clunky would be to use case expressions:
-</para>
+The command versions of lambda abstraction and application are analogous to
+the expression versions.
+In particular, the beta and eta rules describe equivalences of commands.
+These three features (operators, lambda abstraction and application)
+are the core of the notation; everything else can be built using them,
+though the results would be somewhat clumsy.
+For example, we could simulate <literal>do</literal>-notation by defining
+<programlisting>
+bind :: Arrow a => a e b -> a (e,b) c -> a e c
+u `bind` f = returnA &&& u >>> f
+bind_ :: Arrow a => a e b -> a e c -> a e c
+u `bind_` f = u `bind` (arr fst >>> f)
+</programlisting>
+We could simulate <literal>do</literal> by defining
<programlisting>
-clunky env var1 var1 = case lookup env var1 of
- Nothing -> fail
- Just val1 -> case lookup env var2 of
- Nothing -> fail
- Just val2 -> val1 + val2
-where
- fail = val1 + val2
+cond :: ArrowChoice a => a e b -> a e b -> a (e,Bool) b
+cond f g = arr (\ (e,b) -> if b then Left e else Right e) >>> f ||| g
</programlisting>
+</para>
-<para>
-This is a bit shorter, but hardly better. Of course, we can rewrite any set
-of pattern-matching, guarded equations as case expressions; that is
-precisely what the compiler does when compiling equations! The reason that
-Haskell provides guarded equations is because they allow us to write down
-the cases we want to consider, one at a time, independently of each other.
-This structure is hidden in the case version. Two of the right-hand sides
-are really the same (<function>fail</function>), and the whole expression
-tends to become more and more indented.
+</sect2>
+
+<sect2>
+<title>Differences with the paper</title>
+
+<itemizedlist>
+
+<listitem>
+<para>Instead of a single form of arrow application (arrow tail) with two
+translations, the implementation provides two forms
+<quote><literal>-<</literal></quote> (first-order)
+and <quote><literal>-<<</literal></quote> (higher-order).
</para>
+</listitem>
-<para>
-Here is how I would write clunky:
+<listitem>
+<para>User-defined operators are flagged with banana brackets instead of
+a new <literal>form</literal> keyword.
</para>
+</listitem>
-<programlisting>
-clunky env var1 var1
- | Just val1 <- lookup env var1
- , Just val2 <- lookup env var2
- = val1 + val2
-...other equations for clunky...
-</programlisting>
+</itemizedlist>
-<para>
-The semantics should be clear enough. The qualifers are matched in order.
-For a <literal><-</literal> qualifier, which I call a pattern guard, the
-right hand side is evaluated and matched against the pattern on the left.
-If the match fails then the whole guard fails and the next equation is
-tried. If it succeeds, then the appropriate binding takes place, and the
-next qualifier is matched, in the augmented environment. Unlike list
-comprehensions, however, the type of the expression to the right of the
-<literal><-</literal> is the same as the type of the pattern to its
-left. The bindings introduced by pattern guards scope over all the
-remaining guard qualifiers, and over the right hand side of the equation.
-</para>
+</sect2>
+
+<sect2>
+<title>Portability</title>
+
+<para>
+Although only GHC implements arrow notation directly,
+there is also a preprocessor
+(available from the
+<ulink url="http://www.haskell.org/arrows/">arrows web page></ulink>)
+that translates arrow notation into Haskell 98
+for use with other Haskell systems.
+You would still want to check arrow programs with GHC;
+tracing type errors in the preprocessor output is not easy.
+Modules intended for both GHC and the preprocessor must observe some
+additional restrictions:
+<itemizedlist>
+<listitem>
<para>
-Just as with list comprehensions, boolean expressions can be freely mixed
-with among the pattern guards. For example:
+The module must import
+<ulink url="../base/Control.Arrow.html"><literal>Control.Arrow</literal></ulink>.
</para>
+</listitem>
-<programlisting>
-f x | [y] <- x
- , y > 3
- , Just z <- h y
- = ...
-</programlisting>
-
+<listitem>
<para>
-Haskell's current guards therefore emerge as a special case, in which the
-qualifier list has just one element, a boolean expression.
+The preprocessor cannot cope with other Haskell extensions.
+These would have to go in separate modules.
</para>
-</sect2>
+</listitem>
-<!-- ===================== PARALLEL LIST COMPREHENSIONS =================== -->
+<listitem>
+<para>
+Because the preprocessor targets Haskell (rather than Core),
+<literal>let</literal>-bound variables are monomorphic.
+</para>
+</listitem>
- <sect2 id="parallel-list-comprehensions">
- <title>Parallel List Comprehensions</title>
- <indexterm><primary>list comprehensions</primary><secondary>parallel</secondary>
- </indexterm>
- <indexterm><primary>parallel list comprehensions</primary>
- </indexterm>
+</itemizedlist>
+</para>
- <para>Parallel list comprehensions are a natural extension to list
- comprehensions. List comprehensions can be thought of as a nice
- syntax for writing maps and filters. Parallel comprehensions
- extend this to include the zipWith family.</para>
+</sect2>
- <para>A parallel list comprehension has multiple independent
- branches of qualifier lists, each separated by a `|' symbol. For
- example, the following zips together two lists:</para>
+</sect1>
-<programlisting>
- [ (x, y) | x <- xs | y <- ys ]
-</programlisting>
+<!-- ==================== ASSERTIONS ================= -->
- <para>The behavior of parallel list comprehensions follows that of
- zip, in that the resulting list will have the same length as the
- shortest branch.</para>
+<sect1 id="sec-assertions">
+<title>Assertions
+<indexterm><primary>Assertions</primary></indexterm>
+</title>
- <para>We can define parallel list comprehensions by translation to
- regular comprehensions. Here's the basic idea:</para>
+<para>
+If you want to make use of assertions in your standard Haskell code, you
+could define a function like the following:
+</para>
- <para>Given a parallel comprehension of the form: </para>
+<para>
<programlisting>
- [ e | p1 <- e11, p2 <- e12, ...
- | q1 <- e21, q2 <- e22, ...
- ...
- ]
+assert :: Bool -> a -> a
+assert False x = error "assertion failed!"
+assert _ x = x
</programlisting>
- <para>This will be translated to: </para>
+</para>
-<programlisting>
- [ e | ((p1,p2), (q1,q2), ...) <- zipN [(p1,p2) | p1 <- e11, p2 <- e12, ...]
- [(q1,q2) | q1 <- e21, q2 <- e22, ...]
- ...
- ]
-</programlisting>
+<para>
+which works, but gives you back a less than useful error message --
+an assertion failed, but which and where?
+</para>
- <para>where `zipN' is the appropriate zip for the given number of
- branches.</para>
+<para>
+One way out is to define an extended <function>assert</function> function which also
+takes a descriptive string to include in the error message and
+perhaps combine this with the use of a pre-processor which inserts
+the source location where <function>assert</function> was used.
+</para>
- </sect2>
+<para>
+Ghc offers a helping hand here, doing all of this for you. For every
+use of <function>assert</function> in the user's source:
+</para>
-<sect2 id="rebindable-syntax">
-<title>Rebindable syntax</title>
+<para>
+<programlisting>
+kelvinToC :: Double -> Double
+kelvinToC k = assert (k >= 0.0) (k+273.15)
+</programlisting>
- <para>GHC allows most kinds of built-in syntax to be rebound by
- the user, to facilitate replacing the <literal>Prelude</literal>
- with a home-grown version, for example.</para>
+</para>
- <para>You may want to define your own numeric class
- hierarchy. It completely defeats that purpose if the
- literal "1" means "<literal>Prelude.fromInteger
- 1</literal>", which is what the Haskell Report specifies.
- So the <option>-fno-implicit-prelude</option> flag causes
- the following pieces of built-in syntax to refer to
- <emphasis>whatever is in scope</emphasis>, not the Prelude
- versions:</para>
+<para>
+Ghc will rewrite this to also include the source location where the
+assertion was made,
+</para>
- <itemizedlist>
- <listitem>
- <para>Integer and fractional literals mean
- "<literal>fromInteger 1</literal>" and
- "<literal>fromRational 3.2</literal>", not the
- Prelude-qualified versions; both in expressions and in
- patterns. </para>
- </listitem>
+<para>
- <listitem>
- <para>Negation (e.g. "<literal>- (f x)</literal>")
- means "<literal>negate (f x)</literal>" (not
- <literal>Prelude.negate</literal>).</para>
- </listitem>
+<programlisting>
+assert pred val ==> assertError "Main.hs|15" pred val
+</programlisting>
- <listitem>
- <para>In an n+k pattern, the standard Prelude
- <literal>Ord</literal> class is still used for comparison,
- but the necessary subtraction uses whatever
- "<literal>(-)</literal>" is in scope (not
- "<literal>Prelude.(-)</literal>").</para>
- </listitem>
+</para>
- <listitem>
- <para>"Do" notation is translated using whatever
- functions <literal>(>>=)</literal>,
- <literal>(>>)</literal>, <literal>fail</literal>, and
- <literal>return</literal>, are in scope (not the Prelude
- versions). List comprehensions, and parallel array
- comprehensions, are unaffected. </para></listitem>
- </itemizedlist>
+<para>
+The rewrite is only performed by the compiler when it spots
+applications of <function>Control.Exception.assert</function>, so you
+can still define and use your own versions of
+<function>assert</function>, should you so wish. If not, import
+<literal>Control.Exception</literal> to make use
+<function>assert</function> in your code.
+</para>
- <para>Be warned: this is an experimental facility, with fewer checks than
- usual. In particular, it is essential that the functions GHC finds in scope
- must have the appropriate types, namely:
- <screen>
- fromInteger :: forall a. (...) => Integer -> a
- fromRational :: forall a. (...) => Rational -> a
- negate :: forall a. (...) => a -> a
- (-) :: forall a. (...) => a -> a -> a
- (>>=) :: forall m a. (...) => m a -> (a -> m b) -> m b
- (>>) :: forall m a. (...) => m a -> m b -> m b
- return :: forall m a. (...) => a -> m a
- fail :: forall m a. (...) => String -> m a
- </screen>
- (The (...) part can be any context including the empty context; that part
- is up to you.)
- If the functions don't have the right type, very peculiar things may
- happen. Use <literal>-dcore-lint</literal> to
- typecheck the desugared program. If Core Lint is happy you should be all right.</para>
+<para>
+To have the compiler ignore uses of assert, use the compiler option
+<option>-fignore-asserts</option>. <indexterm><primary>-fignore-asserts
+option</primary></indexterm> That is, expressions of the form
+<literal>assert pred e</literal> will be rewritten to
+<literal>e</literal>.
+</para>
+
+<para>
+Assertion failures can be caught, see the documentation for the
+<literal>Control.Exception</literal> library for the details.
+</para>
-</sect2>
</sect1>
+
<!-- =============================== PRAGMAS =========================== -->
<sect1 id="pragmas">
unrecognised <replaceable>word</replaceable> is (silently)
ignored.</para>
-<sect2 id="inline-pragma">
-<title>INLINE pragma
+ <sect2 id="deprecated-pragma">
+ <title>DEPRECATED pragma</title>
+ <indexterm><primary>DEPRECATED</primary>
+ </indexterm>
-<indexterm><primary>INLINE pragma</primary></indexterm>
-<indexterm><primary>pragma, INLINE</primary></indexterm></title>
+ <para>The DEPRECATED pragma lets you specify that a particular
+ function, class, or type, is deprecated. There are two
+ forms.</para>
-<para>
-GHC (with <option>-O</option>, as always) tries to inline (or “unfold”)
-functions/values that are “small enough,” thus avoiding the call
-overhead and possibly exposing other more-wonderful optimisations.
-</para>
+ <itemizedlist>
+ <listitem>
+ <para>You can deprecate an entire module thus:</para>
+<programlisting>
+ module Wibble {-# DEPRECATED "Use Wobble instead" #-} where
+ ...
+</programlisting>
+ <para>When you compile any module that import
+ <literal>Wibble</literal>, GHC will print the specified
+ message.</para>
+ </listitem>
-<para>
-You will probably see these unfoldings (in Core syntax) in your
-interface files.
-</para>
+ <listitem>
+ <para>You can deprecate a function, class, or type, with the
+ following top-level declaration:</para>
+<programlisting>
+ {-# DEPRECATED f, C, T "Don't use these" #-}
+</programlisting>
+ <para>When you compile any module that imports and uses any
+ of the specifed entities, GHC will print the specified
+ message.</para>
+ </listitem>
+ </itemizedlist>
-<para>
-Normally, if GHC decides a function is “too expensive” to inline, it
-will not do so, nor will it export that unfolding for other modules to
-use.
-</para>
+ <para>You can suppress the warnings with the flag
+ <option>-fno-warn-deprecations</option>.</para>
+ </sect2>
-<para>
-The sledgehammer you can bring to bear is the
-<literal>INLINE</literal><indexterm><primary>INLINE pragma</primary></indexterm> pragma, used thusly:
+ <sect2 id="inline-noinline-pragma">
+ <title>INLINE and NOINLINE pragmas</title>
+
+ <para>These pragmas control the inlining of function
+ definitions.</para>
+
+ <sect3 id="inline-pragma">
+ <title>INLINE pragma</title>
+ <indexterm><primary>INLINE</primary></indexterm>
+
+ <para>GHC (with <option>-O</option>, as always) tries to
+ inline (or “unfold”) functions/values that are
+ “small enough,” thus avoiding the call overhead
+ and possibly exposing other more-wonderful optimisations.
+ Normally, if GHC decides a function is “too
+ expensive” to inline, it will not do so, nor will it
+ export that unfolding for other modules to use.</para>
+
+ <para>The sledgehammer you can bring to bear is the
+ <literal>INLINE</literal><indexterm><primary>INLINE
+ pragma</primary></indexterm> pragma, used thusly:</para>
<programlisting>
key_function :: Int -> String -> (Bool, Double)
#endif
</programlisting>
-(You don't need to do the C pre-processor carry-on unless you're going
-to stick the code through HBC—it doesn't like <literal>INLINE</literal> pragmas.)
-</para>
+ <para>(You don't need to do the C pre-processor carry-on
+ unless you're going to stick the code through HBC—it
+ doesn't like <literal>INLINE</literal> pragmas.)</para>
-<para>
-The major effect of an <literal>INLINE</literal> pragma is to declare a function's
-“cost” to be very low. The normal unfolding machinery will then be
-very keen to inline it.
-</para>
+ <para>The major effect of an <literal>INLINE</literal> pragma
+ is to declare a function's “cost” to be very low.
+ The normal unfolding machinery will then be very keen to
+ inline it.</para>
-<para>
-An <literal>INLINE</literal> pragma for a function can be put anywhere its type
-signature could be put.
-</para>
+ <para>Syntactially, an <literal>INLINE</literal> pragma for a
+ function can be put anywhere its type signature could be
+ put.</para>
-<para>
-<literal>INLINE</literal> pragmas are a particularly good idea for the
-<literal>then</literal>/<literal>return</literal> (or <literal>bind</literal>/<literal>unit</literal>) functions in a monad.
-For example, in GHC's own <literal>UniqueSupply</literal> monad code, we have:
+ <para><literal>INLINE</literal> pragmas are a particularly
+ good idea for the
+ <literal>then</literal>/<literal>return</literal> (or
+ <literal>bind</literal>/<literal>unit</literal>) functions in
+ a monad. For example, in GHC's own
+ <literal>UniqueSupply</literal> monad code, we have:</para>
<programlisting>
#ifdef __GLASGOW_HASKELL__
#endif
</programlisting>
-</para>
+ <para>See also the <literal>NOINLINE</literal> pragma (<xref
+ linkend="noinline-pragma">).</para>
+ </sect3>
+
+ <sect3 id="noinline-pragma">
+ <title>NOINLINE pragma</title>
+
+ <indexterm><primary>NOINLINE</primary></indexterm>
+ <indexterm><primary>NOTINLINE</primary></indexterm>
+
+ <para>The <literal>NOINLINE</literal> pragma does exactly what
+ you'd expect: it stops the named function from being inlined
+ by the compiler. You shouldn't ever need to do this, unless
+ you're very cautious about code size.</para>
+
+ <para><literal>NOTINLINE</literal> is a synonym for
+ <literal>NOINLINE</literal> (<literal>NOTINLINE</literal> is
+ specified by Haskell 98 as the standard way to disable
+ inlining, so it should be used if you want your code to be
+ portable).</para>
+ </sect3>
+
+ <sect3 id="phase-control">
+ <title>Phase control</title>
+
+ <para> Sometimes you want to control exactly when in GHC's
+ pipeline the INLINE pragma is switched on. Inlining happens
+ only during runs of the <emphasis>simplifier</emphasis>. Each
+ run of the simplifier has a different <emphasis>phase
+ number</emphasis>; the phase number decreases towards zero.
+ If you use <option>-dverbose-core2core</option> you'll see the
+ sequence of phase numbers for successive runs of the
+ simpifier. In an INLINE pragma you can optionally specify a
+ phase number, thus:</para>
+
+ <itemizedlist>
+ <listitem>
+ <para>You can say "inline <literal>f</literal> in Phase 2
+ and all subsequent phases":
+<programlisting>
+ {-# INLINE [2] f #-}
+</programlisting>
+ </para>
+ </listitem>
-</sect2>
+ <listitem>
+ <para>You can say "inline <literal>g</literal> in all
+ phases up to, but not including, Phase 3":
+<programlisting>
+ {-# INLINE [~3] g #-}
+</programlisting>
+ </para>
+ </listitem>
-<sect2 id="noinline-pragma">
-<title>NOINLINE pragma
-</title>
+ <listitem>
+ <para>If you omit the phase indicator, you mean "inline in
+ all phases".</para>
+ </listitem>
+ </itemizedlist>
-<indexterm><primary>NOINLINE pragma</primary></indexterm>
-<indexterm><primary>pragma</primary><secondary>NOINLINE</secondary></indexterm>
-<indexterm><primary>NOTINLINE pragma</primary></indexterm>
-<indexterm><primary>pragma</primary><secondary>NOTINLINE</secondary></indexterm>
+ <para>You can use a phase number on a NOINLINE pragma too:</para>
-<para>
-The <literal>NOINLINE</literal> pragma does exactly what you'd expect:
-it stops the named function from being inlined by the compiler. You
-shouldn't ever need to do this, unless you're very cautious about code
-size.
-</para>
+ <itemizedlist>
+ <listitem>
+ <para>You can say "do not inline <literal>f</literal>
+ until Phase 2; in Phase 2 and subsequently behave as if
+ there was no pragma at all":
+<programlisting>
+ {-# NOINLINE [2] f #-}
+</programlisting>
+ </para>
+ </listitem>
-<para><literal>NOTINLINE</literal> is a synonym for
-<literal>NOINLINE</literal> (<literal>NOTINLINE</literal> is specified
-by Haskell 98 as the standard way to disable inlining, so it should be
-used if you want your code to be portable).</para>
+ <listitem>
+ <para>You can say "do not inline <literal>g</literal> in
+ Phase 3 or any subsequent phase; before that, behave as if
+ there was no pragma":
+<programlisting>
+ {-# NOINLINE [~3] g #-}
+</programlisting>
+ </para>
+ </listitem>
-</sect2>
+ <listitem>
+ <para>If you omit the phase indicator, you mean "never
+ inline this function".</para>
+ </listitem>
+ </itemizedlist>
+
+ <para>The same phase-numbering control is available for RULES
+ (<xref LinkEnd="rewrite-rules">).</para>
+ </sect3>
+ </sect2>
+
+ <sect2 id="line-pragma">
+ <title>LINE pragma</title>
+
+ <indexterm><primary>LINE</primary><secondary>pragma</secondary></indexterm>
+ <indexterm><primary>pragma</primary><secondary>LINE</secondary></indexterm>
+ <para>This pragma is similar to C's <literal>#line</literal>
+ pragma, and is mainly for use in automatically generated Haskell
+ code. It lets you specify the line number and filename of the
+ original code; for example</para>
+
+<programlisting>
+{-# LINE 42 "Foo.vhs" #-}
+</programlisting>
+
+ <para>if you'd generated the current file from something called
+ <filename>Foo.vhs</filename> and this line corresponds to line
+ 42 in the original. GHC will adjust its error messages to refer
+ to the line/file named in the <literal>LINE</literal>
+ pragma.</para>
+ </sect2>
+
+ <sect2 id="options-pragma">
+ <title>OPTIONS pragma</title>
+ <indexterm><primary>OPTIONS</primary>
+ </indexterm>
+ <indexterm><primary>pragma</primary><secondary>OPTIONS</secondary>
+ </indexterm>
+
+ <para>The <literal>OPTIONS</literal> pragma is used to specify
+ additional options that are given to the compiler when compiling
+ this source file. See <xref linkend="source-file-options"> for
+ details.</para>
+ </sect2>
+
+ <sect2 id="rules">
+ <title>RULES pragma</title>
+
+ <para>The RULES pragma lets you specify rewrite rules. It is
+ described in <xref LinkEnd="rewrite-rules">.</para>
+ </sect2>
<sect2 id="specialize-pragma">
<title>SPECIALIZE pragma</title>
{-# SPECIALIZE hammeredLookup :: [(Widget, value)] -> Widget -> value #-}
</programlisting>
- <para>To get very fancy, you can also specify a named function
- to use for the specialised value, as in:</para>
+ <para>A <literal>SPECIALIZE</literal> pragma for a function can
+ be put anywhere its type signature could be put.</para>
+<para>A <literal>SPECIALIZE</literal> has the effect of generating (a) a specialised
+version of the function and (b) a rewrite rule (see <xref linkend="rules">) that
+rewrites a call to the un-specialised function into a call to the specialised
+one. You can, instead, provide your own specialised function and your own rewrite rule.
+For example, suppose that:
<programlisting>
-{-# RULES hammeredLookup = blah #-}
+ genericLookup :: Ord a => Table a b -> a -> b
+ intLookup :: Table Int b -> Int -> b
</programlisting>
-
- <para>where <literal>blah</literal> is an implementation of
- <literal>hammerdLookup</literal> written specialy for
- <literal>Widget</literal> lookups. It's <emphasis>Your
+where <literal>intLookup</literal> is an implementation of <literal>genericLookup</literal>
+that works very fast for keys of type <literal>Int</literal>. Then you can write the rule
+<programlisting>
+ {-# RULES "intLookup" genericLookup = intLookup #-}
+</programlisting>
+(see <xref linkend="rule-spec">). It is <emphasis>Your
Responsibility</emphasis> to make sure that
- <function>blah</function> really behaves as a specialised
- version of <function>hammeredLookup</function>!!!</para>
-
- <para>Note we use the <literal>RULE</literal> pragma here to
- indicate that <literal>hammeredLookup</literal> applied at a
- certain type should be replaced by <literal>blah</literal>. See
- <xref linkend="rules"> for more information on
- <literal>RULES</literal>.</para>
+ <function>intLookup</function> really behaves as a specialised
+ version of <function>genericLookup</function>!!!</para>
<para>An example in which using <literal>RULES</literal> for
specialisation will Win Big:
<programlisting>
-toDouble :: Real a => a -> Double
-toDouble = fromRational . toRational
+ toDouble :: Real a => a -> Double
+ toDouble = fromRational . toRational
-{-# SPECIALIZE toDouble :: Int -> Double = i2d #-}
-i2d (I# i) = D# (int2Double# i) -- uses Glasgow prim-op directly
+ {-# RULES "toDouble/Int" toDouble = i2d #-}
+ i2d (I# i) = D# (int2Double# i) -- uses Glasgow prim-op directly
</programlisting>
The <function>i2d</function> function is virtually one machine
instruction; the default conversion—via an intermediate
<literal>Rational</literal>—is obscenely expensive by
- comparison.</para>
-
- <para>A <literal>SPECIALIZE</literal> pragma for a function can
- be put anywhere its type signature could be put.</para>
+ comparison.</para>
</sect2>
</sect2>
-<sect2 id="line-pragma">
-<title>LINE pragma
-</title>
-
-<para>
-<indexterm><primary>LINE pragma</primary></indexterm>
-<indexterm><primary>pragma, LINE</primary></indexterm>
-</para>
-
-<para>
-This pragma is similar to C's <literal>#line</literal> pragma, and is mainly for use in
-automatically generated Haskell code. It lets you specify the line
-number and filename of the original code; for example
-</para>
-
-<para>
-
-<programlisting>
-{-# LINE 42 "Foo.vhs" #-}
-</programlisting>
-
-</para>
-
-<para>
-if you'd generated the current file from something called <filename>Foo.vhs</filename>
-and this line corresponds to line 42 in the original. GHC will adjust
-its error messages to refer to the line/file named in the <literal>LINE</literal>
-pragma.
-</para>
-
-</sect2>
-
-<sect2 id="rules">
-<title>RULES pragma</title>
-
-<para>
-The RULES pragma lets you specify rewrite rules. It is described in
-<xref LinkEnd="rewrite-rules">.
-</para>
-
-</sect2>
-
-<sect2 id="deprecated-pragma">
-<title>DEPRECATED pragma</title>
-
-<para>
-The DEPRECATED pragma lets you specify that a particular function, class, or type, is deprecated.
-There are two forms.
-</para>
-<itemizedlist>
-<listitem><para>
-You can deprecate an entire module thus:</para>
-<programlisting>
- module Wibble {-# DEPRECATED "Use Wobble instead" #-} where
- ...
-</programlisting>
-<para>
-When you compile any module that import <literal>Wibble</literal>, GHC will print
-the specified message.</para>
-</listitem>
-
-<listitem>
-<para>
-You can deprecate a function, class, or type, with the following top-level declaration:
-</para>
-<programlisting>
- {-# DEPRECATED f, C, T "Don't use these" #-}
-</programlisting>
-<para>
-When you compile any module that imports and uses any of the specifed entities,
-GHC will print the specified message.
-</para>
-</listitem>
-</itemizedlist>
-<para>You can suppress the warnings with the flag <option>-fno-warn-deprecations</option>.</para>
-</sect2>
</sect1>
<para>
The programmer can specify rewrite rules as part of the source program
-(in a pragma). GHC applies these rewrite rules wherever it can.
+(in a pragma). GHC applies these rewrite rules wherever it can, provided (a)
+the <option>-O</option> flag (<xref LinkEnd="options-optimise">) is on,
+and (b) the <option>-frules-off</option> flag
+(<xref LinkEnd="options-f">) is not specified.
</para>
<para>
<listitem>
<para>
+ There may be zero or more rules in a <literal>RULES</literal> pragma.
+</para>
+</listitem>
+
+<listitem>
+
+<para>
Each rule has a name, enclosed in double quotes. The name itself has
no significance at all. It is only used when reporting how many times the rule fired.
</para>
</listitem>
-<listitem>
+<listitem>
<para>
- There may be zero or more rules in a <literal>RULES</literal> pragma.
+A rule may optionally have a phase-control number (see <xref LinkEnd="phase-control">),
+immediately after the name of the rule. Thus:
+<programlisting>
+ {-# RULES
+ "map/map" [2] forall f g xs. map f (map g xs) = map (f.g) xs
+ #-}
+</programlisting>
+The "[2]" means that the rule is active in Phase 2 and subsequent phases. The inverse
+notation "[~2]" is also accepted, meaning that the rule is active up to, but not including,
+Phase 2.
</para>
</listitem>
+
+
<listitem>
<para>
enclosing definitions.
</para>
</listitem>
+
<listitem>
<para>
</para>
-<para>
+ <para>
So, for example, the following should generate no intermediate lists:
<programlisting>
<listitem>
<para>
- The defintion of (say) <function>build</function> in <FileName>PrelBase.lhs</FileName> looks llike this:
+ The defintion of (say) <function>build</function> in <FileName>GHC/Base.lhs</FileName> looks llike this:
<programlisting>
build :: forall a. (forall b. (a -> b -> b) -> b -> b) -> [a]
<listitem>
<para>
- In <filename>ghc/lib/std/PrelBase.lhs</filename> look at the rules for <function>map</function> to
+ In <filename>libraries/base/GHC/Base.lhs</filename> look at the rules for <function>map</function> to
see how to write rules that will do fusion and yet give an efficient
-program even if fusion doesn't happen. More rules in <filename>PrelList.lhs</filename>.
+program even if fusion doesn't happen. More rules in <filename>GHC/List.lhs</filename>.
</para>
</listitem>
</sect2>
+<sect2 id="core-pragma">
+ <title>CORE pragma</title>
+
+ <indexterm><primary>CORE pragma</primary></indexterm>
+ <indexterm><primary>pragma, CORE</primary></indexterm>
+ <indexterm><primary>core, annotation</primary></indexterm>
+
+<para>
+ The external core format supports <quote>Note</quote> annotations;
+ the <literal>CORE</literal> pragma gives a way to specify what these
+ should be in your Haskell source code. Syntactically, core
+ annotations are attached to expressions and take a Haskell string
+ literal as an argument. The following function definition shows an
+ example:
+
+<programlisting>
+f x = ({-# CORE "foo" #-} show) ({-# CORE "bar" #-} x)
+</programlisting>
+
+ Sematically, this is equivalent to:
+
+<programlisting>
+g x = show x
+</programlisting>
+</para>
+
+<para>
+ However, when external for is generated (via
+ <option>-fext-core</option>), there will be Notes attached to the
+ expressions <function>show</function> and <VarName>x</VarName>.
+ The core function declaration for <function>f</function> is:
+</para>
+
+<programlisting>
+ f :: %forall a . GHCziShow.ZCTShow a ->
+ a -> GHCziBase.ZMZN GHCziBase.Char =
+ \ @ a (zddShow::GHCziShow.ZCTShow a) (eta::a) ->
+ (%note "foo"
+ %case zddShow %of (tpl::GHCziShow.ZCTShow a)
+ {GHCziShow.ZCDShow
+ (tpl1::GHCziBase.Int ->
+ a ->
+ GHCziBase.ZMZN GHCziBase.Char -> GHCziBase.ZMZN GHCziBase.Cha
+r)
+ (tpl2::a -> GHCziBase.ZMZN GHCziBase.Char)
+ (tpl3::GHCziBase.ZMZN a ->
+ GHCziBase.ZMZN GHCziBase.Char -> GHCziBase.ZMZN GHCziBase.Cha
+r) ->
+ tpl2})
+ (%note "foo"
+ eta);
+</programlisting>
+
+<para>
+ Here, we can see that the function <function>show</function> (which
+ has been expanded out to a case expression over the Show dictionary)
+ has a <literal>%note</literal> attached to it, as does the
+ expression <VarName>eta</VarName> (which used to be called
+ <VarName>x</VarName>).
+</para>
+
+</sect2>
+
</sect1>
<sect1 id="generic-classes">
instance (Bin a, Bin b) => Bin (a,b)
instance Bin a => Bin [a]
</programlisting>
-That is, just leave off the "where" clasuse. Of course, you can put in the
+That is, just leave off the "where" clause. Of course, you can put in the
where clause and over-ride whichever methods you please.
</para>
</sect2>
</sect1>
-<sect1 id="newtype-deriving">
-<title>Generalised derived instances for newtypes</title>
-
-<para>
-When you define an abstract type using <literal>newtype</literal>, you may want
-the new type to inherit some instances from its representation. In
-Haskell 98, you can inherit instances of <literal>Eq</literal>, <literal>Ord</literal>,
-<literal>Enum</literal> and <literal>Bounded</literal> by deriving them, but for any
-other classes you have to write an explicit instance declaration. For
-example, if you define
-
-<programlisting>
- newtype Dollars = Dollars Int
-</programlisting>
-
-and you want to use arithmetic on <literal>Dollars</literal>, you have to
-explicitly define an instance of <literal>Num</literal>:
-
-<programlisting>
- instance Num Dollars where
- Dollars a + Dollars b = Dollars (a+b)
- ...
-</programlisting>
-All the instance does is apply and remove the <literal>newtype</literal>
-constructor. It is particularly galling that, since the constructor
-doesn't appear at run-time, this instance declaration defines a
-dictionary which is <emphasis>wholly equivalent</emphasis> to the <literal>Int</literal>
-dictionary, only slower!
-</para>
-
-<sect2> <title> Generalising the deriving clause </title>
-<para>
-GHC now permits such instances to be derived instead, so one can write
-<programlisting>
- newtype Dollars = Dollars Int deriving (Eq,Show,Num)
-</programlisting>
-
-and the implementation uses the <emphasis>same</emphasis> <literal>Num</literal> dictionary
-for <literal>Dollars</literal> as for <literal>Int</literal>. Notionally, the compiler
-derives an instance declaration of the form
-
-<programlisting>
- instance Num Int => Num Dollars
-</programlisting>
-
-which just adds or removes the <literal>newtype</literal> constructor according to the type.
-</para>
-<para>
-
-We can also derive instances of constructor classes in a similar
-way. For example, suppose we have implemented state and failure monad
-transformers, such that
-
-<programlisting>
- instance Monad m => Monad (State s m)
- instance Monad m => Monad (Failure m)
-</programlisting>
-In Haskell 98, we can define a parsing monad by
-<programlisting>
- type Parser tok m a = State [tok] (Failure m) a
-</programlisting>
-
-which is automatically a monad thanks to the instance declarations
-above. With the extension, we can make the parser type abstract,
-without needing to write an instance of class <literal>Monad</literal>, via
-
-<programlisting>
- newtype Parser tok m a = Parser (State [tok] (Failure m) a)
- deriving Monad
-</programlisting>
-In this case the derived instance declaration is of the form
-<programlisting>
- instance Monad (State [tok] (Failure m)) => Monad (Parser tok m)
-</programlisting>
-
-Notice that, since <literal>Monad</literal> is a constructor class, the
-instance is a <emphasis>partial application</emphasis> of the new type, not the
-entire left hand side. We can imagine that the type declaration is
-``eta-converted'' to generate the context of the instance
-declaration.
-</para>
-<para>
-
-We can even derive instances of multi-parameter classes, provided the
-newtype is the last class parameter. In this case, a ``partial
-application'' of the class appears in the <literal>deriving</literal>
-clause. For example, given the class
-
-<programlisting>
- class StateMonad s m | m -> s where ...
- instance Monad m => StateMonad s (State s m) where ...
-</programlisting>
-then we can derive an instance of <literal>StateMonad</literal> for <literal>Parser</literal>s by
-<programlisting>
- newtype Parser tok m a = Parser (State [tok] (Failure m) a)
- deriving (Monad, StateMonad [tok])
-</programlisting>
-
-The derived instance is obtained by completing the application of the
-class to the new type:
-
-<programlisting>
- instance StateMonad [tok] (State [tok] (Failure m)) =>
- StateMonad [tok] (Parser tok m)
-</programlisting>
-</para>
-<para>
-
-As a result of this extension, all derived instances in newtype
-declarations are treated uniformly (and implemented just by reusing
-the dictionary for the representation type), <emphasis>except</emphasis>
-<literal>Show</literal> and <literal>Read</literal>, which really behave differently for
-the newtype and its representation.
-</para>
-</sect2>
-
-<sect2> <title> A more precise specification </title>
-<para>
-Derived instance declarations are constructed as follows. Consider the
-declaration (after expansion of any type synonyms)
-
-<programlisting>
- newtype T v1...vn = T' (S t1...tk vk+1...vn) deriving (c1...cm)
-</programlisting>
-
-where <literal>S</literal> is a type constructor, <literal>t1...tk</literal> are
-types,
-<literal>vk+1...vn</literal> are type variables which do not occur in any of
-the <literal>ti</literal>, and the <literal>ci</literal> are partial applications of
-classes of the form <literal>C t1'...tj'</literal>. The derived instance
-declarations are, for each <literal>ci</literal>,
-
-<programlisting>
- instance ci (S t1...tk vk+1...v) => ci (T v1...vp)
-</programlisting>
-where <literal>p</literal> is chosen so that <literal>T v1...vp</literal> is of the
-right <emphasis>kind</emphasis> for the last parameter of class <literal>Ci</literal>.
-</para>
-<para>
-
-As an example which does <emphasis>not</emphasis> work, consider
-<programlisting>
- newtype NonMonad m s = NonMonad (State s m s) deriving Monad
-</programlisting>
-Here we cannot derive the instance
-<programlisting>
- instance Monad (State s m) => Monad (NonMonad m)
-</programlisting>
-
-because the type variable <literal>s</literal> occurs in <literal>State s m</literal>,
-and so cannot be "eta-converted" away. It is a good thing that this
-<literal>deriving</literal> clause is rejected, because <literal>NonMonad m</literal> is
-not, in fact, a monad --- for the same reason. Try defining
-<literal>>>=</literal> with the correct type: you won't be able to.
-</para>
-<para>
-
-Notice also that the <emphasis>order</emphasis> of class parameters becomes
-important, since we can only derive instances for the last one. If the
-<literal>StateMonad</literal> class above were instead defined as
-
-<programlisting>
- class StateMonad m s | m -> s where ...
-</programlisting>
-
-then we would not have been able to derive an instance for the
-<literal>Parser</literal> type above. We hypothesise that multi-parameter
-classes usually have one "main" parameter for which deriving new
-instances is most interesting.
-</para>
-</sect2>
-</sect1>
-
<!-- Emacs stuff: