X-Git-Url: http://git.megacz.com/?a=blobdiff_plain;f=ghc%2Fdocs%2Fusers_guide%2Fglasgow_exts.sgml;h=ded55673a6a5bcbd8b26ee162832725620dad6a2;hb=b8554efd031fc4ff0d9b3950f8320733058358d8;hp=20d164d16f428646af963dc192827b89b6ab64a6;hpb=2ce9f3af8a25f0ff3488a14ae2883111447f1c9c;p=ghc-hetmet.git diff --git a/ghc/docs/users_guide/glasgow_exts.sgml b/ghc/docs/users_guide/glasgow_exts.sgml index 20d164d..ded5567 100644 --- a/ghc/docs/users_guide/glasgow_exts.sgml +++ b/ghc/docs/users_guide/glasgow_exts.sgml @@ -16,153 +16,15 @@ performance because of the implementation costs of Haskell's -Executive summary of our extensions: - - - - - - Unboxed types and primitive operations: - - You can get right down to the raw machine types and - operations; included in this are “primitive - arrays” (direct access to Big Wads of Bytes). Please - see and following. - - - - - Type system extensions: - - GHC supports a large number of extensions to Haskell's - type system. Specifically: - - - - Multi-parameter type classes: - - - - - - - Functional dependencies: - - - - - - - Implicit parameters: - - - - - - - Local universal quantification: - - - - - - - Extistentially quantification in data types: - - - - - - - Scoped type variables: - - Scoped type variables enable the programmer to - supply type signatures for some nested declarations, - where this would not be legal in Haskell 98. Details in - . - - - - - - - - Pattern guards - - Instead of being a boolean expression, a guard is a list - of qualifiers, exactly as in a list comprehension. See . - - - - - Data types with no constructors - - See . - - - - - Parallel list comprehensions - - An extension to the list comprehension syntax to support - zipWith-like functionality. See . - - - - - Foreign calling: - - Just what it sounds like. We provide - lots of rope that you can dangle around - your neck. Please see . - - - - - Pragmas - - Pragmas are special instructions to the compiler placed - in the source file. The pragmas GHC supports are described in - . - - - - - Rewrite rules: - - The programmer can specify rewrite rules as part of the - source program (in a pragma). GHC applies these rewrite rules - wherever it can. Details in . - - - - - Generic classes: - - (Note: support for generic classes is currently broken - in GHC 5.02). - - Generic class declarations allow you to define a class - whose methods say how to work over an arbitrary data type. - Then it's really easy to make any new type into an instance of - the class. This generalises the rather ad-hoc "deriving" - feature of Haskell 98. Details in . - - - - - Before you get too carried away working at the lowest level (e.g., sloshing MutableByteArray#s around your program), you may wish to check if there are libraries that provide a -“Haskellised veneer” over the features you want. See -. +“Haskellised veneer” over the features you want. The +separate libraries documentation describes all the libraries that come +with GHC. + Language options @@ -191,6 +53,30 @@ program), you may wish to check if there are libraries that provide a + and : + + + + This option enables the language extension defined in the + Haskell 98 Foreign Function Interface Addendum plus deprecated + syntax of previous versions of the FFI for backwards + compatibility. + + + + + : + + + This option enables the deprecated with + keyword for implicit parameters; it is merely provided for backwards + compatibility. + It is independent of the + flag. + + + + : @@ -203,6 +89,7 @@ program), you may wish to check if there are libraries that provide a + @@ -223,6 +110,15 @@ program), you may wish to check if there are libraries that provide a + + + + See . Independent of + . + + + + @@ -244,7 +140,7 @@ program), you may wish to check if there are libraries that provide a module namespace is flat, and you must not conflict with any Prelude module.) - Even though you have not imported the Prelude, all + Even though you have not imported the Prelude, most of the built-in syntax still refers to the built-in Haskell Prelude types and values, as specified by the Haskell Report. For example, the type [Int] @@ -253,51 +149,9 @@ program), you may wish to check if there are libraries that provide a translation for list comprehensions continues to use Prelude.map etc. - With one group of exceptions! You may want to - define your own numeric class hierarchy. It completely - defeats that purpose if the literal "1" means - "Prelude.fromInteger 1", which is what - the Haskell Report specifies. So the - flag causes the - following pieces of built-in syntax to refer to whatever - is in scope, not the Prelude versions: - - - - Integer and fractional literals mean - "fromInteger 1" and - "fromRational 3.2", not the - Prelude-qualified versions; both in expressions and in - patterns. - - - - Negation (e.g. "- (f x)") - means "negate (f x)" (not - Prelude.negate). - - - - In an n+k pattern, the standard Prelude - Ord class is still used for comparison, - but the necessary subtraction uses whatever - "(-)" is in scope (not - "Prelude.(-)"). - - - - Note: Negative literals, such as -3, are - specified by (a careful reading of) the Haskell Report as - meaning Prelude.negate (Prelude.fromInteger 3). - However, GHC deviates from this slightly, and treats them as meaning - fromInteger (-3). One particular effect of this - slightly-non-standard reading is that there is no difficulty with - the literal -2147483648 at type Int; - it means fromInteger (-2147483648). The strict interpretation - would be negate (fromInteger 2147483648), - and the call to fromInteger would overflow - (at type Int, remember). - + However, does + change the handling of certain built-in syntax: see + . @@ -306,144 +160,279 @@ program), you may wish to check if there are libraries that provide a -&primitives; + + + + Unboxed types and primitive operations + +GHC is built on a raft of primitive data types and operations. +While you really can use this stuff to write fast code, + we generally find it a lot less painful, and more satisfying in the + long run, to use higher-level language features and libraries. With + any luck, the code you write will be optimised to the efficient + unboxed version in any case. And if it isn't, we'd like to know + about it. + +We do not currently have good, up-to-date documentation about the +primitives, perhaps because they are mainly intended for internal use. +There used to be a long section about them here in the User Guide, but it +became out of date, and wrong information is worse than none. + +The Real Truth about what primitive types there are, and what operations +work over those types, is held in the file +fptools/ghc/compiler/prelude/primops.txt. +This file is used directly to generate GHC's primitive-operation definitions, so +it is always correct! It is also intended for processing into text. + + Indeed, +the result of such processing is part of the description of the + External + Core language. +So that document is a good place to look for a type-set version. +We would be very happy if someone wanted to volunteer to produce an SGML +back end to the program that processes primops.txt so that +we could include the results here in the User Guide. + +What follows here is a brief summary of some main points. + + +Unboxed types + + + +Unboxed types (Glasgow extension) + - -Primitive state-transformer monad +Most types in GHC are boxed, which means +that values of that type are represented by a pointer to a heap +object. The representation of a Haskell Int, for +example, is a two-word heap object. An unboxed +type, however, is represented by the value itself, no pointers or heap +allocation are involved. + -state transformers (Glasgow extensions) -ST monad (Glasgow extension) +Unboxed types correspond to the “raw machine” types you +would use in C: Int# (long int), +Double# (double), Addr# +(void *), etc. The primitive operations +(PrimOps) on these types are what you might expect; e.g., +(+#) is addition on +Int#s, and is the machine-addition that we all +know and love—usually one instruction. -This monad underlies our implementation of arrays, mutable and -immutable, and our implementation of I/O, including “C calls”. +Primitive (unboxed) types cannot be defined in Haskell, and are +therefore built into the language and compiler. Primitive types are +always unlifted; that is, a value of a primitive type cannot be +bottom. We use the convention that primitive types, values, and +operations have a # suffix. -The ST library, which provides access to the -ST monad, is described in . +Primitive values are often represented by a simple bit-pattern, such +as Int#, Float#, +Double#. But this is not necessarily the case: +a primitive value might be represented by a pointer to a +heap-allocated object. Examples include +Array#, the type of primitive arrays. A +primitive array is heap-allocated because it is too big a value to fit +in a register, and would be too expensive to copy around; in a sense, +it is accidental that it is represented by a pointer. If a pointer +represents a primitive value, then it really does point to that value: +no unevaluated thunks, no indirections…nothing can be at the +other end of the pointer than the primitive value. - + +There are some restrictions on the use of primitive types, the main +one being that you can't pass a primitive value to a polymorphic +function or store one in a polymorphic data type. This rules out +things like [Int#] (i.e. lists of primitive +integers). The reason for this restriction is that polymorphic +arguments and constructor fields are assumed to be pointers: if an +unboxed integer is stored in one of these, the garbage collector would +attempt to follow it, leading to unpredictable space leaks. Or a +seq operation on the polymorphic component may +attempt to dereference the pointer, with disastrous results. Even +worse, the unboxed value might be larger than a pointer +(Double# for instance). + + + +Nevertheless, A numerically-intensive program using unboxed types can +go a lot faster than its “standard” +counterpart—we saw a threefold speedup on one example. + + + - -Primitive arrays, mutable and otherwise +<sect2 id="unboxed-tuples"> +<title>Unboxed Tuples -primitive arrays (Glasgow extension) -arrays, primitive (Glasgow extension) +Unboxed tuples aren't really exported by GHC.Exts, +they're available by default with . An +unboxed tuple looks like this: -GHC knows about quite a few flavours of Large Swathes of Bytes. + + +(# e_1, ..., e_n #) + + -First, GHC distinguishes between primitive arrays of (boxed) Haskell -objects (type Array# obj) and primitive arrays of bytes (type -ByteArray#). +where e_1..e_n are expressions of any +type (primitive or non-primitive). The type of an unboxed tuple looks +the same. -Second, it distinguishes between… - +Unboxed tuples are used for functions that need to return multiple +values, but they avoid the heap allocation normally associated with +using fully-fledged tuples. When an unboxed tuple is returned, the +components are put directly into registers or on the stack; the +unboxed tuple itself does not have a composite representation. Many +of the primitive operations listed in this section return unboxed +tuples. + - -Immutable: - -Arrays that do not change (as with “standard” Haskell arrays); you -can only read from them. Obviously, they do not need the care and -attention of the state-transformer monad. +There are some pretty stringent restrictions on the use of unboxed tuples: - - - -Mutable: + + + + + -Arrays that may be changed or “mutated.” All the operations on them -live within the state-transformer monad and the updates happen -in-place. + Unboxed tuple types are subject to the same restrictions as +other unboxed types; i.e. they may not be stored in polymorphic data +structures or passed to polymorphic functions. + - - -“Static” (in C land): + -A C routine may pass an Addr# pointer back into Haskell land. There -are then primitive operations with which you may merrily grab values -over in C land, by indexing off the “static” pointer. + Unboxed tuples may only be constructed as the direct result of +a function, and may only be deconstructed with a case expression. +eg. the following are valid: + + + +f x y = (# x+1, y-1 #) +g x = case f x x of { (# a, b #) -> a + b } + + + +but the following are invalid: + + + +f x y = g (# x, y #) +g (# x, y #) = x + y + + + - - -“Stable” pointers: - -If, for some reason, you wish to hand a Haskell pointer (i.e., -not an unboxed value) to a C routine, you first make the -pointer “stable,” so that the garbage collector won't forget that it -exists. That is, GHC provides a safe way to pass Haskell pointers to -C. - -Please see for more details. + No variable can have an unboxed tuple type. This is illegal: + + + +f :: (# Int, Int #) -> (# Int, Int #) +f x = x + + + +because x has an unboxed tuple type. + - - -“Foreign objects”: - - -A “foreign object” is a safe way to pass an external object (a -C-allocated pointer, say) to Haskell and have Haskell do the Right -Thing when it no longer references the object. So, for example, C -could pass a large bitmap over to Haskell and say “please free this -memory when you're done with it.” + + + -Please see for more details. - - - - +Note: we may relax some of these restrictions in the future. -The libraries documentatation gives more details on all these -“primitive array” types and the operations on them. +The IO and ST monads use unboxed +tuples to avoid unnecessary allocation during sequences of operations. + - -Data types with no constructors + + + +Syntactic extensions + + + + + Hierarchical Modules + + GHC supports a small extension to the syntax of module + names: a module name is allowed to contain a dot + ‘.’. This is also known as the + “hierarchical module namespace” extension, because + it extends the normally flat Haskell module namespace into a + more flexible hierarchy of modules. + + This extension has very little impact on the language + itself; modules names are always fully + qualified, so you can just think of the fully qualified module + name as the module name. In particular, this + means that the full module name must be given after the + module keyword at the beginning of the + module; for example, the module A.B.C must + begin + +module A.B.C + + + It is a common strategy to use the as + keyword to save some typing when using qualified names with + hierarchical modules. For example: -With the flag, GHC lets you declare -a data type with no constructors. For example: - data S -- S :: * - data T a -- T :: * -> * +import qualified Control.Monad.ST.Strict as ST -Syntactically, the declaration lacks the "= constrs" part. The -type can be parameterised, but only over ordinary types, of kind *; since -Haskell does not have kind signatures, you cannot parameterise over higher-kinded -types. -Such data types have only one value, namely bottom. -Nevertheless, they can be useful when defining "phantom types". - + Hierarchical modules have an impact on the way that GHC + searches for files. For a description, see . + + GHC comes with a large collection of libraries arranged + hierarchically; see the accompanying library documentation. + There is an ongoing project to create and maintain a stable set + of core libraries used by several Haskell + compilers, and the libraries that GHC comes with represent the + current status of that project. For more details, see Haskell + Libraries. + + + + - + Pattern guards @@ -568,9 +557,96 @@ f x | [y] <- x Haskell's current guards therefore emerge as a special case, in which the qualifier list has just one element, a boolean expression. - + + + + + +The recursive do-notation + + + The recursive do-notation (also known as mdo-notation) is implemented as described in +"A recursive do for Haskell", +Levent Erkok, John Launchbury", +Haskell Workshop 2002, pages: 29-37. Pittsburgh, Pennsylvania. + + +The do-notation of Haskell does not allow recursive bindings, +that is, the variables bound in a do-expression are visible only in the textually following +code block. Compare this to a let-expression, where bound variables are visible in the entire binding +group. It turns out that several applications can benefit from recursive bindings in +the do-notation, and this extension provides the necessary syntactic support. + + +Here is a simple (yet contrived) example: + + +import Control.Monad.Fix + +justOnes = mdo xs <- Just (1:xs) + return xs + + +As you can guess justOnes will evaluate to Just [1,1,1,.... + + + +The Control.Monad.Fix library introduces the MonadFix class. It's definition is: + + +class Monad m => MonadFix m where + mfix :: (a -> m a) -> m a + + +The function mfix +dictates how the required recursion operation should be performed. If recursive bindings are required for a monad, +then that monad must be declared an instance of the MonadFix class. +For details, see the above mentioned reference. + + +The following instances of MonadFix are automatically provided: List, Maybe, IO. +Furthermore, the Control.Monad.ST and Control.Monad.ST.Lazy modules provide the instances of the MonadFix class +for Haskell's internal state monad (strict and lazy, respectively). + + +There are three important points in using the recursive-do notation: + + +The recursive version of the do-notation uses the keyword mdo (rather +than do). + + + +You should import Control.Monad.Fix. +(Note: Strictly speaking, this import is required only when you need to refer to the name +MonadFix in your program, but the import is always safe, and the programmers +are encouraged to always import this module when using the mdo-notation.) + + + +As with other extensions, ghc should be given the flag -fglasgow-exts + + + + + +The web page: http://www.cse.ogi.edu/PacSoft/projects/rmb +contains up to date information on recursive monadic bindings. + + + +Historical note: The old implementation of the mdo-notation (and most +of the existing documents) used the name +MonadRec for the class and the corresponding library. +This name is not supported by GHC. + - + + + + + + Parallel List Comprehensions list comprehensionsparallel @@ -618,95 +694,314 @@ qualifier list has just one element, a boolean expression. where `zipN' is the appropriate zip for the given number of branches. - - - -Multi-parameter type classes - + - -This section documents GHC's implementation of multi-parameter type -classes. There's lots of background in the paper Type -classes: exploring the design space (Simon Peyton Jones, Mark -Jones, Erik Meijer). - + +Rebindable syntax - -I'd like to thank people who reported shorcomings in the GHC 3.02 -implementation. Our default decisions were all conservative ones, and -the experience of these heroic pioneers has given useful concrete -examples to support several generalisations. (These appear below as -design choices not implemented in 3.02.) - - -I've discussed these notes with Mark Jones, and I believe that Hugs -will migrate towards the same design choices as I outline here. -Thanks to him, and to many others who have offered very useful -feedback. - + GHC allows most kinds of built-in syntax to be rebound by + the user, to facilitate replacing the Prelude + with a home-grown version, for example. - -Types + You may want to define your own numeric class + hierarchy. It completely defeats that purpose if the + literal "1" means "Prelude.fromInteger + 1", which is what the Haskell Report specifies. + So the flag causes + the following pieces of built-in syntax to refer to + whatever is in scope, not the Prelude + versions: - -There are the following restrictions on the form of a qualified -type: - + + + Integer and fractional literals mean + "fromInteger 1" and + "fromRational 3.2", not the + Prelude-qualified versions; both in expressions and in + patterns. + However, the standard Prelude Eq class + is still used for the equality test necessary for literal patterns. + - + + Negation (e.g. "- (f x)") + means "negate (f x)" (not + Prelude.negate). + - - forall tv1..tvn (c1, ...,cn) => type - + + In an n+k pattern, the standard Prelude + Ord class is still used for comparison, + but the necessary subtraction uses whatever + "(-)" is in scope (not + "Prelude.(-)"). + - + + "Do" notation is translated using whatever + functions (>>=), + (>>), fail, and + return, are in scope (not the Prelude + versions). List comprehensions, and parallel array + comprehensions, are unaffected. + - -(Here, I write the "foralls" explicitly, although the Haskell source -language omits them; in Haskell 1.4, all the free type variables of an -explicit source-language type signature are universally quantified, -except for the class type variables in a class declaration. However, -in GHC, you can give the foralls if you want. See ). - + Be warned: this is an experimental facility, with fewer checks than + usual. In particular, it is essential that the functions GHC finds in scope + must have the appropriate types, namely: + + fromInteger :: forall a. (...) => Integer -> a + fromRational :: forall a. (...) => Rational -> a + negate :: forall a. (...) => a -> a + (-) :: forall a. (...) => a -> a -> a + (>>=) :: forall m a. (...) => m a -> (a -> m b) -> m b + (>>) :: forall m a. (...) => m a -> m b -> m b + return :: forall m a. (...) => a -> m a + fail :: forall m a. (...) => String -> m a + + (The (...) part can be any context including the empty context; that part + is up to you.) + If the functions don't have the right type, very peculiar things may + happen. Use -dcore-lint to + typecheck the desugared program. If Core Lint is happy you should be all right. - + + - - - - Each universally quantified type variable -tvi must be mentioned (i.e. appear free) in type. + + +Type system extensions -The reason for this is that a value with a type that does not obey -this restriction could not be used without introducing -ambiguity. Here, for example, is an illegal type: + +Data types with no constructors +With the flag, GHC lets you declare +a data type with no constructors. For example: - forall a. Eq a => Int + data S -- S :: * + data T a -- T :: * -> * +Syntactically, the declaration lacks the "= constrs" part. The +type can be parameterised over types of any kind, but if the kind is +not * then an explicit kind annotation must be used +(see ). -When a value with this type was used, the constraint Eq tv -would be introduced where tv is a fresh type variable, and -(in the dictionary-translation implementation) the value would be -applied to a dictionary for Eq tv. The difficulty is that we -can never know which instance of Eq to use because we never -get any more information about tv. +Such data types have only one value, namely bottom. +Nevertheless, they can be useful when defining "phantom types". + - - - + +Infix type constructors - Every constraint ci must mention at least one of the -universally quantified type variables tvi. +GHC allows type constructors to be operators, and to be written infix, very much +like expressions. More specifically: + + + A type constructor can be an operator, beginning with a colon; e.g. :*:. + The lexical syntax is the same as that for data constructors. + + + Types can be written infix. For example Int :*: Bool. + + + Back-quotes work + as for expressions, both for type constructors and type variables; e.g. Int `Either` Bool, or + Int `a` Bool. Similarly, parentheses work the same; e.g. (:*:) Int Bool. + + + Fixities may be declared for type constructors just as for data constructors. However, + one cannot distinguish between the two in a fixity declaration; a fixity declaration + sets the fixity for a data constructor and the corresponding type constructor. For example: + + infixl 7 T, :*: + + sets the fixity for both type constructor T and data constructor T, + and similarly for :*:. + Int `a` Bool. + + + Function arrow is infixr with fixity 0. (This might change; I'm not sure what it should be.) + + + Data type and type-synonym declarations can be written infix. E.g. + + data a :*: b = Foo a b + type a :+: b = Either a b + + + + The only thing that differs between operators in types and operators in expressions is that + ordinary non-constructor operators, such as + and * + are not allowed in types. Reason: the uniform thing to do would be to make them type + variables, but that's not very useful. A less uniform but more useful thing would be to + allow them to be type constructors. But that gives trouble in export + lists. So for now we just exclude them. + -For example, this type is OK because C a b mentions the -universally quantified type variable b: + + + + + +Explicitly-kinded quantification + + +Haskell infers the kind of each type variable. Sometimes it is nice to be able +to give the kind explicitly as (machine-checked) documentation, +just as it is nice to give a type signature for a function. On some occasions, +it is essential to do so. For example, in his paper "Restricted Data Types in Haskell" (Haskell Workshop 1999) +John Hughes had to define the data type: + + data Set cxt a = Set [a] + | Unused (cxt a -> ()) + +The only use for the Unused constructor was to force the correct +kind for the type variable cxt. + + +GHC now instead allows you to specify the kind of a type variable directly, wherever +a type variable is explicitly bound. Namely: + +data declarations: + + data Set (cxt :: * -> *) a = Set [a] + +type declarations: + + type T (f :: * -> *) = f Int + +class declarations: + + class (Eq a) => C (f :: * -> *) a where ... + +forall's in type signatures: + + f :: forall (cxt :: * -> *). Set cxt Int + + + + + +The parentheses are required. Some of the spaces are required too, to +separate the lexemes. If you write (f::*->*) you +will get a parse error, because "::*->*" is a +single lexeme in Haskell. + + + +As part of the same extension, you can put kind annotations in types +as well. Thus: + + f :: (Int :: *) -> Int + g :: forall a. a -> (a :: *) + +The syntax is + + atype ::= '(' ctype '::' kind ') + +The parentheses are required. + + + + + +Class method types + + +Haskell 98 prohibits class method types to mention constraints on the +class type variable, thus: + + class Seq s a where + fromList :: [a] -> s a + elem :: Eq a => a -> s a -> Bool + +The type of elem is illegal in Haskell 98, because it +contains the constraint Eq a, constrains only the +class type variable (in this case a). + + +With the GHC lifts this restriction. + + + + + +Multi-parameter type classes + + + +This section documents GHC's implementation of multi-parameter type +classes. There's lots of background in the paper Type +classes: exploring the design space (Simon Peyton Jones, Mark +Jones, Erik Meijer). + + + + +Types + + +GHC imposes the following restrictions on the form of a qualified +type, whether declared in a type signature +or inferred. Consider the type: + + + forall tv1..tvn (c1, ...,cn) => type + + +(Here, I write the "foralls" explicitly, although the Haskell source +language omits them; in Haskell 1.4, all the free type variables of an +explicit source-language type signature are universally quantified, +except for the class type variables in a class declaration. However, +in GHC, you can give the foralls if you want. See ). + + + + + + + + + Each universally quantified type variable +tvi must be reachable from type. + +A type variable is "reachable" if it it is functionally dependent +(see ) +on the type variables free in type. +The reason for this is that a value with a type that does not obey +this restriction could not be used without introducing +ambiguity. +Here, for example, is an illegal type: + + + + forall a. Eq a => Int + + + +When a value with this type was used, the constraint Eq tv +would be introduced where tv is a fresh type variable, and +(in the dictionary-translation implementation) the value would be +applied to a dictionary for Eq tv. The difficulty is that we +can never know which instance of Eq to use because we never +get any more information about tv. + + + + + + + Every constraint ci must mention at least one of the +universally quantified type variables tvi. + +For example, this type is OK because C a b mentions the +universally quantified type variable b: @@ -737,10 +1032,6 @@ territory free in case we need it later. - -These restrictions apply to all types, whether declared in a type signature -or inferred. - Unlike Haskell 1.4, constraints in types do not have to be of @@ -761,9 +1052,9 @@ are perfectly OK This choice recovers principal types, a property that Haskell 1.4 does not have. - + - + Class declarations @@ -827,56 +1118,14 @@ be acyclic. So these class declarations are OK: - - - In the signature of a class operation, every constraint -must mention at least one type variable that is not a class type -variable. - -Thus: - - - - class Collection c a where - mapC :: Collection c b => (a->b) -> c a -> c b - - - -is OK because the constraint (Collection a b) mentions -b, even though it also mentions the class variable -a. On the other hand: - - - - class C a where - op :: Eq a => (a,b) -> (a,b) - - - -is not OK because the constraint (Eq a) mentions on the class -type variable a, but not b. However, any such -example is easily fixed by moving the offending context up to the -superclass context: - - - - class Eq a => C a where - op ::(a,b) -> (a,b) - - - -A yet more relaxed rule would allow the context of a class-op signature -to mention only class type variables. However, that conflicts with -Rule 1(b) for types above. - - - - The type of each class operation must mention all of -the class type variables. For example: + All of the class type variables must be reachable (in the sense +mentioned in ) +from the free varibles of each method type +. For example: @@ -923,9 +1172,9 @@ class like this: - + - + Instance declarations @@ -948,9 +1197,9 @@ declarations However, if you give the command line option -fallow-overlapping-instances -option then two overlapping instance declarations are permitted -iff - +option then overlapping instance declarations are permitted. +However, GHC arranges never to commit to using an instance declaration +if another instance declaration also applies, either now or later. @@ -963,22 +1212,11 @@ iff OR type2 is a substitution instance of type1 -(but not identical to type1) - - - - - - OR vice versa +(but not identical to type1), or vice versa. - - - Notice that these rules - - @@ -998,8 +1236,34 @@ Reason: you can pick which instance decl - - +However the rules are over-conservative. Two instance declarations can overlap, +but it can still be clear in particular situations which to use. For example: + + instance C (Int,a) where ... + instance C (a,Bool) where ... + +These are rejected by GHC's rules, but it is clear what to do when trying +to solve the constraint C (Int,Int) because the second instance +cannot apply. Yell if this restriction bites you. + + +GHC is also conservative about committing to an overlapping instance. For example: + + class C a where { op :: a -> a } + instance C [Int] where ... + instance C a => C [a] where ... + + f :: C b => [b] -> [b] + f x = op x + +From the RHS of f we get the constraint C [b]. But +GHC does not commit to the second instance declaration, because in a paricular +call of f, b might be instantiate to Int, so the first instance declaration +would be appropriate. So GHC rejects the program. If you add +GHC will instead silently pick the second instance, without complaining about +the problem of subsequent instantiations. + + Regrettably, GHC doesn't guarantee to detect overlapping instance declarations if they appear in different modules. GHC can "see" the instance declarations in the transitive closure of all the modules @@ -1037,63 +1301,8 @@ For example, this is OK: instance Stateful (ST s) (MutVar s) where ... - -The "at least one not a type variable" restriction is to ensure that -context reduction terminates: each reduction step removes one type -constructor. For example, the following would make the type checker -loop if it wasn't excluded: - - - - instance C a => C a where ... - - - -There are two situations in which the rule is a bit of a pain. First, -if one allows overlapping instance declarations then it's quite -convenient to have a "default instance" declaration that applies if -something more specific does not: - - - - instance C a where - op = ... -- Default - - - -Second, sometimes you might want to use the following to get the -effect of a "class synonym": - - - - class (C1 a, C2 a, C3 a) => C a where { } - - instance (C1 a, C2 a, C3 a) => C a where { } - - - -This allows you to write shorter signatures: - - - - f :: C a => ... - - - -instead of - - - - f :: (C1 a, C2 a, C3 a) => ... - - - -I'm on the lookout for a simple rule that preserves decidability while -allowing these idioms. The experimental flag --fallow-undecidable-instances -option lifts this restriction, allowing all the types in an -instance head to be type variables. - +See for an experimental +extension to lift this restriction. @@ -1155,16 +1364,10 @@ instance C Int b => Foo b where ... -is not OK. Again, the intent here is to make sure that context -reduction terminates. +is not OK. See for an experimental +extension to lift this restriction. + -Voluminous correspondence on the Haskell mailing list has convinced me -that it's worth experimenting with a more liberal rule. If you use -the flag can use arbitrary -types in an instance context. Termination is ensured by having a -fixed-depth recursion stack. If you exceed the stack depth you get a -sort of backtrace, and the opportunity to increase the stack depth -with N. @@ -1173,114 +1376,515 @@ with N. - - - + - -Implicit parameters - + - Implicit paramters are implemented as described in -"Implicit parameters: dynamic scoping with static types", -J Lewis, MB Shields, E Meijer, J Launchbury, -27th ACM Symposium on Principles of Programming Languages (POPL'00), -Boston, Jan 2000. - + +Undecidable instances - -There should be more documentation, but there isn't (yet). Yell if you need it. - +The rules for instance declarations state that: - - You can't have an implicit parameter in the context of a class or instance -declaration. For example, both these declarations are illegal: +At least one of the types in the head of +an instance declaration must not be a type variable. + +All of the types in the context of +an instance declaration must be type variables. + + +These restrictions ensure that +context reduction terminates: each reduction step removes one type +constructor. For example, the following would make the type checker +loop if it wasn't excluded: - class (?x::Int) => C a where ... - instance (?x::a) => Foo [a] where ... + instance C a => C a where ... -Reason: exactly which implicit parameter you pick up depends on exactly where -you invoke a function. But the ``invocation'' of instance declarations is done -behind the scenes by the compiler, so it's hard to figure out exactly where it is done. -Easiest thing is to outlaw the offending types. - +There are two situations in which the rule is a bit of a pain. First, +if one allows overlapping instance declarations then it's quite +convenient to have a "default instance" declaration that applies if +something more specific does not: - - + + instance C a where + op = ... -- Default + - -Functional dependencies - +Second, sometimes you might want to use the following to get the +effect of a "class synonym": - Functional dependencies are implemented as described by Mark Jones -in "Type Classes with Functional Dependencies", Mark P. Jones, -In Proceedings of the 9th European Symposium on Programming, -ESOP 2000, Berlin, Germany, March 2000, Springer-Verlag LNCS 1782. - - -There should be more documentation, but there isn't (yet). Yell if you need it. - - + + class (C1 a, C2 a, C3 a) => C a where { } + instance (C1 a, C2 a, C3 a) => C a where { } + - -Explicit universal quantification - +This allows you to write shorter signatures: + + + + f :: C a => ... + + + +instead of + + + + f :: (C1 a, C2 a, C3 a) => ... + + + +Voluminous correspondence on the Haskell mailing list has convinced me +that it's worth experimenting with more liberal rules. If you use +the experimental flag +-fallow-undecidable-instances +option, you can use arbitrary +types in both an instance context and instance head. Termination is ensured by having a +fixed-depth recursion stack. If you exceed the stack depth you get a +sort of backtrace, and the opportunity to increase the stack depth +with N. + -GHC's type system supports explicit universal quantification in -constructor fields and function arguments. This is useful for things -like defining runST from the state-thread world. -GHC's syntax for this now agrees with Hugs's, namely: +I'm on the lookout for a less brutal solution: a simple rule that preserves decidability while +allowing these idioms interesting idioms. + - + +Implicit parameters + + Implicit paramters are implemented as described in +"Implicit parameters: dynamic scoping with static types", +J Lewis, MB Shields, E Meijer, J Launchbury, +27th ACM Symposium on Principles of Programming Languages (POPL'00), +Boston, Jan 2000. + +(Most of the following, stil rather incomplete, documentation is due to Jeff Lewis.) + +A variable is called dynamically bound when it is bound by the calling +context of a function and statically bound when bound by the callee's +context. In Haskell, all variables are statically bound. Dynamic +binding of variables is a notion that goes back to Lisp, but was later +discarded in more modern incarnations, such as Scheme. Dynamic binding +can be very confusing in an untyped language, and unfortunately, typed +languages, in particular Hindley-Milner typed languages like Haskell, +only support static scoping of variables. + + +However, by a simple extension to the type class system of Haskell, we +can support dynamic binding. Basically, we express the use of a +dynamically bound variable as a constraint on the type. These +constraints lead to types of the form (?x::t') => t, which says "this +function uses a dynamically-bound variable ?x +of type t'". For +example, the following expresses the type of a sort function, +implicitly parameterized by a comparison function named cmp. - forall a b. (Ord a, Eq b) => a -> b -> a + sort :: (?cmp :: a -> a -> Bool) => [a] -> [a] +The dynamic binding constraints are just a new form of predicate in the type class system. + + +An implicit parameter occurs in an expression using the special form ?x, +where x is +any valid identifier (e.g. ord ?x is a valid expression). +Use of this construct also introduces a new +dynamic-binding constraint in the type of the expression. +For example, the following definition +shows how we can define an implicitly parameterized sort function in +terms of an explicitly parameterized sortBy function: + + sortBy :: (a -> a -> Bool) -> [a] -> [a] + sort :: (?cmp :: a -> a -> Bool) => [a] -> [a] + sort = sortBy ?cmp + + +Implicit-parameter type constraints -The context is, of course, optional. You can't use forall as -a type variable any more! +Dynamic binding constraints behave just like other type class +constraints in that they are automatically propagated. Thus, when a +function is used, its implicit parameters are inherited by the +function that called it. For example, our sort function might be used +to pick out the least value in a list: + + least :: (?cmp :: a -> a -> Bool) => [a] -> a + least xs = fst (sort xs) + +Without lifting a finger, the ?cmp parameter is +propagated to become a parameter of least as well. With explicit +parameters, the default is that parameters must always be explicit +propagated. With implicit parameters, the default is to always +propagate them. + + +An implicit-parameter type constraint differs from other type class constraints in the +following way: All uses of a particular implicit parameter must have +the same type. This means that the type of (?x, ?x) +is (?x::a) => (a,a), and not +(?x::a, ?x::b) => (a, b), as would be the case for type +class constraints. + You can't have an implicit parameter in the context of a class or instance +declaration. For example, both these declarations are illegal: + + class (?x::Int) => C a where ... + instance (?x::a) => Foo [a] where ... + +Reason: exactly which implicit parameter you pick up depends on exactly where +you invoke a function. But the ``invocation'' of instance declarations is done +behind the scenes by the compiler, so it's hard to figure out exactly where it is done. +Easiest thing is to outlaw the offending types. -Haskell type signatures are implicitly quantified. The forall -allows us to say exactly what this means. For example: +Implicit-parameter constraints do not cause ambiguity. For example, consider: + + f :: (?x :: [a]) => Int -> Int + f n = n + length ?x + + g :: (Read a, Show a) => String -> String + g s = show (read s) + +Here, g has an ambiguous type, and is rejected, but f +is fine. The binding for ?x at f's call site is +quite unambiguous, and fixes the type a. + + + +Implicit-parameter bindings +An implicit parameter is bound using the standard +let or where binding forms. +For example, we define the min function by binding +cmp. + + min :: [a] -> a + min = let ?cmp = (<=) in least + + + +A group of implicit-parameter bindings may occur anywhere a normal group of Haskell +bindings can occur, except at top level. That is, they can occur in a let +(including in a list comprehension, or do-notation, or pattern guards), +or a where clause. +Note the following points: + + +An implicit-parameter binding group must be a +collection of simple bindings to implicit-style variables (no +function-style bindings, and no type signatures); these bindings are +neither polymorphic or recursive. + + +You may not mix implicit-parameter bindings with ordinary bindings in a +single let +expression; use two nested lets instead. +(In the case of where you are stuck, since you can't nest where clauses.) + + +You may put multiple implicit-parameter bindings in a +single binding group; but they are not treated +as a mutually recursive group (as ordinary let bindings are). +Instead they are treated as a non-recursive group, simultaneously binding all the implicit +parameter. The bindings are not nested, and may be re-ordered without changing +the meaning of the program. +For example, consider: - g :: b -> b + f t = let { ?x = t; ?y = ?x+(1::Int) } in ?x + ?y +The use of ?x in the binding for ?y does not "see" +the binding for ?x, so the type of f is + + f :: (?x::Int) => Int -> Int + + + + + + + + +Linear implicit parameters + + +Linear implicit parameters are an idea developed by Koen Claessen, +Mark Shields, and Simon PJ. They address the long-standing +problem that monads seem over-kill for certain sorts of problem, notably: + + distributing a supply of unique names + distributing a suppply of random numbers + distributing an oracle (as in QuickCheck) + -means this: +Linear implicit parameters are just like ordinary implicit parameters, +except that they are "linear" -- that is, they cannot be copied, and +must be explicitly "split" instead. Linear implicit parameters are +written '%x' instead of '?x'. +(The '/' in the '%' suggests the split!) + + +For example: + + import GHC.Exts( Splittable ) + + data NameSupply = ... + + splitNS :: NameSupply -> (NameSupply, NameSupply) + newName :: NameSupply -> Name + + instance Splittable NameSupply where + split = splitNS + + + f :: (%ns :: NameSupply) => Env -> Expr -> Expr + f env (Lam x e) = Lam x' (f env e) + where + x' = newName %ns + env' = extend env x x' + ...more equations for f... + +Notice that the implicit parameter %ns is consumed + + once by the call to newName + once by the recursive call to f + + + +So the translation done by the type checker makes +the parameter explicit: + + f :: NameSupply -> Env -> Expr -> Expr + f ns env (Lam x e) = Lam x' (f ns1 env e) + where + (ns1,ns2) = splitNS ns + x' = newName ns2 + env = extend env x x' + +Notice the call to 'split' introduced by the type checker. +How did it know to use 'splitNS'? Because what it really did +was to introduce a call to the overloaded function 'split', +defined by the class Splittable: + + class Splittable a where + split :: a -> (a,a) + +The instance for Splittable NameSupply tells GHC how to implement +split for name supplies. But we can simply write + + g x = (x, %ns, %ns) + +and GHC will infer + + g :: (Splittable a, %ns :: a) => b -> (b,a,a) + +The Splittable class is built into GHC. It's exported by module +GHC.Exts. + + +Other points: + + '?x' and '%x' +are entirely distinct implicit parameters: you + can use them together and they won't intefere with each other. + + + You can bind linear implicit parameters in 'with' clauses. + + You cannot have implicit parameters (whether linear or not) + in the context of a class or instance declaration. + +Warnings + + +The monomorphism restriction is even more important than usual. +Consider the example above: + + f :: (%ns :: NameSupply) => Env -> Expr -> Expr + f env (Lam x e) = Lam x' (f env e) + where + x' = newName %ns + env' = extend env x x' + +If we replaced the two occurrences of x' by (newName %ns), which is +usually a harmless thing to do, we get: + + f :: (%ns :: NameSupply) => Env -> Expr -> Expr + f env (Lam x e) = Lam (newName %ns) (f env e) + where + env' = extend env x (newName %ns) + +But now the name supply is consumed in three places +(the two calls to newName,and the recursive call to f), so +the result is utterly different. Urk! We don't even have +the beta rule. + +Well, this is an experimental change. With implicit +parameters we have already lost beta reduction anyway, and +(as John Launchbury puts it) we can't sensibly reason about +Haskell programs without knowing their typing. + + + +Recursive functions +Linear implicit parameters can be particularly tricky when you have a recursive function +Consider - g :: forall b. (b -> b) + foo :: %x::T => Int -> [Int] + foo 0 = [] + foo n = %x : foo (n-1) + +where T is some type in class Splittable. + +Do you get a list of all the same T's or all different T's +(assuming that split gives two distinct T's back)? + +If you supply the type signature, taking advantage of polymorphic +recursion, you get what you'd probably expect. Here's the +translated term, where the implicit param is made explicit: + + foo x 0 = [] + foo x n = let (x1,x2) = split x + in x1 : foo x2 (n-1) +But if you don't supply a type signature, GHC uses the Hindley +Milner trick of using a single monomorphic instance of the function +for the recursive calls. That is what makes Hindley Milner type inference +work. So the translation becomes + + foo x = let + foom 0 = [] + foom n = x : foom (n-1) + in + foom + +Result: 'x' is not split, and you get a list of identical T's. So the +semantics of the program depends on whether or not foo has a type signature. +Yikes! + +You may say that this is a good reason to dislike linear implicit parameters +and you'd be right. That is why they are an experimental feature. + + + + + + +Functional dependencies + + Functional dependencies are implemented as described by Mark Jones +in “Type Classes with Functional Dependencies”, Mark P. Jones, +In Proceedings of the 9th European Symposium on Programming, +ESOP 2000, Berlin, Germany, March 2000, Springer-Verlag LNCS 1782, +. + + +Functional dependencies are introduced by a vertical bar in the syntax of a +class declaration; e.g. + + class (Monad m) => MonadState s m | m -> s where ... + + class Foo a b c | a b -> c where ... + +There should be more documentation, but there isn't (yet). Yell if you need it. + + + + +Arbitrary-rank polymorphism + +Haskell type signatures are implicitly quantified. The new keyword forall +allows us to say exactly what this means. For example: + + + + g :: b -> b + +means this: + + g :: forall b. (b -> b) + The two are treated identically. - -Universally-quantified data type fields +<para> +However, GHC's type system supports <emphasis>arbitrary-rank</emphasis> +explicit universal quantification in +types. +For example, all the following types are legal: +<programlisting> + f1 :: forall a b. a -> b -> a + g1 :: forall a b. (Ord a, Eq b) => a -> b -> a + + f2 :: (forall a. a->a) -> Int -> Int + g2 :: (forall a. Eq a => [a] -> a -> Bool) -> Int -> Int + + f3 :: ((forall a. a->a) -> Int) -> Bool -> Bool +</programlisting> +Here, <literal>f1</literal> and <literal>g1</literal> are rank-1 types, and +can be written in standard Haskell (e.g. <literal>f1 :: a->b->a</literal>). +The <literal>forall</literal> makes explicit the universal quantification that +is implicitly added by Haskell. +</para> +<para> +The functions <literal>f2</literal> and <literal>g2</literal> have rank-2 types; +the <literal>forall</literal> is on the left of a function arrrow. As <literal>g2</literal> +shows, the polymorphic type on the left of the function arrow can be overloaded. +</para> +<para> +The functions <literal>f3</literal> and <literal>g3</literal> have rank-3 types; +they have rank-2 types on the left of a function arrow. +</para> +<para> +GHC allows types of arbitrary rank; you can nest <literal>forall</literal>s +arbitrarily deep in function arrows. (GHC used to be restricted to rank 2, but +that restriction has now been lifted.) +In particular, a forall-type (also called a "type scheme"), +including an operational type class context, is legal: +<itemizedlist> +<listitem> <para> On the left of a function arrow </para> </listitem> +<listitem> <para> On the right of a function arrow (see <xref linkend="hoist">) </para> </listitem> +<listitem> <para> As the argument of a constructor, or type of a field, in a data type declaration. For +example, any of the <literal>f1,f2,f3,g1,g2,g3</literal> above would be valid +field type signatures.</para> </listitem> +<listitem> <para> As the type of an implicit parameter </para> </listitem> +<listitem> <para> In a pattern type signature (see <xref linkend="scoped-type-variables">) </para> </listitem> +</itemizedlist> +There is one place you cannot put a <literal>forall</literal>: +you cannot instantiate a type variable with a forall-type. So you cannot +make a forall-type the argument of a type constructor. So these types are illegal: +<programlisting> + x1 :: [forall a. a->a] + x2 :: (forall a. a->a, Int) + x3 :: Maybe (forall a. a->a) +</programlisting> +Of course <literal>forall</literal> becomes a keyword; you can't use <literal>forall</literal> as +a type variable any more! +</para> + + +<sect3 id="univ"> +<title>Examples @@ -1303,8 +1907,7 @@ newtype Swizzle = MkSwizzle (Ord a => [a] -> [a]) -The constructors now have so-called rank 2 polymorphic -types, in which there is a for-all in the argument types.: +The constructors have rank-2 types: @@ -1348,11 +1951,6 @@ to require explicit quantification on constructor arguments where that is what is wanted. Feedback welcomed.) - - - -Construction - You construct values of types T1, MonadT, Swizzle by applying the constructor to suitable values, just as usual. For example, @@ -1361,17 +1959,23 @@ the constructor to suitable values, just as usual. For example, -(T1 (\xy->x) 3) :: T Int - -(MkSwizzle sort) :: Swizzle -(MkSwizzle reverse) :: Swizzle + a1 :: T Int + a1 = T1 (\xy->x) 3 + + a2, a3 :: Swizzle + a2 = MkSwizzle sort + a3 = MkSwizzle reverse + + a4 :: MonadT Maybe + a4 = let r x = Just x + b m k = case m of + Just y -> k y + Nothing -> Nothing + in + MkMonad r b -(let r x = Just x - b m k = case m of - Just y -> k y - Nothing -> Nothing - in - MkMonad r b) :: MonadT Maybe + mkTs :: (forall b. b -> b -> b) -> a -> [T a] + mkTs f x y = [T1 f x, T1 f y] @@ -1382,11 +1986,6 @@ required, as (MkSwizzle reverse) shows. (reverseOrd constraint.) - - - -Pattern matching - When you use pattern matching, the bound variables may now have polymorphic types. For example: @@ -1395,17 +1994,17 @@ polymorphic types. For example: - f :: T a -> a -> (a, Char) - f (T1 f k) x = (f k x, f 'c' 'd') + f :: T a -> a -> (a, Char) + f (T1 w k) x = (w k x, w 'c' 'd') - g :: (Ord a, Ord b) => Swizzle -> [a] -> (a -> b) -> [b] - g (MkSwizzle s) xs f = s (map f (s xs)) + g :: (Ord a, Ord b) => Swizzle -> [a] -> (a -> b) -> [b] + g (MkSwizzle s) xs f = s (map f (s xs)) - h :: MonadT m -> [m a] -> m [a] - h m [] = return m [] - h m (x:xs) = bind m x $ \y -> - bind m (h m xs) $ \ys -> - return m (y:ys) + h :: MonadT m -> [m a] -> m [a] + h m [] = return m [] + h m (x:xs) = bind m x $ \y -> + bind m (h m xs) $ \ys -> + return m (y:ys) @@ -1416,205 +2015,200 @@ and bind to extract the polymorphic bind and return functions from the MonadT data structure, rather than using pattern matching. + - -You cannot pattern-match against an argument that is polymorphic. -For example: - - - newtype TIM s a = TIM (ST s (Maybe a)) - - runTIM :: (forall s. TIM s a) -> Maybe a - runTIM (TIM m) = runST m - + +Type inference + +In general, type inference for arbitrary-rank types is undecideable. +GHC uses an algorithm proposed by Odersky and Laufer ("Putting type annotations to work", POPL'96) +to get a decidable algorithm by requiring some help from the programmer. +We do not yet have a formal specification of "some help" but the rule is this: - -Here the pattern-match fails, because you can't pattern-match against -an argument of type (forall s. TIM s a). Instead you -must bind the variable and pattern match in the right hand side: - +For a lambda-bound or case-bound variable, x, either the programmer +provides an explicit polymorphic type for x, or GHC's type inference will assume +that x's type has no foralls in it. + + +What does it mean to "provide" an explicit type for x? You can do that by +giving a type signature for x directly, using a pattern type signature +(), thus: - runTIM :: (forall s. TIM s a) -> Maybe a - runTIM tm = case tm of { TIM m -> runST m } + \ f :: (forall a. a->a) -> (f True, f 'c') - -The tm on the right hand side is (invisibly) instantiated, like -any polymorphic value at its occurrence site, and now you can pattern-match -against it. +Alternatively, you can give a type signature to the enclosing +context, which GHC can "push down" to find the type for the variable: + + (\ f -> (f True, f 'c')) :: (forall a. a->a) -> (Bool,Char) + +Here the type signature on the expression can be pushed inwards +to give a type signature for f. Similarly, and more commonly, +one can give a type signature for the function itself: + + h :: (forall a. a->a) -> (Bool,Char) + h f = (f True, f 'c') + +You don't need to give a type signature if the lambda bound variable +is a constructor argument. Here is an example we saw earlier: + + f :: T a -> a -> (a, Char) + f (T1 w k) x = (w k x, w 'c' 'd') + +Here we do not need to give a type signature to w, because +it is an argument of constructor T1 and that tells GHC all +it needs to know. - + - -The partial-application restriction - -There is really only one way in which data structures with polymorphic -components might surprise you: you must not partially apply them. -For example, this is illegal: - + +Implicit quantification - +GHC performs implicit quantification as follows. At the top level (only) of +user-written types, if and only if there is no explicit forall, +GHC finds all the type variables mentioned in the type that are not already +in scope, and universally quantifies them. For example, the following pairs are +equivalent: - map MkSwizzle [sort, reverse] - - - - - -The restriction is this: every subexpression of the program must -have a type that has no for-alls, except that in a function -application (f e1…en) the partial applications are not subject to -this rule. The restriction makes type inference feasible. - + f :: a -> a + f :: forall a. a -> a - -In the illegal example, the sub-expression MkSwizzle has the -polymorphic type (Ord b => [b] -> [b]) -> Swizzle and is not -a sub-expression of an enclosing application. On the other hand, this -expression is OK: + g (x::a) = let + h :: a -> b -> b + h x y = y + in ... + g (x::a) = let + h :: forall b. a -> b -> b + h x y = y + in ... + - - +Notice that GHC does not find the innermost possible quantification +point. For example: - map (T1 (\a b -> a)) [1,2,3] - + f :: (a -> a) -> Int + -- MEANS + f :: forall a. (a -> a) -> Int + -- NOT + f :: (forall a. a -> a) -> Int - - -even though it involves a partial application of T1, because -the sub-expression T1 (\a b -> a) has type Int -> T -Int. + g :: (Ord a => a -> a) -> Int + -- MEANS the illegal type + g :: forall a. (Ord a => a -> a) -> Int + -- NOT + g :: (forall a. Ord a => a -> a) -> Int + +The latter produces an illegal type, which you might think is silly, +but at least the rule is simple. If you want the latter type, you +can write your for-alls explicitly. Indeed, doing so is strongly advised +for rank-2 types. - + - -Type signatures +<sect2 id="type-synonyms"> +<title>Liberalised type synonyms -Once you have data constructors with universally-quantified fields, or -constants such as runST that have rank-2 types, it isn't long -before you discover that you need more! Consider: - +Type synonmys are like macros at the type level, and +GHC does validity checking on types only after expanding type synonyms. +That means that GHC can be very much more liberal about type synonyms than Haskell 98: + + You can write a forall (including overloading) +in a type synonym, thus: + + type Discard a = forall b. Show b => a -> b -> (a, String) - + f :: Discard a + f x y = (x, show y) - - mkTs f x y = [T1 f x, T1 f y] + g :: Discard Int -> (Int,Bool) -- A rank-2 type + g f = f Int True - + - -mkTs is a fuction that constructs some values of type -T, using some pieces passed to it. The trouble is that since -f is a function argument, Haskell assumes that it is -monomorphic, so we'll get a type error when applying T1 to -it. This is a rather silly example, but the problem really bites in -practice. Lots of people trip over the fact that you can't make -"wrappers functions" for runST for exactly the same reason. -In short, it is impossible to build abstractions around functions with -rank-2 types. - + +You can write an unboxed tuple in a type synonym: + + type Pr = (# Int, Int #) - -The solution is fairly clear. We provide the ability to give a rank-2 -type signature for ordinary functions (not only data -constructors), thus: - + h :: Int -> Pr + h x = (# x, x #) + + - + +You can apply a type synonym to a forall type: + + type Foo a = a -> a -> Bool + + f :: Foo (forall b. b->b) + +After expanding the synonym, f has the legal (in GHC) type: + + f :: (forall b. b->b) -> (forall b. b->b) -> Bool + + + +You can apply a type synonym to a partially applied type synonym: - mkTs :: (forall b. b -> b -> b) -> a -> [T a] - mkTs f x y = [T1 f x, T1 f y] + type Generic i o = forall x. i x -> o x + type Id x = x + + foo :: Generic Id [] + +After epxanding the synonym, foo has the legal (in GHC) type: + + foo :: forall x. x -> [x] + + -This type signature tells the compiler to attribute f with -the polymorphic type (forall b. b -> b -> b) when type -checking the body of mkTs, so now the application of -T1 is fine. +GHC currently does kind checking before expanding synonyms (though even that +could be changed.) - -There are two restrictions: +After expanding type synonyms, GHC does validity checking on types, looking for +the following mal-formedness which isn't detected simply by kind checking: + + +Type constructor applied to a type involving for-alls. + + +Unboxed tuple on left of an arrow. + + +Partially-applied type synonym. + + +So, for example, +this will be rejected: + + type Pr = (# Int, Int #) + + h :: Pr -> Int + h x = ... + +because GHC does not allow unboxed tuples on the left of a function arrow. + + +For-all hoisting - - - - - - You can only define a rank 2 type, specified by the following -grammar: - - - -rank2type ::= [forall tyvars .] [context =>] funty -funty ::= ([forall tyvars .] [context =>] ty) -> funty - | ty -ty ::= ...current Haskell monotype syntax... - - - -Informally, the universal quantification must all be right at the beginning, -or at the top level of a function argument. - - - - - - - There is a restriction on the definition of a function whose -type signature is a rank-2 type: the polymorphic arguments must be -matched on the left hand side of the "=" sign. You can't -define mkTs like this: - - - -mkTs :: (forall b. b -> b -> b) -> a -> [T a] -mkTs = \ f x y -> [T1 f x, T1 f y] - - - - -The same partial-application rule applies to ordinary functions with -rank-2 types as applied to data constructors. - - - - - - - - - - - - -Type synonyms and hoisting - - - -GHC also allows you to write a forall in a type synonym, thus: - - type Discard a = forall b. a -> b -> a - - f :: Discard a - f x y = x - -However, it is often convenient to use these sort of synonyms at the right hand +It is often convenient to use generalised type synonyms at the right hand end of an arrow, thus: type Discard a = forall b. a -> b -> a @@ -1634,9 +2228,9 @@ In general, the rule is this: to determine the type specified by any e user-written type (e.g. in a type signature), GHC expands type synonyms and then repeatedly performs the transformation: - type1 -> forall a. type2 + type1 -> forall a1..an. context2 => type2 ==> - forall a. type1 -> type2 + forall a1..an. context2 => type1 -> type2 (In fact, GHC tries to retain as much synonym information as possible for use in error messages, but that is a usability issue.) This rule applies, of course, whether @@ -1646,11 +2240,22 @@ valid way to write g's type signature: g :: Int -> Int -> forall b. b -> Int + +When doing this hoisting operation, GHC eliminates duplicate constraints. For +example: + + type Foo a = (?x::Int) => Bool -> a + g :: Foo (Foo Int) + +means + + g :: (?x::Int) => Bool -> Bool -> Int + + - - + Existentially quantified data constructors @@ -1740,7 +2345,7 @@ that collection of packages in a uniform manner. You can express quite a bit of object-oriented-like programming this way. - + Why existential? @@ -1763,9 +2368,9 @@ But Haskell programmers can safely think of the ordinary adding a new existential quantification construct. - + - + Type classes @@ -1808,7 +2413,7 @@ So this program is legal: f :: Baz -> String f (Baz1 p q) | p == q = "Yes" | otherwise = "No" - f (Baz1 v fn) = show (fn v) + f (Baz2 v fn) = show (fn v) @@ -1825,9 +2430,9 @@ Notice the way that the syntax fits smoothly with that used for universal quantification earlier. - + - + Restrictions @@ -1898,8 +2503,13 @@ bindings. So this is illegal: f3 x = a==b where { Baz1 a b = x } +Instead, use a case expression: + + + f3 x = case x of Baz1 a b -> a==b + -You can only pattern-match +In general, you can only pattern-match on an existentially-quantified constructor in a case expression or in the patterns of a function definition. @@ -1971,92 +2581,12 @@ declarations. Define your own instances! - - - - - -Assertions -<indexterm><primary>Assertions</primary></indexterm> - - - -If you want to make use of assertions in your standard Haskell code, you -could define a function like the following: - - - - - -assert :: Bool -> a -> a -assert False x = error "assertion failed!" -assert _ x = x - - - - - -which works, but gives you back a less than useful error message -- -an assertion failed, but which and where? - - - -One way out is to define an extended assert function which also -takes a descriptive string to include in the error message and -perhaps combine this with the use of a pre-processor which inserts -the source location where assert was used. - - - -Ghc offers a helping hand here, doing all of this for you. For every -use of assert in the user's source: - - - - - -kelvinToC :: Double -> Double -kelvinToC k = assert (k >= 0.0) (k+273.15) - - - - - -Ghc will rewrite this to also include the source location where the -assertion was made, - - - - - -assert pred val ==> assertError "Main.hs|15" pred val - - - - - -The rewrite is only performed by the compiler when it spots -applications of Exception.assert, so you can still define and -use your own versions of assert, should you so wish. If not, -import Exception to make use assert in your code. - - - -To have the compiler ignore uses of assert, use the compiler option -. -fignore-asserts option That is, -expressions of the form assert pred e will be rewritten to e. - - - -Assertion failures can be caught, see the documentation for the -Exception library () -for the details. - + - + - -Scoped Type Variables +<sect2 id="scoped-type-variables"> +<title>Scoped type variables @@ -2106,7 +2636,7 @@ are noted. So much for the basic idea. Here are the details. - + What a pattern type signature means A type variable brought into scope by a pattern type signature is simply @@ -2144,9 +2674,9 @@ For example, all of these are legal: w (x::a) = x -- a unifies with [b] - + - + Scope and implicit quantification @@ -2278,19 +2808,127 @@ scope over the methods defined in the where part. For exampl - + - -Result type signatures + +Where a pattern type signature can occur - +A pattern type signature can occur in any pattern. For example: + + + +A pattern type signature can be on an arbitrary sub-pattern, not +ust on a variable: + + + + f ((x,y)::(a,b)) = (y,x) :: (b,a) + + + + + + + + + Pattern type signatures, including the result part, can be used +in lambda abstractions: + + + (\ (x::a, y) :: a -> x) + + + - The result type of a function can be given a signature, -thus: + Pattern type signatures, including the result part, can be used +in case expressions: + + + + case e of { (x::a, y) :: a -> x } + + + + + + + +To avoid ambiguity, the type after the “::” in a result +pattern signature on a lambda or case must be atomic (i.e. a single +token or a parenthesised type of some sort). To see why, +consider how one would parse this: + + + + \ x :: a -> b -> x + + + + + + + + + + Pattern type signatures can bind existential type variables. +For example: + + + + data T = forall a. MkT [a] + + f :: T -> T + f (MkT [t::a]) = MkT t3 + where + t3::[a] = [t,t,t] + + + + + + + + + + +Pattern type signatures +can be used in pattern bindings: + + + f x = let (y, z::a) = x in ... + f1 x = let (y, z::Int) = x in ... + f2 (x::(Int,a)) = let (y, z::a) = x in ... + f3 :: (b->b) = \x -> x + + +In all such cases, the binding is not generalised over the pattern-bound +type variables. Thus f3 is monomorphic; f3 +has type b -> b for some type b, +and not forall b. b -> b. +In contrast, the binding + + f4 :: b->b + f4 = \x -> x + +makes a polymorphic function, but b is not in scope anywhere +in f4's scope. + + + + + + + + + +Result type signatures + + +The result type of a function can be given a signature, thus: @@ -2309,137 +2947,968 @@ you want: in \xs -> map g (reverse xs `zip` xs) + + +The type variables bound in a result type signature scope over the right hand side +of the definition. However, consider this corner-case: + + rev1 :: [a] -> [a] = \xs -> reverse xs + foo ys = rev (ys::[a]) + +The signature on rev1 is considered a pattern type signature, not a result +type signature, and the type variables it binds have the same scope as rev1 +itself (i.e. the right-hand side of rev1 and the rest of the module too). +In particular, the expression (ys::[a]) is OK, because the type variable a +is in scope (otherwise it would mean (ys::forall a.[a]), which would be rejected). + + +As mentioned above, rev1 is made monomorphic by this scoping rule. +For example, the following program would be rejected, because it claims that rev1 +is polymorphic: + + rev1 :: [b] -> [b] + rev1 :: [a] -> [a] = \xs -> reverse xs + - - + +Result type signatures are not yet implemented in Hugs. + + + + + + +Generalised derived instances for newtypes + + +When you define an abstract type using newtype, you may want +the new type to inherit some instances from its representation. In +Haskell 98, you can inherit instances of Eq, Ord, +Enum and Bounded by deriving them, but for any +other classes you have to write an explicit instance declaration. For +example, if you define + + + newtype Dollars = Dollars Int + + +and you want to use arithmetic on Dollars, you have to +explicitly define an instance of Num: + + + instance Num Dollars where + Dollars a + Dollars b = Dollars (a+b) + ... + +All the instance does is apply and remove the newtype +constructor. It is particularly galling that, since the constructor +doesn't appear at run-time, this instance declaration defines a +dictionary which is wholly equivalent to the Int +dictionary, only slower! + + Generalising the deriving clause -Result type signatures are not yet implemented in Hugs. +GHC now permits such instances to be derived instead, so one can write + + newtype Dollars = Dollars Int deriving (Eq,Show,Num) + + +and the implementation uses the same Num dictionary +for Dollars as for Int. Notionally, the compiler +derives an instance declaration of the form + + + instance Num Int => Num Dollars + + +which just adds or removes the newtype constructor according to the type. + + + +We can also derive instances of constructor classes in a similar +way. For example, suppose we have implemented state and failure monad +transformers, such that + + + instance Monad m => Monad (State s m) + instance Monad m => Monad (Failure m) + +In Haskell 98, we can define a parsing monad by + + type Parser tok m a = State [tok] (Failure m) a + + +which is automatically a monad thanks to the instance declarations +above. With the extension, we can make the parser type abstract, +without needing to write an instance of class Monad, via + + + newtype Parser tok m a = Parser (State [tok] (Failure m) a) + deriving Monad + +In this case the derived instance declaration is of the form + + instance Monad (State [tok] (Failure m)) => Monad (Parser tok m) + + +Notice that, since Monad is a constructor class, the +instance is a partial application of the new type, not the +entire left hand side. We can imagine that the type declaration is +``eta-converted'' to generate the context of the instance +declaration. + + + +We can even derive instances of multi-parameter classes, provided the +newtype is the last class parameter. In this case, a ``partial +application'' of the class appears in the deriving +clause. For example, given the class + + + class StateMonad s m | m -> s where ... + instance Monad m => StateMonad s (State s m) where ... + +then we can derive an instance of StateMonad for Parsers by + + newtype Parser tok m a = Parser (State [tok] (Failure m) a) + deriving (Monad, StateMonad [tok]) + + +The derived instance is obtained by completing the application of the +class to the new type: + + + instance StateMonad [tok] (State [tok] (Failure m)) => + StateMonad [tok] (Parser tok m) + + + + +As a result of this extension, all derived instances in newtype +declarations are treated uniformly (and implemented just by reusing +the dictionary for the representation type), except +Show and Read, which really behave differently for +the newtype and its representation. + + + A more precise specification + +Derived instance declarations are constructed as follows. Consider the +declaration (after expansion of any type synonyms) + + + newtype T v1...vn = T' (S t1...tk vk+1...vn) deriving (c1...cm) + + +where + + + S is a type constructor, + + + t1...tk are types, + + + vk+1...vn are type variables which do not occur in any of + the ti, and + + + the ci are partial applications of + classes of the form C t1'...tj', where the arity of C + is exactly j+1. That is, C lacks exactly one type argument. + + +Then, for each ci, the derived instance +declaration is: + + instance ci (S t1...tk vk+1...v) => ci (T v1...vp) + +where p is chosen so that T v1...vp is of the +right kind for the last parameter of class Ci. + + + +As an example which does not work, consider + + newtype NonMonad m s = NonMonad (State s m s) deriving Monad + +Here we cannot derive the instance + + instance Monad (State s m) => Monad (NonMonad m) + + +because the type variable s occurs in State s m, +and so cannot be "eta-converted" away. It is a good thing that this +deriving clause is rejected, because NonMonad m is +not, in fact, a monad --- for the same reason. Try defining +>>= with the correct type: you won't be able to. + + + +Notice also that the order of class parameters becomes +important, since we can only derive instances for the last one. If the +StateMonad class above were instead defined as + + + class StateMonad m s | m -> s where ... + + +then we would not have been able to derive an instance for the +Parser type above. We hypothesise that multi-parameter +classes usually have one "main" parameter for which deriving new +instances is most interesting. + + - -Where a pattern type signature can occur + + + + + + +Template Haskell + +Template Haskell allows you to do compile-time meta-programming in Haskell. There is a "home page" for +Template Haskell at +http://www.haskell.org/th/, while +the background to +the main technical innovations is discussed in " +Template Meta-programming for Haskell" (Proc Haskell Workshop 2002). + + + The first example from that paper is set out below as a worked example to help get you started. + + + +The documentation here describes the realisation in GHC. (It's rather sketchy just now; +Tim Sheard is going to expand it.) + + + Syntax + + Template Haskell has the following new syntactic constructions. You need to use the flag + -fglasgow-exts to switch these syntactic extensions on. + + + + A splice is written $x, where x is an + identifier, or $(...), where the "..." is an arbitrary expression. + There must be no space between the "$" and the identifier or parenthesis. This use + of "$" overrides its meaning as an infix operator, just as "M.x" overrides the meaning + of "." as an infix operator. If you want the infix operator, put spaces around it. + + A splice can occur in place of + + an expression; the spliced expression must have type Expr + a list of top-level declarations; ; the spliced expression must have type Q [Dec] + a type; the spliced expression must have type Type. + + (Note that the syntax for a declaration splice uses "$" not "splice" as in + the paper. Also the type of the enclosed expression must be Q [Dec], not [Q Dec] + as in the paper.) + + + + + A expression quotation is written in Oxford brackets, thus: + + [| ... |], where the "..." is an expression; + the quotation has type Expr. + [d| ... |], where the "..." is a list of top-level declarations; + the quotation has type Q [Dec]. + [t| ... |], where the "..." is a type; + the quotation has type Type. + + + + Reification is written thus: + + reifyDecl T, where T is a type constructor; this expression + has type Dec. + reifyDecl C, where C is a class; has type Dec. + reifyType f, where f is an identifier; has type Typ. + Still to come: fixities + + + + + + + + + + Using Template Haskell -A pattern type signature can occur in any pattern. For example: + + The data types and monadic constructor functions for Template Haskell are in the library + Language.Haskell.THSyntax. + + + + You can only run a function at compile time if it is imported from another module. That is, + you can't define a function in a module, and call it from within a splice in the same module. + (It would make sense to do so, but it's hard to implement.) + + + + The flag -ddump-splices shows the expansion of all top-level splices as they happen. + + + If you are building GHC from source, you need at least a stage-2 bootstrap compiler to + run Template Haskell. A stage-1 compiler will reject the TH constructs. Reason: TH + compiles and runs a program, and then looks at the result. So it's important that + the program it compiles produces results whose representations are identical to + those of the compiler itself. + + + + Template Haskell works in any mode (--make, --interactive, + or file-at-a-time). There used to be a restriction to the former two, but that restriction + has been lifted. + + + + A Template Haskell Worked Example +To help you get over the confidence barrier, try out this skeletal worked example. + First cut and paste the two modules below into "Main.hs" and "Printf.hs": - - -A pattern type signature can be on an arbitrary sub-pattern, not -ust on a variable: + +{- Main.hs -} +module Main where +-- Import our template "pr" +import Printf ( pr ) + +-- The splice operator $ takes the Haskell source code +-- generated at compile time by "pr" and splices it into +-- the argument of "putStrLn". +main = putStrLn ( $(pr "Hello") ) + - f ((x,y)::(a,b)) = (y,x) :: (b,a) +{- Printf.hs -} +module Printf where + +-- Skeletal printf from the paper. +-- It needs to be in a separate module to the one where +-- you intend to use it. + +-- Import some Template Haskell syntax +import Language.Haskell.THSyntax + +-- Describe a format string +data Format = D | S | L String + +-- Parse a format string. This is left largely to you +-- as we are here interested in building our first ever +-- Template Haskell program and not in building printf. +parse :: String -> [Format] +parse s = [ L s ] + +-- Generate Haskell source code from a parsed representation +-- of the format string. This code will be spliced into +-- the module which calls "pr", at compile time. +gen :: [Format] -> Expr +gen [D] = [| \n -> show n |] +gen [S] = [| \s -> s |] +gen [L s] = string s + +-- Here we generate the Haskell code for the splice +-- from an input format string. +pr :: String -> Expr +pr s = gen (parse s) +Now run the compiler (here we are using a "stage three" build of GHC, at a Cygwin prompt on Windows): + + +ghc/compiler/stage3/ghc-inplace --make -fglasgow-exts -package haskell-src main.hs -o main.exe + +Run "main.exe" and here is your output: + + + +$ ./main +Hello + + + + + + + + + +Arrow notation + + +Arrows are a generalization of monads introduced by John Hughes. +For more details, see + + + + +“Generalising Monads to Arrows”, +John Hughes, in Science of Computer Programming 37, +pp67–111, May 2000. + + +“A New Notation for Arrows”, +Ross Paterson, in ICFP, Sep 2001. + + + - Pattern type signatures, including the result part, can be used -in lambda abstractions: +“Arrows and Computation”, +Ross Paterson, in The Fun of Programming, +Palgrave, 2003. + + - - (\ (x::a, y) :: a -> x) + +and the arrows web page at +http://www.haskell.org/arrows/. +With the flag, GHC supports the arrow +notation described in the second of these papers. +What follows is a brief introduction to the notation; +it won't make much sense unless you've read Hughes's paper. +This notation is translated to ordinary Haskell, +using combinators from the +Control.Arrow +module. + + +The extension adds a new kind of expression for defining arrows, +of the form proc pat -> cmd, +where proc is a new keyword. +The variables of the pattern are bound in the body of the +proc-expression, +which is a new sort of thing called a command. +The syntax of commands is as follows: + +cmd ::= exp1 -< exp2 + | exp1 -<< exp2 + | do { cstmt1 .. cstmtn ; cmd } + | let decls in cmd + | if exp then cmd1 else cmd2 + | case exp of { calts } + | cmd1 qop cmd2 + | (| aexp cmd1 .. cmdn |) + | \ pat1 .. patn -> cmd + | cmd aexp + | ( cmd ) + +cstmt ::= let decls + | pat <- cmd + | rec { cstmt1 .. cstmtn } + | cmd + +Commands produce values, but (like monadic computations) +may yield more than one value, +or none, and may do other things as well. +For the most part, familiarity with monadic notation is a good guide to +using commands. +However the values of expressions, even monadic ones, +are determined by the values of the variables they contain; +this is not necessarily the case for commands. + + + +A simple example of the new notation is the expression + +proc x -> f -< x+1 + +We call this a procedure or +arrow abstraction. +As with a lambda expression, the variable x +is a new variable bound within the proc-expression. +It refers to the input to the arrow. +In the above example, -< is not an identifier but an +new reserved symbol used for building commands from an expression of arrow +type and an expression to be fed as input to that arrow. +(The weird look will make more sense later.) +It may be read as analogue of application for arrows. +The above example is equivalent to the Haskell expression + +arr (\ x -> x+1) >>> f + +That would make no sense if the expression to the left of +-< involves the bound variable x. +More generally, the expression to the left of -< +may not involve any local variable, +i.e. a variable bound in the current arrow abstraction. +For such a situation there is a variant -<<, as in + +proc x -> f x -<< x+1 + +which is equivalent to + +arr (\ x -> (f, x+1)) >>> app + +so in this case the arrow must belong to the ArrowApply +class. +Such an arrow is equivalent to a monad, so if you're using this form +you may find a monadic formulation more convenient. + + + +do-notation for commands + + +Another form of command is a form of do-notation. +For example, you can write + +proc x -> do + y <- f -< x+1 + g -< 2*y + let z = x+y + t <- h -< x*z + returnA -< t+z + +You can read this much like ordinary do-notation, +but with commands in place of monadic expressions. +The first line sends the value of x+1 as an input to +the arrow f, and matches its output against +y. +In the next line, the output is discarded. +The arrow returnA is defined in the +Control.Arrow +module as arr id. +The above example is treated as an abbreviation for + +arr (\ x -> (x, x)) >>> + first (arr (\ x -> x+1) >>> f) >>> + arr (\ (y, x) -> (y, (x, y))) >>> + first (arr (\ y -> 2*y) >>> g) >>> + arr snd >>> + arr (\ (x, y) -> let z = x+y in ((x, z), z)) >>> + first (arr (\ (x, z) -> x*z) >>> h) >>> + arr (\ (t, z) -> t+z) >>> + returnA + +Note that variables not used later in the composition are projected out. +After simplification using rewrite rules (see ) +defined in the +Control.Arrow +module, this reduces to + +arr (\ x -> (x+1, x)) >>> + first f >>> + arr (\ (y, x) -> (2*y, (x, y))) >>> + first g >>> + arr (\ (_, (x, y)) -> let z = x+y in (x*z, z)) >>> + first h >>> + arr (\ (t, z) -> t+z) + +which is what you might have written by hand. +With arrow notation, GHC keeps track of all those tuples of variables for you. + + + +Note that although the above translation suggests that +let-bound variables like z must be +monomorphic, the actual translation produces Core, +so polymorphic variables are allowed. + + + +It's also possible to have mutually recursive bindings, +using the new rec keyword, as in the following example: + +counter :: ArrowCircuit a => a Bool Int +counter = proc reset -> do + rec output <- returnA -< if reset then 0 else next + next <- delay 0 -< output+1 + returnA -< output + +The translation of such forms uses the loop combinator, +so the arrow concerned must belong to the ArrowLoop class. + + + + + +Conditional commands + + +In the previous example, we used a conditional expression to construct the +input for an arrow. +Sometimes we want to conditionally execute different commands, as in + +proc (x,y) -> + if f x y + then g -< x+1 + else h -< y+2 + +which is translated to + +arr (\ (x,y) -> if f x y then Left x else Right y) >>> + (arr (\x -> x+1) >>> f) ||| (arr (\y -> y+2) >>> g) + +Since the translation uses |||, +the arrow concerned must belong to the ArrowChoice class. + + + +There are also case commands, like + +case input of + [] -> f -< () + [x] -> g -< x+1 + x1:x2:xs -> do + y <- h -< (x1, x2) + ys <- k -< xs + returnA -< y:ys + +The syntax is the same as for case expressions, +except that the bodies of the alternatives are commands rather than expressions. +The translation is similar to that of if commands. + + + + + +Defining your own control structures + + +As we're seen, arrow notation provides constructs, +modelled on those for expressions, +for sequencing, value recursion and conditionals. +But suitable combinators, +which you can define in ordinary Haskell, +may also be used to build new commands out of existing ones. +The basic idea is that a command defines an arrow from environments to values. +These environments assign values to the free local variables of the command. +Thus combinators that produce arrows from arrows +may also be used to build commands from commands. +For example, the ArrowChoice class includes a combinator + +ArrowChoice a => (<+>) :: a e c -> a e c -> a e c + +so we can use it to build commands: + +expr' = proc x -> + returnA -< x + <+> do + symbol Plus -< () + y <- term -< () + expr' -< x + y + <+> do + symbol Minus -< () + y <- term -< () + expr' -< x - y + +This is equivalent to + +expr' = (proc x -> returnA -< x) + <+> (proc x -> do + symbol Plus -< () + y <- term -< () + expr' -< x + y) + <+> (proc x -> do + symbol Minus -< () + y <- term -< () + expr' -< x - y) + +It is essential that this operator be polymorphic in e +(representing the environment input to the command +and thence to its subcommands) +and satisfy the corresponding naturality property + +arr k >>> (f <+> g) = (arr k >>> f) <+> (arr k >>> g) + +at least for strict k. +(This should be automatic if you're not using seq.) +This ensures that environments seen by the subcommands are environments +of the whole command, +and also allows the translation to safely trim these environments. +The operator must also not use any variable defined within the current +arrow abstraction. + + + +We could define our own operator + +untilA :: ArrowChoice a => a e () -> a e Bool -> a e () +untilA body cond = proc x -> + if cond x then returnA -< () + else do + body -< x + untilA body cond -< x + +and use it in the same way. +Of course this infix syntax only makes sense for binary operators; +there is also a more general syntax involving special brackets: + +proc x -> do + y <- f -< x+1 + (|untilA (increment -< x+y) (within 0.5 -< x)|) + + + + + + +Primitive constructs + + +Some operators will need to pass additional inputs to their subcommands. +For example, in an arrow type supporting exceptions, +the operator that attaches an exception handler will wish to pass the +exception that occurred to the handler. +Such an operator might have a type + +handleA :: ... => a e c -> a (e,Ex) c -> a e c + +where Ex is the type of exceptions handled. +You could then use this with arrow notation by writing a command + +body `handleA` \ ex -> handler + +so that if an exception is raised in the command body, +the variable ex is bound to the value of the exception +and the command handler, +which typically refers to ex, is entered. +Though the syntax here looks like a functional lambda, +we are talking about commands, and something different is going on. +The input to the arrow represented by a command consists of values for +the free local variables in the command, plus a stack of anonymous values. +In all the prior examples, this stack was empty. +In the second argument to handleA, +this stack consists of one value, the value of the exception. +The command form of lambda merely gives this value a name. + + + +More concretely, +the values on the stack are paired to the right of the environment. +So when designing operators like handleA that pass +extra inputs to their subcommands, +More precisely, the type of each argument of the operator (and its result) +should have the form + +a (...(e,t1), ... tn) t + +where e is a polymorphic variable +(representing the environment) +and ti are the types of the values on the stack, +with t1 being the top. +The polymorphic variable e must not occur in +a, ti or +t. +However the arrows involved need not be the same. +Here are some more examples of suitable operators: + +bracketA :: ... => a e b -> a (e,b) c -> a (e,c) d -> a e d +runReader :: ... => a e c -> a' (e,State) c +runState :: ... => a e c -> a' (e,State) (c,State) + +We can supply the extra input required by commands built with the last two +by applying them to ordinary expressions, as in + +proc x -> do + s <- ... + (|runReader (do { ... })|) s + +which adds s to the stack of inputs to the command +built using runReader. + + + +The command versions of lambda abstraction and application are analogous to +the expression versions. +In particular, the beta and eta rules describe equivalences of commands. +These three features (operators, lambda abstraction and application) +are the core of the notation; everything else can be built using them, +though the results would be somewhat clumsy. +For example, we could simulate do-notation by defining + +bind :: Arrow a => a e b -> a (e,b) c -> a e c +u `bind` f = returnA &&& u >>> f + +bind_ :: Arrow a => a e b -> a e c -> a e c +u `bind_` f = u `bind` (arr fst >>> f) + +We could simulate do by defining + +cond :: ArrowChoice a => a e b -> a e b -> a (e,Bool) b +cond f g = arr (\ (e,b) -> if b then Left e else Right e) >>> f ||| g + + + + +Differences with the paper + + + + +Instead of a single form of arrow application (arrow tail) with two +translations, the implementation provides two forms +-< (first-order) +and -<< (higher-order). + + +User-defined operators are flagged with banana brackets instead of +a new form keyword. + + + + + + + +Portability + + +Although only GHC implements arrow notation directly, +there is also a preprocessor +(available from the +arrows web page>) +that translates arrow notation into Haskell 98 +for use with other Haskell systems. +You would still want to check arrow programs with GHC; +tracing type errors in the preprocessor output is not easy. +Modules intended for both GHC and the preprocessor must observe some +additional restrictions: + + + - Pattern type signatures, including the result part, can be used -in case expressions: +The module must import +Control.Arrow. + + + + +The preprocessor cannot cope with other Haskell extensions. +These would have to go in separate modules. + + - - case e of { (x::a, y) :: a -> x } - + + +Because the preprocessor targets Haskell (rather than Core), +let-bound variables are monomorphic. + + + + + + + + + + + + + +Assertions +<indexterm><primary>Assertions</primary></indexterm> + + +If you want to make use of assertions in your standard Haskell code, you +could define a function like the following: - - -To avoid ambiguity, the type after the “::” in a result -pattern signature on a lambda or case must be atomic (i.e. a single -token or a parenthesised type of some sort). To see why, -consider how one would parse this: - - \ x :: a -> b -> x +assert :: Bool -> a -> a +assert False x = error "assertion failed!" +assert _ x = x + + +which works, but gives you back a less than useful error message -- +an assertion failed, but which and where? - - + +One way out is to define an extended assert function which also +takes a descriptive string to include in the error message and +perhaps combine this with the use of a pre-processor which inserts +the source location where assert was used. + - Pattern type signatures can bind existential type variables. -For example: +Ghc offers a helping hand here, doing all of this for you. For every +use of assert in the user's source: + + - data T = forall a. MkT [a] - - f :: T -> T - f (MkT [t::a]) = MkT t3 - where - t3::[a] = [t,t,t] +kelvinToC :: Double -> Double +kelvinToC k = assert (k >= 0.0) (k+273.15) - - - - - -Pattern type signatures -can be used in pattern bindings: +Ghc will rewrite this to also include the source location where the +assertion was made, + - - f x = let (y, z::a) = x in ... - f1 x = let (y, z::Int) = x in ... - f2 (x::(Int,a)) = let (y, z::a) = x in ... - f3 :: (b->b) = \x -> x - + -In all such cases, the binding is not generalised over the pattern-bound -type variables. Thus f3 is monomorphic; f3 -has type b -> b for some type b, -and not forall b. b -> b. -In contrast, the binding - f4 :: b->b - f4 = \x -> x +assert pred val ==> assertError "Main.hs|15" pred val -makes a polymorphic function, but b is not in scope anywhere -in f4's scope. - - + + +The rewrite is only performed by the compiler when it spots +applications of Control.Exception.assert, so you +can still define and use your own versions of +assert, should you so wish. If not, import +Control.Exception to make use +assert in your code. - + +To have the compiler ignore uses of assert, use the compiler option +. -fignore-asserts +option That is, expressions of the form +assert pred e will be rewritten to +e. + + +Assertion failures can be caught, see the documentation for the +Control.Exception library for the details. + + + + Pragmas @@ -2463,32 +3932,64 @@ in f4's scope. unrecognised word is (silently) ignored. - -INLINE pragma + <sect2 id="deprecated-pragma"> + <title>DEPRECATED pragma + DEPRECATED + -INLINE pragma -pragma, INLINE + The DEPRECATED pragma lets you specify that a particular + function, class, or type, is deprecated. There are two + forms. - -GHC (with , as always) tries to inline (or “unfold”) -functions/values that are “small enough,” thus avoiding the call -overhead and possibly exposing other more-wonderful optimisations. - + + + You can deprecate an entire module thus: + + module Wibble {-# DEPRECATED "Use Wobble instead" #-} where + ... + + When you compile any module that import + Wibble, GHC will print the specified + message. + - -You will probably see these unfoldings (in Core syntax) in your -interface files. - + + You can deprecate a function, class, or type, with the + following top-level declaration: + + {-# DEPRECATED f, C, T "Don't use these" #-} + + When you compile any module that imports and uses any + of the specifed entities, GHC will print the specified + message. + + - -Normally, if GHC decides a function is “too expensive” to inline, it -will not do so, nor will it export that unfolding for other modules to -use. - + You can suppress the warnings with the flag + . + - -The sledgehammer you can bring to bear is the -INLINEINLINE pragma pragma, used thusly: + + INLINE and NOINLINE pragmas + + These pragmas control the inlining of function + definitions. + + + INLINE pragma + INLINE + + GHC (with , as always) tries to + inline (or “unfold”) functions/values that are + “small enough,” thus avoiding the call overhead + and possibly exposing other more-wonderful optimisations. + Normally, if GHC decides a function is “too + expensive” to inline, it will not do so, nor will it + export that unfolding for other modules to use. + + The sledgehammer you can bring to bear is the + INLINEINLINE + pragma pragma, used thusly: key_function :: Int -> String -> (Bool, Double) @@ -2498,25 +3999,25 @@ key_function :: Int -> String -> (Bool, Double) #endif -(You don't need to do the C pre-processor carry-on unless you're going -to stick the code through HBC—it doesn't like INLINE pragmas.) - + (You don't need to do the C pre-processor carry-on + unless you're going to stick the code through HBC—it + doesn't like INLINE pragmas.) - -The major effect of an INLINE pragma is to declare a function's -“cost” to be very low. The normal unfolding machinery will then be -very keen to inline it. - + The major effect of an INLINE pragma + is to declare a function's “cost” to be very low. + The normal unfolding machinery will then be very keen to + inline it. - -An INLINE pragma for a function can be put anywhere its type -signature could be put. - + Syntactially, an INLINE pragma for a + function can be put anywhere its type signature could be + put. - -INLINE pragmas are a particularly good idea for the -then/return (or bind/unit) functions in a monad. -For example, in GHC's own UniqueSupply monad code, we have: + INLINE pragmas are a particularly + good idea for the + then/return (or + bind/unit) functions in + a monad. For example, in GHC's own + UniqueSupply monad code, we have: #ifdef __GLASGOW_HASKELL__ @@ -2525,32 +4026,140 @@ For example, in GHC's own UniqueSupply monad code, we have: #endif - + See also the NOINLINE pragma (). + + + + NOINLINE pragma + + NOINLINE + NOTINLINE + + The NOINLINE pragma does exactly what + you'd expect: it stops the named function from being inlined + by the compiler. You shouldn't ever need to do this, unless + you're very cautious about code size. + + NOTINLINE is a synonym for + NOINLINE (NOTINLINE is + specified by Haskell 98 as the standard way to disable + inlining, so it should be used if you want your code to be + portable). + + + + Phase control + + Sometimes you want to control exactly when in GHC's + pipeline the INLINE pragma is switched on. Inlining happens + only during runs of the simplifier. Each + run of the simplifier has a different phase + number; the phase number decreases towards zero. + If you use you'll see the + sequence of phase numbers for successive runs of the + simpifier. In an INLINE pragma you can optionally specify a + phase number, thus: + + + + You can say "inline f in Phase 2 + and all subsequent phases": + + {-# INLINE [2] f #-} + + + - + + You can say "inline g in all + phases up to, but not including, Phase 3": + + {-# INLINE [~3] g #-} + + + - -NOINLINE pragma - + + If you omit the phase indicator, you mean "inline in + all phases". + + -NOINLINE pragma -pragmaNOINLINE -NOTINLINE pragma -pragmaNOTINLINE + You can use a phase number on a NOINLINE pragma too: - -The NOINLINE pragma does exactly what you'd expect: -it stops the named function from being inlined by the compiler. You -shouldn't ever need to do this, unless you're very cautious about code -size. - + + + You can say "do not inline f + until Phase 2; in Phase 2 and subsequently behave as if + there was no pragma at all": + + {-# NOINLINE [2] f #-} + + + -NOTINLINE is a synonym for -NOINLINE (NOTINLINE is specified -by Haskell 98 as the standard way to disable inlining, so it should be -used if you want your code to be portable). + + You can say "do not inline g in + Phase 3 or any subsequent phase; before that, behave as if + there was no pragma": + + {-# NOINLINE [~3] g #-} + + + - + + If you omit the phase indicator, you mean "never + inline this function". + + + + The same phase-numbering control is available for RULES + (). + + + + + LINE pragma + + LINEpragma + pragmaLINE + This pragma is similar to C's #line + pragma, and is mainly for use in automatically generated Haskell + code. It lets you specify the line number and filename of the + original code; for example + + +{-# LINE 42 "Foo.vhs" #-} + + + if you'd generated the current file from something called + Foo.vhs and this line corresponds to line + 42 in the original. GHC will adjust its error messages to refer + to the line/file named in the LINE + pragma. + + + + OPTIONS pragma + OPTIONS + + pragmaOPTIONS + + + The OPTIONS pragma is used to specify + additional options that are given to the compiler when compiling + this source file. See for + details. + + + + RULES pragma + + The RULES pragma lets you specify rewrite rules. It is + described in . + SPECIALIZE pragma @@ -2576,35 +4185,37 @@ hammeredLookup :: Ord key => [(key, value)] -> key -> value {-# SPECIALIZE hammeredLookup :: [(Widget, value)] -> Widget -> value #-} - To get very fancy, you can also specify a named function - to use for the specialised value, as in: + A SPECIALIZE pragma for a function can + be put anywhere its type signature could be put. +A SPECIALIZE has the effect of generating (a) a specialised +version of the function and (b) a rewrite rule (see ) that +rewrites a call to the un-specialised function into a call to the specialised +one. You can, instead, provide your own specialised function and your own rewrite rule. +For example, suppose that: -{-# RULES hammeredLookup = blah #-} + genericLookup :: Ord a => Table a b -> a -> b + intLookup :: Table Int b -> Int -> b - - where blah is an implementation of - hammerdLookup written specialy for - Widget lookups. It's Your +where intLookup is an implementation of genericLookup +that works very fast for keys of type Int. Then you can write the rule + + {-# RULES "intLookup" genericLookup = intLookup #-} + +(see ). It is Your Responsibility to make sure that - blah really behaves as a specialised - version of hammeredLookup!!! - - Note we use the RULE pragma here to - indicate that hammeredLookup applied at a - certain type should be replaced by blah. See - for more information on - RULES. + intLookup really behaves as a specialised + version of genericLookup!!! An example in which using RULES for specialisation will Win Big: -toDouble :: Real a => a -> Double -toDouble = fromRational . toRational + toDouble :: Real a => a -> Double + toDouble = fromRational . toRational -{-# SPECIALIZE toDouble :: Int -> Double = i2d #-} -i2d (I# i) = D# (int2Double# i) -- uses Glasgow prim-op directly + {-# RULES "toDouble/Int" toDouble = i2d #-} + i2d (I# i) = D# (int2Double# i) -- uses Glasgow prim-op directly The i2d function is virtually one machine @@ -2612,9 +4223,6 @@ i2d (I# i) = D# (int2Double# i) -- uses Glasgow prim-op directly Rational—is obscenely expensive by comparison. - A SPECIALIZE pragma for a function can - be put anywhere its type signature could be put. - @@ -2642,86 +4250,12 @@ of the pragma. - -LINE pragma - - - -LINE pragma -pragma, LINE - - - -This pragma is similar to C's #line pragma, and is mainly for use in -automatically generated Haskell code. It lets you specify the line -number and filename of the original code; for example - - - - - -{-# LINE 42 "Foo.vhs" #-} - - - - - -if you'd generated the current file from something called Foo.vhs -and this line corresponds to line 42 in the original. GHC will adjust -its error messages to refer to the line/file named in the LINE -pragma. - - - - - -RULES pragma - - -The RULES pragma lets you specify rewrite rules. It is described in -. - - - - - -DEPRECATED pragma - - -The DEPRECATED pragma lets you specify that a particular function, class, or type, is deprecated. -There are two forms. - - - -You can deprecate an entire module thus: - - module Wibble {-# DEPRECATED "Use Wobble instead" #-} where - ... - - -When you compile any module that import Wibble, GHC will print -the specified message. - - - - -You can deprecate a function, class, or type, with the following top-level declaration: - - - {-# DEPRECATED f, C, T "Don't use these" #-} - - -When you compile any module that imports and uses any of the specifed entities, -GHC will print the specified message. - - - -You can suppress the warnings with the flag . - + + Rewrite rules @@ -2731,7 +4265,10 @@ GHC will print the specified message. <para> The programmer can specify rewrite rules as part of the source program -(in a pragma). GHC applies these rewrite rules wherever it can. +(in a pragma). GHC applies these rewrite rules wherever it can, provided (a) +the <option>-O</option> flag (<xref LinkEnd="options-optimise">) is on, +and (b) the <option>-frules-off</option> flag +(<xref LinkEnd="options-f">) is not specified. </para> <para> @@ -2755,16 +4292,34 @@ From a syntactic point of view: <listitem> <para> + There may be zero or more rules in a <literal>RULES</literal> pragma. +</para> +</listitem> + +<listitem> + +<para> Each rule has a name, enclosed in double quotes. The name itself has no significance at all. It is only used when reporting how many times the rule fired. </para> </listitem> -<listitem> +<listitem> <para> - There may be zero or more rules in a <literal>RULES</literal> pragma. +A rule may optionally have a phase-control number (see <xref LinkEnd="phase-control">), +immediately after the name of the rule. Thus: +<programlisting> + {-# RULES + "map/map" [2] forall f g xs. map f (map g xs) = map (f.g) xs + #-} +</programlisting> +The "[2]" means that the rule is active in Phase 2 and subsequent phases. The inverse +notation "[~2]" is also accepted, meaning that the rule is active up to, but not including, +Phase 2. </para> </listitem> + + <listitem> <para> @@ -2773,6 +4328,7 @@ is set, so you must lay out your rules starting in the same column as the enclosing definitions. </para> </listitem> + <listitem> <para> @@ -3160,7 +4716,7 @@ will fuse with one but not the other) </para> -<para> + <para> So, for example, the following should generate no intermediate lists: <programlisting> @@ -3248,7 +4804,7 @@ If you add <option>-dppr-debug</option> you get a more detailed listing. <listitem> <para> - The defintion of (say) <function>build</function> in <FileName>PrelBase.lhs</FileName> looks llike this: + The defintion of (say) <function>build</function> in <FileName>GHC/Base.lhs</FileName> looks llike this: <programlisting> build :: forall a. (forall b. (a -> b -> b) -> b -> b) -> [a] @@ -3266,9 +4822,9 @@ in the RHS of the <literal>INLINE</literal> thing. I regret the delicacy of thi <listitem> <para> - In <filename>ghc/lib/std/PrelBase.lhs</filename> look at the rules for <function>map</function> to + In <filename>libraries/base/GHC/Base.lhs</filename> look at the rules for <function>map</function> to see how to write rules that will do fusion and yet give an efficient -program even if fusion doesn't happen. More rules in <filename>PrelList.lhs</filename>. +program even if fusion doesn't happen. More rules in <filename>GHC/List.lhs</filename>. </para> </listitem> @@ -3278,6 +4834,69 @@ program even if fusion doesn't happen. More rules in <filename>PrelList.lhs</fi </sect2> +<sect2 id="core-pragma"> + <title>CORE pragma + + CORE pragma + pragma, CORE + core, annotation + + + The external core format supports Note annotations; + the CORE pragma gives a way to specify what these + should be in your Haskell source code. Syntactically, core + annotations are attached to expressions and take a Haskell string + literal as an argument. The following function definition shows an + example: + + +f x = ({-# CORE "foo" #-} show) ({-# CORE "bar" #-} x) + + + Sematically, this is equivalent to: + + +g x = show x + + + + + However, when external for is generated (via + ), there will be Notes attached to the + expressions show and x. + The core function declaration for f is: + + + + f :: %forall a . GHCziShow.ZCTShow a -> + a -> GHCziBase.ZMZN GHCziBase.Char = + \ @ a (zddShow::GHCziShow.ZCTShow a) (eta::a) -> + (%note "foo" + %case zddShow %of (tpl::GHCziShow.ZCTShow a) + {GHCziShow.ZCDShow + (tpl1::GHCziBase.Int -> + a -> + GHCziBase.ZMZN GHCziBase.Char -> GHCziBase.ZMZN GHCziBase.Cha +r) + (tpl2::a -> GHCziBase.ZMZN GHCziBase.Char) + (tpl3::GHCziBase.ZMZN a -> + GHCziBase.ZMZN GHCziBase.Char -> GHCziBase.ZMZN GHCziBase.Cha +r) -> + tpl2}) + (%note "foo" + eta); + + + + Here, we can see that the function show (which + has been expanded out to a case expression over the Show dictionary) + has a %note attached to it, as does the + expression eta (which used to be called + x). + + + + @@ -3326,7 +4945,7 @@ Now you can make a data type into an instance of Bin like this: instance (Bin a, Bin b) => Bin (a,b) instance Bin a => Bin [a] -That is, just leave off the "where" clasuse. Of course, you can put in the +That is, just leave off the "where" clause. Of course, you can put in the where clause and over-ride whichever methods you please. @@ -3536,6 +5155,8 @@ Just to finish with, here's another example I rather like: + +