X-Git-Url: http://git.megacz.com/?p=ghc-hetmet.git;a=blobdiff_plain;f=docs%2Fusers_guide%2Fglasgow_exts.xml;h=7e78c7299c9a69de6afc5bc800e1e6bd85711f93;hp=24b049ec36fce3d66d4f1205eda9a4af222751ea;hb=6a05ec5ef5373f61b7f9f5bdc344483417fa801b;hpb=f416411e20d042bbec7946cd71ccefa7483e42a7 diff --git a/docs/users_guide/glasgow_exts.xml b/docs/users_guide/glasgow_exts.xml index 24b049e..7e78c72 100644 --- a/docs/users_guide/glasgow_exts.xml +++ b/docs/users_guide/glasgow_exts.xml @@ -38,11 +38,21 @@ documentation describes all the libraries that come with GHC. extensionsoptions controlling - These flags control what variation of the language are + The language option flag control what variation of the language are permitted. Leaving out all of them gives you standard Haskell 98. - NB. turning on an option that enables special syntax + Generally speaking, all the language options are introduced by "", + e.g. . + + + All the language options can be turned off by using the prefix ""; + e.g. "". + + Language options recognised by Cabal can also be enabled using the LANGUAGE pragma, + thus {-# LANGUAGE TemplateHaskell #-} (see >). + + Turning on an option that enables special syntax might cause working Haskell 98 code to fail to compile, perhaps because it uses a variable name which has become a reserved word. So, together with each option below, we @@ -81,7 +91,8 @@ documentation describes all the libraries that come with GHC. This simultaneously enables all of the extensions to Haskell 98 described in , except where otherwise - noted. + noted. We are trying to move away from this portmanteau flag, + and towards enabling features individaully. New reserved words: forall (only in types), mdo. @@ -95,20 +106,24 @@ documentation describes all the libraries that come with GHC. float##, (#, #), |), {|. + + Implies these specific language options: + , + , + , + , + . - and : - - + : + This option enables the language extension defined in the - Haskell 98 Foreign Function Interface Addendum plus deprecated - syntax of previous versions of the FFI for backwards - compatibility. + Haskell 98 Foreign Function Interface Addendum. New reserved words: foreign. @@ -116,7 +131,7 @@ documentation describes all the libraries that come with GHC. - ,: + ,: These two flags control how generalisation is done. @@ -127,8 +142,8 @@ documentation describes all the libraries that come with GHC. - : - + : + Use GHCi's extended default rules in a regular module (). @@ -139,16 +154,16 @@ documentation describes all the libraries that come with GHC. - - + + - - + + - - + + @@ -173,8 +188,8 @@ documentation describes all the libraries that come with GHC. - - + + See . Independent of @@ -192,8 +207,8 @@ documentation describes all the libraries that come with GHC. - - + + See . Independent of @@ -202,13 +217,13 @@ documentation describes all the libraries that come with GHC. - + - -fno-implicit-prelude + -XNoImplicitPrelude option GHC normally imports Prelude.hi files for you. If you'd rather it didn't, then give it a - option. The idea is + option. The idea is that you can then import a Prelude of your own. (But don't call it Prelude; the Haskell module namespace is flat, and you must not conflict with any @@ -223,14 +238,14 @@ documentation describes all the libraries that come with GHC. translation for list comprehensions continues to use Prelude.map etc. - However, does + However, does change the handling of certain built-in syntax: see . - + Enables implicit parameters (see ). Currently also implied by @@ -243,7 +258,15 @@ documentation describes all the libraries that come with GHC. - + + + Enables overloaded string literals (see ). + + + + + Enables lexically-scoped type variables (see ). Implied by @@ -252,7 +275,7 @@ documentation describes all the libraries that come with GHC. - + Enables Template Haskell (see ). This flag must @@ -271,8 +294,6 @@ documentation describes all the libraries that come with GHC. - - Unboxed types and primitive operations @@ -377,6 +398,13 @@ worse, the unboxed value might be larger than a pointer (Double# for instance). + You cannot define a newtype whose representation type +(the argument type of the data constructor) is an unboxed type. Thus, +this is illegal: + + newtype A = MkA Int# + + You cannot bind a variable with an unboxed type in a top-level binding. @@ -546,14 +574,11 @@ import qualified Control.Monad.ST.Strict as ST linkend="search-path"/>. GHC comes with a large collection of libraries arranged - hierarchically; see the accompanying library documentation. - There is an ongoing project to create and maintain a stable set - of core libraries used by several Haskell - compilers, and the libraries that GHC comes with represent the - current status of that project. For more details, see Haskell - Libraries. - + hierarchically; see the accompanying library + documentation. More libraries to install are available + from HackageDB. @@ -622,7 +647,7 @@ to write clunky would be to use case expressions: -clunky env var1 var1 = case lookup env var1 of +clunky env var1 var2 = case lookup env var1 of Nothing -> fail Just val1 -> case lookup env var2 of Nothing -> fail @@ -647,7 +672,7 @@ Here is how I would write clunky: -clunky env var1 var1 +clunky env var1 var2 | Just val1 <- lookup env var1 , Just val2 <- lookup env var2 = val1 + val2 @@ -685,6 +710,202 @@ qualifier list has just one element, a boolean expression. + + + +View patterns + + + +View patterns are enabled by the flag -XViewPatterns. +More information and examples of view patterns can be found on the +Wiki +page. + + + +View patterns are somewhat like pattern guards that can be nested inside +of other patterns. They are a convenient way of pattern-matching +against values of abstract types. For example, in a programming language +implementation, we might represent the syntax of the types of the +language as follows: + + +type Typ + +data TypView = Unit + | Arrow Typ Typ + +view :: Type -> TypeView + +-- additional operations for constructing Typ's ... + + +The representation of Typ is held abstract, permitting implementations +to use a fancy representation (e.g., hash-consing to managage sharing). + +Without view patterns, using this signature a little inconvenient: + +size :: Typ -> Integer +size t = case view t of + Unit -> 1 + Arrow t1 t2 -> size t1 + size t2 + + +It is necessary to iterate the case, rather than using an equational +function definition. And the situation is even worse when the matching +against t is buried deep inside another pattern. + + + +View patterns permit calling the view function inside the pattern and +matching against the result: + +size (view -> Unit) = 1 +size (view -> Arrow t1 t2) = size t1 + size t2 + + +That is, we add a new form of pattern, written +expression -> +pattern that means "apply the expression to +whatever we're trying to match against, and then match the result of +that application against the pattern". The expression can be any Haskell +expression of function type, and view patterns can be used wherever +patterns are used. + + + +The semantics of a pattern ( +exp -> +pat ) are as follows: + + + + Scoping: + +The variables bound by the view pattern are the variables bound by +pat. + + + +Any variables in exp are bound occurrences, +but variables bound "to the left" in a pattern are in scope. This +feature permits, for example, one argument to a function to be used in +the view of another argument. For example, the function +clunky from can be +written using view patterns as follows: + + +clunky env (lookup env -> Just val1) (lookup env -> Just val2) = val1 + val2 +...other equations for clunky... + + + + +More precisely, the scoping rules are: + + + +In a single pattern, variables bound by patterns to the left of a view +pattern expression are in scope. For example: + +example :: Maybe ((String -> Integer,Integer), String) -> Bool +example Just ((f,_), f -> 4) = True + + +Additionally, in function definitions, variables bound by matching earlier curried +arguments may be used in view pattern expressions in later arguments: + +example :: (String -> Integer) -> String -> Bool +example f (f -> 4) = True + +That is, the scoping is the same as it would be if the curried arguments +were collected into a tuple. + + + + + +In mutually recursive bindings, such as let, +where, or the top level, view patterns in one +declaration may not mention variables bound by other declarations. That +is, each declaration must be self-contained. For example, the following +program is not allowed: + +let {(x -> y) = e1 ; + (y -> x) = e2 } in x + + +(We may lift this +restriction in the future; the only cost is that type checking patterns +would get a little more complicated.) + + + + + + + + + + Typing: If exp has type +T1 -> +T2 and pat matches +a T2, then the whole view pattern matches a +T1. + + + Matching: To the equations in Section 3.17.3 of the +Haskell 98 +Report, add the following: + +case v of { (e -> p) -> e1 ; _ -> e2 } + = +case (e v) of { p -> e1 ; _ -> e2 } + +That is, to match a variable v against a pattern +( exp +-> pat +), evaluate ( +exp v +) and match the result against +pat. + + + Efficiency: When the same view function is applied in +multiple branches of a function definition or a case expression (e.g., +in size above), GHC makes an attempt to collect these +applications into a single nested case expression, so that the view +function is only applied once. Pattern compilation in GHC follows the +matrix algorithm described in Chapter 4 of The +Implementation of Functional Programming Languages. When the +top rows of the first column of a matrix are all view patterns with the +"same" expression, these patterns are transformed into a single nested +case. This includes, for example, adjacent view patterns that line up +in a tuple, as in + +f ((view -> A, p1), p2) = e1 +f ((view -> B, p3), p4) = e2 + + + + The current notion of when two view pattern expressions are "the +same" is very restricted: it is not even full syntactic equality. +However, it does include variables, literals, applications, and tuples; +e.g., two instances of view ("hi", "there") will be +collected. However, the current implementation does not compare up to +alpha-equivalence, so two instances of (x, view x -> +y) will not be coalesced. + + + + + + + + + @@ -692,9 +913,11 @@ qualifier list has just one element, a boolean expression. The recursive do-notation (also known as mdo-notation) is implemented as described in -"A recursive do for Haskell", -Levent Erkok, John Launchbury", +A recursive do for Haskell, +by Levent Erkok, John Launchbury, Haskell Workshop 2002, pages: 29-37. Pittsburgh, Pennsylvania. +This paper is essential reading for anyone making non-trivial use of mdo-notation, +and we do not repeat it here. The do-notation of Haskell does not allow recursive bindings, @@ -725,17 +948,24 @@ class Monad m => MonadFix m where The function mfix -dictates how the required recursion operation should be performed. If recursive bindings are required for a monad, -then that monad must be declared an instance of the MonadFix class. -For details, see the above mentioned reference. +dictates how the required recursion operation should be performed. For example, +justOnes desugars as follows: + +justOnes = mfix (\xs' -> do { xs <- Just (1:xs'); return xs } + +For full details of the way in which mdo is typechecked and desugared, see +the paper A recursive do for Haskell. +In particular, GHC implements the segmentation technique described in Section 3.2 of the paper. +If recursive bindings are required for a monad, +then that monad must be declared an instance of the MonadFix class. The following instances of MonadFix are automatically provided: List, Maybe, IO. Furthermore, the Control.Monad.ST and Control.Monad.ST.Lazy modules provide the instances of the MonadFix class for Haskell's internal state monad (strict and lazy, respectively). -There are three important points in using the recursive-do notation: +Here are some important points in using the recursive-do notation: The recursive version of the do-notation uses the keyword mdo (rather @@ -743,14 +973,21 @@ than do). -You should import Control.Monad.Fix. -(Note: Strictly speaking, this import is required only when you need to refer to the name -MonadFix in your program, but the import is always safe, and the programmers -are encouraged to always import this module when using the mdo-notation.) +It is enabled with the flag -XRecursiveDo, which is in turn implied by +-fglasgow-exts. -As with other extensions, ghc should be given the flag -fglasgow-exts +Unlike ordinary do-notation, but like let and where bindings, +name shadowing is not allowed; that is, all the names bound in a single mdo must +be distinct (Section 3.3 of the paper). + + + +Variables bound by a let statement in an mdo +are monomorphic in the mdo (Section 3.1 of the paper). However +GHC breaks the mdo into segments to enhance polymorphism, +and improve termination (Section 3.2 of the paper). @@ -822,10 +1059,11 @@ This name is not supported by GHC. + + Rebindable syntax - GHC allows most kinds of built-in syntax to be rebound by the user, to facilitate replacing the Prelude with a home-grown version, for example. @@ -834,7 +1072,7 @@ This name is not supported by GHC. hierarchy. It completely defeats that purpose if the literal "1" means "Prelude.fromInteger 1", which is what the Haskell Report specifies. - So the flag causes + So the flag causes the following pieces of built-in syntax to refer to whatever is in scope, not the Prelude versions: @@ -905,18 +1143,252 @@ fromInteger :: Integer -> Bool -> Bool you should be all right. - + +Postfix operators - - -Type system extensions + +GHC allows a small extension to the syntax of left operator sections, which +allows you to define postfix operators. The extension is this: the left section + + (e !) + +is equivalent (from the point of view of both type checking and execution) to the expression + + ((!) e) + +(for any expression e and operator (!). +The strict Haskell 98 interpretation is that the section is equivalent to + + (\y -> (!) e y) + +That is, the operator must be a function of two arguments. GHC allows it to +take only one argument, and that in turn allows you to write the function +postfix. + +Since this extension goes beyond Haskell 98, it should really be enabled +by a flag; but in fact it is enabled all the time. (No Haskell 98 programs +change their behaviour, of course.) + +The extension does not extend to the left-hand side of function +definitions; you must define such a function in prefix form. + - -Data types and type synonyms + +Record field disambiguation + +In record construction and record pattern matching +it is entirely unambiguous which field is referred to, even if there are two different +data types in scope with a common field name. For example: + +module M where + data S = MkS { x :: Int, y :: Bool } + +module Foo where + import M + + data T = MkT { x :: Int } + + ok1 (MkS { x = n }) = n+1 -- Unambiguous + + ok2 n = MkT { x = n+1 } -- Unambiguous + + bad1 k = k { x = 3 } -- Ambiguous + bad2 k = x k -- Ambiguous + +Even though there are two x's in scope, +it is clear that the x in the pattern in the +definition of ok1 can only mean the field +x from type S. Similarly for +the function ok2. However, in the record update +in bad1 and the record selection in bad2 +it is not clear which of the two types is intended. + + +Haskell 98 regards all four as ambiguous, but with the + flag, GHC will accept +the former two. The rules are precisely the same as those for instance +declarations in Haskell 98, where the method names on the left-hand side +of the method bindings in an instance declaration refer unambiguously +to the method of that class (provided they are in scope at all), even +if there are other variables in scope with the same name. +This reduces the clutter of qualified names when you import two +records from different modules that use the same field name. + + + + + + +Record puns + + + +Record puns are enabled by the flag -XRecordPuns. + + + +When using records, it is common to write a pattern that binds a +variable with the same name as a record field, such as: + + +data C = C {a :: Int} +f (C {a = a}) = a + + + + +Record punning permits the variable name to be elided, so one can simply +write + + +f (C {a}) = a + + +to mean the same pattern as above. That is, in a record pattern, the +pattern a expands into the pattern a = +a for the same name a. + + + +Note that puns and other patterns can be mixed in the same record: + +data C = C {a :: Int, b :: Int} +f (C {a, b = 4}) = a + +and that puns can be used wherever record patterns occur (e.g. in +let bindings or at the top-level). + + + +Record punning can also be used in an expression, writing, for example, + +let a = 1 in C {a} + +instead of + +let a = 1 in C {a = a} + + +Note that this expansion is purely syntactic, so the record pun +expression refers to the nearest enclosing variable that is spelled the +same as the field name. + + + + + + + +Record wildcards + + + +Record wildcards are enabled by the flag -XRecordWildCards. + + + +For records with many fields, it can be tiresome to write out each field +individually in a record pattern, as in + +data C = C {a :: Int, b :: Int, c :: Int, d :: Int} +f (C {a = 1, b = b, c = c, d = d}) = b + c + d + + + + +Record wildcard syntax permits a (..) in a record +pattern, where each elided field f is replaced by the +pattern f = f. For example, the above pattern can be +written as + +f (C {a = 1, ..}) = b + c + d + + + + +Note that wildcards can be mixed with other patterns, including puns +(); for example, in a pattern C {a += 1, b, ..}). Additionally, record wildcards can be used +wherever record patterns occur, including in let +bindings and at the top-level. For example, the top-level binding + +C {a = 1, ..} = e + +defines b, c, and +d. + + + +Record wildcards can also be used in expressions, writing, for example, + + +let {a = 1; b = 2; c = 3; d = 4} in C {..} + + +in place of + + +let {a = 1; b = 2; c = 3; d = 4} in C {a=a, b=b, c=c, d=d} + + +Note that this expansion is purely syntactic, so the record wildcard +expression refers to the nearest enclosing variables that are spelled +the same as the omitted field names. + + + + + + + +Local Fixity Declarations + + +A careful reading of the Haskell 98 Report reveals that fixity +declarations (infix, infixl, and +infixr) are permitted to appear inside local bindings +such those introduced by let and +where. However, the Haskell Report does not specify +the semantics of such bindings very precisely. + + +In GHC, a fixity declaration may accompany a local binding: + +let f = ... + infixr 3 `f` +in + ... + +and the fixity declaration applies wherever the binding is in scope. +For example, in a let, it applies in the right-hand +sides of other let-bindings and the body of the +letC. Or, in recursive do +expressions (), the local fixity +declarations of aA let statement scope over other +statements in the group, just as the bound name does. + - +Moreover, a local fixity declatation *must* accompany a local binding of +that name: it is not possible to revise the fixity of name bound +elsewhere, as in + +let infixr 9 $ in ... + + +Because local fixity declarations are technically Haskell 98, no flag is +necessary to enable them. + + + + + + + +Extensions to data types and type synonyms + + Data types with no constructors With the flag, GHC lets you declare @@ -930,13 +1402,13 @@ a data type with no constructors. For example: Syntactically, the declaration lacks the "= constrs" part. The type can be parameterised over types of any kind, but if the kind is not * then an explicit kind annotation must be used -(see ). +(see ). Such data types have only one value, namely bottom. Nevertheless, they can be useful when defining "phantom types". - + - + Infix type constructors, classes, and type variables @@ -1003,9 +1475,9 @@ to be written infix, very much like expressions. More specifically: - + - + Liberalised type synonyms @@ -1095,10 +1567,10 @@ this will be rejected: because GHC does not allow unboxed tuples on the left of a function arrow. - + - + Existentially quantified data constructors @@ -1192,7 +1664,7 @@ that collection of packages in a uniform manner. You can express quite a bit of object-oriented-like programming this way. - + Why existential? @@ -1215,9 +1687,9 @@ But Haskell programmers can safely think of the ordinary adding a new existential quantification construct. - + - + Type classes @@ -1277,9 +1749,9 @@ Notice the way that the syntax fits smoothly with that used for universal quantification earlier. - + - + Record Constructors @@ -1296,7 +1768,7 @@ data Counter a = forall self. NewCounter Here tag is a public field, with a well-typed selector function tag :: Counter a -> a. The self type is hidden from the outside; any attempt to apply _this, -_inc or _output as functions will raise a +_inc or _display as functions will raise a compile-time error. In other words, GHC defines a record selector function only for fields whose type does not mention the existentially-quantified variables. (This example used an underscore in the fields for which record selectors @@ -1331,20 +1803,6 @@ main = do display (inc (inc counterB)) -- prints "##" -In GADT declarations (see ), the explicit -forall may be omitted. For example, we can express -the same Counter a using GADT: - - -data Counter a where - NewCounter { _this :: self - , _inc :: self -> self - , _display :: self -> IO () - , tag :: a - } - :: Counter a - - At the moment, record update syntax is only supported for Haskell 98 data types, so the following function does not work: @@ -1356,10 +1814,10 @@ setTag obj t = obj{ tag = t } - + - + Restrictions @@ -1485,7 +1943,7 @@ are convincing reasons to change it. You can't use deriving to define instances of a data type with existentially quantified data constructors. -Reason: in most cases it would not make sense. For example:# +Reason: in most cases it would not make sense. For example:; data T = forall a. MkT [a] deriving( Eq ) @@ -1510,2413 +1968,2709 @@ declarations. Define your own instances! - - + + +Declaring data types with explicit constructor signatures - -Class declarations +GHC allows you to declare an algebraic data type by +giving the type signatures of constructors explicitly. For example: + + data Maybe a where + Nothing :: Maybe a + Just :: a -> Maybe a + +The form is called a "GADT-style declaration" +because Generalised Algebraic Data Types, described in , +can only be declared using this form. +Notice that GADT-style syntax generalises existential types (). +For example, these two declarations are equivalent: + + data Foo = forall a. MkFoo a (a -> Bool) + data Foo' where { MKFoo :: a -> (a->Bool) -> Foo' } + + +Any data type that can be declared in standard Haskell-98 syntax +can also be declared using GADT-style syntax. +The choice is largely stylistic, but GADT-style declarations differ in one important respect: +they treat class constraints on the data constructors differently. +Specifically, if the constructor is given a type-class context, that +context is made available by pattern matching. For example: + + data Set a where + MkSet :: Eq a => [a] -> Set a - -This section, and the next one, documents GHC's type-class extensions. -There's lots of background in the paper Type -classes: exploring the design space (Simon Peyton Jones, Mark -Jones, Erik Meijer). + makeSet :: Eq a => [a] -> Set a + makeSet xs = MkSet (nub xs) + + insert :: a -> Set a -> Set a + insert a (MkSet as) | a `elem` as = MkSet as + | otherwise = MkSet (a:as) + +A use of MkSet as a constructor (e.g. in the definition of makeSet) +gives rise to a (Eq a) +constraint, as you would expect. The new feature is that pattern-matching on MkSet +(as in the definition of insert) makes available an (Eq a) +context. In implementation terms, the MkSet constructor has a hidden field that stores +the (Eq a) dictionary that is passed to MkSet; so +when pattern-matching that dictionary becomes available for the right-hand side of the match. +In the example, the equality dictionary is used to satisfy the equality constraint +generated by the call to elem, so that the type of +insert itself has no Eq constraint. +This behaviour contrasts with Haskell 98's peculiar treament of +contexts on a data type declaration (Section 4.2.1 of the Haskell 98 Report). +In Haskell 98 the defintion + + data Eq a => Set' a = MkSet' [a] + +gives MkSet' the same type as MkSet above. But instead of +making available an (Eq a) constraint, pattern-matching +on MkSet' requires an (Eq a) constraint! +GHC faithfully implements this behaviour, odd though it is. But for GADT-style declarations, +GHC's behaviour is much more useful, as well as much more intuitive. -All the extensions are enabled by the flag. +For example, a possible application of GHC's behaviour is to reify dictionaries: + + data NumInst a where + MkNumInst :: Num a => NumInst a + + intInst :: NumInst Int + intInst = MkNumInst + + plus :: NumInst a -> a -> a -> a + plus MkNumInst p q = p + q + +Here, a value of type NumInst a is equivalent +to an explicit (Num a) dictionary. - -Multi-parameter type classes -Multi-parameter type classes are permitted. For example: +The rest of this section gives further details about GADT-style data +type declarations. + + +The result type of each data constructor must begin with the type constructor being defined. +If the result type of all constructors +has the form T a1 ... an, where a1 ... an +are distinct type variables, then the data type is ordinary; +otherwise is a generalised data type (). + + +The type signature of +each constructor is independent, and is implicitly universally quantified as usual. +Different constructors may have different universally-quantified type variables +and different type-class constraints. +For example, this is fine: - class Collection c a where - union :: c a -> c a -> c a - ...etc. + data T a where + T1 :: Eq b => b -> T b + T2 :: (Show c, Ix c) => c -> [c] -> T c + - - - - -The superclasses of a class declaration + +Unlike a Haskell-98-style +data type declaration, the type variable(s) in the "data Set a where" header +have no scope. Indeed, one can write a kind signature instead: + + data Set :: * -> * where ... + +or even a mixture of the two: + + data Foo a :: (* -> *) -> * where ... + +The type variables (if given) may be explicitly kinded, so we could also write the header for Foo +like this: + + data Foo a (b :: * -> *) where ... + + - -There are no restrictions on the context in a class declaration -(which introduces superclasses), except that the class hierarchy must -be acyclic. So these class declarations are OK: + +You can use strictness annotations, in the obvious places +in the constructor type: + + data Term a where + Lit :: !Int -> Term Int + If :: Term Bool -> !(Term a) -> !(Term a) -> Term a + Pair :: Term a -> Term b -> Term (a,b) + + + +You can use a deriving clause on a GADT-style data type +declaration. For example, these two declarations are equivalent - class Functor (m k) => FiniteMap m k where - ... + data Maybe1 a where { + Nothing1 :: Maybe1 a ; + Just1 :: a -> Maybe1 a + } deriving( Eq, Ord ) - class (Monad m, Monad (t m)) => Transform t m where - lift :: m a -> (t m) a + data Maybe2 a = Nothing2 | Just2 a + deriving( Eq, Ord ) + + +You can use record syntax on a GADT-style data type declaration: + + data Person where + Adult { name :: String, children :: [Person] } :: Person + Child { name :: String } :: Person + +As usual, for every constructor that has a field f, the type of +field f must be the same (modulo alpha conversion). -As in Haskell 98, The class hierarchy must be acyclic. However, the definition -of "acyclic" involves only the superclass relationships. For example, -this is OK: +At the moment, record updates are not yet possible with GADT-style declarations, +so support is limited to record construction, selection and pattern matching. +For exmaple + + aPerson = Adult { name = "Fred", children = [] } + shortName :: Person -> Bool + hasChildren (Adult { children = kids }) = not (null kids) + hasChildren (Child {}) = False + + + +As in the case of existentials declared using the Haskell-98-like record syntax +(), +record-selector functions are generated only for those fields that have well-typed +selectors. +Here is the example of that section, in GADT-style syntax: - class C a where { - op :: D b => a -> b -> b - } - - class C a => D a where { ... } +data Counter a where + NewCounter { _this :: self + , _inc :: self -> self + , _display :: self -> IO () + , tag :: a + } + :: Counter a +As before, only one selector function is generated here, that for tag. +Nevertheless, you can still use all the field names in pattern matching and record construction. + + + + +Generalised Algebraic Data Types (GADTs) -Here, C is a superclass of D, but it's OK for a -class operation op of C to mention D. (It -would not be OK for D to be a superclass of C.) +Generalised Algebraic Data Types generalise ordinary algebraic data types +by allowing constructors to have richer return types. Here is an example: + + data Term a where + Lit :: Int -> Term Int + Succ :: Term Int -> Term Int + IsZero :: Term Int -> Term Bool + If :: Term Bool -> Term a -> Term a -> Term a + Pair :: Term a -> Term b -> Term (a,b) + +Notice that the return type of the constructors is not always Term a, as is the +case with ordinary data types. This generality allows us to +write a well-typed eval function +for these Terms: + + eval :: Term a -> a + eval (Lit i) = i + eval (Succ t) = 1 + eval t + eval (IsZero t) = eval t == 0 + eval (If b e1 e2) = if eval b then eval e1 else eval e2 + eval (Pair e1 e2) = (eval e1, eval e2) + +The key point about GADTs is that pattern matching causes type refinement. +For example, in the right hand side of the equation + + eval :: Term a -> a + eval (Lit i) = ... + +the type a is refined to Int. That's the whole point! +A precise specification of the type rules is beyond what this user manual aspires to, +but the design closely follows that described in +the paper Simple +unification-based type inference for GADTs, +(ICFP 2006). +The general principle is this: type refinement is only carried out +based on user-supplied type annotations. +So if no type signature is supplied for eval, no type refinement happens, +and lots of obscure error messages will +occur. However, the refinement is quite general. For example, if we had: + + eval :: Term a -> a -> a + eval (Lit i) j = i+j + +the pattern match causes the type a to be refined to Int (because of the type +of the constructor Lit), and that refinement also applies to the type of j, and +the result type of the case expression. Hence the addition i+j is legal. - - - + +These and many other examples are given in papers by Hongwei Xi, and +Tim Sheard. There is a longer introduction +on the wiki, +and Ralf Hinze's +Fun with phantom types also has a number of examples. Note that papers +may use different notation to that implemented in GHC. + + +The rest of this section outlines the extensions to GHC that support GADTs. The extension is enabled with +. + + +A GADT can only be declared using GADT-style syntax (); +the old Haskell-98 syntax for data declarations always declares an ordinary data type. +The result type of each constructor must begin with the type constructor being defined, +but for a GADT the arguments to the type constructor can be arbitrary monotypes. +For example, in the Term data +type above, the type of each constructor must end with Term ty, but +the ty may not be a type variable (e.g. the Lit +constructor). + + +You cannot use a deriving clause for a GADT; only for +an ordianary data type. + - -Class method types + +As mentioned in , record syntax is supported. +For example: + + data Term a where + Lit { val :: Int } :: Term Int + Succ { num :: Term Int } :: Term Int + Pred { num :: Term Int } :: Term Int + IsZero { arg :: Term Int } :: Term Bool + Pair { arg1 :: Term a + , arg2 :: Term b + } :: Term (a,b) + If { cnd :: Term Bool + , tru :: Term a + , fls :: Term a + } :: Term a + +However, for GADTs there is the following additional constraint: +every constructor that has a field f must have +the same result type (modulo alpha conversion) +Hence, in the above example, we cannot merge the num +and arg fields above into a +single name. Although their field types are both Term Int, +their selector functions actually have different types: - -Haskell 98 prohibits class method types to mention constraints on the -class type variable, thus: - class Seq s a where - fromList :: [a] -> s a - elem :: Eq a => a -> s a -> Bool + num :: Term Int -> Term Int + arg :: Term Bool -> Term Int -The type of elem is illegal in Haskell 98, because it -contains the constraint Eq a, constrains only the -class type variable (in this case a). -GHC lifts this restriction. - + + + - + - -Functional dependencies - + - Functional dependencies are implemented as described by Mark Jones -in “Type Classes with Functional Dependencies”, Mark P. Jones, -In Proceedings of the 9th European Symposium on Programming, -ESOP 2000, Berlin, Germany, March 2000, Springer-Verlag LNCS 1782, -. - - -Functional dependencies are introduced by a vertical bar in the syntax of a -class declaration; e.g. - - class (Monad m) => MonadState s m | m -> s where ... + +Extensions to the "deriving" mechanism - class Foo a b c | a b -> c where ... - -There should be more documentation, but there isn't (yet). Yell if you need it. - + +Inferred context for deriving clauses -Rules for functional dependencies -In a class declaration, all of the class type variables must be reachable (in the sense -mentioned in ) -from the free variables of each method type. -For example: - +The Haskell Report is vague about exactly when a deriving clause is +legal. For example: - class Coll s a where - empty :: s - insert :: s -> a -> s + data T0 f a = MkT0 a deriving( Eq ) + data T1 f a = MkT1 (f a) deriving( Eq ) + data T2 f a = MkT2 (f (f a)) deriving( Eq ) - -is not OK, because the type of empty doesn't mention -a. Functional dependencies can make the type variable -reachable: +The natural generated Eq code would result in these instance declarations: - class Coll s a | s -> a where - empty :: s - insert :: s -> a -> s + instance Eq a => Eq (T0 f a) where ... + instance Eq (f a) => Eq (T1 f a) where ... + instance Eq (f (f a)) => Eq (T2 f a) where ... +The first of these is obviously fine. The second is still fine, although less obviously. +The third is not Haskell 98, and risks losing termination of instances. + + +GHC takes a conservative position: it accepts the first two, but not the third. The rule is this: +each constraint in the inferred instance context must consist only of type variables, +with no repititions. + + +This rule is applied regardless of flags. If you want a more exotic context, you can write +it yourself, using the standalone deriving mechanism. + + -Alternatively Coll might be rewritten + +Stand-alone deriving declarations + +GHC now allows stand-alone deriving declarations, enabled by -XStandaloneDeriving: - class Coll s a where - empty :: s a - insert :: s a -> a -> s a - - - -which makes the connection between the type of a collection of -a's (namely (s a)) and the element type a. -Occasionally this really doesn't work, in which case you can split the -class like this: + data Foo a = Bar a | Baz String + deriving instance Eq a => Eq (Foo a) + +The syntax is identical to that of an ordinary instance declaration apart from (a) the keyword +deriving, and (b) the absence of the where part. +You must supply a context (in the example the context is (Eq a)), +exactly as you would in an ordinary instance declaration. +(In contrast the context is inferred in a deriving clause +attached to a data type declaration.) These deriving instance +rules obey the same rules concerning form and termination as ordinary instance declarations, +controlled by the same flags; see . +The stand-alone syntax is generalised for newtypes in exactly the same +way that ordinary deriving clauses are generalised (). +For example: - class CollE s where - empty :: s + newtype Foo a = MkFoo (State Int a) - class CollE s => Coll s a where - insert :: s -> a -> s + deriving instance MonadState Int Foo +GHC always treats the last parameter of the instance +(Foo in this exmample) as the type whose instance is being derived. - + - -Background on functional dependencies -The following description of the motivation and use of functional dependencies is taken -from the Hugs user manual, reproduced here (with minor changes) by kind -permission of Mark Jones. - - -Consider the following class, intended as part of a -library for collection types: - - class Collects e ce where - empty :: ce - insert :: e -> ce -> ce - member :: e -> ce -> Bool - -The type variable e used here represents the element type, while ce is the type -of the container itself. Within this framework, we might want to define -instances of this class for lists or characteristic functions (both of which -can be used to represent collections of any equality type), bit sets (which can -be used to represent collections of characters), or hash tables (which can be -used to represent any collection whose elements have a hash function). Omitting -standard implementation details, this would lead to the following declarations: - - instance Eq e => Collects e [e] where ... - instance Eq e => Collects e (e -> Bool) where ... - instance Collects Char BitSet where ... - instance (Hashable e, Collects a ce) - => Collects e (Array Int ce) where ... - -All this looks quite promising; we have a class and a range of interesting -implementations. Unfortunately, there are some serious problems with the class -declaration. First, the empty function has an ambiguous type: - - empty :: Collects e ce => ce - -By "ambiguous" we mean that there is a type variable e that appears on the left -of the => symbol, but not on the right. The problem with -this is that, according to the theoretical foundations of Haskell overloading, -we cannot guarantee a well-defined semantics for any term with an ambiguous -type. + +Deriving clause for classes <literal>Typeable</literal> and <literal>Data</literal> + + +Haskell 98 allows the programmer to add "deriving( Eq, Ord )" to a data type +declaration, to generate a standard instance declaration for classes specified in the deriving clause. +In Haskell 98, the only classes that may appear in the deriving clause are the standard +classes Eq, Ord, +Enum, Ix, Bounded, Read, and Show. -We can sidestep this specific problem by removing the empty member from the -class declaration. However, although the remaining members, insert and member, -do not have ambiguous types, we still run into problems when we try to use -them. For example, consider the following two functions: - - f x y = insert x . insert y - g = f True 'a' - -for which GHC infers the following types: - - f :: (Collects a c, Collects b c) => a -> b -> c -> c - g :: (Collects Bool c, Collects Char c) => c -> c - -Notice that the type for f allows the two parameters x and y to be assigned -different types, even though it attempts to insert each of the two values, one -after the other, into the same collection. If we're trying to model collections -that contain only one type of value, then this is clearly an inaccurate -type. Worse still, the definition for g is accepted, without causing a type -error. As a result, the error in this code will not be flagged at the point -where it appears. Instead, it will show up only when we try to use g, which -might even be in a different module. +GHC extends this list with two more classes that may be automatically derived +(provided the flag is specified): +Typeable, and Data. These classes are defined in the library +modules Data.Typeable and Data.Generics respectively, and the +appropriate class must be in scope before it can be mentioned in the deriving clause. +An instance of Typeable can only be derived if the +data type has seven or fewer type parameters, all of kind *. +The reason for this is that the Typeable class is derived using the scheme +described in + +Scrap More Boilerplate: Reflection, Zips, and Generalised Casts +. +(Section 7.4 of the paper describes the multiple Typeable classes that +are used, and only Typeable1 up to +Typeable7 are provided in the library.) +In other cases, there is nothing to stop the programmer writing a TypableX +class, whose kind suits that of the data type constructor, and +then writing the data type instance by hand. + + -An attempt to use constructor classes + +Generalised derived instances for newtypes -Faced with the problems described above, some Haskell programmers might be -tempted to use something like the following version of the class declaration: - - class Collects e c where - empty :: c e - insert :: e -> c e -> c e - member :: e -> c e -> Bool - -The key difference here is that we abstract over the type constructor c that is -used to form the collection type c e, and not over that collection type itself, -represented by ce in the original class declaration. This avoids the immediate -problems that we mentioned above: empty has type Collects e c => c -e, which is not ambiguous. - - -The function f from the previous section has a more accurate type: - - f :: (Collects e c) => e -> e -> c e -> c e - -The function g from the previous section is now rejected with a type error as -we would hope because the type of f does not allow the two arguments to have -different types. -This, then, is an example of a multiple parameter class that does actually work -quite well in practice, without ambiguity problems. -There is, however, a catch. This version of the Collects class is nowhere near -as general as the original class seemed to be: only one of the four instances -for Collects -given above can be used with this version of Collects because only one of -them---the instance for lists---has a collection type that can be written in -the form c e, for some type constructor c, and element type e. - - +When you define an abstract type using newtype, you may want +the new type to inherit some instances from its representation. In +Haskell 98, you can inherit instances of Eq, Ord, +Enum and Bounded by deriving them, but for any +other classes you have to write an explicit instance declaration. For +example, if you define -Adding functional dependencies + + newtype Dollars = Dollars Int + - -To get a more useful version of the Collects class, Hugs provides a mechanism -that allows programmers to specify dependencies between the parameters of a -multiple parameter class (For readers with an interest in theoretical -foundations and previous work: The use of dependency information can be seen -both as a generalization of the proposal for `parametric type classes' that was -put forward by Chen, Hudak, and Odersky, or as a special case of Mark Jones's -later framework for "improvement" of qualified types. The -underlying ideas are also discussed in a more theoretical and abstract setting -in a manuscript [implparam], where they are identified as one point in a -general design space for systems of implicit parameterization.). +and you want to use arithmetic on Dollars, you have to +explicitly define an instance of Num: -To start with an abstract example, consider a declaration such as: - - class C a b where ... - -which tells us simply that C can be thought of as a binary relation on types -(or type constructors, depending on the kinds of a and b). Extra clauses can be -included in the definition of classes to add information about dependencies -between parameters, as in the following examples: - - class D a b | a -> b where ... - class E a b | a -> b, b -> a where ... + + instance Num Dollars where + Dollars a + Dollars b = Dollars (a+b) + ... -The notation a -> b used here between the | and where -symbols --- not to be -confused with a function type --- indicates that the a parameter uniquely -determines the b parameter, and might be read as "a determines b." Thus D is -not just a relation, but actually a (partial) function. Similarly, from the two -dependencies that are included in the definition of E, we can see that E -represents a (partial) one-one mapping between types. - - -More generally, dependencies take the form x1 ... xn -> y1 ... ym, -where x1, ..., xn, and y1, ..., yn are type variables with n>0 and -m>=0, meaning that the y parameters are uniquely determined by the x -parameters. Spaces can be used as separators if more than one variable appears -on any single side of a dependency, as in t -> a b. Note that a class may be -annotated with multiple dependencies using commas as separators, as in the -definition of E above. Some dependencies that we can write in this notation are -redundant, and will be rejected because they don't serve any useful -purpose, and may instead indicate an error in the program. Examples of -dependencies like this include a -> a , -a -> a a , -a -> , etc. There can also be -some redundancy if multiple dependencies are given, as in -a->b, - b->c , a->c , and -in which some subset implies the remaining dependencies. Examples like this are -not treated as errors. Note that dependencies appear only in class -declarations, and not in any other part of the language. In particular, the -syntax for instance declarations, class constraints, and types is completely -unchanged. +All the instance does is apply and remove the newtype +constructor. It is particularly galling that, since the constructor +doesn't appear at run-time, this instance declaration defines a +dictionary which is wholly equivalent to the Int +dictionary, only slower! + + + Generalising the deriving clause -By including dependencies in a class declaration, we provide a mechanism for -the programmer to specify each multiple parameter class more precisely. The -compiler, on the other hand, is responsible for ensuring that the set of -instances that are in scope at any given point in the program is consistent -with any declared dependencies. For example, the following pair of instance -declarations cannot appear together in the same scope because they violate the -dependency for D, even though either one on its own would be acceptable: - - instance D Bool Int where ... - instance D Bool Char where ... - -Note also that the following declaration is not allowed, even by itself: - - instance D [a] b where ... - -The problem here is that this instance would allow one particular choice of [a] -to be associated with more than one choice for b, which contradicts the -dependency specified in the definition of D. More generally, this means that, -in any instance of the form: - - instance D t s where ... - -for some particular types t and s, the only variables that can appear in s are -the ones that appear in t, and hence, if the type t is known, then s will be -uniquely determined. +GHC now permits such instances to be derived instead, +using the flag , +so one can write + + newtype Dollars = Dollars Int deriving (Eq,Show,Num) + + +and the implementation uses the same Num dictionary +for Dollars as for Int. Notionally, the compiler +derives an instance declaration of the form + + + instance Num Int => Num Dollars + + +which just adds or removes the newtype constructor according to the type. -The benefit of including dependency information is that it allows us to define -more general multiple parameter classes, without ambiguity problems, and with -the benefit of more accurate types. To illustrate this, we return to the -collection class example, and annotate the original definition of Collects -with a simple dependency: - - class Collects e ce | ce -> e where - empty :: ce - insert :: e -> ce -> ce - member :: e -> ce -> Bool + +We can also derive instances of constructor classes in a similar +way. For example, suppose we have implemented state and failure monad +transformers, such that + + + instance Monad m => Monad (State s m) + instance Monad m => Monad (Failure m) + +In Haskell 98, we can define a parsing monad by + + type Parser tok m a = State [tok] (Failure m) a + + +which is automatically a monad thanks to the instance declarations +above. With the extension, we can make the parser type abstract, +without needing to write an instance of class Monad, via + + + newtype Parser tok m a = Parser (State [tok] (Failure m) a) + deriving Monad -The dependency ce -> e here specifies that the type e of elements is uniquely -determined by the type of the collection ce. Note that both parameters of -Collects are of kind *; there are no constructor classes here. Note too that -all of the instances of Collects that we gave earlier can be used -together with this new definition. - - -What about the ambiguity problems that we encountered with the original -definition? The empty function still has type Collects e ce => ce, but it is no -longer necessary to regard that as an ambiguous type: Although the variable e -does not appear on the right of the => symbol, the dependency for class -Collects tells us that it is uniquely determined by ce, which does appear on -the right of the => symbol. Hence the context in which empty is used can still -give enough information to determine types for both ce and e, without -ambiguity. More generally, we need only regard a type as ambiguous if it -contains a variable on the left of the => that is not uniquely determined -(either directly or indirectly) by the variables on the right. +In this case the derived instance declaration is of the form + + instance Monad (State [tok] (Failure m)) => Monad (Parser tok m) + + +Notice that, since Monad is a constructor class, the +instance is a partial application of the new type, not the +entire left hand side. We can imagine that the type declaration is +"eta-converted" to generate the context of the instance +declaration. -Dependencies also help to produce more accurate types for user defined -functions, and hence to provide earlier detection of errors, and less cluttered -types for programmers to work with. Recall the previous definition for a -function f: - - f x y = insert x y = insert x . insert y - -for which we originally obtained a type: - - f :: (Collects a c, Collects b c) => a -> b -> c -> c + +We can even derive instances of multi-parameter classes, provided the +newtype is the last class parameter. In this case, a ``partial +application'' of the class appears in the deriving +clause. For example, given the class + + + class StateMonad s m | m -> s where ... + instance Monad m => StateMonad s (State s m) where ... + +then we can derive an instance of StateMonad for Parsers by + + newtype Parser tok m a = Parser (State [tok] (Failure m) a) + deriving (Monad, StateMonad [tok]) -Given the dependency information that we have for Collects, however, we can -deduce that a and b must be equal because they both appear as the second -parameter in a Collects constraint with the same first parameter c. Hence we -can infer a shorter and more accurate type for f: - - f :: (Collects a c) => a -> a -> c -> c + +The derived instance is obtained by completing the application of the +class to the new type: + + + instance StateMonad [tok] (State [tok] (Failure m)) => + StateMonad [tok] (Parser tok m) -In a similar way, the earlier definition of g will now be flagged as a type error. -Although we have given only a few examples here, it should be clear that the -addition of dependency information can help to make multiple parameter classes -more useful in practice, avoiding ambiguity problems, and allowing more general -sets of instance declarations. + +As a result of this extension, all derived instances in newtype + declarations are treated uniformly (and implemented just by reusing +the dictionary for the representation type), except +Show and Read, which really behave differently for +the newtype and its representation. - - - -Instance declarations + A more precise specification + +Derived instance declarations are constructed as follows. Consider the +declaration (after expansion of any type synonyms) - -Relaxed rules for instance declarations + + newtype T v1...vn = T' (t vk+1...vn) deriving (c1...cm) + -An instance declaration has the form - - instance ( assertion1, ..., assertionn) => class type1 ... typem where ... - -The part before the "=>" is the -context, while the part after the -"=>" is the head of the instance declaration. - +where + + + The ci are partial applications of + classes of the form C t1'...tj', where the arity of C + is exactly j+1. That is, C lacks exactly one type argument. + + + The k is chosen so that ci (T v1...vk) is well-kinded. + + + The type t is an arbitrary type. + + + The type variables vk+1...vn do not occur in t, + nor in the ci, and + + + None of the ci is Read, Show, + Typeable, or Data. These classes + should not "look through" the type or its constructor. You can still + derive these classes for a newtype, but it happens in the usual way, not + via this new mechanism. + + +Then, for each ci, the derived instance +declaration is: + + instance ci t => ci (T v1...vk) + +As an example which does not work, consider + + newtype NonMonad m s = NonMonad (State s m s) deriving Monad + +Here we cannot derive the instance + + instance Monad (State s m) => Monad (NonMonad m) + - -In Haskell 98 the head of an instance declaration -must be of the form C (T a1 ... an), where -C is the class, T is a type constructor, -and the a1 ... an are distinct type variables. -Furthermore, the assertions in the context of the instance declaration -must be of the form C a where a -is a type variable that occurs in the head. +because the type variable s occurs in State s m, +and so cannot be "eta-converted" away. It is a good thing that this +deriving clause is rejected, because NonMonad m is +not, in fact, a monad --- for the same reason. Try defining +>>= with the correct type: you won't be able to. -The flag loosens these restrictions -considerably. Firstly, multi-parameter type classes are permitted. Secondly, -the context and head of the instance declaration can each consist of arbitrary -(well-kinded) assertions (C t1 ... tn) subject only to the -following rules: - - -For each assertion in the context: - -No type variable has more occurrences in the assertion than in the head -The assertion has fewer constructors and variables (taken together - and counting repetitions) than the head - - -The coverage condition. For each functional dependency, -tvsleft -> -tvsright, of the class, -every type variable in -S(tvsright) must appear in -S(tvsleft), where S is the -substitution mapping each type variable in the class declaration to the -corresponding type in the instance declaration. - - -These restrictions ensure that context reduction terminates: each reduction -step makes the problem smaller by at least one -constructor. For example, the following would make the type checker -loop if it wasn't excluded: - - instance C a => C a where ... +Notice also that the order of class parameters becomes +important, since we can only derive instances for the last one. If the +StateMonad class above were instead defined as + + + class StateMonad m s | m -> s where ... -For example, these are OK: - - instance C Int [a] -- Multiple parameters - instance Eq (S [a]) -- Structured type in head - -- Repeated type variable in head - instance C4 a a => C4 [a] [a] - instance Stateful (ST s) (MutVar s) +then we would not have been able to derive an instance for the +Parser type above. We hypothesise that multi-parameter +classes usually have one "main" parameter for which deriving new +instances is most interesting. + +Lastly, all of this applies only for classes other than +Read, Show, Typeable, +and Data, for which the built-in derivation applies (section +4.3.3. of the Haskell Report). +(For the standard classes Eq, Ord, +Ix, and Bounded it is immaterial whether +the standard method is used or the one described here.) + + + + + - -- Head can consist of type variables only - instance C a - instance (Eq a, Show b) => C2 a b + + +Class and instances declarations - -- Non-type variables in context - instance Show (s a) => Show (Sized s a) - instance C2 Int a => C3 Bool [a] - instance C2 Int a => C3 [a] b - -But these are not: - - -- Context assertion no smaller than head - instance C a => C a where ... - -- (C b b) has more more occurrences of b than the head - instance C b b => Foo [b] where ... - - + +Class declarations -The same restrictions apply to instances generated by -deriving clauses. Thus the following is accepted: - - data MinHeap h a = H a (h a) - deriving (Show) - -because the derived instance - - instance (Show a, Show (h a)) => Show (MinHeap h a) - -conforms to the above rules. +This section, and the next one, documents GHC's type-class extensions. +There's lots of background in the paper Type +classes: exploring the design space (Simon Peyton Jones, Mark +Jones, Erik Meijer). + + +All the extensions are enabled by the flag. + +Multi-parameter type classes -A useful idiom permitted by the above rules is as follows. -If one allows overlapping instance declarations then it's quite -convenient to have a "default instance" declaration that applies if -something more specific does not: +Multi-parameter type classes are permitted. For example: + + - instance C a where - op = ... -- Default + class Collection c a where + union :: c a -> c a -> c a + ...etc. - -You can find lots of background material about the reason for these -restrictions in the paper -Understanding functional dependencies via Constraint Handling Rules. + - -Undecidable instances + +The superclasses of a class declaration -Sometimes even the rules of are too onerous. -For example, sometimes you might want to use the following to get the -effect of a "class synonym": - - class (C1 a, C2 a, C3 a) => C a where { } +There are no restrictions on the context in a class declaration +(which introduces superclasses), except that the class hierarchy must +be acyclic. So these class declarations are OK: - instance (C1 a, C2 a, C3 a) => C a where { } - -This allows you to write shorter signatures: - - f :: C a => ... - -instead of - - f :: (C1 a, C2 a, C3 a) => ... - -The restrictions on functional dependencies () are particularly troublesome. -It is tempting to introduce type variables in the context that do not appear in -the head, something that is excluded by the normal rules. For example: - - class HasConverter a b | a -> b where - convert :: a -> b - - data Foo a = MkFoo a - instance (HasConverter a b,Show b) => Show (Foo a) where - show (MkFoo value) = show (convert value) - -This is dangerous territory, however. Here, for example, is a program that would make the -typechecker loop: - - class D a - class F a b | a->b - instance F [a] [[a]] - instance (D c, F a c) => D [a] -- 'c' is not mentioned in the head - -Similarly, it can be tempting to lift the coverage condition: - class Mul a b c | a b -> c where - (.*.) :: a -> b -> c + class Functor (m k) => FiniteMap m k where + ... - instance Mul Int Int Int where (.*.) = (*) - instance Mul Int Float Float where x .*. y = fromIntegral x * y - instance Mul a b c => Mul a [b] [c] where x .*. v = map (x.*.) v - -The third instance declaration does not obey the coverage condition; -and indeed the (somewhat strange) definition: - - f = \ b x y -> if b then x .*. [y] else y + class (Monad m, Monad (t m)) => Transform t m where + lift :: m a -> (t m) a -makes instance inference go into a loop, because it requires the constraint -(Mul a [b] b). + + -Nevertheless, GHC allows you to experiment with more liberal rules. If you use -the experimental flag --fallow-undecidable-instances -option, you can use arbitrary -types in both an instance context and instance head. Termination is ensured by having a -fixed-depth recursion stack. If you exceed the stack depth you get a -sort of backtrace, and the opportunity to increase the stack depth -with N. - - - +As in Haskell 98, The class hierarchy must be acyclic. However, the definition +of "acyclic" involves only the superclass relationships. For example, +this is OK: - -Overlapping instances - -In general, GHC requires that that it be unambiguous which instance -declaration -should be used to resolve a type-class constraint. This behaviour -can be modified by two flags: --fallow-overlapping-instances - -and --fallow-incoherent-instances -, as this section discusses. Both these -flags are dynamic flags, and can be set on a per-module basis, using -an OPTIONS_GHC pragma if desired (). - -When GHC tries to resolve, say, the constraint C Int Bool, -it tries to match every instance declaration against the -constraint, -by instantiating the head of the instance declaration. For example, consider -these declarations: - instance context1 => C Int a where ... -- (A) - instance context2 => C a Bool where ... -- (B) - instance context3 => C Int [a] where ... -- (C) - instance context4 => C Int [Int] where ... -- (D) + class C a where { + op :: D b => a -> b -> b + } + + class C a => D a where { ... } -The instances (A) and (B) match the constraint C Int Bool, -but (C) and (D) do not. When matching, GHC takes -no account of the context of the instance declaration -(context1 etc). -GHC's default behaviour is that exactly one instance must match the -constraint it is trying to resolve. -It is fine for there to be a potential of overlap (by -including both declarations (A) and (B), say); an error is only reported if a -particular constraint matches more than one. - - -The flag instructs GHC to allow -more than one instance to match, provided there is a most specific one. For -example, the constraint C Int [Int] matches instances (A), -(C) and (D), but the last is more specific, and hence is chosen. If there is no -most-specific match, the program is rejected. + +Here, C is a superclass of D, but it's OK for a +class operation op of C to mention D. (It +would not be OK for D to be a superclass of C.) + + + + + + +Class method types + -However, GHC is conservative about committing to an overlapping instance. For example: +Haskell 98 prohibits class method types to mention constraints on the +class type variable, thus: - f :: [b] -> [b] - f x = ... + class Seq s a where + fromList :: [a] -> s a + elem :: Eq a => a -> s a -> Bool -Suppose that from the RHS of f we get the constraint -C Int [b]. But -GHC does not commit to instance (C), because in a particular -call of f, b might be instantiate -to Int, in which case instance (D) would be more specific still. -So GHC rejects the program. If you add the flag , -GHC will instead pick (C), without complaining about -the problem of subsequent instantiations. - - -The willingness to be overlapped or incoherent is a property of -the instance declaration itself, controlled by the -presence or otherwise of the -and flags when that mdodule is -being defined. Neither flag is required in a module that imports and uses the -instance declaration. Specifically, during the lookup process: - - -An instance declaration is ignored during the lookup process if (a) a more specific -match is found, and (b) the instance declaration was compiled with -. The flag setting for the -more-specific instance does not matter. - - -Suppose an instance declaration does not matche the constraint being looked up, but -does unify with it, so that it might match when the constraint is further -instantiated. Usually GHC will regard this as a reason for not committing to -some other constraint. But if the instance declaration was compiled with -, GHC will skip the "does-it-unify?" -check for that declaration. - - -These rules make it possible for a library author to design a library that relies on -overlapping instances without the library client having to know. - - -If an instance declaration is compiled without -, -then that instance can never be overlapped. This could perhaps be -inconvenient. Perhaps the rule should instead say that the -overlapping instance declaration should be compiled in -this way, rather than the overlapped one. Perhaps overlap -at a usage site should be permitted regardless of how the instance declarations -are compiled, if the flag is -used at the usage site. (Mind you, the exact usage site can occasionally be -hard to pin down.) We are interested to receive feedback on these points. +The type of elem is illegal in Haskell 98, because it +contains the constraint Eq a, constrains only the +class type variable (in this case a). +GHC lifts this restriction. -The flag implies the - flag, but not vice versa. + + + + + + +Functional dependencies + + + Functional dependencies are implemented as described by Mark Jones +in “Type Classes with Functional Dependencies”, Mark P. Jones, +In Proceedings of the 9th European Symposium on Programming, +ESOP 2000, Berlin, Germany, March 2000, Springer-Verlag LNCS 1782, +. - + +Functional dependencies are introduced by a vertical bar in the syntax of a +class declaration; e.g. + + class (Monad m) => MonadState s m | m -> s where ... - -Type synonyms in the instance head + class Foo a b c | a b -> c where ... + +There should be more documentation, but there isn't (yet). Yell if you need it. + +Rules for functional dependencies -Unlike Haskell 98, instance heads may use type -synonyms. (The instance "head" is the bit after the "=>" in an instance decl.) -As always, using a type synonym is just shorthand for -writing the RHS of the type synonym definition. For example: - +In a class declaration, all of the class type variables must be reachable (in the sense +mentioned in ) +from the free variables of each method type. +For example: - type Point = (Int,Int) - instance C Point where ... - instance C [Point] where ... + class Coll s a where + empty :: s + insert :: s -> a -> s +is not OK, because the type of empty doesn't mention +a. Functional dependencies can make the type variable +reachable: + + class Coll s a | s -> a where + empty :: s + insert :: s -> a -> s + -is legal. However, if you added - +Alternatively Coll might be rewritten - instance C (Int,Int) where ... + class Coll s a where + empty :: s a + insert :: s a -> a -> s a -as well, then the compiler will complain about the overlapping -(actually, identical) instance declarations. As always, type synonyms -must be fully applied. You cannot, for example, write: +which makes the connection between the type of a collection of +a's (namely (s a)) and the element type a. +Occasionally this really doesn't work, in which case you can split the +class like this: - type P a = [[a]] - instance Monad P where ... - - - -This design decision is independent of all the others, and easily -reversed, but it makes sense to me. + class CollE s where + empty :: s + class CollE s => Coll s a where + insert :: s -> a -> s + - - - -Type signatures + +Background on functional dependencies -The context of a type signature - -Unlike Haskell 98, constraints in types do not have to be of -the form (class type-variable) or -(class (type-variable type-variable ...)). Thus, -these type signatures are perfectly OK +The following description of the motivation and use of functional dependencies is taken +from the Hugs user manual, reproduced here (with minor changes) by kind +permission of Mark Jones. + + +Consider the following class, intended as part of a +library for collection types: - g :: Eq [a] => ... - g :: Ord (T a ()) => ... + class Collects e ce where + empty :: ce + insert :: e -> ce -> ce + member :: e -> ce -> Bool + +The type variable e used here represents the element type, while ce is the type +of the container itself. Within this framework, we might want to define +instances of this class for lists or characteristic functions (both of which +can be used to represent collections of any equality type), bit sets (which can +be used to represent collections of characters), or hash tables (which can be +used to represent any collection whose elements have a hash function). Omitting +standard implementation details, this would lead to the following declarations: + + instance Eq e => Collects e [e] where ... + instance Eq e => Collects e (e -> Bool) where ... + instance Collects Char BitSet where ... + instance (Hashable e, Collects a ce) + => Collects e (Array Int ce) where ... + +All this looks quite promising; we have a class and a range of interesting +implementations. Unfortunately, there are some serious problems with the class +declaration. First, the empty function has an ambiguous type: + + empty :: Collects e ce => ce +By "ambiguous" we mean that there is a type variable e that appears on the left +of the => symbol, but not on the right. The problem with +this is that, according to the theoretical foundations of Haskell overloading, +we cannot guarantee a well-defined semantics for any term with an ambiguous +type. -GHC imposes the following restrictions on the constraints in a type signature. -Consider the type: - +We can sidestep this specific problem by removing the empty member from the +class declaration. However, although the remaining members, insert and member, +do not have ambiguous types, we still run into problems when we try to use +them. For example, consider the following two functions: - forall tv1..tvn (c1, ...,cn) => type + f x y = insert x . insert y + g = f True 'a' - -(Here, we write the "foralls" explicitly, although the Haskell source -language omits them; in Haskell 98, all the free type variables of an -explicit source-language type signature are universally quantified, -except for the class type variables in a class declaration. However, -in GHC, you can give the foralls if you want. See ). +for which GHC infers the following types: + + f :: (Collects a c, Collects b c) => a -> b -> c -> c + g :: (Collects Bool c, Collects Char c) => c -> c + +Notice that the type for f allows the two parameters x and y to be assigned +different types, even though it attempts to insert each of the two values, one +after the other, into the same collection. If we're trying to model collections +that contain only one type of value, then this is clearly an inaccurate +type. Worse still, the definition for g is accepted, without causing a type +error. As a result, the error in this code will not be flagged at the point +where it appears. Instead, it will show up only when we try to use g, which +might even be in a different module. - - - - +An attempt to use constructor classes - Each universally quantified type variable -tvi must be reachable from type. +Faced with the problems described above, some Haskell programmers might be +tempted to use something like the following version of the class declaration: + + class Collects e c where + empty :: c e + insert :: e -> c e -> c e + member :: e -> c e -> Bool + +The key difference here is that we abstract over the type constructor c that is +used to form the collection type c e, and not over that collection type itself, +represented by ce in the original class declaration. This avoids the immediate +problems that we mentioned above: empty has type Collects e c => c +e, which is not ambiguous. + + +The function f from the previous section has a more accurate type: + + f :: (Collects e c) => e -> e -> c e -> c e + +The function g from the previous section is now rejected with a type error as +we would hope because the type of f does not allow the two arguments to have +different types. +This, then, is an example of a multiple parameter class that does actually work +quite well in practice, without ambiguity problems. +There is, however, a catch. This version of the Collects class is nowhere near +as general as the original class seemed to be: only one of the four instances +for Collects +given above can be used with this version of Collects because only one of +them---the instance for lists---has a collection type that can be written in +the form c e, for some type constructor c, and element type e. + + -A type variable a is "reachable" if it it appears -in the same constraint as either a type variable free in in -type, or another reachable type variable. -A value with a type that does not obey -this reachability restriction cannot be used without introducing -ambiguity; that is why the type is rejected. -Here, for example, is an illegal type: +Adding functional dependencies + +To get a more useful version of the Collects class, Hugs provides a mechanism +that allows programmers to specify dependencies between the parameters of a +multiple parameter class (For readers with an interest in theoretical +foundations and previous work: The use of dependency information can be seen +both as a generalization of the proposal for `parametric type classes' that was +put forward by Chen, Hudak, and Odersky, or as a special case of Mark Jones's +later framework for "improvement" of qualified types. The +underlying ideas are also discussed in a more theoretical and abstract setting +in a manuscript [implparam], where they are identified as one point in a +general design space for systems of implicit parameterization.). +To start with an abstract example, consider a declaration such as: + + class C a b where ... + +which tells us simply that C can be thought of as a binary relation on types +(or type constructors, depending on the kinds of a and b). Extra clauses can be +included in the definition of classes to add information about dependencies +between parameters, as in the following examples: + + class D a b | a -> b where ... + class E a b | a -> b, b -> a where ... + +The notation a -> b used here between the | and where +symbols --- not to be +confused with a function type --- indicates that the a parameter uniquely +determines the b parameter, and might be read as "a determines b." Thus D is +not just a relation, but actually a (partial) function. Similarly, from the two +dependencies that are included in the definition of E, we can see that E +represents a (partial) one-one mapping between types. + + +More generally, dependencies take the form x1 ... xn -> y1 ... ym, +where x1, ..., xn, and y1, ..., yn are type variables with n>0 and +m>=0, meaning that the y parameters are uniquely determined by the x +parameters. Spaces can be used as separators if more than one variable appears +on any single side of a dependency, as in t -> a b. Note that a class may be +annotated with multiple dependencies using commas as separators, as in the +definition of E above. Some dependencies that we can write in this notation are +redundant, and will be rejected because they don't serve any useful +purpose, and may instead indicate an error in the program. Examples of +dependencies like this include a -> a , +a -> a a , +a -> , etc. There can also be +some redundancy if multiple dependencies are given, as in +a->b, + b->c , a->c , and +in which some subset implies the remaining dependencies. Examples like this are +not treated as errors. Note that dependencies appear only in class +declarations, and not in any other part of the language. In particular, the +syntax for instance declarations, class constraints, and types is completely +unchanged. + + +By including dependencies in a class declaration, we provide a mechanism for +the programmer to specify each multiple parameter class more precisely. The +compiler, on the other hand, is responsible for ensuring that the set of +instances that are in scope at any given point in the program is consistent +with any declared dependencies. For example, the following pair of instance +declarations cannot appear together in the same scope because they violate the +dependency for D, even though either one on its own would be acceptable: + + instance D Bool Int where ... + instance D Bool Char where ... + +Note also that the following declaration is not allowed, even by itself: - forall a. Eq a => Int + instance D [a] b where ... - - -When a value with this type was used, the constraint Eq tv -would be introduced where tv is a fresh type variable, and -(in the dictionary-translation implementation) the value would be -applied to a dictionary for Eq tv. The difficulty is that we -can never know which instance of Eq to use because we never -get any more information about tv. +The problem here is that this instance would allow one particular choice of [a] +to be associated with more than one choice for b, which contradicts the +dependency specified in the definition of D. More generally, this means that, +in any instance of the form: + + instance D t s where ... + +for some particular types t and s, the only variables that can appear in s are +the ones that appear in t, and hence, if the type t is known, then s will be +uniquely determined. -Note -that the reachability condition is weaker than saying that a is -functionally dependent on a type variable free in -type (see ). The reason for this is there -might be a "hidden" dependency, in a superclass perhaps. So -"reachable" is a conservative approximation to "functionally dependent". -For example, consider: +The benefit of including dependency information is that it allows us to define +more general multiple parameter classes, without ambiguity problems, and with +the benefit of more accurate types. To illustrate this, we return to the +collection class example, and annotate the original definition of Collects +with a simple dependency: - class C a b | a -> b where ... - class C a b => D a b where ... - f :: forall a b. D a b => a -> a + class Collects e ce | ce -> e where + empty :: ce + insert :: e -> ce -> ce + member :: e -> ce -> Bool -This is fine, because in fact a does functionally determine b -but that is not immediately apparent from f's type. +The dependency ce -> e here specifies that the type e of elements is uniquely +determined by the type of the collection ce. Note that both parameters of +Collects are of kind *; there are no constructor classes here. Note too that +all of the instances of Collects that we gave earlier can be used +together with this new definition. - - - - Every constraint ci must mention at least one of the -universally quantified type variables tvi. - -For example, this type is OK because C a b mentions the -universally quantified type variable b: - - +What about the ambiguity problems that we encountered with the original +definition? The empty function still has type Collects e ce => ce, but it is no +longer necessary to regard that as an ambiguous type: Although the variable e +does not appear on the right of the => symbol, the dependency for class +Collects tells us that it is uniquely determined by ce, which does appear on +the right of the => symbol. Hence the context in which empty is used can still +give enough information to determine types for both ce and e, without +ambiguity. More generally, we need only regard a type as ambiguous if it +contains a variable on the left of the => that is not uniquely determined +(either directly or indirectly) by the variables on the right. + + +Dependencies also help to produce more accurate types for user defined +functions, and hence to provide earlier detection of errors, and less cluttered +types for programmers to work with. Recall the previous definition for a +function f: - forall a. C a b => burble + f x y = insert x y = insert x . insert y - - -The next type is illegal because the constraint Eq b does not -mention a: - - +for which we originally obtained a type: - forall a. Eq b => burble + f :: (Collects a c, Collects b c) => a -> b -> c -> c + +Given the dependency information that we have for Collects, however, we can +deduce that a and b must be equal because they both appear as the second +parameter in a Collects constraint with the same first parameter c. Hence we +can infer a shorter and more accurate type for f: + + f :: (Collects a c) => a -> a -> c -> c +In a similar way, the earlier definition of g will now be flagged as a type error. + + +Although we have given only a few examples here, it should be clear that the +addition of dependency information can help to make multiple parameter classes +more useful in practice, avoiding ambiguity problems, and allowing more general +sets of instance declarations. + + + + + +Instance declarations -The reason for this restriction is milder than the other one. The -excluded types are never useful or necessary (because the offending -context doesn't need to be witnessed at this point; it can be floated -out). Furthermore, floating them out increases sharing. Lastly, -excluding them is a conservative choice; it leaves a patch of -territory free in case we need it later. + +Relaxed rules for instance declarations +An instance declaration has the form + + instance ( assertion1, ..., assertionn) => class type1 ... typem where ... + +The part before the "=>" is the +context, while the part after the +"=>" is the head of the instance declaration. - + +In Haskell 98 the head of an instance declaration +must be of the form C (T a1 ... an), where +C is the class, T is a type constructor, +and the a1 ... an are distinct type variables. +Furthermore, the assertions in the context of the instance declaration +must be of the form C a where a +is a type variable that occurs in the head. + + +The flag loosens these restrictions +considerably. Firstly, multi-parameter type classes are permitted. Secondly, +the context and head of the instance declaration can each consist of arbitrary +(well-kinded) assertions (C t1 ... tn) subject only to the +following rules: + + +The Paterson Conditions: for each assertion in the context + +No type variable has more occurrences in the assertion than in the head +The assertion has fewer constructors and variables (taken together + and counting repetitions) than the head + +The Coverage Condition. For each functional dependency, +tvsleft -> +tvsright, of the class, +every type variable in +S(tvsright) must appear in +S(tvsleft), where S is the +substitution mapping each type variable in the class declaration to the +corresponding type in the instance declaration. + + +These restrictions ensure that context reduction terminates: each reduction +step makes the problem smaller by at least one +constructor. Both the Paterson Conditions and the Coverage Condition are lifted +if you give the +flag (). +You can find lots of background material about the reason for these +restrictions in the paper +Understanding functional dependencies via Constraint Handling Rules. - - - -For-all hoisting -It is often convenient to use generalised type synonyms (see ) at the right hand -end of an arrow, thus: +For example, these are OK: - type Discard a = forall b. a -> b -> a + instance C Int [a] -- Multiple parameters + instance Eq (S [a]) -- Structured type in head - g :: Int -> Discard Int - g x y z = x+y - -Simply expanding the type synonym would give - - g :: Int -> (forall b. Int -> b -> Int) + -- Repeated type variable in head + instance C4 a a => C4 [a] [a] + instance Stateful (ST s) (MutVar s) + + -- Head can consist of type variables only + instance C a + instance (Eq a, Show b) => C2 a b + + -- Non-type variables in context + instance Show (s a) => Show (Sized s a) + instance C2 Int a => C3 Bool [a] + instance C2 Int a => C3 [a] b -but GHC "hoists" the forall to give the isomorphic type +But these are not: - g :: forall b. Int -> Int -> b -> Int + -- Context assertion no smaller than head + instance C a => C a where ... + -- (C b b) has more more occurrences of b than the head + instance C b b => Foo [b] where ... -In general, the rule is this: to determine the type specified by any explicit -user-written type (e.g. in a type signature), GHC expands type synonyms and then repeatedly -performs the transformation: + + + +The same restrictions apply to instances generated by +deriving clauses. Thus the following is accepted: - type1 -> forall a1..an. context2 => type2 -==> - forall a1..an. context2 => type1 -> type2 + data MinHeap h a = H a (h a) + deriving (Show) -(In fact, GHC tries to retain as much synonym information as possible for use in -error messages, but that is a usability issue.) This rule applies, of course, whether -or not the forall comes from a synonym. For example, here is another -valid way to write g's type signature: +because the derived instance - g :: Int -> Int -> forall b. b -> Int + instance (Show a, Show (h a)) => Show (MinHeap h a) +conforms to the above rules. + -When doing this hoisting operation, GHC eliminates duplicate constraints. For -example: - - type Foo a = (?x::Int) => Bool -> a - g :: Foo (Foo Int) - -means +A useful idiom permitted by the above rules is as follows. +If one allows overlapping instance declarations then it's quite +convenient to have a "default instance" declaration that applies if +something more specific does not: - g :: (?x::Int) => Bool -> Bool -> Int + instance C a where + op = ... -- Default + +Undecidable instances - - - -Implicit parameters - - Implicit parameters are implemented as described in -"Implicit parameters: dynamic scoping with static types", -J Lewis, MB Shields, E Meijer, J Launchbury, -27th ACM Symposium on Principles of Programming Languages (POPL'00), -Boston, Jan 2000. - - -(Most of the following, stil rather incomplete, documentation is -due to Jeff Lewis.) - -Implicit parameter support is enabled with the option -. - - -A variable is called dynamically bound when it is bound by the calling -context of a function and statically bound when bound by the callee's -context. In Haskell, all variables are statically bound. Dynamic -binding of variables is a notion that goes back to Lisp, but was later -discarded in more modern incarnations, such as Scheme. Dynamic binding -can be very confusing in an untyped language, and unfortunately, typed -languages, in particular Hindley-Milner typed languages like Haskell, -only support static scoping of variables. - -However, by a simple extension to the type class system of Haskell, we -can support dynamic binding. Basically, we express the use of a -dynamically bound variable as a constraint on the type. These -constraints lead to types of the form (?x::t') => t, which says "this -function uses a dynamically-bound variable ?x -of type t'". For -example, the following expresses the type of a sort function, -implicitly parameterized by a comparison function named cmp. +Sometimes even the rules of are too onerous. +For example, sometimes you might want to use the following to get the +effect of a "class synonym": - sort :: (?cmp :: a -> a -> Bool) => [a] -> [a] + class (C1 a, C2 a, C3 a) => C a where { } + + instance (C1 a, C2 a, C3 a) => C a where { } -The dynamic binding constraints are just a new form of predicate in the type class system. - - -An implicit parameter occurs in an expression using the special form ?x, -where x is -any valid identifier (e.g. ord ?x is a valid expression). -Use of this construct also introduces a new -dynamic-binding constraint in the type of the expression. -For example, the following definition -shows how we can define an implicitly parameterized sort function in -terms of an explicitly parameterized sortBy function: +This allows you to write shorter signatures: + + f :: C a => ... + +instead of - sortBy :: (a -> a -> Bool) -> [a] -> [a] + f :: (C1 a, C2 a, C3 a) => ... + +The restrictions on functional dependencies () are particularly troublesome. +It is tempting to introduce type variables in the context that do not appear in +the head, something that is excluded by the normal rules. For example: + + class HasConverter a b | a -> b where + convert :: a -> b + + data Foo a = MkFoo a - sort :: (?cmp :: a -> a -> Bool) => [a] -> [a] - sort = sortBy ?cmp + instance (HasConverter a b,Show b) => Show (Foo a) where + show (MkFoo value) = show (convert value) - +This is dangerous territory, however. Here, for example, is a program that would make the +typechecker loop: + + class D a + class F a b | a->b + instance F [a] [[a]] + instance (D c, F a c) => D [a] -- 'c' is not mentioned in the head + +Similarly, it can be tempting to lift the coverage condition: + + class Mul a b c | a b -> c where + (.*.) :: a -> b -> c - -Implicit-parameter type constraints - -Dynamic binding constraints behave just like other type class -constraints in that they are automatically propagated. Thus, when a -function is used, its implicit parameters are inherited by the -function that called it. For example, our sort function might be used -to pick out the least value in a list: + instance Mul Int Int Int where (.*.) = (*) + instance Mul Int Float Float where x .*. y = fromIntegral x * y + instance Mul a b c => Mul a [b] [c] where x .*. v = map (x.*.) v + +The third instance declaration does not obey the coverage condition; +and indeed the (somewhat strange) definition: - least :: (?cmp :: a -> a -> Bool) => [a] -> a - least xs = head (sort xs) + f = \ b x y -> if b then x .*. [y] else y -Without lifting a finger, the ?cmp parameter is -propagated to become a parameter of least as well. With explicit -parameters, the default is that parameters must always be explicit -propagated. With implicit parameters, the default is to always -propagate them. +makes instance inference go into a loop, because it requires the constraint +(Mul a [b] b). -An implicit-parameter type constraint differs from other type class constraints in the -following way: All uses of a particular implicit parameter must have -the same type. This means that the type of (?x, ?x) -is (?x::a) => (a,a), and not -(?x::a, ?x::b) => (a, b), as would be the case for type -class constraints. +Nevertheless, GHC allows you to experiment with more liberal rules. If you use +the experimental flag +-XUndecidableInstances, +both the Paterson Conditions and the Coverage Condition +(described in ) are lifted. Termination is ensured by having a +fixed-depth recursion stack. If you exceed the stack depth you get a +sort of backtrace, and the opportunity to increase the stack depth +with N. - You can't have an implicit parameter in the context of a class or instance -declaration. For example, both these declarations are illegal: + + + + +Overlapping instances + +In general, GHC requires that that it be unambiguous which instance +declaration +should be used to resolve a type-class constraint. This behaviour +can be modified by two flags: +-XOverlappingInstances + +and +-XIncoherentInstances +, as this section discusses. Both these +flags are dynamic flags, and can be set on a per-module basis, using +an OPTIONS_GHC pragma if desired (). + +When GHC tries to resolve, say, the constraint C Int Bool, +it tries to match every instance declaration against the +constraint, +by instantiating the head of the instance declaration. For example, consider +these declarations: - class (?x::Int) => C a where ... - instance (?x::a) => Foo [a] where ... + instance context1 => C Int a where ... -- (A) + instance context2 => C a Bool where ... -- (B) + instance context3 => C Int [a] where ... -- (C) + instance context4 => C Int [Int] where ... -- (D) -Reason: exactly which implicit parameter you pick up depends on exactly where -you invoke a function. But the ``invocation'' of instance declarations is done -behind the scenes by the compiler, so it's hard to figure out exactly where it is done. -Easiest thing is to outlaw the offending types. +The instances (A) and (B) match the constraint C Int Bool, +but (C) and (D) do not. When matching, GHC takes +no account of the context of the instance declaration +(context1 etc). +GHC's default behaviour is that exactly one instance must match the +constraint it is trying to resolve. +It is fine for there to be a potential of overlap (by +including both declarations (A) and (B), say); an error is only reported if a +particular constraint matches more than one. + + -Implicit-parameter constraints do not cause ambiguity. For example, consider: +The flag instructs GHC to allow +more than one instance to match, provided there is a most specific one. For +example, the constraint C Int [Int] matches instances (A), +(C) and (D), but the last is more specific, and hence is chosen. If there is no +most-specific match, the program is rejected. + + +However, GHC is conservative about committing to an overlapping instance. For example: - f :: (?x :: [a]) => Int -> Int - f n = n + length ?x - - g :: (Read a, Show a) => String -> String - g s = show (read s) + f :: [b] -> [b] + f x = ... -Here, g has an ambiguous type, and is rejected, but f -is fine. The binding for ?x at f's call site is -quite unambiguous, and fixes the type a. +Suppose that from the RHS of f we get the constraint +C Int [b]. But +GHC does not commit to instance (C), because in a particular +call of f, b might be instantiate +to Int, in which case instance (D) would be more specific still. +So GHC rejects the program. +(If you add the flag , +GHC will instead pick (C), without complaining about +the problem of subsequent instantiations.) - - - -Implicit-parameter bindings - -An implicit parameter is bound using the standard -let or where binding forms. -For example, we define the min function by binding -cmp. +Notice that we gave a type signature to f, so GHC had to +check that f has the specified type. +Suppose instead we do not give a type signature, asking GHC to infer +it instead. In this case, GHC will refrain from +simplifying the constraint C Int [Int] (for the same reason +as before) but, rather than rejecting the program, it will infer the type - min :: [a] -> a - min = let ?cmp = (<=) in least + f :: C Int b => [b] -> [b] +That postpones the question of which instance to pick to the +call site for f +by which time more is known about the type b. -A group of implicit-parameter bindings may occur anywhere a normal group of Haskell -bindings can occur, except at top level. That is, they can occur in a let -(including in a list comprehension, or do-notation, or pattern guards), -or a where clause. -Note the following points: +The willingness to be overlapped or incoherent is a property of +the instance declaration itself, controlled by the +presence or otherwise of the +and flags when that mdodule is +being defined. Neither flag is required in a module that imports and uses the +instance declaration. Specifically, during the lookup process: -An implicit-parameter binding group must be a -collection of simple bindings to implicit-style variables (no -function-style bindings, and no type signatures); these bindings are -neither polymorphic or recursive. - - -You may not mix implicit-parameter bindings with ordinary bindings in a -single let -expression; use two nested lets instead. -(In the case of where you are stuck, since you can't nest where clauses.) +An instance declaration is ignored during the lookup process if (a) a more specific +match is found, and (b) the instance declaration was compiled with +. The flag setting for the +more-specific instance does not matter. - -You may put multiple implicit-parameter bindings in a -single binding group; but they are not treated -as a mutually recursive group (as ordinary let bindings are). -Instead they are treated as a non-recursive group, simultaneously binding all the implicit -parameter. The bindings are not nested, and may be re-ordered without changing -the meaning of the program. -For example, consider: - - f t = let { ?x = t; ?y = ?x+(1::Int) } in ?x + ?y - -The use of ?x in the binding for ?y does not "see" -the binding for ?x, so the type of f is - - f :: (?x::Int) => Int -> Int - +Suppose an instance declaration does not match the constraint being looked up, but +does unify with it, so that it might match when the constraint is further +instantiated. Usually GHC will regard this as a reason for not committing to +some other constraint. But if the instance declaration was compiled with +, GHC will skip the "does-it-unify?" +check for that declaration. +These rules make it possible for a library author to design a library that relies on +overlapping instances without the library client having to know. + + +If an instance declaration is compiled without +, +then that instance can never be overlapped. This could perhaps be +inconvenient. Perhaps the rule should instead say that the +overlapping instance declaration should be compiled in +this way, rather than the overlapped one. Perhaps overlap +at a usage site should be permitted regardless of how the instance declarations +are compiled, if the flag is +used at the usage site. (Mind you, the exact usage site can occasionally be +hard to pin down.) We are interested to receive feedback on these points. + +The flag implies the + flag, but not vice versa. - -Implicit parameters and polymorphic recursion + +Type synonyms in the instance head -Consider these two definitions: +Unlike Haskell 98, instance heads may use type +synonyms. (The instance "head" is the bit after the "=>" in an instance decl.) +As always, using a type synonym is just shorthand for +writing the RHS of the type synonym definition. For example: + + - len1 :: [a] -> Int - len1 xs = let ?acc = 0 in len_acc1 xs + type Point = (Int,Int) + instance C Point where ... + instance C [Point] where ... + - len_acc1 [] = ?acc - len_acc1 (x:xs) = let ?acc = ?acc + (1::Int) in len_acc1 xs - ------------ +is legal. However, if you added - len2 :: [a] -> Int - len2 xs = let ?acc = 0 in len_acc2 xs - len_acc2 :: (?acc :: Int) => [a] -> Int - len_acc2 [] = ?acc - len_acc2 (x:xs) = let ?acc = ?acc + (1::Int) in len_acc2 xs - -The only difference between the two groups is that in the second group -len_acc is given a type signature. -In the former case, len_acc1 is monomorphic in its own -right-hand side, so the implicit parameter ?acc is not -passed to the recursive call. In the latter case, because len_acc2 -has a type signature, the recursive call is made to the -polymoprhic version, which takes ?acc -as an implicit parameter. So we get the following results in GHCi: - Prog> len1 "hello" - 0 - Prog> len2 "hello" - 5 + instance C (Int,Int) where ... -Adding a type signature dramatically changes the result! This is a rather -counter-intuitive phenomenon, worth watching out for. - - -Implicit parameters and monomorphism -GHC applies the dreaded Monomorphism Restriction (section 4.5.5 of the -Haskell Report) to implicit parameters. For example, consider: +as well, then the compiler will complain about the overlapping +(actually, identical) instance declarations. As always, type synonyms +must be fully applied. You cannot, for example, write: + + - f :: Int -> Int - f v = let ?x = 0 in - let y = ?x + v in - let ?x = 5 in - y + type P a = [[a]] + instance Monad P where ... -Since the binding for y falls under the Monomorphism -Restriction it is not generalised, so the type of y is -simply Int, not (?x::Int) => Int. -Hence, (f 9) returns result 9. -If you add a type signature for y, then y -will get type (?x::Int) => Int, so the occurrence of -y in the body of the let will see the -inner binding of ?x, so (f 9) will return -14. + + +This design decision is independent of all the others, and easily +reversed, but it makes sense to me. + - - + + - -Explicitly-kinded quantification - -Haskell infers the kind of each type variable. Sometimes it is nice to be able -to give the kind explicitly as (machine-checked) documentation, -just as it is nice to give a type signature for a function. On some occasions, -it is essential to do so. For example, in his paper "Restricted Data Types in Haskell" (Haskell Workshop 1999) -John Hughes had to define the data type: - - data Set cxt a = Set [a] - | Unused (cxt a -> ()) - -The only use for the Unused constructor was to force the correct -kind for the type variable cxt. - - -GHC now instead allows you to specify the kind of a type variable directly, wherever -a type variable is explicitly bound. Namely: - -data declarations: - - data Set (cxt :: * -> *) a = Set [a] - -type declarations: - - type T (f :: * -> *) = f Int - -class declarations: - - class (Eq a) => C (f :: * -> *) a where ... - -forall's in type signatures: - - f :: forall (cxt :: * -> *). Set cxt Int - - - - -The parentheses are required. Some of the spaces are required too, to -separate the lexemes. If you write (f::*->*) you -will get a parse error, because "::*->*" is a -single lexeme in Haskell. - + - -As part of the same extension, you can put kind annotations in types -as well. Thus: - - f :: (Int :: *) -> Int - g :: forall a. a -> (a :: *) - -The syntax is - - atype ::= '(' ctype '::' kind ') - -The parentheses are required. + +Implicit parameters + + Implicit parameters are implemented as described in +"Implicit parameters: dynamic scoping with static types", +J Lewis, MB Shields, E Meijer, J Launchbury, +27th ACM Symposium on Principles of Programming Languages (POPL'00), +Boston, Jan 2000. - +(Most of the following, stil rather incomplete, documentation is +due to Jeff Lewis.) - -Arbitrary-rank polymorphism - +Implicit parameter support is enabled with the option +. -Haskell type signatures are implicitly quantified. The new keyword forall -allows us to say exactly what this means. For example: +A variable is called dynamically bound when it is bound by the calling +context of a function and statically bound when bound by the callee's +context. In Haskell, all variables are statically bound. Dynamic +binding of variables is a notion that goes back to Lisp, but was later +discarded in more modern incarnations, such as Scheme. Dynamic binding +can be very confusing in an untyped language, and unfortunately, typed +languages, in particular Hindley-Milner typed languages like Haskell, +only support static scoping of variables. +However, by a simple extension to the type class system of Haskell, we +can support dynamic binding. Basically, we express the use of a +dynamically bound variable as a constraint on the type. These +constraints lead to types of the form (?x::t') => t, which says "this +function uses a dynamically-bound variable ?x +of type t'". For +example, the following expresses the type of a sort function, +implicitly parameterized by a comparison function named cmp. - g :: b -> b - -means this: - - g :: forall b. (b -> b) + sort :: (?cmp :: a -> a -> Bool) => [a] -> [a] -The two are treated identically. +The dynamic binding constraints are just a new form of predicate in the type class system. - -However, GHC's type system supports arbitrary-rank -explicit universal quantification in -types. -For example, all the following types are legal: +An implicit parameter occurs in an expression using the special form ?x, +where x is +any valid identifier (e.g. ord ?x is a valid expression). +Use of this construct also introduces a new +dynamic-binding constraint in the type of the expression. +For example, the following definition +shows how we can define an implicitly parameterized sort function in +terms of an explicitly parameterized sortBy function: - f1 :: forall a b. a -> b -> a - g1 :: forall a b. (Ord a, Eq b) => a -> b -> a - - f2 :: (forall a. a->a) -> Int -> Int - g2 :: (forall a. Eq a => [a] -> a -> Bool) -> Int -> Int + sortBy :: (a -> a -> Bool) -> [a] -> [a] - f3 :: ((forall a. a->a) -> Int) -> Bool -> Bool + sort :: (?cmp :: a -> a -> Bool) => [a] -> [a] + sort = sortBy ?cmp -Here, f1 and g1 are rank-1 types, and -can be written in standard Haskell (e.g. f1 :: a->b->a). -The forall makes explicit the universal quantification that -is implicitly added by Haskell. - - -The functions f2 and g2 have rank-2 types; -the forall is on the left of a function arrow. As g2 -shows, the polymorphic type on the left of the function arrow can be overloaded. - - -The function f3 has a rank-3 type; -it has rank-2 types on the left of a function arrow. + + +Implicit-parameter type constraints -GHC allows types of arbitrary rank; you can nest foralls -arbitrarily deep in function arrows. (GHC used to be restricted to rank 2, but -that restriction has now been lifted.) -In particular, a forall-type (also called a "type scheme"), -including an operational type class context, is legal: - - On the left of a function arrow - On the right of a function arrow (see ) - As the argument of a constructor, or type of a field, in a data type declaration. For -example, any of the f1,f2,f3,g1,g2 above would be valid -field type signatures. - As the type of an implicit parameter - In a pattern type signature (see ) - -There is one place you cannot put a forall: -you cannot instantiate a type variable with a forall-type. So you cannot -make a forall-type the argument of a type constructor. So these types are illegal: +Dynamic binding constraints behave just like other type class +constraints in that they are automatically propagated. Thus, when a +function is used, its implicit parameters are inherited by the +function that called it. For example, our sort function might be used +to pick out the least value in a list: - x1 :: [forall a. a->a] - x2 :: (forall a. a->a, Int) - x3 :: Maybe (forall a. a->a) + least :: (?cmp :: a -> a -> Bool) => [a] -> a + least xs = head (sort xs) -Of course forall becomes a keyword; you can't use forall as -a type variable any more! +Without lifting a finger, the ?cmp parameter is +propagated to become a parameter of least as well. With explicit +parameters, the default is that parameters must always be explicit +propagated. With implicit parameters, the default is to always +propagate them. - - - -Examples - - -In a data or newtype declaration one can quantify -the types of the constructor arguments. Here are several examples: +An implicit-parameter type constraint differs from other type class constraints in the +following way: All uses of a particular implicit parameter must have +the same type. This means that the type of (?x, ?x) +is (?x::a) => (a,a), and not +(?x::a, ?x::b) => (a, b), as would be the case for type +class constraints. + You can't have an implicit parameter in the context of a class or instance +declaration. For example, both these declarations are illegal: + + class (?x::Int) => C a where ... + instance (?x::a) => Foo [a] where ... + +Reason: exactly which implicit parameter you pick up depends on exactly where +you invoke a function. But the ``invocation'' of instance declarations is done +behind the scenes by the compiler, so it's hard to figure out exactly where it is done. +Easiest thing is to outlaw the offending types. - +Implicit-parameter constraints do not cause ambiguity. For example, consider: -data T a = T1 (forall b. b -> b -> b) a - -data MonadT m = MkMonad { return :: forall a. a -> m a, - bind :: forall a b. m a -> (a -> m b) -> m b - } + f :: (?x :: [a]) => Int -> Int + f n = n + length ?x -newtype Swizzle = MkSwizzle (Ord a => [a] -> [a]) + g :: (Read a, Show a) => String -> String + g s = show (read s) - +Here, g has an ambiguous type, and is rejected, but f +is fine. The binding for ?x at f's call site is +quite unambiguous, and fixes the type a. + - -The constructors have rank-2 types: - + +Implicit-parameter bindings - +An implicit parameter is bound using the standard +let or where binding forms. +For example, we define the min function by binding +cmp. -T1 :: forall a. (forall b. b -> b -> b) -> a -> T a -MkMonad :: forall m. (forall a. a -> m a) - -> (forall a b. m a -> (a -> m b) -> m b) - -> MonadT m -MkSwizzle :: (Ord a => [a] -> [a]) -> Swizzle + min :: [a] -> a + min = let ?cmp = (<=) in least - - - - -Notice that you don't need to use a forall if there's an -explicit context. For example in the first argument of the -constructor MkSwizzle, an implicit "forall a." is -prefixed to the argument type. The implicit forall -quantifies all type variables that are not already in scope, and are -mentioned in the type quantified over. - -As for type signatures, implicit quantification happens for non-overloaded -types too. So if you write this: +A group of implicit-parameter bindings may occur anywhere a normal group of Haskell +bindings can occur, except at top level. That is, they can occur in a let +(including in a list comprehension, or do-notation, or pattern guards), +or a where clause. +Note the following points: + + +An implicit-parameter binding group must be a +collection of simple bindings to implicit-style variables (no +function-style bindings, and no type signatures); these bindings are +neither polymorphic or recursive. + + +You may not mix implicit-parameter bindings with ordinary bindings in a +single let +expression; use two nested lets instead. +(In the case of where you are stuck, since you can't nest where clauses.) + + +You may put multiple implicit-parameter bindings in a +single binding group; but they are not treated +as a mutually recursive group (as ordinary let bindings are). +Instead they are treated as a non-recursive group, simultaneously binding all the implicit +parameter. The bindings are not nested, and may be re-ordered without changing +the meaning of the program. +For example, consider: - data T a = MkT (Either a b) (b -> b) + f t = let { ?x = t; ?y = ?x+(1::Int) } in ?x + ?y - -it's just as if you had written this: - +The use of ?x in the binding for ?y does not "see" +the binding for ?x, so the type of f is - data T a = MkT (forall b. Either a b) (forall b. b -> b) + f :: (?x::Int) => Int -> Int - -That is, since the type variable b isn't in scope, it's -implicitly universally quantified. (Arguably, it would be better -to require explicit quantification on constructor arguments -where that is what is wanted. Feedback welcomed.) + + - -You construct values of types T1, MonadT, Swizzle by applying -the constructor to suitable values, just as usual. For example, - + - +Implicit parameters and polymorphic recursion + +Consider these two definitions: - a1 :: T Int - a1 = T1 (\xy->x) 3 - - a2, a3 :: Swizzle - a2 = MkSwizzle sort - a3 = MkSwizzle reverse - - a4 :: MonadT Maybe - a4 = let r x = Just x - b m k = case m of - Just y -> k y - Nothing -> Nothing - in - MkMonad r b + len1 :: [a] -> Int + len1 xs = let ?acc = 0 in len_acc1 xs - mkTs :: (forall b. b -> b -> b) -> a -> [T a] - mkTs f x y = [T1 f x, T1 f y] - + len_acc1 [] = ?acc + len_acc1 (x:xs) = let ?acc = ?acc + (1::Int) in len_acc1 xs - + ------------ - -The type of the argument can, as usual, be more general than the type -required, as (MkSwizzle reverse) shows. (reverse -does not need the Ord constraint.) - + len2 :: [a] -> Int + len2 xs = let ?acc = 0 in len_acc2 xs - -When you use pattern matching, the bound variables may now have -polymorphic types. For example: + len_acc2 :: (?acc :: Int) => [a] -> Int + len_acc2 [] = ?acc + len_acc2 (x:xs) = let ?acc = ?acc + (1::Int) in len_acc2 xs + +The only difference between the two groups is that in the second group +len_acc is given a type signature. +In the former case, len_acc1 is monomorphic in its own +right-hand side, so the implicit parameter ?acc is not +passed to the recursive call. In the latter case, because len_acc2 +has a type signature, the recursive call is made to the +polymoprhic version, which takes ?acc +as an implicit parameter. So we get the following results in GHCi: + + Prog> len1 "hello" + 0 + Prog> len2 "hello" + 5 + +Adding a type signature dramatically changes the result! This is a rather +counter-intuitive phenomenon, worth watching out for. + - +Implicit parameters and monomorphism +GHC applies the dreaded Monomorphism Restriction (section 4.5.5 of the +Haskell Report) to implicit parameters. For example, consider: - f :: T a -> a -> (a, Char) - f (T1 w k) x = (w k x, w 'c' 'd') - - g :: (Ord a, Ord b) => Swizzle -> [a] -> (a -> b) -> [b] - g (MkSwizzle s) xs f = s (map f (s xs)) - - h :: MonadT m -> [m a] -> m [a] - h m [] = return m [] - h m (x:xs) = bind m x $ \y -> - bind m (h m xs) $ \ys -> - return m (y:ys) + f :: Int -> Int + f v = let ?x = 0 in + let y = ?x + v in + let ?x = 5 in + y - - - - -In the function h we use the record selectors return -and bind to extract the polymorphic bind and return functions -from the MonadT data structure, rather than using pattern -matching. +Since the binding for y falls under the Monomorphism +Restriction it is not generalised, so the type of y is +simply Int, not (?x::Int) => Int. +Hence, (f 9) returns result 9. +If you add a type signature for y, then y +will get type (?x::Int) => Int, so the occurrence of +y in the body of the let will see the +inner binding of ?x, so (f 9) will return +14. + - -Type inference + + + +Explicitly-kinded quantification + + +Haskell infers the kind of each type variable. Sometimes it is nice to be able +to give the kind explicitly as (machine-checked) documentation, +just as it is nice to give a type signature for a function. On some occasions, +it is essential to do so. For example, in his paper "Restricted Data Types in Haskell" (Haskell Workshop 1999) +John Hughes had to define the data type: + + data Set cxt a = Set [a] + | Unused (cxt a -> ()) + +The only use for the Unused constructor was to force the correct +kind for the type variable cxt. -A lexically scoped type variable can be bound by: +GHC now instead allows you to specify the kind of a type variable directly, wherever +a type variable is explicitly bound, with the flag . + + +This flag enables kind signatures in the following places: -A declaration type signature () -A pattern type signature () -Class and instance declarations () +data declarations: + + data Set (cxt :: * -> *) a = Set [a] + +type declarations: + + type T (f :: * -> *) = f Int + +class declarations: + + class (Eq a) => C (f :: * -> *) a where ... + +forall's in type signatures: + + f :: forall (cxt :: * -> *). Set cxt Int + + -In Haskell, a programmer-written type signature is implicitly quantifed over -its free type variables (Section -4.1.2 -of the Haskel Report). -Lexically scoped type variables affect this implicit quantification rules -as follows: any type variable that is in scope is not universally -quantified. For example, if type variable a is in scope, -then - - (e :: a -> a) means (e :: a -> a) - (e :: b -> b) means (e :: forall b. b->b) - (e :: a -> b) means (e :: forall b. a->b) - +The parentheses are required. Some of the spaces are required too, to +separate the lexemes. If you write (f::*->*) you +will get a parse error, because "::*->*" is a +single lexeme in Haskell. + +As part of the same extension, you can put kind annotations in types +as well. Thus: + + f :: (Int :: *) -> Int + g :: forall a. a -> (a :: *) + +The syntax is + + atype ::= '(' ctype '::' kind ') + +The parentheses are required. + + - + +Arbitrary-rank polymorphism + - -Declaration type signatures -A declaration type signature that has explicit -quantification (using forall) brings into scope the -explicitly-quantified -type variables, in the definition of the named function(s). For example: - - f :: forall a. [a] -> [a] - f (x:xs) = xs ++ [ x :: a ] - -The "forall a" brings "a" into scope in -the definition of "f". + +Haskell type signatures are implicitly quantified. The new keyword forall +allows us to say exactly what this means. For example: -This only happens if the quantification in f's type -signature is explicit. For example: + - g :: [a] -> [a] - g (x:xs) = xs ++ [ x :: a ] + g :: b -> b -This program will be rejected, because "a" does not scope -over the definition of "f", so "x::a" -means "x::forall a. a" by Haskell's usual implicit -quantification rules. - - - - -Pattern type signatures - -A type signature may occur in any pattern; this is a pattern type -signature. -For example: +means this: - -- f and g assume that 'a' is already in scope - f = \(x::Int, y) -> x - g (x::a) = x - h ((x,y) :: (Int,Bool)) = (y,x) + g :: forall b. (b -> b) -In the case where all the type variables in the pattern type sigature are -already in scope (i.e. bound by the enclosing context), matters are simple: the -signature simply constrains the type of the pattern in the obvious way. +The two are treated identically. + -There is only one situation in which you can write a pattern type signature that -mentions a type variable that is not already in scope, namely in pattern match -of an existential data constructor. For example: +However, GHC's type system supports arbitrary-rank +explicit universal quantification in +types. +For example, all the following types are legal: - data T = forall a. MkT [a] + f1 :: forall a b. a -> b -> a + g1 :: forall a b. (Ord a, Eq b) => a -> b -> a - k :: T -> T - k (MkT [t::a]) = MkT t3 - where - t3::[a] = [t,t,t] + f2 :: (forall a. a->a) -> Int -> Int + g2 :: (forall a. Eq a => [a] -> a -> Bool) -> Int -> Int + + f3 :: ((forall a. a->a) -> Int) -> Bool -> Bool + + f4 :: Int -> (forall a. a -> a) -Here, the pattern type signature (t::a) mentions a lexical type -variable that is not already in scope. Indeed, it cannot already be in scope, -because it is bound by the pattern match. GHC's rule is that in this situation -(and only then), a pattern type signature can mention a type variable that is -not already in scope; the effect is to bring it into scope, standing for the -existentially-bound type variable. +Here, f1 and g1 are rank-1 types, and +can be written in standard Haskell (e.g. f1 :: a->b->a). +The forall makes explicit the universal quantification that +is implicitly added by Haskell. -If this seems a little odd, we think so too. But we must have -some way to bring such type variables into scope, else we -could not name existentially-bound type variables in subequent type signatures. +The functions f2 and g2 have rank-2 types; +the forall is on the left of a function arrow. As g2 +shows, the polymorphic type on the left of the function arrow can be overloaded. -This is (now) the only situation in which a pattern type -signature is allowed to mention a lexical variable that is not already in -scope. -For example, both f and g would be -illegal if a was not already in scope. +The function f3 has a rank-3 type; +it has rank-2 types on the left of a function arrow. + + +GHC allows types of arbitrary rank; you can nest foralls +arbitrarily deep in function arrows. (GHC used to be restricted to rank 2, but +that restriction has now been lifted.) +In particular, a forall-type (also called a "type scheme"), +including an operational type class context, is legal: + + On the left or right (see f4, for example) +of a function arrow + As the argument of a constructor, or type of a field, in a data type declaration. For +example, any of the f1,f2,f3,g1,g2 above would be valid +field type signatures. + As the type of an implicit parameter + In a pattern type signature (see ) + +Of course forall becomes a keyword; you can't use forall as +a type variable any more! - - - + +Notice that you don't need to use a forall if there's an +explicit context. For example in the first argument of the +constructor MkSwizzle, an implicit "forall a." is +prefixed to the argument type. The implicit forall +quantifies all type variables that are not already in scope, and are +mentioned in the type quantified over. + - -Class and instance declarations +As for type signatures, implicit quantification happens for non-overloaded +types too. So if you write this: -The type variables in the head of a class or instance declaration -scope over the methods defined in the where part. For example: + + data T a = MkT (Either a b) (b -> b) + +it's just as if you had written this: - class C a where - op :: [a] -> a - - op xs = let ys::[a] - ys = reverse xs - in - head ys + data T a = MkT (forall b. Either a b) (forall b. b -> b) - - - - - -Deriving clause for classes <literal>Typeable</literal> and <literal>Data</literal> +That is, since the type variable b isn't in scope, it's +implicitly universally quantified. (Arguably, it would be better +to require explicit quantification on constructor arguments +where that is what is wanted. Feedback welcomed.) + -Haskell 98 allows the programmer to add "deriving( Eq, Ord )" to a data type -declaration, to generate a standard instance declaration for classes specified in the deriving clause. -In Haskell 98, the only classes that may appear in the deriving clause are the standard -classes Eq, Ord, -Enum, Ix, Bounded, Read, and Show. +You construct values of types T1, MonadT, Swizzle by applying +the constructor to suitable values, just as usual. For example, + -GHC extends this list with two more classes that may be automatically derived -(provided the flag is specified): -Typeable, and Data. These classes are defined in the library -modules Data.Typeable and Data.Generics respectively, and the -appropriate class must be in scope before it can be mentioned in the deriving clause. + + + a1 :: T Int + a1 = T1 (\xy->x) 3 + + a2, a3 :: Swizzle + a2 = MkSwizzle sort + a3 = MkSwizzle reverse + + a4 :: MonadT Maybe + a4 = let r x = Just x + b m k = case m of + Just y -> k y + Nothing -> Nothing + in + MkMonad r b + + mkTs :: (forall b. b -> b -> b) -> a -> [T a] + mkTs f x y = [T1 f x, T1 f y] + + -An instance of Typeable can only be derived if the -data type has seven or fewer type parameters, all of kind *. -The reason for this is that the Typeable class is derived using the scheme -described in - -Scrap More Boilerplate: Reflection, Zips, and Generalised Casts -. -(Section 7.4 of the paper describes the multiple Typeable classes that -are used, and only Typeable1 up to -Typeable7 are provided in the library.) -In other cases, there is nothing to stop the programmer writing a TypableX -class, whose kind suits that of the data type constructor, and -then writing the data type instance by hand. + + +The type of the argument can, as usual, be more general than the type +required, as (MkSwizzle reverse) shows. (reverse +does not need the Ord constraint.) - - -Generalised derived instances for newtypes + +When you use pattern matching, the bound variables may now have +polymorphic types. For example: + -When you define an abstract type using newtype, you may want -the new type to inherit some instances from its representation. In -Haskell 98, you can inherit instances of Eq, Ord, -Enum and Bounded by deriving them, but for any -other classes you have to write an explicit instance declaration. For -example, if you define - - newtype Dollars = Dollars Int - + + f :: T a -> a -> (a, Char) + f (T1 w k) x = (w k x, w 'c' 'd') -and you want to use arithmetic on Dollars, you have to -explicitly define an instance of Num: + g :: (Ord a, Ord b) => Swizzle -> [a] -> (a -> b) -> [b] + g (MkSwizzle s) xs f = s (map f (s xs)) - - instance Num Dollars where - Dollars a + Dollars b = Dollars (a+b) - ... + h :: MonadT m -> [m a] -> m [a] + h m [] = return m [] + h m (x:xs) = bind m x $ \y -> + bind m (h m xs) $ \ys -> + return m (y:ys) -All the instance does is apply and remove the newtype -constructor. It is particularly galling that, since the constructor -doesn't appear at run-time, this instance declaration defines a -dictionary which is wholly equivalent to the Int -dictionary, only slower! - + - Generalising the deriving clause -GHC now permits such instances to be derived instead, so one can write - - newtype Dollars = Dollars Int deriving (Eq,Show,Num) - - -and the implementation uses the same Num dictionary -for Dollars as for Int. Notionally, the compiler -derives an instance declaration of the form +In the function h we use the record selectors return +and bind to extract the polymorphic bind and return functions +from the MonadT data structure, rather than using pattern +matching. + + - - instance Num Int => Num Dollars - + +Type inference -which just adds or removes the newtype constructor according to the type. + +In general, type inference for arbitrary-rank types is undecidable. +GHC uses an algorithm proposed by Odersky and Laufer ("Putting type annotations to work", POPL'96) +to get a decidable algorithm by requiring some help from the programmer. +We do not yet have a formal specification of "some help" but the rule is this: + + +For a lambda-bound or case-bound variable, x, either the programmer +provides an explicit polymorphic type for x, or GHC's type inference will assume +that x's type has no foralls in it. +What does it mean to "provide" an explicit type for x? You can do that by +giving a type signature for x directly, using a pattern type signature +(), thus: + + \ f :: (forall a. a->a) -> (f True, f 'c') + +Alternatively, you can give a type signature to the enclosing +context, which GHC can "push down" to find the type for the variable: + + (\ f -> (f True, f 'c')) :: (forall a. a->a) -> (Bool,Char) + +Here the type signature on the expression can be pushed inwards +to give a type signature for f. Similarly, and more commonly, +one can give a type signature for the function itself: + + h :: (forall a. a->a) -> (Bool,Char) + h f = (f True, f 'c') + +You don't need to give a type signature if the lambda bound variable +is a constructor argument. Here is an example we saw earlier: + + f :: T a -> a -> (a, Char) + f (T1 w k) x = (w k x, w 'c' 'd') + +Here we do not need to give a type signature to w, because +it is an argument of constructor T1 and that tells GHC all +it needs to know. + -We can also derive instances of constructor classes in a similar -way. For example, suppose we have implemented state and failure monad -transformers, such that + - - instance Monad m => Monad (State s m) - instance Monad m => Monad (Failure m) - -In Haskell 98, we can define a parsing monad by - - type Parser tok m a = State [tok] (Failure m) a - -which is automatically a monad thanks to the instance declarations -above. With the extension, we can make the parser type abstract, -without needing to write an instance of class Monad, via + +Implicit quantification - - newtype Parser tok m a = Parser (State [tok] (Failure m) a) - deriving Monad - -In this case the derived instance declaration is of the form - - instance Monad (State [tok] (Failure m)) => Monad (Parser tok m) - + +GHC performs implicit quantification as follows. At the top level (only) of +user-written types, if and only if there is no explicit forall, +GHC finds all the type variables mentioned in the type that are not already +in scope, and universally quantifies them. For example, the following pairs are +equivalent: + + f :: a -> a + f :: forall a. a -> a -Notice that, since Monad is a constructor class, the -instance is a partial application of the new type, not the -entire left hand side. We can imagine that the type declaration is -``eta-converted'' to generate the context of the instance -declaration. + g (x::a) = let + h :: a -> b -> b + h x y = y + in ... + g (x::a) = let + h :: forall b. a -> b -> b + h x y = y + in ... + +Notice that GHC does not find the innermost possible quantification +point. For example: + + f :: (a -> a) -> Int + -- MEANS + f :: forall a. (a -> a) -> Int + -- NOT + f :: (forall a. a -> a) -> Int -We can even derive instances of multi-parameter classes, provided the -newtype is the last class parameter. In this case, a ``partial -application'' of the class appears in the deriving -clause. For example, given the class - - class StateMonad s m | m -> s where ... - instance Monad m => StateMonad s (State s m) where ... - -then we can derive an instance of StateMonad for Parsers by - - newtype Parser tok m a = Parser (State [tok] (Failure m) a) - deriving (Monad, StateMonad [tok]) + g :: (Ord a => a -> a) -> Int + -- MEANS the illegal type + g :: forall a. (Ord a => a -> a) -> Int + -- NOT + g :: (forall a. Ord a => a -> a) -> Int +The latter produces an illegal type, which you might think is silly, +but at least the rule is simple. If you want the latter type, you +can write your for-alls explicitly. Indeed, doing so is strongly advised +for rank-2 types. + + + -The derived instance is obtained by completing the application of the -class to the new type: - - instance StateMonad [tok] (State [tok] (Failure m)) => - StateMonad [tok] (Parser tok m) + +Impredicative polymorphism + +GHC supports impredicative polymorphism. This means +that you can call a polymorphic function at a polymorphic type, and +parameterise data structures over polymorphic types. For example: + + f :: Maybe (forall a. [a] -> [a]) -> Maybe ([Int], [Char]) + f (Just g) = Just (g [3], g "hello") + f Nothing = Nothing +Notice here that the Maybe type is parameterised by the +polymorphic type (forall a. [a] -> +[a]). - - -As a result of this extension, all derived instances in newtype - declarations are treated uniformly (and implemented just by reusing -the dictionary for the representation type), except -Show and Read, which really behave differently for -the newtype and its representation. +The technical details of this extension are described in the paper +Boxy types: +type inference for higher-rank types and impredicativity, +which appeared at ICFP 2006. - - - A more precise specification - -Derived instance declarations are constructed as follows. Consider the -declaration (after expansion of any type synonyms) + - - newtype T v1...vn = T' (t vk+1...vn) deriving (c1...cm) - + +Lexically scoped type variables + -where - - - The type t is an arbitrary type - - - The vk+1...vn are type variables which do not occur in - t, and - - - The ci are partial applications of - classes of the form C t1'...tj', where the arity of C - is exactly j+1. That is, C lacks exactly one type argument. - - - None of the ci is Read, Show, - Typeable, or Data. These classes - should not "look through" the type or its constructor. You can still - derive these classes for a newtype, but it happens in the usual way, not - via this new mechanism. - - -Then, for each ci, the derived instance -declaration is: - - instance ci (t vk+1...v) => ci (T v1...vp) + +GHC supports lexically scoped type variables, without +which some type signatures are simply impossible to write. For example: + +f :: forall a. [a] -> [a] +f xs = ys ++ ys + where + ys :: [a] + ys = reverse xs -where p is chosen so that T v1...vp is of the -right kind for the last parameter of class Ci. +The type signature for f brings the type variable a into scope; it scopes over +the entire definition of f. +In particular, it is in scope at the type signature for ys. +In Haskell 98 it is not possible to declare +a type for ys; a major benefit of scoped type variables is that +it becomes possible to do so. - +Lexically-scoped type variables are enabled by +. + +Note: GHC 6.6 contains substantial changes to the way that scoped type +variables work, compared to earlier releases. Read this section +carefully! -As an example which does not work, consider - - newtype NonMonad m s = NonMonad (State s m s) deriving Monad - -Here we cannot derive the instance - - instance Monad (State s m) => Monad (NonMonad m) - + +Overview -because the type variable s occurs in State s m, -and so cannot be "eta-converted" away. It is a good thing that this -deriving clause is rejected, because NonMonad m is -not, in fact, a monad --- for the same reason. Try defining ->>= with the correct type: you won't be able to. +The design follows the following principles + +A scoped type variable stands for a type variable, and not for +a type. (This is a change from GHC's earlier +design.) +Furthermore, distinct lexical type variables stand for distinct +type variables. This means that every programmer-written type signature +(includin one that contains free scoped type variables) denotes a +rigid type; that is, the type is fully known to the type +checker, and no inference is involved. +Lexical type variables may be alpha-renamed freely, without +changing the program. + - -Notice also that the order of class parameters becomes -important, since we can only derive instances for the last one. If the -StateMonad class above were instead defined as - - - class StateMonad m s | m -> s where ... - - -then we would not have been able to derive an instance for the -Parser type above. We hypothesise that multi-parameter -classes usually have one "main" parameter for which deriving new -instances is most interesting. +A lexically scoped type variable can be bound by: + +A declaration type signature () +An expression type signature () +A pattern type signature () +Class and instance declarations () + -Lastly, all of this applies only for classes other than -Read, Show, Typeable, -and Data, for which the built-in derivation applies (section -4.3.3. of the Haskell Report). -(For the standard classes Eq, Ord, -Ix, and Bounded it is immaterial whether -the standard method is used or the one described here.) + +In Haskell, a programmer-written type signature is implicitly quantifed over +its free type variables (Section +4.1.2 +of the Haskel Report). +Lexically scoped type variables affect this implicit quantification rules +as follows: any type variable that is in scope is not universally +quantified. For example, if type variable a is in scope, +then + + (e :: a -> a) means (e :: a -> a) + (e :: b -> b) means (e :: forall b. b->b) + (e :: a -> b) means (e :: forall b. a->b) + - - - -Generalised typing of mutually recursive bindings + - -The Haskell Report specifies that a group of bindings (at top level, or in a -let or where) should be sorted into -strongly-connected components, and then type-checked in dependency order -(Haskell -Report, Section 4.5.1). -As each group is type-checked, any binders of the group that -have -an explicit type signature are put in the type environment with the specified -polymorphic type, -and all others are monomorphic until the group is generalised -(Haskell Report, Section 4.5.2). - -Following a suggestion of Mark Jones, in his paper -Typing Haskell in -Haskell, -GHC implements a more general scheme. If is -specified: -the dependency analysis ignores references to variables that have an explicit -type signature. -As a result of this refined dependency analysis, the dependency groups are smaller, and more bindings will -typecheck. For example, consider: + +Declaration type signatures +A declaration type signature that has explicit +quantification (using forall) brings into scope the +explicitly-quantified +type variables, in the definition of the named function(s). For example: - f :: Eq a => a -> Bool - f x = (x == x) || g True || g "Yes" - - g y = (y <= y) || f True + f :: forall a. [a] -> [a] + f (x:xs) = xs ++ [ x :: a ] -This is rejected by Haskell 98, but under Jones's scheme the definition for -g is typechecked first, separately from that for -f, -because the reference to f in g's right -hand side is ingored by the dependency analysis. Then g's -type is generalised, to get +The "forall a" brings "a" into scope in +the definition of "f". + +This only happens if the quantification in f's type +signature is explicit. For example: - g :: Ord a => a -> Bool + g :: [a] -> [a] + g (x:xs) = xs ++ [ x :: a ] -Now, the defintion for f is typechecked, with this type for -g in the type environment. +This program will be rejected, because "a" does not scope +over the definition of "f", so "x::a" +means "x::forall a. a" by Haskell's usual implicit +quantification rules. + - -The same refined dependency analysis also allows the type signatures of -mutually-recursive functions to have different contexts, something that is illegal in -Haskell 98 (Section 4.5.2, last sentence). With - -GHC only insists that the type signatures of a refined group have identical -type signatures; in practice this means that only variables bound by the same -pattern binding must have the same context. For example, this is fine: + +Expression type signatures + +An expression type signature that has explicit +quantification (using forall) brings into scope the +explicitly-quantified +type variables, in the annotated expression. For example: - f :: Eq a => a -> Bool - f x = (x == x) || g True - - g :: Ord a => a -> Bool - g y = (y <= y) || f True + f = runST ( (op >>= \(x :: STRef s Int) -> g x) :: forall s. ST s Bool ) +Here, the type signature forall a. ST s Bool brings the +type variable s into scope, in the annotated expression +(op >>= \(x :: STRef s Int) -> g x). - - - - - - - -Generalised Algebraic Data Types + -Generalised Algebraic Data Types (GADTs) generalise ordinary algebraic data types by allowing you -to give the type signatures of constructors explicitly. For example: + +Pattern type signatures + +A type signature may occur in any pattern; this is a pattern type +signature. +For example: - data Term a where - Lit :: Int -> Term Int - Succ :: Term Int -> Term Int - IsZero :: Term Int -> Term Bool - If :: Term Bool -> Term a -> Term a -> Term a - Pair :: Term a -> Term b -> Term (a,b) + -- f and g assume that 'a' is already in scope + f = \(x::Int, y::a) -> x + g (x::a) = x + h ((x,y) :: (Int,Bool)) = (y,x) -Notice that the return type of the constructors is not always Term a, as is the -case with ordinary vanilla data types. Now we can write a well-typed eval function -for these Terms: +In the case where all the type variables in the pattern type sigature are +already in scope (i.e. bound by the enclosing context), matters are simple: the +signature simply constrains the type of the pattern in the obvious way. + + +There is only one situation in which you can write a pattern type signature that +mentions a type variable that is not already in scope, namely in pattern match +of an existential data constructor. For example: - eval :: Term a -> a - eval (Lit i) = i - eval (Succ t) = 1 + eval t - eval (IsZero t) = eval t == 0 - eval (If b e1 e2) = if eval b then eval e1 else eval e2 - eval (Pair e1 e2) = (eval e1, eval e2) + data T = forall a. MkT [a] + + k :: T -> T + k (MkT [t::a]) = MkT t3 + where + t3::[a] = [t,t,t] -These and many other examples are given in papers by Hongwei Xi, and Tim Sheard. +Here, the pattern type signature (t::a) mentions a lexical type +variable that is not already in scope. Indeed, it cannot already be in scope, +because it is bound by the pattern match. GHC's rule is that in this situation +(and only then), a pattern type signature can mention a type variable that is +not already in scope; the effect is to bring it into scope, standing for the +existentially-bound type variable. -The rest of this section outlines the extensions to GHC that support GADTs. -It is far from comprehensive, but the design closely follows that described in -the paper Simple -unification-based type inference for GADTs, -which appeared in ICFP 2006. - - - Data type declarations have a 'where' form, as exemplified above. The type signature of -each constructor is independent, and is implicitly universally quantified as usual. Unlike a normal -Haskell data type declaration, the type variable(s) in the "data Term a where" header -have no scope. Indeed, one can write a kind signature instead: +If this seems a little odd, we think so too. But we must have +some way to bring such type variables into scope, else we +could not name existentially-bound type variables in subequent type signatures. + + +This is (now) the only situation in which a pattern type +signature is allowed to mention a lexical variable that is not already in +scope. +For example, both f and g would be +illegal if a was not already in scope. + + + + + + - -You can use record syntax on a GADT-style data type declaration: + +Class and instance declarations + + +The type variables in the head of a class or instance declaration +scope over the methods defined in the where part. For example: - - data Term a where - Lit { val :: Int } :: Term Int - Succ { num :: Term Int } :: Term Int - Pred { num :: Term Int } :: Term Int - IsZero { arg :: Term Int } :: Term Bool - Pair { arg1 :: Term a - , arg2 :: Term b - } :: Term (a,b) - If { cnd :: Term Bool - , tru :: Term a - , fls :: Term a - } :: Term a - -For every constructor that has a field f, (a) the type of -field f must be the same; and (b) the -result type of the constructor must be the same; both modulo alpha conversion. -Hence, in our example, we cannot merge the num and arg -fields above into a -single name. Although their field types are both Term Int, -their selector functions actually have different types: - num :: Term Int -> Term Int - arg :: Term Bool -> Term Int + class C a where + op :: [a] -> a + + op xs = let ys::[a] + ys = reverse xs + in + head ys + + -At the moment, record updates are not yet possible with GADT, so support is -limited to record construction, selection and pattern matching: + - - someTerm :: Term Bool - someTerm = IsZero { arg = Succ { num = Lit { val = 0 } } } - eval :: Term a -> a - eval Lit { val = i } = i - eval Succ { num = t } = eval t + 1 - eval Pred { num = t } = eval t - 1 - eval IsZero { arg = t } = eval t == 0 - eval Pair { arg1 = t1, arg2 = t2 } = (eval t1, eval t2) - eval t@If{} = if eval (cnd t) then eval (tru t) else eval (fls t) - + +Generalised typing of mutually recursive bindings - + +The Haskell Report specifies that a group of bindings (at top level, or in a +let or where) should be sorted into +strongly-connected components, and then type-checked in dependency order +(Haskell +Report, Section 4.5.1). +As each group is type-checked, any binders of the group that +have +an explicit type signature are put in the type environment with the specified +polymorphic type, +and all others are monomorphic until the group is generalised +(Haskell Report, Section 4.5.2). + - -You can use strictness annotations, in the obvious places -in the constructor type: +Following a suggestion of Mark Jones, in his paper +Typing Haskell in +Haskell, +GHC implements a more general scheme. If is +specified: +the dependency analysis ignores references to variables that have an explicit +type signature. +As a result of this refined dependency analysis, the dependency groups are smaller, and more bindings will +typecheck. For example, consider: - data Term a where - Lit :: !Int -> Term Int - If :: Term Bool -> !(Term a) -> !(Term a) -> Term a - Pair :: Term a -> Term b -> Term (a,b) + f :: Eq a => a -> Bool + f x = (x == x) || g True || g "Yes" + + g y = (y <= y) || f True - - - -You can use a deriving clause on a GADT-style data type -declaration, but only if the data type could also have been declared in -Haskell-98 syntax. For example, these two declarations are equivalent +This is rejected by Haskell 98, but under Jones's scheme the definition for +g is typechecked first, separately from that for +f, +because the reference to f in g's right +hand side is ingored by the dependency analysis. Then g's +type is generalised, to get - data Maybe1 a where { - Nothing1 :: Maybe a ; - Just1 :: a -> Maybe a - } deriving( Eq, Ord ) - - data Maybe2 a = Nothing2 | Just2 a - deriving( Eq, Ord ) + g :: Ord a => a -> Bool -This simply allows you to declare a vanilla Haskell-98 data type using the -where form without losing the deriving clause. - +Now, the defintion for f is typechecked, with this type for +g in the type environment. + - -Pattern matching causes type refinement. For example, in the right hand side of the equation + +The same refined dependency analysis also allows the type signatures of +mutually-recursive functions to have different contexts, something that is illegal in +Haskell 98 (Section 4.5.2, last sentence). With + +GHC only insists that the type signatures of a refined group have identical +type signatures; in practice this means that only variables bound by the same +pattern binding must have the same context. For example, this is fine: - eval :: Term a -> a - eval (Lit i) = ... + f :: Eq a => a -> Bool + f x = (x == x) || g True + + g :: Ord a => a -> Bool + g y = (y <= y) || f True -the type a is refined to Int. (That's the whole point!) -A precise specification of the type rules is beyond what this user manual aspires to, but there is a paper -about the ideas: "Wobbly types: practical type inference for generalised algebraic data types", on Simon PJ's home page. + + - The general principle is this: type refinement is only carried out based on user-supplied type annotations. -So if no type signature is supplied for eval, no type refinement happens, and lots of obscure error messages will -occur. However, the refinement is quite general. For example, if we had: - - eval :: Term a -> a -> a - eval (Lit i) j = i+j - -the pattern match causes the type a to be refined to Int (because of the type -of the constructor Lit, and that refinement also applies to the type of j, and -the result type of the case expression. Hence the addition i+j is legal. + +Type families + + + +GHC supports the definition of type families indexed by types. They may be +seen as an extension of Haskell 98's class-based overloading of values to +types. When type families are declared in classes, they are also known as +associated types. - - + +There are two forms of type families: data families and type synonym families. +Currently, only the former are fully implemented, while we are still working +on the latter. As a result, the specification of the language extension is +also still to some degree in flux. Hence, a more detailed description of +the language extension and its use is currently available +from the Haskell +wiki page on type families. The material will be moved to this user's +guide when it has stabilised. - -Notice that GADTs generalise existential types. For example, these two declarations are equivalent: - - data T a = forall b. MkT b (b->a) - data T' a where { MKT :: b -> (b->a) -> T' a } - + +Type families are enabled by the flag . - - + + + + + @@ -3932,24 +4686,27 @@ Template Meta-programming for Haskell" (Proc Haskell Workshop 2002). There is a Wiki page about Template Haskell at -http://www.haskell.org/th/, and that is the best place to look for +http://www.haskell.org/haskellwiki/Template_Haskell, and that is the best place to look for further details. You may also consult the online Haskell library reference material -(search for the type ExpQ). -[Temporary: many changes to the original design are described in - "http://research.microsoft.com/~simonpj/tmp/notes2.ps". -Not all of these changes are in GHC 6.6.] +(look for module Language.Haskell.TH). +Many changes to the original design are described in + +Notes on Template Haskell version 2. +Not all of these changes are in GHC, however. - The first example from that paper is set out below as a worked example to help get you started. + The first example from that paper is set out below () +as a worked example to help get you started. -The documentation here describes the realisation in GHC. (It's rather sketchy just now; -Tim Sheard is going to expand it.) +The documentation here describes the realisation of Template Haskell in GHC. It is not detailed enough to +understand Template Haskell; see the +Wiki page. @@ -3957,9 +4714,10 @@ Tim Sheard is going to expand it.) Template Haskell has the following new syntactic constructions. You need to use the flag - + + to switch these syntactic extensions on - ( is no longer implied by + ( is no longer implied by ). @@ -3974,41 +4732,47 @@ Tim Sheard is going to expand it.) an expression; the spliced expression must have type Q Exp - a list of top-level declarations; ; the spliced expression must have type Q [Dec] - [Planned, but not implemented yet.] a - type; the spliced expression must have type Q Typ. + a list of top-level declarations; the spliced expression must have type Q [Dec] - (Note that the syntax for a declaration splice uses "$" not "splice" as in - the paper. Also the type of the enclosed expression must be Q [Dec], not [Q Dec] - as in the paper.) - + + Inside a splice you can can only call functions defined in imported modules, + not functions defined elsewhere in the same module. A expression quotation is written in Oxford brackets, thus: [| ... |], where the "..." is an expression; - the quotation has type Expr. + the quotation has type Q Exp. [d| ... |], where the "..." is a list of top-level declarations; the quotation has type Q [Dec]. - [Planned, but not implemented yet.] [t| ... |], where the "..." is a type; - the quotation has type Type. + [t| ... |], where the "..." is a type; + the quotation has type Q Typ. - Reification is written thus: + A name can be quoted with either one or two prefix single quotes: - reifyDecl T, where T is a type constructor; this expression - has type Dec. - reifyDecl C, where C is a class; has type Dec. - reifyType f, where f is an identifier; has type Typ. - Still to come: fixities - - + 'f has type Name, and names the function f. + Similarly 'C has type Name and names the data constructor C. + In general 'thing interprets thing in an expression context. + + ''T has type Name, and names the type constructor T. + That is, ''thing interprets thing in a type context. + + + These Names can be used to construct Template Haskell expressions, patterns, delarations etc. They + may also be given as an argument to the reify function. + +(Compared to the original paper, there are many differnces of detail. +The syntax for a declaration splice uses "$" not "splice". +The type of the enclosed expression must be Q [Dec], not [Q Dec]. +Type splices are not implemented, and neither are pattern splices or quotations. + Using Template Haskell @@ -4025,6 +4789,14 @@ Tim Sheard is going to expand it.) (It would make sense to do so, but it's hard to implement.) + + Furthermore, you can only run a function at compile time if it is imported + from another module that is not part of a mutually-recursive group of modules + that includes the module currently being compiled. For example, when compiling module A, + you can only run Template Haskell functions imported from B if B does not import A (directly or indirectly). + The reason should be clear: to run B we must compile and run A, but we are currently type-checking A. + + The flag -ddump-splices shows the expansion of all top-level splices as they happen. @@ -4043,7 +4815,7 @@ Tim Sheard is going to expand it.) - A Template Haskell Worked Example + A Template Haskell Worked Example To help you get over the confidence barrier, try out this skeletal worked example. First cut and paste the two modules below into "Main.hs" and "Printf.hs": @@ -4083,21 +4855,21 @@ parse s = [ L s ] -- Generate Haskell source code from a parsed representation -- of the format string. This code will be spliced into -- the module which calls "pr", at compile time. -gen :: [Format] -> ExpQ +gen :: [Format] -> Q Exp gen [D] = [| \n -> show n |] gen [S] = [| \s -> s |] gen [L s] = stringE s -- Here we generate the Haskell code for the splice -- from an input format string. -pr :: String -> ExpQ -pr s = gen (parse s) +pr :: String -> Q Exp +pr s = gen (parse s) Now run the compiler (here we are a Cygwin prompt on Windows): -$ ghc --make -fth main.hs -o main.exe +$ ghc --make -XTemplateHaskell main.hs -o main.exe Run "main.exe" and here is your output: @@ -4186,7 +4958,7 @@ Palgrave, 2003. and the arrows web page at http://www.haskell.org/arrows/. -With the flag, GHC supports the arrow +With the flag, GHC supports the arrow notation described in the second of these papers. What follows is a brief introduction to the notation; it won't make much sense unless you've read Hughes's paper. @@ -4644,7 +5416,7 @@ Because the preprocessor targets Haskell (rather than Core), - + Bang patterns <indexterm><primary>Bang patterns</primary></indexterm> @@ -4656,10 +5428,10 @@ prime feature description contains more discussion and examples than the material below. -Bang patterns are enabled by the flag . +Bang patterns are enabled by the flag . - + Informal description of bang patterns @@ -4714,7 +5486,7 @@ is part of the syntax of let bindings. - + Syntax and semantics @@ -4729,7 +5501,7 @@ f !x = 3 Is this a definition of the infix function "(!)", or of the "f" with a bang pattern? GHC resolves this -ambiguity inf favour of the latter. If you want to define +ambiguity in favour of the latter. If you want to define (!) with bang-patterns enabled, you have to do so using prefix notation: @@ -4788,7 +5560,7 @@ a module. - + Assertions <indexterm><primary>Assertions</primary></indexterm> @@ -4898,6 +5670,87 @@ Assertion failures can be caught, see the documentation for the unrecognised word is (silently) ignored. + Certain pragmas are file-header pragmas. A file-header + pragma must precede the module keyword in the file. + There can be as many file-header pragmas as you please, and they can be + preceded or followed by comments. + + + LANGUAGE pragma + + LANGUAGEpragma + pragmaLANGUAGE + + The LANGUAGE pragma allows language extensions to be enabled + in a portable way. + It is the intention that all Haskell compilers support the + LANGUAGE pragma with the same syntax, although not + all extensions are supported by all compilers, of + course. The LANGUAGE pragma should be used instead + of OPTIONS_GHC, if possible. + + For example, to enable the FFI and preprocessing with CPP: + +{-# LANGUAGE ForeignFunctionInterface, CPP #-} + + LANGUAGE is a file-header pragma (see ). + + Every language extension can also be turned into a command-line flag + by prefixing it with "-X"; for example . + (Similarly, all "-X" flags can be written as LANGUAGE pragmas. + + + A list of all supported language extensions can be obtained by invoking + ghc --supported-languages (see ). + + Any extension from the Extension type defined in + Language.Haskell.Extension + may be used. GHC will report an error if any of the requested extensions are not supported. + + + + + OPTIONS_GHC pragma + OPTIONS_GHC + + pragmaOPTIONS_GHC + + + The OPTIONS_GHC pragma is used to specify + additional options that are given to the compiler when compiling + this source file. See for + details. + + Previous versions of GHC accepted OPTIONS rather + than OPTIONS_GHC, but that is now deprecated. + + + OPTIONS_GHC is a file-header pragma (see ). + + + INCLUDE pragma + + The INCLUDE pragma is for specifying the names + of C header files that should be #include'd into + the C source code generated by the compiler for the current module (if + compiling via C). For example: + + +{-# INCLUDE "foo.h" #-} +{-# INCLUDE <stdio.h> #-} + + INCLUDE is a file-header pragma (see ). + + An INCLUDE pragma is the preferred alternative + to the option (), because the + INCLUDE pragma is understood by other + compilers. Yet another alternative is to add the include file to each + foreign import declaration in your code, but we + don't recommend using this approach with GHC. + + DEPRECATED pragma DEPRECATED @@ -4950,31 +5803,6 @@ Assertion failures can be caught, see the documentation for the . - - INCLUDE pragma - - The INCLUDE pragma is for specifying the names - of C header files that should be #include'd into - the C source code generated by the compiler for the current module (if - compiling via C). For example: - - -{-# INCLUDE "foo.h" #-} -{-# INCLUDE <stdio.h> #-} - - The INCLUDE pragma(s) must appear at the top of - your source file with any OPTIONS_GHC - pragma(s). - - An INCLUDE pragma is the preferred alternative - to the option (), because the - INCLUDE pragma is understood by other - compilers. Yet another alternative is to add the include file to each - foreign import declaration in your code, but we - don't recommend using this approach with GHC. - - INLINE and NOINLINE pragmas @@ -5121,29 +5949,6 @@ happen. - - LANGUAGE pragma - - LANGUAGEpragma - pragmaLANGUAGE - - This allows language extensions to be enabled in a portable way. - It is the intention that all Haskell compilers support the - LANGUAGE pragma with the same syntax, although not - all extensions are supported by all compilers, of - course. The LANGUAGE pragma should be used instead - of OPTIONS_GHC, if possible. - - For example, to enable the FFI and preprocessing with CPP: - -{-# LANGUAGE ForeignFunctionInterface, CPP #-} - - Any extension from the Extension type defined in - Language.Haskell.Extension may be used. GHC will report an error if any of the requested extensions are not supported. - - - LINE pragma @@ -5163,22 +5968,6 @@ happen. pragma. - - OPTIONS_GHC pragma - OPTIONS_GHC - - pragmaOPTIONS_GHC - - - The OPTIONS_GHC pragma is used to specify - additional options that are given to the compiler when compiling - this source file. See for - details. - - Previous versions of GHC accepted OPTIONS rather - than OPTIONS_GHC, but that is now deprecated. - - RULES pragma @@ -5757,12 +6546,6 @@ The following are good consumers: - length - - - - - ++ (on its first argument) @@ -6030,7 +6813,7 @@ r) GHCziBase.ZMZN GHCziBase.Char -> GHCziBase.ZMZN GHCziBase.Cha r) -> tpl2}) - (%note "foo" + (%note "bar" eta); @@ -6048,84 +6831,10 @@ r) -> Special built-in functions -GHC has a few built-in funcions with special behaviour, -described in this section. All are exported by -GHC.Exts. - - The <literal>inline</literal> function - -The inline function is somewhat experimental. - - inline :: a -> a - -The call (inline f) arranges that f -is inlined, regardless of its size. More precisely, the call -(inline f) rewrites to the right-hand side of f's -definition. -This allows the programmer to control inlining from -a particular call site -rather than the definition site of the function -(c.f. INLINE pragmas ). - - -This inlining occurs regardless of the argument to the call -or the size of f's definition; it is unconditional. -The main caveat is that f's definition must be -visible to the compiler. That is, f must be -let-bound in the current scope. -If no inlining takes place, the inline function -expands to the identity function in Phase zero; so its use imposes -no overhead. - - If the function is defined in another -module, GHC only exposes its inlining in the interface file if the -function is sufficiently small that it might be -inlined by the automatic mechanism. There is currently no way to tell -GHC to expose arbitrarily-large functions in the interface file. (This -shortcoming is something that could be fixed, with some kind of pragma.) - - - - The <literal>lazy</literal> function - -The lazy function restrains strictness analysis a little: - - lazy :: a -> a - -The call (lazy e) means the same as e, -but lazy has a magical property so far as strictness -analysis is concerned: it is lazy in its first argument, -even though its semantics is strict. After strictness analysis has run, -calls to lazy are inlined to be the identity function. - - -This behaviour is occasionally useful when controlling evaluation order. -Notably, lazy is used in the library definition of -Control.Parallel.par: - - par :: a -> b -> b - par x y = case (par# x) of { _ -> lazy y } - -If lazy were not lazy, par would -look strict in y which would defeat the whole -purpose of par. - - - - The <literal>unsafeCoerce#</literal> function - -The function unsafeCoerce# allows you to side-step the -typechecker entirely. It has type - - unsafeCoerce# :: a -> b - -That is, it allows you to coerce any type into any other type. If you use this -function, you had better get it right, otherwise segmentation faults await. -It is generally used when you want to write a program that you know is -well-typed, but where Haskell's type system is not expressive enough to prove -that it is well typed. - - +GHC has a few built-in funcions with special behaviour. These +are now described in the module GHC.Prim +in the library documentation. @@ -6182,7 +6891,7 @@ where clause and over-ride whichever methods you please. Use the flags (to enable the extra syntax), - (to generate extra per-data-type code), + (to generate extra per-data-type code), and (to make the Generics library available. @@ -6391,21 +7100,21 @@ carried out at let and where bindings. Switching off the dreaded Monomorphism Restriction - + Haskell's monomorphism restriction (see Section 4.5.5 of the Haskell Report) can be completely switched off by -. +. Monomorphic pattern bindings - - + + As an experimental change, we are exploring the possibility of making pattern bindings monomorphic; that is, not generalised at all. @@ -6421,7 +7130,7 @@ can be completely switched off by [x] = e -- A pattern binding Experimentally, GHC now makes pattern bindings monomorphic by -default. Use to recover the +default. Use to recover the standard behaviour.