From 646ca6ffb55b1a9ee4c71e8573d87629dc7a4989 Mon Sep 17 00:00:00 2001 From: "simonpj@microsoft.com" Date: Fri, 22 Dec 2006 09:34:31 +0000 Subject: [PATCH] Documentation for class contexts in data-constructor declarations --- docs/users_guide/glasgow_exts.xml | 504 +++++++++++++++++++++++-------------- 1 file changed, 311 insertions(+), 193 deletions(-) diff --git a/docs/users_guide/glasgow_exts.xml b/docs/users_guide/glasgow_exts.xml index b2daa16..6f54d1e 100644 --- a/docs/users_guide/glasgow_exts.xml +++ b/docs/users_guide/glasgow_exts.xml @@ -1309,7 +1309,7 @@ universal quantification earlier. - + Record Constructors @@ -1361,20 +1361,6 @@ main = do display (inc (inc counterB)) -- prints "##" -In GADT declarations (see ), the explicit -forall may be omitted. For example, we can express -the same Counter a using GADT: - - -data Counter a where - NewCounter { _this :: self - , _inc :: self -> self - , _display :: self -> IO () - , tag :: a - } - :: Counter a - - At the moment, record update syntax is only supported for Haskell 98 data types, so the following function does not work: @@ -1515,7 +1501,7 @@ are convincing reasons to change it. You can't use deriving to define instances of a data type with existentially quantified data constructors. -Reason: in most cases it would not make sense. For example:# +Reason: in most cases it would not make sense. For example:; data T = forall a. MkT [a] deriving( Eq ) @@ -1543,6 +1529,315 @@ declarations. Define your own instances! + + + +Declaring data types with explicit constructor signatures + +GHC allows you to declare an algebraic data type by +giving the type signatures of constructors explicitly. For example: + + data Maybe a where + Nothing :: Maybe a + Just :: a -> Maybe a + +The form is called a "GADT-style declaration" +because Generalised Algebraic Data Types, described in , +can only be declared using this form. +Notice that GADT-style syntax generalises existential types (). +For example, these two declarations are equivalent: + + data Foo = forall a. MkFoo a (a -> Bool) + data Foo' where { MKFoo :: a -> (a->Bool) -> Foo' } + + +Any data type that can be declared in standard Haskell-98 syntax +can also be declared using GADT-style syntax. +The choice is largely stylistic, but GADT-style declarations differ in one important respect: +they treat class constraints on the data constructors differently. +Specifically, if the constructor is given a type-class context, that +context is made available by pattern matching. For example: + + data Set a where + MkSet :: Eq a => [a] -> Set a + + makeSet :: Eq a => [a] -> Set a + makeSet xs = MkSet (nub xs) + + insert :: a -> Set a -> Set a + insert a (MkSet as) | a `elem` as = MkSet as + | otherwise = MkSet (a:as) + +A use of MkSet as a constructor (e.g. in the definition of makeSet) +gives rise to a (Eq a) +constraint, as you would expect. The new feature is that pattern-matching on MkSet +(as in the definition of insert) makes available an (Eq a) +context. In implementation terms, the MkSet constructor has a hidden field that stores +the (Eq a) dictionary that is passed to MkSet; so +when pattern-matching that dictionary becomes available for the right-hand side of the match. +In the example, the equality dictionary is used to satisfy the equality constraint +generated by the call to elem, so that the type of +insert itself has no Eq constraint. + +This behaviour contrasts with Haskell 98's peculiar treament of +contexts on a data type declaration (Section 4.2.1 of the Haskell 98 Report). +In Haskell 98 the defintion + + data Eq a => Set' a = MkSet' [a] + +gives MkSet' the same type as MkSet above. But instead of +making available an (Eq a) constraint, pattern-matching +on MkSet' requires an (Eq a) constraint! +GHC faithfully implements this behaviour, odd though it is. But for GADT-style declarations, +GHC's behaviour is much more useful, as well as much more intuitive. + +For example, a possible application of GHC's behaviour is to reify dictionaries: + + data NumInst a where + MkNumInst :: Num a => NumInst a + + intInst :: NumInst Int + intInst = MkNumInst + + plus :: NumInst a -> a -> a -> a + plus MkNumInst p q = p + q + +Here, a value of type NumInst a is equivalent +to an explicit (Num a) dictionary. + + + +The rest of this section gives further details about GADT-style data +type declarations. + + + +The result type of each data constructor must begin with the type constructor being defined. +If the result type of all constructors +has the form T a1 ... an, where a1 ... an +are distinct type variables, then the data type is ordinary; +otherwise is a generalised data type (). + + + +The type signature of +each constructor is independent, and is implicitly universally quantified as usual. +Different constructors may have different universally-quantified type variables +and different type-class constraints. +For example, this is fine: + + data T a where + T1 :: Eq b => b -> T b + T2 :: (Show c, Ix c) => c -> [c] -> T c + + + + +Unlike a Haskell-98-style +data type declaration, the type variable(s) in the "data Set a where" header +have no scope. Indeed, one can write a kind signature instead: + + data Set :: * -> * where ... + +or even a mixture of the two: + + data Foo a :: (* -> *) -> * where ... + +The type variables (if given) may be explicitly kinded, so we could also write the header for Foo +like this: + + data Foo a (b :: * -> *) where ... + + + + + +You can use strictness annotations, in the obvious places +in the constructor type: + + data Term a where + Lit :: !Int -> Term Int + If :: Term Bool -> !(Term a) -> !(Term a) -> Term a + Pair :: Term a -> Term b -> Term (a,b) + + + + +You can use a deriving clause on a GADT-style data type +declaration. For example, these two declarations are equivalent + + data Maybe1 a where { + Nothing1 :: Maybe1 a ; + Just1 :: a -> Maybe1 a + } deriving( Eq, Ord ) + + data Maybe2 a = Nothing2 | Just2 a + deriving( Eq, Ord ) + + + + +You can use record syntax on a GADT-style data type declaration: + + + data Person where + Adult { name :: String, children :: [Person] } :: Person + Child { name :: String } :: Person + +As usual, for every constructor that has a field f, the type of +field f must be the same (modulo alpha conversion). + + +At the moment, record updates are not yet possible with GADT-style declarations, +so support is limited to record construction, selection and pattern matching. +For exmaple + + aPerson = Adult { name = "Fred", children = [] } + + shortName :: Person -> Bool + hasChildren (Adult { children = kids }) = not (null kids) + hasChildren (Child {}) = False + + + + +As in the case of existentials declared using the Haskell-98-like record syntax +(), +record-selector functions are generated only for those fields that have well-typed +selectors. +Here is the example of that section, in GADT-style syntax: + +data Counter a where + NewCounter { _this :: self + , _inc :: self -> self + , _display :: self -> IO () + , tag :: a + } + :: Counter a + +As before, only one selector function is generated here, that for tag. +Nevertheless, you can still use all the field names in pattern matching and record construction. + + + + + +Generalised Algebraic Data Types (GADTs) + +Generalised Algebraic Data Types generalise ordinary algebraic data types +by allowing constructors to have richer return types. Here is an example: + + data Term a where + Lit :: Int -> Term Int + Succ :: Term Int -> Term Int + IsZero :: Term Int -> Term Bool + If :: Term Bool -> Term a -> Term a -> Term a + Pair :: Term a -> Term b -> Term (a,b) + +Notice that the return type of the constructors is not always Term a, as is the +case with ordinary data types. This generality allows us to +write a well-typed eval function +for these Terms: + + eval :: Term a -> a + eval (Lit i) = i + eval (Succ t) = 1 + eval t + eval (IsZero t) = eval t == 0 + eval (If b e1 e2) = if eval b then eval e1 else eval e2 + eval (Pair e1 e2) = (eval e1, eval e2) + +The key point about GADTs is that pattern matching causes type refinement. +For example, in the right hand side of the equation + + eval :: Term a -> a + eval (Lit i) = ... + +the type a is refined to Int. That's the whole point! +A precise specification of the type rules is beyond what this user manual aspires to, +but the design closely follows that described in +the paper Simple +unification-based type inference for GADTs, +(ICFP 2006). +The general principle is this: type refinement is only carried out +based on user-supplied type annotations. +So if no type signature is supplied for eval, no type refinement happens, +and lots of obscure error messages will +occur. However, the refinement is quite general. For example, if we had: + + eval :: Term a -> a -> a + eval (Lit i) j = i+j + +the pattern match causes the type a to be refined to Int (because of the type +of the constructor Lit), and that refinement also applies to the type of j, and +the result type of the case expression. Hence the addition i+j is legal. + + +These and many other examples are given in papers by Hongwei Xi, and +Tim Sheard. There is a longer introduction +on the wiki, +and Ralf Hinze's +Fun with phantom types also has a number of examples. Note that papers +may use different notation to that implemented in GHC. + + +The rest of this section outlines the extensions to GHC that support GADTs. + + +A GADT can only be declared using GADT-style syntax (); +the old Haskell-98 syntax for data declarations always declares an ordinary data type. +The result type of each constructor must begin with the type constructor being defined, +but for a GADT the arguments to the type constructor can be arbitrary monotypes. +For example, in the Term data +type above, the type of each constructor must end with Term ty, but +the ty may not be a type variable (e.g. the Lit +constructor). + + + +You cannot use a deriving clause for a GADT; only for +an ordianary data type. + + + +As mentioned in , record syntax is supported. +For example: + + data Term a where + Lit { val :: Int } :: Term Int + Succ { num :: Term Int } :: Term Int + Pred { num :: Term Int } :: Term Int + IsZero { arg :: Term Int } :: Term Bool + Pair { arg1 :: Term a + , arg2 :: Term b + } :: Term (a,b) + If { cnd :: Term Bool + , tru :: Term a + , fls :: Term a + } :: Term a + +However, for GADTs there is the following additional constraint: +every constructor that has a field f must have +the same result type (modulo alpha conversion) +Hence, in the above example, we cannot merge the num +and arg fields above into a +single name. Although their field types are both Term Int, +their selector functions actually have different types: + + + num :: Term Int -> Term Int + arg :: Term Bool -> Term Int + + + + + + + + + + + @@ -3818,183 +4113,6 @@ pattern binding must have the same context. For example, this is fine: - - - -Generalised Algebraic Data Types (GADTs) - -Generalised Algebraic Data Types generalise ordinary algebraic data types by allowing you -to give the type signatures of constructors explicitly. For example: - - data Term a where - Lit :: Int -> Term Int - Succ :: Term Int -> Term Int - IsZero :: Term Int -> Term Bool - If :: Term Bool -> Term a -> Term a -> Term a - Pair :: Term a -> Term b -> Term (a,b) - -Notice that the return type of the constructors is not always Term a, as is the -case with ordinary vanilla data types. Now we can write a well-typed eval function -for these Terms: - - eval :: Term a -> a - eval (Lit i) = i - eval (Succ t) = 1 + eval t - eval (IsZero t) = eval t == 0 - eval (If b e1 e2) = if eval b then eval e1 else eval e2 - eval (Pair e1 e2) = (eval e1, eval e2) - -These and many other examples are given in papers by Hongwei Xi, and -Tim Sheard. There is a longer introduction -on the wiki, -and Ralf Hinze's -Fun with phantom types also has a number of examples. Note that papers -may use different notation to that implemented in GHC. - - -The rest of this section outlines the extensions to GHC that support GADTs. -It is far from comprehensive, but the design closely follows that described in -the paper Simple -unification-based type inference for GADTs, -which appeared in ICFP 2006. - - - Data type declarations have a 'where' form, as exemplified above. The type signature of -each constructor is independent, and is implicitly universally quantified as usual. Unlike a normal -Haskell data type declaration, the type variable(s) in the "data Term a where" header -have no scope. Indeed, one can write a kind signature instead: - - data Term :: * -> * where ... - -or even a mixture of the two: - - data Foo a :: (* -> *) -> * where ... - -The type variables (if given) may be explicitly kinded, so we could also write the header for Foo -like this: - - data Foo a (b :: * -> *) where ... - - - - -There are no restrictions on the type of the data constructor, except that the result -type must begin with the type constructor being defined. For example, in the Term data -type above, the type of each constructor must end with ... -> Term .... - - - -You can use record syntax on a GADT-style data type declaration: - - - data Term a where - Lit { val :: Int } :: Term Int - Succ { num :: Term Int } :: Term Int - Pred { num :: Term Int } :: Term Int - IsZero { arg :: Term Int } :: Term Bool - Pair { arg1 :: Term a - , arg2 :: Term b - } :: Term (a,b) - If { cnd :: Term Bool - , tru :: Term a - , fls :: Term a - } :: Term a - -For every constructor that has a field f, (a) the type of -field f must be the same; and (b) the -result type of the constructor must be the same; both modulo alpha conversion. -Hence, in our example, we cannot merge the num and arg -fields above into a -single name. Although their field types are both Term Int, -their selector functions actually have different types: - - - num :: Term Int -> Term Int - arg :: Term Bool -> Term Int - - -At the moment, record updates are not yet possible with GADT, so support is -limited to record construction, selection and pattern matching: - - - someTerm :: Term Bool - someTerm = IsZero { arg = Succ { num = Lit { val = 0 } } } - - eval :: Term a -> a - eval Lit { val = i } = i - eval Succ { num = t } = eval t + 1 - eval Pred { num = t } = eval t - 1 - eval IsZero { arg = t } = eval t == 0 - eval Pair { arg1 = t1, arg2 = t2 } = (eval t1, eval t2) - eval t@If{} = if eval (cnd t) then eval (tru t) else eval (fls t) - - - - - -You can use strictness annotations, in the obvious places -in the constructor type: - - data Term a where - Lit :: !Int -> Term Int - If :: Term Bool -> !(Term a) -> !(Term a) -> Term a - Pair :: Term a -> Term b -> Term (a,b) - - - - -You can use a deriving clause on a GADT-style data type -declaration, but only if the data type could also have been declared in -Haskell-98 syntax. For example, these two declarations are equivalent - - data Maybe1 a where { - Nothing1 :: Maybe1 a ; - Just1 :: a -> Maybe1 a - } deriving( Eq, Ord ) - - data Maybe2 a = Nothing2 | Just2 a - deriving( Eq, Ord ) - -This simply allows you to declare a vanilla Haskell-98 data type using the -where form without losing the deriving clause. - - - -Pattern matching causes type refinement. For example, in the right hand side of the equation - - eval :: Term a -> a - eval (Lit i) = ... - -the type a is refined to Int. (That's the whole point!) -A precise specification of the type rules is beyond what this user manual aspires to, but there is a paper -about the ideas: "Wobbly types: practical type inference for generalised algebraic data types", on Simon PJ's home page. - - The general principle is this: type refinement is only carried out based on user-supplied type annotations. -So if no type signature is supplied for eval, no type refinement happens, and lots of obscure error messages will -occur. However, the refinement is quite general. For example, if we had: - - eval :: Term a -> a -> a - eval (Lit i) j = i+j - -the pattern match causes the type a to be refined to Int (because of the type -of the constructor Lit, and that refinement also applies to the type of j, and -the result type of the case expression. Hence the addition i+j is legal. - - - - - -Notice that GADTs generalise existential types. For example, these two declarations are equivalent: - - data T a = forall b. MkT b (b->a) - data T' a where { MKT :: b -> (b->a) -> T' a } - - - - - - -- 1.7.10.4