From 6eccfc1e938b81819eead0f53418aefae970b4a6 Mon Sep 17 00:00:00 2001 From: simonpj Date: Fri, 21 Dec 2001 10:05:55 +0000 Subject: [PATCH] [project @ 2001-12-21 10:05:55 by simonpj] Docment newtype deriving stuff --- ghc/docs/users_guide/glasgow_exts.sgml | 555 ++++++++++++++++++-------------- 1 file changed, 313 insertions(+), 242 deletions(-) diff --git a/ghc/docs/users_guide/glasgow_exts.sgml b/ghc/docs/users_guide/glasgow_exts.sgml index 3669baa..4d8e450 100644 --- a/ghc/docs/users_guide/glasgow_exts.sgml +++ b/ghc/docs/users_guide/glasgow_exts.sgml @@ -210,6 +210,7 @@ program), you may wish to check if there are libraries that provide a + @@ -955,9 +956,9 @@ declarations However, if you give the command line option -fallow-overlapping-instances -option then two overlapping instance declarations are permitted -iff - +option then overlapping instance declarations are permitted. +However, GHC arranges never to commit to using an instance declaration +if another instance declaration also applies, either now or later. @@ -970,22 +971,11 @@ iff OR type2 is a substitution instance of type1 -(but not identical to type1) +(but not identical to type1), or vice versa. - - - - OR vice versa - - - - - Notice that these rules - - @@ -1005,8 +995,34 @@ Reason: you can pick which instance decl - - +However the rules are over-conservative. Two instance declarations can overlap, +but it can still be clear in particular situations which to use. For example: + + instance C (Int,a) where ... + instance C (a,Bool) where ... + +These are rejected by GHC's rules, but it is clear what to do when trying +to solve the constraint C (Int,Int) because the second instance +cannot apply. Yell if this restriction bites you. + + +GHC is also conservative about committing to an overlapping instance. For example: + + class C a where { op :: a -> a } + instance C [Int] where ... + instance C a => C [a] where ... + + f :: C b => [b] -> [b] + f x = op x + +From the RHS of f we get the constraint C [b]. But +GHC does not commit to the second instance declaration, because in a paricular +call of f, b might be instantiate to Int, so the first instance declaration +would be appropriate. So GHC rejects the program. If you add +GHC will instead silently pick the second instance, without complaining about +the problem of subsequent instantiations. + + Regrettably, GHC doesn't guarantee to detect overlapping instance declarations if they appear in different modules. GHC can "see" the instance declarations in the transitive closure of all the modules @@ -1365,56 +1381,78 @@ There should be more documentation, but there isn't (yet). Yell if you need it. -GHC's type system supports explicit universal quantification in -constructor fields and function arguments. This is useful for things -like defining runST from the state-thread world. -GHC's syntax for this now agrees with Hugs's, namely: +Haskell type signatures are implicitly quantified. The new keyword forall +allows us to say exactly what this means. For example: - - - forall a b. (Ord a, Eq b) => a -> b -> a + g :: b -> b - - - - -The context is, of course, optional. You can't use forall as -a type variable any more! +means this: + + g :: forall b. (b -> b) + +The two are treated identically. -Haskell type signatures are implicitly quantified. The forall -allows us to say exactly what this means. For example: - +However, GHC's type system supports arbitrary-rank +explicit universal quantification in +types. +For example, all the following types are legal: + + f1 :: forall a b. a -> b -> a + g1 :: forall a b. (Ord a, Eq b) => a -> b -> a - + f2 :: (forall a. a->a) -> Int -> Int + g2 :: (forall a. Eq a => [a] -> a -> Bool) -> Int -> Int - - g :: b -> b + f3 :: ((forall a. a->a) -> Int) -> Bool -> Bool - +Here, f1 and g1 are rank-1 types, and +can be written in standard Haskell (e.g. f1 :: a->b->a). +The forall makes explicit the universal quantification that +is implicitly added by Haskell. - -means this: +The functions f2 and g2 have rank-2 types; +the forall is on the left of a function arrrow. As g2 +shows, the polymorphic type on the left of the function arrow can be overloaded. - - +The functions f3 and g3 have rank-3 types; +they have rank-2 types on the left of a function arrow. + + +GHC allows types of arbitrary rank; you can nest foralls +arbitrarily deep in function arrows. (GHC used to be restricted to rank 2, but +that restriction has now been lifted.) +In particular, a forall-type (also called a "type scheme"), +including an operational type class context, is legal: + + On the left of a function arrow + On the right of a function arrow (see ) + As the argument of a constructor, or type of a field, in a data type declaration. For +example, any of the f1,f2,f3,g1,g2,g3 above would be valid +field type signatures. + As the type of an implicit parameter + In a pattern type signature (see ) + +There is one place you cannot put a forall: +you cannot instantiate a type variable with a forall-type. So you cannot +make a forall-type the argument of a type constructor. So these types are illegal: - g :: forall b. (b -> b) + x1 :: [forall a. a->a] + x2 :: (forall a. a->a, Int) + x3 :: Maybe (forall a. a->a) - +Of course forall becomes a keyword; you can't use forall as +a type variable any more! - -The two are treated identically. - -Universally-quantified data type fields +<title>Examples @@ -1437,8 +1475,7 @@ newtype Swizzle = MkSwizzle (Ord a => [a] -> [a]) -The constructors now have so-called rank 2 polymorphic -types, in which there is a for-all in the argument types.: +The constructors have rank-2 types: @@ -1482,11 +1519,6 @@ to require explicit quantification on constructor arguments where that is what is wanted. Feedback welcomed.) - - - -Construction - You construct values of types T1, MonadT, Swizzle by applying the constructor to suitable values, just as usual. For example, @@ -1495,17 +1527,23 @@ the constructor to suitable values, just as usual. For example, -(T1 (\xy->x) 3) :: T Int - -(MkSwizzle sort) :: Swizzle -(MkSwizzle reverse) :: Swizzle + a1 :: T Int + a1 = T1 (\xy->x) 3 + + a2, a3 :: Swizzle + a2 = MkSwizzle sort + a3 = MkSwizzle reverse + + a4 :: MonadT Maybe + a4 = let r x = Just x + b m k = case m of + Just y -> k y + Nothing -> Nothing + in + MkMonad r b -(let r x = Just x - b m k = case m of - Just y -> k y - Nothing -> Nothing - in - MkMonad r b) :: MonadT Maybe + mkTs :: (forall b. b -> b -> b) -> a -> [T a] + mkTs f x y = [T1 f x, T1 f y] @@ -1516,11 +1554,6 @@ required, as (MkSwizzle reverse) shows. (reverseOrd constraint.) - - - -Pattern matching - When you use pattern matching, the bound variables may now have polymorphic types. For example: @@ -1529,17 +1562,17 @@ polymorphic types. For example: - f :: T a -> a -> (a, Char) - f (T1 f k) x = (f k x, f 'c' 'd') + f :: T a -> a -> (a, Char) + f (T1 w k) x = (w k x, w 'c' 'd') - g :: (Ord a, Ord b) => Swizzle -> [a] -> (a -> b) -> [b] - g (MkSwizzle s) xs f = s (map f (s xs)) + g :: (Ord a, Ord b) => Swizzle -> [a] -> (a -> b) -> [b] + g (MkSwizzle s) xs f = s (map f (s xs)) - h :: MonadT m -> [m a] -> m [a] - h m [] = return m [] - h m (x:xs) = bind m x $ \y -> - bind m (h m xs) $ \ys -> - return m (y:ys) + h :: MonadT m -> [m a] -> m [a] + h m [] = return m [] + h m (x:xs) = bind m x $ \y -> + bind m (h m xs) $ \ys -> + return m (y:ys) @@ -1550,187 +1583,50 @@ and bind to extract the polymorphic bind and return functions from the MonadT data structure, rather than using pattern matching. - - -You cannot pattern-match against an argument that is polymorphic. -For example: - - - newtype TIM s a = TIM (ST s (Maybe a)) - - runTIM :: (forall s. TIM s a) -> Maybe a - runTIM (TIM m) = runST m - - - - - -Here the pattern-match fails, because you can't pattern-match against -an argument of type (forall s. TIM s a). Instead you -must bind the variable and pattern match in the right hand side: - - - runTIM :: (forall s. TIM s a) -> Maybe a - runTIM tm = case tm of { TIM m -> runST m } - - -The tm on the right hand side is (invisibly) instantiated, like -any polymorphic value at its occurrence site, and now you can pattern-match -against it. - - -The partial-application restriction +Type inference -There is really only one way in which data structures with polymorphic -components might surprise you: you must not partially apply them. -For example, this is illegal: - - - - - - map MkSwizzle [sort, reverse] - - +In general, type inference for arbitrary-rank types is undecideable. +GHC uses an algorithm proposed by Odersky and Laufer ("Putting type annotations to work", POPL'96) +to get a decidable algorithm by requiring some help from the programmer. +We do not yet have a formal specification of "some help" but the rule is this: - -The restriction is this: every subexpression of the program must -have a type that has no for-alls, except that in a function -application (f e1…en) the partial applications are not subject to -this rule. The restriction makes type inference feasible. +For a lambda-bound or case-bound variable, x, either the programmer +provides an explicit polymorphic type for x, or GHC's type inference will assume +that x's type has no foralls in it. - -In the illegal example, the sub-expression MkSwizzle has the -polymorphic type (Ord b => [b] -> [b]) -> Swizzle and is not -a sub-expression of an enclosing application. On the other hand, this -expression is OK: - - - - +What does it mean to "provide" an explicit type for x? You can do that by +giving a type signature for x directly, using a pattern type signature +(), thus: - map (T1 (\a b -> a)) [1,2,3] + \ f :: (forall a. a->a) -> (f True, f 'c') - - - - -even though it involves a partial application of T1, because -the sub-expression T1 (\a b -> a) has type Int -> T -Int. - - - - - -Type signatures - - - -Once you have data constructors with universally-quantified fields, or -constants such as runST that have rank-2 types, it isn't long -before you discover that you need more! Consider: - - - - +Alternatively, you can give a type signature to the enclosing +context, which GHC can "push down" to find the type for the variable: - mkTs f x y = [T1 f x, T1 f y] + (\ f -> (f True, f 'c')) :: (forall a. a->a) -> (Bool,Char) - - - - -mkTs is a fuction that constructs some values of type -T, using some pieces passed to it. The trouble is that since -f is a function argument, Haskell assumes that it is -monomorphic, so we'll get a type error when applying T1 to -it. This is a rather silly example, but the problem really bites in -practice. Lots of people trip over the fact that you can't make -"wrappers functions" for runST for exactly the same reason. -In short, it is impossible to build abstractions around functions with -rank-2 types. - - - -The solution is fairly clear. We provide the ability to give a rank-2 -type signature for ordinary functions (not only data -constructors), thus: - - - - +Here the type signature on the expression can be pushed inwards +to give a type signature for f. Similarly, and more commonly, +one can give a type signature for the function itself: - mkTs :: (forall b. b -> b -> b) -> a -> [T a] - mkTs f x y = [T1 f x, T1 f y] + h :: (forall a. a->a) -> (Bool,Char) + h f = (f True, f 'c') - - - - -This type signature tells the compiler to attribute f with -the polymorphic type (forall b. b -> b -> b) when type -checking the body of mkTs, so now the application of -T1 is fine. - - - -There are two restrictions: - - - - - - - - - You can only define a rank 2 type, specified by the following -grammar: - - +You don't need to give a type signature if the lambda bound variable +is a constructor argument. Here is an example we saw earlier: -rank2type ::= [forall tyvars .] [context =>] funty -funty ::= ([forall tyvars .] [context =>] ty) -> funty - | ty -ty ::= ...current Haskell monotype syntax... + f :: T a -> a -> (a, Char) + f (T1 w k) x = (w k x, w 'c' 'd') - - -Informally, the universal quantification must all be right at the beginning, -or at the top level of a function argument. - - - - - - - There is a restriction on the definition of a function whose -type signature is a rank-2 type: the polymorphic arguments must be -matched on the left hand side of the "=" sign. You can't -define mkTs like this: - - - -mkTs :: (forall b. b -> b -> b) -> a -> [T a] -mkTs = \ f x y -> [T1 f x, T1 f y] - - - - -The same partial-application rule applies to ordinary functions with -rank-2 types as applied to data constructors. - - - - - - +Here we do not need to give a type signature to w, because +it is an argument of constructor T1 and that tells GHC all +it needs to know. @@ -3670,6 +3566,181 @@ Just to finish with, here's another example I rather like: + +Generalised derived instances for newtypes + + +When you define an abstract type using newtype, you may want +the new type to inherit some instances from its representation. In +Haskell 98, you can inherit instances of Eq, Ord, +Enum and Bounded by deriving them, but for any +other classes you have to write an explicit instance declaration. For +example, if you define + + + newtype Dollars = Dollars Int + + +and you want to use arithmetic on Dollars, you have to +explicitly define an instance of Num: + + + instance Num Dollars where + Dollars a + Dollars b = Dollars (a+b) + ... + +All the instance does is apply and remove the newtype +constructor. It is particularly galling that, since the constructor +doesn't appear at run-time, this instance declaration defines a +dictionary which is wholly equivalent to the Int +dictionary, only slower! + + + Generalising the deriving clause + +GHC now permits such instances to be derived instead, so one can write + + newtype Dollars = Dollars Int deriving (Eq,Show,Num) + + +and the implementation uses the same Num dictionary +for Dollars as for Int. Notionally, the compiler +derives an instance declaration of the form + + + instance Num Int => Num Dollars + + +which just adds or removes the newtype constructor according to the type. + + + +We can also derive instances of constructor classes in a similar +way. For example, suppose we have implemented state and failure monad +transformers, such that + + + instance Monad m => Monad (State s m) + instance Monad m => Monad (Failure m) + +In Haskell 98, we can define a parsing monad by + + type Parser tok m a = State [tok] (Failure m) a + + +which is automatically a monad thanks to the instance declarations +above. With the extension, we can make the parser type abstract, +without needing to write an instance of class Monad, via + + + newtype Parser tok m a = Parser (State [tok] (Failure m) a) + deriving Monad + +In this case the derived instance declaration is of the form + + instance Monad (State [tok] (Failure m)) => Monad (Parser tok m) + + +Notice that, since Monad is a constructor class, the +instance is a partial application of the new type, not the +entire left hand side. We can imagine that the type declaration is +``eta-converted'' to generate the context of the instance +declaration. + + + +We can even derive instances of multi-parameter classes, provided the +newtype is the last class parameter. In this case, a ``partial +application'' of the class appears in the deriving +clause. For example, given the class + + + class StateMonad s m | m -> s where ... + instance Monad m => StateMonad s (State s m) where ... + +then we can derive an instance of StateMonad for Parsers by + + newtype Parser tok m a = Parser (State [tok] (Failure m) a) + deriving (Monad, StateMonad [tok]) + + +The derived instance is obtained by completing the application of the +class to the new type: + + + instance StateMonad [tok] (State [tok] (Failure m)) => + StateMonad [tok] (Parser tok m) + + + + +As a result of this extension, all derived instances in newtype +declarations are treated uniformly (and implemented just by reusing +the dictionary for the representation type), except +Show and Read, which really behave differently for +the newtype and its representation. + + + + A more precise specification + +Derived instance declarations are constructed as follows. Consider the +declaration (after expansion of any type synonyms) + + + newtype T v1...vn = T' (S t1...tk vk+1...vn) deriving (c1...cm) + + +where S is a type constructor, t1...tk are +types, +vk+1...vn are type variables which do not occur in any of +the ti, and the ci are partial applications of +classes of the form C t1'...tj'. The derived instance +declarations are, for each ci, + + + instance ci (S t1...tk vk+1...v) => ci (T v1...vp) + +where p is chosen so that T v1...vp is of the +right kind for the last parameter of class Ci. + + + +As an example which does not work, consider + + newtype NonMonad m s = NonMonad (State s m s) deriving Monad + +Here we cannot derive the instance + + instance Monad (State s m) => Monad (NonMonad m) + + +because the type variable s occurs in State s m, +and so cannot be "eta-converted" away. It is a good thing that this +deriving clause is rejected, because NonMonad m is +not, in fact, a monad --- for the same reason. Try defining +>>= with the correct type: you won't be able to. + + + +Notice also that the order of class parameters becomes +important, since we can only derive instances for the last one. If the +StateMonad class above were instead defined as + + + class StateMonad m s | m -> s where ... + + +then we would not have been able to derive an instance for the +Parser type above. We hypothesise that multi-parameter +classes usually have one "main" parameter for which deriving new +instances is most interesting. + + + + + +