X-Git-Url: http://git.megacz.com/?a=blobdiff_plain;f=ghc%2Fdocs%2Fusers_guide%2Fglasgow_exts.xml;h=beaaad616a393caf66fc20809d502363266087fb;hb=a0f46309637779ccc141ec531e9b128596a5bba0;hp=a5ac2b3b9784bcfd64cef599324e65690ced9ebb;hpb=59c9c122f942f348008d4ed8ba088286343d63d3;p=ghc-hetmet.git diff --git a/ghc/docs/users_guide/glasgow_exts.xml b/ghc/docs/users_guide/glasgow_exts.xml index a5ac2b3..beaaad6 100644 --- a/ghc/docs/users_guide/glasgow_exts.xml +++ b/ghc/docs/users_guide/glasgow_exts.xml @@ -1596,9 +1596,9 @@ GHC lifts this restriction. + - - + Functional dependencies @@ -1618,6 +1618,8 @@ class declaration; e.g. There should be more documentation, but there isn't (yet). Yell if you need it. + +Rules for functional dependencies In a class declaration, all of the class type variables must be reachable (in the sense mentioned in ) @@ -1665,9 +1667,250 @@ class like this: + +Background on functional dependencies +The following description of the motivation and use of functional dependencies is taken +from the Hugs user manual, reproduced here (with minor changes) by kind +permission of Mark Jones. + + +Consider the following class, intended as part of a +library for collection types: + + class Collects e ce where + empty :: ce + insert :: e -> ce -> ce + member :: e -> ce -> Bool + +The type variable e used here represents the element type, while ce is the type +of the container itself. Within this framework, we might want to define +instances of this class for lists or characteristic functions (both of which +can be used to represent collections of any equality type), bit sets (which can +be used to represent collections of characters), or hash tables (which can be +used to represent any collection whose elements have a hash function). Omitting +standard implementation details, this would lead to the following declarations: + + instance Eq e => Collects e [e] where ... + instance Eq e => Collects e (e -> Bool) where ... + instance Collects Char BitSet where ... + instance (Hashable e, Collects a ce) + => Collects e (Array Int ce) where ... + +All this looks quite promising; we have a class and a range of interesting +implementations. Unfortunately, there are some serious problems with the class +declaration. First, the empty function has an ambiguous type: + + empty :: Collects e ce => ce + +By "ambiguous" we mean that there is a type variable e that appears on the left +of the => symbol, but not on the right. The problem with +this is that, according to the theoretical foundations of Haskell overloading, +we cannot guarantee a well-defined semantics for any term with an ambiguous +type. + + +We can sidestep this specific problem by removing the empty member from the +class declaration. However, although the remaining members, insert and member, +do not have ambiguous types, we still run into problems when we try to use +them. For example, consider the following two functions: + + f x y = insert x . insert y + g = f True 'a' + +for which GHC infers the following types: + + f :: (Collects a c, Collects b c) => a -> b -> c -> c + g :: (Collects Bool c, Collects Char c) => c -> c + +Notice that the type for f allows the two parameters x and y to be assigned +different types, even though it attempts to insert each of the two values, one +after the other, into the same collection. If we're trying to model collections +that contain only one type of value, then this is clearly an inaccurate +type. Worse still, the definition for g is accepted, without causing a type +error. As a result, the error in this code will not be flagged at the point +where it appears. Instead, it will show up only when we try to use g, which +might even be in a different module. + +An attempt to use constructor classes + +Faced with the problems described above, some Haskell programmers might be +tempted to use something like the following version of the class declaration: + + class Collects e c where + empty :: c e + insert :: e -> c e -> c e + member :: e -> c e -> Bool + +The key difference here is that we abstract over the type constructor c that is +used to form the collection type c e, and not over that collection type itself, +represented by ce in the original class declaration. This avoids the immediate +problems that we mentioned above: empty has type Collects e c => c +e, which is not ambiguous. + + +The function f from the previous section has a more accurate type: + + f :: (Collects e c) => e -> e -> c e -> c e + +The function g from the previous section is now rejected with a type error as +we would hope because the type of f does not allow the two arguments to have +different types. +This, then, is an example of a multiple parameter class that does actually work +quite well in practice, without ambiguity problems. +There is, however, a catch. This version of the Collects class is nowhere near +as general as the original class seemed to be: only one of the four instances +for Collects +given above can be used with this version of Collects because only one of +them---the instance for lists---has a collection type that can be written in +the form c e, for some type constructor c, and element type e. + + + +Adding functional dependencies + + +To get a more useful version of the Collects class, Hugs provides a mechanism +that allows programmers to specify dependencies between the parameters of a +multiple parameter class (For readers with an interest in theoretical +foundations and previous work: The use of dependency information can be seen +both as a generalization of the proposal for `parametric type classes' that was +put forward by Chen, Hudak, and Odersky, or as a special case of Mark Jones's +later framework for "improvement" of qualified types. The +underlying ideas are also discussed in a more theoretical and abstract setting +in a manuscript [implparam], where they are identified as one point in a +general design space for systems of implicit parameterization.). + +To start with an abstract example, consider a declaration such as: + + class C a b where ... + +which tells us simply that C can be thought of as a binary relation on types +(or type constructors, depending on the kinds of a and b). Extra clauses can be +included in the definition of classes to add information about dependencies +between parameters, as in the following examples: + + class D a b | a -> b where ... + class E a b | a -> b, b -> a where ... + +The notation a -> b used here between the | and where +symbols --- not to be +confused with a function type --- indicates that the a parameter uniquely +determines the b parameter, and might be read as "a determines b." Thus D is +not just a relation, but actually a (partial) function. Similarly, from the two +dependencies that are included in the definition of E, we can see that E +represents a (partial) one-one mapping between types. + + +More generally, dependencies take the form x1 ... xn -> y1 ... ym, +where x1, ..., xn, and y1, ..., yn are type variables with n>0 and +m>=0, meaning that the y parameters are uniquely determined by the x +parameters. Spaces can be used as separators if more than one variable appears +on any single side of a dependency, as in t -> a b. Note that a class may be +annotated with multiple dependencies using commas as separators, as in the +definition of E above. Some dependencies that we can write in this notation are +redundant, and will be rejected because they don't serve any useful +purpose, and may instead indicate an error in the program. Examples of +dependencies like this include a -> a , +a -> a a , +a -> , etc. There can also be +some redundancy if multiple dependencies are given, as in +a->b, + b->c , a->c , and +in which some subset implies the remaining dependencies. Examples like this are +not treated as errors. Note that dependencies appear only in class +declarations, and not in any other part of the language. In particular, the +syntax for instance declarations, class constraints, and types is completely +unchanged. + + +By including dependencies in a class declaration, we provide a mechanism for +the programmer to specify each multiple parameter class more precisely. The +compiler, on the other hand, is responsible for ensuring that the set of +instances that are in scope at any given point in the program is consistent +with any declared dependencies. For example, the following pair of instance +declarations cannot appear together in the same scope because they violate the +dependency for D, even though either one on its own would be acceptable: + + instance D Bool Int where ... + instance D Bool Char where ... + +Note also that the following declaration is not allowed, even by itself: + + instance D [a] b where ... + +The problem here is that this instance would allow one particular choice of [a] +to be associated with more than one choice for b, which contradicts the +dependency specified in the definition of D. More generally, this means that, +in any instance of the form: + + instance D t s where ... + +for some particular types t and s, the only variables that can appear in s are +the ones that appear in t, and hence, if the type t is known, then s will be +uniquely determined. + + +The benefit of including dependency information is that it allows us to define +more general multiple parameter classes, without ambiguity problems, and with +the benefit of more accurate types. To illustrate this, we return to the +collection class example, and annotate the original definition of Collects +with a simple dependency: + + class Collects e ce | ce -> e where + empty :: ce + insert :: e -> ce -> ce + member :: e -> ce -> Bool + +The dependency ce -> e here specifies that the type e of elements is uniquely +determined by the type of the collection ce. Note that both parameters of +Collects are of kind *; there are no constructor classes here. Note too that +all of the instances of Collects that we gave earlier can be used +together with this new definition. + + +What about the ambiguity problems that we encountered with the original +definition? The empty function still has type Collects e ce => ce, but it is no +longer necessary to regard that as an ambiguous type: Although the variable e +does not appear on the right of the => symbol, the dependency for class +Collects tells us that it is uniquely determined by ce, which does appear on +the right of the => symbol. Hence the context in which empty is used can still +give enough information to determine types for both ce and e, without +ambiguity. More generally, we need only regard a type as ambiguous if it +contains a variable on the left of the => that is not uniquely determined +(either directly or indirectly) by the variables on the right. + + +Dependencies also help to produce more accurate types for user defined +functions, and hence to provide earlier detection of errors, and less cluttered +types for programmers to work with. Recall the previous definition for a +function f: + + f x y = insert x y = insert x . insert y + +for which we originally obtained a type: + + f :: (Collects a c, Collects b c) => a -> b -> c -> c + +Given the dependency information that we have for Collects, however, we can +deduce that a and b must be equal because they both appear as the second +parameter in a Collects constraint with the same first parameter c. Hence we +can infer a shorter and more accurate type for f: + + f :: (Collects a c) => a -> a -> c -> c + +In a similar way, the earlier definition of g will now be flagged as a type error. + + +Although we have given only a few examples here, it should be clear that the +addition of dependency information can help to make multiple parameter classes +more useful in practice, avoiding ambiguity problems, and allowing more general +sets of instance declarations. + + + @@ -1690,22 +1933,38 @@ In Haskell 98 the head of an instance declaration must be of the form C (T a1 ... an), where C is the class, T is a type constructor, and the a1 ... an are distinct type variables. -Furthermore, the assertions in the context of the instance declaration be of -the form C a where a is a type variable. +Furthermore, the assertions in the context of the instance declaration +must be of the form C a where a +is a type variable that occurs in the head. The flag loosens these restrictions considerably. Firstly, multi-parameter type classes are permitted. Secondly, the context and head of the instance declaration can each consist of arbitrary -(well-kinded) assertions (C t1 ... tn) subject only to the following rule: -for each assertion in the context +(well-kinded) assertions (C t1 ... tn) subject only to the +following rules: + + +For each assertion in the context: No type variable has more occurrences in the assertion than in the head -Tthe assertion has fewer constructors and variables (taken together +The assertion has fewer constructors and variables (taken together and counting repetitions) than the head + + +The coverage condition. For each functional dependency, +tvsleft -> +tvsright, of the class, +every type variable in +S(tvsright) must appear in +S(tvsleft), where S is the +substitution mapping each type variable in the class declaration to the +corresponding type in the instance declaration. + + These restrictions ensure that context reduction terminates: each reduction -step makes the problem smaller +step makes the problem smaller by at least one constructor. For example, the following would make the type checker loop if it wasn't excluded: @@ -1731,34 +1990,52 @@ For example, these are OK: But these are not: - instance C a => C a where ... -- Context assertion no smaller than head - instance C b b => Foo [b] where ... + instance C a => C a where ... -- (C b b) has more more occurrences of b than the head + instance C b b => Foo [b] where ... + + + + +The same restrictions apply to instances generated by +deriving clauses. Thus the following is accepted: + + data MinHeap h a = H a (h a) + deriving (Show) + +because the derived instance + + instance (Show a, Show (h a)) => Show (MinHeap h a) +conforms to the above rules. -A couple of useful idioms are these. First, -if one allows overlapping instance declarations then it's quite +A useful idiom permitted by the above rules is as follows. +If one allows overlapping instance declarations then it's quite convenient to have a "default instance" declaration that applies if something more specific does not: instance C a where op = ... -- Default + + -Second, sometimes you might want to use the following to get the -effect of a "class synonym": - + +Undecidable instances + +Sometimes even the rules of are too onerous. +For example, sometimes you might want to use the following to get the +effect of a "class synonym": class (C1 a, C2 a, C3 a) => C a where { } instance (C1 a, C2 a, C3 a) => C a where { } This allows you to write shorter signatures: - f :: C a => ... @@ -1766,16 +2043,46 @@ instead of f :: (C1 a, C2 a, C3 a) => ... - - +The restrictions on functional dependencies () are particularly troublesome. +It is tempting to introduce type variables in the context that do not appear in +the head, something that is excluded by the normal rules. For example: + + class HasConverter a b | a -> b where + convert :: a -> b + + data Foo a = MkFoo a - -Undecidable instances + instance (HasConverter a b,Show b) => Show (Foo a) where + show (MkFoo value) = show (convert value) + +This is dangerous territory, however. Here, for example, is a program that would make the +typechecker loop: + + class D a + class F a b | a->b + instance F [a] [[a]] + instance (D c, F a c) => D [a] -- 'c' is not mentioned in the head + +Similarly, it can be tempting to lift the coverage condition: + + class Mul a b c | a b -> c where + (.*.) :: a -> b -> c + instance Mul Int Int Int where (.*.) = (*) + instance Mul Int Float Float where x .*. y = fromIntegral x * y + instance Mul a b c => Mul a [b] [c] where x .*. v = map (x.*.) v + +The third instance declaration does not obey the coverage condition; +and indeed the (somewhat strange) definition: + + f = \ b x y -> if b then x .*. [y] else y + +makes instance inference go into a loop, because it requires the constraint +(Mul a [b] b). + -Sometimes even the rules of are too onerous. -Voluminous correspondence on the Haskell mailing list has convinced me -that it's worth experimenting with more liberal rules. If you use +Nevertheless, GHC allows you to experiment with more liberal rules. If you use the experimental flag -fallow-undecidable-instances option, you can use arbitrary @@ -1784,10 +2091,7 @@ fixed-depth recursion stack. If you exceed the stack depth you get a sort of backtrace, and the opportunity to increase the stack depth with N. - -I'm on the lookout for a less brutal solution: a simple rule that preserves decidability while -allowing these idioms interesting idioms. - +