X-Git-Url: http://git.megacz.com/?a=blobdiff_plain;f=ghc%2Fdocs%2Fusers_guide%2Fglasgow_exts.xml;h=beaaad616a393caf66fc20809d502363266087fb;hb=a0f46309637779ccc141ec531e9b128596a5bba0;hp=a5ac2b3b9784bcfd64cef599324e65690ced9ebb;hpb=59c9c122f942f348008d4ed8ba088286343d63d3;p=ghc-hetmet.git
diff --git a/ghc/docs/users_guide/glasgow_exts.xml b/ghc/docs/users_guide/glasgow_exts.xml
index a5ac2b3..beaaad6 100644
--- a/ghc/docs/users_guide/glasgow_exts.xml
+++ b/ghc/docs/users_guide/glasgow_exts.xml
@@ -1596,9 +1596,9 @@ GHC lifts this restriction.
+
-
-
+Functional dependencies
@@ -1618,6 +1618,8 @@ class declaration; e.g.
There should be more documentation, but there isn't (yet). Yell if you need it.
+
+Rules for functional dependencies
In a class declaration, all of the class type variables must be reachable (in the sense
mentioned in )
@@ -1665,9 +1667,250 @@ class like this:
+
+Background on functional dependencies
+The following description of the motivation and use of functional dependencies is taken
+from the Hugs user manual, reproduced here (with minor changes) by kind
+permission of Mark Jones.
+
+
+Consider the following class, intended as part of a
+library for collection types:
+
+ class Collects e ce where
+ empty :: ce
+ insert :: e -> ce -> ce
+ member :: e -> ce -> Bool
+
+The type variable e used here represents the element type, while ce is the type
+of the container itself. Within this framework, we might want to define
+instances of this class for lists or characteristic functions (both of which
+can be used to represent collections of any equality type), bit sets (which can
+be used to represent collections of characters), or hash tables (which can be
+used to represent any collection whose elements have a hash function). Omitting
+standard implementation details, this would lead to the following declarations:
+
+ instance Eq e => Collects e [e] where ...
+ instance Eq e => Collects e (e -> Bool) where ...
+ instance Collects Char BitSet where ...
+ instance (Hashable e, Collects a ce)
+ => Collects e (Array Int ce) where ...
+
+All this looks quite promising; we have a class and a range of interesting
+implementations. Unfortunately, there are some serious problems with the class
+declaration. First, the empty function has an ambiguous type:
+
+ empty :: Collects e ce => ce
+
+By "ambiguous" we mean that there is a type variable e that appears on the left
+of the => symbol, but not on the right. The problem with
+this is that, according to the theoretical foundations of Haskell overloading,
+we cannot guarantee a well-defined semantics for any term with an ambiguous
+type.
+
+
+We can sidestep this specific problem by removing the empty member from the
+class declaration. However, although the remaining members, insert and member,
+do not have ambiguous types, we still run into problems when we try to use
+them. For example, consider the following two functions:
+
+ f x y = insert x . insert y
+ g = f True 'a'
+
+for which GHC infers the following types:
+
+ f :: (Collects a c, Collects b c) => a -> b -> c -> c
+ g :: (Collects Bool c, Collects Char c) => c -> c
+
+Notice that the type for f allows the two parameters x and y to be assigned
+different types, even though it attempts to insert each of the two values, one
+after the other, into the same collection. If we're trying to model collections
+that contain only one type of value, then this is clearly an inaccurate
+type. Worse still, the definition for g is accepted, without causing a type
+error. As a result, the error in this code will not be flagged at the point
+where it appears. Instead, it will show up only when we try to use g, which
+might even be in a different module.
+
+An attempt to use constructor classes
+
+Faced with the problems described above, some Haskell programmers might be
+tempted to use something like the following version of the class declaration:
+
+ class Collects e c where
+ empty :: c e
+ insert :: e -> c e -> c e
+ member :: e -> c e -> Bool
+
+The key difference here is that we abstract over the type constructor c that is
+used to form the collection type c e, and not over that collection type itself,
+represented by ce in the original class declaration. This avoids the immediate
+problems that we mentioned above: empty has type Collects e c => c
+e, which is not ambiguous.
+
+
+The function f from the previous section has a more accurate type:
+
+ f :: (Collects e c) => e -> e -> c e -> c e
+
+The function g from the previous section is now rejected with a type error as
+we would hope because the type of f does not allow the two arguments to have
+different types.
+This, then, is an example of a multiple parameter class that does actually work
+quite well in practice, without ambiguity problems.
+There is, however, a catch. This version of the Collects class is nowhere near
+as general as the original class seemed to be: only one of the four instances
+for Collects
+given above can be used with this version of Collects because only one of
+them---the instance for lists---has a collection type that can be written in
+the form c e, for some type constructor c, and element type e.
+
+
+
+Adding functional dependencies
+
+
+To get a more useful version of the Collects class, Hugs provides a mechanism
+that allows programmers to specify dependencies between the parameters of a
+multiple parameter class (For readers with an interest in theoretical
+foundations and previous work: The use of dependency information can be seen
+both as a generalization of the proposal for `parametric type classes' that was
+put forward by Chen, Hudak, and Odersky, or as a special case of Mark Jones's
+later framework for "improvement" of qualified types. The
+underlying ideas are also discussed in a more theoretical and abstract setting
+in a manuscript [implparam], where they are identified as one point in a
+general design space for systems of implicit parameterization.).
+
+To start with an abstract example, consider a declaration such as:
+
+ class C a b where ...
+
+which tells us simply that C can be thought of as a binary relation on types
+(or type constructors, depending on the kinds of a and b). Extra clauses can be
+included in the definition of classes to add information about dependencies
+between parameters, as in the following examples:
+
+ class D a b | a -> b where ...
+ class E a b | a -> b, b -> a where ...
+
+The notation a -> b used here between the | and where
+symbols --- not to be
+confused with a function type --- indicates that the a parameter uniquely
+determines the b parameter, and might be read as "a determines b." Thus D is
+not just a relation, but actually a (partial) function. Similarly, from the two
+dependencies that are included in the definition of E, we can see that E
+represents a (partial) one-one mapping between types.
+
+
+More generally, dependencies take the form x1 ... xn -> y1 ... ym,
+where x1, ..., xn, and y1, ..., yn are type variables with n>0 and
+m>=0, meaning that the y parameters are uniquely determined by the x
+parameters. Spaces can be used as separators if more than one variable appears
+on any single side of a dependency, as in t -> a b. Note that a class may be
+annotated with multiple dependencies using commas as separators, as in the
+definition of E above. Some dependencies that we can write in this notation are
+redundant, and will be rejected because they don't serve any useful
+purpose, and may instead indicate an error in the program. Examples of
+dependencies like this include a -> a ,
+a -> a a ,
+a -> , etc. There can also be
+some redundancy if multiple dependencies are given, as in
+a->b,
+ b->c , a->c , and
+in which some subset implies the remaining dependencies. Examples like this are
+not treated as errors. Note that dependencies appear only in class
+declarations, and not in any other part of the language. In particular, the
+syntax for instance declarations, class constraints, and types is completely
+unchanged.
+
+
+By including dependencies in a class declaration, we provide a mechanism for
+the programmer to specify each multiple parameter class more precisely. The
+compiler, on the other hand, is responsible for ensuring that the set of
+instances that are in scope at any given point in the program is consistent
+with any declared dependencies. For example, the following pair of instance
+declarations cannot appear together in the same scope because they violate the
+dependency for D, even though either one on its own would be acceptable:
+
+ instance D Bool Int where ...
+ instance D Bool Char where ...
+
+Note also that the following declaration is not allowed, even by itself:
+
+ instance D [a] b where ...
+
+The problem here is that this instance would allow one particular choice of [a]
+to be associated with more than one choice for b, which contradicts the
+dependency specified in the definition of D. More generally, this means that,
+in any instance of the form:
+
+ instance D t s where ...
+
+for some particular types t and s, the only variables that can appear in s are
+the ones that appear in t, and hence, if the type t is known, then s will be
+uniquely determined.
+
+
+The benefit of including dependency information is that it allows us to define
+more general multiple parameter classes, without ambiguity problems, and with
+the benefit of more accurate types. To illustrate this, we return to the
+collection class example, and annotate the original definition of Collects
+with a simple dependency:
+
+ class Collects e ce | ce -> e where
+ empty :: ce
+ insert :: e -> ce -> ce
+ member :: e -> ce -> Bool
+
+The dependency ce -> e here specifies that the type e of elements is uniquely
+determined by the type of the collection ce. Note that both parameters of
+Collects are of kind *; there are no constructor classes here. Note too that
+all of the instances of Collects that we gave earlier can be used
+together with this new definition.
+
+
+What about the ambiguity problems that we encountered with the original
+definition? The empty function still has type Collects e ce => ce, but it is no
+longer necessary to regard that as an ambiguous type: Although the variable e
+does not appear on the right of the => symbol, the dependency for class
+Collects tells us that it is uniquely determined by ce, which does appear on
+the right of the => symbol. Hence the context in which empty is used can still
+give enough information to determine types for both ce and e, without
+ambiguity. More generally, we need only regard a type as ambiguous if it
+contains a variable on the left of the => that is not uniquely determined
+(either directly or indirectly) by the variables on the right.
+
+
+Dependencies also help to produce more accurate types for user defined
+functions, and hence to provide earlier detection of errors, and less cluttered
+types for programmers to work with. Recall the previous definition for a
+function f:
+
+ f x y = insert x y = insert x . insert y
+
+for which we originally obtained a type:
+
+ f :: (Collects a c, Collects b c) => a -> b -> c -> c
+
+Given the dependency information that we have for Collects, however, we can
+deduce that a and b must be equal because they both appear as the second
+parameter in a Collects constraint with the same first parameter c. Hence we
+can infer a shorter and more accurate type for f:
+
+ f :: (Collects a c) => a -> a -> c -> c
+
+In a similar way, the earlier definition of g will now be flagged as a type error.
+
+
+Although we have given only a few examples here, it should be clear that the
+addition of dependency information can help to make multiple parameter classes
+more useful in practice, avoiding ambiguity problems, and allowing more general
+sets of instance declarations.
+
+
+
@@ -1690,22 +1933,38 @@ In Haskell 98 the head of an instance declaration
must be of the form C (T a1 ... an), where
C is the class, T is a type constructor,
and the a1 ... an are distinct type variables.
-Furthermore, the assertions in the context of the instance declaration be of
-the form C a where a is a type variable.
+Furthermore, the assertions in the context of the instance declaration
+must be of the form C a where a
+is a type variable that occurs in the head.
The flag loosens these restrictions
considerably. Firstly, multi-parameter type classes are permitted. Secondly,
the context and head of the instance declaration can each consist of arbitrary
-(well-kinded) assertions (C t1 ... tn) subject only to the following rule:
-for each assertion in the context
+(well-kinded) assertions (C t1 ... tn) subject only to the
+following rules:
+
+
+For each assertion in the context:
No type variable has more occurrences in the assertion than in the head
-Tthe assertion has fewer constructors and variables (taken together
+The assertion has fewer constructors and variables (taken together
and counting repetitions) than the head
+
+
+The coverage condition. For each functional dependency,
+tvsleft->
+tvsright, of the class,
+every type variable in
+S(tvsright) must appear in
+S(tvsleft), where S is the
+substitution mapping each type variable in the class declaration to the
+corresponding type in the instance declaration.
+
+
These restrictions ensure that context reduction terminates: each reduction
-step makes the problem smaller
+step makes the problem smaller by at least one
constructor. For example, the following would make the type checker
loop if it wasn't excluded:
@@ -1731,34 +1990,52 @@ For example, these are OK:
But these are not:
- instance C a => C a where ...
-- Context assertion no smaller than head
- instance C b b => Foo [b] where ...
+ instance C a => C a where ...
-- (C b b) has more more occurrences of b than the head
+ instance C b b => Foo [b] where ...
+
+
+
+
+The same restrictions apply to instances generated by
+deriving clauses. Thus the following is accepted:
+
+ data MinHeap h a = H a (h a)
+ deriving (Show)
+
+because the derived instance
+
+ instance (Show a, Show (h a)) => Show (MinHeap h a)
+conforms to the above rules.
-A couple of useful idioms are these. First,
-if one allows overlapping instance declarations then it's quite
+A useful idiom permitted by the above rules is as follows.
+If one allows overlapping instance declarations then it's quite
convenient to have a "default instance" declaration that applies if
something more specific does not:
instance C a where
op = ... -- Default
+
+
-Second, sometimes you might want to use the following to get the
-effect of a "class synonym":
-
+
+Undecidable instances
+
+Sometimes even the rules of are too onerous.
+For example, sometimes you might want to use the following to get the
+effect of a "class synonym":
class (C1 a, C2 a, C3 a) => C a where { }
instance (C1 a, C2 a, C3 a) => C a where { }
This allows you to write shorter signatures:
-
f :: C a => ...
@@ -1766,16 +2043,46 @@ instead of
f :: (C1 a, C2 a, C3 a) => ...
-
-
+The restrictions on functional dependencies () are particularly troublesome.
+It is tempting to introduce type variables in the context that do not appear in
+the head, something that is excluded by the normal rules. For example:
+
+ class HasConverter a b | a -> b where
+ convert :: a -> b
+
+ data Foo a = MkFoo a
-
-Undecidable instances
+ instance (HasConverter a b,Show b) => Show (Foo a) where
+ show (MkFoo value) = show (convert value)
+
+This is dangerous territory, however. Here, for example, is a program that would make the
+typechecker loop:
+
+ class D a
+ class F a b | a->b
+ instance F [a] [[a]]
+ instance (D c, F a c) => D [a] -- 'c' is not mentioned in the head
+
+Similarly, it can be tempting to lift the coverage condition:
+
+ class Mul a b c | a b -> c where
+ (.*.) :: a -> b -> c
+ instance Mul Int Int Int where (.*.) = (*)
+ instance Mul Int Float Float where x .*. y = fromIntegral x * y
+ instance Mul a b c => Mul a [b] [c] where x .*. v = map (x.*.) v
+
+The third instance declaration does not obey the coverage condition;
+and indeed the (somewhat strange) definition:
+
+ f = \ b x y -> if b then x .*. [y] else y
+
+makes instance inference go into a loop, because it requires the constraint
+(Mul a [b] b).
+
-Sometimes even the rules of are too onerous.
-Voluminous correspondence on the Haskell mailing list has convinced me
-that it's worth experimenting with more liberal rules. If you use
+Nevertheless, GHC allows you to experiment with more liberal rules. If you use
the experimental flag
-fallow-undecidable-instances
option, you can use arbitrary
@@ -1784,10 +2091,7 @@ fixed-depth recursion stack. If you exceed the stack depth you get a
sort of backtrace, and the opportunity to increase the stack depth
with N.
-
-I'm on the lookout for a less brutal solution: a simple rule that preserves decidability while
-allowing these idioms interesting idioms.
-
+