X-Git-Url: http://git.megacz.com/?p=ghc-hetmet.git;a=blobdiff_plain;f=docs%2Fusers_guide%2Fglasgow_exts.xml;h=0f37953d5dcde24079a9c8c0daf8864f65a89b7b;hp=b7d43c9f3fd80a77247c11acec0527835e47182b;hb=b2524b3960999fffdb3767900f58825903f6560f;hpb=661c97c65e5fa47177502e592bb763f752b487ac diff --git a/docs/users_guide/glasgow_exts.xml b/docs/users_guide/glasgow_exts.xml index b7d43c9..0f37953 100644 --- a/docs/users_guide/glasgow_exts.xml +++ b/docs/users_guide/glasgow_exts.xml @@ -3,8 +3,9 @@ language, GHC extensions, GHC As with all known Haskell systems, GHC implements some extensions to -the language. They are all enabled by options; by default GHC -understands only plain Haskell 98. +the language. They can all be enabled or disabled by commandline flags +or language pragmas. By default GHC understands the most recent Haskell +version it supports, plus a handful of extensions. @@ -39,8 +40,7 @@ documentation describes all the libraries that come with GHC. The language option flags control what variation of the language are - permitted. Leaving out all of them gives you standard Haskell - 98. + permitted. Language options can be controlled in two ways: @@ -56,38 +56,7 @@ documentation describes all the libraries that come with GHC. The flag is equivalent to enabling the following extensions: - , - , - , - , - , - , - , - , - , - , - , - , - , - , - , - , - , - , - , - , - , - , - , - , - , - , - , - , - , - , - , - . + &what_glasgow_exts_does; Enabling these options is the only effect of . We are trying to move away from this portmanteau flag, @@ -470,10 +439,10 @@ Indeed, the bindings can even be recursive. 'x'# has type Char# "foo"# has type Addr# 3# has type Int#. In general, - any Haskell 98 integer lexeme followed by a # is an Int# literal, e.g. + any Haskell integer lexeme followed by a # is an Int# literal, e.g. -0x3A# as well as 32#. 3## has type Word#. In general, - any non-negative Haskell 98 integer lexeme followed by ## + any non-negative Haskell integer lexeme followed by ## is a Word#. 3.2# has type Float#. 3.2## has type Double# @@ -481,43 +450,6 @@ Indeed, the bindings can even be recursive. - - New qualified operator syntax - - A new syntax for referencing qualified operators is - planned to be introduced by Haskell', and is enabled in GHC - with - the - option. In the new syntax, the prefix form of a qualified - operator is - written module.(symbol) - (in Haskell 98 this would - be (module.symbol)), - and the infix form is - written `module.(symbol)` - (in Haskell 98 this would - be `module.symbol`. - For example: - - add x y = Prelude.(+) x y - subtract y = (`Prelude.(-)` y) - - The new form of qualified operators is intended to regularise - the syntax by eliminating odd cases - like Prelude... For example, - when NewQualifiedOperators is on, it is possible to - write the enumerated sequence [Monday..] - without spaces, whereas in Haskell 98 this would be a - reference to the operator ‘.‘ - from module Monday. - - When is on, the old Haskell - 98 syntax for qualified operators is not accepted, so this - option may cause existing Haskell 98 code to break. - - - - @@ -818,10 +750,8 @@ let {(x -> y) = e1 ; (y -> x) = e2 } in x -(We may lift this -restriction in the future; the only cost is that type checking patterns -would get a little more complicated.) - +(For some amplification on this design choice see +Trac #4061.) @@ -1271,6 +1201,234 @@ output = [ x + + + + Monad comprehensions + monad comprehensions + + + Monad comprehesions generalise the list comprehension notation, + including parallel comprehensions + () and + transform comprenensions () + to work for any monad. + + + Monad comprehensions support: + + + + + Bindings: + + + +[ x + y | x <- Just 1, y <- Just 2 ] + + + + Bindings are translated with the (>>=) and + return functions to the usual do-notation: + + + +do x <- Just 1 + y <- Just 2 + return (x+y) + + + + + + Guards: + + + +[ x | x <- [1..10], x <= 5 ] + + + + Guards are translated with the guard function, + which requires a MonadPlus instance: + + + +do x <- [1..10] + guard (x <= 5) + return x + + + + + + Transform statements (as with -XTransformListComp): + + + +[ x+y | x <- [1..10], y <- [1..x], then take 2 ] + + + + This translates to: + + + +do (x,y) <- take 2 (do x <- [1..10] + y <- [1..x] + return (x,y)) + return (x+y) + + + + + + Group statements (as with -XTransformListComp): + + + +[ x | x <- [1,1,2,2,3], then group by x ] +[ x | x <- [1,1,2,2,3], then group by x using GHC.Exts.groupWith ] +[ x | x <- [1,1,2,2,3], then group using myGroup ] + + + + The basic then group by e statement is + translated using the mgroupWith function, which + requires a MonadGroup instance, defined in + Control.Monad.Group: + + + +do x <- mgroupWith (do x <- [1,1,2,2,3] + return x) + return x + + + + Note that the type of x is changed by the + grouping statement. + + + + The grouping function can also be defined with the + using keyword. + + + + + + Parallel statements (as with -XParallelListComp): + + + +[ (x+y) | x <- [1..10] + | y <- [11..20] + ] + + + + Parallel statements are translated using the + mzip function, which requires a + MonadZip instance defined in + Control.Monad.Zip: + + + +do (x,y) <- mzip (do x <- [1..10] + return x) + (do y <- [11..20] + return y) + return (x+y) + + + + + + + All these features are enabled by default if the + MonadComprehensions extension is enabled. The types + and more detailed examples on how to use comprehensions are explained + in the previous chapters and . In general you just have + to replace the type [a] with the type + Monad m => m a for monad comprehensions. + + + + Note: Even though most of these examples are using the list monad, + monad comprehensions work for any monad. + The base package offers all necessary instances for + lists, which make MonadComprehensions backward + compatible to built-in, transform and parallel list comprehensions. + + More formally, the desugaring is as follows. We write D[ e | Q] +to mean the desugaring of the monad comprehension [ e | Q]: + +Expressions: e +Declarations: d +Lists of qualifiers: Q,R,S + +-- Basic forms +D[ e | ] = return e +D[ e | p <- e, Q ] = e >>= \p -> D[ e | Q ] +D[ e | e, Q ] = guard e >> \p -> D[ e | Q ] +D[ e | let d, Q ] = let d in D[ e | Q ] + +-- Parallel comprehensions (iterate for multiple parallel branches) +D[ e | (Q | R), S ] = mzip D[ Qv | Q ] D[ Rv | R ] >>= \(Qv,Rv) -> D[ e | S ] + +-- Transform comprehensions +D[ e | Q then f, R ] = f D[ Qv | Q ] >>= \Qv -> D[ e | R ] + +D[ e | Q then f by b, R ] = f b D[ Qv | Q ] >>= \Qv -> D[ e | R ] + +D[ e | Q then group using f, R ] = f D[ Qv | Q ] >>= \ys -> + case (fmap selQv1 ys, ..., fmap selQvn ys) of + Qv -> D[ e | R ] + +D[ e | Q then group by b using f, R ] = f b D[ Qv | Q ] >>= \ys -> + case (fmap selQv1 ys, ..., fmap selQvn ys) of + Qv -> D[ e | R ] + +where Qv is the tuple of variables bound by Q (and used subsequently) + selQvi is a selector mapping Qv to the ith component of Qv + +Operator Standard binding Expected type +-------------------------------------------------------------------- +return GHC.Base t1 -> m t2 +(>>=) GHC.Base m1 t1 -> (t2 -> m2 t3) -> m3 t3 +(>>) GHC.Base m1 t1 -> m2 t2 -> m3 t3 +guard Control.Monad t1 -> m t2 +fmap GHC.Base forall a b. (a->b) -> n a -> n b +mgroupWith Control.Monad.Group forall a. (a -> t) -> m1 a -> m2 (n a) +mzip Control.Monad.Zip forall a b. m a -> m b -> m (a,b) + +The comprehension should typecheck when its desugaring would typecheck. + + +Monad comprehensions support rebindable syntax (). +Without rebindable +syntax, the operators from the "standard binding" module are used; with +rebindable syntax, the operators are looked up in the current lexical scope. +For example, parallel comprehensions will be typechecked and desugared +using whatever "mzip" is in scope. + + +The rebindable operators must have the "Expected type" given in the +table above. These types are surprisingly general. For example, you can +use a bind operator with the type + +(>>=) :: T x y a -> (a -> T y z b) -> T x z b + +In the case of transform comprehensions, notice that the groups are +parameterised over some arbitrary type n (provided it +has an fmap, as well as +the comprehension being over an arbitrary monad. + + + @@ -1291,8 +1449,8 @@ output = [ x hierarchy. It completely defeats that purpose if the literal "1" means "Prelude.fromInteger 1", which is what the Haskell Report specifies. - So the - flag also causes + So the + flag causes the following pieces of built-in syntax to refer to whatever is in scope, not the Prelude versions: @@ -1324,6 +1482,11 @@ output = [ x + Conditionals (e.g. "if e1 then e2 else e3") + means "ifThenElse e1 e2 e3". However case expressions are unaffected. + + + "Do" notation is translated using whatever functions (>>=), (>>), and fail, @@ -1343,6 +1506,9 @@ output = [ x to use this, ask! + implies . + + In all cases (apart from arrow notation), the static semantics should be that of the desugared form, even if that is a little unexpected. For example, the static semantics of the literal 368 @@ -1896,6 +2062,26 @@ not * then an explicit kind annotation must be used Nevertheless, they can be useful when defining "phantom types". + +Data type contexts + +Haskell allows datatypes to be given contexts, e.g. + + +data Eq a => Set a = NilSet | ConsSet a (Set a) + + +give constructors with types: + + +NilSet :: Set a +ConsSet :: Eq a => a -> Set a -> Set a + + +In GHC this feature is an extension called +DatatypeContexts, and on by default. + + Infix type constructors, classes, and type variables @@ -2476,7 +2662,8 @@ declarations. Define your own instances! Declaring data types with explicit constructor signatures -GHC allows you to declare an algebraic data type by +When the GADTSyntax extension is enabled, +GHC allows you to declare an algebraic data type by giving the type signatures of constructors explicitly. For example: data Maybe a where @@ -3025,6 +3212,12 @@ then writing the data type instance by hand. + With , you can derive +instances of the class Generic, defined in +GHC.Generics. You can use these to define generic functions, +as described in . + + With , you can derive instances of the class Functor, defined in GHC.Base. @@ -3346,6 +3539,47 @@ GHC lifts this restriction (flag ). + + + +Default signatures + + +Haskell 98 allows you to define a default implementation when declaring a class: + + class Enum a where + enum :: [a] + enum = [] + +The type of the enum method is [a], and +this is also the type of the default method. You can lift this restriction +and give another type to the default method using the flag +. For instance, if you have written a +generic implementation of enumeration in a class GEnum +with method genum in terms of GHC.Generics, +you can specify a default method that uses that generic implementation: + + class Enum a where + enum :: [a] + default enum :: (Generic a, GEnum (Rep a)) => [a] + enum = map to genum + +We reuse the keyword default to signal that a signature +applies to the default method only; when defining instances of the +Enum class, the original type [a] of +enum still applies. When giving an empty instance, however, +the default implementation map to0 genum is filled-in, +and type-checked with the type +(Generic a, GEnum (Rep a)) => [a]. + + + +We use default signatures to simplify generic programming in GHC +(). + + + + @@ -4046,18 +4280,21 @@ The willingness to be overlapped or incoherent is a property of the instance declaration itself, controlled by the presence or otherwise of the and flags when that module is -being defined. Neither flag is required in a module that imports and uses the -instance declaration. Specifically, during the lookup process: +being defined. Specifically, during the lookup process: -An instance declaration is ignored during the lookup process if (a) a more specific -match is found, and (b) the instance declaration was compiled with -. The flag setting for the -more-specific instance does not matter. +If the constraint being looked up matches two instance declarations IA and IB, +and + +IB is a substitution instance of IA (but not vice versa); +that is, IB is strictly more specific than IA +either IA or IB was compiled with + +then the less-specific instance IA is ignored. Suppose an instance declaration does not match the constraint being looked up, but -does unify with it, so that it might match when the constraint is further +does unify with it, so that it might match when the constraint is further instantiated. Usually GHC will regard this as a reason for not committing to some other constraint. But if the instance declaration was compiled with , GHC will skip the "does-it-unify?" @@ -4067,18 +4304,6 @@ check for that declaration. These rules make it possible for a library author to design a library that relies on overlapping instances without the library client having to know. - -If an instance declaration is compiled without -, -then that instance can never be overlapped. This could perhaps be -inconvenient. Perhaps the rule should instead say that the -overlapping instance declaration should be compiled in -this way, rather than the overlapped one. Perhaps overlap -at a usage site should be permitted regardless of how the instance declarations -are compiled, if the flag is -used at the usage site. (Mind you, the exact usage site can occasionally be -hard to pin down.) We are interested to receive feedback on these points. - The flag implies the flag, but not vice versa. @@ -5776,9 +6001,6 @@ for rank-2 types. Impredicative polymorphism -NOTE: the impredicative-polymorphism feature is deprecated in GHC 6.12, and -will be removed or replaced in GHC 6.14. - GHC supports impredicative polymorphism, enabled with . This means @@ -5901,7 +6123,7 @@ signature is explicit. For example: g (x:xs) = xs ++ [ x :: a ] This program will be rejected, because "a" does not scope -over the definition of "f", so "x::a" +over the definition of "g", so "x::a" means "x::forall a. a" by Haskell's usual implicit quantification rules. @@ -5937,7 +6159,7 @@ type variables, in the annotated expression. For example: f = runST ( (op >>= \(x :: STRef s Int) -> g x) :: forall s. ST s Bool ) -Here, the type signature forall a. ST s Bool brings the +Here, the type signature forall s. ST s Bool brings the type variable s into scope, in the annotated expression (op >>= \(x :: STRef s Int) -> g x). @@ -7205,16 +7427,23 @@ forces evaluation anyway does nothing. There is one (apparent) exception to this general rule that a bang only makes a difference when it precedes a variable or wild-card: a bang at the top level of a let or where -binding makes the binding strict, regardless of the pattern. For example: +binding makes the binding strict, regardless of the pattern. +(We say "apparent" exception because the Right Way to think of it is that the bang +at the top of a binding is not part of the pattern; rather it +is part of the syntax of the binding, +creating a "bang-pattern binding".) +For example: let ![x,y] = e in b -is a strict binding: operationally, it evaluates e, matches -it against the pattern [x,y], and then evaluates b. -(We say "apparent" exception because the Right Way to think of it is that the bang -at the top of a binding is not part of the pattern; rather it -is part of the syntax of the binding.) -Nested bangs in a pattern binding behave uniformly with all other forms of +is a bang-pattern binding. Operationally, it behaves just like a case expression: + +case e of [x,y] -> b + +Like a case expression, a bang-pattern binding must be non-recursive, and +is monomorphic. + +However, nested bangs in a pattern binding behave uniformly with all other forms of pattern matching. For example let (!x,[y]) = e in b @@ -7477,7 +7706,7 @@ Assertion failures can be caught, see the documentation for the A list of all supported language extensions can be obtained by invoking - ghc --supported-languages (see ). + ghc --supported-extensions (see ). Any extension from the Extension type defined in String -> (Bool, Double) function "f" has a number of other effects: -No functions are inlined into f. Otherwise -GHC might inline a big function into f's right hand side, -making f big; and then inline f blindly. +While GHC is keen to inline the function, it does not do so +blindly. For example, if you write + +map key_function xs + +there really isn't any point in inlining key_function to get + +map (\x -> body) xs + +In general, GHC only inlines the function if there is some reason (no matter +how slight) to supose that it is useful to do so. + -The float-in, float-out, and common-sub-expression transformations are not -applied to the body of f. +Moreover, GHC will only inline the function if it is fully applied, +where "fully applied" +means applied to as many arguments as appear (syntactically) +on the LHS of the function +definition. For example: + +comp1 :: (b -> c) -> (a -> b) -> a -> c +{-# INLINE comp1 #-} +comp1 f g = \x -> f (g x) + +comp2 :: (b -> c) -> (a -> b) -> a -> c +{-# INLINE comp2 #-} +comp2 f g x = f (g x) + +The two functions comp1 and comp2 have the +same semantics, but comp1 will be inlined when applied +to two arguments, while comp2 requires +three. This might make a big difference if you say + +map (not `comp1` not) xs + +which will optimise better than the corresponding use of `comp2`. + + + +It is useful for GHC to optimise the definition of an +INLINE function f just like any other non-INLINE function, +in case the non-inlined version of f is +ultimately called. But we don't want to inline +the optimised version +of f; +a major reason for INLINE pragmas is to expose functions +in f's RHS that have +rewrite rules, and it's no good if those functions have been optimised +away. + + +So GHC guarantees to inline precisely the code that you wrote, no more +and no less. It does this by capturing a copy of the definition of the function to use +for inlining (we call this the "inline-RHS"), which it leaves untouched, +while optimising the ordinarly RHS as usual. For externally-visible functions +the inline-RHS (not the optimised RHS) is recorded in the interface file. An INLINE function is not worker/wrappered by strictness analysis. It's going to be inlined wholesale instead. -All of these effects are aimed at ensuring that what gets inlined is -exactly what you asked for, no more and no less. GHC ensures that inlining cannot go on forever: every mutually-recursive group is cut by one or more loop breakers that is never inlined @@ -7649,8 +7925,9 @@ itself, so an INLINE pragma is always ignored. {-# INLINE returnUs #-} - See also the NOINLINE pragma (). + See also the NOINLINE () + and INLINABLE () + pragmas. Note: the HBC compiler doesn't like INLINE pragmas, so if you want your code to be HBC-compatible you'll have to surround @@ -7659,6 +7936,57 @@ itself, so an INLINE pragma is always ignored. + + INLINABLE pragma + +An {-# INLINABLE f #-} pragma on a +function f has the following behaviour: + + +While INLINE says "please inline me", the INLINABLE +says "feel free to inline me; use your +discretion". In other words the choice is left to GHC, which uses the same +rules as for pragma-free functions. Unlike INLINE, that decision is made at +the call site, and +will therefore be affected by the inlining threshold, optimisation level etc. + + +Like INLINE, the INLINABLE pragma retains a +copy of the original RHS for +inlining purposes, and persists it in the interface file, regardless of +the size of the RHS. + + + +One way to use INLINABLE is in conjunction with +the special function inline (). +The call inline f tries very hard to inline f. +To make sure that f can be inlined, +it is a good idea to mark the definition +of f as INLINABLE, +so that GHC guarantees to expose an unfolding regardless of how big it is. +Moreover, by annotating f as INLINABLE, +you ensure that f's original RHS is inlined, rather than +whatever random optimised version of f GHC's optimiser +has produced. + + + +The INLINABLE pragma also works with SPECIALISE: +if you mark function f as INLINABLE, then +you can subsequently SPECIALISE in another module +(see ). + + +Unlike INLINE, it is OK to use +an INLINABLE pragma on a recursive function. +The principal reason do to so to allow later use of SPECIALISE + + + + + + NOINLINE pragma @@ -7915,6 +8243,9 @@ RULE with a somewhat-complex left-hand side (try it yourself), so it might not f well. If you use this kind of specialisation, let us know how well it works. + + SPECIALIZE INLINE + A SPECIALIZE pragma can optionally be followed with a INLINE or NOINLINE pragma, optionally followed by a phase, as described in . @@ -7943,6 +8274,66 @@ specialisation, whose body is also inlined. The result is a type-based unrolling of the indexing function. Warning: you can make GHC diverge by using SPECIALISE INLINE on an ordinarily-recursive function. + + +SPECIALIZE for imported functions + + +Generally, you can only give a SPECIALIZE pragma +for a function defined in the same module. +However if a function f is given an INLINABLE +pragma at its definition site, then it can subequently be specialised by +importing modules (see ). +For example + +module Map( lookup, blah blah ) where + lookup :: Ord key => [(key,a)] -> key -> Maybe a + lookup = ... + {-# INLINABLE lookup #-} + +module Client where + import Map( lookup ) + + data T = T1 | T2 deriving( Eq, Ord ) + {-# SPECIALISE lookup :: [(T,a)] -> T -> Maybe a + +Here, lookup is declared INLINABLE, but +it cannot be specialised for type T at its definition site, +because that type does not exist yet. Instead a client module can define T +and then specialise lookup at that type. + + +Moreover, every module that imports Client (or imports a module +that imports Client, transitively) will "see", and make use of, +the specialised version of lookup. You don't need to put +a SPECIALIZE pragma in every module. + + +Moreover you often don't even need the SPECIALIZE pragma in the +first place. When compiling a module M, +GHC's optimiser (with -O) automatically considers each top-level +overloaded function declared in M, and specialises it +for the different types at which it is called in M. The optimiser +also considers each imported +INLINABLE overloaded function, and specialises it +for the different types at which it is called in M. +So in our example, it would be enough for lookup to +be called at type T: + +module Client where + import Map( lookup ) + + data T = T1 | T2 deriving( Eq, Ord ) + + findT1 :: [(T,a)] -> Maybe a + findT1 m = lookup m T1 -- A call of lookup at type T + +However, sometimes there are no such calls, in which case the +pragma can be useful. + + + +Obselete SPECIALIZE syntax Note: In earlier versions of GHC, it was possible to provide your own specialised function for a given type: @@ -7953,6 +8344,7 @@ on an ordinarily-recursive function. This feature has been removed, as it is now subsumed by the RULES pragma (see ). + @@ -8037,10 +8429,6 @@ data S = S {-# UNPACK #-} !Int {-# UNPACK #-} !Int directly in the T constructor. The unpacker can see through newtypes, too. - If a field cannot be unpacked, you will not get a warning, - so it might be an idea to check the generated code with - . - See also the flag, which essentially has the effect of adding {-# UNPACK #-} to every strict @@ -8081,7 +8469,7 @@ Here is an example: Use the debug flag to see what rules fired. If you need more information, then shows you -each individual rule firing in detail. +each individual rule firing and also shows what the code looks like before and after the rewrite. @@ -8666,7 +9054,8 @@ If you add you get a more detailed listing. - Use to see in great detail what rules are being fired. + Use or +to see in great detail what rules are being fired. If you add you get a still more detailed listing. @@ -8773,7 +9162,23 @@ r) -> GHC has a few built-in functions with special behaviour. These are now described in the module GHC.Prim -in the library documentation. +in the library documentation. +In particular: + + +inline +allows control over inlining on a per-call-site basis. + + +lazy +restrains the strictness analyser. + + +unsafeCoerce# +allows you to fool the type checker. + + + @@ -8781,255 +9186,185 @@ in the library documentation. Generic classes -The ideas behind this extension are described in detail in "Derivable type classes", -Ralf Hinze and Simon Peyton Jones, Haskell Workshop, Montreal Sept 2000, pp94-105. -An example will give the idea: +GHC used to have an implementation of generic classes as defined in the paper +"Derivable type classes", Ralf Hinze and Simon Peyton Jones, Haskell Workshop, +Montreal Sept 2000, pp94-105. These have been removed and replaced by the more +general support for generic programming. - - import Generics - - class Bin a where - toBin :: a -> [Int] - fromBin :: [Int] -> (a, [Int]) - - toBin {| Unit |} Unit = [] - toBin {| a :+: b |} (Inl x) = 0 : toBin x - toBin {| a :+: b |} (Inr y) = 1 : toBin y - toBin {| a :*: b |} (x :*: y) = toBin x ++ toBin y - - fromBin {| Unit |} bs = (Unit, bs) - fromBin {| a :+: b |} (0:bs) = (Inl x, bs') where (x,bs') = fromBin bs - fromBin {| a :+: b |} (1:bs) = (Inr y, bs') where (y,bs') = fromBin bs - fromBin {| a :*: b |} bs = (x :*: y, bs'') where (x,bs' ) = fromBin bs - (y,bs'') = fromBin bs' - - -This class declaration explains how toBin and fromBin -work for arbitrary data types. They do so by giving cases for unit, product, and sum, -which are defined thus in the library module Generics: - - - data Unit = Unit - data a :+: b = Inl a | Inr b - data a :*: b = a :*: b - - -Now you can make a data type into an instance of Bin like this: - - instance (Bin a, Bin b) => Bin (a,b) - instance Bin a => Bin [a] - -That is, just leave off the "where" clause. Of course, you can put in the -where clause and over-ride whichever methods you please. - + - - Using generics - To use generics you need to - - - Use the flags (to enable the extra syntax), - (to generate extra per-data-type code), - and (to make the Generics library - available. - - - Import the module Generics from the - lang package. This import brings into - scope the data types Unit, - :*:, and :+:. (You - don't need this import if you don't mention these types - explicitly; for example, if you are simply giving instance - declarations.) - - - - Changes wrt the paper - -Note that the type constructors :+: and :*: -can be written infix (indeed, you can now use -any operator starting in a colon as an infix type constructor). Also note that -the type constructors are not exactly as in the paper (Unit instead of 1, etc). -Finally, note that the syntax of the type patterns in the class declaration -uses "{|" and "|}" brackets; curly braces -alone would ambiguous when they appear on right hand sides (an extension we -anticipate wanting). - - + +Generic programming - Terminology and restrictions -Terminology. A "generic default method" in a class declaration -is one that is defined using type patterns as above. -A "polymorphic default method" is a default method defined as in Haskell 98. -A "generic class declaration" is a class declaration with at least one -generic default method. +Using a combination of +() and + (), +you can easily do datatype-generic +programming using the GHC.Generics framework. This section +gives a very brief overview of how to do it. For more detail please refer to the +HaskellWiki page +or the original paper: - -Restrictions: -Alas, we do not yet implement the stuff about constructor names and -field labels. +José Pedro Magalhães, Atze Dijkstra, Johan Jeuring, and Andres Löh. + + A generic deriving mechanism for Haskell. +Proceedings of the third ACM Haskell symposium on Haskell +(Haskell'2010), pp. 37-48, ACM, 2010. + - - -A generic class can have only one parameter; you can't have a generic -multi-parameter class. - - +Note: the current support for generic programming in GHC +is preliminary. In particular, we only allow deriving instances for the +Generic class. Support for deriving +Generic1 (and thus enabling generic functions of kind +* -> * such as fmap) will come at a +later stage. - - -A default method must be defined entirely using type patterns, or entirely -without. So this is illegal: - - class Foo a where - op :: a -> (a, Bool) - op {| Unit |} Unit = (Unit, True) - op x = (x, False) - -However it is perfectly OK for some methods of a generic class to have -generic default methods and others to have polymorphic default methods. - - - - -The type variable(s) in the type pattern for a generic method declaration -scope over the right hand side. So this is legal (note the use of the type variable ``p'' in a type signature on the right hand side: - - class Foo a where - op :: a -> Bool - op {| p :*: q |} (x :*: y) = op (x :: p) - ... - - - + +Deriving representations - -The type patterns in a generic default method must take one of the forms: - - a :+: b - a :*: b - Unit - -where "a" and "b" are type variables. Furthermore, all the type patterns for -a single type constructor (:*:, say) must be identical; they -must use the same type variables. So this is illegal: +The first thing we need is generic representations. The +GHC.Generics module defines a couple of primitive types +that can be used to represent most Haskell datatypes: + - class Foo a where - op :: a -> Bool - op {| a :+: b |} (Inl x) = True - op {| p :+: q |} (Inr y) = False +-- | Unit: used for constructors without arguments +data U1 p = U1 + +-- | Constants, additional parameters and recursion of kind * +newtype K1 i c p = K1 { unK1 :: c } + +-- | Meta-information (constructor names, etc.) +newtype M1 i c f p = M1 { unM1 :: f p } + +-- | Sums: encode choice between constructors +infixr 5 :+: +data (:+:) f g p = L1 (f p) | R1 (g p) + +-- | Products: encode multiple arguments to constructors +infixr 6 :*: +data (:*:) f g p = f p :*: g p + + +For example, a user-defined datatype of trees data UserTree a = Node a +(UserTree a) (UserTree a) | Leaf gets the following representation: + + +instance Generic (UserTree a) where + -- Representation type + type Rep (UserTree a) = + M1 D D1UserTree ( + M1 C C1_0UserTree ( + M1 S NoSelector (K1 P a) + :*: M1 S NoSelector (K1 R (UserTree a)) + :*: M1 S NoSelector (K1 R (UserTree a))) + :+: M1 C C1_1UserTree U1) + + -- Conversion functions + from (Node x l r) = M1 (L1 (M1 (M1 (K1 x) :*: M1 (K1 l) :*: M1 (K1 r)))) + from Leaf = M1 (R1 (M1 U1)) + to (M1 (L1 (M1 (M1 (K1 x) :*: M1 (K1 l) :*: M1 (K1 r))))) = Node x l r + to (M1 (R1 (M1 U1))) = Leaf + +-- Meta-information +data D1UserTree +data C1_0UserTree +data C1_1UserTree + +instance Datatype D1UserTree where + datatypeName _ = "UserTree" + moduleName _ = "Main" + +instance Constructor C1_0UserTree where + conName _ = "Node" + +instance Constructor C1_1UserTree where + conName _ = "Leaf" -The type patterns must be identical, even in equations for different methods of the class. -So this too is illegal: - - class Foo a where - op1 :: a -> Bool - op1 {| a :*: b |} (x :*: y) = True - op2 :: a -> Bool - op2 {| p :*: q |} (x :*: y) = False - -(The reason for this restriction is that we gather all the equations for a particular type constructor -into a single generic instance declaration.) +This representation is generated automatically if a +deriving Generic clause is attached to the datatype. +Standalone deriving can also be +used. - + - - -A generic method declaration must give a case for each of the three type constructors. - - + +Writing generic functions - -The type for a generic method can be built only from: - - Function arrows - Type variables - Tuples - Arbitrary types not involving type variables - -Here are some example type signatures for generic methods: +A generic function is defined by creating a class and giving instances for +each of the representation types of GHC.Generics. As an +example we show generic serialization: - op1 :: a -> Bool - op2 :: Bool -> (a,Bool) - op3 :: [Int] -> a -> a - op4 :: [a] -> Bool - -Here, op1, op2, op3 are OK, but op4 is rejected, because it has a type variable -inside a list. - - -This restriction is an implementation restriction: we just haven't got around to -implementing the necessary bidirectional maps over arbitrary type constructors. -It would be relatively easy to add specific type constructors, such as Maybe and list, -to the ones that are allowed. - +data Bin = O | I - - -In an instance declaration for a generic class, the idea is that the compiler -will fill in the methods for you, based on the generic templates. However it can only -do so if - - - - The instance type is simple (a type constructor applied to type variables, as in Haskell 98). - - - - - No constructor of the instance type has unboxed fields. - - - -(Of course, these things can only arise if you are already using GHC extensions.) -However, you can still give an instance declarations for types which break these rules, -provided you give explicit code to override any generic default methods. - - +class GSerialize f where + gput :: f a -> [Bin] - - +instance GSerialize U1 where + gput U1 = [] - -The option dumps incomprehensible stuff giving details of -what the compiler does with generic declarations. - +instance (GSerialize a, GSerialize b) => GSerialize (a :*: b) where + gput (a :*: b) = gput a ++ gput b + +instance (GSerialize a, GSerialize b) => GSerialize (a :+: b) where + gput (L1 x) = O : gput x + gput (R1 x) = I : gput x +instance (GSerialize a) => GSerialize (M1 i c a) where + gput (M1 x) = gput x + +instance (Serialize a) => GSerialize (K1 i c a) where + gput (K1 x) = put x + + +Typically this class will not be exported, as it only makes sense to have +instances for the representation types. + - Another example + +Generic defaults + -Just to finish with, here's another example I rather like: +The only thing left to do now is to define a "front-end" class, which is +exposed to the user: - class Tag a where - nCons :: a -> Int - nCons {| Unit |} _ = 1 - nCons {| a :*: b |} _ = 1 - nCons {| a :+: b |} _ = nCons (bot::a) + nCons (bot::b) +class Serialize a where + put :: a -> [Bin] - tag :: a -> Int - tag {| Unit |} _ = 1 - tag {| a :*: b |} _ = 1 - tag {| a :+: b |} (Inl x) = tag x - tag {| a :+: b |} (Inr y) = nCons (bot::a) + tag y + default put :: (Generic a, GSerialize (Rep a)) => a -> [Bit] + put = gput . from + +Here we use a default signature +to specify that the user does not have to provide an implementation for +put, as long as there is a Generic +instance for the type to instantiate. For the UserTree type, +for instance, the user can just write: + + +instance (Serialize a) => Serialize (UserTree a) + +The default method for put is then used, corresponding to the +generic implementation of serialization. + + Control over monomorphism @@ -9079,7 +9414,6 @@ standard behaviour.