X-Git-Url: http://git.megacz.com/?p=ghc-hetmet.git;a=blobdiff_plain;f=docs%2Fusers_guide%2Fglasgow_exts.xml;h=5d1b5cf5f053bf3a3435740f51ee0a7a1c88d3f0;hp=f22e6c9edf85ebae04c672517f5e254c3cbbf966;hb=432b9c9322181a3644083e3c19b7e240d90659e7;hpb=f3399c446c7507d46d6cc550aa2fe7027dbc1b5b diff --git a/docs/users_guide/glasgow_exts.xml b/docs/users_guide/glasgow_exts.xml index f22e6c9..5d1b5cf 100644 --- a/docs/users_guide/glasgow_exts.xml +++ b/docs/users_guide/glasgow_exts.xml @@ -38,277 +38,68 @@ documentation describes all the libraries that come with GHC. extensionsoptions controlling - The language option flag control what variation of the language are + The language option flags control what variation of the language are permitted. Leaving out all of them gives you standard Haskell 98. - Generally speaking, all the language options are introduced by "", - e.g. . - - - All the language options can be turned off by using the prefix ""; - e.g. "". - - Language options recognised by Cabal can also be enabled using the LANGUAGE pragma, - thus {-# LANGUAGE TemplateHaskell #-} (see >). - - Turning on an option that enables special syntax - might cause working Haskell 98 code to fail - to compile, perhaps because it uses a variable name which has - become a reserved word. So, together with each option below, we - list the special syntax which is enabled by this option. We use - notation and nonterminal names from the Haskell 98 lexical syntax - (see the Haskell 98 Report). There are two classes of special - syntax: - + Language options can be controlled in two ways: - - New reserved words and symbols: character sequences - which are no longer available for use as identifiers in the - program. - - - Other special syntax: sequences of characters that have - a different meaning when this particular option is turned - on. - - - - We are only listing syntax changes here that might affect - existing working programs (i.e. "stolen" syntax). Many of these - extensions will also enable new context-free syntax, but in all - cases programs written to use the new syntax would not be - compilable without the option enabled. - - - - - - : + Every language option can switched on by a command-line flag "" + (e.g. ), and switched off by the flag ""; + (e.g. ). + + Language options recognised by Cabal can also be enabled using the LANGUAGE pragma, + thus {-# LANGUAGE TemplateHaskell #-} (see ). + + + + The flag - - - This simultaneously enables all of the extensions to - Haskell 98 described in , except where otherwise - noted. We are trying to move away from this portmanteau flag, - and towards enabling features individaully. - - New reserved words: forall (only in - types), mdo. - - Other syntax stolen: - varid{#}, - char#, - string#, - integer#, - float#, - float##, - (#, #), - |), {|. - - Implies these specific language options: - , - , - , - , - . - - - - - - : - - - - This option enables the language extension defined in the - Haskell 98 Foreign Function Interface Addendum. - - New reserved words: foreign. - - - - - - ,: - - - These two flags control how generalisation is done. - See . - - - - - - - : - - - - Use GHCi's extended default rules in a regular module (). - Independent of the - flag. - - - - - - - - - - - - - - - - - - - - - - See . Only relevant - if you also use . - - - - - - - - - - See . Only relevant if - you also use . - - - - - - - - - - See . Independent of - . - - New reserved words/symbols: rec, - proc, -<, - >-, -<<, - >>-. - - Other syntax stolen: (|, - |). - - - - - - - - - - See . Independent of - . - - - - - - - -XNoImplicitPrelude - option GHC normally imports - Prelude.hi files for you. If you'd - rather it didn't, then give it a - option. The idea is - that you can then import a Prelude of your own. (But don't - call it Prelude; the Haskell module - namespace is flat, and you must not conflict with any - Prelude module.) - - Even though you have not imported the Prelude, most of - the built-in syntax still refers to the built-in Haskell - Prelude types and values, as specified by the Haskell - Report. For example, the type [Int] - still means Prelude.[] Int; tuples - continue to refer to the standard Prelude tuples; the - translation for list comprehensions continues to use - Prelude.map etc. - - However, does - change the handling of certain built-in syntax: see . - - - - - - - Enables implicit parameters (see ). Currently also implied by - . - - Syntax stolen: - ?varid, - %varid. - - - - - - - Enables overloaded string literals (see ). - - - - - - - Enables lexically-scoped type variables (see ). Implied by - . - - - - - - - Enables Template Haskell (see ). This flag must - be given explicitly; it is no longer implied by - . - - Syntax stolen: [|, - [e|, [p|, - [d|, [t|, - $(, - $varid. - - - - - - - Enables quasiquotation (see ). - - Syntax stolen: - [:varid|. - - + is equivalent to enabling the following extensions: + , + , + , + , + , + , + , + , + , + , + , + , + , + , + , + , + , + , + , + , + , + , + , + , + , + , + , + , + , + , + . + Enabling these options is the only + effect of . + We are trying to move away from this portmanteau flag, + and towards enabling features individually. - Unboxed types and primitive operations -GHC is built on a raft of primitive data types and operations. +GHC is built on a raft of primitive data types and operations; +"primitive" in the sense that they cannot be defined in Haskell itself. While you really can use this stuff to write fast code, we generally find it a lot less painful, and more satisfying in the long run, to use higher-level language features and libraries. With @@ -316,28 +107,21 @@ While you really can use this stuff to write fast code, unboxed version in any case. And if it isn't, we'd like to know about it. -We do not currently have good, up-to-date documentation about the -primitives, perhaps because they are mainly intended for internal use. -There used to be a long section about them here in the User Guide, but it -became out of date, and wrong information is worse than none. - -The Real Truth about what primitive types there are, and what operations -work over those types, is held in the file -fptools/ghc/compiler/prelude/primops.txt.pp. -This file is used directly to generate GHC's primitive-operation definitions, so -it is always correct! It is also intended for processing into text. - - Indeed, -the result of such processing is part of the description of the - External - Core language. -So that document is a good place to look for a type-set version. -We would be very happy if someone wanted to volunteer to produce an SGML -back end to the program that processes primops.txt so that -we could include the results here in the User Guide. - -What follows here is a brief summary of some main points. +All these primitive data types and operations are exported by the +library GHC.Prim, for which there is +detailed online documentation. +(This documentation is generated from the file compiler/prelude/primops.txt.pp.) + + +If you want to mention any of the primitive data types or operations in your +program, you must first import GHC.Prim to bring them +into scope. Many of them have names ending in "#", and to mention such +names you need the extension (). + + +The primops make extensive use of unboxed types +and unboxed tuples, which +we briefly summarise here. Unboxed types @@ -370,8 +154,11 @@ know and love—usually one instruction. Primitive (unboxed) types cannot be defined in Haskell, and are therefore built into the language and compiler. Primitive types are always unlifted; that is, a value of a primitive type cannot be -bottom. We use the convention that primitive types, values, and -operations have a <literal>#</literal> suffix. +bottom. We use the convention (but it is only a convention) +that primitive types, values, and +operations have a <literal>#</literal> suffix (see <xref linkend="magic-hash"/>). +For some primitive types we have special syntax for literals, also +described in the <link linkend="magic-hash">same section</link>. </para> <para> @@ -423,22 +210,20 @@ in a <emphasis>top-level</emphasis> binding. in a <emphasis>recursive</emphasis> binding. </para></listitem> <listitem><para> You may bind unboxed variables in a (non-recursive, -non-top-level) pattern binding, but any such variable causes the entire -pattern-match -to become strict. For example: +non-top-level) pattern binding, but you must make any such pattern-match +strict. For example, rather than: <programlisting> data Foo = Foo Int Int# f x = let (Foo a b, w) = ..rhs.. in ..body.. </programlisting> -Since <literal>b</literal> has type <literal>Int#</literal>, the entire pattern -match -is strict, and the program behaves as if you had written +you must write: <programlisting> data Foo = Foo Int Int# - f x = case ..rhs.. of { (Foo a b, w) -> ..body.. } + f x = let !(Foo a b, w) = ..rhs.. in ..body.. </programlisting> +since <literal>b</literal> has type <literal>Int#</literal>. </para> </listitem> </itemizedlist> @@ -548,8 +333,146 @@ Indeed, the bindings can even be recursive. <sect1 id="syntax-extns"> <title>Syntactic extensions + + Unicode syntax + The language + extension + enables Unicode characters to be used to stand for certain ASCII + character sequences. The following alternatives are provided: + + + + + + ASCII + Unicode alternative + Code point + Name + + + + + :: + :: + 0x2237 + PROPORTION + + + + + => + + 0x21D2 + RIGHTWARDS DOUBLE ARROW + + + + + forall + + 0x2200 + FOR ALL + + + + + -> + + 0x2192 + RIGHTWARDS ARROW + + + + + <- + + 0x2190 + LEFTWARDS ARROW + + + + + .. + + 0x22EF + MIDLINE HORIZONTAL ELLIPSIS + + + + + + + + The magic hash + The language extension allows "#" as a + postfix modifier to identifiers. Thus, "x#" is a valid variable, and "T#" is + a valid type constructor or data constructor. + + The hash sign does not change sematics at all. We tend to use variable + names ending in "#" for unboxed values or types (e.g. Int#), + but there is no requirement to do so; they are just plain ordinary variables. + Nor does the extension bring anything into scope. + For example, to bring Int# into scope you must + import GHC.Prim (see ); + the extension + then allows you to refer to the Int# + that is now in scope. + The also enables some new forms of literals (see ): + + 'x'# has type Char# + "foo"# has type Addr# + 3# has type Int#. In general, + any Haskell 98 integer lexeme followed by a # is an Int# literal, e.g. + -0x3A# as well as 32#. + 3## has type Word#. In general, + any non-negative Haskell 98 integer lexeme followed by ## + is a Word#. + 3.2# has type Float#. + 3.2## has type Double# + + + + + + New qualified operator syntax + + A new syntax for referencing qualified operators is + planned to be introduced by Haskell', and is enabled in GHC + with + the + option. In the new syntax, the prefix form of a qualified + operator is + written module.(symbol) + (in Haskell 98 this would + be (module.symbol)), + and the infix form is + written `module.(symbol)` + (in Haskell 98 this would + be `module.symbol`. + For example: + + add x y = Prelude.(+) x y + subtract y = (`Prelude.(-)` y) + + The new form of qualified operators is intended to regularise + the syntax by eliminating odd cases + like Prelude... For example, + when NewQualifiedOperators is on, it is possible to + write the enumerated sequence [Monday..] + without spaces, whereas in Haskell 98 this would be a + reference to the operator ‘.‘ + from module Monday. + + When is on, the old Haskell + 98 syntax for qualified operators is not accepted, so this + option may cause existing Haskell 98 code to break. + + + + + Hierarchical Modules @@ -753,7 +676,7 @@ view :: Type -> TypeView The representation of Typ is held abstract, permitting implementations -to use a fancy representation (e.g., hash-consing to managage sharing). +to use a fancy representation (e.g., hash-consing to manage sharing). Without view patterns, using this signature a little inconvenient: @@ -1004,11 +927,6 @@ and improve termination (Section 3.2 of the paper). -The web page: http://www.cse.ogi.edu/PacSoft/projects/rmb/ -contains up to date information on recursive monadic bindings. - - - Historical note: The old implementation of the mdo-notation (and most of the existing documents) used the name MonadRec for the class and the corresponding library. @@ -1088,6 +1006,7 @@ This name is not supported by GHC. paper Comprehensive comprehensions: comprehensions with "order by" and "group by", except that the syntax we use differs slightly from the paper. +The extension is enabled with the flag . Here is an example: employees = [ ("Simon", "MS", 80) @@ -1113,7 +1032,7 @@ In this example, the list output would take on (The function sortWith is not a keyword; it is an ordinary function that is exported by GHC.Exts.) -There are five new forms of compehension qualifier, +There are five new forms of comprehension qualifier, all introduced by the (existing) keyword then: @@ -1160,7 +1079,7 @@ then group by e using f is a function supplied to f by the compiler which lets it compute e on every element of the list being transformed. However, unlike the non-grouping case, f additionally partitions the list into a number of sublists: this means that - at every point after this statement, binders occuring before it in the comprehension + at every point after this statement, binders occurring before it in the comprehension refer to lists of possible values, not single values. To help understand this, let's look at an example: @@ -1233,21 +1152,28 @@ output = [ x -Rebindable syntax - - GHC allows most kinds of built-in syntax to be rebound by - the user, to facilitate replacing the Prelude - with a home-grown version, for example. - - You may want to define your own numeric class +Rebindable syntax and the implicit Prelude import + + -XNoImplicitPrelude + option GHC normally imports + Prelude.hi files for you. If you'd + rather it didn't, then give it a + option. The idea is + that you can then import a Prelude of your own. (But don't + call it Prelude; the Haskell module + namespace is flat, and you must not conflict with any + Prelude module.) + + Suppose you are importing a Prelude of your own + in order to define your own numeric class hierarchy. It completely defeats that purpose if the literal "1" means "Prelude.fromInteger 1", which is what the Haskell Report specifies. - So the flag causes + So the + flag also causes the following pieces of built-in syntax to refer to whatever is in scope, not the Prelude versions: - An integer literal 368 means @@ -1296,7 +1222,7 @@ output = [ x In all cases (apart from arrow notation), the static semantics should be that of the desugared form, -even if that is a little unexpected. For emample, the +even if that is a little unexpected. For example, the static semantics of the literal 368 is exactly that of fromInteger (368::Integer); it's fine for fromInteger to have any of the types: @@ -1319,8 +1245,9 @@ fromInteger :: Integer -> Bool -> Bool Postfix operators -GHC allows a small extension to the syntax of left operator sections, which -allows you to define postfix operators. The extension is this: the left section + The flag enables a small +extension to the syntax of left operator sections, which allows you to +define postfix operators. The extension is this: the left section (e !) @@ -1337,10 +1264,6 @@ That is, the operator must be a function of two arguments. GHC allows it to take only one argument, and that in turn allows you to write the function postfix. -Since this extension goes beyond Haskell 98, it should really be enabled -by a flag; but in fact it is enabled all the time. (No Haskell 98 programs -change their behaviour, of course.) - The extension does not extend to the left-hand side of function definitions; you must define such a function in prefix form. @@ -1378,7 +1301,7 @@ it is not clear which of the two types is intended. Haskell 98 regards all four as ambiguous, but with the - flag, GHC will accept + flag, GHC will accept the former two. The rules are precisely the same as those for instance declarations in Haskell 98, where the method names on the left-hand side of the method bindings in an instance declaration refer unambiguously @@ -1396,7 +1319,7 @@ records from different modules that use the same field name. -Record puns are enabled by the flag -XRecordPuns. +Record puns are enabled by the flag -XNamedFieldPuns. @@ -1537,11 +1460,12 @@ For example, in a let, it applies in the right-hand sides of other let-bindings and the body of the letC. Or, in recursive do expressions (), the local fixity -declarations of aA let statement scope over other +declarations of a let statement scope over other statements in the group, just as the bound name does. -Moreover, a local fixity declatation *must* accompany a local binding of + +Moreover, a local fixity declaration *must* accompany a local binding of that name: it is not possible to revise the fixity of name bound elsewhere, as in @@ -1550,8 +1474,170 @@ let infixr 9 $ in ... Because local fixity declarations are technically Haskell 98, no flag is necessary to enable them. + + + + + Package-qualified imports + + With the flag, GHC allows + import declarations to be qualified by the package name that the + module is intended to be imported from. For example: + + +import "network" Network.Socket + + + would import the module Network.Socket from + the package network (any version). This may + be used to disambiguate an import when the same module is + available from multiple packages, or is present in both the + current package being built and an external package. + + Note: you probably don't need to use this feature, it was + added mainly so that we can build backwards-compatible versions of + packages when APIs change. It can lead to fragile dependencies in + the common case: modules occasionally move from one package to + another, rendering any package-qualified imports broken. + +Summary of stolen syntax + + Turning on an option that enables special syntax + might cause working Haskell 98 code to fail + to compile, perhaps because it uses a variable name which has + become a reserved word. This section lists the syntax that is + "stolen" by language extensions. + We use + notation and nonterminal names from the Haskell 98 lexical syntax + (see the Haskell 98 Report). + We only list syntax changes here that might affect + existing working programs (i.e. "stolen" syntax). Many of these + extensions will also enable new context-free syntax, but in all + cases programs written to use the new syntax would not be + compilable without the option enabled. + +There are two classes of special + syntax: + + + + New reserved words and symbols: character sequences + which are no longer available for use as identifiers in the + program. + + + Other special syntax: sequences of characters that have + a different meaning when this particular option is turned + on. + + + +The following syntax is stolen: + + + + + forall + forall + + + Stolen (in types) by: , + , + , + , + , + + + + + + + mdo + mdo + + + Stolen by: , + + + + + + foreign + foreign + + + Stolen by: , + + + + + + rec, + proc, -<, + >-, -<<, + >>-, and (|, + |) brackets + proc + + + Stolen by: , + + + + + + ?varid, + %varid + implicit parameters + + + Stolen by: , + + + + + + [|, + [e|, [p|, + [d|, [t|, + $(, + $varid + Template Haskell + + + Stolen by: , + + + + + + [:varid| + quasi-quotation + + + Stolen by: , + + + + + + varid{#}, + char#, + string#, + integer#, + float#, + float##, + (#, #), + + + Stolen by: , + + + + + @@ -1652,9 +1738,12 @@ to be written infix, very much like expressions. More specifically: Liberalised type synonyms -Type synonyms are like macros at the type level, and +Type synonyms are like macros at the type level, but Haskell 98 imposes many rules +on individual synonym declarations. +With the extension, GHC does validity checking on types only after expanding type synonyms. -That means that GHC can be very much more liberal about type synonyms than Haskell 98: +That means that GHC can be very much more liberal about type synonyms than Haskell 98. + You can write a forall (including overloading) in a type synonym, thus: @@ -1671,7 +1760,8 @@ in a type synonym, thus: -You can write an unboxed tuple in a type synonym: +If you also use , +you can write an unboxed tuple in a type synonym: type Pr = (# Int, Int #) @@ -1829,7 +1919,7 @@ apply fn to val to get a boolean. For e -What this allows us to do is to package heterogenous values +What this allows us to do is to package heterogeneous values together with a bunch of functions that manipulate them, and then treat that collection of packages in a uniform manner. You can express quite a bit of object-oriented-like programming this way. @@ -1969,15 +2059,28 @@ main = do display (inc (inc counterB)) -- prints "##" -At the moment, record update syntax is only supported for Haskell 98 data types, -so the following function does not work: - +Record update syntax is supported for existentials (and GADTs): --- This is invalid; use explicit NewCounter instead for now setTag :: Counter a -> a -> Counter a setTag obj t = obj{ tag = t } +The rule for record update is this: +the types of the updated fields may +mention only the universally-quantified type variables +of the data constructor. For GADTs, the field may mention only types +that appear as a simple type-variable argument in the constructor's result +type. For example: + +data T a b where { T1 { f1::a, f2::b, f3::(b,c) } :: T a b } -- c is existential +upd1 t x = t { f1=x } -- OK: upd1 :: T a b -> a' -> T a' b +upd2 t x = t { f3=x } -- BAD (f3's type mentions c, which is + -- existentially quantified) +data G a b where { G1 { g1::a, g2::c } :: G a [c] } +upd3 g x = g { g1=x } -- OK: upd3 :: G a b -> c -> G c b +upd4 g x = g { g2=x } -- BAD (f2's type mentions c, which is not a simple + -- type-variable argument in G1's result type) + @@ -2210,9 +2313,9 @@ like this: data NumInst a = Num a => MkNumInst (NumInst a) -Notice that, unlike the situation when declaring an existental, there is +Notice that, unlike the situation when declaring an existential, there is no forall, because the Num constrains the -data type's univerally quantified type variable a. +data type's universally quantified type variable a. A constructor may have both universal and existential type variables: for example, the following two declarations are equivalent: @@ -2249,16 +2352,46 @@ otherwise is a generalised data type ( +As with other type signatures, you can give a single signature for several data constructors. +In this example we give a single signature for T1 and T2: + + data T a where + T1,T2 :: a -> T a + T3 :: T a + + + + The type signature of each constructor is independent, and is implicitly universally quantified as usual. -Different constructors may have different universally-quantified type variables -and different type-class constraints. -For example, this is fine: +In particular, the type variable(s) in the "data T a where" header +have no scope, and different constructors may have different universally-quantified type variables: + + data T a where -- The 'a' has no scope + T1,T2 :: b -> T b -- Means forall b. b -> T b + T3 :: T a -- Means forall a. T a + + + + +A constructor signature may mention type class constraints, which can differ for +different constructors. For example, this is fine: data T a where - T1 :: Eq b => b -> T b + T1 :: Eq b => b -> b -> T b T2 :: (Show c, Ix c) => c -> [c] -> T c +When patten matching, these constraints are made available to discharge constraints +in the body of the match. For example: + + f :: T a -> String + f (T1 x y) | x==y = "yes" + | otherwise = "no" + f (T2 a b) = show a + +Note that f is not overloaded; the Eq constraint arising +from the use of == is discharged by the pattern match on T1 +and similarly the Show constraint arising from the use of show. @@ -2270,12 +2403,12 @@ have no scope. Indeed, one can write a kind signature instead: or even a mixture of the two: - data Foo a :: (* -> *) -> * where ... + data Bar a :: (* -> *) -> * where ... The type variables (if given) may be explicitly kinded, so we could also write the header for Foo like this: - data Foo a (b :: * -> *) where ... + data Bar a (b :: * -> *) where ... @@ -2306,27 +2439,48 @@ declaration. For example, these two declarations are equivalent +The type signature may have quantified type variables that do not appear +in the result type: + + data Foo where + MkFoo :: a -> (a->Bool) -> Foo + Nil :: Foo + +Here the type variable a does not appear in the result type +of either constructor. +Although it is universally quantified in the type of the constructor, such +a type variable is often called "existential". +Indeed, the above declaration declares precisely the same type as +the data Foo in . + +The type may contain a class context too, of course: + + data Showable where + MkShowable :: Show a => a -> Showable + + + + You can use record syntax on a GADT-style data type declaration: data Person where - Adult { name :: String, children :: [Person] } :: Person - Child { name :: String } :: Person + Adult :: { name :: String, children :: [Person] } -> Person + Child :: Show a => { name :: !String, funny :: a } -> Person As usual, for every constructor that has a field f, the type of field f must be the same (modulo alpha conversion). - - -At the moment, record updates are not yet possible with GADT-style declarations, -so support is limited to record construction, selection and pattern matching. -For example - - aPerson = Adult { name = "Fred", children = [] } +The Child constructor above shows that the signature +may have a context, existentially-quantified variables, and strictness annotations, +just as in the non-record case. (NB: the "type" that follows the double-colon +is not really a type, because of the record syntax and strictness annotations. +A "type" of this form can appear only in a constructor signature.) + - shortName :: Person -> Bool - hasChildren (Adult { children = kids }) = not (null kids) - hasChildren (Child {}) = False - + +Record updates are allowed with GADT-style declarations, +only fields that have the following property: the type of the field +mentions no existential type variables. @@ -2420,11 +2574,17 @@ The result type of each constructor must begin with the type constructor being d but for a GADT the arguments to the type constructor can be arbitrary monotypes. For example, in the Term data type above, the type of each constructor must end with Term ty, but -the ty may not be a type variable (e.g. the Lit +the ty need not be a type variable (e.g. the Lit constructor). +It's is permitted to declare an ordinary algebraic data type using GADT-style syntax. +What makes a GADT into a GADT is not the syntax, but rather the presence of data constructors +whose result type is not just T a b. + + + You cannot use a deriving clause for a GADT; only for an ordinary data type. @@ -2460,6 +2620,23 @@ their selector functions actually have different types: + +When pattern-matching against data constructors drawn from a GADT, +for example in a case expression, the following rules apply: + +The type of the scrutinee must be rigid. +The type of the entire case expression must be rigid. +The type of any free variable mentioned in any of +the case alternatives must be rigid. + +A type is "rigid" if it is completely known to the compiler at its binding site. The easiest +way to ensure that a variable a rigid type is to give it a type signature. +For more precise details see +Simple unification-based type inference for GADTs +. The criteria implemented by GHC are given in the Appendix. + + + @@ -2517,9 +2694,27 @@ The syntax is identical to that of an ordinary instance declaration apart from ( You must supply a context (in the example the context is (Eq a)), exactly as you would in an ordinary instance declaration. (In contrast the context is inferred in a deriving clause -attached to a data type declaration.) These deriving instance -rules obey the same rules concerning form and termination as ordinary instance declarations, -controlled by the same flags; see . +attached to a data type declaration.) + +A deriving instance declaration +must obey the same rules concerning form and termination as ordinary instance declarations, +controlled by the same flags; see . + + +Unlike a deriving +declaration attached to a data declaration, the instance can be more specific +than the data type (assuming you also use +-XFlexibleInstances, ). Consider +for example + + data Foo a = Bar a | Baz String + + deriving instance Eq a => Eq (Foo [a]) + deriving instance Eq a => Eq (Foo (Maybe a)) + +This will generate a derived instance for (Foo [a]) and (Foo (Maybe a)), +but other types such as (Foo (Int,Bool)) will not be an instance of Eq. + The stand-alone syntax is generalised for newtypes in exactly the same way that ordinary deriving clauses are generalised (). @@ -2537,7 +2732,7 @@ GHC always treats the last parameter of the instance -Deriving clause for classes <literal>Typeable</literal> and <literal>Data</literal> +Deriving clause for extra classes (<literal>Typeable</literal>, <literal>Data</literal>, etc) Haskell 98 allows the programmer to add "deriving( Eq, Ord )" to a data type @@ -2547,11 +2742,11 @@ classes Eq, Ord, Enum, Ix, Bounded, Read, and Show. -GHC extends this list with two more classes that may be automatically derived -(provided the flag is specified): -Typeable, and Data. These classes are defined in the library -modules Data.Typeable and Data.Generics respectively, and the -appropriate class must be in scope before it can be mentioned in the deriving clause. +GHC extends this list with several more classes that may be automatically derived: + + With , you can derive instances of the classes +Typeable, and Data, defined in the library +modules Data.Typeable and Data.Generics respectively. An instance of Typeable can only be derived if the data type has seven or fewer type parameters, all of kind *. @@ -2567,6 +2762,26 @@ In other cases, there is nothing to stop the programmer writing a Typab class, whose kind suits that of the data type constructor, and then writing the data type instance by hand. + + + With , you can derive instances of +the class Functor, +defined in GHC.Base. + + + With , you can derive instances of +the class Foldable, +defined in Data.Foldable. + + + With , you can derive instances of +the class Traversable, +defined in Data.Traversable. + + +In each case the appropriate class must be in scope before it +can be mentioned in the deriving clause. + @@ -3185,9 +3400,6 @@ sets of instance declarations. Instance declarations - -Relaxed rules for instance declarations - An instance declaration has the form instance ( assertion1, ..., assertionn) => class type1 ... typem where ... @@ -3197,19 +3409,73 @@ The part before the "=>" is the "=>" is the head of the instance declaration. + +Relaxed rules for the instance head + In Haskell 98 the head of an instance declaration must be of the form C (T a1 ... an), where -C is the class, T is a type constructor, +C is the class, T is a data type constructor, and the a1 ... an are distinct type variables. -Furthermore, the assertions in the context of the instance declaration +GHC relaxes these rules in two ways. + + + +The flag allows the head of the instance +declaration to mention arbitrary nested types. +For example, this becomes a legal instance declaration + + instance C (Maybe Int) where ... + +See also the rules on overlap. + + +With the flag, instance heads may use type +synonyms. As always, using a type synonym is just shorthand for +writing the RHS of the type synonym definition. For example: + + + + type Point = (Int,Int) + instance C Point where ... + instance C [Point] where ... + + + +is legal. However, if you added + + + + instance C (Int,Int) where ... + + + +as well, then the compiler will complain about the overlapping +(actually, identical) instance declarations. As always, type synonyms +must be fully applied. You cannot, for example, write: + + + type P a = [[a]] + instance Monad P where ... + + + + + + + + +Relaxed rules for instance contexts + +In Haskell 98, the assertions in the context of the instance declaration must be of the form C a where a is a type variable that occurs in the head. + -The flag loosens these restrictions -considerably. Firstly, multi-parameter type classes are permitted. Secondly, -the context and head of the instance declaration can each consist of arbitrary +The flag relaxes this rule, as well +as the corresponding rule for type signatures (see ). +With this flag the context of the instance declaration can each consist of arbitrary (well-kinded) assertions (C t1 ... tn) subject only to the following rules: @@ -3235,7 +3501,7 @@ corresponding type in the instance declaration. These restrictions ensure that context reduction terminates: each reduction step makes the problem smaller by at least one constructor. Both the Paterson Conditions and the Coverage Condition are lifted -if you give the +if you give the flag (). You can find lots of background material about the reason for these restrictions in the paper f, so GHC had to check that f has the specified type. Suppose instead we do not give a type signature, asking GHC to infer it instead. In this case, GHC will refrain from -simplifying the constraint C Int [Int] (for the same reason +simplifying the constraint C Int [b] (for the same reason as before) but, rather than rejecting the program, it will infer the type - f :: C Int b => [b] -> [b] + f :: C Int [b] => [b] -> [b] That postpones the question of which instance to pick to the call site for f by which time more is known about the type b. +You can write this type signature yourself if you use the + +flag. + + +Exactly the same situation can arise in instance declarations themselves. Suppose we have + + class Foo a where + f :: a -> a + instance Foo [b] where + f x = ... + +and, as before, the constraint C Int [b] arises from f's +right hand side. GHC will reject the instance, complaining as before that it does not know how to resolve +the constraint C Int [b], because it matches more than one instance +declaration. The solution is to postpone the choice by adding the constraint to the context +of the instance declaration, thus: + + instance C Int [b] => Foo [b] where + f x = ... + +(You need to do this.) The willingness to be overlapped or incoherent is a property of @@ -3477,57 +3765,16 @@ inconvenient. Perhaps the rule should instead say that the this way, rather than the overlapped one. Perhaps overlap at a usage site should be permitted regardless of how the instance declarations are compiled, if the flag is -used at the usage site. (Mind you, the exact usage site can occasionally be -hard to pin down.) We are interested to receive feedback on these points. - -The flag implies the - flag, but not vice versa. - - - - -Type synonyms in the instance head - - -Unlike Haskell 98, instance heads may use type -synonyms. (The instance "head" is the bit after the "=>" in an instance decl.) -As always, using a type synonym is just shorthand for -writing the RHS of the type synonym definition. For example: - - - - type Point = (Int,Int) - instance C Point where ... - instance C [Point] where ... - - - -is legal. However, if you added - - - - instance C (Int,Int) where ... - - - -as well, then the compiler will complain about the overlapping -(actually, identical) instance declarations. As always, type synonyms -must be fully applied. You cannot, for example, write: - - - - type P a = [[a]] - instance Monad P where ... - - - -This design decision is independent of all the others, and easily -reversed, but it makes sense to me. - +used at the usage site. (Mind you, the exact usage site can occasionally be +hard to pin down.) We are interested to receive feedback on these points. + +The flag implies the + flag, but not vice versa. + @@ -3607,6 +3854,706 @@ to work since it gets translated into an equality comparison. + +Type families + + + Indexed type families are a new GHC extension to + facilitate type-level + programming. Type families are a generalisation of associated + data types + (“Associated + Types with Class”, M. Chakravarty, G. Keller, S. Peyton Jones, + and S. Marlow. In Proceedings of “The 32nd Annual ACM SIGPLAN-SIGACT + Symposium on Principles of Programming Languages (POPL'05)”, pages + 1-13, ACM Press, 2005) and associated type synonyms + (“Type + Associated Type Synonyms”. M. Chakravarty, G. Keller, and + S. Peyton Jones. + In Proceedings of “The Tenth ACM SIGPLAN International Conference on + Functional Programming”, ACM Press, pages 241-253, 2005). Type families + themselves are described in the paper “Type + Checking with Open Type Functions”, T. Schrijvers, + S. Peyton-Jones, + M. Chakravarty, and M. Sulzmann, in Proceedings of “ICFP 2008: The + 13th ACM SIGPLAN International Conference on Functional + Programming”, ACM Press, pages 51-62, 2008. Type families + essentially provide type-indexed data types and named functions on types, + which are useful for generic programming and highly parameterised library + interfaces as well as interfaces with enhanced static information, much like + dependent types. They might also be regarded as an alternative to functional + dependencies, but provide a more functional style of type-level programming + than the relational style of functional dependencies. + + + Indexed type families, or type families for short, are type constructors that + represent sets of types. Set members are denoted by supplying the type family + constructor with type parameters, which are called type + indices. The + difference between vanilla parametrised type constructors and family + constructors is much like between parametrically polymorphic functions and + (ad-hoc polymorphic) methods of type classes. Parametric polymorphic functions + behave the same at all type instances, whereas class methods can change their + behaviour in dependence on the class type parameters. Similarly, vanilla type + constructors imply the same data representation for all type instances, but + family constructors can have varying representation types for varying type + indices. + + + Indexed type families come in two flavours: data + families and type synonym + families. They are the indexed family variants of algebraic + data types and type synonyms, respectively. The instances of data families + can be data types and newtypes. + + + Type families are enabled by the flag . + Additional information on the use of type families in GHC is available on + the + Haskell wiki page on type families. + + + + Data families + + + Data families appear in two flavours: (1) they can be defined on the + toplevel + or (2) they can appear inside type classes (in which case they are known as + associated types). The former is the more general variant, as it lacks the + requirement for the type-indexes to coincide with the class + parameters. However, the latter can lead to more clearly structured code and + compiler warnings if some type instances were - possibly accidentally - + omitted. In the following, we always discuss the general toplevel form first + and then cover the additional constraints placed on associated types. + + + + Data family declarations + + + Indexed data families are introduced by a signature, such as + +data family GMap k :: * -> * + + The special family distinguishes family from standard + data declarations. The result kind annotation is optional and, as + usual, defaults to * if omitted. An example is + +data family Array e + + Named arguments can also be given explicit kind signatures if needed. + Just as with + [http://www.haskell.org/ghc/docs/latest/html/users_guide/gadt.html GADT + declarations] named arguments are entirely optional, so that we can + declare Array alternatively with + +data family Array :: * -> * + + + + + Associated data family declarations + + When a data family is declared as part of a type class, we drop + the family special. The GMap + declaration takes the following form + +class GMapKey k where + data GMap k :: * -> * + ... + + In contrast to toplevel declarations, named arguments must be used for + all type parameters that are to be used as type-indexes. Moreover, + the argument names must be class parameters. Each class parameter may + only be used at most once per associated type, but some may be omitted + and they may be in an order other than in the class head. Hence, the + following contrived example is admissible: + + class C a b c where + data T c a :: * + + + + + + + Data instance declarations + + + Instance declarations of data and newtype families are very similar to + standard data and newtype declarations. The only two differences are + that the keyword data or newtype + is followed by instance and that some or all of the + type arguments can be non-variable types, but may not contain forall + types or type synonym families. However, data families are generally + allowed in type parameters, and type synonyms are allowed as long as + they are fully applied and expand to a type that is itself admissible - + exactly as this is required for occurrences of type synonyms in class + instance parameters. For example, the Either + instance for GMap is + +data instance GMap (Either a b) v = GMapEither (GMap a v) (GMap b v) + + In this example, the declaration has only one variant. In general, it + can be any number. + + + Data and newtype instance declarations are only permitted when an + appropriate family declaration is in scope - just as a class instance declaratoin + requires the class declaration to be visible. Moreover, each instance + declaration has to conform to the kind determined by its family + declaration. This implies that the number of parameters of an instance + declaration matches the arity determined by the kind of the family. + + + A data family instance declaration can use the full exprssiveness of + ordinary data or newtype declarations: + + Although, a data family is introduced with + the keyword "data", a data family instance can + use either data or newtype. For example: + +data family T a +data instance T Int = T1 Int | T2 Bool +newtype instance T Char = TC Bool + + + A data instance can use GADT syntax for the data constructors, + and indeed can define a GADT. For example: + +data family G a b +data instance G [a] b where + G1 :: c -> G [Int] b + G2 :: G [a] Bool + + + You can use a deriving clause on a + data instance or newtype instance + declaration. + + + + + + Even if type families are defined as toplevel declarations, functions + that perform different computations for different family instances may still + need to be defined as methods of type classes. In particular, the + following is not possible: + +data family T a +data instance T Int = A +data instance T Char = B +foo :: T a -> Int +foo A = 1 -- WRONG: These two equations together... +foo B = 2 -- ...will produce a type error. + +Instead, you would have to write foo as a class operation, thus: + +class C a where + foo :: T a -> Int +instance Foo Int where + foo A = 1 +instance Foo Char where + foo B = 2 + + (Given the functionality provided by GADTs (Generalised Algebraic Data + Types), it might seem as if a definition, such as the above, should be + feasible. However, type families are - in contrast to GADTs - are + open; i.e., new instances can always be added, + possibly in other + modules. Supporting pattern matching across different data instances + would require a form of extensible case construct.) + + + + Associated data instances + + When an associated data family instance is declared within a type + class instance, we drop the instance keyword in the + family instance. So, the Either instance + for GMap becomes: + +instance (GMapKey a, GMapKey b) => GMapKey (Either a b) where + data GMap (Either a b) v = GMapEither (GMap a v) (GMap b v) + ... + + The most important point about associated family instances is that the + type indexes corresponding to class parameters must be identical to + the type given in the instance head; here this is the first argument + of GMap, namely Either a b, + which coincides with the only class parameter. Any parameters to the + family constructor that do not correspond to class parameters, need to + be variables in every instance; here this is the + variable v. + + + Instances for an associated family can only appear as part of + instances declarations of the class in which the family was declared - + just as with the equations of the methods of a class. Also in + correspondence to how methods are handled, declarations of associated + types can be omitted in class instances. If an associated family + instance is omitted, the corresponding instance type is not inhabited; + i.e., only diverging expressions, such + as undefined, can assume the type. + + + + + Scoping of class parameters + + In the case of multi-parameter type classes, the visibility of class + parameters in the right-hand side of associated family instances + depends solely on the parameters of the data + family. As an example, consider the simple class declaration + +class C a b where + data T a + + Only one of the two class parameters is a parameter to the data + family. Hence, the following instance declaration is invalid: + +instance C [c] d where + data T [c] = MkT (c, d) -- WRONG!! 'd' is not in scope + + Here, the right-hand side of the data instance mentions the type + variable d that does not occur in its left-hand + side. We cannot admit such data instances as they would compromise + type safety. + + + + + Type class instances of family instances + + Type class instances of instances of data families can be defined as + usual, and in particular data instance declarations can + have deriving clauses. For example, we can write + +data GMap () v = GMapUnit (Maybe v) + deriving Show + + which implicitly defines an instance of the form + +instance Show v => Show (GMap () v) where ... + + + + Note that class instances are always for + particular instances of a data family and never + for an entire family as a whole. This is for essentially the same + reasons that we cannot define a toplevel function that performs + pattern matching on the data constructors + of different instances of a single type family. + It would require a form of extensible case construct. + + + + + Overlap of data instances + + The instance declarations of a data family used in a single program + may not overlap at all, independent of whether they are associated or + not. In contrast to type class instances, this is not only a matter + of consistency, but one of type safety. + + + + + + + Import and export + + + The association of data constructors with type families is more dynamic + than that is the case with standard data and newtype declarations. In + the standard case, the notation T(..) in an import or + export list denotes the type constructor and all the data constructors + introduced in its declaration. However, a family declaration never + introduces any data constructors; instead, data constructors are + introduced by family instances. As a result, which data constructors + are associated with a type family depends on the currently visible + instance declarations for that family. Consequently, an import or + export item of the form T(..) denotes the family + constructor and all currently visible data constructors - in the case of + an export item, these may be either imported or defined in the current + module. The treatment of import and export items that explicitly list + data constructors, such as GMap(GMapEither), is + analogous. + + + + Associated families + + As expected, an import or export item of the + form C(..) denotes all of the class' methods and + associated types. However, when associated types are explicitly + listed as subitems of a class, we need some new syntax, as uppercase + identifiers as subitems are usually data constructors, not type + constructors. To clarify that we denote types here, each associated + type name needs to be prefixed by the keyword type. + So for example, when explicitly listing the components of + the GMapKey class, we write GMapKey(type + GMap, empty, lookup, insert). + + + + + Examples + + Assuming our running GMapKey class example, let us + look at some export lists and their meaning: + + + module GMap (GMapKey) where...: Exports + just the class name. + + + module GMap (GMapKey(..)) where...: + Exports the class, the associated type GMap + and the member + functions empty, lookup, + and insert. None of the data constructors is + exported. + + + module GMap (GMapKey(..), GMap(..)) + where...: As before, but also exports all the data + constructors GMapInt, + GMapChar, + GMapUnit, GMapPair, + and GMapUnit. + + + module GMap (GMapKey(empty, lookup, insert), + GMap(..)) where...: As before. + + + module GMap (GMapKey, empty, lookup, insert, GMap(..)) + where...: As before. + + + + + Finally, you can write GMapKey(type GMap) to denote + both the class GMapKey as well as its associated + type GMap. However, you cannot + write GMapKey(type GMap(..)) — i.e., + sub-component specifications cannot be nested. To + specify GMap's data constructors, you have to list + it separately. + + + + + Instances + + Family instances are implicitly exported, just like class instances. + However, this applies only to the heads of instances, not to the data + constructors an instance defines. + + + + + + + + + Synonym families + + + Type families appear in two flavours: (1) they can be defined on the + toplevel or (2) they can appear inside type classes (in which case they + are known as associated type synonyms). The former is the more general + variant, as it lacks the requirement for the type-indexes to coincide with + the class parameters. However, the latter can lead to more clearly + structured code and compiler warnings if some type instances were - + possibly accidentally - omitted. In the following, we always discuss the + general toplevel form first and then cover the additional constraints + placed on associated types. + + + + Type family declarations + + + Indexed type families are introduced by a signature, such as + +type family Elem c :: * + + The special family distinguishes family from standard + type declarations. The result kind annotation is optional and, as + usual, defaults to * if omitted. An example is + +type family Elem c + + Parameters can also be given explicit kind signatures if needed. We + call the number of parameters in a type family declaration, the family's + arity, and all applications of a type family must be fully saturated + w.r.t. to that arity. This requirement is unlike ordinary type synonyms + and it implies that the kind of a type family is not sufficient to + determine a family's arity, and hence in general, also insufficient to + determine whether a type family application is well formed. As an + example, consider the following declaration: + +type family F a b :: * -> * -- F's arity is 2, + -- although it's overall kind is * -> * -> * -> * + + Given this declaration the following are examples of well-formed and + malformed types: + +F Char [Int] -- OK! Kind: * -> * +F Char [Int] Bool -- OK! Kind: * +F IO Bool -- WRONG: kind mismatch in the first argument +F Bool -- WRONG: unsaturated application + + + + + Associated type family declarations + + When a type family is declared as part of a type class, we drop + the family special. The Elem + declaration takes the following form + +class Collects ce where + type Elem ce :: * + ... + + The argument names of the type family must be class parameters. Each + class parameter may only be used at most once per associated type, but + some may be omitted and they may be in an order other than in the + class head. Hence, the following contrived example is admissible: + +class C a b c where + type T c a :: * + + These rules are exactly as for associated data families. + + + + + + Type instance declarations + + Instance declarations of type families are very similar to standard type + synonym declarations. The only two differences are that the + keyword type is followed + by instance and that some or all of the type + arguments can be non-variable types, but may not contain forall types or + type synonym families. However, data families are generally allowed, and + type synonyms are allowed as long as they are fully applied and expand + to a type that is admissible - these are the exact same requirements as + for data instances. For example, the [e] instance + for Elem is + +type instance Elem [e] = e + + + + Type family instance declarations are only legitimate when an + appropriate family declaration is in scope - just like class instances + require the class declaration to be visible. Moreover, each instance + declaration has to conform to the kind determined by its family + declaration, and the number of type parameters in an instance + declaration must match the number of type parameters in the family + declaration. Finally, the right-hand side of a type instance must be a + monotype (i.e., it may not include foralls) and after the expansion of + all saturated vanilla type synonyms, no synonyms, except family synonyms + may remain. Here are some examples of admissible and illegal type + instances: + +type family F a :: * +type instance F [Int] = Int -- OK! +type instance F String = Char -- OK! +type instance F (F a) = a -- WRONG: type parameter mentions a type family +type instance F (forall a. (a, b)) = b -- WRONG: a forall type appears in a type parameter +type instance F Float = forall a.a -- WRONG: right-hand side may not be a forall type + +type family G a b :: * -> * +type instance G Int = (,) -- WRONG: must be two type parameters +type instance G Int Char Float = Double -- WRONG: must be two type parameters + + + + + Associated type instance declarations + + When an associated family instance is declared within a type class + instance, we drop the instance keyword in the family + instance. So, the [e] instance + for Elem becomes: + +instance (Eq (Elem [e])) => Collects ([e]) where + type Elem [e] = e + ... + + The most important point about associated family instances is that the + type indexes corresponding to class parameters must be identical to the + type given in the instance head; here this is [e], + which coincides with the only class parameter. + + + Instances for an associated family can only appear as part of instances + declarations of the class in which the family was declared - just as + with the equations of the methods of a class. Also in correspondence to + how methods are handled, declarations of associated types can be omitted + in class instances. If an associated family instance is omitted, the + corresponding instance type is not inhabited; i.e., only diverging + expressions, such as undefined, can assume the type. + + + + + Overlap of type synonym instances + + The instance declarations of a type family used in a single program + may only overlap if the right-hand sides of the overlapping instances + coincide for the overlapping types. More formally, two instance + declarations overlap if there is a substitution that makes the + left-hand sides of the instances syntactically the same. Whenever + that is the case, the right-hand sides of the instances must also be + syntactically equal under the same substitution. This condition is + independent of whether the type family is associated or not, and it is + not only a matter of consistency, but one of type safety. + + + Here are two example to illustrate the condition under which overlap + is permitted. + +type instance F (a, Int) = [a] +type instance F (Int, b) = [b] -- overlap permitted + +type instance G (a, Int) = [a] +type instance G (Char, a) = [a] -- ILLEGAL overlap, as [Char] /= [Int] + + + + + + Decidability of type synonym instances + + In order to guarantee that type inference in the presence of type + families decidable, we need to place a number of additional + restrictions on the formation of type instance declarations (c.f., + Definition 5 (Relaxed Conditions) of “Type + Checking with Open Type Functions”). Instance + declarations have the general form + +type instance F t1 .. tn = t + + where we require that for every type family application (G s1 + .. sm) in t, + + + s1 .. sm do not contain any type family + constructors, + + + the total number of symbols (data type constructors and type + variables) in s1 .. sm is strictly smaller than + in t1 .. tn, and + + + for every type + variable a, a occurs + in s1 .. sm at most as often as in t1 + .. tn. + + + These restrictions are easily verified and ensure termination of type + inference. However, they are not sufficient to guarantee completeness + of type inference in the presence of, so called, ''loopy equalities'', + such as a ~ [F a], where a recursive occurrence of + a type variable is underneath a family application and data + constructor application - see the above mentioned paper for details. + + + If the option is passed to the + compiler, the above restrictions are not enforced and it is on the + programmer to ensure termination of the normalisation of type families + during type inference. + + + + + + Equality constraints + + Type context can include equality constraints of the form t1 ~ + t2, which denote that the types t1 + and t2 need to be the same. In the presence of type + families, whether two types are equal cannot generally be decided + locally. Hence, the contexts of function signatures may include + equality constraints, as in the following example: + +sumCollects :: (Collects c1, Collects c2, Elem c1 ~ Elem c2) => c1 -> c2 -> c2 + + where we require that the element type of c1 + and c2 are the same. In general, the + types t1 and t2 of an equality + constraint may be arbitrary monotypes; i.e., they may not contain any + quantifiers, independent of whether higher-rank types are otherwise + enabled. + + + Equality constraints can also appear in class and instance contexts. + The former enable a simple translation of programs using functional + dependencies into programs using family synonyms instead. The general + idea is to rewrite a class declaration of the form + +class C a b | a -> b + + to + +class (F a ~ b) => C a b where + type F a + + That is, we represent every functional dependency (FD) a1 .. an + -> b by an FD type family F a1 .. an and a + superclass context equality F a1 .. an ~ b, + essentially giving a name to the functional dependency. In class + instances, we define the type instances of FD families in accordance + with the class head. Method signatures are not affected by that + process. + + + NB: Equalities in superclass contexts are not fully implemented in + GHC 6.10. + + + + + Type families and instance declarations + Type families require us to extend the rules for + the form of instance heads, which are given + in . + Specifically: + + Data type families may appear in an instance head + Type synonym families may not appear (at all) in an instance head + +The reason for the latter restriction is that there is no way to check for. Consider + + type family F a + type instance F Bool = Int + + class C a + + instance C Int + instance C (F a) + +Now a constraint (C (F Bool)) would match both instances. +The situation is especially bad because the type instance for F Bool +might be in another module, or even in a module that is not yet written. + + + + + + Other type system extensions @@ -3615,9 +4562,11 @@ to work since it gets translated into an equality comparison. The context of a type signature -Unlike Haskell 98, constraints in types do not have to be of -the form (class type-variable) or -(class (type-variable type-variable ...)). Thus, +The flag lifts the Haskell 98 restriction +that the type-class constraints in a type signature must have the +form (class type-variable) or +(class (type-variable type-variable ...)). +With these type signatures are perfectly OK g :: Eq [a] => ... @@ -3648,8 +4597,8 @@ in GHC, you can give the foralls if you want. See a is "reachable" if it appears +in the same constraint as either a type variable free in type, or another reachable type variable. A value with a type that does not obey this reachability restriction cannot be used without introducing @@ -4252,7 +5201,7 @@ it has rank-2 types on the left of a function arrow. GHC has three flags to control higher-rank types: - : data constructors (only) can have polymorphic argment types. + : data constructors (only) can have polymorphic argument types. : any function (including data constructors) can have a rank-2 type. @@ -4547,9 +5496,13 @@ f xs = ys ++ ys ys :: [a] ys = reverse xs -The type signature for f brings the type variable a into scope; it scopes over -the entire definition of f. -In particular, it is in scope at the type signature for ys. +The type signature for f brings the type variable a into scope, +because of the explicit forall (). +The type variables bound by a forall scope over +the entire definition of the accompanying value declaration. +In this example, the type variable a scopes over the whole +definition of f, including over +the type signature for ys. In Haskell 98 it is not possible to declare a type for ys; a major benefit of scoped type variables is that it becomes possible to do so. @@ -4592,7 +5545,7 @@ In Haskell, a programmer-written type signature is implicitly quantified over its free type variables (Section 4.1.2 -of the Haskel Report). +of the Haskell Report). Lexically scoped type variables affect this implicit quantification rules as follows: any type variable that is in scope is not universally quantified. For example, if type variable a is in scope, @@ -4690,8 +5643,8 @@ already in scope (i.e. bound by the enclosing context), matters are simple: the signature simply constrains the type of the pattern in the obvious way. -Unlike expression and declaration type signatures, pattern type signatures are not implictly generalised. -The pattern in a patterm binding may only mention type variables +Unlike expression and declaration type signatures, pattern type signatures are not implicitly generalised. +The pattern in a pattern binding may only mention type variables that are already in scope. For example: f :: forall a. [a] -> (Int, [a]) @@ -4886,33 +5839,6 @@ pattern binding must have the same context. For example, this is fine: - -Type families - - - -GHC supports the definition of type families indexed by types. They may be -seen as an extension of Haskell 98's class-based overloading of values to -types. When type families are declared in classes, they are also known as -associated types. - - -There are two forms of type families: data families and type synonym families. -Currently, only the former are fully implemented, while we are still working -on the latter. As a result, the specification of the language extension is -also still to some degree in flux. Hence, a more detailed description of -the language extension and its use is currently available -from the Haskell -wiki page on type families. The material will be moved to this user's -guide when it has stabilised. - - -Type families are enabled by the flag . - - - - - @@ -4977,6 +5903,8 @@ Wiki page. an expression; the spliced expression must have type Q Exp + an type; the spliced expression must + have type Q Typ a list of top-level declarations; the spliced expression must have type Q [Dec] @@ -5015,17 +5943,17 @@ Wiki page. That is, ''thing interprets thing in a type context. - These Names can be used to construct Template Haskell expressions, patterns, delarations etc. They + These Names can be used to construct Template Haskell expressions, patterns, declarations etc. They may also be given as an argument to the reify function. -(Compared to the original paper, there are many differnces of detail. +(Compared to the original paper, there are many differences of detail. The syntax for a declaration splice uses "$" not "splice". The type of the enclosed expression must be Q [Dec], not [Q Dec]. -Type splices are not implemented, and neither are pattern splices or quotations. +Pattern splices and quotations are not implemented.) @@ -5242,7 +6170,7 @@ main = do { print $ eval [$expr|1 + 2|] module Expr where import qualified Language.Haskell.TH as TH -import Language.Haskell.TH.Quasi +import Language.Haskell.TH.Quote data Expr = IntExpr Integer | AntiIntExpr String @@ -5313,6 +6241,8 @@ For more details, see “Generalising Monads to Arrows”, John Hughes, in Science of Computer Programming 37, pp67–111, May 2000. +The paper that introduced arrows: a friendly introduction, motivated with +programming examples. @@ -5320,6 +6250,7 @@ pp67–111, May 2000. A New Notation for Arrows”, Ross Paterson, in ICFP, Sep 2001. +Introduced the notation described here. @@ -5331,17 +6262,42 @@ Palgrave, 2003. - -and the arrows web page at + + +“Programming with Arrows”, +John Hughes, in 5th International Summer School on +Advanced Functional Programming, +Lecture Notes in Computer Science vol. 3622, +Springer, 2004. +This paper includes another introduction to the notation, +with practical examples. + + + + + +“Type and Translation Rules for Arrow Notation in GHC”, +Ross Paterson and Simon Peyton Jones, September 16, 2004. +A terse enumeration of the formal rules used +(extracted from comments in the source code). + + + + + +The arrows web page at http://www.haskell.org/arrows/. + + + + With the flag, GHC supports the arrow -notation described in the second of these papers. -What follows is a brief introduction to the notation; -it won't make much sense unless you've read Hughes's paper. -This notation is translated to ordinary Haskell, -using combinators from the +notation described in the second of these papers, +translating it using combinators from the Control.Arrow module. +What follows is a brief introduction to the notation; +it won't make much sense unless you've read Hughes's paper. The extension adds a new kind of expression for defining arrows: @@ -5615,7 +6571,8 @@ We could define our own operator untilA :: ArrowChoice a => a e () -> a e Bool -> a e () untilA body cond = proc x -> - if cond x then returnA -< () + b <- cond -< x + if b then returnA -< () else do body -< x untilA body cond -< x @@ -5797,13 +6754,24 @@ Because the preprocessor targets Haskell (rather than Core), Bang patterns GHC supports an extension of pattern matching called bang -patterns. Bang patterns are under consideration for Haskell Prime. +patterns, written !pat. +Bang patterns are under consideration for Haskell Prime. The Haskell prime feature description contains more discussion and examples than the material below. +The key change is the addition of a new rule to the +semantics of pattern matching in the Haskell 98 report. +Add new bullet 10, saying: Matching the pattern !pat +against a value v behaves as follows: + +if v is bottom, the match diverges +otherwise, pat is matched against v + + + Bang patterns are enabled by the flag . @@ -5834,9 +6802,40 @@ A bang only really has an effect if it precedes a variable or wild-card pattern: f3 !(x,y) = [x,y] f4 (x,y) = [x,y] -Here, f3 and f4 are identical; putting a bang before a pattern that +Here, f3 and f4 are identical; +putting a bang before a pattern that forces evaluation anyway does nothing. - + + +There is one (apparent) exception to this general rule that a bang only +makes a difference when it precedes a variable or wild-card: a bang at the +top level of a let or where +binding makes the binding strict, regardless of the pattern. For example: + +let ![x,y] = e in b + +is a strict binding: operationally, it evaluates e, matches +it against the pattern [x,y], and then evaluates b. +(We say "apparent" exception because the Right Way to think of it is that the bang +at the top of a binding is not part of the pattern; rather it +is part of the syntax of the binding.) +Nested bangs in a pattern binding behave uniformly with all other forms of +pattern matching. For example + +let (!x,[y]) = e in b + +is equivalent to this: + +let { t = case e of (x,[y]) -> x `seq` (x,y) + x = fst t + y = snd t } +in b + +The binding is lazy, but when either x or y is +evaluated by b the entire pattern is matched, including forcing the +evaluation of x. + + Bang patterns work in case expressions too, of course: g5 x = let y = f x in body @@ -5846,18 +6845,6 @@ g7 x = case f x of { !y -> body } The functions g5 and g6 mean exactly the same thing. But g7 evaluates (f x), binds y to the result, and then evaluates body. - -Bang patterns work in let and where -definitions too. For example: - -let ![x,y] = e in b - -is a strict pattern: operationally, it evaluates e, matches -it against the pattern [x,y], and then evaluates b -The "!" should not be regarded as part of the pattern; after all, -in a function argument ![x,y] means the -same as [x,y]. Rather, the "!" -is part of the syntax of let bindings. @@ -6043,13 +7030,31 @@ Assertion failures can be caught, see the documentation for the word. The various values for word that GHC understands are described in the following sections; any pragma encountered with an - unrecognised word is (silently) - ignored. + unrecognised word is + ignored. The layout rule applies in pragmas, so the closing #-} + should start in a column to the right of the opening {-#. - Certain pragmas are file-header pragmas. A file-header - pragma must precede the module keyword in the file. + Certain pragmas are file-header pragmas: + + + A file-header + pragma must precede the module keyword in the file. + + There can be as many file-header pragmas as you please, and they can be - preceded or followed by comments. + preceded or followed by comments. + + + File-header pragmas are read once only, before + pre-processing the file (e.g. with cpp). + + + The file-header pragmas are: {-# LANGUAGE #-}, + {-# OPTIONS_GHC #-}, and + {-# INCLUDE #-}. + + + LANGUAGE pragma @@ -6127,56 +7132,63 @@ Assertion failures can be caught, see the documentation for the don't recommend using this approach with GHC. - - DEPRECATED pragma - DEPRECATED - + + WARNING and DEPRECATED pragmas + WARNING + DEPRECATED - The DEPRECATED pragma lets you specify that a particular - function, class, or type, is deprecated. There are two - forms. + The WARNING pragma allows you to attach an arbitrary warning + to a particular function, class, or type. + A DEPRECATED pragma lets you specify that + a particular function, class, or type is deprecated. + There are two ways of using these pragmas. - You can deprecate an entire module thus: + You can work on an entire module thus: module Wibble {-# DEPRECATED "Use Wobble instead" #-} where ... + Or: + + module Wibble {-# WARNING "This is an unstable interface." #-} where + ... + When you compile any module that import Wibble, GHC will print the specified message. - You can deprecate a function, class, type, or data constructor, with the - following top-level declaration: + You can attach a warning to a function, class, type, or data constructor, with the + following top-level declarations: {-# DEPRECATED f, C, T "Don't use these" #-} + {-# WARNING unsafePerformIO "This is unsafe; I hope you know what you're doing" #-} When you compile any module that imports and uses any of the specified entities, GHC will print the specified message. - You can only deprecate entities declared at top level in the module + You can only attach to entities declared at top level in the module being compiled, and you can only use unqualified names in the list of - entities being deprecated. A capitalised name, such as T + entities. A capitalised name, such as T refers to either the type constructor T or the data constructor T, or both if - both are in scope. If both are in scope, there is currently no way to deprecate - one without the other (c.f. fixities ). + both are in scope. If both are in scope, there is currently no way to + specify one without the other (c.f. fixities + ). - Any use of the deprecated item, or of anything from a deprecated - module, will be flagged with an appropriate message. However, - deprecations are not reported for - (a) uses of a deprecated function within its defining module, and - (b) uses of a deprecated function in an export list. + Warnings and deprecations are not reported for + (a) uses within the defining module, and + (b) uses in an export list. The latter reduces spurious complaints within a library in which one module gathers together and re-exports the exports of several others. You can suppress the warnings with the flag - . + . @@ -6203,16 +7215,9 @@ Assertion failures can be caught, see the documentation for the key_function :: Int -> String -> (Bool, Double) - -#ifdef __GLASGOW_HASKELL__ {-# INLINE key_function #-} -#endif - (You don't need to do the C pre-processor carry-on - unless you're going to stick the code through HBC—it - doesn't like INLINE pragmas.) - The major effect of an INLINE pragma is to declare a function's “cost” to be very low. The normal unfolding machinery will then be very keen to @@ -6220,7 +7225,7 @@ key_function :: Int -> String -> (Bool, Double) function "f" has a number of other effects: -No funtions are inlined into f. Otherwise +No functions are inlined into f. Otherwise GHC might inline a big function into f's right hand side, making f big; and then inline f blindly. @@ -6236,6 +7241,16 @@ It's going to be inlined wholesale instead. All of these effects are aimed at ensuring that what gets inlined is exactly what you asked for, no more and no less. +GHC ensures that inlining cannot go on forever: every mutually-recursive +group is cut by one or more loop breakers that is never inlined +(see +Secrets of the GHC inliner, JFP 12(4) July 2002). +GHC tries not to select a function with an INLINE pragma as a loop breaker, but +when there is no choice even an INLINE function can be selected, in which case +the INLINE pragma is ignored. +For example, for a self-recursive function, the loop breaker can only be the function +itself, so an INLINE pragma is always ignored. + Syntactically, an INLINE pragma for a function can be put anywhere its type signature could be put. @@ -6248,14 +7263,18 @@ exactly what you asked for, no more and no less. UniqueSupply monad code, we have: -#ifdef __GLASGOW_HASKELL__ {-# INLINE thenUs #-} {-# INLINE returnUs #-} -#endif See also the NOINLINE pragma (). + + Note: the HBC compiler doesn't like INLINE pragmas, + so if you want your code to be HBC-compatible you'll have to surround + the pragma with C pre-processor directives + #ifdef __GLASGOW_HASKELL__...#endif. + @@ -6309,7 +7328,7 @@ exactly what you asked for, no more and no less. there was no pragma). - "INLINE[~k] f" means: be willing to inline + "NOINLINE[~k] f" means: be willing to inline f until phase k, but from phase k onwards do not inline it. @@ -6343,6 +7362,83 @@ happen. + + ANN pragmas + + GHC offers the ability to annotate various code constructs with additional + data by using three pragmas. This data can then be inspected at a later date by + using GHC-as-a-library. + + + Annotating values + + ANN + + Any expression that has both Typeable and Data instances may be attached to a top-level value + binding using an ANN pragma. In particular, this means you can use ANN + to annotate data constructors (e.g. Just) as well as normal values (e.g. take). + By way of example, to annotate the function foo with the annotation Just "Hello" + you would do this: + + +{-# ANN foo (Just "Hello") #-} +foo = ... + + + + A number of restrictions apply to use of annotations: + + The binder being annotated must be at the top level (i.e. no nested binders) + The binder being annotated must be declared in the current module + The expression you are annotating with must have a type with Typeable and Data instances + The Template Haskell staging restrictions apply to the + expression being annotated with, so for example you cannot run a function from the module being compiled. + + To be precise, the annotation {-# ANN x e #-} is well staged if and only if $(e) would be + (disregarding the usual type restrictions of the splice syntax, and the usual restriction on splicing inside a splice - $([|1|]) is fine as an annotation, albeit redundant). + + + If you feel strongly that any of these restrictions are too onerous, + please give the GHC team a shout. + + + However, apart from these restrictions, many things are allowed, including expressions which are not fully evaluated! + Annotation expressions will be evaluated by the compiler just like Template Haskell splices are. So, this annotation is fine: + + +{-# ANN f SillyAnnotation { foo = (id 10) + $([| 20 |]), bar = 'f } #-} +f = ... + + + + + Annotating types + + ANN type + ANN + + You can annotate types with the ANN pragma by using the type keyword. For example: + + +{-# ANN type Foo (Just "A `Maybe String' annotation") #-} +data Foo = ... + + + + + Annotating modules + + ANN module + ANN + + You can annotate modules with the ANN pragma by using the module keyword. For example: + + +{-# ANN module (Just "A `Maybe String' annotation") #-} + + + + LINE pragma @@ -6583,23 +7679,19 @@ data S = S {-# UNPACK #-} !Int {-# UNPACK #-} !Int The programmer can specify rewrite rules as part of the source program -(in a pragma). GHC applies these rewrite rules wherever it can, provided (a) -the flag () is on, -and (b) the flag -() is not specified, and (c) the - () -flag is active. - - - +(in a pragma). Here is an example: {-# RULES - "map/map" forall f g xs. map f (map g xs) = map (f.g) xs - #-} + "map/map" forall f g xs. map f (map g xs) = map (f.g) xs + #-} - + + +Use the debug flag to see what rules fired. +If you need more information, then shows you +each individual rule firing in detail. @@ -6609,15 +7701,32 @@ Here is an example: From a syntactic point of view: - + - There may be zero or more rules in a RULES pragma. + There may be zero or more rules in a RULES pragma, separated by semicolons (which + may be generated by the layout rule). + +The layout rule applies in a pragma. +Currently no new indentation level +is set, so if you put several rules in single RULES pragma and wish to use layout to separate them, +you must lay out the starting in the same column as the enclosing definitions. + + {-# RULES + "map/map" forall f g xs. map f (map g xs) = map (f.g) xs + "map/append" forall f xs ys. map f (xs ++ ys) = map f xs ++ map f ys + #-} + +Furthermore, the closing #-} +should start in a column to the right of the opening {-#. + + + Each rule has a name, enclosed in double quotes. The name itself has no significance at all. It is only used when reporting how many times the rule fired. @@ -6631,7 +7740,7 @@ immediately after the name of the rule. Thus: {-# RULES "map/map" [2] forall f g xs. map f (map g xs) = map (f.g) xs - #-} + #-} The "[2]" means that the rule is active in Phase 2 and subsequent phases. The inverse notation "[~2]" is also accepted, meaning that the rule is active up to, but not including, @@ -6640,17 +7749,8 @@ Phase 2. - - - - Layout applies in a RULES pragma. Currently no new indentation level -is set, so you must lay out your rules starting in the same column as the -enclosing definitions. - - - Each variable mentioned in a rule must either be in scope (e.g. map), or bound by the forall (e.g. f, g, xs). The variables bound by @@ -6699,17 +7799,40 @@ variables it mentions, though of course they need to be in scope. - Rules are automatically exported from a module, just as instance declarations are. + All rules are implicitly exported from the module, and are therefore +in force in any module that imports the module that defined the rule, directly +or indirectly. (That is, if A imports B, which imports C, then C's rules are +in force when compiling A.) The situation is very similar to that for instance +declarations. + + + + + + +Inside a RULE "forall" is treated as a keyword, regardless of +any other flag settings. Furthermore, inside a RULE, the language extension + is automatically enabled; see +. + + +Like other pragmas, RULE pragmas are always checked for scope errors, and +are typechecked. Typechecking means that the LHS and RHS of a rule are typechecked, +and must have the same type. However, rules are only enabled +if the flag is +on (see ). + + - + Semantics @@ -6717,9 +7840,17 @@ From a semantic point of view: - -Rules are only applied if you use the flag. +Rules are enabled (that is, used during optimisation) +by the flag. +This flag is implied by , and may be switched +off (as usual) by . +(NB: enabling without +may not do what you expect, though, because without GHC +ignores all optimisation information in interface files; +see , .) +Note that is an optimisation flag, and +has no effect on parsing or typechecking. @@ -6736,14 +7867,6 @@ expression by substituting for the pattern variables. - The LHS and RHS of a rule are typechecked, and must have the -same type. - - - - - - GHC makes absolutely no attempt to verify that the LHS and RHS of a rule have the same meaning. That is undecidable in general, and infeasible in most interesting cases. The responsibility is entirely the programmer's! @@ -6757,7 +7880,7 @@ infeasible in most interesting cases. The responsibility is entirely the progra terminating. For example: - "loop" forall x,y. f x y = f y x + "loop" forall x y. f x y = f y x This rule will cause the compiler to go into an infinite loop. @@ -6810,48 +7933,32 @@ not be substituted, and the rule would not fire. - In the earlier phases of compilation, GHC inlines nothing -that appears on the LHS of a rule, because once you have substituted -for something you can't match against it (given the simple minded -matching). So if you write the rule - +Ordinary inlining happens at the same time as rule rewriting, which may lead to unexpected +results. Consider this (artificial) example - "map/map" forall f,g. map f . map g = map (f.g) - +f x = x +{-# RULES "f" f True = False #-} -this won't match the expression map f (map g xs). -It will only match something written with explicit use of ".". -Well, not quite. It will match the expression +g y = f y - -wibble f g xs +h z = g True - -where wibble is defined: - +Since f's right-hand side is small, it is inlined into g, +to give -wibble f g = map f . map g +g y = y - -because wibble will be inlined (it's small). - -Later on in compilation, GHC starts inlining even things on the -LHS of rules, but still leaves the rules enabled. This inlining -policy is controlled by the per-simplification-pass flag n. - +Now g is inlined into h, but f's RULE has +no chance to fire. +If instead GHC had first inlined g into h then there +would have been a better chance that f's RULE might fire. - - - - All rules are implicitly exported from the module, and are therefore -in force in any module that imports the module that defined the rule, directly -or indirectly. (That is, if A imports B, which imports C, then C's rules are -in force when compiling A.) The situation is very similar to that for instance -declarations. +The way to get predictable behaviour is to use a NOINLINE +pragma on f, to ensure +that it is not inlined until its RULEs have had a chance to fire. - @@ -7194,7 +8301,7 @@ g x = show x - However, when external for is generated (via + However, when external core is generated (via ), there will be Notes attached to the expressions show and x. The core function declaration for f is: @@ -7236,7 +8343,7 @@ r) -> Special built-in functions GHC has a few built-in functions with special behaviour. These are now described in the module GHC.Prim +url="../libraries/ghc-prim/GHC-Prim.html">GHC.Prim in the library documentation. @@ -7545,6 +8652,7 @@ standard behaviour. ;;; Local Variables: *** ;;; mode: xml *** ;;; sgml-parent-document: ("users_guide.xml" "book" "chapter" "sect1") *** + ;;; ispell-local-dictionary: "british" *** ;;; End: *** -->