X-Git-Url: http://git.megacz.com/?p=ghc-hetmet.git;a=blobdiff_plain;f=docs%2Fusers_guide%2Fglasgow_exts.xml;h=5d1b5cf5f053bf3a3435740f51ee0a7a1c88d3f0;hp=f30b9f72019630090df29a92f40d5e1300f87100;hb=432b9c9322181a3644083e3c19b7e240d90659e7;hpb=7a0cab9b9934adf2afd95058149bcce90f699007 diff --git a/docs/users_guide/glasgow_exts.xml b/docs/users_guide/glasgow_exts.xml index f30b9f7..5d1b5cf 100644 --- a/docs/users_guide/glasgow_exts.xml +++ b/docs/users_guide/glasgow_exts.xml @@ -38,266 +38,68 @@ documentation describes all the libraries that come with GHC. extensionsoptions controlling - The language option flag control what variation of the language are + The language option flags control what variation of the language are permitted. Leaving out all of them gives you standard Haskell 98. - Generally speaking, all the language options are introduced by "", - e.g. . - - - All the language options can be turned off by using the prefix ""; - e.g. "". - - Language options recognised by Cabal can also be enabled using the LANGUAGE pragma, - thus {-# LANGUAGE TemplateHaskell #-} (see >). - - Turning on an option that enables special syntax - might cause working Haskell 98 code to fail - to compile, perhaps because it uses a variable name which has - become a reserved word. So, together with each option below, we - list the special syntax which is enabled by this option. We use - notation and nonterminal names from the Haskell 98 lexical syntax - (see the Haskell 98 Report). There are two classes of special - syntax: - + Language options can be controlled in two ways: - - New reserved words and symbols: character sequences - which are no longer available for use as identifiers in the - program. - - - Other special syntax: sequences of characters that have - a different meaning when this particular option is turned - on. - - - - We are only listing syntax changes here that might affect - existing working programs (i.e. "stolen" syntax). Many of these - extensions will also enable new context-free syntax, but in all - cases programs written to use the new syntax would not be - compilable without the option enabled. - - - - - - : + Every language option can switched on by a command-line flag "" + (e.g. ), and switched off by the flag ""; + (e.g. ). + + Language options recognised by Cabal can also be enabled using the LANGUAGE pragma, + thus {-# LANGUAGE TemplateHaskell #-} (see ). + + + + The flag - - - This simultaneously enables all of the extensions to - Haskell 98 described in , except where otherwise - noted. We are trying to move away from this portmanteau flag, - and towards enabling features individaully. - - New reserved words: forall (only in - types), mdo. - - Other syntax stolen: - varid{#}, - char#, - string#, - integer#, - float#, - float##, - (#, #), - |), {|. - - Implies these specific language options: - , - , - , - , - . - - - - - - : - - - - This option enables the language extension defined in the - Haskell 98 Foreign Function Interface Addendum. - - New reserved words: foreign. - - - - - - ,: - - - These two flags control how generalisation is done. - See . - - - - - - - : - - - - Use GHCi's extended default rules in a regular module (). - Independent of the - flag. - - - - - - - - - - - - - - - - - - - - - - See . Only relevant - if you also use . - - - - - - - - - - See . Only relevant if - you also use . - - - - - - - - - - See . Independent of - . - - New reserved words/symbols: rec, - proc, -<, - >-, -<<, - >>-. - - Other syntax stolen: (|, - |). - - - - - - - - - - See . Independent of - . - - - - - - - -XNoImplicitPrelude - option GHC normally imports - Prelude.hi files for you. If you'd - rather it didn't, then give it a - option. The idea is - that you can then import a Prelude of your own. (But don't - call it Prelude; the Haskell module - namespace is flat, and you must not conflict with any - Prelude module.) - - Even though you have not imported the Prelude, most of - the built-in syntax still refers to the built-in Haskell - Prelude types and values, as specified by the Haskell - Report. For example, the type [Int] - still means Prelude.[] Int; tuples - continue to refer to the standard Prelude tuples; the - translation for list comprehensions continues to use - Prelude.map etc. - - However, does - change the handling of certain built-in syntax: see . - - - - - - - Enables implicit parameters (see ). Currently also implied by - . + is equivalent to enabling the following extensions: + , + , + , + , + , + , + , + , + , + , + , + , + , + , + , + , + , + , + , + , + , + , + , + , + , + , + , + , + , + , + . + Enabling these options is the only + effect of . + We are trying to move away from this portmanteau flag, + and towards enabling features individually. - Syntax stolen: - ?varid, - %varid. - - - - - - - Enables overloaded string literals (see ). - - - - - - - Enables lexically-scoped type variables (see ). Implied by - . - - - - - - - Enables Template Haskell (see ). This flag must - be given explicitly; it is no longer implied by - . - - Syntax stolen: [|, - [e|, [p|, - [d|, [t|, - $(, - $varid. - - - - Unboxed types and primitive operations -GHC is built on a raft of primitive data types and operations. +GHC is built on a raft of primitive data types and operations; +"primitive" in the sense that they cannot be defined in Haskell itself. While you really can use this stuff to write fast code, we generally find it a lot less painful, and more satisfying in the long run, to use higher-level language features and libraries. With @@ -305,28 +107,21 @@ While you really can use this stuff to write fast code, unboxed version in any case. And if it isn't, we'd like to know about it. -We do not currently have good, up-to-date documentation about the -primitives, perhaps because they are mainly intended for internal use. -There used to be a long section about them here in the User Guide, but it -became out of date, and wrong information is worse than none. - -The Real Truth about what primitive types there are, and what operations -work over those types, is held in the file -fptools/ghc/compiler/prelude/primops.txt.pp. -This file is used directly to generate GHC's primitive-operation definitions, so -it is always correct! It is also intended for processing into text. - - Indeed, -the result of such processing is part of the description of the - External - Core language. -So that document is a good place to look for a type-set version. -We would be very happy if someone wanted to volunteer to produce an SGML -back end to the program that processes primops.txt so that -we could include the results here in the User Guide. - -What follows here is a brief summary of some main points. +All these primitive data types and operations are exported by the +library GHC.Prim, for which there is +detailed online documentation. +(This documentation is generated from the file compiler/prelude/primops.txt.pp.) + + +If you want to mention any of the primitive data types or operations in your +program, you must first import GHC.Prim to bring them +into scope. Many of them have names ending in "#", and to mention such +names you need the extension (). + + +The primops make extensive use of unboxed types +and unboxed tuples, which +we briefly summarise here. Unboxed types @@ -359,8 +154,11 @@ know and love—usually one instruction. Primitive (unboxed) types cannot be defined in Haskell, and are therefore built into the language and compiler. Primitive types are always unlifted; that is, a value of a primitive type cannot be -bottom. We use the convention that primitive types, values, and -operations have a <literal>#</literal> suffix. +bottom. We use the convention (but it is only a convention) +that primitive types, values, and +operations have a <literal>#</literal> suffix (see <xref linkend="magic-hash"/>). +For some primitive types we have special syntax for literals, also +described in the <link linkend="magic-hash">same section</link>. </para> <para> @@ -412,22 +210,20 @@ in a <emphasis>top-level</emphasis> binding. in a <emphasis>recursive</emphasis> binding. </para></listitem> <listitem><para> You may bind unboxed variables in a (non-recursive, -non-top-level) pattern binding, but any such variable causes the entire -pattern-match -to become strict. For example: +non-top-level) pattern binding, but you must make any such pattern-match +strict. For example, rather than: <programlisting> data Foo = Foo Int Int# f x = let (Foo a b, w) = ..rhs.. in ..body.. </programlisting> -Since <literal>b</literal> has type <literal>Int#</literal>, the entire pattern -match -is strict, and the program behaves as if you had written +you must write: <programlisting> data Foo = Foo Int Int# - f x = case ..rhs.. of { (Foo a b, w) -> ..body.. } + f x = let !(Foo a b, w) = ..rhs.. in ..body.. </programlisting> +since <literal>b</literal> has type <literal>Int#</literal>. </para> </listitem> </itemizedlist> @@ -537,8 +333,146 @@ Indeed, the bindings can even be recursive. <sect1 id="syntax-extns"> <title>Syntactic extensions + + Unicode syntax + The language + extension + enables Unicode characters to be used to stand for certain ASCII + character sequences. The following alternatives are provided: + + + + + + ASCII + Unicode alternative + Code point + Name + + + + + :: + :: + 0x2237 + PROPORTION + + + + + => + + 0x21D2 + RIGHTWARDS DOUBLE ARROW + + + + + forall + + 0x2200 + FOR ALL + + + + + -> + + 0x2192 + RIGHTWARDS ARROW + + + + + <- + + 0x2190 + LEFTWARDS ARROW + + + + + .. + + 0x22EF + MIDLINE HORIZONTAL ELLIPSIS + + + + + + + + The magic hash + The language extension allows "#" as a + postfix modifier to identifiers. Thus, "x#" is a valid variable, and "T#" is + a valid type constructor or data constructor. + + The hash sign does not change sematics at all. We tend to use variable + names ending in "#" for unboxed values or types (e.g. Int#), + but there is no requirement to do so; they are just plain ordinary variables. + Nor does the extension bring anything into scope. + For example, to bring Int# into scope you must + import GHC.Prim (see ); + the extension + then allows you to refer to the Int# + that is now in scope. + The also enables some new forms of literals (see ): + + 'x'# has type Char# + "foo"# has type Addr# + 3# has type Int#. In general, + any Haskell 98 integer lexeme followed by a # is an Int# literal, e.g. + -0x3A# as well as 32#. + 3## has type Word#. In general, + any non-negative Haskell 98 integer lexeme followed by ## + is a Word#. + 3.2# has type Float#. + 3.2## has type Double# + + + + + + New qualified operator syntax + + A new syntax for referencing qualified operators is + planned to be introduced by Haskell', and is enabled in GHC + with + the + option. In the new syntax, the prefix form of a qualified + operator is + written module.(symbol) + (in Haskell 98 this would + be (module.symbol)), + and the infix form is + written `module.(symbol)` + (in Haskell 98 this would + be `module.symbol`. + For example: + + add x y = Prelude.(+) x y + subtract y = (`Prelude.(-)` y) + + The new form of qualified operators is intended to regularise + the syntax by eliminating odd cases + like Prelude... For example, + when NewQualifiedOperators is on, it is possible to + write the enumerated sequence [Monday..] + without spaces, whereas in Haskell 98 this would be a + reference to the operator ‘.‘ + from module Monday. + + When is on, the old Haskell + 98 syntax for qualified operators is not accepted, so this + option may cause existing Haskell 98 code to break. + + + + + Hierarchical Modules @@ -742,7 +676,7 @@ view :: Type -> TypeView The representation of Typ is held abstract, permitting implementations -to use a fancy representation (e.g., hash-consing to managage sharing). +to use a fancy representation (e.g., hash-consing to manage sharing). Without view patterns, using this signature a little inconvenient: @@ -993,11 +927,6 @@ and improve termination (Section 3.2 of the paper). -The web page: http://www.cse.ogi.edu/PacSoft/projects/rmb -contains up to date information on recursive monadic bindings. - - - Historical note: The old implementation of the mdo-notation (and most of the existing documents) used the name MonadRec for the class and the corresponding library. @@ -1058,25 +987,193 @@ This name is not supported by GHC. branches. + + - + + Generalised (SQL-Like) List Comprehensions + list comprehensionsgeneralised + + extended list comprehensions + + group + sql + + + Generalised list comprehensions are a further enhancement to the + list comprehension syntatic sugar to allow operations such as sorting + and grouping which are familiar from SQL. They are fully described in the + paper + Comprehensive comprehensions: comprehensions with "order by" and "group by", + except that the syntax we use differs slightly from the paper. +The extension is enabled with the flag . +Here is an example: + +employees = [ ("Simon", "MS", 80) +, ("Erik", "MS", 100) +, ("Phil", "Ed", 40) +, ("Gordon", "Ed", 45) +, ("Paul", "Yale", 60)] + +output = [ (the dept, sum salary) +| (name, dept, salary) <- employees +, then group by dept +, then sortWith by (sum salary) +, then take 5 ] + +In this example, the list output would take on + the value: + + +[("Yale", 60), ("Ed", 85), ("MS", 180)] + + +There are three new keywords: group, by, and using. +(The function sortWith is not a keyword; it is an ordinary +function that is exported by GHC.Exts.) - -Rebindable syntax +There are five new forms of comprehension qualifier, +all introduced by the (existing) keyword then: + + + + +then f + + + This statement requires that f have the type + forall a. [a] -> [a]. You can see an example of it's use in the + motivating example, as this form is used to apply take 5. + + + + + + + +then f by e + + + This form is similar to the previous one, but allows you to create a function + which will be passed as the first argument to f. As a consequence f must have + the type forall a. (a -> t) -> [a] -> [a]. As you can see + from the type, this function lets f "project out" some information + from the elements of the list it is transforming. + + An example is shown in the opening example, where sortWith + is supplied with a function that lets it find out the sum salary + for any item in the list comprehension it transforms. + + + + + + + +then group by e using f + + + This is the most general of the grouping-type statements. In this form, + f is required to have type forall a. (a -> t) -> [a] -> [[a]]. + As with the then f by e case above, the first argument + is a function supplied to f by the compiler which lets it compute e on every + element of the list being transformed. However, unlike the non-grouping case, + f additionally partitions the list into a number of sublists: this means that + at every point after this statement, binders occurring before it in the comprehension + refer to lists of possible values, not single values. To help understand + this, let's look at an example: + + +-- This works similarly to groupWith in GHC.Exts, but doesn't sort its input first +groupRuns :: Eq b => (a -> b) -> [a] -> [[a]] +groupRuns f = groupBy (\x y -> f x == f y) + +output = [ (the x, y) +| x <- ([1..3] ++ [1..2]) +, y <- [4..6] +, then group by x using groupRuns ] + + + This results in the variable output taking on the value below: + + +[(1, [4, 5, 6]), (2, [4, 5, 6]), (3, [4, 5, 6]), (1, [4, 5, 6]), (2, [4, 5, 6])] + + + Note that we have used the the function to change the type + of x from a list to its original numeric type. The variable y, in contrast, is left + unchanged from the list form introduced by the grouping. + + + + + + +then group by e + + + This form of grouping is essentially the same as the one described above. However, + since no function to use for the grouping has been supplied it will fall back on the + groupWith function defined in + GHC.Exts. This + is the form of the group statement that we made use of in the opening example. + + + + + + + +then group using f + + + With this form of the group statement, f is required to simply have the type + forall a. [a] -> [[a]], which will be used to group up the + comprehension so far directly. An example of this form is as follows: + + +output = [ x +| y <- [1..5] +, x <- "hello" +, then group using inits] + - GHC allows most kinds of built-in syntax to be rebound by - the user, to facilitate replacing the Prelude - with a home-grown version, for example. + This will yield a list containing every prefix of the word "hello" written out 5 times: - You may want to define your own numeric class + +["","h","he","hel","hell","hello","helloh","hellohe","hellohel","hellohell","hellohello","hellohelloh",...] + + + + + + + + + + +Rebindable syntax and the implicit Prelude import + + -XNoImplicitPrelude + option GHC normally imports + Prelude.hi files for you. If you'd + rather it didn't, then give it a + option. The idea is + that you can then import a Prelude of your own. (But don't + call it Prelude; the Haskell module + namespace is flat, and you must not conflict with any + Prelude module.) + + Suppose you are importing a Prelude of your own + in order to define your own numeric class hierarchy. It completely defeats that purpose if the literal "1" means "Prelude.fromInteger 1", which is what the Haskell Report specifies. - So the flag causes + So the + flag also causes the following pieces of built-in syntax to refer to whatever is in scope, not the Prelude versions: - An integer literal 368 means @@ -1125,7 +1222,7 @@ This name is not supported by GHC. In all cases (apart from arrow notation), the static semantics should be that of the desugared form, -even if that is a little unexpected. For emample, the +even if that is a little unexpected. For example, the static semantics of the literal 368 is exactly that of fromInteger (368::Integer); it's fine for fromInteger to have any of the types: @@ -1148,8 +1245,9 @@ fromInteger :: Integer -> Bool -> Bool Postfix operators -GHC allows a small extension to the syntax of left operator sections, which -allows you to define postfix operators. The extension is this: the left section + The flag enables a small +extension to the syntax of left operator sections, which allows you to +define postfix operators. The extension is this: the left section (e !) @@ -1166,10 +1264,6 @@ That is, the operator must be a function of two arguments. GHC allows it to take only one argument, and that in turn allows you to write the function postfix. -Since this extension goes beyond Haskell 98, it should really be enabled -by a flag; but in fact it is enabled all the time. (No Haskell 98 programs -change their behaviour, of course.) - The extension does not extend to the left-hand side of function definitions; you must define such a function in prefix form. @@ -1207,7 +1301,7 @@ it is not clear which of the two types is intended. Haskell 98 regards all four as ambiguous, but with the - flag, GHC will accept + flag, GHC will accept the former two. The rules are precisely the same as those for instance declarations in Haskell 98, where the method names on the left-hand side of the method bindings in an instance declaration refer unambiguously @@ -1225,7 +1319,7 @@ records from different modules that use the same field name. -Record puns are enabled by the flag -XRecordPuns. +Record puns are enabled by the flag -XNamedFieldPuns. @@ -1366,11 +1460,12 @@ For example, in a let, it applies in the right-hand sides of other let-bindings and the body of the letC. Or, in recursive do expressions (), the local fixity -declarations of aA let statement scope over other +declarations of a let statement scope over other statements in the group, just as the bound name does. -Moreover, a local fixity declatation *must* accompany a local binding of + +Moreover, a local fixity declaration *must* accompany a local binding of that name: it is not possible to revise the fixity of name bound elsewhere, as in @@ -1379,8 +1474,170 @@ let infixr 9 $ in ... Because local fixity declarations are technically Haskell 98, no flag is necessary to enable them. + + + + + Package-qualified imports + + With the flag, GHC allows + import declarations to be qualified by the package name that the + module is intended to be imported from. For example: + + +import "network" Network.Socket + + + would import the module Network.Socket from + the package network (any version). This may + be used to disambiguate an import when the same module is + available from multiple packages, or is present in both the + current package being built and an external package. + + Note: you probably don't need to use this feature, it was + added mainly so that we can build backwards-compatible versions of + packages when APIs change. It can lead to fragile dependencies in + the common case: modules occasionally move from one package to + another, rendering any package-qualified imports broken. + +Summary of stolen syntax + + Turning on an option that enables special syntax + might cause working Haskell 98 code to fail + to compile, perhaps because it uses a variable name which has + become a reserved word. This section lists the syntax that is + "stolen" by language extensions. + We use + notation and nonterminal names from the Haskell 98 lexical syntax + (see the Haskell 98 Report). + We only list syntax changes here that might affect + existing working programs (i.e. "stolen" syntax). Many of these + extensions will also enable new context-free syntax, but in all + cases programs written to use the new syntax would not be + compilable without the option enabled. + +There are two classes of special + syntax: + + + + New reserved words and symbols: character sequences + which are no longer available for use as identifiers in the + program. + + + Other special syntax: sequences of characters that have + a different meaning when this particular option is turned + on. + + + +The following syntax is stolen: + + + + + forall + forall + + + Stolen (in types) by: , + , + , + , + , + + + + + + + mdo + mdo + + + Stolen by: , + + + + + + foreign + foreign + + + Stolen by: , + + + + + + rec, + proc, -<, + >-, -<<, + >>-, and (|, + |) brackets + proc + + + Stolen by: , + + + + + + ?varid, + %varid + implicit parameters + + + Stolen by: , + + + + + + [|, + [e|, [p|, + [d|, [t|, + $(, + $varid + Template Haskell + + + Stolen by: , + + + + + + [:varid| + quasi-quotation + + + Stolen by: , + + + + + + varid{#}, + char#, + string#, + integer#, + float#, + float##, + (#, #), + + + Stolen by: , + + + + + @@ -1481,9 +1738,12 @@ to be written infix, very much like expressions. More specifically: Liberalised type synonyms -Type synonyms are like macros at the type level, and +Type synonyms are like macros at the type level, but Haskell 98 imposes many rules +on individual synonym declarations. +With the extension, GHC does validity checking on types only after expanding type synonyms. -That means that GHC can be very much more liberal about type synonyms than Haskell 98: +That means that GHC can be very much more liberal about type synonyms than Haskell 98. + You can write a forall (including overloading) in a type synonym, thus: @@ -1500,7 +1760,8 @@ in a type synonym, thus: -You can write an unboxed tuple in a type synonym: +If you also use , +you can write an unboxed tuple in a type synonym: type Pr = (# Int, Int #) @@ -1658,7 +1919,7 @@ apply fn to val to get a boolean. For e -What this allows us to do is to package heterogenous values +What this allows us to do is to package heterogeneous values together with a bunch of functions that manipulate them, and then treat that collection of packages in a uniform manner. You can express quite a bit of object-oriented-like programming this way. @@ -1689,8 +1950,8 @@ adding a new existential quantification construct. - -Type classes + +Existentials and type classes An easy extension is to allow @@ -1798,15 +2059,28 @@ main = do display (inc (inc counterB)) -- prints "##" -At the moment, record update syntax is only supported for Haskell 98 data types, -so the following function does not work: - +Record update syntax is supported for existentials (and GADTs): --- This is invalid; use explicit NewCounter instead for now setTag :: Counter a -> a -> Counter a setTag obj t = obj{ tag = t } +The rule for record update is this: +the types of the updated fields may +mention only the universally-quantified type variables +of the data constructor. For GADTs, the field may mention only types +that appear as a simple type-variable argument in the constructor's result +type. For example: + +data T a b where { T1 { f1::a, f2::b, f3::(b,c) } :: T a b } -- c is existential +upd1 t x = t { f1=x } -- OK: upd1 :: T a b -> a' -> T a' b +upd2 t x = t { f3=x } -- BAD (f3's type mentions c, which is + -- existentially quantified) +data G a b where { G1 { g1::a, g2::c } :: G a [c] } +upd3 g x = g { g1=x } -- OK: upd3 :: G a b -> c -> G c b +upd4 g x = g { g2=x } -- BAD (f2's type mentions c, which is not a simple + -- type-variable argument in G1's result type) + @@ -2016,19 +2290,8 @@ In the example, the equality dictionary is used to satisfy the equality constrai generated by the call to elem, so that the type of insert itself has no Eq constraint. -This behaviour contrasts with Haskell 98's peculiar treatment of -contexts on a data type declaration (Section 4.2.1 of the Haskell 98 Report). -In Haskell 98 the definition - - data Eq a => Set' a = MkSet' [a] - -gives MkSet' the same type as MkSet above. But instead of -making available an (Eq a) constraint, pattern-matching -on MkSet' requires an (Eq a) constraint! -GHC faithfully implements this behaviour, odd though it is. But for GADT-style declarations, -GHC's behaviour is much more useful, as well as much more intuitive. -For example, a possible application of GHC's behaviour is to reify dictionaries: +For example, one possible application is to reify dictionaries: data NumInst a where MkNumInst :: Num a => NumInst a @@ -2042,6 +2305,38 @@ For example, a possible application of GHC's behaviour is to reify dictionaries: Here, a value of type NumInst a is equivalent to an explicit (Num a) dictionary. + +All this applies to constructors declared using the syntax of . +For example, the NumInst data type above could equivalently be declared +like this: + + data NumInst a + = Num a => MkNumInst (NumInst a) + +Notice that, unlike the situation when declaring an existential, there is +no forall, because the Num constrains the +data type's universally quantified type variable a. +A constructor may have both universal and existential type variables: for example, +the following two declarations are equivalent: + + data T1 a + = forall b. (Num a, Eq b) => MkT1 a b + data T2 a where + MkT2 :: (Num a, Eq b) => a -> b -> T2 a + + +All this behaviour contrasts with Haskell 98's peculiar treatment of +contexts on a data type declaration (Section 4.2.1 of the Haskell 98 Report). +In Haskell 98 the definition + + data Eq a => Set' a = MkSet' [a] + +gives MkSet' the same type as MkSet above. But instead of +making available an (Eq a) constraint, pattern-matching +on MkSet' requires an (Eq a) constraint! +GHC faithfully implements this behaviour, odd though it is. But for GADT-style declarations, +GHC's behaviour is much more useful, as well as much more intuitive. + The rest of this section gives further details about GADT-style data @@ -2057,16 +2352,46 @@ otherwise is a generalised data type ( +As with other type signatures, you can give a single signature for several data constructors. +In this example we give a single signature for T1 and T2: + + data T a where + T1,T2 :: a -> T a + T3 :: T a + + + + The type signature of each constructor is independent, and is implicitly universally quantified as usual. -Different constructors may have different universally-quantified type variables -and different type-class constraints. -For example, this is fine: +In particular, the type variable(s) in the "data T a where" header +have no scope, and different constructors may have different universally-quantified type variables: + + data T a where -- The 'a' has no scope + T1,T2 :: b -> T b -- Means forall b. b -> T b + T3 :: T a -- Means forall a. T a + + + + +A constructor signature may mention type class constraints, which can differ for +different constructors. For example, this is fine: data T a where - T1 :: Eq b => b -> T b + T1 :: Eq b => b -> b -> T b T2 :: (Show c, Ix c) => c -> [c] -> T c +When patten matching, these constraints are made available to discharge constraints +in the body of the match. For example: + + f :: T a -> String + f (T1 x y) | x==y = "yes" + | otherwise = "no" + f (T2 a b) = show a + +Note that f is not overloaded; the Eq constraint arising +from the use of == is discharged by the pattern match on T1 +and similarly the Show constraint arising from the use of show. @@ -2078,12 +2403,12 @@ have no scope. Indeed, one can write a kind signature instead: or even a mixture of the two: - data Foo a :: (* -> *) -> * where ... + data Bar a :: (* -> *) -> * where ... The type variables (if given) may be explicitly kinded, so we could also write the header for Foo like this: - data Foo a (b :: * -> *) where ... + data Bar a (b :: * -> *) where ... @@ -2114,27 +2439,48 @@ declaration. For example, these two declarations are equivalent +The type signature may have quantified type variables that do not appear +in the result type: + + data Foo where + MkFoo :: a -> (a->Bool) -> Foo + Nil :: Foo + +Here the type variable a does not appear in the result type +of either constructor. +Although it is universally quantified in the type of the constructor, such +a type variable is often called "existential". +Indeed, the above declaration declares precisely the same type as +the data Foo in . + +The type may contain a class context too, of course: + + data Showable where + MkShowable :: Show a => a -> Showable + + + + You can use record syntax on a GADT-style data type declaration: data Person where - Adult { name :: String, children :: [Person] } :: Person - Child { name :: String } :: Person + Adult :: { name :: String, children :: [Person] } -> Person + Child :: Show a => { name :: !String, funny :: a } -> Person As usual, for every constructor that has a field f, the type of -field f must be the same (modulo alpha conversion). - - -At the moment, record updates are not yet possible with GADT-style declarations, -so support is limited to record construction, selection and pattern matching. -For example - - aPerson = Adult { name = "Fred", children = [] } +field f must be the same (modulo alpha conversion). +The Child constructor above shows that the signature +may have a context, existentially-quantified variables, and strictness annotations, +just as in the non-record case. (NB: the "type" that follows the double-colon +is not really a type, because of the record syntax and strictness annotations. +A "type" of this form can appear only in a constructor signature.) + - shortName :: Person -> Bool - hasChildren (Adult { children = kids }) = not (null kids) - hasChildren (Child {}) = False - + +Record updates are allowed with GADT-style declarations, +only fields that have the following property: the type of the field +mentions no existential type variables. @@ -2193,7 +2539,7 @@ the type a is refined to Int. That's the A precise specification of the type rules is beyond what this user manual aspires to, but the design closely follows that described in the paper Simple +url="http://research.microsoft.com/%7Esimonpj/papers/gadt/">Simple unification-based type inference for GADTs, (ICFP 2006). The general principle is this: type refinement is only carried out @@ -2212,14 +2558,14 @@ the result type of the case expression. Hence the addition < These and many other examples are given in papers by Hongwei Xi, and Tim Sheard. There is a longer introduction -on the wiki, +on the wiki, and Ralf Hinze's Fun with phantom types also has a number of examples. Note that papers may use different notation to that implemented in GHC. The rest of this section outlines the extensions to GHC that support GADTs. The extension is enabled with -. +. The flag also sets . A GADT can only be declared using GADT-style syntax (); @@ -2228,11 +2574,17 @@ The result type of each constructor must begin with the type constructor being d but for a GADT the arguments to the type constructor can be arbitrary monotypes. For example, in the Term data type above, the type of each constructor must end with Term ty, but -the ty may not be a type variable (e.g. the Lit +the ty need not be a type variable (e.g. the Lit constructor). +It's is permitted to declare an ordinary algebraic data type using GADT-style syntax. +What makes a GADT into a GADT is not the syntax, but rather the presence of data constructors +whose result type is not just T a b. + + + You cannot use a deriving clause for a GADT; only for an ordinary data type. @@ -2268,6 +2620,23 @@ their selector functions actually have different types: + +When pattern-matching against data constructors drawn from a GADT, +for example in a case expression, the following rules apply: + +The type of the scrutinee must be rigid. +The type of the entire case expression must be rigid. +The type of any free variable mentioned in any of +the case alternatives must be rigid. + +A type is "rigid" if it is completely known to the compiler at its binding site. The easiest +way to ensure that a variable a rigid type is to give it a type signature. +For more precise details see +Simple unification-based type inference for GADTs +. The criteria implemented by GHC are given in the Appendix. + + + @@ -2325,9 +2694,27 @@ The syntax is identical to that of an ordinary instance declaration apart from ( You must supply a context (in the example the context is (Eq a)), exactly as you would in an ordinary instance declaration. (In contrast the context is inferred in a deriving clause -attached to a data type declaration.) These deriving instance -rules obey the same rules concerning form and termination as ordinary instance declarations, -controlled by the same flags; see . +attached to a data type declaration.) + +A deriving instance declaration +must obey the same rules concerning form and termination as ordinary instance declarations, +controlled by the same flags; see . + + +Unlike a deriving +declaration attached to a data declaration, the instance can be more specific +than the data type (assuming you also use +-XFlexibleInstances, ). Consider +for example + + data Foo a = Bar a | Baz String + + deriving instance Eq a => Eq (Foo [a]) + deriving instance Eq a => Eq (Foo (Maybe a)) + +This will generate a derived instance for (Foo [a]) and (Foo (Maybe a)), +but other types such as (Foo (Int,Bool)) will not be an instance of Eq. + The stand-alone syntax is generalised for newtypes in exactly the same way that ordinary deriving clauses are generalised (). @@ -2345,7 +2732,7 @@ GHC always treats the last parameter of the instance -Deriving clause for classes <literal>Typeable</literal> and <literal>Data</literal> +Deriving clause for extra classes (<literal>Typeable</literal>, <literal>Data</literal>, etc) Haskell 98 allows the programmer to add "deriving( Eq, Ord )" to a data type @@ -2355,11 +2742,11 @@ classes Eq, Ord, Enum, Ix, Bounded, Read, and Show. -GHC extends this list with two more classes that may be automatically derived -(provided the flag is specified): -Typeable, and Data. These classes are defined in the library -modules Data.Typeable and Data.Generics respectively, and the -appropriate class must be in scope before it can be mentioned in the deriving clause. +GHC extends this list with several more classes that may be automatically derived: + + With , you can derive instances of the classes +Typeable, and Data, defined in the library +modules Data.Typeable and Data.Generics respectively. An instance of Typeable can only be derived if the data type has seven or fewer type parameters, all of kind *. @@ -2375,6 +2762,26 @@ In other cases, there is nothing to stop the programmer writing a Typab class, whose kind suits that of the data type constructor, and then writing the data type instance by hand. + + + With , you can derive instances of +the class Functor, +defined in GHC.Base. + + + With , you can derive instances of +the class Foldable, +defined in Data.Foldable. + + + With , you can derive instances of +the class Traversable, +defined in Data.Traversable. + + +In each case the appropriate class must be in scope before it +can be mentioned in the deriving clause. + @@ -2588,8 +2995,8 @@ the standard method is used or the one described here.) This section, and the next one, documents GHC's type-class extensions. There's lots of background in the paper Type -classes: exploring the design space (Simon Peyton Jones, Mark +url="http://research.microsoft.com/~simonpj/Papers/type-class-design-space/">Type +classes: exploring the design space (Simon Peyton Jones, Mark Jones, Erik Meijer). @@ -2668,7 +3075,7 @@ class type variable, thus: The type of elem is illegal in Haskell 98, because it contains the constraint Eq a, constrains only the class type variable (in this case a). -GHC lifts this restriction. +GHC lifts this restriction (flag ). @@ -2680,7 +3087,7 @@ GHC lifts this restriction. Functional dependencies are implemented as described by Mark Jones -in “Type Classes with Functional Dependencies”, Mark P. Jones, +in “Type Classes with Functional Dependencies”, Mark P. Jones, In Proceedings of the 9th European Symposium on Programming, ESOP 2000, Berlin, Germany, March 2000, Springer-Verlag LNCS 1782, . @@ -2993,9 +3400,6 @@ sets of instance declarations. Instance declarations - -Relaxed rules for instance declarations - An instance declaration has the form instance ( assertion1, ..., assertionn) => class type1 ... typem where ... @@ -3005,19 +3409,73 @@ The part before the "=>" is the "=>" is the head of the instance declaration. + +Relaxed rules for the instance head + In Haskell 98 the head of an instance declaration must be of the form C (T a1 ... an), where -C is the class, T is a type constructor, +C is the class, T is a data type constructor, and the a1 ... an are distinct type variables. -Furthermore, the assertions in the context of the instance declaration +GHC relaxes these rules in two ways. + + + +The flag allows the head of the instance +declaration to mention arbitrary nested types. +For example, this becomes a legal instance declaration + + instance C (Maybe Int) where ... + +See also the rules on overlap. + + +With the flag, instance heads may use type +synonyms. As always, using a type synonym is just shorthand for +writing the RHS of the type synonym definition. For example: + + + + type Point = (Int,Int) + instance C Point where ... + instance C [Point] where ... + + + +is legal. However, if you added + + + + instance C (Int,Int) where ... + + + +as well, then the compiler will complain about the overlapping +(actually, identical) instance declarations. As always, type synonyms +must be fully applied. You cannot, for example, write: + + + type P a = [[a]] + instance Monad P where ... + + + + + + + + +Relaxed rules for instance contexts + +In Haskell 98, the assertions in the context of the instance declaration must be of the form C a where a is a type variable that occurs in the head. + -The flag loosens these restrictions -considerably. Firstly, multi-parameter type classes are permitted. Secondly, -the context and head of the instance declaration can each consist of arbitrary +The flag relaxes this rule, as well +as the corresponding rule for type signatures (see ). +With this flag the context of the instance declaration can each consist of arbitrary (well-kinded) assertions (C t1 ... tn) subject only to the following rules: @@ -3043,7 +3501,7 @@ corresponding type in the instance declaration. These restrictions ensure that context reduction terminates: each reduction step makes the problem smaller by at least one constructor. Both the Paterson Conditions and the Coverage Condition are lifted -if you give the +if you give the flag (). You can find lots of background material about the reason for these restrictions in the paper f, so GHC had to check that f has the specified type. Suppose instead we do not give a type signature, asking GHC to infer it instead. In this case, GHC will refrain from -simplifying the constraint C Int [Int] (for the same reason +simplifying the constraint C Int [b] (for the same reason as before) but, rather than rejecting the program, it will infer the type - f :: C Int b => [b] -> [b] + f :: C Int [b] => [b] -> [b] That postpones the question of which instance to pick to the call site for f by which time more is known about the type b. +You can write this type signature yourself if you use the + +flag. + + +Exactly the same situation can arise in instance declarations themselves. Suppose we have + + class Foo a where + f :: a -> a + instance Foo [b] where + f x = ... + +and, as before, the constraint C Int [b] arises from f's +right hand side. GHC will reject the instance, complaining as before that it does not know how to resolve +the constraint C Int [b], because it matches more than one instance +declaration. The solution is to postpone the choice by adding the constraint to the context +of the instance declaration, thus: + + instance C Int [b] => Foo [b] where + f x = ... + +(You need to do this.) The willingness to be overlapped or incoherent is a property of @@ -3293,47 +3773,6 @@ hard to pin down.) We are interested to receive feedback on these points. - -Type synonyms in the instance head - - -Unlike Haskell 98, instance heads may use type -synonyms. (The instance "head" is the bit after the "=>" in an instance decl.) -As always, using a type synonym is just shorthand for -writing the RHS of the type synonym definition. For example: - - - - type Point = (Int,Int) - instance C Point where ... - instance C [Point] where ... - - - -is legal. However, if you added - - - - instance C (Int,Int) where ... - - - -as well, then the compiler will complain about the overlapping -(actually, identical) instance declarations. As always, type synonyms -must be fully applied. You cannot, for example, write: - - - - type P a = [[a]] - instance Monad P where ... - - - -This design decision is independent of all the others, and easily -reversed, but it makes sense to me. - - - @@ -3366,51 +3805,751 @@ The only predefined instance is the obvious one to make strings work as usual: instance IsString [Char] where fromString cs = cs -The class IsString is not in scope by default. If you want to mention -it explicitly (for example, to give an instance declaration for it), you can import it -from module GHC.Exts. - - -Haskell's defaulting mechanism is extended to cover string literals, when is specified. -Specifically: +The class IsString is not in scope by default. If you want to mention +it explicitly (for example, to give an instance declaration for it), you can import it +from module GHC.Exts. + + +Haskell's defaulting mechanism is extended to cover string literals, when is specified. +Specifically: + + +Each type in a default declaration must be an +instance of Num or of IsString. + + + +The standard defaulting rule (Haskell Report, Section 4.3.4) +is extended thus: defaulting applies when all the unresolved constraints involve standard classes +or IsString; and at least one is a numeric class +or IsString. + + + + +A small example: + +module Main where + +import GHC.Exts( IsString(..) ) + +newtype MyString = MyString String deriving (Eq, Show) +instance IsString MyString where + fromString = MyString + +greet :: MyString -> MyString +greet "hello" = "world" +greet other = other + +main = do + print $ greet "hello" + print $ greet "fool" + + + +Note that deriving Eq is necessary for the pattern matching +to work since it gets translated into an equality comparison. + + + + + + +Type families + + + Indexed type families are a new GHC extension to + facilitate type-level + programming. Type families are a generalisation of associated + data types + (“Associated + Types with Class”, M. Chakravarty, G. Keller, S. Peyton Jones, + and S. Marlow. In Proceedings of “The 32nd Annual ACM SIGPLAN-SIGACT + Symposium on Principles of Programming Languages (POPL'05)”, pages + 1-13, ACM Press, 2005) and associated type synonyms + (“Type + Associated Type Synonyms”. M. Chakravarty, G. Keller, and + S. Peyton Jones. + In Proceedings of “The Tenth ACM SIGPLAN International Conference on + Functional Programming”, ACM Press, pages 241-253, 2005). Type families + themselves are described in the paper “Type + Checking with Open Type Functions”, T. Schrijvers, + S. Peyton-Jones, + M. Chakravarty, and M. Sulzmann, in Proceedings of “ICFP 2008: The + 13th ACM SIGPLAN International Conference on Functional + Programming”, ACM Press, pages 51-62, 2008. Type families + essentially provide type-indexed data types and named functions on types, + which are useful for generic programming and highly parameterised library + interfaces as well as interfaces with enhanced static information, much like + dependent types. They might also be regarded as an alternative to functional + dependencies, but provide a more functional style of type-level programming + than the relational style of functional dependencies. + + + Indexed type families, or type families for short, are type constructors that + represent sets of types. Set members are denoted by supplying the type family + constructor with type parameters, which are called type + indices. The + difference between vanilla parametrised type constructors and family + constructors is much like between parametrically polymorphic functions and + (ad-hoc polymorphic) methods of type classes. Parametric polymorphic functions + behave the same at all type instances, whereas class methods can change their + behaviour in dependence on the class type parameters. Similarly, vanilla type + constructors imply the same data representation for all type instances, but + family constructors can have varying representation types for varying type + indices. + + + Indexed type families come in two flavours: data + families and type synonym + families. They are the indexed family variants of algebraic + data types and type synonyms, respectively. The instances of data families + can be data types and newtypes. + + + Type families are enabled by the flag . + Additional information on the use of type families in GHC is available on + the + Haskell wiki page on type families. + + + + Data families + + + Data families appear in two flavours: (1) they can be defined on the + toplevel + or (2) they can appear inside type classes (in which case they are known as + associated types). The former is the more general variant, as it lacks the + requirement for the type-indexes to coincide with the class + parameters. However, the latter can lead to more clearly structured code and + compiler warnings if some type instances were - possibly accidentally - + omitted. In the following, we always discuss the general toplevel form first + and then cover the additional constraints placed on associated types. + + + + Data family declarations + + + Indexed data families are introduced by a signature, such as + +data family GMap k :: * -> * + + The special family distinguishes family from standard + data declarations. The result kind annotation is optional and, as + usual, defaults to * if omitted. An example is + +data family Array e + + Named arguments can also be given explicit kind signatures if needed. + Just as with + [http://www.haskell.org/ghc/docs/latest/html/users_guide/gadt.html GADT + declarations] named arguments are entirely optional, so that we can + declare Array alternatively with + +data family Array :: * -> * + + + + + Associated data family declarations + + When a data family is declared as part of a type class, we drop + the family special. The GMap + declaration takes the following form + +class GMapKey k where + data GMap k :: * -> * + ... + + In contrast to toplevel declarations, named arguments must be used for + all type parameters that are to be used as type-indexes. Moreover, + the argument names must be class parameters. Each class parameter may + only be used at most once per associated type, but some may be omitted + and they may be in an order other than in the class head. Hence, the + following contrived example is admissible: + + class C a b c where + data T c a :: * + + + + + + + Data instance declarations + + + Instance declarations of data and newtype families are very similar to + standard data and newtype declarations. The only two differences are + that the keyword data or newtype + is followed by instance and that some or all of the + type arguments can be non-variable types, but may not contain forall + types or type synonym families. However, data families are generally + allowed in type parameters, and type synonyms are allowed as long as + they are fully applied and expand to a type that is itself admissible - + exactly as this is required for occurrences of type synonyms in class + instance parameters. For example, the Either + instance for GMap is + +data instance GMap (Either a b) v = GMapEither (GMap a v) (GMap b v) + + In this example, the declaration has only one variant. In general, it + can be any number. + + + Data and newtype instance declarations are only permitted when an + appropriate family declaration is in scope - just as a class instance declaratoin + requires the class declaration to be visible. Moreover, each instance + declaration has to conform to the kind determined by its family + declaration. This implies that the number of parameters of an instance + declaration matches the arity determined by the kind of the family. + + + A data family instance declaration can use the full exprssiveness of + ordinary data or newtype declarations: + + Although, a data family is introduced with + the keyword "data", a data family instance can + use either data or newtype. For example: + +data family T a +data instance T Int = T1 Int | T2 Bool +newtype instance T Char = TC Bool + + + A data instance can use GADT syntax for the data constructors, + and indeed can define a GADT. For example: + +data family G a b +data instance G [a] b where + G1 :: c -> G [Int] b + G2 :: G [a] Bool + + + You can use a deriving clause on a + data instance or newtype instance + declaration. + + + + + + Even if type families are defined as toplevel declarations, functions + that perform different computations for different family instances may still + need to be defined as methods of type classes. In particular, the + following is not possible: + +data family T a +data instance T Int = A +data instance T Char = B +foo :: T a -> Int +foo A = 1 -- WRONG: These two equations together... +foo B = 2 -- ...will produce a type error. + +Instead, you would have to write foo as a class operation, thus: + +class C a where + foo :: T a -> Int +instance Foo Int where + foo A = 1 +instance Foo Char where + foo B = 2 + + (Given the functionality provided by GADTs (Generalised Algebraic Data + Types), it might seem as if a definition, such as the above, should be + feasible. However, type families are - in contrast to GADTs - are + open; i.e., new instances can always be added, + possibly in other + modules. Supporting pattern matching across different data instances + would require a form of extensible case construct.) + + + + Associated data instances + + When an associated data family instance is declared within a type + class instance, we drop the instance keyword in the + family instance. So, the Either instance + for GMap becomes: + +instance (GMapKey a, GMapKey b) => GMapKey (Either a b) where + data GMap (Either a b) v = GMapEither (GMap a v) (GMap b v) + ... + + The most important point about associated family instances is that the + type indexes corresponding to class parameters must be identical to + the type given in the instance head; here this is the first argument + of GMap, namely Either a b, + which coincides with the only class parameter. Any parameters to the + family constructor that do not correspond to class parameters, need to + be variables in every instance; here this is the + variable v. + + + Instances for an associated family can only appear as part of + instances declarations of the class in which the family was declared - + just as with the equations of the methods of a class. Also in + correspondence to how methods are handled, declarations of associated + types can be omitted in class instances. If an associated family + instance is omitted, the corresponding instance type is not inhabited; + i.e., only diverging expressions, such + as undefined, can assume the type. + + + + + Scoping of class parameters + + In the case of multi-parameter type classes, the visibility of class + parameters in the right-hand side of associated family instances + depends solely on the parameters of the data + family. As an example, consider the simple class declaration + +class C a b where + data T a + + Only one of the two class parameters is a parameter to the data + family. Hence, the following instance declaration is invalid: + +instance C [c] d where + data T [c] = MkT (c, d) -- WRONG!! 'd' is not in scope + + Here, the right-hand side of the data instance mentions the type + variable d that does not occur in its left-hand + side. We cannot admit such data instances as they would compromise + type safety. + + + + + Type class instances of family instances + + Type class instances of instances of data families can be defined as + usual, and in particular data instance declarations can + have deriving clauses. For example, we can write + +data GMap () v = GMapUnit (Maybe v) + deriving Show + + which implicitly defines an instance of the form + +instance Show v => Show (GMap () v) where ... + + + + Note that class instances are always for + particular instances of a data family and never + for an entire family as a whole. This is for essentially the same + reasons that we cannot define a toplevel function that performs + pattern matching on the data constructors + of different instances of a single type family. + It would require a form of extensible case construct. + + + + + Overlap of data instances + + The instance declarations of a data family used in a single program + may not overlap at all, independent of whether they are associated or + not. In contrast to type class instances, this is not only a matter + of consistency, but one of type safety. + + + + + + + Import and export + + + The association of data constructors with type families is more dynamic + than that is the case with standard data and newtype declarations. In + the standard case, the notation T(..) in an import or + export list denotes the type constructor and all the data constructors + introduced in its declaration. However, a family declaration never + introduces any data constructors; instead, data constructors are + introduced by family instances. As a result, which data constructors + are associated with a type family depends on the currently visible + instance declarations for that family. Consequently, an import or + export item of the form T(..) denotes the family + constructor and all currently visible data constructors - in the case of + an export item, these may be either imported or defined in the current + module. The treatment of import and export items that explicitly list + data constructors, such as GMap(GMapEither), is + analogous. + + + + Associated families + + As expected, an import or export item of the + form C(..) denotes all of the class' methods and + associated types. However, when associated types are explicitly + listed as subitems of a class, we need some new syntax, as uppercase + identifiers as subitems are usually data constructors, not type + constructors. To clarify that we denote types here, each associated + type name needs to be prefixed by the keyword type. + So for example, when explicitly listing the components of + the GMapKey class, we write GMapKey(type + GMap, empty, lookup, insert). + + + + + Examples + + Assuming our running GMapKey class example, let us + look at some export lists and their meaning: + + + module GMap (GMapKey) where...: Exports + just the class name. + + + module GMap (GMapKey(..)) where...: + Exports the class, the associated type GMap + and the member + functions empty, lookup, + and insert. None of the data constructors is + exported. + + + module GMap (GMapKey(..), GMap(..)) + where...: As before, but also exports all the data + constructors GMapInt, + GMapChar, + GMapUnit, GMapPair, + and GMapUnit. + + + module GMap (GMapKey(empty, lookup, insert), + GMap(..)) where...: As before. + + + module GMap (GMapKey, empty, lookup, insert, GMap(..)) + where...: As before. + + + + + Finally, you can write GMapKey(type GMap) to denote + both the class GMapKey as well as its associated + type GMap. However, you cannot + write GMapKey(type GMap(..)) — i.e., + sub-component specifications cannot be nested. To + specify GMap's data constructors, you have to list + it separately. + + + + + Instances + + Family instances are implicitly exported, just like class instances. + However, this applies only to the heads of instances, not to the data + constructors an instance defines. + + + + + + + + + Synonym families + + + Type families appear in two flavours: (1) they can be defined on the + toplevel or (2) they can appear inside type classes (in which case they + are known as associated type synonyms). The former is the more general + variant, as it lacks the requirement for the type-indexes to coincide with + the class parameters. However, the latter can lead to more clearly + structured code and compiler warnings if some type instances were - + possibly accidentally - omitted. In the following, we always discuss the + general toplevel form first and then cover the additional constraints + placed on associated types. + + + + Type family declarations + + + Indexed type families are introduced by a signature, such as + +type family Elem c :: * + + The special family distinguishes family from standard + type declarations. The result kind annotation is optional and, as + usual, defaults to * if omitted. An example is + +type family Elem c + + Parameters can also be given explicit kind signatures if needed. We + call the number of parameters in a type family declaration, the family's + arity, and all applications of a type family must be fully saturated + w.r.t. to that arity. This requirement is unlike ordinary type synonyms + and it implies that the kind of a type family is not sufficient to + determine a family's arity, and hence in general, also insufficient to + determine whether a type family application is well formed. As an + example, consider the following declaration: + +type family F a b :: * -> * -- F's arity is 2, + -- although it's overall kind is * -> * -> * -> * + + Given this declaration the following are examples of well-formed and + malformed types: + +F Char [Int] -- OK! Kind: * -> * +F Char [Int] Bool -- OK! Kind: * +F IO Bool -- WRONG: kind mismatch in the first argument +F Bool -- WRONG: unsaturated application + + + + + Associated type family declarations + + When a type family is declared as part of a type class, we drop + the family special. The Elem + declaration takes the following form + +class Collects ce where + type Elem ce :: * + ... + + The argument names of the type family must be class parameters. Each + class parameter may only be used at most once per associated type, but + some may be omitted and they may be in an order other than in the + class head. Hence, the following contrived example is admissible: + +class C a b c where + type T c a :: * + + These rules are exactly as for associated data families. + + + + + + Type instance declarations + + Instance declarations of type families are very similar to standard type + synonym declarations. The only two differences are that the + keyword type is followed + by instance and that some or all of the type + arguments can be non-variable types, but may not contain forall types or + type synonym families. However, data families are generally allowed, and + type synonyms are allowed as long as they are fully applied and expand + to a type that is admissible - these are the exact same requirements as + for data instances. For example, the [e] instance + for Elem is + +type instance Elem [e] = e + + + + Type family instance declarations are only legitimate when an + appropriate family declaration is in scope - just like class instances + require the class declaration to be visible. Moreover, each instance + declaration has to conform to the kind determined by its family + declaration, and the number of type parameters in an instance + declaration must match the number of type parameters in the family + declaration. Finally, the right-hand side of a type instance must be a + monotype (i.e., it may not include foralls) and after the expansion of + all saturated vanilla type synonyms, no synonyms, except family synonyms + may remain. Here are some examples of admissible and illegal type + instances: + +type family F a :: * +type instance F [Int] = Int -- OK! +type instance F String = Char -- OK! +type instance F (F a) = a -- WRONG: type parameter mentions a type family +type instance F (forall a. (a, b)) = b -- WRONG: a forall type appears in a type parameter +type instance F Float = forall a.a -- WRONG: right-hand side may not be a forall type + +type family G a b :: * -> * +type instance G Int = (,) -- WRONG: must be two type parameters +type instance G Int Char Float = Double -- WRONG: must be two type parameters + + + + + Associated type instance declarations + + When an associated family instance is declared within a type class + instance, we drop the instance keyword in the family + instance. So, the [e] instance + for Elem becomes: + +instance (Eq (Elem [e])) => Collects ([e]) where + type Elem [e] = e + ... + + The most important point about associated family instances is that the + type indexes corresponding to class parameters must be identical to the + type given in the instance head; here this is [e], + which coincides with the only class parameter. + + + Instances for an associated family can only appear as part of instances + declarations of the class in which the family was declared - just as + with the equations of the methods of a class. Also in correspondence to + how methods are handled, declarations of associated types can be omitted + in class instances. If an associated family instance is omitted, the + corresponding instance type is not inhabited; i.e., only diverging + expressions, such as undefined, can assume the type. + + + + + Overlap of type synonym instances + + The instance declarations of a type family used in a single program + may only overlap if the right-hand sides of the overlapping instances + coincide for the overlapping types. More formally, two instance + declarations overlap if there is a substitution that makes the + left-hand sides of the instances syntactically the same. Whenever + that is the case, the right-hand sides of the instances must also be + syntactically equal under the same substitution. This condition is + independent of whether the type family is associated or not, and it is + not only a matter of consistency, but one of type safety. + + + Here are two example to illustrate the condition under which overlap + is permitted. + +type instance F (a, Int) = [a] +type instance F (Int, b) = [b] -- overlap permitted + +type instance G (a, Int) = [a] +type instance G (Char, a) = [a] -- ILLEGAL overlap, as [Char] /= [Int] + + + + + + Decidability of type synonym instances + + In order to guarantee that type inference in the presence of type + families decidable, we need to place a number of additional + restrictions on the formation of type instance declarations (c.f., + Definition 5 (Relaxed Conditions) of “Type + Checking with Open Type Functions”). Instance + declarations have the general form + +type instance F t1 .. tn = t + + where we require that for every type family application (G s1 + .. sm) in t, + + + s1 .. sm do not contain any type family + constructors, + + + the total number of symbols (data type constructors and type + variables) in s1 .. sm is strictly smaller than + in t1 .. tn, and + + + for every type + variable a, a occurs + in s1 .. sm at most as often as in t1 + .. tn. + + + These restrictions are easily verified and ensure termination of type + inference. However, they are not sufficient to guarantee completeness + of type inference in the presence of, so called, ''loopy equalities'', + such as a ~ [F a], where a recursive occurrence of + a type variable is underneath a family application and data + constructor application - see the above mentioned paper for details. + + + If the option is passed to the + compiler, the above restrictions are not enforced and it is on the + programmer to ensure termination of the normalisation of type families + during type inference. + + + + + + Equality constraints + + Type context can include equality constraints of the form t1 ~ + t2, which denote that the types t1 + and t2 need to be the same. In the presence of type + families, whether two types are equal cannot generally be decided + locally. Hence, the contexts of function signatures may include + equality constraints, as in the following example: + +sumCollects :: (Collects c1, Collects c2, Elem c1 ~ Elem c2) => c1 -> c2 -> c2 + + where we require that the element type of c1 + and c2 are the same. In general, the + types t1 and t2 of an equality + constraint may be arbitrary monotypes; i.e., they may not contain any + quantifiers, independent of whether higher-rank types are otherwise + enabled. + + + Equality constraints can also appear in class and instance contexts. + The former enable a simple translation of programs using functional + dependencies into programs using family synonyms instead. The general + idea is to rewrite a class declaration of the form + +class C a b | a -> b + + to + +class (F a ~ b) => C a b where + type F a + + That is, we represent every functional dependency (FD) a1 .. an + -> b by an FD type family F a1 .. an and a + superclass context equality F a1 .. an ~ b, + essentially giving a name to the functional dependency. In class + instances, we define the type instances of FD families in accordance + with the class head. Method signatures are not affected by that + process. + + + NB: Equalities in superclass contexts are not fully implemented in + GHC 6.10. + + + + + Type families and instance declarations + Type families require us to extend the rules for + the form of instance heads, which are given + in . + Specifically: - -Each type in a default declaration must be an -instance of Num or of IsString. - - - -The standard defaulting rule (Haskell Report, Section 4.3.4) -is extended thus: defaulting applies when all the unresolved constraints involve standard classes -or IsString; and at least one is a numeric class -or IsString. - + Data type families may appear in an instance head + Type synonym families may not appear (at all) in an instance head - - -A small example: +The reason for the latter restriction is that there is no way to check for. Consider -module Main where - -import GHC.Exts( IsString(..) ) - -newtype MyString = MyString String deriving (Eq, Show) -instance IsString MyString where - fromString = MyString + type family F a + type instance F Bool = Int -greet :: MyString -> MyString -greet "hello" = "world" -greet other = other + class C a -main = do - print $ greet "hello" - print $ greet "fool" + instance C Int + instance C (F a) +Now a constraint (C (F Bool)) would match both instances. +The situation is especially bad because the type instance for F Bool +might be in another module, or even in a module that is not yet written. - -Note that deriving Eq is necessary for the pattern matching -to work since it gets translated into an equality comparison. - + @@ -3421,11 +4560,13 @@ to work since it gets translated into an equality comparison. Type signatures -The context of a type signature +The context of a type signature -Unlike Haskell 98, constraints in types do not have to be of -the form (class type-variable) or -(class (type-variable type-variable ...)). Thus, +The flag lifts the Haskell 98 restriction +that the type-class constraints in a type signature must have the +form (class type-variable) or +(class (type-variable type-variable ...)). +With these type signatures are perfectly OK g :: Eq [a] => ... @@ -3456,8 +4597,8 @@ in GHC, you can give the foralls if you want. See a is "reachable" if it appears +in the same constraint as either a type variable free in type, or another reachable type variable. A value with a type that does not obey this reachability restriction cannot be used without introducing @@ -4057,9 +5198,18 @@ The function f3 has a rank-3 type; it has rank-2 types on the left of a function arrow. -GHC allows types of arbitrary rank; you can nest foralls -arbitrarily deep in function arrows. (GHC used to be restricted to rank 2, but -that restriction has now been lifted.) +GHC has three flags to control higher-rank types: + + + : data constructors (only) can have polymorphic argument types. + + + : any function (including data constructors) can have a rank-2 type. + + + : any function (including data constructors) can have an arbitrary-rank type. +That is, you can nest foralls +arbitrarily deep in function arrows. In particular, a forall-type (also called a "type scheme"), including an operational type class context, is legal: @@ -4071,6 +5221,8 @@ field type signatures. As the type of an implicit parameter In a pattern type signature (see ) + + Of course forall becomes a keyword; you can't use forall as a type variable any more! @@ -4309,7 +5461,9 @@ for rank-2 types. Impredicative polymorphism -GHC supports impredicative polymorphism. This means +GHC supports impredicative polymorphism, +enabled with . +This means that you can call a polymorphic function at a polymorphic type, and parameterise data structures over polymorphic types. For example: @@ -4322,7 +5476,7 @@ Notice here that the Maybe type is parameterised by the [a]). The technical details of this extension are described in the paper -Boxy types: +Boxy types: type inference for higher-rank types and impredicativity, which appeared at ICFP 2006. @@ -4342,15 +5496,19 @@ f xs = ys ++ ys ys :: [a] ys = reverse xs -The type signature for f brings the type variable a into scope; it scopes over -the entire definition of f. -In particular, it is in scope at the type signature for ys. +The type signature for f brings the type variable a into scope, +because of the explicit forall (). +The type variables bound by a forall scope over +the entire definition of the accompanying value declaration. +In this example, the type variable a scopes over the whole +definition of f, including over +the type signature for ys. In Haskell 98 it is not possible to declare a type for ys; a major benefit of scoped type variables is that it becomes possible to do so. Lexically-scoped type variables are enabled by -. +. This flag implies . Note: GHC 6.6 contains substantial changes to the way that scoped type variables work, compared to earlier releases. Read this section @@ -4385,9 +5543,9 @@ A lexically scoped type variable can be bound by: In Haskell, a programmer-written type signature is implicitly quantified over its free type variables (Section +url="http://www.haskell.org/onlinereport/decls.html#sect4.1.2">Section 4.1.2 -of the Haskel Report). +of the Haskell Report). Lexically scoped type variables affect this implicit quantification rules as follows: any type variable that is in scope is not universally quantified. For example, if type variable a is in scope, @@ -4408,7 +5566,7 @@ then A declaration type signature that has explicit quantification (using forall) brings into scope the explicitly-quantified -type variables, in the definition of the named function(s). For example: +type variables, in the definition of the named function. For example: f :: forall a. [a] -> [a] f (x:xs) = xs ++ [ x :: a ] @@ -4416,7 +5574,9 @@ type variables, in the definition of the named function(s). For example: The "forall a" brings "a" into scope in the definition of "f". -This only happens if the quantification in f's type +This only happens if: + + The quantification in f's type signature is explicit. For example: g :: [a] -> [a] @@ -4426,6 +5586,26 @@ This program will be rejected, because "a" does not scope over the definition of "f", so "x::a" means "x::forall a. a" by Haskell's usual implicit quantification rules. + + The signature gives a type for a function binding or a bare variable binding, +not a pattern binding. +For example: + + f1 :: forall a. [a] -> [a] + f1 (x:xs) = xs ++ [ x :: a ] -- OK + + f2 :: forall a. [a] -> [a] + f2 = \(x:xs) -> xs ++ [ x :: a ] -- OK + + f3 :: forall a. [a] -> [a] + Just f3 = Just (\(x:xs) -> xs ++ [ x :: a ]) -- Not OK! + +The binding for f3 is a pattern binding, and so its type signature +does not bring a into scope. However f1 is a +function binding, and f2 binds a bare variable; in both cases +the type signature brings a into scope. + + @@ -4463,8 +5643,8 @@ already in scope (i.e. bound by the enclosing context), matters are simple: the signature simply constrains the type of the pattern in the obvious way. -Unlike expression and declaration type signatures, pattern type signatures are not implictly generalised. -The pattern in a patterm binding may only mention type variables +Unlike expression and declaration type signatures, pattern type signatures are not implicitly generalised. +The pattern in a pattern binding may only mention type variables that are already in scope. For example: f :: forall a. [a] -> (Int, [a]) @@ -4603,18 +5783,18 @@ scope over the methods defined in the where part. For exampl The Haskell Report specifies that a group of bindings (at top level, or in a let or where) should be sorted into strongly-connected components, and then type-checked in dependency order -(Haskell +(Haskell Report, Section 4.5.1). As each group is type-checked, any binders of the group that have an explicit type signature are put in the type environment with the specified polymorphic type, and all others are monomorphic until the group is generalised -(Haskell Report, Section 4.5.2). +(Haskell Report, Section 4.5.2). Following a suggestion of Mark Jones, in his paper -Typing Haskell in +Typing Haskell in Haskell, GHC implements a more general scheme. If is specified: @@ -4659,33 +5839,6 @@ pattern binding must have the same context. For example, this is fine: - -Type families - - - -GHC supports the definition of type families indexed by types. They may be -seen as an extension of Haskell 98's class-based overloading of values to -types. When type families are declared in classes, they are also known as -associated types. - - -There are two forms of type families: data families and type synonym families. -Currently, only the former are fully implemented, while we are still working -on the latter. As a result, the specification of the language extension is -also still to some degree in flux. Hence, a more detailed description of -the language extension and its use is currently available -from the Haskell -wiki page on type families. The material will be moved to this user's -guide when it has stabilised. - - -Type families are enabled by the flag . - - - - - @@ -4698,12 +5851,12 @@ Type families are enabled by the flag . Haskell. The background to the main technical innovations is discussed in " +url="http://research.microsoft.com/~simonpj/papers/meta-haskell/"> Template Meta-programming for Haskell" (Proc Haskell Workshop 2002). There is a Wiki page about -Template Haskell at +Template Haskell at http://www.haskell.org/haskellwiki/Template_Haskell, and that is the best place to look for further details. You may also @@ -4750,6 +5903,8 @@ Wiki page. an expression; the spliced expression must have type Q Exp + an type; the spliced expression must + have type Q Typ a list of top-level declarations; the spliced expression must have type Q [Dec] @@ -4769,6 +5924,15 @@ Wiki page. + A quasi-quotation can appear in either a pattern context or an + expression context and is also written in Oxford brackets: + + [:varid| ... |], + where the "..." is an arbitrary string; a full description of the + quasi-quotation facility is given in . + + + A name can be quoted with either one or two prefix single quotes: 'f has type Name, and names the function f. @@ -4779,17 +5943,17 @@ Wiki page. That is, ''thing interprets thing in a type context. - These Names can be used to construct Template Haskell expressions, patterns, delarations etc. They + These Names can be used to construct Template Haskell expressions, patterns, declarations etc. They may also be given as an argument to the reify function. -(Compared to the original paper, there are many differnces of detail. +(Compared to the original paper, there are many differences of detail. The syntax for a declaration splice uses "$" not "splice". The type of the enclosed expression must be Q [Dec], not [Q Dec]. -Type splices are not implemented, and neither are pattern splices or quotations. +Pattern splices and quotations are not implemented.) @@ -4808,9 +5972,13 @@ Type splices are not implemented, and neither are pattern splices or quotations. - Furthermore, you can only run a function at compile time if it is imported + You can only run a function at compile time if it is imported from another module that is not part of a mutually-recursive group of modules - that includes the module currently being compiled. For example, when compiling module A, + that includes the module currently being compiled. Furthermore, all of the modules of + the mutually-recursive group must be reachable by non-SOURCE imports from the module where the + splice is to be run. + + For example, when compiling module A, you can only run Template Haskell functions imported from B if B does not import A (directly or indirectly). The reason should be clear: to run B we must compile and run A, but we are currently type-checking A. @@ -4938,6 +6106,124 @@ The basic idea is to compile the program twice: + Template Haskell Quasi-quotation +Quasi-quotation allows patterns and expressions to be written using +programmer-defined concrete syntax; the motivation behind the extension and +several examples are documented in +"Why It's +Nice to be Quoted: Quasiquoting for Haskell" (Proc Haskell Workshop +2007). The example below shows how to write a quasiquoter for a simple +expression language. + + +In the example, the quasiquoter expr is bound to a value of +type Language.Haskell.TH.Quote.QuasiQuoter which contains two +functions for quoting expressions and patterns, respectively. The first argument +to each quoter is the (arbitrary) string enclosed in the Oxford brackets. The +context of the quasi-quotation statement determines which of the two parsers is +called: if the quasi-quotation occurs in an expression context, the expression +parser is called, and if it occurs in a pattern context, the pattern parser is +called. + + +Note that in the example we make use of an antiquoted +variable n, indicated by the syntax 'int:n +(this syntax for anti-quotation was defined by the parser's +author, not by GHC). This binds n to the +integer value argument of the constructor IntExpr when +pattern matching. Please see the referenced paper for further details regarding +anti-quotation as well as the description of a technique that uses SYB to +leverage a single parser of type String -> a to generate both +an expression parser that returns a value of type Q Exp and a +pattern parser that returns a value of type Q Pat. + + +In general, a quasi-quote has the form +[$quoter| string |]. +The quoter must be the name of an imported quoter; it +cannot be an arbitrary expression. The quoted string +can be arbitrary, and may contain newlines. + + +Quasiquoters must obey the same stage restrictions as Template Haskell, e.g., in +the example, expr cannot be defined +in Main.hs where it is used, but must be imported. + + + + +{- Main.hs -} +module Main where + +import Expr + +main :: IO () +main = do { print $ eval [$expr|1 + 2|] + ; case IntExpr 1 of + { [$expr|'int:n|] -> print n + ; _ -> return () + } + } + + +{- Expr.hs -} +module Expr where + +import qualified Language.Haskell.TH as TH +import Language.Haskell.TH.Quote + +data Expr = IntExpr Integer + | AntiIntExpr String + | BinopExpr BinOp Expr Expr + | AntiExpr String + deriving(Show, Typeable, Data) + +data BinOp = AddOp + | SubOp + | MulOp + | DivOp + deriving(Show, Typeable, Data) + +eval :: Expr -> Integer +eval (IntExpr n) = n +eval (BinopExpr op x y) = (opToFun op) (eval x) (eval y) + where + opToFun AddOp = (+) + opToFun SubOp = (-) + opToFun MulOp = (*) + opToFun DivOp = div + +expr = QuasiQuoter parseExprExp parseExprPat + +-- Parse an Expr, returning its representation as +-- either a Q Exp or a Q Pat. See the referenced paper +-- for how to use SYB to do this by writing a single +-- parser of type String -> Expr instead of two +-- separate parsers. + +parseExprExp :: String -> Q Exp +parseExprExp ... + +parseExprPat :: String -> Q Pat +parseExprPat ... + + +Now run the compiler: + + +$ ghc --make -XQuasiQuotes Main.hs -o main + + +Run "main" and here is your output: + + +$ ./main +3 +1 + + + + @@ -4955,6 +6241,8 @@ For more details, see “Generalising Monads to Arrows”, John Hughes, in Science of Computer Programming 37, pp67–111, May 2000. +The paper that introduced arrows: a friendly introduction, motivated with +programming examples. @@ -4962,6 +6250,7 @@ pp67–111, May 2000. A New Notation for Arrows”, Ross Paterson, in ICFP, Sep 2001. +Introduced the notation described here. @@ -4973,17 +6262,42 @@ Palgrave, 2003. - -and the arrows web page at + + +“Programming with Arrows”, +John Hughes, in 5th International Summer School on +Advanced Functional Programming, +Lecture Notes in Computer Science vol. 3622, +Springer, 2004. +This paper includes another introduction to the notation, +with practical examples. + + + + + +“Type and Translation Rules for Arrow Notation in GHC”, +Ross Paterson and Simon Peyton Jones, September 16, 2004. +A terse enumeration of the formal rules used +(extracted from comments in the source code). + + + + + +The arrows web page at http://www.haskell.org/arrows/. + + + + With the flag, GHC supports the arrow -notation described in the second of these papers. -What follows is a brief introduction to the notation; -it won't make much sense unless you've read Hughes's paper. -This notation is translated to ordinary Haskell, -using combinators from the +notation described in the second of these papers, +translating it using combinators from the Control.Arrow module. +What follows is a brief introduction to the notation; +it won't make much sense unless you've read Hughes's paper. The extension adds a new kind of expression for defining arrows: @@ -5257,7 +6571,8 @@ We could define our own operator untilA :: ArrowChoice a => a e () -> a e Bool -> a e () untilA body cond = proc x -> - if cond x then returnA -< () + b <- cond -< x + if b then returnA -< () else do body -< x untilA body cond -< x @@ -5439,13 +6754,24 @@ Because the preprocessor targets Haskell (rather than Core), Bang patterns GHC supports an extension of pattern matching called bang -patterns. Bang patterns are under consideration for Haskell Prime. +patterns, written !pat. +Bang patterns are under consideration for Haskell Prime. The Haskell prime feature description contains more discussion and examples than the material below. +The key change is the addition of a new rule to the +semantics of pattern matching in the Haskell 98 report. +Add new bullet 10, saying: Matching the pattern !pat +against a value v behaves as follows: + +if v is bottom, the match diverges +otherwise, pat is matched against v + + + Bang patterns are enabled by the flag . @@ -5476,9 +6802,40 @@ A bang only really has an effect if it precedes a variable or wild-card pattern: f3 !(x,y) = [x,y] f4 (x,y) = [x,y] -Here, f3 and f4 are identical; putting a bang before a pattern that +Here, f3 and f4 are identical; +putting a bang before a pattern that forces evaluation anyway does nothing. - + + +There is one (apparent) exception to this general rule that a bang only +makes a difference when it precedes a variable or wild-card: a bang at the +top level of a let or where +binding makes the binding strict, regardless of the pattern. For example: + +let ![x,y] = e in b + +is a strict binding: operationally, it evaluates e, matches +it against the pattern [x,y], and then evaluates b. +(We say "apparent" exception because the Right Way to think of it is that the bang +at the top of a binding is not part of the pattern; rather it +is part of the syntax of the binding.) +Nested bangs in a pattern binding behave uniformly with all other forms of +pattern matching. For example + +let (!x,[y]) = e in b + +is equivalent to this: + +let { t = case e of (x,[y]) -> x `seq` (x,y) + x = fst t + y = snd t } +in b + +The binding is lazy, but when either x or y is +evaluated by b the entire pattern is matched, including forcing the +evaluation of x. + + Bang patterns work in case expressions too, of course: g5 x = let y = f x in body @@ -5488,18 +6845,6 @@ g7 x = case f x of { !y -> body } The functions g5 and g6 mean exactly the same thing. But g7 evaluates (f x), binds y to the result, and then evaluates body. - -Bang patterns work in let and where -definitions too. For example: - -let ![x,y] = e in b - -is a strict pattern: operationally, it evaluates e, matches -it against the pattern [x,y], and then evaluates b -The "!" should not be regarded as part of the pattern; after all, -in a function argument ![x,y] means the -same as [x,y]. Rather, the "!" -is part of the syntax of let bindings. @@ -5526,7 +6871,7 @@ prefix notation: (!) f x = 3 The semantics of Haskell pattern matching is described in +url="http://www.haskell.org/onlinereport/exps.html#sect3.17.2"> Section 3.17.2 of the Haskell Report. To this description add one extra item 10, saying: Matching @@ -5536,7 +6881,7 @@ the pattern !pat against a value v behaves v -Similarly, in Figure 4 of +Similarly, in Figure 4 of Section 3.17.3, add a new case (t): case v of { !pat -> e; _ -> e' } @@ -5544,7 +6889,7 @@ case v of { !pat -> e; _ -> e' } That leaves let expressions, whose translation is given in -Section +Section 3.12 of the Haskell Report. In the translation box, first apply @@ -5685,13 +7030,31 @@ Assertion failures can be caught, see the documentation for the word. The various values for word that GHC understands are described in the following sections; any pragma encountered with an - unrecognised word is (silently) - ignored. + unrecognised word is + ignored. The layout rule applies in pragmas, so the closing #-} + should start in a column to the right of the opening {-#. - Certain pragmas are file-header pragmas. A file-header - pragma must precede the module keyword in the file. + Certain pragmas are file-header pragmas: + + + A file-header + pragma must precede the module keyword in the file. + + There can be as many file-header pragmas as you please, and they can be - preceded or followed by comments. + preceded or followed by comments. + + + File-header pragmas are read once only, before + pre-processing the file (e.g. with cpp). + + + The file-header pragmas are: {-# LANGUAGE #-}, + {-# OPTIONS_GHC #-}, and + {-# INCLUDE #-}. + + + LANGUAGE pragma @@ -5769,56 +7132,63 @@ Assertion failures can be caught, see the documentation for the don't recommend using this approach with GHC. - - DEPRECATED pragma - DEPRECATED - + + WARNING and DEPRECATED pragmas + WARNING + DEPRECATED - The DEPRECATED pragma lets you specify that a particular - function, class, or type, is deprecated. There are two - forms. + The WARNING pragma allows you to attach an arbitrary warning + to a particular function, class, or type. + A DEPRECATED pragma lets you specify that + a particular function, class, or type is deprecated. + There are two ways of using these pragmas. - You can deprecate an entire module thus: + You can work on an entire module thus: module Wibble {-# DEPRECATED "Use Wobble instead" #-} where ... + Or: + + module Wibble {-# WARNING "This is an unstable interface." #-} where + ... + When you compile any module that import Wibble, GHC will print the specified message. - You can deprecate a function, class, type, or data constructor, with the - following top-level declaration: + You can attach a warning to a function, class, type, or data constructor, with the + following top-level declarations: {-# DEPRECATED f, C, T "Don't use these" #-} + {-# WARNING unsafePerformIO "This is unsafe; I hope you know what you're doing" #-} When you compile any module that imports and uses any of the specified entities, GHC will print the specified message. - You can only deprecate entities declared at top level in the module + You can only attach to entities declared at top level in the module being compiled, and you can only use unqualified names in the list of - entities being deprecated. A capitalised name, such as T + entities. A capitalised name, such as T refers to either the type constructor T or the data constructor T, or both if - both are in scope. If both are in scope, there is currently no way to deprecate - one without the other (c.f. fixities ). + both are in scope. If both are in scope, there is currently no way to + specify one without the other (c.f. fixities + ). - Any use of the deprecated item, or of anything from a deprecated - module, will be flagged with an appropriate message. However, - deprecations are not reported for - (a) uses of a deprecated function within its defining module, and - (b) uses of a deprecated function in an export list. + Warnings and deprecations are not reported for + (a) uses within the defining module, and + (b) uses in an export list. The latter reduces spurious complaints within a library in which one module gathers together and re-exports the exports of several others. You can suppress the warnings with the flag - . + . @@ -5845,20 +7215,41 @@ Assertion failures can be caught, see the documentation for the key_function :: Int -> String -> (Bool, Double) - -#ifdef __GLASGOW_HASKELL__ {-# INLINE key_function #-} -#endif - (You don't need to do the C pre-processor carry-on - unless you're going to stick the code through HBC—it - doesn't like INLINE pragmas.) - The major effect of an INLINE pragma is to declare a function's “cost” to be very low. The normal unfolding machinery will then be very keen to - inline it. + inline it. However, an INLINE pragma for a + function "f" has a number of other effects: + + +No functions are inlined into f. Otherwise +GHC might inline a big function into f's right hand side, +making f big; and then inline f blindly. + + +The float-in, float-out, and common-sub-expression transformations are not +applied to the body of f. + + +An INLINE function is not worker/wrappered by strictness analysis. +It's going to be inlined wholesale instead. + + +All of these effects are aimed at ensuring that what gets inlined is +exactly what you asked for, no more and no less. + +GHC ensures that inlining cannot go on forever: every mutually-recursive +group is cut by one or more loop breakers that is never inlined +(see +Secrets of the GHC inliner, JFP 12(4) July 2002). +GHC tries not to select a function with an INLINE pragma as a loop breaker, but +when there is no choice even an INLINE function can be selected, in which case +the INLINE pragma is ignored. +For example, for a self-recursive function, the loop breaker can only be the function +itself, so an INLINE pragma is always ignored. Syntactically, an INLINE pragma for a function can be put anywhere its type signature could be @@ -5872,14 +7263,18 @@ key_function :: Int -> String -> (Bool, Double) UniqueSupply monad code, we have: -#ifdef __GLASGOW_HASKELL__ {-# INLINE thenUs #-} {-# INLINE returnUs #-} -#endif See also the NOINLINE pragma (). + + Note: the HBC compiler doesn't like INLINE pragmas, + so if you want your code to be HBC-compatible you'll have to surround + the pragma with C pre-processor directives + #ifdef __GLASGOW_HASKELL__...#endif. + @@ -5933,7 +7328,7 @@ key_function :: Int -> String -> (Bool, Double) there was no pragma). - "INLINE[~k] f" means: be willing to inline + "NOINLINE[~k] f" means: be willing to inline f until phase k, but from phase k onwards do not inline it. @@ -5967,6 +7362,83 @@ happen. + + ANN pragmas + + GHC offers the ability to annotate various code constructs with additional + data by using three pragmas. This data can then be inspected at a later date by + using GHC-as-a-library. + + + Annotating values + + ANN + + Any expression that has both Typeable and Data instances may be attached to a top-level value + binding using an ANN pragma. In particular, this means you can use ANN + to annotate data constructors (e.g. Just) as well as normal values (e.g. take). + By way of example, to annotate the function foo with the annotation Just "Hello" + you would do this: + + +{-# ANN foo (Just "Hello") #-} +foo = ... + + + + A number of restrictions apply to use of annotations: + + The binder being annotated must be at the top level (i.e. no nested binders) + The binder being annotated must be declared in the current module + The expression you are annotating with must have a type with Typeable and Data instances + The Template Haskell staging restrictions apply to the + expression being annotated with, so for example you cannot run a function from the module being compiled. + + To be precise, the annotation {-# ANN x e #-} is well staged if and only if $(e) would be + (disregarding the usual type restrictions of the splice syntax, and the usual restriction on splicing inside a splice - $([|1|]) is fine as an annotation, albeit redundant). + + + If you feel strongly that any of these restrictions are too onerous, + please give the GHC team a shout. + + + However, apart from these restrictions, many things are allowed, including expressions which are not fully evaluated! + Annotation expressions will be evaluated by the compiler just like Template Haskell splices are. So, this annotation is fine: + + +{-# ANN f SillyAnnotation { foo = (id 10) + $([| 20 |]), bar = 'f } #-} +f = ... + + + + + Annotating types + + ANN type + ANN + + You can annotate types with the ANN pragma by using the type keyword. For example: + + +{-# ANN type Foo (Just "A `Maybe String' annotation") #-} +data Foo = ... + + + + + Annotating modules + + ANN module + ANN + + You can annotate modules with the ANN pragma by using the module keyword. For example: + + +{-# ANN module (Just "A `Maybe String' annotation") #-} + + + + LINE pragma @@ -6164,14 +7636,14 @@ data T = T {-# UNPACK #-} !(Int,Int) will store the two Ints directly in the T constructor, by flattening the pair. - Multi-level unpacking is also supported: + Multi-level unpacking is also supported: data T = T {-# UNPACK #-} !S data S = S {-# UNPACK #-} !Int {-# UNPACK #-} !Int - will store two unboxed Int#s + will store two unboxed Int#s directly in the T constructor. The unpacker can see through newtypes, too. @@ -6185,6 +7657,15 @@ data S = S {-# UNPACK #-} !Int {-# UNPACK #-} !Int constructor field. + + SOURCE pragma + + SOURCE + The {-# SOURCE #-} pragma is used only in import declarations, + to break a module loop. It is described in detail in . + + + @@ -6198,23 +7679,19 @@ data S = S {-# UNPACK #-} !Int {-# UNPACK #-} !Int The programmer can specify rewrite rules as part of the source program -(in a pragma). GHC applies these rewrite rules wherever it can, provided (a) -the flag () is on, -and (b) the flag -() is not specified, and (c) the - () -flag is active. - - - +(in a pragma). Here is an example: {-# RULES - "map/map" forall f g xs. map f (map g xs) = map (f.g) xs - #-} + "map/map" forall f g xs. map f (map g xs) = map (f.g) xs + #-} - + + +Use the debug flag to see what rules fired. +If you need more information, then shows you +each individual rule firing in detail. @@ -6224,15 +7701,32 @@ Here is an example: From a syntactic point of view: - + - There may be zero or more rules in a RULES pragma. + There may be zero or more rules in a RULES pragma, separated by semicolons (which + may be generated by the layout rule). + +The layout rule applies in a pragma. +Currently no new indentation level +is set, so if you put several rules in single RULES pragma and wish to use layout to separate them, +you must lay out the starting in the same column as the enclosing definitions. + + {-# RULES + "map/map" forall f g xs. map f (map g xs) = map (f.g) xs + "map/append" forall f xs ys. map f (xs ++ ys) = map f xs ++ map f ys + #-} + +Furthermore, the closing #-} +should start in a column to the right of the opening {-#. + + + Each rule has a name, enclosed in double quotes. The name itself has no significance at all. It is only used when reporting how many times the rule fired. @@ -6246,7 +7740,7 @@ immediately after the name of the rule. Thus: {-# RULES "map/map" [2] forall f g xs. map f (map g xs) = map (f.g) xs - #-} + #-} The "[2]" means that the rule is active in Phase 2 and subsequent phases. The inverse notation "[~2]" is also accepted, meaning that the rule is active up to, but not including, @@ -6255,17 +7749,8 @@ Phase 2. - - - - Layout applies in a RULES pragma. Currently no new indentation level -is set, so you must lay out your rules starting in the same column as the -enclosing definitions. - - - Each variable mentioned in a rule must either be in scope (e.g. map), or bound by the forall (e.g. f, g, xs). The variables bound by @@ -6314,17 +7799,40 @@ variables it mentions, though of course they need to be in scope. - Rules are automatically exported from a module, just as instance declarations are. + All rules are implicitly exported from the module, and are therefore +in force in any module that imports the module that defined the rule, directly +or indirectly. (That is, if A imports B, which imports C, then C's rules are +in force when compiling A.) The situation is very similar to that for instance +declarations. + + + + + + +Inside a RULE "forall" is treated as a keyword, regardless of +any other flag settings. Furthermore, inside a RULE, the language extension + is automatically enabled; see +. + + +Like other pragmas, RULE pragmas are always checked for scope errors, and +are typechecked. Typechecking means that the LHS and RHS of a rule are typechecked, +and must have the same type. However, rules are only enabled +if the flag is +on (see ). + + - + Semantics @@ -6332,9 +7840,17 @@ From a semantic point of view: - -Rules are only applied if you use the flag. +Rules are enabled (that is, used during optimisation) +by the flag. +This flag is implied by , and may be switched +off (as usual) by . +(NB: enabling without +may not do what you expect, though, because without GHC +ignores all optimisation information in interface files; +see , .) +Note that is an optimisation flag, and +has no effect on parsing or typechecking. @@ -6351,14 +7867,6 @@ expression by substituting for the pattern variables. - The LHS and RHS of a rule are typechecked, and must have the -same type. - - - - - - GHC makes absolutely no attempt to verify that the LHS and RHS of a rule have the same meaning. That is undecidable in general, and infeasible in most interesting cases. The responsibility is entirely the programmer's! @@ -6372,7 +7880,7 @@ infeasible in most interesting cases. The responsibility is entirely the progra terminating. For example: - "loop" forall x,y. f x y = f y x + "loop" forall x y. f x y = f y x This rule will cause the compiler to go into an infinite loop. @@ -6425,48 +7933,32 @@ not be substituted, and the rule would not fire. - In the earlier phases of compilation, GHC inlines nothing -that appears on the LHS of a rule, because once you have substituted -for something you can't match against it (given the simple minded -matching). So if you write the rule - +Ordinary inlining happens at the same time as rule rewriting, which may lead to unexpected +results. Consider this (artificial) example - "map/map" forall f,g. map f . map g = map (f.g) - +f x = x +{-# RULES "f" f True = False #-} -this won't match the expression map f (map g xs). -It will only match something written with explicit use of ".". -Well, not quite. It will match the expression +g y = f y - -wibble f g xs +h z = g True - -where wibble is defined: - +Since f's right-hand side is small, it is inlined into g, +to give -wibble f g = map f . map g +g y = y - -because wibble will be inlined (it's small). - -Later on in compilation, GHC starts inlining even things on the -LHS of rules, but still leaves the rules enabled. This inlining -policy is controlled by the per-simplification-pass flag n. - +Now g is inlined into h, but f's RULE has +no chance to fire. +If instead GHC had first inlined g into h then there +would have been a better chance that f's RULE might fire. - - - - All rules are implicitly exported from the module, and are therefore -in force in any module that imports the module that defined the rule, directly -or indirectly. (That is, if A imports B, which imports C, then C's rules are -in force when compiling A.) The situation is very similar to that for instance -declarations. +The way to get predictable behaviour is to use a NOINLINE +pragma on f, to ensure +that it is not inlined until its RULEs have had a chance to fire. - @@ -6809,7 +8301,7 @@ g x = show x - However, when external for is generated (via + However, when external core is generated (via ), there will be Notes attached to the expressions show and x. The core function declaration for f is: @@ -6851,7 +8343,7 @@ r) -> Special built-in functions GHC has a few built-in functions with special behaviour. These are now described in the module GHC.Prim +url="../libraries/ghc-prim/GHC-Prim.html">GHC.Prim in the library documentation. @@ -7121,7 +8613,7 @@ carried out at let and where bindings. Haskell's monomorphism restriction (see -Section +Section 4.5.5 of the Haskell Report) can be completely switched off by @@ -7148,7 +8640,7 @@ can be completely switched off by [x] = e -- A pattern binding Experimentally, GHC now makes pattern bindings monomorphic by -default. Use to recover the +default. Use to recover the standard behaviour. @@ -7160,6 +8652,7 @@ standard behaviour. ;;; Local Variables: *** ;;; mode: xml *** ;;; sgml-parent-document: ("users_guide.xml" "book" "chapter" "sect1") *** + ;;; ispell-local-dictionary: "british" *** ;;; End: *** -->