X-Git-Url: http://git.megacz.com/?p=ghc-hetmet.git;a=blobdiff_plain;f=docs%2Fusers_guide%2Fglasgow_exts.xml;h=cb5033a0909eff1463e4f759c91d1860ae5cf952;hp=9c1b2c7a789e9524afdf68ac03cbe9e0226deb74;hb=c0f542271da944d540faf91678c48ff856174b57;hpb=d5019cfeeaa33967d19146e235722b953e1d800f diff --git a/docs/users_guide/glasgow_exts.xml b/docs/users_guide/glasgow_exts.xml index 9c1b2c7..cb5033a 100644 --- a/docs/users_guide/glasgow_exts.xml +++ b/docs/users_guide/glasgow_exts.xml @@ -38,11 +38,21 @@ documentation describes all the libraries that come with GHC. extensionsoptions controlling - These flags control what variation of the language are + The language option flag control what variation of the language are permitted. Leaving out all of them gives you standard Haskell 98. - NB. turning on an option that enables special syntax + Generally speaking, all the language options are introduced by "", + e.g. . + + + All the language options can be turned off by using the prefix ""; + e.g. "". + + Language options recognised by Cabal can also be enabled using the LANGUAGE pragma, + thus {-# LANGUAGE TemplateHaskell #-} (see >). + + Turning on an option that enables special syntax might cause working Haskell 98 code to fail to compile, perhaps because it uses a variable name which has become a reserved word. So, together with each option below, we @@ -81,7 +91,8 @@ documentation describes all the libraries that come with GHC. This simultaneously enables all of the extensions to Haskell 98 described in , except where otherwise - noted. + noted. We are trying to move away from this portmanteau flag, + and towards enabling features individually. New reserved words: forall (only in types), mdo. @@ -95,20 +106,24 @@ documentation describes all the libraries that come with GHC. float##, (#, #), |), {|. + + Implies these specific language options: + , + , + , + , + . - and : - - + : + This option enables the language extension defined in the - Haskell 98 Foreign Function Interface Addendum plus deprecated - syntax of previous versions of the FFI for backwards - compatibility. + Haskell 98 Foreign Function Interface Addendum. New reserved words: foreign. @@ -116,11 +131,32 @@ documentation describes all the libraries that come with GHC. - : - + : + + + Allow "#" as a postfix modifier on identifiers. + + + + + + + ,: + + + These two flags control how generalisation is done. + See . + + + + + + + : + - Switch off the Haskell 98 monomorphism restriction. + Use GHCi's extended default rules in a regular module (). Independent of the flag. @@ -128,24 +164,55 @@ documentation describes all the libraries that come with GHC. - - + + + + + + + + + + + + + These flags control higher-rank polymorphism. + See . + New reserved words: forall. + + + + + + + + + + Allow more liberal type synonyms. + See . + New reserved words: forall. + + + + + + + - - + + - - + + - + - See . Only relevant - if you also use . + See . @@ -162,8 +229,8 @@ documentation describes all the libraries that come with GHC. - - + + See . Independent of @@ -181,8 +248,8 @@ documentation describes all the libraries that come with GHC. - - + + See . Independent of @@ -191,13 +258,13 @@ documentation describes all the libraries that come with GHC. - + - -fno-implicit-prelude + -XNoImplicitPrelude option GHC normally imports Prelude.hi files for you. If you'd rather it didn't, then give it a - option. The idea is + option. The idea is that you can then import a Prelude of your own. (But don't call it Prelude; the Haskell module namespace is flat, and you must not conflict with any @@ -212,14 +279,14 @@ documentation describes all the libraries that come with GHC. translation for list comprehensions continues to use Prelude.map etc. - However, does + However, does change the handling of certain built-in syntax: see . - + Enables implicit parameters (see ). Currently also implied by @@ -232,7 +299,15 @@ documentation describes all the libraries that come with GHC. - + + + Enables overloaded string literals (see ). + + + + + Enables lexically-scoped type variables (see ). Implied by @@ -241,7 +316,7 @@ documentation describes all the libraries that come with GHC. - + Enables Template Haskell (see ). This flag must @@ -256,16 +331,26 @@ documentation describes all the libraries that come with GHC. + + + + Enables quasiquotation (see ). + + Syntax stolen: + [:varid|. + + + - - Unboxed types and primitive operations -GHC is built on a raft of primitive data types and operations. +GHC is built on a raft of primitive data types and operations; +"primitive" in the sense that they cannot be defined in Haskell itself. While you really can use this stuff to write fast code, we generally find it a lot less painful, and more satisfying in the long run, to use higher-level language features and libraries. With @@ -273,28 +358,21 @@ While you really can use this stuff to write fast code, unboxed version in any case. And if it isn't, we'd like to know about it. -We do not currently have good, up-to-date documentation about the -primitives, perhaps because they are mainly intended for internal use. -There used to be a long section about them here in the User Guide, but it -became out of date, and wrong information is worse than none. - -The Real Truth about what primitive types there are, and what operations -work over those types, is held in the file -fptools/ghc/compiler/prelude/primops.txt.pp. -This file is used directly to generate GHC's primitive-operation definitions, so -it is always correct! It is also intended for processing into text. - - Indeed, -the result of such processing is part of the description of the - External - Core language. -So that document is a good place to look for a type-set version. -We would be very happy if someone wanted to volunteer to produce an SGML -back end to the program that processes primops.txt so that -we could include the results here in the User Guide. - -What follows here is a brief summary of some main points. +All these primitive data types and operations are exported by the +library GHC.Prim, for which there is +detailed online documentation. +(This documentation is generated from the file compiler/prelude/primops.txt.pp.) + + +If you want to mention any of the primitive data types or operations in your +program, you must first import GHC.Prim to bring them +into scope. Many of them have names ending in "#", and to mention such +names you need the extension (). + + +The primops make extensive use of unboxed types +and unboxed tuples, which +we briefly summarise here. Unboxed types @@ -327,8 +405,11 @@ know and love—usually one instruction. Primitive (unboxed) types cannot be defined in Haskell, and are therefore built into the language and compiler. Primitive types are always unlifted; that is, a value of a primitive type cannot be -bottom. We use the convention that primitive types, values, and -operations have a <literal>#</literal> suffix. +bottom. We use the convention (but it is only a convention) +that primitive types, values, and +operations have a <literal>#</literal> suffix (see <xref linkend="magic-hash"/>). +For some primitive types we have special syntax for literals, also +described in the <link linkend="magic-hash">same section</link>. </para> <para> @@ -366,6 +447,13 @@ worse, the unboxed value might be larger than a pointer (<literal>Double#</literal> for instance). </para> </listitem> +<listitem><para> You cannot define a newtype whose representation type +(the argument type of the data constructor) is an unboxed type. Thus, +this is illegal: +<programlisting> + newtype A = MkA Int# +</programlisting> +</para></listitem> <listitem><para> You cannot bind a variable with an unboxed type in a <emphasis>top-level</emphasis> binding. </para></listitem> @@ -498,8 +586,40 @@ Indeed, the bindings can even be recursive. <sect1 id="syntax-extns"> <title>Syntactic extensions + + The magic hash + The language extension allows "#" as a + postfix modifier to identifiers. Thus, "x#" is a valid variable, and "T#" is + a valid type constructor or data constructor. + + The hash sign does not change sematics at all. We tend to use variable + names ending in "#" for unboxed values or types (e.g. Int#), + but there is no requirement to do so; they are just plain ordinary variables. + Nor does the extension bring anything into scope. + For example, to bring Int# into scope you must + import GHC.Prim (see ); + the extension + then allows you to refer to the Int# + that is now in scope. + The also enables some new forms of literals (see ): + + 'x'# has type Char# + "foo"# has type Addr# + 3# has type Int#. In general, + any Haskell 98 integer lexeme followed by a # is an Int# literal, e.g. + -0x3A# as well as 32#. + 3## has type Word#. In general, + any non-negative Haskell 98 integer lexeme followed by ## + is a Word#. + 3.2# has type Float#. + 3.2## has type Double# + + + + + Hierarchical Modules @@ -535,14 +655,11 @@ import qualified Control.Monad.ST.Strict as ST linkend="search-path"/>. GHC comes with a large collection of libraries arranged - hierarchically; see the accompanying library documentation. - There is an ongoing project to create and maintain a stable set - of core libraries used by several Haskell - compilers, and the libraries that GHC comes with represent the - current status of that project. For more details, see Haskell - Libraries. - + hierarchically; see the accompanying library + documentation. More libraries to install are available + from HackageDB. @@ -611,7 +728,7 @@ to write clunky would be to use case expressions: -clunky env var1 var1 = case lookup env var1 of +clunky env var1 var2 = case lookup env var1 of Nothing -> fail Just val1 -> case lookup env var2 of Nothing -> fail @@ -636,7 +753,7 @@ Here is how I would write clunky: -clunky env var1 var1 +clunky env var1 var2 | Just val1 <- lookup env var1 , Just val2 <- lookup env var2 = val1 + val2 @@ -674,6 +791,202 @@ qualifier list has just one element, a boolean expression. + + + +View patterns + + + +View patterns are enabled by the flag -XViewPatterns. +More information and examples of view patterns can be found on the +Wiki +page. + + + +View patterns are somewhat like pattern guards that can be nested inside +of other patterns. They are a convenient way of pattern-matching +against values of abstract types. For example, in a programming language +implementation, we might represent the syntax of the types of the +language as follows: + + +type Typ + +data TypView = Unit + | Arrow Typ Typ + +view :: Type -> TypeView + +-- additional operations for constructing Typ's ... + + +The representation of Typ is held abstract, permitting implementations +to use a fancy representation (e.g., hash-consing to manage sharing). + +Without view patterns, using this signature a little inconvenient: + +size :: Typ -> Integer +size t = case view t of + Unit -> 1 + Arrow t1 t2 -> size t1 + size t2 + + +It is necessary to iterate the case, rather than using an equational +function definition. And the situation is even worse when the matching +against t is buried deep inside another pattern. + + + +View patterns permit calling the view function inside the pattern and +matching against the result: + +size (view -> Unit) = 1 +size (view -> Arrow t1 t2) = size t1 + size t2 + + +That is, we add a new form of pattern, written +expression -> +pattern that means "apply the expression to +whatever we're trying to match against, and then match the result of +that application against the pattern". The expression can be any Haskell +expression of function type, and view patterns can be used wherever +patterns are used. + + + +The semantics of a pattern ( +exp -> +pat ) are as follows: + + + + Scoping: + +The variables bound by the view pattern are the variables bound by +pat. + + + +Any variables in exp are bound occurrences, +but variables bound "to the left" in a pattern are in scope. This +feature permits, for example, one argument to a function to be used in +the view of another argument. For example, the function +clunky from can be +written using view patterns as follows: + + +clunky env (lookup env -> Just val1) (lookup env -> Just val2) = val1 + val2 +...other equations for clunky... + + + + +More precisely, the scoping rules are: + + + +In a single pattern, variables bound by patterns to the left of a view +pattern expression are in scope. For example: + +example :: Maybe ((String -> Integer,Integer), String) -> Bool +example Just ((f,_), f -> 4) = True + + +Additionally, in function definitions, variables bound by matching earlier curried +arguments may be used in view pattern expressions in later arguments: + +example :: (String -> Integer) -> String -> Bool +example f (f -> 4) = True + +That is, the scoping is the same as it would be if the curried arguments +were collected into a tuple. + + + + + +In mutually recursive bindings, such as let, +where, or the top level, view patterns in one +declaration may not mention variables bound by other declarations. That +is, each declaration must be self-contained. For example, the following +program is not allowed: + +let {(x -> y) = e1 ; + (y -> x) = e2 } in x + + +(We may lift this +restriction in the future; the only cost is that type checking patterns +would get a little more complicated.) + + + + + + + + + + Typing: If exp has type +T1 -> +T2 and pat matches +a T2, then the whole view pattern matches a +T1. + + + Matching: To the equations in Section 3.17.3 of the +Haskell 98 +Report, add the following: + +case v of { (e -> p) -> e1 ; _ -> e2 } + = +case (e v) of { p -> e1 ; _ -> e2 } + +That is, to match a variable v against a pattern +( exp +-> pat +), evaluate ( +exp v +) and match the result against +pat. + + + Efficiency: When the same view function is applied in +multiple branches of a function definition or a case expression (e.g., +in size above), GHC makes an attempt to collect these +applications into a single nested case expression, so that the view +function is only applied once. Pattern compilation in GHC follows the +matrix algorithm described in Chapter 4 of The +Implementation of Functional Programming Languages. When the +top rows of the first column of a matrix are all view patterns with the +"same" expression, these patterns are transformed into a single nested +case. This includes, for example, adjacent view patterns that line up +in a tuple, as in + +f ((view -> A, p1), p2) = e1 +f ((view -> B, p3), p4) = e2 + + + + The current notion of when two view pattern expressions are "the +same" is very restricted: it is not even full syntactic equality. +However, it does include variables, literals, applications, and tuples; +e.g., two instances of view ("hi", "there") will be +collected. However, the current implementation does not compare up to +alpha-equivalence, so two instances of (x, view x -> +y) will not be coalesced. + + + + + + + + + @@ -681,9 +994,11 @@ qualifier list has just one element, a boolean expression. The recursive do-notation (also known as mdo-notation) is implemented as described in -"A recursive do for Haskell", -Levent Erkok, John Launchbury", +A recursive do for Haskell, +by Levent Erkok, John Launchbury, Haskell Workshop 2002, pages: 29-37. Pittsburgh, Pennsylvania. +This paper is essential reading for anyone making non-trivial use of mdo-notation, +and we do not repeat it here. The do-notation of Haskell does not allow recursive bindings, @@ -714,17 +1029,24 @@ class Monad m => MonadFix m where The function mfix -dictates how the required recursion operation should be performed. If recursive bindings are required for a monad, -then that monad must be declared an instance of the MonadFix class. -For details, see the above mentioned reference. +dictates how the required recursion operation should be performed. For example, +justOnes desugars as follows: + +justOnes = mfix (\xs' -> do { xs <- Just (1:xs'); return xs } + +For full details of the way in which mdo is typechecked and desugared, see +the paper A recursive do for Haskell. +In particular, GHC implements the segmentation technique described in Section 3.2 of the paper. +If recursive bindings are required for a monad, +then that monad must be declared an instance of the MonadFix class. The following instances of MonadFix are automatically provided: List, Maybe, IO. Furthermore, the Control.Monad.ST and Control.Monad.ST.Lazy modules provide the instances of the MonadFix class for Haskell's internal state monad (strict and lazy, respectively). -There are three important points in using the recursive-do notation: +Here are some important points in using the recursive-do notation: The recursive version of the do-notation uses the keyword mdo (rather @@ -732,20 +1054,27 @@ than do). -You should import Control.Monad.Fix. -(Note: Strictly speaking, this import is required only when you need to refer to the name -MonadFix in your program, but the import is always safe, and the programmers -are encouraged to always import this module when using the mdo-notation.) +It is enabled with the flag -XRecursiveDo, which is in turn implied by +-fglasgow-exts. + + + +Unlike ordinary do-notation, but like let and where bindings, +name shadowing is not allowed; that is, all the names bound in a single mdo must +be distinct (Section 3.3 of the paper). -As with other extensions, ghc should be given the flag -fglasgow-exts +Variables bound by a let statement in an mdo +are monomorphic in the mdo (Section 3.1 of the paper). However +GHC breaks the mdo into segments to enhance polymorphism, +and improve termination (Section 3.2 of the paper). -The web page: http://www.cse.ogi.edu/PacSoft/projects/rmb +The web page: http://www.cse.ogi.edu/PacSoft/projects/rmb/ contains up to date information on recursive monadic bindings. @@ -810,11 +1139,172 @@ This name is not supported by GHC. branches. + + + + + Generalised (SQL-Like) List Comprehensions + list comprehensionsgeneralised + + extended list comprehensions + + group + sql + + + Generalised list comprehensions are a further enhancement to the + list comprehension syntatic sugar to allow operations such as sorting + and grouping which are familiar from SQL. They are fully described in the + paper + Comprehensive comprehensions: comprehensions with "order by" and "group by", + except that the syntax we use differs slightly from the paper. +Here is an example: + +employees = [ ("Simon", "MS", 80) +, ("Erik", "MS", 100) +, ("Phil", "Ed", 40) +, ("Gordon", "Ed", 45) +, ("Paul", "Yale", 60)] + +output = [ (the dept, sum salary) +| (name, dept, salary) <- employees +, then group by dept +, then sortWith by (sum salary) +, then take 5 ] + +In this example, the list output would take on + the value: + + +[("Yale", 60), ("Ed", 85), ("MS", 180)] + + +There are three new keywords: group, by, and using. +(The function sortWith is not a keyword; it is an ordinary +function that is exported by GHC.Exts.) + +There are five new forms of comprehension qualifier, +all introduced by the (existing) keyword then: + + + + +then f + + + This statement requires that f have the type + forall a. [a] -> [a]. You can see an example of it's use in the + motivating example, as this form is used to apply take 5. + + + + + + + +then f by e + + + This form is similar to the previous one, but allows you to create a function + which will be passed as the first argument to f. As a consequence f must have + the type forall a. (a -> t) -> [a] -> [a]. As you can see + from the type, this function lets f "project out" some information + from the elements of the list it is transforming. + + An example is shown in the opening example, where sortWith + is supplied with a function that lets it find out the sum salary + for any item in the list comprehension it transforms. + + + + + + + +then group by e using f + + + This is the most general of the grouping-type statements. In this form, + f is required to have type forall a. (a -> t) -> [a] -> [[a]]. + As with the then f by e case above, the first argument + is a function supplied to f by the compiler which lets it compute e on every + element of the list being transformed. However, unlike the non-grouping case, + f additionally partitions the list into a number of sublists: this means that + at every point after this statement, binders occurring before it in the comprehension + refer to lists of possible values, not single values. To help understand + this, let's look at an example: + + +-- This works similarly to groupWith in GHC.Exts, but doesn't sort its input first +groupRuns :: Eq b => (a -> b) -> [a] -> [[a]] +groupRuns f = groupBy (\x y -> f x == f y) + +output = [ (the x, y) +| x <- ([1..3] ++ [1..2]) +, y <- [4..6] +, then group by x using groupRuns ] + + + This results in the variable output taking on the value below: + + +[(1, [4, 5, 6]), (2, [4, 5, 6]), (3, [4, 5, 6]), (1, [4, 5, 6]), (2, [4, 5, 6])] + + + Note that we have used the the function to change the type + of x from a list to its original numeric type. The variable y, in contrast, is left + unchanged from the list form introduced by the grouping. + + + + + + +then group by e + + + This form of grouping is essentially the same as the one described above. However, + since no function to use for the grouping has been supplied it will fall back on the + groupWith function defined in + GHC.Exts. This + is the form of the group statement that we made use of in the opening example. + + + + + + + +then group using f + + + With this form of the group statement, f is required to simply have the type + forall a. [a] -> [[a]], which will be used to group up the + comprehension so far directly. An example of this form is as follows: + + +output = [ x +| y <- [1..5] +, x <- "hello" +, then group using inits] + + + This will yield a list containing every prefix of the word "hello" written out 5 times: + + +["","h","he","hel","hell","hello","helloh","hellohe","hellohel","hellohell","hellohello","hellohelloh",...] + + + + + + + + Rebindable syntax - GHC allows most kinds of built-in syntax to be rebound by the user, to facilitate replacing the Prelude with a home-grown version, for example. @@ -823,7 +1313,7 @@ This name is not supported by GHC. hierarchy. It completely defeats that purpose if the literal "1" means "Prelude.fromInteger 1", which is what the Haskell Report specifies. - So the flag causes + So the flag causes the following pieces of built-in syntax to refer to whatever is in scope, not the Prelude versions: @@ -876,7 +1366,7 @@ This name is not supported by GHC. In all cases (apart from arrow notation), the static semantics should be that of the desugared form, -even if that is a little unexpected. For emample, the +even if that is a little unexpected. For example, the static semantics of the literal 368 is exactly that of fromInteger (368::Integer); it's fine for fromInteger to have any of the types: @@ -894,47 +1384,306 @@ fromInteger :: Integer -> Bool -> Bool you should be all right. - + +Postfix operators - - -Type system extensions + +GHC allows a small extension to the syntax of left operator sections, which +allows you to define postfix operators. The extension is this: the left section + + (e !) + +is equivalent (from the point of view of both type checking and execution) to the expression + + ((!) e) + +(for any expression e and operator (!). +The strict Haskell 98 interpretation is that the section is equivalent to + + (\y -> (!) e y) + +That is, the operator must be a function of two arguments. GHC allows it to +take only one argument, and that in turn allows you to write the function +postfix. + +Since this extension goes beyond Haskell 98, it should really be enabled +by a flag; but in fact it is enabled all the time. (No Haskell 98 programs +change their behaviour, of course.) + +The extension does not extend to the left-hand side of function +definitions; you must define such a function in prefix form. + - -Data types and type synonyms + +Record field disambiguation + +In record construction and record pattern matching +it is entirely unambiguous which field is referred to, even if there are two different +data types in scope with a common field name. For example: + +module M where + data S = MkS { x :: Int, y :: Bool } - -Data types with no constructors +module Foo where + import M -With the flag, GHC lets you declare -a data type with no constructors. For example: + data T = MkT { x :: Int } + + ok1 (MkS { x = n }) = n+1 -- Unambiguous - - data S -- S :: * - data T a -- T :: * -> * + ok2 n = MkT { x = n+1 } -- Unambiguous + + bad1 k = k { x = 3 } -- Ambiguous + bad2 k = x k -- Ambiguous +Even though there are two x's in scope, +it is clear that the x in the pattern in the +definition of ok1 can only mean the field +x from type S. Similarly for +the function ok2. However, in the record update +in bad1 and the record selection in bad2 +it is not clear which of the two types is intended. + + +Haskell 98 regards all four as ambiguous, but with the + flag, GHC will accept +the former two. The rules are precisely the same as those for instance +declarations in Haskell 98, where the method names on the left-hand side +of the method bindings in an instance declaration refer unambiguously +to the method of that class (provided they are in scope at all), even +if there are other variables in scope with the same name. +This reduces the clutter of qualified names when you import two +records from different modules that use the same field name. + + -Syntactically, the declaration lacks the "= constrs" part. The -type can be parameterised over types of any kind, but if the kind is -not * then an explicit kind annotation must be used -(see ). + -Such data types have only one value, namely bottom. -Nevertheless, they can be useful when defining "phantom types". - + +Record puns + - -Infix type constructors, classes, and type variables + +Record puns are enabled by the flag -XNamedFieldPuns. + -GHC allows type constructors, classes, and type variables to be operators, and -to be written infix, very much like expressions. More specifically: - - - A type constructor or class can be an operator, beginning with a colon; e.g. :*:. - The lexical syntax is the same as that for data constructors. +When using records, it is common to write a pattern that binds a +variable with the same name as a record field, such as: + + +data C = C {a :: Int} +f (C {a = a}) = a + + + + +Record punning permits the variable name to be elided, so one can simply +write + + +f (C {a}) = a + + +to mean the same pattern as above. That is, in a record pattern, the +pattern a expands into the pattern a = +a for the same name a. + + + +Note that puns and other patterns can be mixed in the same record: + +data C = C {a :: Int, b :: Int} +f (C {a, b = 4}) = a + +and that puns can be used wherever record patterns occur (e.g. in +let bindings or at the top-level). + + + +Record punning can also be used in an expression, writing, for example, + +let a = 1 in C {a} + +instead of + +let a = 1 in C {a = a} + + +Note that this expansion is purely syntactic, so the record pun +expression refers to the nearest enclosing variable that is spelled the +same as the field name. + + + + + + + +Record wildcards + + + +Record wildcards are enabled by the flag -XRecordWildCards. + + + +For records with many fields, it can be tiresome to write out each field +individually in a record pattern, as in + +data C = C {a :: Int, b :: Int, c :: Int, d :: Int} +f (C {a = 1, b = b, c = c, d = d}) = b + c + d + + + + +Record wildcard syntax permits a (..) in a record +pattern, where each elided field f is replaced by the +pattern f = f. For example, the above pattern can be +written as + +f (C {a = 1, ..}) = b + c + d + + + + +Note that wildcards can be mixed with other patterns, including puns +(); for example, in a pattern C {a += 1, b, ..}). Additionally, record wildcards can be used +wherever record patterns occur, including in let +bindings and at the top-level. For example, the top-level binding + +C {a = 1, ..} = e + +defines b, c, and +d. + + + +Record wildcards can also be used in expressions, writing, for example, + + +let {a = 1; b = 2; c = 3; d = 4} in C {..} + + +in place of + + +let {a = 1; b = 2; c = 3; d = 4} in C {a=a, b=b, c=c, d=d} + + +Note that this expansion is purely syntactic, so the record wildcard +expression refers to the nearest enclosing variables that are spelled +the same as the omitted field names. + + + + + + + +Local Fixity Declarations + + +A careful reading of the Haskell 98 Report reveals that fixity +declarations (infix, infixl, and +infixr) are permitted to appear inside local bindings +such those introduced by let and +where. However, the Haskell Report does not specify +the semantics of such bindings very precisely. + + +In GHC, a fixity declaration may accompany a local binding: + +let f = ... + infixr 3 `f` +in + ... + +and the fixity declaration applies wherever the binding is in scope. +For example, in a let, it applies in the right-hand +sides of other let-bindings and the body of the +letC. Or, in recursive do +expressions (), the local fixity +declarations of a let statement scope over other +statements in the group, just as the bound name does. + + + +Moreover, a local fixity declaration *must* accompany a local binding of +that name: it is not possible to revise the fixity of name bound +elsewhere, as in + +let infixr 9 $ in ... + + +Because local fixity declarations are technically Haskell 98, no flag is +necessary to enable them. + + + + + Package-qualified imports + + With the flag, GHC allows + import declarations to be qualified by the package name that the + module is intended to be imported from. For example: + + +import "network" Network.Socket + + + would import the module Network.Socket from + the package network (any version). This may + be used to disambiguate an import when the same module is + available from multiple packages, or is present in both the + current package being built and an external package. + + Note: you probably don't need to use this feature, it was + added mainly so that we can build backwards-compatible versions of + packages when APIs change. It can lead to fragile dependencies in + the common case: modules occasionally move from one package to + another, rendering any package-qualified imports broken. + + + + + + +Extensions to data types and type synonyms + + +Data types with no constructors + +With the flag, GHC lets you declare +a data type with no constructors. For example: + + + data S -- S :: * + data T a -- T :: * -> * + + +Syntactically, the declaration lacks the "= constrs" part. The +type can be parameterised over types of any kind, but if the kind is +not * then an explicit kind annotation must be used +(see ). + +Such data types have only one value, namely bottom. +Nevertheless, they can be useful when defining "phantom types". + + + +Infix type constructors, classes, and type variables + + +GHC allows type constructors, classes, and type variables to be operators, and +to be written infix, very much like expressions. More specifically: + + + A type constructor or class can be an operator, beginning with a colon; e.g. :*:. + The lexical syntax is the same as that for data constructors. Data type and type-synonym declarations can be written infix, parenthesised @@ -992,15 +1741,18 @@ to be written infix, very much like expressions. More specifically: - + - + Liberalised type synonyms -Type synonyms are like macros at the type level, and +Type synonyms are like macros at the type level, but Haskell 98 imposes many rules +on individual synonym declarations. +With the extension, GHC does validity checking on types only after expanding type synonyms. -That means that GHC can be very much more liberal about type synonyms than Haskell 98: +That means that GHC can be very much more liberal about type synonyms than Haskell 98. + You can write a forall (including overloading) in a type synonym, thus: @@ -1017,7 +1769,8 @@ in a type synonym, thus: -You can write an unboxed tuple in a type synonym: +If you also use , +you can write an unboxed tuple in a type synonym: type Pr = (# Int, Int #) @@ -1084,10 +1837,10 @@ this will be rejected: because GHC does not allow unboxed tuples on the left of a function arrow. - + - + Existentially quantified data constructors @@ -1175,13 +1928,13 @@ apply fn to val to get a boolean. For e -What this allows us to do is to package heterogenous values +What this allows us to do is to package heterogeneous values together with a bunch of functions that manipulate them, and then treat that collection of packages in a uniform manner. You can express quite a bit of object-oriented-like programming this way. - + Why existential? @@ -1204,10 +1957,10 @@ But Haskell programmers can safely think of the ordinary adding a new existential quantification construct. - + - -Type classes + +Existentials and type classes An easy extension is to allow @@ -1261,14 +2014,9 @@ dictionaries for Eq and Show respectively, extract it on pattern matching. - -Notice the way that the syntax fits smoothly with that used for -universal quantification earlier. - - - + - + Record Constructors @@ -1285,7 +2033,7 @@ data Counter a = forall self. NewCounter Here tag is a public field, with a well-typed selector function tag :: Counter a -> a. The self type is hidden from the outside; any attempt to apply _this, -_inc or _output as functions will raise a +_inc or _display as functions will raise a compile-time error. In other words, GHC defines a record selector function only for fields whose type does not mention the existentially-quantified variables. (This example used an underscore in the fields for which record selectors @@ -1320,20 +2068,6 @@ main = do display (inc (inc counterB)) -- prints "##" -In GADT declarations (see ), the explicit -forall may be omitted. For example, we can express -the same Counter a using GADT: - - -data Counter a where - NewCounter { _this :: self - , _inc :: self -> self - , _display :: self -> IO () - , tag :: a - } - :: Counter a - - At the moment, record update syntax is only supported for Haskell 98 data types, so the following function does not work: @@ -1345,10 +2079,10 @@ setTag obj t = obj{ tag = t } - + - + Restrictions @@ -1474,7 +2208,7 @@ are convincing reasons to change it. You can't use deriving to define instances of a data type with existentially quantified data constructors. -Reason: in most cases it would not make sense. For example:# +Reason: in most cases it would not make sense. For example:; data T = forall a. MkT [a] deriving( Eq ) @@ -1499,80 +2233,750 @@ declarations. Define your own instances! - - + + +Declaring data types with explicit constructor signatures - -Class declarations +GHC allows you to declare an algebraic data type by +giving the type signatures of constructors explicitly. For example: + + data Maybe a where + Nothing :: Maybe a + Just :: a -> Maybe a + +The form is called a "GADT-style declaration" +because Generalised Algebraic Data Types, described in , +can only be declared using this form. +Notice that GADT-style syntax generalises existential types (). +For example, these two declarations are equivalent: + + data Foo = forall a. MkFoo a (a -> Bool) + data Foo' where { MKFoo :: a -> (a->Bool) -> Foo' } + + +Any data type that can be declared in standard Haskell-98 syntax +can also be declared using GADT-style syntax. +The choice is largely stylistic, but GADT-style declarations differ in one important respect: +they treat class constraints on the data constructors differently. +Specifically, if the constructor is given a type-class context, that +context is made available by pattern matching. For example: + + data Set a where + MkSet :: Eq a => [a] -> Set a + makeSet :: Eq a => [a] -> Set a + makeSet xs = MkSet (nub xs) + + insert :: a -> Set a -> Set a + insert a (MkSet as) | a `elem` as = MkSet as + | otherwise = MkSet (a:as) + +A use of MkSet as a constructor (e.g. in the definition of makeSet) +gives rise to a (Eq a) +constraint, as you would expect. The new feature is that pattern-matching on MkSet +(as in the definition of insert) makes available an (Eq a) +context. In implementation terms, the MkSet constructor has a hidden field that stores +the (Eq a) dictionary that is passed to MkSet; so +when pattern-matching that dictionary becomes available for the right-hand side of the match. +In the example, the equality dictionary is used to satisfy the equality constraint +generated by the call to elem, so that the type of +insert itself has no Eq constraint. + -This section, and the next one, documents GHC's type-class extensions. -There's lots of background in the paper Type -classes: exploring the design space (Simon Peyton Jones, Mark -Jones, Erik Meijer). +For example, one possible application is to reify dictionaries: + + data NumInst a where + MkNumInst :: Num a => NumInst a + + intInst :: NumInst Int + intInst = MkNumInst + + plus :: NumInst a -> a -> a -> a + plus MkNumInst p q = p + q + +Here, a value of type NumInst a is equivalent +to an explicit (Num a) dictionary. -All the extensions are enabled by the flag. +All this applies to constructors declared using the syntax of . +For example, the NumInst data type above could equivalently be declared +like this: + + data NumInst a + = Num a => MkNumInst (NumInst a) + +Notice that, unlike the situation when declaring an existential, there is +no forall, because the Num constrains the +data type's universally quantified type variable a. +A constructor may have both universal and existential type variables: for example, +the following two declarations are equivalent: + + data T1 a + = forall b. (Num a, Eq b) => MkT1 a b + data T2 a where + MkT2 :: (Num a, Eq b) => a -> b -> T2 a + + +All this behaviour contrasts with Haskell 98's peculiar treatment of +contexts on a data type declaration (Section 4.2.1 of the Haskell 98 Report). +In Haskell 98 the definition + + data Eq a => Set' a = MkSet' [a] + +gives MkSet' the same type as MkSet above. But instead of +making available an (Eq a) constraint, pattern-matching +on MkSet' requires an (Eq a) constraint! +GHC faithfully implements this behaviour, odd though it is. But for GADT-style declarations, +GHC's behaviour is much more useful, as well as much more intuitive. - -Multi-parameter type classes -Multi-parameter type classes are permitted. For example: +The rest of this section gives further details about GADT-style data +type declarations. + + +The result type of each data constructor must begin with the type constructor being defined. +If the result type of all constructors +has the form T a1 ... an, where a1 ... an +are distinct type variables, then the data type is ordinary; +otherwise is a generalised data type (). + + +The type signature of +each constructor is independent, and is implicitly universally quantified as usual. +Different constructors may have different universally-quantified type variables +and different type-class constraints. +For example, this is fine: - class Collection c a where - union :: c a -> c a -> c a - ...etc. + data T a where + T1 :: Eq b => b -> T b + T2 :: (Show c, Ix c) => c -> [c] -> T c + - - - - -The superclasses of a class declaration + +Unlike a Haskell-98-style +data type declaration, the type variable(s) in the "data Set a where" header +have no scope. Indeed, one can write a kind signature instead: + + data Set :: * -> * where ... + +or even a mixture of the two: + + data Foo a :: (* -> *) -> * where ... + +The type variables (if given) may be explicitly kinded, so we could also write the header for Foo +like this: + + data Foo a (b :: * -> *) where ... + + - -There are no restrictions on the context in a class declaration -(which introduces superclasses), except that the class hierarchy must -be acyclic. So these class declarations are OK: + +You can use strictness annotations, in the obvious places +in the constructor type: + + data Term a where + Lit :: !Int -> Term Int + If :: Term Bool -> !(Term a) -> !(Term a) -> Term a + Pair :: Term a -> Term b -> Term (a,b) + + + +You can use a deriving clause on a GADT-style data type +declaration. For example, these two declarations are equivalent - class Functor (m k) => FiniteMap m k where - ... + data Maybe1 a where { + Nothing1 :: Maybe1 a ; + Just1 :: a -> Maybe1 a + } deriving( Eq, Ord ) - class (Monad m, Monad (t m)) => Transform t m where - lift :: m a -> (t m) a + data Maybe2 a = Nothing2 | Just2 a + deriving( Eq, Ord ) + + +You can use record syntax on a GADT-style data type declaration: + + data Person where + Adult { name :: String, children :: [Person] } :: Person + Child { name :: String } :: Person + +As usual, for every constructor that has a field f, the type of +field f must be the same (modulo alpha conversion). -As in Haskell 98, The class hierarchy must be acyclic. However, the definition -of "acyclic" involves only the superclass relationships. For example, -this is OK: +At the moment, record updates are not yet possible with GADT-style declarations, +so support is limited to record construction, selection and pattern matching. +For example + + aPerson = Adult { name = "Fred", children = [] } + shortName :: Person -> Bool + hasChildren (Adult { children = kids }) = not (null kids) + hasChildren (Child {}) = False + + + +As in the case of existentials declared using the Haskell-98-like record syntax +(), +record-selector functions are generated only for those fields that have well-typed +selectors. +Here is the example of that section, in GADT-style syntax: - class C a where { - op :: D b => a -> b -> b - } - - class C a => D a where { ... } +data Counter a where + NewCounter { _this :: self + , _inc :: self -> self + , _display :: self -> IO () + , tag :: a + } + :: Counter a +As before, only one selector function is generated here, that for tag. +Nevertheless, you can still use all the field names in pattern matching and record construction. + + + + +Generalised Algebraic Data Types (GADTs) -Here, C is a superclass of D, but it's OK for a -class operation op of C to mention D. (It -would not be OK for D to be a superclass of C.) - +Generalised Algebraic Data Types generalise ordinary algebraic data types +by allowing constructors to have richer return types. Here is an example: + + data Term a where + Lit :: Int -> Term Int + Succ :: Term Int -> Term Int + IsZero :: Term Int -> Term Bool + If :: Term Bool -> Term a -> Term a -> Term a + Pair :: Term a -> Term b -> Term (a,b) + +Notice that the return type of the constructors is not always Term a, as is the +case with ordinary data types. This generality allows us to +write a well-typed eval function +for these Terms: + + eval :: Term a -> a + eval (Lit i) = i + eval (Succ t) = 1 + eval t + eval (IsZero t) = eval t == 0 + eval (If b e1 e2) = if eval b then eval e1 else eval e2 + eval (Pair e1 e2) = (eval e1, eval e2) + +The key point about GADTs is that pattern matching causes type refinement. +For example, in the right hand side of the equation + + eval :: Term a -> a + eval (Lit i) = ... + +the type a is refined to Int. That's the whole point! +A precise specification of the type rules is beyond what this user manual aspires to, +but the design closely follows that described in +the paper Simple +unification-based type inference for GADTs, +(ICFP 2006). +The general principle is this: type refinement is only carried out +based on user-supplied type annotations. +So if no type signature is supplied for eval, no type refinement happens, +and lots of obscure error messages will +occur. However, the refinement is quite general. For example, if we had: + + eval :: Term a -> a -> a + eval (Lit i) j = i+j + +the pattern match causes the type a to be refined to Int (because of the type +of the constructor Lit), and that refinement also applies to the type of j, and +the result type of the case expression. Hence the addition i+j is legal. + + +These and many other examples are given in papers by Hongwei Xi, and +Tim Sheard. There is a longer introduction +on the wiki, +and Ralf Hinze's +Fun with phantom types also has a number of examples. Note that papers +may use different notation to that implemented in GHC. + + +The rest of this section outlines the extensions to GHC that support GADTs. The extension is enabled with +. The flag also sets . + + +A GADT can only be declared using GADT-style syntax (); +the old Haskell-98 syntax for data declarations always declares an ordinary data type. +The result type of each constructor must begin with the type constructor being defined, +but for a GADT the arguments to the type constructor can be arbitrary monotypes. +For example, in the Term data +type above, the type of each constructor must end with Term ty, but +the ty need not be a type variable (e.g. the Lit +constructor). + + + +It's is permitted to declare an ordinary algebraic data type using GADT-style syntax. +What makes a GADT into a GADT is not the syntax, but rather the presence of data constructors +whose result type is not just T a b. + + + +You cannot use a deriving clause for a GADT; only for +an ordinary data type. + + + +As mentioned in , record syntax is supported. +For example: + + data Term a where + Lit { val :: Int } :: Term Int + Succ { num :: Term Int } :: Term Int + Pred { num :: Term Int } :: Term Int + IsZero { arg :: Term Int } :: Term Bool + Pair { arg1 :: Term a + , arg2 :: Term b + } :: Term (a,b) + If { cnd :: Term Bool + , tru :: Term a + , fls :: Term a + } :: Term a + +However, for GADTs there is the following additional constraint: +every constructor that has a field f must have +the same result type (modulo alpha conversion) +Hence, in the above example, we cannot merge the num +and arg fields above into a +single name. Although their field types are both Term Int, +their selector functions actually have different types: + + + num :: Term Int -> Term Int + arg :: Term Bool -> Term Int + + + + +When pattern-matching against data constructors drawn from a GADT, +for example in a case expression, the following rules apply: + +The type of the scrutinee must be rigid. +The type of the result of the case expression must be rigid. +The type of any free variable mentioned in any of +the case alternatives must be rigid. + +A type is "rigid" if it is completely known to the compiler at its binding site. The easiest +way to ensure that a variable a rigid type is to give it a type signature. + + + + + + + + + + + +Extensions to the "deriving" mechanism + + +Inferred context for deriving clauses + + +The Haskell Report is vague about exactly when a deriving clause is +legal. For example: + + data T0 f a = MkT0 a deriving( Eq ) + data T1 f a = MkT1 (f a) deriving( Eq ) + data T2 f a = MkT2 (f (f a)) deriving( Eq ) + +The natural generated Eq code would result in these instance declarations: + + instance Eq a => Eq (T0 f a) where ... + instance Eq (f a) => Eq (T1 f a) where ... + instance Eq (f (f a)) => Eq (T2 f a) where ... + +The first of these is obviously fine. The second is still fine, although less obviously. +The third is not Haskell 98, and risks losing termination of instances. + + +GHC takes a conservative position: it accepts the first two, but not the third. The rule is this: +each constraint in the inferred instance context must consist only of type variables, +with no repetitions. + + +This rule is applied regardless of flags. If you want a more exotic context, you can write +it yourself, using the standalone deriving mechanism. + + + + +Stand-alone deriving declarations + + +GHC now allows stand-alone deriving declarations, enabled by -XStandaloneDeriving: + + data Foo a = Bar a | Baz String + + deriving instance Eq a => Eq (Foo a) + +The syntax is identical to that of an ordinary instance declaration apart from (a) the keyword +deriving, and (b) the absence of the where part. +You must supply a context (in the example the context is (Eq a)), +exactly as you would in an ordinary instance declaration. +(In contrast the context is inferred in a deriving clause +attached to a data type declaration.) + +A deriving instance declaration +must obey the same rules concerning form and termination as ordinary instance declarations, +controlled by the same flags; see . + + +Unlike a deriving +declaration attached to a data declaration, the instance can be more specific +than the data type (assuming you also use +-XFlexibleInstances, ). Consider +for example + + data Foo a = Bar a | Baz String + + deriving instance Eq a => Eq (Foo [a]) + deriving instance Eq a => Eq (Foo (Maybe a)) + +This will generate a derived instance for (Foo [a]) and (Foo (Maybe a)), +but other types such as (Foo (Int,Bool)) will not be an instance of Eq. + + +The stand-alone syntax is generalised for newtypes in exactly the same +way that ordinary deriving clauses are generalised (). +For example: + + newtype Foo a = MkFoo (State Int a) + + deriving instance MonadState Int Foo + +GHC always treats the last parameter of the instance +(Foo in this example) as the type whose instance is being derived. + + + + + + +Deriving clause for classes <literal>Typeable</literal> and <literal>Data</literal> + + +Haskell 98 allows the programmer to add "deriving( Eq, Ord )" to a data type +declaration, to generate a standard instance declaration for classes specified in the deriving clause. +In Haskell 98, the only classes that may appear in the deriving clause are the standard +classes Eq, Ord, +Enum, Ix, Bounded, Read, and Show. + + +GHC extends this list with two more classes that may be automatically derived +(provided the flag is specified): +Typeable, and Data. These classes are defined in the library +modules Data.Typeable and Data.Generics respectively, and the +appropriate class must be in scope before it can be mentioned in the deriving clause. + +An instance of Typeable can only be derived if the +data type has seven or fewer type parameters, all of kind *. +The reason for this is that the Typeable class is derived using the scheme +described in + +Scrap More Boilerplate: Reflection, Zips, and Generalised Casts +. +(Section 7.4 of the paper describes the multiple Typeable classes that +are used, and only Typeable1 up to +Typeable7 are provided in the library.) +In other cases, there is nothing to stop the programmer writing a TypableX +class, whose kind suits that of the data type constructor, and +then writing the data type instance by hand. + + + + +Generalised derived instances for newtypes + + +When you define an abstract type using newtype, you may want +the new type to inherit some instances from its representation. In +Haskell 98, you can inherit instances of Eq, Ord, +Enum and Bounded by deriving them, but for any +other classes you have to write an explicit instance declaration. For +example, if you define + + + newtype Dollars = Dollars Int + + +and you want to use arithmetic on Dollars, you have to +explicitly define an instance of Num: + + + instance Num Dollars where + Dollars a + Dollars b = Dollars (a+b) + ... + +All the instance does is apply and remove the newtype +constructor. It is particularly galling that, since the constructor +doesn't appear at run-time, this instance declaration defines a +dictionary which is wholly equivalent to the Int +dictionary, only slower! + + + + Generalising the deriving clause + +GHC now permits such instances to be derived instead, +using the flag , +so one can write + + newtype Dollars = Dollars Int deriving (Eq,Show,Num) + + +and the implementation uses the same Num dictionary +for Dollars as for Int. Notionally, the compiler +derives an instance declaration of the form + + + instance Num Int => Num Dollars + + +which just adds or removes the newtype constructor according to the type. + + + +We can also derive instances of constructor classes in a similar +way. For example, suppose we have implemented state and failure monad +transformers, such that + + + instance Monad m => Monad (State s m) + instance Monad m => Monad (Failure m) + +In Haskell 98, we can define a parsing monad by + + type Parser tok m a = State [tok] (Failure m) a + + +which is automatically a monad thanks to the instance declarations +above. With the extension, we can make the parser type abstract, +without needing to write an instance of class Monad, via + + + newtype Parser tok m a = Parser (State [tok] (Failure m) a) + deriving Monad + +In this case the derived instance declaration is of the form + + instance Monad (State [tok] (Failure m)) => Monad (Parser tok m) + + +Notice that, since Monad is a constructor class, the +instance is a partial application of the new type, not the +entire left hand side. We can imagine that the type declaration is +"eta-converted" to generate the context of the instance +declaration. + + + +We can even derive instances of multi-parameter classes, provided the +newtype is the last class parameter. In this case, a ``partial +application'' of the class appears in the deriving +clause. For example, given the class + + + class StateMonad s m | m -> s where ... + instance Monad m => StateMonad s (State s m) where ... + +then we can derive an instance of StateMonad for Parsers by + + newtype Parser tok m a = Parser (State [tok] (Failure m) a) + deriving (Monad, StateMonad [tok]) + + +The derived instance is obtained by completing the application of the +class to the new type: + + + instance StateMonad [tok] (State [tok] (Failure m)) => + StateMonad [tok] (Parser tok m) + + + + +As a result of this extension, all derived instances in newtype + declarations are treated uniformly (and implemented just by reusing +the dictionary for the representation type), except +Show and Read, which really behave differently for +the newtype and its representation. + + + + A more precise specification + +Derived instance declarations are constructed as follows. Consider the +declaration (after expansion of any type synonyms) + + + newtype T v1...vn = T' (t vk+1...vn) deriving (c1...cm) + + +where + + + The ci are partial applications of + classes of the form C t1'...tj', where the arity of C + is exactly j+1. That is, C lacks exactly one type argument. + + + The k is chosen so that ci (T v1...vk) is well-kinded. + + + The type t is an arbitrary type. + + + The type variables vk+1...vn do not occur in t, + nor in the ci, and + + + None of the ci is Read, Show, + Typeable, or Data. These classes + should not "look through" the type or its constructor. You can still + derive these classes for a newtype, but it happens in the usual way, not + via this new mechanism. + + +Then, for each ci, the derived instance +declaration is: + + instance ci t => ci (T v1...vk) + +As an example which does not work, consider + + newtype NonMonad m s = NonMonad (State s m s) deriving Monad + +Here we cannot derive the instance + + instance Monad (State s m) => Monad (NonMonad m) + + +because the type variable s occurs in State s m, +and so cannot be "eta-converted" away. It is a good thing that this +deriving clause is rejected, because NonMonad m is +not, in fact, a monad --- for the same reason. Try defining +>>= with the correct type: you won't be able to. + + + +Notice also that the order of class parameters becomes +important, since we can only derive instances for the last one. If the +StateMonad class above were instead defined as + + + class StateMonad m s | m -> s where ... + + +then we would not have been able to derive an instance for the +Parser type above. We hypothesise that multi-parameter +classes usually have one "main" parameter for which deriving new +instances is most interesting. + +Lastly, all of this applies only for classes other than +Read, Show, Typeable, +and Data, for which the built-in derivation applies (section +4.3.3. of the Haskell Report). +(For the standard classes Eq, Ord, +Ix, and Bounded it is immaterial whether +the standard method is used or the one described here.) + + + + + + + + +Class and instances declarations + + +Class declarations + + +This section, and the next one, documents GHC's type-class extensions. +There's lots of background in the paper Type +classes: exploring the design space (Simon Peyton Jones, Mark +Jones, Erik Meijer). + + +All the extensions are enabled by the flag. + + + +Multi-parameter type classes + +Multi-parameter type classes are permitted. For example: + + + + class Collection c a where + union :: c a -> c a -> c a + ...etc. + + + + + + +The superclasses of a class declaration + + +There are no restrictions on the context in a class declaration +(which introduces superclasses), except that the class hierarchy must +be acyclic. So these class declarations are OK: + + + + class Functor (m k) => FiniteMap m k where + ... + + class (Monad m, Monad (t m)) => Transform t m where + lift :: m a -> (t m) a + + + + + +As in Haskell 98, The class hierarchy must be acyclic. However, the definition +of "acyclic" involves only the superclass relationships. For example, +this is OK: + + + + class C a where { + op :: D b => a -> b -> b + } + + class C a => D a where { ... } + + + +Here, C is a superclass of D, but it's OK for a +class operation op of C to mention D. (It +would not be OK for D to be a superclass of C.) + @@ -1592,7 +2996,7 @@ class type variable, thus: The type of elem is illegal in Haskell 98, because it contains the constraint Eq a, constrains only the class type variable (in this case a). -GHC lifts this restriction. +GHC lifts this restriction (flag ). @@ -1604,7 +3008,7 @@ GHC lifts this restriction. Functional dependencies are implemented as described by Mark Jones -in “Type Classes with Functional Dependencies”, Mark P. Jones, +in “Type Classes with Functional Dependencies”, Mark P. Jones, In Proceedings of the 9th European Symposium on Programming, ESOP 2000, Berlin, Germany, March 2000, Springer-Verlag LNCS 1782, . @@ -1939,14 +3343,14 @@ must be of the form C a where a is a type variable that occurs in the head. -The flag loosens these restrictions +The flag loosens these restrictions considerably. Firstly, multi-parameter type classes are permitted. Secondly, the context and head of the instance declaration can each consist of arbitrary (well-kinded) assertions (C t1 ... tn) subject only to the following rules: -For each assertion in the context: +The Paterson Conditions: for each assertion in the context No type variable has more occurrences in the assertion than in the head The assertion has fewer constructors and variables (taken together @@ -1954,7 +3358,7 @@ For each assertion in the context: -The coverage condition. For each functional dependency, +The Coverage Condition. For each functional dependency, tvsleft -> tvsright, of the class, every type variable in @@ -1966,11 +3370,15 @@ corresponding type in the instance declaration. These restrictions ensure that context reduction terminates: each reduction step makes the problem smaller by at least one -constructor. For example, the following would make the type checker -loop if it wasn't excluded: - - instance C a => C a where ... - +constructor. Both the Paterson Conditions and the Coverage Condition are lifted +if you give the +flag (). +You can find lots of background material about the reason for these +restrictions in the paper +Understanding functional dependencies via Constraint Handling Rules. + + For example, these are OK: instance C Int [a] -- Multiple parameters @@ -2064,7 +3472,7 @@ typechecker loop: class F a b | a->b instance F [a] [[a]] instance (D c, F a c) => D [a] -- 'c' is not mentioned in the head - + Similarly, it can be tempting to lift the coverage condition: class Mul a b c | a b -> c where @@ -2084,13 +3492,13 @@ makes instance inference go into a loop, because it requires the constraint Nevertheless, GHC allows you to experiment with more liberal rules. If you use -the experimental flag --fallow-undecidable-instances -option, you can use arbitrary -types in both an instance context and instance head. Termination is ensured by having a +the experimental flag +-XUndecidableInstances, +both the Paterson Conditions and the Coverage Condition +(described in ) are lifted. Termination is ensured by having a fixed-depth recursion stack. If you exceed the stack depth you get a sort of backtrace, and the opportunity to increase the stack depth -with N. +with N. @@ -2102,11 +3510,11 @@ with N. In general, GHC requires that that it be unambiguous which instance declaration should be used to resolve a type-class constraint. This behaviour -can be modified by two flags: --fallow-overlapping-instances +can be modified by two flags: +-XOverlappingInstances -and --fallow-incoherent-instances +and +-XIncoherentInstances , as this section discusses. Both these flags are dynamic flags, and can be set on a per-module basis, using an OPTIONS_GHC pragma if desired (). @@ -2134,7 +3542,7 @@ particular constraint matches more than one. -The flag instructs GHC to allow +The flag instructs GHC to allow more than one instance to match, provided there is a most specific one. For example, the constraint C Int [Int] matches instances (A), (C) and (D), but the last is more specific, and hence is chosen. If there is no @@ -2151,38 +3559,65 @@ Suppose that from the RHS of f we get the constraint GHC does not commit to instance (C), because in a particular call of f, b might be instantiate to Int, in which case instance (D) would be more specific still. -So GHC rejects the program. If you add the flag , +So GHC rejects the program. +(If you add the flag , GHC will instead pick (C), without complaining about -the problem of subsequent instantiations. +the problem of subsequent instantiations.) + + +Notice that we gave a type signature to f, so GHC had to +check that f has the specified type. +Suppose instead we do not give a type signature, asking GHC to infer +it instead. In this case, GHC will refrain from +simplifying the constraint C Int [Int] (for the same reason +as before) but, rather than rejecting the program, it will infer the type + + f :: C Int b => [b] -> [b] + +That postpones the question of which instance to pick to the +call site for f +by which time more is known about the type b. The willingness to be overlapped or incoherent is a property of the instance declaration itself, controlled by the -presence or otherwise of the -and flags when that mdodule is +presence or otherwise of the +and flags when that module is being defined. Neither flag is required in a module that imports and uses the instance declaration. Specifically, during the lookup process: An instance declaration is ignored during the lookup process if (a) a more specific match is found, and (b) the instance declaration was compiled with -. The flag setting for the +. The flag setting for the more-specific instance does not matter. -Suppose an instance declaration does not matche the constraint being looked up, but +Suppose an instance declaration does not match the constraint being looked up, but does unify with it, so that it might match when the constraint is further instantiated. Usually GHC will regard this as a reason for not committing to some other constraint. But if the instance declaration was compiled with -, GHC will skip the "does-it-unify?" +, GHC will skip the "does-it-unify?" check for that declaration. -All this makes it possible for a library author to design a library that relies on -overlapping instances without the library client having to know. +These rules make it possible for a library author to design a library that relies on +overlapping instances without the library client having to know. + + +If an instance declaration is compiled without +, +then that instance can never be overlapped. This could perhaps be +inconvenient. Perhaps the rule should instead say that the +overlapping instance declaration should be compiled in +this way, rather than the overlapped one. Perhaps overlap +at a usage site should be permitted regardless of how the instance declarations +are compiled, if the flag is +used at the usage site. (Mind you, the exact usage site can occasionally be +hard to pin down.) We are interested to receive feedback on these points. -The flag implies the - flag, but not vice versa. +The flag implies the + flag, but not vice versa. @@ -2231,10 +3666,90 @@ reversed, but it makes sense to me. + +Overloaded string literals + + + +GHC supports overloaded string literals. Normally a +string literal has type String, but with overloaded string +literals enabled (with -XOverloadedStrings) + a string literal has type (IsString a) => a. + + +This means that the usual string syntax can be used, e.g., for packed strings +and other variations of string like types. String literals behave very much +like integer literals, i.e., they can be used in both expressions and patterns. +If used in a pattern the literal with be replaced by an equality test, in the same +way as an integer literal is. + + +The class IsString is defined as: + +class IsString a where + fromString :: String -> a + +The only predefined instance is the obvious one to make strings work as usual: + +instance IsString [Char] where + fromString cs = cs + +The class IsString is not in scope by default. If you want to mention +it explicitly (for example, to give an instance declaration for it), you can import it +from module GHC.Exts. + + +Haskell's defaulting mechanism is extended to cover string literals, when is specified. +Specifically: + + +Each type in a default declaration must be an +instance of Num or of IsString. + + + +The standard defaulting rule (Haskell Report, Section 4.3.4) +is extended thus: defaulting applies when all the unresolved constraints involve standard classes +or IsString; and at least one is a numeric class +or IsString. + + + + +A small example: + +module Main where + +import GHC.Exts( IsString(..) ) + +newtype MyString = MyString String deriving (Eq, Show) +instance IsString MyString where + fromString = MyString + +greet :: MyString -> MyString +greet "hello" = "world" +greet other = other + +main = do + print $ greet "hello" + print $ greet "fool" + + + +Note that deriving Eq is necessary for the pattern matching +to work since it gets translated into an equality comparison. + + + + + + +Other type system extensions + Type signatures -The context of a type signature +The context of a type signature Unlike Haskell 98, constraints in types do not have to be of the form (class type-variable) or @@ -2269,8 +3784,8 @@ in GHC, you can give the foralls if you want. See a is "reachable" if it appears +in the same constraint as either a type variable free in type, or another reachable type variable. A value with a type that does not obey this reachability restriction cannot be used without introducing @@ -2347,54 +3862,6 @@ territory free in case we need it later. - -For-all hoisting - -It is often convenient to use generalised type synonyms (see ) at the right hand -end of an arrow, thus: - - type Discard a = forall b. a -> b -> a - - g :: Int -> Discard Int - g x y z = x+y - -Simply expanding the type synonym would give - - g :: Int -> (forall b. Int -> b -> Int) - -but GHC "hoists" the forall to give the isomorphic type - - g :: forall b. Int -> Int -> b -> Int - -In general, the rule is this: to determine the type specified by any explicit -user-written type (e.g. in a type signature), GHC expands type synonyms and then repeatedly -performs the transformation: - - type1 -> forall a1..an. context2 => type2 -==> - forall a1..an. context2 => type1 -> type2 - -(In fact, GHC tries to retain as much synonym information as possible for use in -error messages, but that is a usability issue.) This rule applies, of course, whether -or not the forall comes from a synonym. For example, here is another -valid way to write g's type signature: - - g :: Int -> Int -> forall b. b -> Int - - - -When doing this hoisting operation, GHC eliminates duplicate constraints. For -example: - - type Foo a = (?x::Int) => Bool -> a - g :: Foo (Foo Int) - -means - - g :: (?x::Int) => Bool -> Bool -> Int - - - @@ -2409,11 +3876,11 @@ J Lewis, MB Shields, E Meijer, J Launchbury, Boston, Jan 2000. -(Most of the following, stil rather incomplete, documentation is +(Most of the following, still rather incomplete, documentation is due to Jeff Lewis.) Implicit parameter support is enabled with the option -. +. A variable is called dynamically bound when it is bound by the calling @@ -2466,7 +3933,7 @@ function that called it. For example, our sort function might to pick out the least value in a list: least :: (?cmp :: a -> a -> Bool) => [a] -> a - least xs = fst (sort xs) + least xs = head (sort xs) Without lifting a finger, the ?cmp parameter is propagated to become a parameter of least as well. With explicit @@ -2589,7 +4056,7 @@ In the former case, len_acc1 is monomorphic in its own right-hand side, so the implicit parameter ?acc is not passed to the recursive call. In the latter case, because len_acc2 has a type signature, the recursive call is made to the -polymoprhic version, which takes ?acc +polymorphic version, which takes ?acc as an implicit parameter. So we get the following results in GHCi: Prog> len1 "hello" @@ -2626,6 +4093,11 @@ inner binding of ?x, so (f 9) will return + + + Explicitly-kinded quantification @@ -2814,7 +4288,10 @@ kind for the type variable cxt. GHC now instead allows you to specify the kind of a type variable directly, wherever -a type variable is explicitly bound. Namely: +a type variable is explicitly bound, with the flag . + + +This flag enables kind signatures in the following places: data declarations: @@ -2890,6 +4367,8 @@ For example, all the following types are legal: g2 :: (forall a. Eq a => [a] -> a -> Bool) -> Int -> Int f3 :: ((forall a. a->a) -> Int) -> Bool -> Bool + + f4 :: Int -> (forall a. a -> a) Here, f1 and g1 are rank-1 types, and can be written in standard Haskell (e.g. f1 :: a->b->a). @@ -2906,28 +4385,31 @@ The function f3 has a rank-3 type; it has rank-2 types on the left of a function arrow. -GHC allows types of arbitrary rank; you can nest foralls -arbitrarily deep in function arrows. (GHC used to be restricted to rank 2, but -that restriction has now been lifted.) +GHC has three flags to control higher-rank types: + + + : data constructors (only) can have polymorphic argument types. + + + : any function (including data constructors) can have a rank-2 type. + + + : any function (including data constructors) can have an arbitrary-rank type. +That is, you can nest foralls +arbitrarily deep in function arrows. In particular, a forall-type (also called a "type scheme"), including an operational type class context, is legal: - On the left of a function arrow - On the right of a function arrow (see ) + On the left or right (see f4, for example) +of a function arrow As the argument of a constructor, or type of a field, in a data type declaration. For example, any of the f1,f2,f3,g1,g2 above would be valid field type signatures. As the type of an implicit parameter In a pattern type signature (see ) -There is one place you cannot put a forall: -you cannot instantiate a type variable with a forall-type. So you cannot -make a forall-type the argument of a type constructor. So these types are illegal: - - x1 :: [forall a. a->a] - x2 :: (forall a. a->a, Int) - x3 :: Maybe (forall a. a->a) - + + Of course forall becomes a keyword; you can't use forall as a type variable any more! @@ -3163,665 +4645,320 @@ for rank-2 types. - - - -Scoped type variables +<sect2 id="impredicative-polymorphism"> +<title>Impredicative polymorphism - - -A lexically scoped type variable can be bound by: - -A declaration type signature () -A pattern type signature () -A result type signature () - -For example: - -f (xs::[a]) = ys ++ ys - where - ys :: [a] - ys = reverse xs - -The pattern (xs::[a]) includes a type signature for xs. -This brings the type variable a into scope; it scopes over -all the patterns and right hand sides for this equation for f. -In particular, it is in scope at the type signature for y. - - - -At ordinary type signatures, such as that for ys, any type variables -mentioned in the type signature that are not in scope are -implicitly universally quantified. (If there are no type variables in -scope, all type variables mentioned in the signature are universally -quantified, which is just as in Haskell 98.) In this case, since a -is in scope, it is not universally quantified, so the type of ys is -the same as that of xs. In Haskell 98 it is not possible to declare -a type for ys; a major benefit of scoped type variables is that -it becomes possible to do so. - - - -Scoped type variables are implemented in both GHC and Hugs. Where the -implementations differ from the specification below, those differences -are noted. - - - -So much for the basic idea. Here are the details. - - - -What a scoped type variable means - -A lexically-scoped type variable is simply -the name for a type. The restriction it expresses is that all occurrences -of the same name mean the same type. For example: - - f :: [Int] -> Int -> Int - f (xs::[a]) (y::a) = (head xs + y) :: a - -The pattern type signatures on the left hand side of -f express the fact that xs -must be a list of things of some type a; and that y -must have this same type. The type signature on the expression (head xs) -specifies that this expression must have the same type a. -There is no requirement that the type named by "a" is -in fact a type variable. Indeed, in this case, the type named by "a" is -Int. (This is a slight liberalisation from the original rather complex -rules, which specified that a pattern-bound type variable should be universally quantified.) -For example, all of these are legal: - - - t (x::a) (y::a) = x+y*2 - - f (x::a) (y::b) = [x,y] -- a unifies with b - - g (x::a) = x + 1::Int -- a unifies with Int - - h x = let k (y::a) = [x,y] -- a is free in the - in k x -- environment - - k (x::a) True = ... -- a unifies with Int - k (x::Int) False = ... - - w :: [b] -> [b] - w (x::a) = x -- a unifies with [b] - - - - - -Scope and implicit quantification - - - - - - - -All the type variables mentioned in a pattern, -that are not already in scope, -are brought into scope by the pattern. We describe this set as -the type variables bound by the pattern. -For example: - - f (x::a) = let g (y::(a,b)) = fst y - in - g (x,True) - -The pattern (x::a) brings the type variable -a into scope, as well as the term -variable x. The pattern (y::(a,b)) -contains an occurrence of the already-in-scope type variable a, -and brings into scope the type variable b. - - - - - -The type variable(s) bound by the pattern have the same scope -as the term variable(s) bound by the pattern. For example: - - let - f (x::a) = <...rhs of f...> - (p::b, q::b) = (1,2) - in <...body of let...> - -Here, the type variable a scopes over the right hand side of f, -just like x does; while the type variable b scopes over the -body of the let, and all the other definitions in the let, -just like p and q do. -Indeed, the newly bound type variables also scope over any ordinary, separate -type signatures in the let group. - - - - - - -The type variables bound by the pattern may be -mentioned in ordinary type signatures or pattern -type signatures anywhere within their scope. - - - - - - - In ordinary type signatures, any type variable mentioned in the -signature that is in scope is not universally quantified. - - - - - - - - Ordinary type signatures do not bring any new type variables -into scope (except in the type signature itself!). So this is illegal: - - - f :: a -> a - f x = x::a - - -It's illegal because a is not in scope in the body of f, -so the ordinary signature x::a is equivalent to x::forall a.a; -and that is an incorrect typing. - - - - - - -The pattern type signature is a monotype: - - - - -A pattern type signature cannot contain any explicit forall quantification. - - - -The type variables bound by a pattern type signature can only be instantiated to monotypes, -not to type schemes. - - - -There is no implicit universal quantification on pattern type signatures (in contrast to -ordinary type signatures). - - - - - - - - - -The type variables in the head of a class or instance declaration -scope over the methods defined in the where part. For example: - - - - class C a where - op :: [a] -> a - - op xs = let ys::[a] - ys = reverse xs - in - head ys - - - -(Not implemented in Hugs yet, Dec 98). - - - - - - - - - - -Declaration type signatures -A declaration type signature that has explicit -quantification (using forall) brings into scope the -explicitly-quantified -type variables, in the definition of the named function(s). For example: - - f :: forall a. [a] -> [a] - f (x:xs) = xs ++ [ x :: a ] - -The "forall a" brings "a" into scope in -the definition of "f". - -This only happens if the quantification in f's type -signature is explicit. For example: - - g :: [a] -> [a] - g (x:xs) = xs ++ [ x :: a ] - -This program will be rejected, because "a" does not scope -over the definition of "f", so "x::a" -means "x::forall a. a" by Haskell's usual implicit -quantification rules. - - - - -Where a pattern type signature can occur - - -A pattern type signature can occur in any pattern. For example: - - - - -A pattern type signature can be on an arbitrary sub-pattern, not -just on a variable: - - - - f ((x,y)::(a,b)) = (y,x) :: (b,a) - - - - - - - - - Pattern type signatures, including the result part, can be used -in lambda abstractions: - - - (\ (x::a, y) :: a -> x) - - - - - - - Pattern type signatures, including the result part, can be used -in case expressions: - +GHC supports impredicative polymorphism, +enabled with . +This means +that you can call a polymorphic function at a polymorphic type, and +parameterise data structures over polymorphic types. For example: - case e of { ((x::a, y) :: (a,b)) -> x } + f :: Maybe (forall a. [a] -> [a]) -> Maybe ([Int], [Char]) + f (Just g) = Just (g [3], g "hello") + f Nothing = Nothing - -Note that the -> symbol in a case alternative -leads to difficulties when parsing a type signature in the pattern: in -the absence of the extra parentheses in the example above, the parser -would try to interpret the -> as a function -arrow and give a parse error later. - +Notice here that the Maybe type is parameterised by the +polymorphic type (forall a. [a] -> +[a]). +The technical details of this extension are described in the paper +Boxy types: +type inference for higher-rank types and impredicativity, +which appeared at ICFP 2006. + + - + +Lexically scoped type variables + - -To avoid ambiguity, the type after the “::” in a result -pattern signature on a lambda or case must be atomic (i.e. a single -token or a parenthesised type of some sort). To see why, -consider how one would parse this: - - +GHC supports lexically scoped type variables, without +which some type signatures are simply impossible to write. For example: - \ x :: a -> b -> x +f :: forall a. [a] -> [a] +f xs = ys ++ ys + where + ys :: [a] + ys = reverse xs - - +The type signature for f brings the type variable a into scope; it scopes over +the entire definition of f. +In particular, it is in scope at the type signature for ys. +In Haskell 98 it is not possible to declare +a type for ys; a major benefit of scoped type variables is that +it becomes possible to do so. - +Lexically-scoped type variables are enabled by +. This flag implies . + +Note: GHC 6.6 contains substantial changes to the way that scoped type +variables work, compared to earlier releases. Read this section +carefully! - + +Overview +The design follows the following principles + +A scoped type variable stands for a type variable, and not for +a type. (This is a change from GHC's earlier +design.) +Furthermore, distinct lexical type variables stand for distinct +type variables. This means that every programmer-written type signature +(including one that contains free scoped type variables) denotes a +rigid type; that is, the type is fully known to the type +checker, and no inference is involved. +Lexical type variables may be alpha-renamed freely, without +changing the program. + + - Pattern type signatures can bind existential type variables. -For example: - - +A lexically scoped type variable can be bound by: + +A declaration type signature () +An expression type signature () +A pattern type signature () +Class and instance declarations () + + + +In Haskell, a programmer-written type signature is implicitly quantified over +its free type variables (Section +4.1.2 +of the Haskell Report). +Lexically scoped type variables affect this implicit quantification rules +as follows: any type variable that is in scope is not universally +quantified. For example, if type variable a is in scope, +then - data T = forall a. MkT [a] - - f :: T -> T - f (MkT [t::a]) = MkT t3 - where - t3::[a] = [t,t,t] + (e :: a -> a) means (e :: a -> a) + (e :: b -> b) means (e :: forall b. b->b) + (e :: a -> b) means (e :: forall b. a->b) - - - - + - -Pattern type signatures -can be used in pattern bindings: + +Declaration type signatures +A declaration type signature that has explicit +quantification (using forall) brings into scope the +explicitly-quantified +type variables, in the definition of the named function. For example: - f x = let (y, z::a) = x in ... - f1 x = let (y, z::Int) = x in ... - f2 (x::(Int,a)) = let (y, z::a) = x in ... - f3 :: (b->b) = \x -> x + f :: forall a. [a] -> [a] + f (x:xs) = xs ++ [ x :: a ] - -In all such cases, the binding is not generalised over the pattern-bound -type variables. Thus f3 is monomorphic; f3 -has type b -> b for some type b, -and not forall b. b -> b. -In contrast, the binding +The "forall a" brings "a" into scope in +the definition of "f". + +This only happens if: + + The quantification in f's type +signature is explicit. For example: - f4 :: b->b - f4 = \x -> x + g :: [a] -> [a] + g (x:xs) = xs ++ [ x :: a ] -makes a polymorphic function, but b is not in scope anywhere -in f4's scope. +This program will be rejected, because "a" does not scope +over the definition of "f", so "x::a" +means "x::forall a. a" by Haskell's usual implicit +quantification rules. + + The signature gives a type for a function binding or a bare variable binding, +not a pattern binding. +For example: + + f1 :: forall a. [a] -> [a] + f1 (x:xs) = xs ++ [ x :: a ] -- OK - - + f2 :: forall a. [a] -> [a] + f2 = \(x:xs) -> xs ++ [ x :: a ] -- OK + + f3 :: forall a. [a] -> [a] + Just f3 = Just (\(x:xs) -> xs ++ [ x :: a ]) -- Not OK! + +The binding for f3 is a pattern binding, and so its type signature +does not bring a into scope. However f1 is a +function binding, and f2 binds a bare variable; in both cases +the type signature brings a into scope. + -Pattern type signatures are completely orthogonal to ordinary, separate -type signatures. The two can be used independently or together. - - -Result type signatures - - -The result type of a function can be given a signature, thus: - + +Expression type signatures +An expression type signature that has explicit +quantification (using forall) brings into scope the +explicitly-quantified +type variables, in the annotated expression. For example: - f (x::a) :: [a] = [x,x,x] + f = runST ( (op >>= \(x :: STRef s Int) -> g x) :: forall s. ST s Bool ) +Here, the type signature forall a. ST s Bool brings the +type variable s into scope, in the annotated expression +(op >>= \(x :: STRef s Int) -> g x). + + -The final :: [a] after all the patterns gives a signature to the -result type. Sometimes this is the only way of naming the type variable -you want: - - + +Pattern type signatures + +A type signature may occur in any pattern; this is a pattern type +signature. +For example: - f :: Int -> [a] -> [a] - f n :: ([a] -> [a]) = let g (x::a, y::a) = (y,x) - in \xs -> map g (reverse xs `zip` xs) + -- f and g assume that 'a' is already in scope + f = \(x::Int, y::a) -> x + g (x::a) = x + h ((x,y) :: (Int,Bool)) = (y,x) - +In the case where all the type variables in the pattern type signature are +already in scope (i.e. bound by the enclosing context), matters are simple: the +signature simply constrains the type of the pattern in the obvious way. -The type variables bound in a result type signature scope over the right hand side -of the definition. However, consider this corner-case: +Unlike expression and declaration type signatures, pattern type signatures are not implicitly generalised. +The pattern in a pattern binding may only mention type variables +that are already in scope. For example: - rev1 :: [a] -> [a] = \xs -> reverse xs + f :: forall a. [a] -> (Int, [a]) + f xs = (n, zs) + where + (ys::[a], n) = (reverse xs, length xs) -- OK + zs::[a] = xs ++ ys -- OK - foo ys = rev (ys::[a]) + Just (v::b) = ... -- Not OK; b is not in scope -The signature on rev1 is considered a pattern type signature, not a result -type signature, and the type variables it binds have the same scope as rev1 -itself (i.e. the right-hand side of rev1 and the rest of the module too). -In particular, the expression (ys::[a]) is OK, because the type variable a -is in scope (otherwise it would mean (ys::forall a.[a]), which would be rejected). +Here, the pattern signatures for ys and zs +are fine, but the one for v is not because b is +not in scope. -As mentioned above, rev1 is made monomorphic by this scoping rule. -For example, the following program would be rejected, because it claims that rev1 -is polymorphic: +However, in all patterns other than pattern bindings, a pattern +type signature may mention a type variable that is not in scope; in this case, +the signature brings that type variable into scope. +This is particularly important for existential data constructors. For example: - rev1 :: [b] -> [b] - rev1 :: [a] -> [a] = \xs -> reverse xs - - + data T = forall a. MkT [a] - -Result type signatures are not yet implemented in Hugs. + k :: T -> T + k (MkT [t::a]) = MkT t3 + where + t3::[a] = [t,t,t] + +Here, the pattern type signature (t::a) mentions a lexical type +variable that is not already in scope. Indeed, it cannot already be in scope, +because it is bound by the pattern match. GHC's rule is that in this situation +(and only then), a pattern type signature can mention a type variable that is +not already in scope; the effect is to bring it into scope, standing for the +existentially-bound type variable. - - - - - - -Deriving clause for classes <literal>Typeable</literal> and <literal>Data</literal> - -Haskell 98 allows the programmer to add "deriving( Eq, Ord )" to a data type -declaration, to generate a standard instance declaration for classes specified in the deriving clause. -In Haskell 98, the only classes that may appear in the deriving clause are the standard -classes Eq, Ord, -Enum, Ix, Bounded, Read, and Show. +When a pattern type signature binds a type variable in this way, GHC insists that the +type variable is bound to a rigid, or fully-known, type variable. +This means that any user-written type signature always stands for a completely known type. -GHC extends this list with two more classes that may be automatically derived -(provided the flag is specified): -Typeable, and Data. These classes are defined in the library -modules Data.Typeable and Data.Generics respectively, and the -appropriate class must be in scope before it can be mentioned in the deriving clause. +If all this seems a little odd, we think so too. But we must have +some way to bring such type variables into scope, else we +could not name existentially-bound type variables in subsequent type signatures. -An instance of Typeable can only be derived if the -data type has seven or fewer type parameters, all of kind *. -The reason for this is that the Typeable class is derived using the scheme -described in - -Scrap More Boilerplate: Reflection, Zips, and Generalised Casts -. -(Section 7.4 of the paper describes the multiple Typeable classes that -are used, and only Typeable1 up to -Typeable7 are provided in the library.) -In other cases, there is nothing to stop the programmer writing a TypableX -class, whose kind suits that of the data type constructor, and -then writing the data type instance by hand. - - - - -Generalised derived instances for newtypes - -When you define an abstract type using newtype, you may want -the new type to inherit some instances from its representation. In -Haskell 98, you can inherit instances of Eq, Ord, -Enum and Bounded by deriving them, but for any -other classes you have to write an explicit instance declaration. For -example, if you define - - - newtype Dollars = Dollars Int - - -and you want to use arithmetic on Dollars, you have to -explicitly define an instance of Num: - - - instance Num Dollars where - Dollars a + Dollars b = Dollars (a+b) - ... - -All the instance does is apply and remove the newtype -constructor. It is particularly galling that, since the constructor -doesn't appear at run-time, this instance declaration defines a -dictionary which is wholly equivalent to the Int -dictionary, only slower! +This is (now) the only situation in which a pattern type +signature is allowed to mention a lexical variable that is not already in +scope. +For example, both f and g would be +illegal if a was not already in scope. - Generalising the deriving clause - -GHC now permits such instances to be derived instead, so one can write - - newtype Dollars = Dollars Int deriving (Eq,Show,Num) - + -and the implementation uses the same Num dictionary -for Dollars as for Int. Notionally, the compiler -derives an instance declaration of the form + + + +Class and instance declarations -Notice also that the order of class parameters becomes -important, since we can only derive instances for the last one. If the -StateMonad class above were instead defined as +The type variables in the head of a class or instance declaration +scope over the methods defined in the where part. For example: - - class StateMonad m s | m -> s where ... - -then we would not have been able to derive an instance for the -Parser type above. We hypothesise that multi-parameter -classes usually have one "main" parameter for which deriving new -instances is most interesting. - -Lastly, all of this applies only for classes other than -Read, Show, Typeable, -and Data, for which the built-in derivation applies (section -4.3.3. of the Haskell Report). -(For the standard classes Eq, Ord, -Ix, and Bounded it is immaterial whether -the standard method is used or the one described here.) + + class C a where + op :: [a] -> a + + op xs = let ys::[a] + ys = reverse xs + in + head ys + + Generalised typing of mutually recursive bindings @@ -3829,20 +4966,20 @@ the standard method is used or the one described here.) The Haskell Report specifies that a group of bindings (at top level, or in a let or where) should be sorted into strongly-connected components, and then type-checked in dependency order -(Haskell +(Haskell Report, Section 4.5.1). As each group is type-checked, any binders of the group that have an explicit type signature are put in the type environment with the specified polymorphic type, and all others are monomorphic until the group is generalised -(Haskell Report, Section 4.5.2). +(Haskell Report, Section 4.5.2). Following a suggestion of Mark Jones, in his paper -Typing Haskell in +Typing Haskell in Haskell, -GHC implements a more general scheme. If is +GHC implements a more general scheme. If is specified: the dependency analysis ignores references to variables that have an explicit type signature. @@ -3858,12 +4995,12 @@ This is rejected by Haskell 98, but under Jones's scheme the definition for g is typechecked first, separately from that for f, because the reference to f in g's right -hand side is ingored by the dependency analysis. Then g's +hand side is ignored by the dependency analysis. Then g's type is generalised, to get g :: Ord a => a -> Bool -Now, the defintion for f is typechecked, with this type for +Now, the definition for f is typechecked, with this type for g in the type environment. @@ -3871,7 +5008,7 @@ Now, the defintion for f is typechecked, with this type for The same refined dependency analysis also allows the type signatures of mutually-recursive functions to have different contexts, something that is illegal in Haskell 98 (Section 4.5.2, last sentence). With - + GHC only insists that the type signatures of a refined group have identical type signatures; in practice this means that only variables bound by the same pattern binding must have the same context. For example, this is fine: @@ -3885,201 +5022,72 @@ pattern binding must have the same context. For example, this is fine: - - - - - - -Generalised Algebraic Data Types + +Type families + -Generalised Algebraic Data Types (GADTs) generalise ordinary algebraic data types by allowing you -to give the type signatures of constructors explicitly. For example: - - data Term a where - Lit :: Int -> Term Int - Succ :: Term Int -> Term Int - IsZero :: Term Int -> Term Bool - If :: Term Bool -> Term a -> Term a -> Term a - Pair :: Term a -> Term b -> Term (a,b) - -Notice that the return type of the constructors is not always Term a, as is the -case with ordinary vanilla data types. Now we can write a well-typed eval function -for these Terms: - - eval :: Term a -> a - eval (Lit i) = i - eval (Succ t) = 1 + eval t - eval (IsZero t) = eval t == 0 - eval (If b e1 e2) = if eval b then eval e1 else eval e2 - eval (Pair e1 e2) = (eval e1, eval e2) - -These and many other examples are given in papers by Hongwei Xi, and Tim Sheard. + +GHC supports the definition of type families indexed by types. They may be +seen as an extension of Haskell 98's class-based overloading of values to +types. When type families are declared in classes, they are also known as +associated types. - The extensions to GHC are these: - - - Data type declarations have a 'where' form, as exemplified above. The type signature of -each constructor is independent, and is implicitly universally quantified as usual. Unlike a normal -Haskell data type declaration, the type variable(s) in the "data Term a where" header -have no scope. Indeed, one can write a kind signature instead: - - data Term :: * -> * where ... - -or even a mixture of the two: - - data Foo a :: (* -> *) -> * where ... - -The type variables (if given) may be explicitly kinded, so we could also write the header for Foo -like this: - - data Foo a (b :: * -> *) where ... - - - - -There are no restrictions on the type of the data constructor, except that the result -type must begin with the type constructor being defined. For example, in the Term data -type above, the type of each constructor must end with ... -> Term .... - - - -You can use record syntax on a GADT-style data type declaration: - - - data Term a where - Lit { val :: Int } :: Term Int - Succ { num :: Term Int } :: Term Int - Pred { num :: Term Int } :: Term Int - IsZero { arg :: Term Int } :: Term Bool - Pair { arg1 :: Term a - , arg2 :: Term b - } :: Term (a,b) - If { cnd :: Term Bool - , tru :: Term a - , fls :: Term a - } :: Term a - -For every constructor that has a field f, (a) the type of -field f must be the same; and (b) the -result type of the constructor must be the same; both modulo alpha conversion. -Hence, in our example, we cannot merge the num and arg -fields above into a -single name. Although their field types are both Term Int, -their selector functions actually have different types: - - - num :: Term Int -> Term Int - arg :: Term Bool -> Term Int - - -At the moment, record updates are not yet possible with GADT, so support is -limited to record construction, selection and pattern matching: - - - someTerm :: Term Bool - someTerm = IsZero { arg = Succ { num = Lit { val = 0 } } } - - eval :: Term a -> a - eval Lit { val = i } = i - eval Succ { num = t } = eval t + 1 - eval Pred { num = t } = eval t - 1 - eval IsZero { arg = t } = eval t == 0 - eval Pair { arg1 = t1, arg2 = t2 } = (eval t1, eval t2) - eval t@If{} = if eval (cnd t) then eval (tru t) else eval (fls t) - - - - - -You can use strictness annotations, in the obvious places -in the constructor type: - - data Term a where - Lit :: !Int -> Term Int - If :: Term Bool -> !(Term a) -> !(Term a) -> Term a - Pair :: Term a -> Term b -> Term (a,b) - - - - -You can use a deriving clause on a GADT-style data type -declaration, but only if the data type could also have been declared in -Haskell-98 syntax. For example, these two declarations are equivalent - - data Maybe1 a where { - Nothing1 :: Maybe a ; - Just1 :: a -> Maybe a - } deriving( Eq, Ord ) - - data Maybe2 a = Nothing2 | Just2 a - deriving( Eq, Ord ) - -This simply allows you to declare a vanilla Haskell-98 data type using the -where form without losing the deriving clause. - - - -Pattern matching causes type refinement. For example, in the right hand side of the equation - - eval :: Term a -> a - eval (Lit i) = ... - -the type a is refined to Int. (That's the whole point!) -A precise specification of the type rules is beyond what this user manual aspires to, but there is a paper -about the ideas: "Wobbly types: practical type inference for generalised algebraic data types", on Simon PJ's home page. - - The general principle is this: type refinement is only carried out based on user-supplied type annotations. -So if no type signature is supplied for eval, no type refinement happens, and lots of obscure error messages will -occur. However, the refinement is quite general. For example, if we had: - - eval :: Term a -> a -> a - eval (Lit i) j = i+j - -the pattern match causes the type a to be refined to Int (because of the type -of the constructor Lit, and that refinement also applies to the type of j, and -the result type of the case expression. Hence the addition i+j is legal. + +There are two forms of type families: data families and type synonym families. +Currently, only the former are fully implemented, while we are still working +on the latter. As a result, the specification of the language extension is +also still to some degree in flux. Hence, a more detailed description of +the language extension and its use is currently available +from the Haskell +wiki page on type families. The material will be moved to this user's +guide when it has stabilised. - - + +Type families are enabled by the flag . -Notice that GADTs generalise existential types. For example, these two declarations are equivalent: - - data T a = forall b. MkT b (b->a) - data T' a where { MKT :: b -> (b->a) -> T' a } - - - - + + + + Template Haskell -Template Haskell allows you to do compile-time meta-programming in Haskell. There is a "home page" for -Template Haskell at -http://www.haskell.org/th/, while -the background to +Template Haskell allows you to do compile-time meta-programming in +Haskell. +The background to the main technical innovations is discussed in " +url="http://research.microsoft.com/~simonpj/papers/meta-haskell/"> Template Meta-programming for Haskell" (Proc Haskell Workshop 2002). -The details of the Template Haskell design are still in flux. Make sure you -consult the online library reference material -(search for the type ExpQ). -[Temporary: many changes to the original design are described in - "http://research.microsoft.com/~simonpj/tmp/notes2.ps". -Not all of these changes are in GHC 6.2.] + + +There is a Wiki page about +Template Haskell at +http://www.haskell.org/haskellwiki/Template_Haskell, and that is the best place to look for +further details. +You may also +consult the online +Haskell library reference material +(look for module Language.Haskell.TH). +Many changes to the original design are described in + +Notes on Template Haskell version 2. +Not all of these changes are in GHC, however. - The first example from that paper is set out below as a worked example to help get you started. + The first example from that paper is set out below () +as a worked example to help get you started. -The documentation here describes the realisation in GHC. (It's rather sketchy just now; -Tim Sheard is going to expand it.) +The documentation here describes the realisation of Template Haskell in GHC. It is not detailed enough to +understand Template Haskell; see the +Wiki page. @@ -4087,9 +5095,10 @@ Tim Sheard is going to expand it.) Template Haskell has the following new syntactic constructions. You need to use the flag - + + to switch these syntactic extensions on - ( is no longer implied by + ( is no longer implied by ). @@ -4104,41 +5113,56 @@ Tim Sheard is going to expand it.) an expression; the spliced expression must have type Q Exp - a list of top-level declarations; ; the spliced expression must have type Q [Dec] - [Planned, but not implemented yet.] a - type; the spliced expression must have type Q Typ. + a list of top-level declarations; the spliced expression must have type Q [Dec] - (Note that the syntax for a declaration splice uses "$" not "splice" as in - the paper. Also the type of the enclosed expression must be Q [Dec], not [Q Dec] - as in the paper.) - + + Inside a splice you can can only call functions defined in imported modules, + not functions defined elsewhere in the same module. A expression quotation is written in Oxford brackets, thus: [| ... |], where the "..." is an expression; - the quotation has type Expr. + the quotation has type Q Exp. [d| ... |], where the "..." is a list of top-level declarations; the quotation has type Q [Dec]. - [Planned, but not implemented yet.] [t| ... |], where the "..." is a type; - the quotation has type Type. + [t| ... |], where the "..." is a type; + the quotation has type Q Typ. + + + + A quasi-quotation can appear in either a pattern context or an + expression context and is also written in Oxford brackets: + + [:varid| ... |], + where the "..." is an arbitrary string; a full description of the + quasi-quotation facility is given in . - Reification is written thus: + A name can be quoted with either one or two prefix single quotes: - reifyDecl T, where T is a type constructor; this expression - has type Dec. - reifyDecl C, where C is a class; has type Dec. - reifyType f, where f is an identifier; has type Typ. - Still to come: fixities - - + 'f has type Name, and names the function f. + Similarly 'C has type Name and names the data constructor C. + In general 'thing interprets thing in an expression context. + + ''T has type Name, and names the type constructor T. + That is, ''thing interprets thing in a type context. + + + These Names can be used to construct Template Haskell expressions, patterns, declarations etc. They + may also be given as an argument to the reify function. + +(Compared to the original paper, there are many differences of detail. +The syntax for a declaration splice uses "$" not "splice". +The type of the enclosed expression must be Q [Dec], not [Q Dec]. +Type splices are not implemented, and neither are pattern splices or quotations. + Using Template Haskell @@ -4155,6 +5179,18 @@ Tim Sheard is going to expand it.) (It would make sense to do so, but it's hard to implement.) + + You can only run a function at compile time if it is imported + from another module that is not part of a mutually-recursive group of modules + that includes the module currently being compiled. Furthermore, all of the modules of + the mutually-recursive group must be reachable by non-SOURCE imports from the module where the + splice is to be run. + + For example, when compiling module A, + you can only run Template Haskell functions imported from B if B does not import A (directly or indirectly). + The reason should be clear: to run B we must compile and run A, but we are currently type-checking A. + + The flag -ddump-splices shows the expansion of all top-level splices as they happen. @@ -4173,7 +5209,7 @@ Tim Sheard is going to expand it.) - A Template Haskell Worked Example + A Template Haskell Worked Example To help you get over the confidence barrier, try out this skeletal worked example. First cut and paste the two modules below into "Main.hs" and "Printf.hs": @@ -4213,21 +5249,21 @@ parse s = [ L s ] -- Generate Haskell source code from a parsed representation -- of the format string. This code will be spliced into -- the module which calls "pr", at compile time. -gen :: [Format] -> ExpQ +gen :: [Format] -> Q Exp gen [D] = [| \n -> show n |] gen [S] = [| \s -> s |] gen [L s] = stringE s -- Here we generate the Haskell code for the splice -- from an input format string. -pr :: String -> ExpQ -pr s = gen (parse s) +pr :: String -> Q Exp +pr s = gen (parse s) Now run the compiler (here we are a Cygwin prompt on Windows): -$ ghc --make -fth main.hs -o main.exe +$ ghc --make -XTemplateHaskell main.hs -o main.exe Run "main.exe" and here is your output: @@ -4268,7 +5304,7 @@ The basic idea is to compile the program twice: Then compile it again with , and additionally use - to name the object files differentliy (you can choose any suffix + to name the object files differently (you can choose any suffix that isn't the normal object suffix here). GHC will automatically load the object files built in the first step when executing splice expressions. If you omit the flag when @@ -4278,6 +5314,124 @@ The basic idea is to compile the program twice: + Template Haskell Quasi-quotation +Quasi-quotation allows patterns and expressions to be written using +programmer-defined concrete syntax; the motivation behind the extension and +several examples are documented in +"Why It's +Nice to be Quoted: Quasiquoting for Haskell" (Proc Haskell Workshop +2007). The example below shows how to write a quasiquoter for a simple +expression language. + + +In the example, the quasiquoter expr is bound to a value of +type Language.Haskell.TH.Quote.QuasiQuoter which contains two +functions for quoting expressions and patterns, respectively. The first argument +to each quoter is the (arbitrary) string enclosed in the Oxford brackets. The +context of the quasi-quotation statement determines which of the two parsers is +called: if the quasi-quotation occurs in an expression context, the expression +parser is called, and if it occurs in a pattern context, the pattern parser is +called. + + +Note that in the example we make use of an antiquoted +variable n, indicated by the syntax 'int:n +(this syntax for anti-quotation was defined by the parser's +author, not by GHC). This binds n to the +integer value argument of the constructor IntExpr when +pattern matching. Please see the referenced paper for further details regarding +anti-quotation as well as the description of a technique that uses SYB to +leverage a single parser of type String -> a to generate both +an expression parser that returns a value of type Q Exp and a +pattern parser that returns a value of type Q Pat. + + +In general, a quasi-quote has the form +[$quoter| string |]. +The quoter must be the name of an imported quoter; it +cannot be an arbitrary expression. The quoted string +can be arbitrary, and may contain newlines. + + +Quasiquoters must obey the same stage restrictions as Template Haskell, e.g., in +the example, expr cannot be defined +in Main.hs where it is used, but must be imported. + + + + +{- Main.hs -} +module Main where + +import Expr + +main :: IO () +main = do { print $ eval [$expr|1 + 2|] + ; case IntExpr 1 of + { [$expr|'int:n|] -> print n + ; _ -> return () + } + } + + +{- Expr.hs -} +module Expr where + +import qualified Language.Haskell.TH as TH +import Language.Haskell.TH.Quasi + +data Expr = IntExpr Integer + | AntiIntExpr String + | BinopExpr BinOp Expr Expr + | AntiExpr String + deriving(Show, Typeable, Data) + +data BinOp = AddOp + | SubOp + | MulOp + | DivOp + deriving(Show, Typeable, Data) + +eval :: Expr -> Integer +eval (IntExpr n) = n +eval (BinopExpr op x y) = (opToFun op) (eval x) (eval y) + where + opToFun AddOp = (+) + opToFun SubOp = (-) + opToFun MulOp = (*) + opToFun DivOp = div + +expr = QuasiQuoter parseExprExp parseExprPat + +-- Parse an Expr, returning its representation as +-- either a Q Exp or a Q Pat. See the referenced paper +-- for how to use SYB to do this by writing a single +-- parser of type String -> Expr instead of two +-- separate parsers. + +parseExprExp :: String -> Q Exp +parseExprExp ... + +parseExprPat :: String -> Q Pat +parseExprPat ... + + +Now run the compiler: + + +$ ghc --make -XQuasiQuotes Main.hs -o main + + +Run "main" and here is your output: + + +$ ./main +3 +1 + + + + @@ -4316,7 +5470,7 @@ Palgrave, 2003. and the arrows web page at http://www.haskell.org/arrows/. -With the flag, GHC supports the arrow +With the flag, GHC supports the arrow notation described in the second of these papers. What follows is a brief introduction to the notation; it won't make much sense unless you've read Hughes's paper. @@ -4756,25 +5910,169 @@ The module must import The preprocessor cannot cope with other Haskell extensions. These would have to go in separate modules. - + + + + +Because the preprocessor targets Haskell (rather than Core), +let-bound variables are monomorphic. + + + + + + + + + + + + + +Bang patterns +<indexterm><primary>Bang patterns</primary></indexterm> + +GHC supports an extension of pattern matching called bang +patterns. Bang patterns are under consideration for Haskell Prime. +The Haskell +prime feature description contains more discussion and examples +than the material below. + + +Bang patterns are enabled by the flag . + + + +Informal description of bang patterns + + +The main idea is to add a single new production to the syntax of patterns: + + pat ::= !pat + +Matching an expression e against a pattern !p is done by first +evaluating e (to WHNF) and then matching the result against p. +Example: + +f1 !x = True + +This definition makes f1 is strict in x, +whereas without the bang it would be lazy. +Bang patterns can be nested of course: + +f2 (!x, y) = [x,y] + +Here, f2 is strict in x but not in +y. +A bang only really has an effect if it precedes a variable or wild-card pattern: + +f3 !(x,y) = [x,y] +f4 (x,y) = [x,y] + +Here, f3 and f4 are identical; putting a bang before a pattern that +forces evaluation anyway does nothing. + +Bang patterns work in case expressions too, of course: + +g5 x = let y = f x in body +g6 x = case f x of { y -> body } +g7 x = case f x of { !y -> body } + +The functions g5 and g6 mean exactly the same thing. +But g7 evaluates (f x), binds y to the +result, and then evaluates body. + +Bang patterns work in let and where +definitions too. For example: + +let ![x,y] = e in b + +is a strict pattern: operationally, it evaluates e, matches +it against the pattern [x,y], and then evaluates b +The "!" should not be regarded as part of the pattern; after all, +in a function argument ![x,y] means the +same as [x,y]. Rather, the "!" +is part of the syntax of let bindings. + + + + + +Syntax and semantics + + + +We add a single new production to the syntax of patterns: + + pat ::= !pat + +There is one problem with syntactic ambiguity. Consider: + +f !x = 3 + +Is this a definition of the infix function "(!)", +or of the "f" with a bang pattern? GHC resolves this +ambiguity in favour of the latter. If you want to define +(!) with bang-patterns enabled, you have to do so using +prefix notation: + +(!) f x = 3 + +The semantics of Haskell pattern matching is described in +Section 3.17.2 of the Haskell Report. To this description add +one extra item 10, saying: +Matching +the pattern !pat against a value v behaves as follows: +if v is bottom, the match diverges + otherwise, pat is matched against + v + + +Similarly, in Figure 4 of +Section 3.17.3, add a new case (t): + +case v of { !pat -> e; _ -> e' } + = v `seq` case v of { pat -> e; _ -> e' } + + +That leaves let expressions, whose translation is given in +Section +3.12 +of the Haskell Report. +In the translation box, first apply +the following transformation: for each pattern pi that is of +form !qi = ei, transform it to (xi,!qi) = ((),ei), and and replace e0 +by (xi `seq` e0). Then, when none of the left-hand-side patterns +have a bang at the top, apply the rules in the existing box. + +The effect of the let rule is to force complete matching of the pattern +qi before evaluation of the body is begun. The bang is +retained in the translated form in case qi is a variable, +thus: + + let !y = f x in b + - + -Because the preprocessor targets Haskell (rather than Core), -let-bound variables are monomorphic. +The let-binding can be recursive. However, it is much more common for +the let-binding to be non-recursive, in which case the following law holds: +(let !p = rhs in body) + is equivalent to +(case rhs of !p -> body) - - - + +A pattern with a bang at the outermost level is not allowed at the top level of +a module. - - - + Assertions <indexterm><primary>Assertions</primary></indexterm> @@ -4882,83 +6180,147 @@ Assertion failures can be caught, see the documentation for the word that GHC understands are described in the following sections; any pragma encountered with an unrecognised word is (silently) - ignored. + ignored. The layout rule applies in pragmas, so the closing #-} + should start in a column to the right of the opening {-#. + + Certain pragmas are file-header pragmas. A file-header + pragma must precede the module keyword in the file. + There can be as many file-header pragmas as you please, and they can be + preceded or followed by comments. + + + LANGUAGE pragma + + LANGUAGEpragma + pragmaLANGUAGE + + The LANGUAGE pragma allows language extensions to be enabled + in a portable way. + It is the intention that all Haskell compilers support the + LANGUAGE pragma with the same syntax, although not + all extensions are supported by all compilers, of + course. The LANGUAGE pragma should be used instead + of OPTIONS_GHC, if possible. + + For example, to enable the FFI and preprocessing with CPP: - - DEPRECATED pragma - DEPRECATED +{-# LANGUAGE ForeignFunctionInterface, CPP #-} + + LANGUAGE is a file-header pragma (see ). + + Every language extension can also be turned into a command-line flag + by prefixing it with "-X"; for example . + (Similarly, all "-X" flags can be written as LANGUAGE pragmas. + + + A list of all supported language extensions can be obtained by invoking + ghc --supported-languages (see ). + + Any extension from the Extension type defined in + Language.Haskell.Extension + may be used. GHC will report an error if any of the requested extensions are not supported. + + + + + OPTIONS_GHC pragma + OPTIONS_GHC + + pragmaOPTIONS_GHC - The DEPRECATED pragma lets you specify that a particular - function, class, or type, is deprecated. There are two - forms. + The OPTIONS_GHC pragma is used to specify + additional options that are given to the compiler when compiling + this source file. See for + details. + + Previous versions of GHC accepted OPTIONS rather + than OPTIONS_GHC, but that is now deprecated. + + + OPTIONS_GHC is a file-header pragma (see ). + + + INCLUDE pragma + + The INCLUDE pragma is for specifying the names + of C header files that should be #include'd into + the C source code generated by the compiler for the current module (if + compiling via C). For example: + + +{-# INCLUDE "foo.h" #-} +{-# INCLUDE <stdio.h> #-} + + INCLUDE is a file-header pragma (see ). + + An INCLUDE pragma is the preferred alternative + to the option (), because the + INCLUDE pragma is understood by other + compilers. Yet another alternative is to add the include file to each + foreign import declaration in your code, but we + don't recommend using this approach with GHC. + + + + WARNING and DEPRECATED pragmas + WARNING + DEPRECATED + + The WARNING pragma allows you to attach an arbitrary warning + to a particular function, class, or type. + A DEPRECATED pragma lets you specify that + a particular function, class, or type is deprecated. + There are two ways of using these pragmas. - You can deprecate an entire module thus: + You can work on an entire module thus: module Wibble {-# DEPRECATED "Use Wobble instead" #-} where ... + Or: + + module Wibble {-# WARNING "This is an unstable interface." #-} where + ... + When you compile any module that import Wibble, GHC will print the specified message. - You can deprecate a function, class, type, or data constructor, with the - following top-level declaration: + You can attach a warning to a function, class, type, or data constructor, with the + following top-level declarations: {-# DEPRECATED f, C, T "Don't use these" #-} + {-# WARNING unsafePerformIO "This is unsafe; I hope you know what you're doing" #-} When you compile any module that imports and uses any of the specified entities, GHC will print the specified message. - You can only depecate entities declared at top level in the module + You can only attach to entities declared at top level in the module being compiled, and you can only use unqualified names in the list of - entities being deprecated. A capitalised name, such as T + entities. A capitalised name, such as T refers to either the type constructor T or the data constructor T, or both if - both are in scope. If both are in scope, there is currently no way to deprecate - one without the other (c.f. fixities ). + both are in scope. If both are in scope, there is currently no way to + specify one without the other (c.f. fixities + ). - Any use of the deprecated item, or of anything from a deprecated - module, will be flagged with an appropriate message. However, - deprecations are not reported for - (a) uses of a deprecated function within its defining module, and - (b) uses of a deprecated function in an export list. + Warnings and deprecations are not reported for + (a) uses within the defining module, and + (b) uses in an export list. The latter reduces spurious complaints within a library in which one module gathers together and re-exports the exports of several others. You can suppress the warnings with the flag - . - - - - INCLUDE pragma - - The INCLUDE pragma is for specifying the names - of C header files that should be #include'd into - the C source code generated by the compiler for the current module (if - compiling via C). For example: - - -{-# INCLUDE "foo.h" #-} -{-# INCLUDE <stdio.h> #-} - - The INCLUDE pragma(s) must appear at the top of - your source file with any OPTIONS_GHC - pragma(s). - - An INCLUDE pragma is the preferred alternative - to the option (), because the - INCLUDE pragma is understood by other - compilers. Yet another alternative is to add the include file to each - foreign import declaration in your code, but we - don't recommend using this approach with GHC. + . @@ -4985,20 +6347,41 @@ Assertion failures can be caught, see the documentation for the key_function :: Int -> String -> (Bool, Double) - -#ifdef __GLASGOW_HASKELL__ {-# INLINE key_function #-} -#endif - (You don't need to do the C pre-processor carry-on - unless you're going to stick the code through HBC—it - doesn't like INLINE pragmas.) - The major effect of an INLINE pragma is to declare a function's “cost” to be very low. The normal unfolding machinery will then be very keen to - inline it. + inline it. However, an INLINE pragma for a + function "f" has a number of other effects: + + +No functions are inlined into f. Otherwise +GHC might inline a big function into f's right hand side, +making f big; and then inline f blindly. + + +The float-in, float-out, and common-sub-expression transformations are not +applied to the body of f. + + +An INLINE function is not worker/wrappered by strictness analysis. +It's going to be inlined wholesale instead. + + +All of these effects are aimed at ensuring that what gets inlined is +exactly what you asked for, no more and no less. + +GHC ensures that inlining cannot go on forever: every mutually-recursive +group is cut by one or more loop breakers that is never inlined +(see +Secrets of the GHC inliner, JFP 12(4) July 2002). +GHC tries not to select a function with an INLINE pragma as a loop breaker, but +when there is no choice even an INLINE function can be selected, in which case +the INLINE pragma is ignored. +For example, for a self-recursive function, the loop breaker can only be the function +itself, so an INLINE pragma is always ignored. Syntactically, an INLINE pragma for a function can be put anywhere its type signature could be @@ -5012,14 +6395,18 @@ key_function :: Int -> String -> (Bool, Double) UniqueSupply monad code, we have: -#ifdef __GLASGOW_HASKELL__ {-# INLINE thenUs #-} {-# INLINE returnUs #-} -#endif See also the NOINLINE pragma (). + + Note: the HBC compiler doesn't like INLINE pragmas, + so if you want your code to be HBC-compatible you'll have to surround + the pragma with C pre-processor directives + #ifdef __GLASGOW_HASKELL__...#endif. + @@ -5073,7 +6460,7 @@ key_function :: Int -> String -> (Bool, Double) there was no pragma). - "INLINE[~k] f" means: be willing to inline + "NOINLINE[~k] f" means: be willing to inline f until phase k, but from phase k onwards do not inline it. @@ -5107,29 +6494,6 @@ happen. - - LANGUAGE pragma - - LANGUAGEpragma - pragmaLANGUAGE - - This allows language extensions to be enabled in a portable way. - It is the intention that all Haskell compilers support the - LANGUAGE pragma with the same syntax, although not - all extensions are supported by all compilers, of - course. The LANGUAGE pragma should be used instead - of OPTIONS_GHC, if possible. - - For example, to enable the FFI and preprocessing with CPP: - -{-# LANGUAGE ForeignFunctionInterface, CPP #-} - - Any extension from the Extension type defined in - Language.Haskell.Extension may be used. GHC will report an error if any of the requested extensions are not supported. - - - LINE pragma @@ -5149,22 +6513,6 @@ happen. pragma. - - OPTIONS_GHC pragma - OPTIONS_GHC - - pragmaOPTIONS_GHC - - - The OPTIONS_GHC pragma is used to specify - additional options that are given to the compiler when compiling - this source file. See for - details. - - Previous versions of GHC accepted OPTIONS rather - than OPTIONS_GHC, but that is now deprecated. - - RULES pragma @@ -5210,7 +6558,7 @@ happen. {-# SPECIALIZE f :: <type> #-} - is valid if and only if the defintion + is valid if and only if the definition f_spec :: <type> f_spec = f @@ -5226,7 +6574,7 @@ happen. h :: Eq a => a -> a -> a {-# SPECIALISE h :: (Eq a) => [a] -> [a] -> [a] #-} - + The last of these examples will generate a RULE with a somewhat-complex left-hand side (try it yourself), so it might not fire very well. If you use this kind of specialisation, let us know how well it works. @@ -5235,7 +6583,7 @@ well. If you use this kind of specialisation, let us know how well it works. A SPECIALIZE pragma can optionally be followed with a INLINE or NOINLINE pragma, optionally followed by a phase, as described in . -The INLINE pragma affects the specialised verison of the +The INLINE pragma affects the specialised version of the function (only), and applies even if the function is recursive. The motivating example is this: @@ -5343,14 +6691,14 @@ data T = T {-# UNPACK #-} !(Int,Int) will store the two Ints directly in the T constructor, by flattening the pair. - Multi-level unpacking is also supported: + Multi-level unpacking is also supported: data T = T {-# UNPACK #-} !S data S = S {-# UNPACK #-} !Int {-# UNPACK #-} !Int - will store two unboxed Int#s + will store two unboxed Int#s directly in the T constructor. The unpacker can see through newtypes, too. @@ -5364,6 +6712,15 @@ data S = S {-# UNPACK #-} !Int {-# UNPACK #-} !Int constructor field. + + SOURCE pragma + + SOURCE + The {-# SOURCE #-} pragma is used only in import declarations, + to break a module loop. It is described in detail in . + + + @@ -5377,23 +6734,19 @@ data S = S {-# UNPACK #-} !Int {-# UNPACK #-} !Int The programmer can specify rewrite rules as part of the source program -(in a pragma). GHC applies these rewrite rules wherever it can, provided (a) -the flag () is on, -and (b) the flag -() is not specified, and (c) the - () -flag is active. - - - +(in a pragma). Here is an example: {-# RULES - "map/map" forall f g xs. map f (map g xs) = map (f.g) xs - #-} + "map/map" forall f g xs. map f (map g xs) = map (f.g) xs + #-} - + + +Use the debug flag to see what rules fired. +If you need more information, then shows you +each individual rule firing in detail. @@ -5403,15 +6756,32 @@ Here is an example: From a syntactic point of view: - + - There may be zero or more rules in a RULES pragma. + There may be zero or more rules in a RULES pragma, separated by semicolons (which + may be generated by the layout rule). + +The layout rule applies in a pragma. +Currently no new indentation level +is set, so if you put several rules in single RULES pragma and wish to use layout to separate them, +you must lay out the starting in the same column as the enclosing definitions. + + {-# RULES + "map/map" forall f g xs. map f (map g xs) = map (f.g) xs + "map/append" forall f xs ys. map f (xs ++ ys) = map f xs ++ map f ys + #-} + +Furthermore, the closing #-} +should start in a column to the right of the opening {-#. + + + Each rule has a name, enclosed in double quotes. The name itself has no significance at all. It is only used when reporting how many times the rule fired. @@ -5425,7 +6795,7 @@ immediately after the name of the rule. Thus: {-# RULES "map/map" [2] forall f g xs. map f (map g xs) = map (f.g) xs - #-} + #-} The "[2]" means that the rule is active in Phase 2 and subsequent phases. The inverse notation "[~2]" is also accepted, meaning that the rule is active up to, but not including, @@ -5434,17 +6804,8 @@ Phase 2. - - - - Layout applies in a RULES pragma. Currently no new indentation level -is set, so you must lay out your rules starting in the same column as the -enclosing definitions. - - - Each variable mentioned in a rule must either be in scope (e.g. map), or bound by the forall (e.g. f, g, xs). The variables bound by @@ -5493,17 +6854,40 @@ variables it mentions, though of course they need to be in scope. - Rules are automatically exported from a module, just as instance declarations are. + All rules are implicitly exported from the module, and are therefore +in force in any module that imports the module that defined the rule, directly +or indirectly. (That is, if A imports B, which imports C, then C's rules are +in force when compiling A.) The situation is very similar to that for instance +declarations. + + + + + + +Inside a RULE "forall" is treated as a keyword, regardless of +any other flag settings. Furthermore, inside a RULE, the language extension + is automatically enabled; see +. + + +Like other pragmas, RULE pragmas are always checked for scope errors, and +are typechecked. Typechecking means that the LHS and RHS of a rule are typechecked, +and must have the same type. However, rules are only enabled +if the flag is +on (see ). + + - + Semantics @@ -5511,9 +6895,17 @@ From a semantic point of view: - -Rules are only applied if you use the flag. +Rules are enabled (that is, used during optimisation) +by the flag. +This flag is implied by , and may be switched +off (as usual) by . +(NB: enabling without +may not do what you expect, though, because without GHC +ignores all optimisation information in interface files; +see , .) +Note that is an optimisation flag, and +has no effect on parsing or typechecking. @@ -5530,14 +6922,6 @@ expression by substituting for the pattern variables. - The LHS and RHS of a rule are typechecked, and must have the -same type. - - - - - - GHC makes absolutely no attempt to verify that the LHS and RHS of a rule have the same meaning. That is undecidable in general, and infeasible in most interesting cases. The responsibility is entirely the programmer's! @@ -5551,7 +6935,7 @@ infeasible in most interesting cases. The responsibility is entirely the progra terminating. For example: - "loop" forall x,y. f x y = f y x + "loop" forall x y. f x y = f y x This rule will cause the compiler to go into an infinite loop. @@ -5604,48 +6988,32 @@ not be substituted, and the rule would not fire. - In the earlier phases of compilation, GHC inlines nothing -that appears on the LHS of a rule, because once you have substituted -for something you can't match against it (given the simple minded -matching). So if you write the rule - +Ordinary inlining happens at the same time as rule rewriting, which may lead to unexpected +results. Consider this (artificial) example - "map/map" forall f,g. map f . map g = map (f.g) - +f x = x +{-# RULES "f" f True = False #-} -this won't match the expression map f (map g xs). -It will only match something written with explicit use of ".". -Well, not quite. It will match the expression +g y = f y - -wibble f g xs +h z = g True - -where wibble is defined: - +Since f's right-hand side is small, it is inlined into g, +to give -wibble f g = map f . map g +g y = y - -because wibble will be inlined (it's small). - -Later on in compilation, GHC starts inlining even things on the -LHS of rules, but still leaves the rules enabled. This inlining -policy is controlled by the per-simplification-pass flag n. - +Now g is inlined into h, but f's RULE has +no chance to fire. +If instead GHC had first inlined g into h then there +would have been a better chance that f's RULE might fire. - - - - All rules are implicitly exported from the module, and are therefore -in force in any module that imports the module that defined the rule, directly -or indirectly. (That is, if A imports B, which imports C, then C's rules are -in force when compiling A.) The situation is very similar to that for instance -declarations. +The way to get predictable behaviour is to use a NOINLINE +pragma on f, to ensure +that it is not inlined until its RULEs have had a chance to fire. - @@ -5743,12 +7111,6 @@ The following are good consumers: - length - - - - - ++ (on its first argument) @@ -5937,7 +7299,7 @@ If you add you get a more detailed listing. - The definition of (say) build in GHC/Base.lhs looks llike this: + The definition of (say) build in GHC/Base.lhs looks like this: build :: forall a. (forall b. (a -> b -> b) -> b -> b) -> [a] @@ -5994,7 +7356,7 @@ g x = show x - However, when external for is generated (via + However, when external core is generated (via ), there will be Notes attached to the expressions show and x. The core function declaration for f is: @@ -6016,7 +7378,7 @@ r) GHCziBase.ZMZN GHCziBase.Char -> GHCziBase.ZMZN GHCziBase.Cha r) -> tpl2}) - (%note "foo" + (%note "bar" eta); @@ -6034,78 +7396,16 @@ r) -> Special built-in functions -GHC has a few built-in funcions with special behaviour, -described in this section. All are exported by -GHC.Exts. - - The <literal>inline</literal> function - -The inline function is somewhat experimental. - - inline :: a -> a - -The call (inline f) arranges that f -is inlined, regardless of its size. More precisely, the call -(inline f) rewrites to the right-hand side of f's -definition. -This allows the programmer to control inlining from -a particular call site -rather than the definition site of the function -(c.f. INLINE pragmas ). - - -This inlining occurs regardless of the argument to the call -or the size of f's definition; it is unconditional. -The main caveat is that f's definition must be -visible to the compiler. That is, f must be -let-bound in the current scope. -If no inlining takes place, the inline function -expands to the identity function in Phase zero; so its use imposes -no overhead. - - If the function is defined in another -module, GHC only exposes its inlining in the interface file if the -function is sufficiently small that it might be -inlined by the automatic mechanism. There is currently no way to tell -GHC to expose arbitrarily-large functions in the interface file. (This -shortcoming is something that could be fixed, with some kind of pragma.) - - - - The <literal>inline</literal> function - -The lazy function restrains strictness analysis a little: - - lazy :: a -> a - -The call (lazy e) means the same as e, -but lazy has a magical property so far as strictness -analysis is concerned: it is lazy in its first argument, -even though its semantics is strict. After strictness analysis has run, -calls to lazy are inlined to be the identity function. - - -This behaviour is occasionally useful when controlling evaluation order. -Notably, lazy is used in the library definition of -Control.Parallel.par: - - par :: a -> b -> b - par x y = case (par# x) of { _ -> lazy y } - -If lazy were not lazy, par would -look strict in y which would defeat the whole -purpose of par. - - +GHC has a few built-in functions with special behaviour. These +are now described in the module GHC.Prim +in the library documentation. Generic classes - (Note: support for generic classes is currently broken in - GHC 5.02). - The ideas behind this extension are described in detail in "Derivable type classes", Ralf Hinze and Simon Peyton Jones, Haskell Workshop, Montreal Sept 2000, pp94-105. @@ -6156,7 +7456,7 @@ where clause and over-ride whichever methods you please. Use the flags (to enable the extra syntax), - (to generate extra per-data-type code), + (to generate extra per-data-type code), and (to make the Generics library available. @@ -6266,7 +7566,7 @@ So this too is illegal: op2 :: a -> Bool op2 {| p :*: q |} (x :*: y) = False -(The reason for this restriction is that we gather all the equations for a particular type consructor +(The reason for this restriction is that we gather all the equations for a particular type constructor into a single generic instance declaration.) @@ -6297,7 +7597,7 @@ Here, op1, op2, op3 are OK, but op4 is rejected, because it has a type variable inside a list. -This restriction is an implementation restriction: we just havn't got around to +This restriction is an implementation restriction: we just haven't got around to implementing the necessary bidirectional maps over arbitrary type constructors. It would be relatively easy to add specific type constructors, such as Maybe and list, to the ones that are allowed. @@ -6356,12 +7656,58 @@ Just to finish with, here's another example I rather like: + +Control over monomorphism + +GHC supports two flags that control the way in which generalisation is +carried out at let and where bindings. + + + +Switching off the dreaded Monomorphism Restriction + + +Haskell's monomorphism restriction (see +Section +4.5.5 +of the Haskell Report) +can be completely switched off by +. + + + + +Monomorphic pattern bindings + + + + As an experimental change, we are exploring the possibility of + making pattern bindings monomorphic; that is, not generalised at all. + A pattern binding is a binding whose LHS has no function arguments, + and is not a simple variable. For example: + + f x = x -- Not a pattern binding + f = \x -> x -- Not a pattern binding + f :: Int -> Int = \x -> x -- Not a pattern binding + + (g,h) = e -- A pattern binding + (f) = e -- A pattern binding + [x] = e -- A pattern binding + +Experimentally, GHC now makes pattern bindings monomorphic by +default. Use to recover the +standard behaviour. + + + +