X-Git-Url: http://git.megacz.com/?a=blobdiff_plain;f=ghc%2Fdocs%2Fusers_guide%2Fglasgow_exts.sgml;h=6b3bd135160094f51c4a82c3459b4b8640eb17fb;hb=1a4238bccc8be8a71f8ebec15f25d8edf8d084ad;hp=5c3e74eab34986010c098b92bb862730cd413361;hpb=d3b0b334c261f982390edf37213edc35c5992e30;p=ghc-hetmet.git diff --git a/ghc/docs/users_guide/glasgow_exts.sgml b/ghc/docs/users_guide/glasgow_exts.sgml index 5c3e74e..6b3bd13 100644 --- a/ghc/docs/users_guide/glasgow_exts.sgml +++ b/ghc/docs/users_guide/glasgow_exts.sgml @@ -152,7 +152,222 @@ with GHC. -&primitives; + + + Unboxed types and primitive operations + +GHC is built on a raft of primitive data types and operations. +While you really can use this stuff to write fast code, + we generally find it a lot less painful, and more satisfying in the + long run, to use higher-level language features and libraries. With + any luck, the code you write will be optimised to the efficient + unboxed version in any case. And if it isn't, we'd like to know + about it. + +We do not currently have good, up-to-date documentation about the +primitives, perhaps because they are mainly intended for internal use. +There used to be a long section about them here in the User Guide, but it +became out of date, and wrong information is worse than none. + +The Real Truth about what primitive types there are, and what operations +work over those types, is held in the file +fptools/ghc/compiler/prelude/primops.txt. +This file is used directly to generate GHC's primitive-operation definitions, so +it is always correct! It is also intended for processing into text. + + Indeed, +the result of such processing is part of the description of the + External + Core language. +So that document is a good place to look for a type-set version. +We would be very happy if someone wanted to volunteer to produce an SGML +back end to the program that processes primops.txt so that +we could include the results here in the User Guide. + +What follows here is a brief summary of some main points. + + +Unboxed types + + + +Unboxed types (Glasgow extension) + + +Most types in GHC are boxed, which means +that values of that type are represented by a pointer to a heap +object. The representation of a Haskell Int, for +example, is a two-word heap object. An unboxed +type, however, is represented by the value itself, no pointers or heap +allocation are involved. + + + +Unboxed types correspond to the “raw machine” types you +would use in C: Int# (long int), +Double# (double), Addr# +(void *), etc. The primitive operations +(PrimOps) on these types are what you might expect; e.g., +(+#) is addition on +Int#s, and is the machine-addition that we all +know and love—usually one instruction. + + + +Primitive (unboxed) types cannot be defined in Haskell, and are +therefore built into the language and compiler. Primitive types are +always unlifted; that is, a value of a primitive type cannot be +bottom. We use the convention that primitive types, values, and +operations have a # suffix. + + + +Primitive values are often represented by a simple bit-pattern, such +as Int#, Float#, +Double#. But this is not necessarily the case: +a primitive value might be represented by a pointer to a +heap-allocated object. Examples include +Array#, the type of primitive arrays. A +primitive array is heap-allocated because it is too big a value to fit +in a register, and would be too expensive to copy around; in a sense, +it is accidental that it is represented by a pointer. If a pointer +represents a primitive value, then it really does point to that value: +no unevaluated thunks, no indirections…nothing can be at the +other end of the pointer than the primitive value. + + + +There are some restrictions on the use of primitive types, the main +one being that you can't pass a primitive value to a polymorphic +function or store one in a polymorphic data type. This rules out +things like [Int#] (i.e. lists of primitive +integers). The reason for this restriction is that polymorphic +arguments and constructor fields are assumed to be pointers: if an +unboxed integer is stored in one of these, the garbage collector would +attempt to follow it, leading to unpredictable space leaks. Or a +seq operation on the polymorphic component may +attempt to dereference the pointer, with disastrous results. Even +worse, the unboxed value might be larger than a pointer +(Double# for instance). + + + +Nevertheless, A numerically-intensive program using unboxed types can +go a lot faster than its “standard” +counterpart—we saw a threefold speedup on one example. + + + + + +Unboxed Tuples + + + +Unboxed tuples aren't really exported by GHC.Exts, +they're available by default with . An +unboxed tuple looks like this: + + + + + +(# e_1, ..., e_n #) + + + + + +where e_1..e_n are expressions of any +type (primitive or non-primitive). The type of an unboxed tuple looks +the same. + + + +Unboxed tuples are used for functions that need to return multiple +values, but they avoid the heap allocation normally associated with +using fully-fledged tuples. When an unboxed tuple is returned, the +components are put directly into registers or on the stack; the +unboxed tuple itself does not have a composite representation. Many +of the primitive operations listed in this section return unboxed +tuples. + + + +There are some pretty stringent restrictions on the use of unboxed tuples: + + + + + + + + + Unboxed tuple types are subject to the same restrictions as +other unboxed types; i.e. they may not be stored in polymorphic data +structures or passed to polymorphic functions. + + + + + + + Unboxed tuples may only be constructed as the direct result of +a function, and may only be deconstructed with a case expression. +eg. the following are valid: + + + +f x y = (# x+1, y-1 #) +g x = case f x x of { (# a, b #) -> a + b } + + + +but the following are invalid: + + + +f x y = g (# x, y #) +g (# x, y #) = x + y + + + + + + + + + No variable can have an unboxed tuple type. This is illegal: + + + +f :: (# Int, Int #) -> (# Int, Int #) +f x = x + + + +because x has an unboxed tuple type. + + + + + + + + + +Note: we may relax some of these restrictions in the future. + + + +The IO and ST monads use unboxed +tuples to avoid unnecessary allocation during sequences of operations. + + + + + @@ -357,6 +572,8 @@ the do-notation, and this extension provides the necessary syntactic support. Here is a simple (yet contrived) example: +import Control.Monad.Fix + justOnes = mdo xs <- Just (1:xs) return xs @@ -378,8 +595,9 @@ then that monad must be declared an instance of the MonadFix For details, see the above mentioned reference. -The following instances of MonadFix are automatically provided: List, Maybe, IO, and -state monads (both lazy and strict). +The following instances of MonadFix are automatically provided: List, Maybe, IO. +Furthermore, the Control.Monad.ST and Control.Monad.ST.Lazy modules provide the instances of the MonadFix class +for Haskell's internal state monad (strict and lazy, respectively). There are three important points in using the recursive-do notation: @@ -390,14 +608,10 @@ than do). -If you want to declare an instance of the MonadFix class for one of -your own monads, or you need to refer to the class name MonadFix in any other way (for -instance when writing a type constraint), then your program should -import Control.Monad.MonadFix. -Otherwise, you don't need to import any special libraries to use the mdo-notation. That is, -as long as you only use the predefined instances mentioned above, the mdo-notation will -be automatically available. -To be on the safe side, of course, you can simply import it in all cases. +You should import Control.Monad.Fix. +(Note: Strictly speaking, this import is required only when you need to refer to the name +MonadFix in your program, but the import is always safe, and the programmers +are encouraged to always import this module when using the mdo-notation.) @@ -407,15 +621,15 @@ As with other extensions, ghc should be given the flag -fglasgow-exts -Historical note: The old implementation of the mdo-notation (and most -of the existing documents) used the name -MonadRec for the class and the corresponding library. -This name is no longer supported. +The web page: http://www.cse.ogi.edu/PacSoft/projects/rmb +contains up to date information on recursive monadic bindings. -The web page: http://www.cse.ogi.edu/PacSoft/projects/rmb -contains up to date information on recursive monadic bindings. +Historical note: The old implementation of the mdo-notation (and most +of the existing documents) used the name +MonadRec for the class and the corresponding library. +This name is not supported by GHC. @@ -1341,10 +1555,12 @@ implicitly parameterized by a comparison function named cmp. The dynamic binding constraints are just a new form of predicate in the type class system. -An implicit parameter is introduced by the special form ?x, +An implicit parameter occurs in an expression using the special form ?x, where x is -any valid identifier. Use if this construct also introduces new -dynamic binding constraints. For example, the following definition +any valid identifier (e.g. ord ?x is a valid expression). +Use of this construct also introduces a new +dynamic-binding constraint in the type of the expression. +For example, the following definition shows how we can define an implicitly parameterized sort function in terms of an explicitly parameterized sortBy function: @@ -1353,6 +1569,11 @@ terms of an explicitly parameterized sortBy function: sort :: (?cmp :: a -> a -> Bool) => [a] -> [a] sort = sortBy ?cmp + + + +Implicit-parameter type constraints + Dynamic binding constraints behave just like other type class constraints in that they are automatically propagated. Thus, when a function is used, its implicit parameters are inherited by the @@ -1369,68 +1590,93 @@ propagated. With implicit parameters, the default is to always propagate them. -An implicit parameter differs from other type class constraints in the +An implicit-parameter type constraint differs from other type class constraints in the following way: All uses of a particular implicit parameter must have the same type. This means that the type of (?x, ?x) is (?x::a) => (a,a), and not (?x::a, ?x::b) => (a, b), as would be the case for type class constraints. + + You can't have an implicit parameter in the context of a class or instance +declaration. For example, both these declarations are illegal: + + class (?x::Int) => C a where ... + instance (?x::a) => Foo [a] where ... + +Reason: exactly which implicit parameter you pick up depends on exactly where +you invoke a function. But the ``invocation'' of instance declarations is done +behind the scenes by the compiler, so it's hard to figure out exactly where it is done. +Easiest thing is to outlaw the offending types. + +Implicit-parameter constraints do not cause ambiguity. For example, consider: + + f :: (?x :: [a]) => Int -> Int + f n = n + length ?x + + g :: (Read a, Show a) => String -> String + g s = show (read s) + +Here, g has an ambiguous type, and is rejected, but f +is fine. The binding for ?x at f's call site is +quite unambiguous, and fixes the type a. + + + + +Implicit-parameter bindings + -An implicit parameter is bound using the standard -let binding form, where the bindings must be a -collection of simple bindings to implicit-style variables (no -function-style bindings, and no type signatures); these bindings are -neither polymorphic or recursive. This form binds the implicit -parameters arising in the body, not the free variables as a -let or where would do. For -example, we define the min function by binding -cmp. +An implicit parameter is bound using the standard +let or where binding forms. +For example, we define the min function by binding +cmp. min :: [a] -> a min = let ?cmp = (<=) in least + +A group of implicit-parameter bindings may occur anywhere a normal group of Haskell +bindings can occur, except at top level. That is, they can occur in a let +(including in a list comprehension, or do-notation, or pattern guards), +or a where clause. Note the following points: +An implicit-parameter binding group must be a +collection of simple bindings to implicit-style variables (no +function-style bindings, and no type signatures); these bindings are +neither polymorphic or recursive. + + You may not mix implicit-parameter bindings with ordinary bindings in a single let expression; use two nested lets instead. +(In the case of where you are stuck, since you can't nest where clauses.) You may put multiple implicit-parameter bindings in a -single let expression; they are not treated +single binding group; but they are not treated as a mutually recursive group (as ordinary let bindings are). -Instead they are treated as a non-recursive group, each scoping over the bindings that -follow. For example, consider: +Instead they are treated as a non-recursive group, simultaneously binding all the implicit +parameter. The bindings are not nested, and may be re-ordered without changing +the meaning of the program. +For example, consider: - f y = let { ?x = y; ?x = ?x+1 } in ?x + f t = let { ?x = t; ?y = ?x+(1::Int) } in ?x + ?y -This function adds one to its argument. - - - -You may not have an implicit-parameter binding in a where clause, -only in a let binding. - - - - You can't have an implicit parameter in the context of a class or instance -declaration. For example, both these declarations are illegal: +The use of ?x in the binding for ?y does not "see" +the binding for ?x, so the type of f is - class (?x::Int) => C a where ... - instance (?x::a) => Foo [a] where ... + f :: (?x::Int) => Int -> Int -Reason: exactly which implicit parameter you pick up depends on exactly where -you invoke a function. But the ``invocation'' of instance declarations is done -behind the scenes by the compiler, so it's hard to figure out exactly where it is done. -Easiest thing is to outlaw the offending types. - + + @@ -2972,6 +3218,9 @@ Template Meta-programming for Haskell", in Proc Haskell Workshop 2002. + The first example from that paper is set out below as a worked example to help get you started. + + The documentation here describes the realisation in GHC. (It's rather sketchy just now; Tim Sheard is going to expand it.) @@ -2992,21 +3241,25 @@ Tim Sheard is going to expand it.) A splice can occur in place of - an expression; - a list of top-level declarations; - a pattern; - a type; + an expression; the spliced expression must have type Expr + a list of top-level declarations; ; the spliced expression must have type Q [Dec] + a type; the spliced expression must have type Type. + (Note that the syntax for a declaration splice uses "$" not "splice" as in + the paper. Also the type of the enclosed expression must be Q [Dec], not [Q Dec] + as in the paper.) A expression quotation is written in Oxford brackets, thus: - [| ... |], where the "..." is an expression; - [d| ... |], where the "..." is a list of top-level declarations; - [p| ... |], where the "..." is a pattern; - [t| ... |], where the "..." is a type; + [| ... |], where the "..." is an expression; + the quotation has type Expr. + [d| ... |], where the "..." is a list of top-level declarations; + the quotation has type Q [Dec]. + [t| ... |], where the "..." is a type; + the quotation has type Type. @@ -3035,12 +3288,6 @@ Tim Sheard is going to expand it.) - If the module contains any top-level splices that must be run, you must use GHC with - --make or --interactive flags. (Reason: that - means it walks the dependency tree and knows what modules must be linked etc.) - - - You can only run a function at compile time if it is imported from another module. That is, you can't define a function in a module, and call it from within a splice in the same module. (It would make sense to do so, but it's hard to implement.) @@ -3049,8 +3296,86 @@ Tim Sheard is going to expand it.) The flag -ddump-splices shows the expansion of all top-level splices as they happen. + + If you are building GHC from source, you need at least a stage-2 bootstrap compiler to + run Template Haskell. A stage-1 compiler will reject the TH constructs. Reason: TH + compiles and runs a program, and then looks at the result. So it's important that + the program it compiles produces results whose representations are identical to + those of the compiler itself. + + Template Haskell works in any mode (--make, --interactive, + or file-at-a-time). There used to be a restriction to the former two, but that restriction + has been lifted. + + + + A Template Haskell Worked Example +To help you get over the confidence barrier, try out this skeletal worked example. + First cut and paste the two modules below into "Main.hs" and "Printf.hs": + + +{- Main.hs -} +module Main where + +-- Import our template "pr" +import Printf ( pr ) + +-- The splice operator $ takes the Haskell source code +-- generated at compile time by "pr" and splices it into +-- the argument of "putStrLn". +main = putStrLn ( $(pr "Hello") ) + + + +{- Printf.hs -} +module Printf where + +-- Skeletal printf from the paper. +-- It needs to be in a separate module to the one where +-- you intend to use it. + +-- Import some Template Haskell syntax +import Language.Haskell.THSyntax + +-- Describe a format string +data Format = D | S | L String + +-- Parse a format string. This is left largely to you +-- as we are here interested in building our first ever +-- Template Haskell program and not in building printf. +parse :: String -> [Format] +parse s = [ L s ] + +-- Generate Haskell source code from a parsed representation +-- of the format string. This code will be spliced into +-- the module which calls "pr", at compile time. +gen :: [Format] -> Expr +gen [D] = [| \n -> show n |] +gen [S] = [| \s -> s |] +gen [L s] = string s + +-- Here we generate the Haskell code for the splice +-- from an input format string. +pr :: String -> Expr +pr s = gen (parse s) + + +Now run the compiler (here we are using a "stage three" build of GHC, at a Cygwin prompt on Windows): + + +ghc/compiler/stage3/ghc-inplace --make -fglasgow-exts -package haskell-src main.hs -o main.exe + + +Run "main.exe" and here is your output: + + + +$ ./main +Hello + + @@ -3279,11 +3604,14 @@ hammeredLookup :: Ord key => [(key, value)] -> key -> value {-# SPECIALIZE hammeredLookup :: [(Widget, value)] -> Widget -> value #-} + A SPECIALIZE pragma for a function can + be put anywhere its type signature could be put. + To get very fancy, you can also specify a named function to use for the specialised value, as in: -{-# RULES hammeredLookup = blah #-} +{-# RULES "hammeredLookup" hammeredLookup = blah #-} where blah is an implementation of @@ -3306,7 +3634,7 @@ hammeredLookup :: Ord key => [(key, value)] -> key -> value toDouble :: Real a => a -> Double toDouble = fromRational . toRational -{-# SPECIALIZE toDouble :: Int -> Double = i2d #-} +{-# RULES "toDouble/Int" toDouble = i2d #-} i2d (I# i) = D# (int2Double# i) -- uses Glasgow prim-op directly @@ -3315,9 +3643,6 @@ i2d (I# i) = D# (int2Double# i) -- uses Glasgow prim-op directly Rational—is obscenely expensive by comparison. - A SPECIALIZE pragma for a function can - be put anywhere its type signature could be put. -