From 8704ccbf31ade8864835228e6436cad2665591ee Mon Sep 17 00:00:00 2001 From: simonmar Date: Tue, 20 Mar 2001 15:36:40 +0000 Subject: [PATCH] [project @ 2001-03-20 15:36:40 by simonmar] Split the primitives section out into a separate file, and add a paragraph discouraging its use. --- ghc/docs/users_guide/glasgow_exts.sgml | 1107 +------------------------------ ghc/docs/users_guide/primitives.sgml | 1115 ++++++++++++++++++++++++++++++++ ghc/docs/users_guide/ug-ent.sgml | 1 + 3 files changed, 1119 insertions(+), 1104 deletions(-) create mode 100644 ghc/docs/users_guide/primitives.sgml diff --git a/ghc/docs/users_guide/glasgow_exts.sgml b/ghc/docs/users_guide/glasgow_exts.sgml index 824a007..8589b04 100644 --- a/ghc/docs/users_guide/glasgow_exts.sgml +++ b/ghc/docs/users_guide/glasgow_exts.sgml @@ -273,1112 +273,11 @@ program), you may wish to check if there are libraries that provide a - -Unboxed types and primitive operations - -PrelGHC module - - -This module defines all the types which are primitive in Glasgow -Haskell, and the operations provided for them. - - - -Unboxed types - - - -Unboxed types (Glasgow extension) - - -Most types in GHC are boxed, which means -that values of that type are represented by a pointer to a heap -object. The representation of a Haskell Int, for -example, is a two-word heap object. An unboxed -type, however, is represented by the value itself, no pointers or heap -allocation are involved. - - - -Unboxed types correspond to the “raw machine” types you -would use in C: Int# (long int), -Double# (double), Addr# -(void *), etc. The primitive operations -(PrimOps) on these types are what you might expect; e.g., -(+#) is addition on -Int#s, and is the machine-addition that we all -know and love—usually one instruction. - - - -Primitive (unboxed) types cannot be defined in Haskell, and are -therefore built into the language and compiler. Primitive types are -always unlifted; that is, a value of a primitive type cannot be -bottom. We use the convention that primitive types, values, and -operations have a # suffix. - - - -Primitive values are often represented by a simple bit-pattern, such -as Int#, Float#, -Double#. But this is not necessarily the case: -a primitive value might be represented by a pointer to a -heap-allocated object. Examples include -Array#, the type of primitive arrays. A -primitive array is heap-allocated because it is too big a value to fit -in a register, and would be too expensive to copy around; in a sense, -it is accidental that it is represented by a pointer. If a pointer -represents a primitive value, then it really does point to that value: -no unevaluated thunks, no indirections…nothing can be at the -other end of the pointer than the primitive value. - - - -There are some restrictions on the use of primitive types, the main -one being that you can't pass a primitive value to a polymorphic -function or store one in a polymorphic data type. This rules out -things like [Int#] (i.e. lists of primitive -integers). The reason for this restriction is that polymorphic -arguments and constructor fields are assumed to be pointers: if an -unboxed integer is stored in one of these, the garbage collector would -attempt to follow it, leading to unpredictable space leaks. Or a -seq operation on the polymorphic component may -attempt to dereference the pointer, with disastrous results. Even -worse, the unboxed value might be larger than a pointer -(Double# for instance). - - - -Nevertheless, A numerically-intensive program using unboxed types can -go a lot faster than its “standard” -counterpart—we saw a threefold speedup on one example. - - - - - -Unboxed Tuples - - - -Unboxed tuples aren't really exported by PrelGHC, -they're available by default with . An -unboxed tuple looks like this: - - - - - -(# e_1, ..., e_n #) - - - - - -where e_1..e_n are expressions of any -type (primitive or non-primitive). The type of an unboxed tuple looks -the same. - - - -Unboxed tuples are used for functions that need to return multiple -values, but they avoid the heap allocation normally associated with -using fully-fledged tuples. When an unboxed tuple is returned, the -components are put directly into registers or on the stack; the -unboxed tuple itself does not have a composite representation. Many -of the primitive operations listed in this section return unboxed -tuples. - - - -There are some pretty stringent restrictions on the use of unboxed tuples: - - - - - - - - - Unboxed tuple types are subject to the same restrictions as -other unboxed types; i.e. they may not be stored in polymorphic data -structures or passed to polymorphic functions. - - - - - - - Unboxed tuples may only be constructed as the direct result of -a function, and may only be deconstructed with a case expression. -eg. the following are valid: - - - -f x y = (# x+1, y-1 #) -g x = case f x x of { (# a, b #) -> a + b } - - - -but the following are invalid: - - - -f x y = g (# x, y #) -g (# x, y #) = x + y - - - - - - - - - No variable can have an unboxed tuple type. This is illegal: - - - -f :: (# Int, Int #) -> (# Int, Int #) -f x = x - - - -because x has an unboxed tuple type. - - - - - - - - - -Note: we may relax some of these restrictions in the future. - - - -The IO and ST monads use unboxed -tuples to avoid unnecessary allocation during sequences of operations. - - - - - -Character and numeric types - -character types, primitive -numeric types, primitive -integer types, primitive -floating point types, primitive - -There are the following obvious primitive types: - - - -type Char# -type Int# -type Word# -type Addr# -type Float# -type Double# -type Int64# -type Word64# - - -Char# -Int# -Word# -Addr# -Float# -Double# -Int64# -Word64# - - -If you really want to know their exact equivalents in C, see -ghc/includes/StgTypes.h in the GHC source tree. - - - -Literals for these types may be written as follows: - - - - - -1# an Int# -1.2# a Float# -1.34## a Double# -'a'# a Char#; for weird characters, use e.g. '\o<octal>'# -"a"# an Addr# (a `char *'); only characters '\0'..'\255' allowed - - -literals, primitive -constants, primitive -numbers, primitive - - - - - -Comparison operations - - -comparisons, primitive -operators, comparison - - - - - -{>,>=,==,/=,<,<=}# :: Int# -> Int# -> Bool - -{gt,ge,eq,ne,lt,le}Char# :: Char# -> Char# -> Bool - -- ditto for Word# and Addr# - - -># ->=# -==# -/=# -<# -<=# -gt{Char,Word,Addr}# -ge{Char,Word,Addr}# -eq{Char,Word,Addr}# -ne{Char,Word,Addr}# -lt{Char,Word,Addr}# -le{Char,Word,Addr}# - - - - - -Primitive-character operations - - -characters, primitive operations -operators, primitive character - - - - - -ord# :: Char# -> Int# -chr# :: Int# -> Char# - - -ord# -chr# - - - - - -Primitive-<literal>Int</literal> operations - - -integers, primitive operations -operators, primitive integer - - - - - -{+,-,*,quotInt,remInt,gcdInt}# :: Int# -> Int# -> Int# -negateInt# :: Int# -> Int# - -iShiftL#, iShiftRA#, iShiftRL# :: Int# -> Int# -> Int# - -- shift left, right arithmetic, right logical - -addIntC#, subIntC#, mulIntC# :: Int# -> Int# -> (# Int#, Int# #) - -- add, subtract, multiply with carry - - -+# --# -*# -quotInt# -remInt# -gcdInt# -iShiftL# -iShiftRA# -iShiftRL# -addIntC# -subIntC# -mulIntC# -shift operations, integer - - - -Note: No error/overflow checking! - - - - - -Primitive-<literal>Double</literal> and <literal>Float</literal> operations - - -floating point numbers, primitive -operators, primitive floating point - - - - - -{+,-,*,/}## :: Double# -> Double# -> Double# -{<,<=,==,/=,>=,>}## :: Double# -> Double# -> Bool -negateDouble# :: Double# -> Double# -double2Int# :: Double# -> Int# -int2Double# :: Int# -> Double# - -{plus,minux,times,divide}Float# :: Float# -> Float# -> Float# -{gt,ge,eq,ne,lt,le}Float# :: Float# -> Float# -> Bool -negateFloat# :: Float# -> Float# -float2Int# :: Float# -> Int# -int2Float# :: Int# -> Float# - - - - - -+## --## -*## -/## -<## -<=## -==## -=/## ->=## ->## -negateDouble# -double2Int# -int2Double# - - - -plusFloat# -minusFloat# -timesFloat# -divideFloat# -gtFloat# -geFloat# -eqFloat# -neFloat# -ltFloat# -leFloat# -negateFloat# -float2Int# -int2Float# - - - -And a full complement of trigonometric functions: - - - - - -expDouble# :: Double# -> Double# -logDouble# :: Double# -> Double# -sqrtDouble# :: Double# -> Double# -sinDouble# :: Double# -> Double# -cosDouble# :: Double# -> Double# -tanDouble# :: Double# -> Double# -asinDouble# :: Double# -> Double# -acosDouble# :: Double# -> Double# -atanDouble# :: Double# -> Double# -sinhDouble# :: Double# -> Double# -coshDouble# :: Double# -> Double# -tanhDouble# :: Double# -> Double# -powerDouble# :: Double# -> Double# -> Double# - - -trigonometric functions, primitive - - - -similarly for Float#. - - - -There are two coercion functions for Float#/Double#: - - - - - -float2Double# :: Float# -> Double# -double2Float# :: Double# -> Float# - - -float2Double# -double2Float# - - - -The primitive version of decodeDouble -(encodeDouble is implemented as an external C -function): - - - - - -decodeDouble# :: Double# -> PrelNum.ReturnIntAndGMP - - -encodeDouble# -decodeDouble# - - - -(And the same for Float#s.) - - - - - -Operations on/for <literal>Integers</literal> (interface to GMP) - - - -arbitrary precision integers -Integer, operations on - - - -We implement Integers (arbitrary-precision -integers) using the GNU multiple-precision (GMP) package (version -2.0.2). - - - -The data type for Integer is either a small -integer, represented by an Int, or a large integer -represented using the pieces required by GMP's -MP_INT in gmp.h (see -gmp.info in -ghc/includes/runtime/gmp). It comes out as: - - - - - -data Integer = S# Int# -- small integers - | J# Int# ByteArray# -- large integers - - -Integer type The primitive -ops to support large Integers use the -“pieces” of the representation, and are as follows: - - - - - -negateInteger# :: Int# -> ByteArray# -> Integer - -{plus,minus,times}Integer#, gcdInteger#, - quotInteger#, remInteger#, divExactInteger# - :: Int# -> ByteArray# - -> Int# -> ByteArray# - -> (# Int#, ByteArray# #) - -cmpInteger# - :: Int# -> ByteArray# - -> Int# -> ByteArray# - -> Int# -- -1 for <; 0 for ==; +1 for > - -cmpIntegerInt# - :: Int# -> ByteArray# - -> Int# - -> Int# -- -1 for <; 0 for ==; +1 for > - -gcdIntegerInt# :: - :: Int# -> ByteArray# - -> Int# - -> Int# - -divModInteger#, quotRemInteger# - :: Int# -> ByteArray# - -> Int# -> ByteArray# - -> (# Int#, ByteArray#, - Int#, ByteArray# #) - -integer2Int# :: Int# -> ByteArray# -> Int# - -int2Integer# :: Int# -> Integer -- NB: no error-checking on these two! -word2Integer# :: Word# -> Integer - -addr2Integer# :: Addr# -> Integer - -- the Addr# is taken to be a `char *' string - -- to be converted into an Integer. - - -negateInteger# -plusInteger# -minusInteger# -timesInteger# -quotInteger# -remInteger# -gcdInteger# -gcdIntegerInt# -divExactInteger# -cmpInteger# -divModInteger# -quotRemInteger# -integer2Int# -int2Integer# -word2Integer# -addr2Integer# - - - - - -Words and addresses - - -word, primitive type -address, primitive type -unsigned integer, primitive type -pointer, primitive type - - - -A Word# is used for bit-twiddling operations. -It is the same size as an Int#, but has no sign -nor any arithmetic operations. - - -type Word# -- Same size/etc as Int# but *unsigned* -type Addr# -- A pointer from outside the "Haskell world" (from C, probably); - -- described under "arrays" - - -Word# -Addr# - - - -Word#s and Addr#s have -the usual comparison operations. Other -unboxed-Word ops (bit-twiddling and coercions): - - - - - -{gt,ge,eq,ne,lt,le}Word# :: Word# -> Word# -> Bool - -and#, or#, xor# :: Word# -> Word# -> Word# - -- standard bit ops. - -quotWord#, remWord# :: Word# -> Word# -> Word# - -- word (i.e. unsigned) versions are different from int - -- versions, so we have to provide these explicitly. - -not# :: Word# -> Word# - -shiftL#, shiftRL# :: Word# -> Int# -> Word# - -- shift left, right logical - -int2Word# :: Int# -> Word# -- just a cast, really -word2Int# :: Word# -> Int# - - -bit operations, Word and Addr -gtWord# -geWord# -eqWord# -neWord# -ltWord# -leWord# -and# -or# -xor# -not# -quotWord# -remWord# -shiftL# -shiftRA# -shiftRL# -int2Word# -word2Int# - - - -Unboxed-Addr ops (C casts, really): - - -{gt,ge,eq,ne,lt,le}Addr# :: Addr# -> Addr# -> Bool - -int2Addr# :: Int# -> Addr# -addr2Int# :: Addr# -> Int# -addr2Integer# :: Addr# -> (# Int#, ByteArray# #) - - -gtAddr# -geAddr# -eqAddr# -neAddr# -ltAddr# -leAddr# -int2Addr# -addr2Int# -addr2Integer# - - - -The casts between Int#, -Word# and Addr# -correspond to null operations at the machine level, but are required -to keep the Haskell type checker happy. - - - -Operations for indexing off of C pointers -(Addr#s) to snatch values are listed under -“arrays”. - - - - - -Arrays - - -arrays, primitive - - - -The type Array# elt is the type of primitive, -unpointed arrays of values of type elt. - - - - - -type Array# elt - - -Array# - - - -Array# is more primitive than a Haskell -array—indeed, the Haskell Array interface is -implemented using Array#—in that an -Array# is indexed only by -Int#s, starting at zero. It is also more -primitive by virtue of being unboxed. That doesn't mean that it isn't -a heap-allocated object—of course, it is. Rather, being unboxed -means that it is represented by a pointer to the array itself, and not -to a thunk which will evaluate to the array (or to bottom). The -components of an Array# are themselves boxed. - - - -The type ByteArray# is similar to -Array#, except that it contains just a string -of (non-pointer) bytes. - - - - - -type ByteArray# - - -ByteArray# - - - -Arrays of these types are useful when a Haskell program wishes to -construct a value to pass to a C procedure. It is also possible to use -them to build (say) arrays of unboxed characters for internal use in a -Haskell program. Given these uses, ByteArray# -is deliberately a bit vague about the type of its components. -Operations are provided to extract values of type -Char#, Int#, -Float#, Double#, and -Addr# from arbitrary offsets within a -ByteArray#. (For type -Foo#, the $i$th offset gets you the $i$th -Foo#, not the Foo# at -byte-position $i$. Mumble.) (If you want a -Word#, grab an Int#, -then coerce it.) - - - -Lastly, we have static byte-arrays, of type -Addr# [mentioned previously]. (Remember -the duality between arrays and pointers in C.) Arrays of this types -are represented by a pointer to an array in the world outside Haskell, -so this pointer is not followed by the garbage collector. In other -respects they are just like ByteArray#. They -are only needed in order to pass values from C to Haskell. - - - - - -Reading and writing - - -Primitive arrays are linear, and indexed starting at zero. - - - -The size and indices of a ByteArray#, Addr#, and -MutableByteArray# are all in bytes. It's up to the program to -calculate the correct byte offset from the start of the array. This -allows a ByteArray# to contain a mixture of values of different -type, which is often needed when preparing data for and unpicking -results from C. (Umm…not true of indices…WDP 95/09) - - - -Should we provide some sizeOfDouble# constants? - - - -Out-of-range errors on indexing should be caught by the code which -uses the primitive operation; the primitive operations themselves do -not check for out-of-range indexes. The intention is that the -primitive ops compile to one machine instruction or thereabouts. - - - -We use the terms “reading” and “writing” to refer to accessing -mutable arrays (see ), and -“indexing” to refer to reading a value from an immutable -array. - - - -Immutable byte arrays are straightforward to index (all indices in bytes): - - -indexCharArray# :: ByteArray# -> Int# -> Char# -indexIntArray# :: ByteArray# -> Int# -> Int# -indexAddrArray# :: ByteArray# -> Int# -> Addr# -indexFloatArray# :: ByteArray# -> Int# -> Float# -indexDoubleArray# :: ByteArray# -> Int# -> Double# - -indexCharOffAddr# :: Addr# -> Int# -> Char# -indexIntOffAddr# :: Addr# -> Int# -> Int# -indexFloatOffAddr# :: Addr# -> Int# -> Float# -indexDoubleOffAddr# :: Addr# -> Int# -> Double# -indexAddrOffAddr# :: Addr# -> Int# -> Addr# - -- Get an Addr# from an Addr# offset - - -indexCharArray# -indexIntArray# -indexAddrArray# -indexFloatArray# -indexDoubleArray# -indexCharOffAddr# -indexIntOffAddr# -indexFloatOffAddr# -indexDoubleOffAddr# -indexAddrOffAddr# - - - -The last of these, indexAddrOffAddr#, extracts an Addr# using an offset -from another Addr#, thereby providing the ability to follow a chain of -C pointers. - - - -Something a bit more interesting goes on when indexing arrays of boxed -objects, because the result is simply the boxed object. So presumably -it should be entered—we never usually return an unevaluated -object! This is a pain: primitive ops aren't supposed to do -complicated things like enter objects. The current solution is to -return a single element unboxed tuple (see ). - - - - - -indexArray# :: Array# elt -> Int# -> (# elt #) - - -indexArray# - - - - - -The state type - - -state, primitive type -State# - - - -The primitive type State# represents the state of a state -transformer. It is parameterised on the desired type of state, which -serves to keep states from distinct threads distinct from one another. -But the only effect of this parameterisation is in the type -system: all values of type State# are represented in the same way. -Indeed, they are all represented by nothing at all! The code -generator “knows” to generate no code, and allocate no registers -etc, for primitive states. - - - - - -type State# s - - - - - -The type GHC.RealWorld is truly opaque: there are no values defined -of this type, and no operations over it. It is “primitive” in that -sense - but it is not unlifted! Its only role in life is to be -the type which distinguishes the IO state transformer. - - - - - -data RealWorld - - - - - - - -State of the world - - -A single, primitive, value of type State# RealWorld is provided. - - - - - -realWorld# :: State# RealWorld - - -realWorld# state object - - - -(Note: in the compiler, not a PrimOp; just a mucho magic -Id. Exported from GHC, though). - - - - - -Mutable arrays - - -mutable arrays -arrays, mutable -Corresponding to Array# and ByteArray#, we have the types of -mutable versions of each. In each case, the representation is a -pointer to a suitable block of (mutable) heap-allocated storage. - - - - - -type MutableArray# s elt -type MutableByteArray# s - - -MutableArray# -MutableByteArray# - - - -Allocation - - -mutable arrays, allocation -arrays, allocation -allocation, of mutable arrays - - - -Mutable arrays can be allocated. Only pointer-arrays are initialised; -arrays of non-pointers are filled in by “user code” rather than by -the array-allocation primitive. Reason: only the pointer case has to -worry about GC striking with a partly-initialised array. - - - - - -newArray# :: Int# -> elt -> State# s -> (# State# s, MutableArray# s elt #) - -newCharArray# :: Int# -> State# s -> (# State# s, MutableByteArray# s elt #) -newIntArray# :: Int# -> State# s -> (# State# s, MutableByteArray# s elt #) -newAddrArray# :: Int# -> State# s -> (# State# s, MutableByteArray# s elt #) -newFloatArray# :: Int# -> State# s -> (# State# s, MutableByteArray# s elt #) -newDoubleArray# :: Int# -> State# s -> (# State# s, MutableByteArray# s elt #) - - -newArray# -newCharArray# -newIntArray# -newAddrArray# -newFloatArray# -newDoubleArray# - - - -The size of a ByteArray# is given in bytes. - - - - - -Reading and writing - - -arrays, reading and writing - - - - - -readArray# :: MutableArray# s elt -> Int# -> State# s -> (# State# s, elt #) -readCharArray# :: MutableByteArray# s -> Int# -> State# s -> (# State# s, Char# #) -readIntArray# :: MutableByteArray# s -> Int# -> State# s -> (# State# s, Int# #) -readAddrArray# :: MutableByteArray# s -> Int# -> State# s -> (# State# s, Addr# #) -readFloatArray# :: MutableByteArray# s -> Int# -> State# s -> (# State# s, Float# #) -readDoubleArray# :: MutableByteArray# s -> Int# -> State# s -> (# State# s, Double# #) - -writeArray# :: MutableArray# s elt -> Int# -> elt -> State# s -> State# s -writeCharArray# :: MutableByteArray# s -> Int# -> Char# -> State# s -> State# s -writeIntArray# :: MutableByteArray# s -> Int# -> Int# -> State# s -> State# s -writeAddrArray# :: MutableByteArray# s -> Int# -> Addr# -> State# s -> State# s -writeFloatArray# :: MutableByteArray# s -> Int# -> Float# -> State# s -> State# s -writeDoubleArray# :: MutableByteArray# s -> Int# -> Double# -> State# s -> State# s - - -readArray# -readCharArray# -readIntArray# -readAddrArray# -readFloatArray# -readDoubleArray# -writeArray# -writeCharArray# -writeIntArray# -writeAddrArray# -writeFloatArray# -writeDoubleArray# - - - - - -Equality - - -arrays, testing for equality - - - -One can take “equality” of mutable arrays. What is compared is the -name or reference to the mutable array, not its contents. - - - - - -sameMutableArray# :: MutableArray# s elt -> MutableArray# s elt -> Bool -sameMutableByteArray# :: MutableByteArray# s -> MutableByteArray# s -> Bool - - -sameMutableArray# -sameMutableByteArray# - - - - - -Freezing mutable arrays - - -arrays, freezing mutable -freezing mutable arrays -mutable arrays, freezing - - - -Only unsafe-freeze has a primitive. (Safe freeze is done directly in Haskell -by copying the array and then using unsafeFreeze.) - - - - - -unsafeFreezeArray# :: MutableArray# s elt -> State# s -> (# State# s, Array# s elt #) -unsafeFreezeByteArray# :: MutableByteArray# s -> State# s -> (# State# s, ByteArray# #) - - -unsafeFreezeArray# -unsafeFreezeByteArray# - - - - - - - -Synchronizing variables (M-vars) - - -synchronising variables (M-vars) -M-Vars - - - -Synchronising variables are the primitive type used to implement -Concurrent Haskell's MVars (see the Concurrent Haskell paper for -the operational behaviour of these operations). - - - - - -type MVar# s elt -- primitive - -newMVar# :: State# s -> (# State# s, MVar# s elt #) -takeMVar# :: SynchVar# s elt -> State# s -> (# State# s, elt #) -putMVar# :: SynchVar# s elt -> State# s -> State# s - - -SynchVar# -newSynchVar# -takeMVar -putMVar - - - - - + +&primitives; -Primitive state-transformer monad - +Primitive state-transformer monad state transformers (Glasgow extensions) diff --git a/ghc/docs/users_guide/primitives.sgml b/ghc/docs/users_guide/primitives.sgml new file mode 100644 index 0000000..2fe9df7 --- /dev/null +++ b/ghc/docs/users_guide/primitives.sgml @@ -0,0 +1,1115 @@ + + + + Unboxed types and primitive operations + PrelGHC module + + This module defines all the types which are primitive in + Glasgow Haskell, and the operations provided for them. + + Note: while you really can use this stuff to write fast code, + we generally find it a lot less painful, and more satisfying in the + long run, to use higher-level language features and libraries. With + any luck, the code you write will be optimised to the efficient + unboxed version in any case. And if it isn't, we'd like to know + about it. + + +Unboxed types + + + +Unboxed types (Glasgow extension) + + +Most types in GHC are boxed, which means +that values of that type are represented by a pointer to a heap +object. The representation of a Haskell Int, for +example, is a two-word heap object. An unboxed +type, however, is represented by the value itself, no pointers or heap +allocation are involved. + + + +Unboxed types correspond to the “raw machine” types you +would use in C: Int# (long int), +Double# (double), Addr# +(void *), etc. The primitive operations +(PrimOps) on these types are what you might expect; e.g., +(+#) is addition on +Int#s, and is the machine-addition that we all +know and love—usually one instruction. + + + +Primitive (unboxed) types cannot be defined in Haskell, and are +therefore built into the language and compiler. Primitive types are +always unlifted; that is, a value of a primitive type cannot be +bottom. We use the convention that primitive types, values, and +operations have a # suffix. + + + +Primitive values are often represented by a simple bit-pattern, such +as Int#, Float#, +Double#. But this is not necessarily the case: +a primitive value might be represented by a pointer to a +heap-allocated object. Examples include +Array#, the type of primitive arrays. A +primitive array is heap-allocated because it is too big a value to fit +in a register, and would be too expensive to copy around; in a sense, +it is accidental that it is represented by a pointer. If a pointer +represents a primitive value, then it really does point to that value: +no unevaluated thunks, no indirections…nothing can be at the +other end of the pointer than the primitive value. + + + +There are some restrictions on the use of primitive types, the main +one being that you can't pass a primitive value to a polymorphic +function or store one in a polymorphic data type. This rules out +things like [Int#] (i.e. lists of primitive +integers). The reason for this restriction is that polymorphic +arguments and constructor fields are assumed to be pointers: if an +unboxed integer is stored in one of these, the garbage collector would +attempt to follow it, leading to unpredictable space leaks. Or a +seq operation on the polymorphic component may +attempt to dereference the pointer, with disastrous results. Even +worse, the unboxed value might be larger than a pointer +(Double# for instance). + + + +Nevertheless, A numerically-intensive program using unboxed types can +go a lot faster than its “standard” +counterpart—we saw a threefold speedup on one example. + + + + + +Unboxed Tuples + + + +Unboxed tuples aren't really exported by PrelGHC, +they're available by default with . An +unboxed tuple looks like this: + + + + + +(# e_1, ..., e_n #) + + + + + +where e_1..e_n are expressions of any +type (primitive or non-primitive). The type of an unboxed tuple looks +the same. + + + +Unboxed tuples are used for functions that need to return multiple +values, but they avoid the heap allocation normally associated with +using fully-fledged tuples. When an unboxed tuple is returned, the +components are put directly into registers or on the stack; the +unboxed tuple itself does not have a composite representation. Many +of the primitive operations listed in this section return unboxed +tuples. + + + +There are some pretty stringent restrictions on the use of unboxed tuples: + + + + + + + + + Unboxed tuple types are subject to the same restrictions as +other unboxed types; i.e. they may not be stored in polymorphic data +structures or passed to polymorphic functions. + + + + + + + Unboxed tuples may only be constructed as the direct result of +a function, and may only be deconstructed with a case expression. +eg. the following are valid: + + + +f x y = (# x+1, y-1 #) +g x = case f x x of { (# a, b #) -> a + b } + + + +but the following are invalid: + + + +f x y = g (# x, y #) +g (# x, y #) = x + y + + + + + + + + + No variable can have an unboxed tuple type. This is illegal: + + + +f :: (# Int, Int #) -> (# Int, Int #) +f x = x + + + +because x has an unboxed tuple type. + + + + + + + + + +Note: we may relax some of these restrictions in the future. + + + +The IO and ST monads use unboxed +tuples to avoid unnecessary allocation during sequences of operations. + + + + + +Character and numeric types + +character types, primitive +numeric types, primitive +integer types, primitive +floating point types, primitive + +There are the following obvious primitive types: + + + +type Char# +type Int# +type Word# +type Addr# +type Float# +type Double# +type Int64# +type Word64# + + +Char# +Int# +Word# +Addr# +Float# +Double# +Int64# +Word64# + + +If you really want to know their exact equivalents in C, see +ghc/includes/StgTypes.h in the GHC source tree. + + + +Literals for these types may be written as follows: + + + + + +1# an Int# +1.2# a Float# +1.34## a Double# +'a'# a Char#; for weird characters, use e.g. '\o<octal>'# +"a"# an Addr# (a `char *'); only characters '\0'..'\255' allowed + + +literals, primitive +constants, primitive +numbers, primitive + + + + + +Comparison operations + + +comparisons, primitive +operators, comparison + + + + + +{>,>=,==,/=,<,<=}# :: Int# -> Int# -> Bool + +{gt,ge,eq,ne,lt,le}Char# :: Char# -> Char# -> Bool + -- ditto for Word# and Addr# + + +># +>=# +==# +/=# +<# +<=# +gt{Char,Word,Addr}# +ge{Char,Word,Addr}# +eq{Char,Word,Addr}# +ne{Char,Word,Addr}# +lt{Char,Word,Addr}# +le{Char,Word,Addr}# + + + + + +Primitive-character operations + + +characters, primitive operations +operators, primitive character + + + + + +ord# :: Char# -> Int# +chr# :: Int# -> Char# + + +ord# +chr# + + + + + +Primitive-<literal>Int</literal> operations + + +integers, primitive operations +operators, primitive integer + + + + + +{+,-,*,quotInt,remInt,gcdInt}# :: Int# -> Int# -> Int# +negateInt# :: Int# -> Int# + +iShiftL#, iShiftRA#, iShiftRL# :: Int# -> Int# -> Int# + -- shift left, right arithmetic, right logical + +addIntC#, subIntC#, mulIntC# :: Int# -> Int# -> (# Int#, Int# #) + -- add, subtract, multiply with carry + + ++# +-# +*# +quotInt# +remInt# +gcdInt# +iShiftL# +iShiftRA# +iShiftRL# +addIntC# +subIntC# +mulIntC# +shift operations, integer + + + +Note: No error/overflow checking! + + + + + +Primitive-<literal>Double</literal> and <literal>Float</literal> operations + + +floating point numbers, primitive +operators, primitive floating point + + + + + +{+,-,*,/}## :: Double# -> Double# -> Double# +{<,<=,==,/=,>=,>}## :: Double# -> Double# -> Bool +negateDouble# :: Double# -> Double# +double2Int# :: Double# -> Int# +int2Double# :: Int# -> Double# + +{plus,minux,times,divide}Float# :: Float# -> Float# -> Float# +{gt,ge,eq,ne,lt,le}Float# :: Float# -> Float# -> Bool +negateFloat# :: Float# -> Float# +float2Int# :: Float# -> Int# +int2Float# :: Int# -> Float# + + + + + ++## +-## +*## +/## +<## +<=## +==## +=/## +>=## +>## +negateDouble# +double2Int# +int2Double# + + + +plusFloat# +minusFloat# +timesFloat# +divideFloat# +gtFloat# +geFloat# +eqFloat# +neFloat# +ltFloat# +leFloat# +negateFloat# +float2Int# +int2Float# + + + +And a full complement of trigonometric functions: + + + + + +expDouble# :: Double# -> Double# +logDouble# :: Double# -> Double# +sqrtDouble# :: Double# -> Double# +sinDouble# :: Double# -> Double# +cosDouble# :: Double# -> Double# +tanDouble# :: Double# -> Double# +asinDouble# :: Double# -> Double# +acosDouble# :: Double# -> Double# +atanDouble# :: Double# -> Double# +sinhDouble# :: Double# -> Double# +coshDouble# :: Double# -> Double# +tanhDouble# :: Double# -> Double# +powerDouble# :: Double# -> Double# -> Double# + + +trigonometric functions, primitive + + + +similarly for Float#. + + + +There are two coercion functions for Float#/Double#: + + + + + +float2Double# :: Float# -> Double# +double2Float# :: Double# -> Float# + + +float2Double# +double2Float# + + + +The primitive version of decodeDouble +(encodeDouble is implemented as an external C +function): + + + + + +decodeDouble# :: Double# -> PrelNum.ReturnIntAndGMP + + +encodeDouble# +decodeDouble# + + + +(And the same for Float#s.) + + + + + +Operations on/for <literal>Integers</literal> (interface to GMP) + + + +arbitrary precision integers +Integer, operations on + + + +We implement Integers (arbitrary-precision +integers) using the GNU multiple-precision (GMP) package (version +2.0.2). + + + +The data type for Integer is either a small +integer, represented by an Int, or a large integer +represented using the pieces required by GMP's +MP_INT in gmp.h (see +gmp.info in +ghc/includes/runtime/gmp). It comes out as: + + + + + +data Integer = S# Int# -- small integers + | J# Int# ByteArray# -- large integers + + +Integer type The primitive +ops to support large Integers use the +“pieces” of the representation, and are as follows: + + + + + +negateInteger# :: Int# -> ByteArray# -> Integer + +{plus,minus,times}Integer#, gcdInteger#, + quotInteger#, remInteger#, divExactInteger# + :: Int# -> ByteArray# + -> Int# -> ByteArray# + -> (# Int#, ByteArray# #) + +cmpInteger# + :: Int# -> ByteArray# + -> Int# -> ByteArray# + -> Int# -- -1 for <; 0 for ==; +1 for > + +cmpIntegerInt# + :: Int# -> ByteArray# + -> Int# + -> Int# -- -1 for <; 0 for ==; +1 for > + +gcdIntegerInt# :: + :: Int# -> ByteArray# + -> Int# + -> Int# + +divModInteger#, quotRemInteger# + :: Int# -> ByteArray# + -> Int# -> ByteArray# + -> (# Int#, ByteArray#, + Int#, ByteArray# #) + +integer2Int# :: Int# -> ByteArray# -> Int# + +int2Integer# :: Int# -> Integer -- NB: no error-checking on these two! +word2Integer# :: Word# -> Integer + +addr2Integer# :: Addr# -> Integer + -- the Addr# is taken to be a `char *' string + -- to be converted into an Integer. + + +negateInteger# +plusInteger# +minusInteger# +timesInteger# +quotInteger# +remInteger# +gcdInteger# +gcdIntegerInt# +divExactInteger# +cmpInteger# +divModInteger# +quotRemInteger# +integer2Int# +int2Integer# +word2Integer# +addr2Integer# + + + + + +Words and addresses + + +word, primitive type +address, primitive type +unsigned integer, primitive type +pointer, primitive type + + + +A Word# is used for bit-twiddling operations. +It is the same size as an Int#, but has no sign +nor any arithmetic operations. + + +type Word# -- Same size/etc as Int# but *unsigned* +type Addr# -- A pointer from outside the "Haskell world" (from C, probably); + -- described under "arrays" + + +Word# +Addr# + + + +Word#s and Addr#s have +the usual comparison operations. Other +unboxed-Word ops (bit-twiddling and coercions): + + + + + +{gt,ge,eq,ne,lt,le}Word# :: Word# -> Word# -> Bool + +and#, or#, xor# :: Word# -> Word# -> Word# + -- standard bit ops. + +quotWord#, remWord# :: Word# -> Word# -> Word# + -- word (i.e. unsigned) versions are different from int + -- versions, so we have to provide these explicitly. + +not# :: Word# -> Word# + +shiftL#, shiftRL# :: Word# -> Int# -> Word# + -- shift left, right logical + +int2Word# :: Int# -> Word# -- just a cast, really +word2Int# :: Word# -> Int# + + +bit operations, Word and Addr +gtWord# +geWord# +eqWord# +neWord# +ltWord# +leWord# +and# +or# +xor# +not# +quotWord# +remWord# +shiftL# +shiftRA# +shiftRL# +int2Word# +word2Int# + + + +Unboxed-Addr ops (C casts, really): + + +{gt,ge,eq,ne,lt,le}Addr# :: Addr# -> Addr# -> Bool + +int2Addr# :: Int# -> Addr# +addr2Int# :: Addr# -> Int# +addr2Integer# :: Addr# -> (# Int#, ByteArray# #) + + +gtAddr# +geAddr# +eqAddr# +neAddr# +ltAddr# +leAddr# +int2Addr# +addr2Int# +addr2Integer# + + + +The casts between Int#, +Word# and Addr# +correspond to null operations at the machine level, but are required +to keep the Haskell type checker happy. + + + +Operations for indexing off of C pointers +(Addr#s) to snatch values are listed under +“arrays”. + + + + + +Arrays + + +arrays, primitive + + + +The type Array# elt is the type of primitive, +unpointed arrays of values of type elt. + + + + + +type Array# elt + + +Array# + + + +Array# is more primitive than a Haskell +array—indeed, the Haskell Array interface is +implemented using Array#—in that an +Array# is indexed only by +Int#s, starting at zero. It is also more +primitive by virtue of being unboxed. That doesn't mean that it isn't +a heap-allocated object—of course, it is. Rather, being unboxed +means that it is represented by a pointer to the array itself, and not +to a thunk which will evaluate to the array (or to bottom). The +components of an Array# are themselves boxed. + + + +The type ByteArray# is similar to +Array#, except that it contains just a string +of (non-pointer) bytes. + + + + + +type ByteArray# + + +ByteArray# + + + +Arrays of these types are useful when a Haskell program wishes to +construct a value to pass to a C procedure. It is also possible to use +them to build (say) arrays of unboxed characters for internal use in a +Haskell program. Given these uses, ByteArray# +is deliberately a bit vague about the type of its components. +Operations are provided to extract values of type +Char#, Int#, +Float#, Double#, and +Addr# from arbitrary offsets within a +ByteArray#. (For type +Foo#, the $i$th offset gets you the $i$th +Foo#, not the Foo# at +byte-position $i$. Mumble.) (If you want a +Word#, grab an Int#, +then coerce it.) + + + +Lastly, we have static byte-arrays, of type +Addr# [mentioned previously]. (Remember +the duality between arrays and pointers in C.) Arrays of this types +are represented by a pointer to an array in the world outside Haskell, +so this pointer is not followed by the garbage collector. In other +respects they are just like ByteArray#. They +are only needed in order to pass values from C to Haskell. + + + + + +Reading and writing + + +Primitive arrays are linear, and indexed starting at zero. + + + +The size and indices of a ByteArray#, Addr#, and +MutableByteArray# are all in bytes. It's up to the program to +calculate the correct byte offset from the start of the array. This +allows a ByteArray# to contain a mixture of values of different +type, which is often needed when preparing data for and unpicking +results from C. (Umm…not true of indices…WDP 95/09) + + + +Should we provide some sizeOfDouble# constants? + + + +Out-of-range errors on indexing should be caught by the code which +uses the primitive operation; the primitive operations themselves do +not check for out-of-range indexes. The intention is that the +primitive ops compile to one machine instruction or thereabouts. + + + +We use the terms “reading” and “writing” to refer to accessing +mutable arrays (see ), and +“indexing” to refer to reading a value from an immutable +array. + + + +Immutable byte arrays are straightforward to index (all indices in bytes): + + +indexCharArray# :: ByteArray# -> Int# -> Char# +indexIntArray# :: ByteArray# -> Int# -> Int# +indexAddrArray# :: ByteArray# -> Int# -> Addr# +indexFloatArray# :: ByteArray# -> Int# -> Float# +indexDoubleArray# :: ByteArray# -> Int# -> Double# + +indexCharOffAddr# :: Addr# -> Int# -> Char# +indexIntOffAddr# :: Addr# -> Int# -> Int# +indexFloatOffAddr# :: Addr# -> Int# -> Float# +indexDoubleOffAddr# :: Addr# -> Int# -> Double# +indexAddrOffAddr# :: Addr# -> Int# -> Addr# + -- Get an Addr# from an Addr# offset + + +indexCharArray# +indexIntArray# +indexAddrArray# +indexFloatArray# +indexDoubleArray# +indexCharOffAddr# +indexIntOffAddr# +indexFloatOffAddr# +indexDoubleOffAddr# +indexAddrOffAddr# + + + +The last of these, indexAddrOffAddr#, extracts an Addr# using an offset +from another Addr#, thereby providing the ability to follow a chain of +C pointers. + + + +Something a bit more interesting goes on when indexing arrays of boxed +objects, because the result is simply the boxed object. So presumably +it should be entered—we never usually return an unevaluated +object! This is a pain: primitive ops aren't supposed to do +complicated things like enter objects. The current solution is to +return a single element unboxed tuple (see ). + + + + + +indexArray# :: Array# elt -> Int# -> (# elt #) + + +indexArray# + + + + + +The state type + + +state, primitive type +State# + + + +The primitive type State# represents the state of a state +transformer. It is parameterised on the desired type of state, which +serves to keep states from distinct threads distinct from one another. +But the only effect of this parameterisation is in the type +system: all values of type State# are represented in the same way. +Indeed, they are all represented by nothing at all! The code +generator “knows” to generate no code, and allocate no registers +etc, for primitive states. + + + + + +type State# s + + + + + +The type GHC.RealWorld is truly opaque: there are no values defined +of this type, and no operations over it. It is “primitive” in that +sense - but it is not unlifted! Its only role in life is to be +the type which distinguishes the IO state transformer. + + + + + +data RealWorld + + + + + + + +State of the world + + +A single, primitive, value of type State# RealWorld is provided. + + + + + +realWorld# :: State# RealWorld + + +realWorld# state object + + + +(Note: in the compiler, not a PrimOp; just a mucho magic +Id. Exported from GHC, though). + + + + + +Mutable arrays + + +mutable arrays +arrays, mutable +Corresponding to Array# and ByteArray#, we have the types of +mutable versions of each. In each case, the representation is a +pointer to a suitable block of (mutable) heap-allocated storage. + + + + + +type MutableArray# s elt +type MutableByteArray# s + + +MutableArray# +MutableByteArray# + + + +Allocation + + +mutable arrays, allocation +arrays, allocation +allocation, of mutable arrays + + + +Mutable arrays can be allocated. Only pointer-arrays are initialised; +arrays of non-pointers are filled in by “user code” rather than by +the array-allocation primitive. Reason: only the pointer case has to +worry about GC striking with a partly-initialised array. + + + + + +newArray# :: Int# -> elt -> State# s -> (# State# s, MutableArray# s elt #) + +newCharArray# :: Int# -> State# s -> (# State# s, MutableByteArray# s elt #) +newIntArray# :: Int# -> State# s -> (# State# s, MutableByteArray# s elt #) +newAddrArray# :: Int# -> State# s -> (# State# s, MutableByteArray# s elt #) +newFloatArray# :: Int# -> State# s -> (# State# s, MutableByteArray# s elt #) +newDoubleArray# :: Int# -> State# s -> (# State# s, MutableByteArray# s elt #) + + +newArray# +newCharArray# +newIntArray# +newAddrArray# +newFloatArray# +newDoubleArray# + + + +The size of a ByteArray# is given in bytes. + + + + + +Reading and writing + + +arrays, reading and writing + + + + + +readArray# :: MutableArray# s elt -> Int# -> State# s -> (# State# s, elt #) +readCharArray# :: MutableByteArray# s -> Int# -> State# s -> (# State# s, Char# #) +readIntArray# :: MutableByteArray# s -> Int# -> State# s -> (# State# s, Int# #) +readAddrArray# :: MutableByteArray# s -> Int# -> State# s -> (# State# s, Addr# #) +readFloatArray# :: MutableByteArray# s -> Int# -> State# s -> (# State# s, Float# #) +readDoubleArray# :: MutableByteArray# s -> Int# -> State# s -> (# State# s, Double# #) + +writeArray# :: MutableArray# s elt -> Int# -> elt -> State# s -> State# s +writeCharArray# :: MutableByteArray# s -> Int# -> Char# -> State# s -> State# s +writeIntArray# :: MutableByteArray# s -> Int# -> Int# -> State# s -> State# s +writeAddrArray# :: MutableByteArray# s -> Int# -> Addr# -> State# s -> State# s +writeFloatArray# :: MutableByteArray# s -> Int# -> Float# -> State# s -> State# s +writeDoubleArray# :: MutableByteArray# s -> Int# -> Double# -> State# s -> State# s + + +readArray# +readCharArray# +readIntArray# +readAddrArray# +readFloatArray# +readDoubleArray# +writeArray# +writeCharArray# +writeIntArray# +writeAddrArray# +writeFloatArray# +writeDoubleArray# + + + + + +Equality + + +arrays, testing for equality + + + +One can take “equality” of mutable arrays. What is compared is the +name or reference to the mutable array, not its contents. + + + + + +sameMutableArray# :: MutableArray# s elt -> MutableArray# s elt -> Bool +sameMutableByteArray# :: MutableByteArray# s -> MutableByteArray# s -> Bool + + +sameMutableArray# +sameMutableByteArray# + + + + + +Freezing mutable arrays + + +arrays, freezing mutable +freezing mutable arrays +mutable arrays, freezing + + + +Only unsafe-freeze has a primitive. (Safe freeze is done directly in Haskell +by copying the array and then using unsafeFreeze.) + + + + + +unsafeFreezeArray# :: MutableArray# s elt -> State# s -> (# State# s, Array# s elt #) +unsafeFreezeByteArray# :: MutableByteArray# s -> State# s -> (# State# s, ByteArray# #) + + +unsafeFreezeArray# +unsafeFreezeByteArray# + + + + + + + +Synchronizing variables (M-vars) + + +synchronising variables (M-vars) +M-Vars + + + +Synchronising variables are the primitive type used to implement +Concurrent Haskell's MVars (see the Concurrent Haskell paper for +the operational behaviour of these operations). + + + + + +type MVar# s elt -- primitive + +newMVar# :: State# s -> (# State# s, MVar# s elt #) +takeMVar# :: SynchVar# s elt -> State# s -> (# State# s, elt #) +putMVar# :: SynchVar# s elt -> State# s -> State# s + + +SynchVar# +newSynchVar# +takeMVar +putMVar + + + + + + + diff --git a/ghc/docs/users_guide/ug-ent.sgml b/ghc/docs/users_guide/ug-ent.sgml index 0c56c2f..64ebcb7 100644 --- a/ghc/docs/users_guide/ug-ent.sgml +++ b/ghc/docs/users_guide/ug-ent.sgml @@ -14,6 +14,7 @@ + -- 1.7.10.4