\begin{onlystandalone} \documentstyle[a4wide,grasp]{article} \begin{rawlatex} \renewcommand{\textfraction}{0.1} \renewcommand{\floatpagefraction}{0.9} \renewcommand{\dblfloatpagefraction}{0.9} \sloppy \renewcommand{\today}{November 1997} \end{rawlatex} \begin{document} \title{The GHC Prelude and Libraries} \author{Simon L Peyton Jones \and Simon Marlow \and Will Partain} \maketitle \begin{rawlatex} \tableofcontents \end{rawlatex} \end{onlystandalone} \section[ghc-prelude]{The GHC prelude and libraries} This document describes GHC's prelude and libraries. The basic story is that of the Haskell 1.4 Report and Libraries document (which we do not reproduce here), but this document describes in addition: \begin{itemize} \item GHC's additional non-standard libraries and types, such as state transformers, packed strings, foreign objects, stable pointers, and so on. \item GHC's primitive types and operations. The standard Haskell functions are implemented on top of these, and it is sometimes useful to use them directly. \item The organisation of these libraries into directories. \item Short description of programmer interface to the non-standard, `built-in' libraries provided in addition to the standard prelude and libraries. \end{itemize} In addition to the GHC prelude libraries, GHC comes with a number of system libraries, which are presented in Section \ref{syslibs}. \subsection{Prelude library organisation} {\em Probably only of interest to implementors...} The prelude libraries are organised into the following three groups, each of which is kept in a separate sub-directory of GHC's source @lib/@ directory: \begin{description} \item[@lib/required/@] These are the libraries {\em required} by the Haskell definition. All are defined by the Haskell Report, or by the Haskell Libraries Report. They currently comprise: \begin{itemize} \item @Array@: monolithic arrays. \item @Char@: more functions on characters. \item @Complex@: interface defining complex number type and functions over it. \item @CPUTime@: get the CPU time used by the program. \item @Directory@: basic functions for accessing the file system. \item @Ix@: the @Ix@ class of indexing operations. \item @IO@: additional input/output functions. \item @List@: more functions on lists. \item @Locale@: localisation functions. \item @Maybe@: more functions on @Maybe@ types. \item @Monad@: functions on monads. \item @Numeric@: operations for reading and showing number values. \item @Prelude@: standard prelude interface. \item @Random@: pseudo-random number generator. \item @Ratio@: functions on rational numbers. \item @System@: basic operating-system interface functions. \item @Time@: operations on time. \end{itemize} \item[@lib/glaExts@] Extension libraries, currently comprising: \begin{itemize} \item @Addr@: primitive pointer type. \item @Bits@: a class of bitwise operations. \item @ByteArray@: operations over immutable chunks of (heap allocated) bytes. \item @CCall@: classes @CCallable@ and @CReturnable@ for calling C. \item @Foreign@: types and operations for GHC's foreign-language interface. \item @GlaExts@: interface for extensions that are only implemented in GHC: namely unboxed types and primitive operations. \item @IOExts@: extensions to the @IO@ library. \item @Int@: 8, 16, 32 and 64-bit integers with bit operations. \item @LazyST@: a lazy version of the @ST@ monad. \item @MutableArray@: operations over mutable arrays. \item @ST@: the state transformer monad, @STRef@s and @STArray@s. \item @Word@: 8, 16, 32 and 64-bit naturals with bit operations. \end{itemize} \item[@lib/concurrent@] GHC extension libraries to support Concurrent Haskell, currently comprising: \begin{itemize} \item @Concurrent@: main library. \item @Parallel@: stuff for multi-processor parallelism. \item @Channel@ \item @ChannelVar@ \item @Merge@ \item @SampleVar@ \item @Semaphore@ \end{itemize} \item[@lib/ghc@] These libraries are the pieces on which all the others are built. They aren't typically imported by Joe Programmer, but there's nothing to stop you doing so if you want. In general, the modules prefixed by @Prel@ are pieces that go towards building @Prelude@. \begin{itemize} \item @GHC@: this ``library'' brings into scope all the primitive types and operations, such as @Int#@, @+#@, @encodeFloat#@, etc etc. It is unique in that there is no Haskell source code for it. Details in Section \ref{sect:ghc}. \item @PrelBase@: defines the basic types and classes without which very few Haskell programs can work. The classes are: @Eq@, @Ord@, @Enum@, @Bounded@, @Num@, @Show@, @Eval@, @Monad@, @MonadZero@, @MonadPlus@. The types are: list, @Bool@, @Char@, @Ordering@, @String@, @Int@, @Integer@. \item @PrelMaybe@: defines the @Maybe@ type. \item @PrelEither@: defines the @Either@ type. \item @PrelTup@: defines tuples and their instances. \item @PrelList@: defines most of the list operations required by @Prelude@. (A few are in @PrelBase@, to avoid gratuitous mutual recursion between modules.) \item @PrelNum@ defines: the numeric classes beyond @Num@, namely @Real@, @Integral@, @Fractional@, @Floating@, @RealFrac@, @RealFloat@; instances for appropriate classes for @Int@ and @Integer@; the types @Float@, @Double@, and @Ratio@ and their instances. \item @PrelRead@: the @Read@ class and all its instances. It's kept separate because many programs don't use @Read@ at all, so we don't even want to link in its code. (If the prelude libraries are built by splitting the object files, this is all a non-issue) \item @ConcBase@: substrate stuff for Concurrent Haskell. \item @IOBase@: substrate stuff for the main I/O libraries. \item @IOHandle@: large blob of code for doing I/O on handles. \item @PrelIO@: the remaining small pieces to produce the I/O stuff needed by @Prelude@. \item @STBase@: substrate stuff for @ST@. \item @ArrBase@: substrate stuff for @Array@. \item @GHCerr@: error reporting code, called from code that the compiler plants in compiled programs. \item @GHCmain@: the definition of @mainIO@, which is what {\em really} gets called by the runtime system. @mainIO@ in turn calls @main@. \item @PackBase@: low-level packing/unpacking operations. \item @Error@: the definition of @error@, placed in its own module with a hand-written @.hi-boot@ file in order to break recursive dependencies in the libraries (everything needs @error@, but the definition of @error@ itself needs a few things...). \end{itemize} \end{description} The @...Base@ modules generally export representation information that is hidden from the public interface. For example the module @STBase@ exports the type @ST@ including its representation, whereas the module @ST@ exports @ST@ abstractly. None of these modules are involved in any mutual recursion, with the sole exception that many modules import @Error.error@. \subsection[ghc-libs-ghc]{The module @GHC@: really primitive stuff} \label{sect:ghc} This section defines all the types which are primitive in Glasgow Haskell, and the operations provided for them. A primitive type is one which cannot be defined in Haskell, and which is therefore built into the language and compiler. Primitive types are always unboxed; that is, a value of primitive type cannot be bottom. Primitive values are often represented by a simple bit-pattern, such as @Int#@, @Float#@, @Double#@. But this is not necessarily the case: a primitive value might be represented by a pointer to a heap-allocated object. Examples include @Array#@, the type of primitive arrays. You might think this odd: doesn't being heap-allocated mean that it has a box? No, it does not. A primitive array is heap-allocated because it is too big a value to fit in a register, and would be too expensive to copy around; in a sense, it is accidental that it is represented by a pointer. If a pointer represents a primitive value, then it really does point to that value: no unevaluated thunks, no indirections...nothing can be at the other end of the pointer than the primitive value. This section also describes a few non-primitive types, which are needed to express the result types of some primitive operations. \subsubsection{Character and numeric types} There are the following obvious primitive types: \begin{verbatim} type Char# type Int# -- see also Word# and Addr#, later type Float# type Double# \end{verbatim} If you really want to know their exact equivalents in C, see @ghc/includes/StgTypes.lh@ in the GHC source tree. Literals for these types may be written as follows: \begin{verbatim} 1# an Int# 1.2# a Float# 1.34## a Double# 'a'# a Char#; for weird characters, use '\o'# "a"# an Addr# (a `char *') \end{verbatim} \subsubsubsection{Comparison operations} \begin{verbatim} {>,>=,==,/=,<,<=}# :: Int# -> Int# -> Bool {gt,ge,eq,ne,lt,le}Char# :: Char# -> Char# -> Bool -- ditto for Word#, Float#, Double#, and Addr# \end{verbatim} \subsubsubsection{Unboxed-character operations} \begin{verbatim} ord# :: Char# -> Int# chr# :: Int# -> Char# \end{verbatim} \subsubsubsection{Unboxed-@Int@ operations} \begin{verbatim} {+,-,*,quotInt,remInt}# :: Int# -> Int# -> Int# negateInt# :: Int# -> Int# iShiftL#, iShiftRA#, iShiftRL# :: Int# -> Int# -> Int# -- shift left, right arithmetic, right logical \end{verbatim} {\bf Note:} No error/overflow checking! \subsubsubsection{Unboxed-@Double@ and @Float@ operations} \begin{verbatim} {plus,minus,times,divide}Double# :: Double# -> Double# -> Double# negateDouble# :: Double# -> Double# float2Int# :: Double# -> Int# -- just a cast, no checking! int2Double# :: Int# -> Double# expDouble# :: Double# -> Double# logDouble# :: Double# -> Double# sqrtDouble# :: Double# -> Double# sinDouble# :: Double# -> Double# cosDouble# :: Double# -> Double# tanDouble# :: Double# -> Double# asinDouble# :: Double# -> Double# acosDouble# :: Double# -> Double# atanDouble# :: Double# -> Double# sinhDouble# :: Double# -> Double# coshDouble# :: Double# -> Double# tanhDouble# :: Double# -> Double# powerDouble# :: Double# -> Double# -> Double# \end{verbatim} There's an exactly-matching set of unboxed-@Float@ ops; replace @Double#@ with @Float#@ in the list above. There are two coercion functions for @Float#@/@Double#@: \begin{verbatim} float2Double# :: Float# -> Double# double2Float# :: Double# -> Float# \end{verbatim} The primitive versions of @encodeDouble@/@decodeDouble@: \begin{verbatim} encodeDouble# :: Int# -> Int# -> ByteArray# -- Integer mantissa -> Int# -- Int exponent -> Double# decodeDouble# :: Double# -> PrelNum.ReturnIntAndGMP \end{verbatim} (And the same for @Float#@s.) \subsubsection{Operations on/for @Integers@ (interface to GMP)} \label{sect:horrid-Integer-pairing-types} We implement @Integers@ (arbitrary-precision integers) using the GNU multiple-precision (GMP) package (version 1.3.2). {\bf Note:} some of this might change when we upgrade to using GMP~2.x. The data type for @Integer@ must mirror that for @MP_INT@ in @gmp.h@ (see @gmp.info@ in \tr{ghc/includes/runtime/gmp}). It comes out as: \begin{verbatim} data Integer = J# Int# Int# ByteArray# \end{verbatim} So, @Integer@ is really just a ``pairing'' type for a particular collection of primitive types. The operations in the GMP return other combinations of GMP-plus-something, so we need ``pairing'' types for those, too: \begin{verbatim} data Return2GMPs = Return2GMPs Int# Int# ByteArray# Int# Int# ByteArray# data ReturnIntAndGMP = ReturnIntAndGMP Int# Int# Int# ByteArray# -- ????? something to return a string of bytes (in the heap?) \end{verbatim} The primitive ops to support @Integers@ use the ``pieces'' of the representation, and are as follows: \begin{verbatim} negateInteger# :: Int# -> Int# -> ByteArray# -> Integer {plus,minus,times}Integer# :: Int# -> Int# -> ByteArray# -> Int# -> Int# -> ByteArray# -> Integer cmpInteger# :: Int# -> Int# -> ByteArray# -> Int# -> Int# -> ByteArray# -> Int# -- -1 for <; 0 for ==; +1 for > divModInteger#, quotRemInteger# :: Int# -> Int# -> ByteArray# -> Int# -> Int# -> ByteArray# -> PrelNum.Return2GMPs integer2Int# :: Int# -> Int# -> ByteArray# -> Int# int2Integer# :: Int# -> Integer -- NB: no error-checking on these two! word2Integer# :: Word# -> Integer addr2Integer# :: Addr# -> Integer -- the Addr# is taken to be a `char *' string -- to be converted into an Integer. \end{verbatim} \subsubsection{Words and addresses} A @Word#@ is used for bit-twiddling operations. It is the same size as an @Int#@, but has no sign nor any arithmetic operations. \begin{verbatim} type Word# -- Same size/etc as Int# but *unsigned* type Addr# -- A pointer from outside the "Haskell world" (from C, probably); -- described under "arrays" \end{verbatim} @Word#@s and @Addr#@s have the usual comparison operations. Other unboxed-@Word@ ops (bit-twiddling and coercions): \begin{verbatim} and#, or#, xor# :: Word# -> Word# -> Word# -- standard bit ops. quotWord#, remWord# :: Word# -> Word# -> Word# -- word (i.e. unsigned) versions are different from int -- versions, so we have to provide these explicitly. not# :: Word# -> Word# shiftL#, shiftRA#, shiftRL# :: Word# -> Int# -> Word# -- shift left, right arithmetic, right logical int2Word# :: Int# -> Word# -- just a cast, really word2Int# :: Word# -> Int# \end{verbatim} Unboxed-@Addr@ ops (C casts, really): \begin{verbatim} int2Addr# :: Int# -> Addr# addr2Int# :: Addr# -> Int# \end{verbatim} The casts between @Int#@, @Word#@ and @Addr#@ correspond to null operations at the machine level, but are required to keep the Haskell type checker happy. Operations for indexing off of C pointers (@Addr#@s) to snatch values are listed under ``arrays''. \subsubsection{Arrays} The type @Array# elt@ is the type of primitive, unboxed arrays of values of type @elt@. \begin{verbatim} type Array# elt \end{verbatim} @Array#@ is more primitive than a Haskell array --- indeed, the Haskell @Array@ interface is implemented using @Array#@ --- in that an @Array#@ is indexed only by @Int#@s, starting at zero. It is also more primitive by virtue of being unboxed. That doesn't mean that it isn't a heap-allocated object --- of course, it is. Rather, being unboxed means that it is represented by a pointer to the array itself, and not to a thunk which will evaluate to the array (or to bottom). The components of an @Array#@ are themselves boxed. The type @ByteArray#@ is similar to @Array#@, except that it contains just a string of (non-pointer) bytes. \begin{verbatim} type ByteArray# \end{verbatim} Arrays of these types are useful when a Haskell program wishes to construct a value to pass to a C procedure. It is also possible to use them to build (say) arrays of unboxed characters for internal use in a Haskell program. Given these uses, @ByteArray#@ is deliberately a bit vague about the type of its components. Operations are provided to extract values of type @Char#@, @Int#@, @Float#@, @Double#@, and @Addr#@ from arbitrary offsets within a @ByteArray#@. (For type @Foo#@, the $i$th offset gets you the $i$th @Foo#@, not the @Foo#@ at byte-position $i$. Mumble.) (If you want a @Word#@, grab an @Int#@, then coerce it.) Lastly, we have static byte-arrays, of type @Addr#@ [mentioned previously]. (Remember the duality between arrays and pointers in C.) Arrays of this types are represented by a pointer to an array in the world outside Haskell, so this pointer is not followed by the garbage collector. In other respects they are just like @ByteArray#@. They are only needed in order to pass values from C to Haskell. \subsubsubsection{Reading and writing.} Primitive arrays are linear, and indexed starting at zero. The size and indices of a @ByteArray#@, @Addr#@, and @MutableByteArray#@ are all in bytes. It's up to the program to calculate the correct byte offset from the start of the array. This allows a @ByteArray#@ to contain a mixture of values of different type, which is often needed when preparing data for and unpicking results from C. (Umm... not true of indices... WDP 95/09) {\em Should we provide some @sizeOfDouble#@ constants?} Out-of-range errors on indexing should be caught by the code which uses the primitive operation; the primitive operations themselves do {\em not} check for out-of-range indexes. The intention is that the primitive ops compile to one machine instruction or thereabouts. We use the terms ``reading'' and ``writing'' to refer to accessing {\em mutable} arrays (see Section~\ref{sect:mutable}), and ``indexing'' to refer to reading a value from an {\em immutable} array. Immutable byte arrays are straightforward to index (all indices in bytes): \begin{verbatim} indexCharArray# :: ByteArray# -> Int# -> Char# indexIntArray# :: ByteArray# -> Int# -> Int# indexAddrArray# :: ByteArray# -> Int# -> Addr# indexFloatArray# :: ByteArray# -> Int# -> Float# indexDoubleArray# :: ByteArray# -> Int# -> Double# indexCharOffAddr# :: Addr# -> Int# -> Char# indexIntOffAddr# :: Addr# -> Int# -> Int# indexFloatOffAddr# :: Addr# -> Int# -> Float# indexDoubleOffAddr# :: Addr# -> Int# -> Double# indexAddrOffAddr# :: Addr# -> Int# -> Addr# -- Get an Addr# from an Addr# offset \end{verbatim} \index{indexCharArray# function} \index{indexIntArray# function} \index{indexAddrArray# function} \index{indexFloatArray# function} \index{indexDoubleArray# function} \index{indexCharOffAddr# function} \index{indexIntOffAddr# function} \index{indexFloatOffAddr# function} \index{indexDoubleOffAddr# function} \index{indexAddrOffAddr# function} The last of these, @indexAddrOffAddr#@, extracts an @Addr#@ using an offset from another @Addr#@, thereby providing the ability to follow a chain of C pointers. Something a bit more interesting goes on when indexing arrays of boxed objects, because the result is simply the boxed object. So presumably it should be entered --- we never usually return an unevaluated object! This is a pain: primitive ops aren't supposed to do complicated things like enter objects. The current solution is to return a lifted value, but I don't like it! \begin{verbatim} indexArray# :: Array# elt -> Int# -> PrelBase.Lift elt -- Yuk! \end{verbatim} \subsubsection{The state type} \index{State# type} The primitive type @State#@ represents the state of a state transformer. It is parameterised on the desired type of state, which serves to keep states from distinct threads distinct from one another. But the {\em only} effect of this parameterisation is in the type system: all values of type @State#@ are represented in the same way. Indeed, they are all represented by nothing at all! The code generator ``knows'' to generate no code, and allocate no registers etc, for primitive states. \begin{verbatim} type State# s \end{verbatim} The type @GHC.RealWorld@ is truly opaque: there are no values defined of this type, and no operations over it. It is ``primitive'' in that sense---but it is {\em not unboxed!} Its only role in life is to be the type which distinguishes the @IO@ state transformer. \begin{verbatim} data RealWorld \end{verbatim} \subsubsubsection{State of the world} A single, primitive, value of type @State# RealWorld@ is provided. \begin{verbatim} realWorld# :: State# GHC.RealWorld \end{verbatim} \index{realWorld# state object} (Note: in the compiler, not a @PrimOp@; just a mucho magic @Id@. Exported from @GHC@, though). \subsubsection{State pairing types} \label{sect:horrid-pairing-types} This subsection defines some types which, while they aren't quite primitive because we can define them in Haskell, are very nearly so. They define constructors which pair a primitive state with a value of each primitive type. They are required to express the result type of the primitive operations in the state monad. \begin{verbatim} data StateAndPtr# s elt = StateAndPtr# (State# s) elt data StateAndChar# s = StateAndChar# (State# s) Char# data StateAndInt# s = StateAndInt# (State# s) Int# data StateAndWord# s = StateAndWord# (State# s) Word# data StateAndFloat# s = StateAndFloat# (State# s) Float# data StateAndDouble# s = StateAndDouble# (State# s) Double# data StateAndAddr# s = StateAndAddr# (State# s) Addr# data StateAndStablePtr# s a = StateAndStablePtr# (State# s) (StablePtr# a) data StateAndForeignObj# s = StateAndForeignObj# (State# s) ForeignObj# data StateAndSynchVar# s a = StateAndSynchVar# (State# s) (SynchVar# a) data StateAndArray# s elt = StateAndArray# (State# s) (Array# elt) data StateAndMutableArray# s elt = StateAndMutableArray# (State# s) (MutableArray# s elt) data StateAndByteArray# s = StateAndByteArray# (State# s) ByteArray# data StateAndMutableByteArray# s = StateAndMutableByteArray# (State# s) (MutableByteArray# s) \end{verbatim} Hideous. \subsubsection{Mutable arrays} \label{sect:mutable} \index{Mutable arrays} Corresponding to @Array#@ and @ByteArray#@, we have the types of mutable versions of each. In each case, the representation is a pointer to a suitable block of (mutable) heap-allocated storage. \begin{verbatim} type MutableArray# s elt type MutableByteArray# s \end{verbatim} \subsubsubsection{Allocation} \index{Mutable arrays, allocation} \index{Allocation, of mutable arrays} Mutable arrays can be allocated. Only pointer-arrays are initialised; arrays of non-pointers are filled in by ``user code'' rather than by the array-allocation primitive. Reason: only the pointer case has to worry about GC striking with a partly-initialised array. \begin{verbatim} newArray# :: Int# -> elt -> State# s -> StateAndMutableArray# s elt newCharArray# :: Int# -> State# s -> StateAndMutableByteArray# s newIntArray# :: Int# -> State# s -> StateAndMutableByteArray# s newAddrArray# :: Int# -> State# s -> StateAndMutableByteArray# s newFloatArray# :: Int# -> State# s -> StateAndMutableByteArray# s newDoubleArray# :: Int# -> State# s -> StateAndMutableByteArray# s \end{verbatim} The size of a @ByteArray#@ is given in bytes. \subsubsubsection{Reading and writing} %OLD: Remember, offsets in a @MutableByteArray#@ are in bytes. \begin{verbatim} readArray# :: MutableArray# s elt -> Int# -> State# s -> StateAndPtr# s elt readCharArray# :: MutableByteArray# s -> Int# -> State# s -> StateAndChar# s readIntArray# :: MutableByteArray# s -> Int# -> State# s -> StateAndInt# s readAddrArray# :: MutableByteArray# s -> Int# -> State# s -> StateAndAddr# s readFloatArray# :: MutableByteArray# s -> Int# -> State# s -> StateAndFloat# s readDoubleArray# :: MutableByteArray# s -> Int# -> State# s -> StateAndDouble# s writeArray# :: MutableArray# s elt -> Int# -> elt -> State# s -> State# s writeCharArray# :: MutableByteArray# s -> Int# -> Char# -> State# s -> State# s writeIntArray# :: MutableByteArray# s -> Int# -> Int# -> State# s -> State# s writeAddrArray# :: MutableByteArray# s -> Int# -> Addr# -> State# s -> State# s writeFloatArray# :: MutableByteArray# s -> Int# -> Float# -> State# s -> State# s writeDoubleArray# :: MutableByteArray# s -> Int# -> Double# -> State# s -> State# s \end{verbatim} \subsubsubsection{Equality.} One can take ``equality'' of mutable arrays. What is compared is the {\em name} or reference to the mutable array, not its contents. \begin{verbatim} sameMutableArray# :: MutableArray# s elt -> MutableArray# s elt -> Bool sameMutableByteArray# :: MutableByteArray# s -> MutableByteArray# s -> Bool \end{verbatim} \subsubsubsection{Freezing mutable arrays} Only unsafe-freeze has a primitive. (Safe freeze is done directly in Haskell by copying the array and then using @unsafeFreeze@.) \begin{verbatim} unsafeFreezeArray# :: MutableArray# s elt -> State# s -> StateAndArray# s elt unsafeFreezeByteArray# :: MutableByteArray# s -> State# s -> StateAndByteArray# s \end{verbatim} \subsubsection{Stable pointers} \index{Stable pointers} A stable pointer is a name for a Haskell object which can be passed to the external world. It is ``stable'' in the sense that the name does not change when the Haskell garbage collector runs --- in contrast to the address of the object which may well change. The stable pointer type is parameterised by the type of the thing which is named. \begin{verbatim} type StablePtr# a \end{verbatim} A stable pointer is represented by an index into the (static) @StablePointerTable@. The Haskell garbage collector treats the @StablePointerTable@ as a source of roots for GC. The @makeStablePointer@ function converts a value into a stable pointer. It is part of the @IO@ monad, because we want to be sure we don't allocate one twice by accident, and then only free one of the copies. \begin{verbatim} makeStablePointer# :: a -> State# RealWorld -> StateAndStablePtr# RealWorld a freeStablePointer# :: StablePtr# a -> State# RealWorld -> State# RealWorld deRefStablePointer# :: StablePtr# a -> State# RealWorld -> StateAndPtr RealWorld a \end{verbatim} There is also a C procedure @FreeStablePtr@ which frees a stable pointer. %{\em Andy's comment.} {\bf Errors:} The following is not strictly true: the current %implementation is not as polymorphic as claimed. The reason for this %is that the C programmer will have to use a different entry-routine %for each type of stable pointer. At present, we only supply a very %limited number (3) of these routines. It might be possible to %increase the range of these routines by providing general purpose %entry points to apply stable pointers to (stable pointers to) %arguments and to enter (stable pointers to) boxed primitive values. %{\em End of Andy's comment.} % % Rewritten and updated for MallocPtr++ -- 4/96 SOF % \subsubsection{Foreign objects} \index{Foreign objects} \index{ForeignObj type} A @ForeignObj#@ is a reference to an object outside the Haskell world (i.e., from the C world, or a reference to an object on another machine completely.), where the Haskell world has been told ``Let me know when you're finished with this ...''. \begin{verbatim} type ForeignObj# \end{verbatim} GHC provides two primitives on @ForeignObj#@: \begin{verbatim} makeForeignObj# :: Addr# -- foreign reference -> Addr# -- pointer to finalisation routine -> StateAndForeignObj# RealWorld ForeignObj# writeForeignObj :: ForeignObj# -- foreign object -> Addr# -- datum -> State# RealWorld -> State# RealWorld \end{verbatim} The module @Foreign@ (Section \ref{sec:foreign-obj}) provides a more programmer-friendly interface to foreign objects. \subsubsection{Synchronizing variables (M-vars)} \index{Synchronising variables (M-vars)} \index{M-Vars} Synchronising variables are the primitive type used to implement Concurrent Haskell's MVars (see the Concurrent Haskell paper for the operational behaviour of these operations). \begin{verbatim} type SynchVar# s elt -- primitive newSynchVar#:: State# s -> StateAndSynchVar# s elt takeMVar# :: SynchVar# s elt -> State# s -> StateAndPtr# s elt putMVar# :: SynchVar# s elt -> State# s -> State# s \end{verbatim} \subsubsection{@spark#@ primitive operation (for parallel execution)} {\em ToDo: say something} It's used in the unfolding for @par@. \subsubsection{The @errorIO#@ primitive operation} The @errorIO#@ primitive takes an argument much like @IO@. It aborts execution of the current program, and continues instead by performing the given @IO@-like value on the current state of the world. \begin{verbatim} errorIO# :: (State# RealWorld# -> a -> a \end{verbatim} \subsection{GHC/Hugs Extension Libraries} The extension libraries provided by both GHC and Hugs are described in a separate document ``The Hugs-GHC Extension Libraries''. \subsection{GHC-only Extension Libraries} If you rely on the implicit @import Prelude@ that GHC normally does for you, and if you don't use any weird flags (notably @-fglasgow-exts@), and if you don't import the Glasgow extensions interface, @GlaExts@, then GHC should work {\em exactly} as the Haskell report says, and the full user namespaces should be available to you. If you mess about with @import Prelude@... innocent hiding, e.g., \begin{verbatim} import Prelude hiding ( fromIntegral ) \end{verbatim} should work just fine. % this should work now?? -- SDM %There are some things you can do that will make GHC crash, e.g., %hiding a standard class: %\begin{verbatim} %import Prelude hiding ( Eq(..) ) %\end{verbatim} % %Don't do that. If you turn on @-fglasgow-exts@, the compiler will recognise and parse unboxed values properly. To get at the primitive operations described herein, import the relevant interfaces. \subsubsection{The @GlaExts@ interface} \index{GlaExts interface (GHC extensions)} The @GlaExts@ interface provides access to extensions that only GHC implements. These currently are: unboxed types, including the representations of the primitive types (Int, Float, etc.), and the GHC primitive operations (@+#@, @==#@, etc.). This module used to provide access to all the Glasgow extensions, but these have since been moved into separate libraries for compatibility with Hugs (version 2.09: in fact, you can still get at this stuff via @GlaExts@ for compatibility, but this facility will likely be removed in the future). \begin{verbatim} -- the representation of some basic types: data Char = C# Char# data Int = I# Int# data Addr = A# Addr# data Word = W# Word# data Float = F# Float# data Double = D# Double# data Integer = J# Int# Int# ByteArray# module GHC -- all primops and primitive types. \end{verbatim} \subsubsection{The @MutableArray@ interface} \label{sec:mutable-array} \index{MutableArray interface (GHC extensions)} The @MutableArray@ interface defines a general set of operations over mutable arrays (@MutableArray@) and mutable chunks of memory (@MutableByteArray@): \begin{verbatim} data MutableArray s ix elt -- abstract data MutableByteArray s ix -- abstract -- instance of : CCallable -- Creators: newArray :: Ix ix => (ix,ix) -> elt -> ST s (MutableArray s ix elt) newCharArray :: Ix ix => (ix,ix) -> ST s (MutableByteArray s ix) newAddrArray :: Ix ix => (ix,ix) -> ST s (MutableByteArray s ix) newIntArray :: Ix ix => (ix,ix) -> ST s (MutableByteArray s ix) newFloatArray :: Ix ix => (ix,ix) -> ST s (MutableByteArray s ix) newDoubleArray :: Ix ix => (ix,ix) -> ST s (MutableByteArray s ix) boundsOfArray :: Ix ix => MutableArray s ix elt -> (ix, ix) boundsOfByteArray :: Ix ix => MutableByteArray s ix -> (ix, ix) readArray :: Ix ix => MutableArray s ix elt -> ix -> ST s elt readCharArray :: Ix ix => MutableByteArray s ix -> ix -> ST s Char readIntArray :: Ix ix => MutableByteArray s ix -> ix -> ST s Int readAddrArray :: Ix ix => MutableByteArray s ix -> ix -> ST s Addr readFloatArray :: Ix ix => MutableByteArray s ix -> ix -> ST s Float readDoubleArray :: Ix ix => MutableByteArray s ix -> ix -> ST s Double writeArray :: Ix ix => MutableArray s ix elt -> ix -> elt -> ST s () writeCharArray :: Ix ix => MutableByteArray s ix -> ix -> Char -> ST s () writeIntArray :: Ix ix => MutableByteArray s ix -> ix -> Int -> ST s () writeAddrArray :: Ix ix => MutableByteArray s ix -> ix -> Addr -> ST s () writeFloatArray :: Ix ix => MutableByteArray s ix -> ix -> Float -> ST s () writeDoubleArray :: Ix ix => MutableByteArray s ix -> ix -> Double -> ST s () freezeArray :: Ix ix => MutableArray s ix elt -> ST s (Array ix elt) freezeCharArray :: Ix ix => MutableByteArray s ix -> ST s (ByteArray ix) freezeIntArray :: Ix ix => MutableByteArray s ix -> ST s (ByteArray ix) freezeAddrArray :: Ix ix => MutableByteArray s ix -> ST s (ByteArray ix) freezeFloatArray :: Ix ix => MutableByteArray s ix -> ST s (ByteArray ix) freezeDoubleArray :: Ix ix => MutableByteArray s ix -> ST s (ByteArray ix) unsafeFreezeArray :: Ix ix => MutableArray s ix elt -> ST s (Array ix elt) unsafeFreezeByteArray :: Ix ix => MutableByteArray s ix -> ST s (ByteArray ix) thawArray :: Ix ix => Array ix elt -> ST s (MutableArray s ix elt) \end{verbatim} \subsubsection{The @ByteArray@ interface} \label{sec:byte-array} \index{ByteArray interface (GHC extensions)} @ByteArray@s are chunks of immutable Haskell heap: \begin{verbatim} data ByteArray ix -- abstract -- instance of: CCallable indexCharArray :: Ix ix => ByteArray ix -> ix -> Char indexIntArray :: Ix ix => ByteArray ix -> ix -> Int indexAddrArray :: Ix ix => ByteArray ix -> ix -> Addr indexFloatArray :: Ix ix => ByteArray ix -> ix -> Float indexDoubleArray :: Ix ix => ByteArray ix -> ix -> Double indexCharOffAddr :: Addr -> Int -> Char indexIntOffAddr :: Addr -> Int -> Int indexAddrOffAddr :: Addr -> Int -> Addr indexFloatOffAddr :: Addr -> Int -> Float indexDoubleOffAddr :: Addr -> Int -> Double \end{verbatim} \subsubsection{Stable pointers} Nothing exciting here, just simple boxing up. \begin{verbatim} data StablePtr a = StablePtr (StablePtr# a) makeStablePointer :: a -> StablePtr a freeStablePointer :: StablePtr a -> IO () \end{verbatim} \subsubsection{Foreign objects} \label{sec:foreign-obj} \index{Foreign objects} This module provides the @ForeignObj@ type and wrappers around the primitive operations on foreign objects. \begin{verbatim} data ForeignObj = ForeignObj ForeignObj# makeForeignObj :: Addr -- object to be boxed up as a ForeignObj -> Addr -- finaliser -> IO ForeignObj writeForeignObj :: ForeignObj -- previously created foreign object -> Addr -- new value -> IO () \end{verbatim} \index{ForeignObj type} \index{makeForeignObj function} \index{writeForeignObj function} A typical use of @ForeignObj@ is in constructing Haskell bindings to external libraries. A good example is that of writing a binding to an image-processing library (which was actually the main motivation for implementing @ForeignObj@'s precursor, @MallocPtr#@). The images manipulated are not stored in the Haskell heap, either because the library insist on allocating them internally or we (sensibly) decide to spare the GC from having to heave heavy images around. \begin{verbatim} data Image = Image ForeignObj \end{verbatim} The @ForeignObj@ type is then used to refer to the externally allocated image, and to acheive some type safety, the Haskell binding defines the @Image@ data type. So, a value of type @ForeignObj@ is used to ``box'' up an external reference into a Haskell heap object that we can then indirectly reference: \begin{verbatim} createImage :: (Int,Int) -> IO Image \end{verbatim} So far, this looks just like an @Addr@ type, but @ForeignObj@ offers a bit more, namely that we can specify a {\em finalisation routine} to invoke when the @ForeignObj@ is discarded by the GC. The garbage collector invokes the finalisation routine associated with the @ForeignObj@, saying `` Thanks, I'm through with this now..'' For the image-processing library, the finalisation routine could for the images free up memory allocated for them. The finalisation routine has currently to be written in C (the finalisation routine can in turn call on @FreeStablePtr@ to deallocate a stable pointer). Associating a finalisation routine with an external object is done by calling @makeForeignObj@. {\bf Note:} the foreign object value and its finaliser are contained in the @ForeignObj@, so there's no danger of an aggressive optimiser somehow separating the two (with the result that the foreign reference would not be freed). (Implementation: a linked list of all @ForeignObj#@s is maintained to allow the garbage collector to detect when a @ForeignObj#@ becomes garbage.) Like @Array@, @ForeignObj#@s are represented by heap objects. Upon controlled termination of the Haskell program, all @ForeignObjs@ are freed, invoking their respective finalisers before terminating. \subsubsection{The @CCall@ module} The @CCall@ module defines the classes @CCallable@ and @CReturnable@, along with instances for the primitive types (@Int@, @Int#@, @Float@, @Float#@ etc.) GHC knows to import this module if you use @_ccall_@, but if you need to define your own instances of these classes, you will need to import @CCall@ explicitly. More information on how to use @_ccall_@ can be found in Section \ref{glasgow-ccalls}. \begin{onlystandalone} \end{document} \end{onlystandalone}