[project @ 1996-01-11 14:06:51 by partain]

[ghc-hetmet.git] / ghc / docs / state_interface / state-interface.verb
diff --git a/ghc/docs/state_interface/state-interface.verb b/ghc/docs/state_interface/state-interface.verb

new file mode 100644 (file)

index 0000000..3767205
--- /dev/null
+++ b/ghc/docs/state_interface/state-interface.verb
@@ -0,0 +1,1046 @@
+\documentstyle[a4wide,grasp]{article}
+\renewcommand{\textfraction}{0.1}
+\renewcommand{\floatpagefraction}{0.9}
+\renewcommand{\dblfloatpagefraction}{0.9}
+
+\sloppy
+
+
+\begin{document}
+
+\title{GHC prelude: types and operations}
+\author{Simon L Peyton Jones \and John Launchbury \and Will Partain}
+
+\maketitle
+\tableofcontents
+
+This ``state interface document'' corresponds to Glasgow Haskell
+version~0.23.
+
+\section{Really primitive stuff}
+
+This section defines all the types which are primitive in Glasgow Haskell, and the
+operations provided for them.
+
+A primitive type is one which cannot be defined in Haskell, and which is 
+therefore built into the language and compiler.
+Primitive types are always unboxed; that is, a value of primitive type cannot be 
+bottom.
+
+Primitive values are often represented by a simple bit-pattern, such as @Int#@, 
+@Float#@, @Double#@.  But this is not necessarily the case: a primitive value 
+might be represented by a pointer to a heap-allocated object.  Examples include 
+@Array#@, the type of primitive arrays.  You might think this odd: doesn't being 
+heap-allocated mean that it has a box?  No, it does not.  A primitive array is 
+heap-allocated because it is too big a value to fit in a register, and would be 
+too expensive to copy around; in a sense, it is accidental that it is represented 
+by a pointer.  If a pointer represents a primitive value, then it really does 
+point to that value: no unevaluated thunks, no indirections...nothing can be at 
+the other end of the pointer than the primitive value.
+
+This section also describes a few non-primitive types, which are needed 
+to express the result types of some primitive operations.
+
+\subsection{Character and numeric types}
+
+There are the following obvious primitive types:
+@
+type Char#
+type Int#      -- see also Word# and Addr#, later
+type Float#
+type Double#
+@
+If you want to know their exact equivalents in C, see
+@ghc/includes/StgTypes.lh@ in the GHC source.
+
+Literals for these types may be written as follows:
+@
+1#             an Int#
+1.2#           a Float#
+1.34##         a Double#
+'a'#           a Char#; for weird characters, use '\o<octal>'#
+"a"#           an Addr# (a `char *')
+@
+
+\subsubsection{Comparison operations}
+@
+{gt,ge,eq,ne,lt,le}Char# :: Char# -> Char# -> Bool
+    -- ditto for Int#, Word#, Float#, Double#, and Addr#
+@
+
+\subsubsection{Unboxed-character operations}
+@
+ord# :: Char# -> Int#
+chr# :: Int# -> Char#
+@
+
+\subsubsection{Unboxed-@Int@ operations}
+@
+{plus,minus,times,quot,div,rem}Int# :: Int# -> Int# -> Int#
+negateInt# :: Int# -> Int#
+@
+NB: No error/overflow checking!
+
+\subsubsection{Unboxed-@Float@ and @Double@ operations}
+@
+{plus,minus,times,divide}Float# :: Float# -> Float# -> Float#
+negateFloat# :: Float# -> Float#
+
+float2Int#     :: Float# -> Int#   -- just a cast, no checking!
+int2Float#     :: Int# -> Float#
+
+expFloat#      :: Float# -> Float#
+logFloat#      :: Float# -> Float#
+sqrtFloat#     :: Float# -> Float#
+sinFloat#      :: Float# -> Float#
+cosFloat#      :: Float# -> Float#
+tanFloat#      :: Float# -> Float#
+asinFloat#     :: Float# -> Float#
+acosFloat#     :: Float# -> Float#
+atanFloat#     :: Float# -> Float#
+sinhFloat#     :: Float# -> Float#
+coshFloat#     :: Float# -> Float#
+tanhFloat#     :: Float# -> Float#
+powerFloat#    :: Float# -> Float# -> Float#
+@
+There's an exactly-matching set of unboxed-@Double@ ops; replace
+@Float#@ with @Double#@ in the list above.  There are two
+coercion functions for @Float#@/@Double#@:
+@
+float2Double#  :: Float# -> Double#
+double2Float#  :: Double# -> Float#
+@
+The primitive versions of @encodeFloat@/@decodeFloat@:
+@
+encodeFloat#   :: Int# -> Int# -> ByteArray#   -- Integer mantissa
+               -> Int#                         -- Int exponent
+               -> Float#
+
+decodeFloat#   :: Float#
+               -> _ReturnIntAndGMP
+@
+(And the same for @Double#@s.)
+
+\subsection{Operations on/for @Integers@ (interface to GMP)}
+\label{sect:horrid-Integer-pairing-types}
+
+We implement @Integers@ (arbitrary-precision integers) using the GNU
+multiple-precision (GMP) package.
+
+The data type for @Integer@ must mirror that for @MP_INT@ in @gmp.h@
+(see @gmp.info@).  It comes out as:
+@
+data Integer = J# Int# Int# ByteArray#
+@
+So, @Integer@ is really just a ``pairing'' type for a particular
+collection of primitive types.
+
+The operations in the GMP return other combinations of
+GMP-plus-something, so we need ``pairing'' types for those, too:
+@
+type _ReturnGMP       = Integer        -- synonym
+data _Return2GMPs     = _Return2GMPs Int# Int# ByteArray#
+                                    Int# Int# ByteArray#
+data _ReturnIntAndGMP = _ReturnIntAndGMP Int#
+                                        Int# Int# ByteArray#
+
+-- ????? something to return a string of bytes (in the heap?)
+@
+The primitive ops to support @Integers@ use the ``pieces'' of the
+representation, and are as follows:
+@
+negateInteger# :: Int# -> Int# -> ByteArray# -> Integer
+
+{plus,minus,times}Integer# :: Int# -> Int# -> ByteArray#
+                          -> Int# -> Int# -> ByteArray#
+                          -> Integer
+
+cmpInteger# :: Int# -> Int# -> ByteArray#
+           -> Int# -> Int# -> ByteArray#
+           -> Int# -- -1 for <; 0 for ==; +1 for >
+
+divModInteger#, quotRemInteger#
+       :: Int# -> Int# -> ByteArray#
+       -> Int# -> Int# -> ByteArray#
+       -> _Return2GMPs
+
+integer2Int# :: Int# -> Int# -> ByteArray#
+            -> Int# 
+
+int2Integer#  :: Int#  -> Integer -- NB: no error-checking on these two!
+word2Integer# :: Word# -> Integer
+
+addr2Integer# :: Addr# -> Integer
+       -- the Addr# is taken to be a `char *' string
+       -- to be converted into an Integer
+@
+
+
+\subsection{Words and addresses}
+
+A @Word#@ is used for bit-twiddling operations.  It is the same size as
+an @Int#@, but has no sign nor any arithmetic operations.
+@
+type Word#     -- Same size/etc as Int# but *unsigned*
+type Addr#     -- A pointer from outside the "Haskell world" (from C, probably);
+               -- described under "arrays"
+@
+@Word#@s and @Addr#@s have the usual comparison operations.
+Other unboxed-@Word@ ops (bit-twiddling and coercions):
+@
+and#, or# :: Word# -> Word# -> Word#
+
+not# :: Word# -> Word#
+
+shiftL#, shiftRA#, shiftRL# :: Word# -> Int# -> Word#
+       -- shift left, right arithmetic, right logical
+
+iShiftL#, iShiftRA#, iShiftRL# :: Int# -> Int# -> Int#
+       -- same shift ops, but on Int#s
+
+int2Word#      :: Int#  -> Word# -- just a cast, really
+word2Int#      :: Word# -> Int#
+@
+
+Unboxed-@Addr@ ops (C casts, really):
+@
+int2Addr#      :: Int#  -> Addr#
+addr2Int#      :: Addr# -> Int#
+@
+Operations for indexing off of C pointers (@Addr#@s) to snatch values
+are listed under ``arrays''.
+
+\subsection{Arrays}
+
+The type @Array# elt@ is the type of primitive,
+unboxed arrays of values of type @elt@.  
+@
+type Array# elt
+@
+@Array#@ is more primitive than a Haskell
+array --- indeed, Haskell arrays are implemented using @Array#@ ---
+in that an @Array#@ is indexed only by @Int#@s, starting at zero.  It is also
+more primitive by virtue of being unboxed.  That doesn't mean that it isn't
+a heap-allocated object --- of course, it is.  Rather, being unboxed means
+that it is represented by a pointer to the array itself, and not to a thunk
+which will evaluate to the array (or to bottom).
+The components of an @Array#@ are themselves boxed.
+
+The type @ByteArray#@ is similar to @Array#@, except that it contains
+just a string of (non-pointer) bytes.
+@
+type ByteArray#
+@
+Arrays of these types are useful when a Haskell program wishes to
+construct a value to pass to a C procedure.  It is also possible to
+use them to build (say) arrays of unboxed characters for internal use
+in a Haskell program.  Given these uses, @ByteArray#@ is deliberately
+a bit vague about the type of its components.  Operations are provided
+to extract values of type @Char#@, @Int#@, @Float#@, @Double#@, and
+@Addr#@ from arbitrary offsets within a @ByteArray#@.  (For type @Foo#@,
+the $i$th offset gets you the $i$th @Foo#@, not the @Foo#@ at byte-position $i$.  Mumble.)
+(If you want a @Word#@, grab an @Int#@, then coerce it.)
+
+Lastly, we have static byte-arrays, of type @Addr#@ [mentioned
+previously].  (Remember the duality between arrays and pointers in C.)
+Arrays of this types are represented by a pointer to an array in the
+world outside Haskell, so this pointer is not followed by the garbage
+collector.  In other respects they are just like @ByteArray#@.  They
+are only needed in order to pass values from C to Haskell.
+
+\subsubsection{Reading and writing.}
+
+Primitive arrays are linear, and indexed starting at zero.
+
+The size and indices of a @ByteArray#@, @Addr#@, and
+@MutableByteArray#@ are all in bytes.  It's up to the program to
+calculate the correct byte offset from the start of the array.  This
+allows a @ByteArray#@ to contain a mixture of values of different
+type, which is often needed when preparing data for and unpicking
+results from C.  (Umm... not true of indices... WDP 95/09)
+
+{\em Should we provide some @sizeOfDouble#@ constants?}
+
+Out-of-range errors on indexing should be caught by the code which
+uses the primitive operation; the primitive operations themselves do
+{\em not} check for out-of-range indexes. The intention is that the
+primitive ops compile to one machine instruction or thereabouts.
+
+We use the terms ``reading'' and ``writing'' to refer to accessing {\em mutable} 
+arrays (see Section~\ref{sect:mutable}), and ``indexing'' 
+to refer to reading a value from an {\em immutable} 
+array.
+
+If you want to read/write a @Word#@, read an @Int#@ and coerce.
+
+Immutable byte arrays are straightforward to index (all indices in bytes):
+@
+indexCharArray#   :: ByteArray# -> Int# -> Char#
+indexIntArray#    :: ByteArray# -> Int# -> Int#
+indexAddrArray#   :: ByteArray# -> Int# -> Addr#
+indexFloatArray#  :: ByteArray# -> Int# -> Float#
+indexDoubleArray# :: ByteArray# -> Int# -> Double#
+
+indexCharOffAddr#   :: Addr# -> Int# -> Char#
+indexIntOffAddr#    :: Addr# -> Int# -> Int#
+indexFloatOffAddr#  :: Addr# -> Int# -> Float#
+indexDoubleOffAddr# :: Addr# -> Int# -> Double#
+indexAddrOffAddr#   :: Addr# -> Int# -> Addr#  -- Get an Addr# from an Addr# offset
+@
+The last of these, @indexAddrOffAddr#@, extracts an @Addr#@ using an offset
+from another @Addr#@, thereby providing the ability to follow a chain of
+C pointers.
+
+Something a bit more interesting goes on when indexing arrays of boxed
+objects, because the result is simply the boxed object. So presumably
+it should be entered --- we never usually return an unevaluated
+object!  This is a pain: primitive ops aren't supposed to do
+complicated things like enter objects.  The current solution is to
+return a lifted value, but I don't like it!
+@
+indexArray#       :: Array# elt -> Int# -> _Lift elt   -- Yuk!
+@
+
+\subsubsection{The state type}
+
+The primitive type @State#@ represents the state of a state transformer.
+It is parameterised on the desired type of state, which serves to keep
+states from distinct threads distinct from one another.  But the {\em only}
+effect of this parameterisation is in the type system: all values of type
+@State#@ are represented in the same way.  Indeed, they are all 
+represented by nothing at all!  The code generator ``knows'' to generate no 
+code, and allocate no registers etc, for primitive states.
+@
+type State# s
+@
+
+The type @_RealWorld@ is truly opaque: there are no values defined
+of this type, and no operations over it.  It is ``primitive'' in that
+sense---but it is {\em not unboxed!} Its only role in life is to be the type
+which distinguishes the @PrimIO@ state transformer (see
+Section~\ref{sect:io-spec}).
+@
+data _RealWorld
+@
+
+\subsubsection{States}
+
+A single, primitive, value of type @State# _RealWorld@ is provided.
+@
+realWorld# :: State# _RealWorld
+@
+(Note: in the compiler, not a @PrimOp@; just a mucho magic @Id@.)
+
+\subsection{State pairing types}
+\label{sect:horrid-pairing-types}
+
+This subsection defines some types which, while they aren't quite primitive 
+because we can define them in Haskell, are very nearly so.  They define 
+constructors which pair a primitive state with a value of each primitive type.
+They are required to express the result type of the primitive operations in the 
+state monad.
+@
+data StateAndPtr#    s elt = StateAndPtr#    (State# s) elt 
+
+data StateAndChar#   s     = StateAndChar#   (State# s) Char# 
+data StateAndInt#    s     = StateAndInt#    (State# s) Int# 
+data StateAndWord#   s     = StateAndWord#   (State# s) Word#
+data StateAndFloat#  s     = StateAndFloat#  (State# s) Float# 
+data StateAndDouble# s     = StateAndDouble# (State# s) Double#  
+data StateAndAddr#   s     = StateAndAddr#   (State# s) Addr#
+
+data StateAndStablePtr# s a = StateAndStablePtr# (State# s) (StablePtr# a)
+data StateAndMallocPtr# s   = StateAndMallocPtr# (State# s) MallocPtr#
+data StateAndSynchVar#  s a = StateAndSynchVar#  (State# s) (SynchVar# a)
+
+data StateAndArray#            s elt = StateAndArray#        (State# s) (Array# elt) 
+data StateAndMutableArray#     s elt = StateAndMutableArray# (State# s) (MutableArray# s elt)  
+data StateAndByteArray#        s = StateAndByteArray#        (State# s) ByteArray# 
+data StateAndMutableByteArray# s = StateAndMutableByteArray# (State# s) (MutableByteArray# s)
+@
+
+
+\subsection{Mutable arrays}
+\label{sect:mutable}
+
+Corresponding to @Array#@ and @ByteArray#@,
+we have the types of mutable versions of each.  
+In each case, the representation is a pointer
+to a suitable block of (mutable) heap-allocated storage.
+@
+type MutableArray# s elt
+type MutableByteArray# s
+@
+\subsubsection{Allocation.}
+
+Mutable arrays can be allocated.
+Only pointer-arrays are initialised; arrays of non-pointers are filled
+in by ``user code'' rather than by the array-allocation primitive.
+Reason: only the pointer case has to worry about GC striking with a
+partly-initialised array.
+@
+newArray#       :: Int# -> elt -> State# s -> StateAndMutableArray# s elt 
+
+newCharArray#   :: Int# -> State# s -> StateAndMutableByteArray# s 
+newIntArray#    :: Int# -> State# s -> StateAndMutableByteArray# s 
+newAddrArray#   :: Int# -> State# s -> StateAndMutableByteArray# s 
+newFloatArray#  :: Int# -> State# s -> StateAndMutableByteArray# s 
+newDoubleArray# :: Int# -> State# s -> StateAndMutableByteArray# s 
+@
+The size of a @ByteArray#@ is given in bytes.
+
+\subsubsection{Reading and writing}
+
+%OLD: Remember, offsets in a @MutableByteArray#@ are in bytes.
+@
+readArray#       :: MutableArray# s elt -> Int# -> State# s -> StateAndPtr#    s elt
+readCharArray#   :: MutableByteArray# s -> Int# -> State# s -> StateAndChar#   s
+readIntArray#    :: MutableByteArray# s -> Int# -> State# s -> StateAndInt#    s
+readAddrArray#  :: MutableByteArray# s -> Int# -> State# s -> StateAndAddr#   s 
+readFloatArray#  :: MutableByteArray# s -> Int# -> State# s -> StateAndFloat#  s 
+readDoubleArray# :: MutableByteArray# s -> Int# -> State# s -> StateAndDouble# s 
+
+writeArray#       :: MutableArray# s elt -> Int# -> elt     -> State# s -> State# s 
+writeCharArray#   :: MutableByteArray# s -> Int# -> Char#   -> State# s -> State# s 
+writeIntArray#    :: MutableByteArray# s -> Int# -> Int#    -> State# s -> State# s 
+writeAddrArray#   :: MutableByteArray# s -> Int# -> Addr#   -> State# s -> State# s 
+writeFloatArray#  :: MutableByteArray# s -> Int# -> Float#  -> State# s -> State# s 
+writeDoubleArray# :: MutableByteArray# s -> Int# -> Double# -> State# s -> State# s 
+@
+
+\subsubsection{Equality.}
+
+One can take ``equality'' of mutable arrays.  What is compared is the
+{\em name} or reference to the mutable array, not its contents.
+@
+sameMutableArray#     :: MutableArray# s elt -> MutableArray# s elt -> Bool
+sameMutableByteArray# :: MutableByteArray# s -> MutableByteArray# s -> Bool
+@
+
+\subsubsection{Freezing mutable arrays}
+
+Only unsafe-freeze has a primitive.  (Safe freeze is done directly in Haskell 
+by copying the array and then using @unsafeFreeze@.) 
+@
+unsafeFreezeArray#     :: MutableArray# s elt -> State# s -> StateAndArray#     s elt
+unsafeFreezeByteArray# :: MutableByteArray# s -> State# s -> StateAndByteArray# s
+@
+
+\subsubsection{Stable pointers}
+
+{\em Andy's comment.} {\bf Errors:} The following is not strictly true: the current
+implementation is not as polymorphic as claimed.  The reason for this
+is that the C programmer will have to use a different entry-routine
+for each type of stable pointer.  At present, we only supply a very
+limited number (1) of these routines.  It might be possible to
+increase the range of these routines by providing general purpose
+entry points to apply stable pointers to (stable pointers to)
+arguments and to enter (stable pointers to) boxed primitive values.
+{\em End of Andy's comment.}
+
+A stable pointer is a name for a Haskell object which can be passed to the 
+external world.  It is ``stable'' in the sense that the name does not change when 
+the Haskell garbage collector runs --- in contrast to the address of the object 
+which may well change.
+
+The stable pointer type is parameterised by the type of the thing which is named.
+@
+type StablePtr# a
+@
+A stable pointer is represented by an index into the (static) 
+@StablePointerTable@.  The Haskell garbage collector treats the 
+@StablePointerTable@ as a source of roots for GC.
+
+The @makeStablePointer@ function converts a value into a stable pointer.
+It is part of the @PrimIO@ monad, because we want to be sure we don't
+allocate one twice by accident, and then only free one of the copies.
+@
+makeStablePointer#  :: a -> State# _RealWorld -> StateAndStablePtr# _RealWorld a
+freeStablePointer#  :: StablePtr# a -> State# _RealWorld -> State# _RealWorld
+deRefStablePointer# :: StablePtr# a -> State# _RealWorld -> StateAndPtr _RealWorld a
+@
+There is also a C procedure @FreeStablePtr@ which frees a stable pointer.
+
+\subsubsection{``Malloc'' pointers}
+
+A ``malloc'' pointer is an ordinary pointer from outside the Haskell world
+(i.e., from the C world) where the Haskell world has been told ``Let me
+know when you're finished with this ...''.
+
+The ``malloc'' pointer type is just a special @Addr#@ ({\em not} parameterised).
+@
+type MallocPtr#
+@
+{\em ToDo: say more about this and how it's used...}
+
+The main point is that when Haskell discards a 
+value of type @MallocPtr#@, it calls the procedure @FreeMallocPtr@, which
+must be provided by the C world.  @FreeMallocPtr@ might in turn call
+the GHC-provided procedure @FreeStablePtr@, to deallocate a stable pointer.
+No other GHC runtime system procedures should be called by @FreeMallocPtr@.
+
+(Implementation: a linked list of all @MallocPtr#@s is maintained to allow the
+garbage collector to detect when a @MallocPtr#@ becomes garbage.)
+
+Like @Array@, @MallocPtr#@s are represented by heap objects.
+
+{\bf ToDo --- Important:} Ian Poole reports a need for functions to return a list of
+CHPs.  Should we add a @CHeapPtrArray@ type too? or just
+hack something up?
+
+The only Haskell operation we might want on @MallocPtr#@s is an
+equality test.  However, this is easily implemented if desired:
+@
+>      eqCHP x y = (_runST (_ccall_ equal x y) == 1::Int)
+
+C>     equal (x, y)
+C>     {
+C>     return (x == y ? 1 : 0);
+C>     }
+@
+
+The C world must provide a function @FreeCHeapPointer@ which
+will be called (with a C Heap pointer as argument) when the garbage
+collector releases a CHP.
+
+{\bf ToDo:} Decide whether @FreeCHeapPointer@ is allowed to call on a
+stable pointer. (I sincerely hope not since we will still be in the
+GC at this point.)
+
+\subsubsection{Synchronizing variables (I-vars, M-vars)}
+
+ToDo ToDo ToDo
+
+@
+type SynchVar# s elt   -- primitive
+
+newSynchVar#:: State# s -> StateAndSynchVar# s elt
+
+takeMVar#   :: SynchVar# s elt -> State# s -> StateAndPtr# s elt
+putMVar#    :: SynchVar# s elt -> State# s -> State# s
+
+readIVar#   :: SynchVar# s elt -> State# s -> StateAndPtr# s elt
+writeIVar#  :: SynchVar# s elt -> State# s -> State# s
+@
+
+\subsubsection{Controlling the garbage collector}
+
+The C function {\tt PerformGC\/}, allows the C world to force Haskell
+to do a garbage collection.  It can only be called while Haskell
+is performing a C Call.
+
+Note that this function can be used to define a Haskell IO operation
+with the same effect:
+@
+>      performGCIO :: PrimIO ()
+>      performGCIO = _ccall_gc_ PerformGC
+@
+
+{\bf ToDo:} Is there any need for abnormal/normal termination to force
+a GC too?  Is there any need for a function that provides finer
+control over GC: argument = amount of space required; result = amount
+of space recovered.
+
+\subsection{@spark#@ primitive operation (for parallel execution)}
+
+{\em ToDo: say something}  It's used in the unfolding for @par@.
+
+\subsection{The @errorIO#@ primitive operation}
+
+The @errorIO#@ primitive takes an argument of type @PrimIO@.  It aborts execution of
+the current program, and continues instead by performing the given @PrimIO@ value
+on the current state of the world.
+@
+errorIO# :: PrimIO () -> a
+@
+
+\subsection{C Calls}
+
+{\bf ToDo:} current implementation has state variable as second
+argument not last argument.
+
+The @ccall#@ primitive can't be given an ordinary type, because it has
+a variable number of arguments.  The nearest we can get is:
+@
+ccall# :: CRoutine -> a1# -> ... -> an# -> State# _RealWorld -> StateAndR# _RealWorld
+@
+where the type variables @a1#@\ldots@an#@ and @r#@ can be instantiated by any
+primitive type, and @StateAndR#@ is the appropriate pairing type from 
+Section~\ref{sect:horrid-pairing-types}.  The @CRoutine@ 
+isn't a proper Haskell type at all; it just reminds us that @ccall#@ needs to 
+know what C routine to call.
+
+This notation is really short for a massive family of @ccall#@ primitives, one 
+for each combination of types.  (Of course, the compiler simply remembers the 
+types involved, and generates appropriate code when it finally spits out the C.)
+
+Unlike all the other primitive operators, @ccall#@ is not bound to an in-scope 
+identifier.  The only way it is possible to generate a @ccall#@ is via the 
+@_ccall_@ construct.
+
+All this applies equally to @casm#@:
+@
+casm#  :: CAsmString -> a1# -> ... -> an# -> State# _RealWorld -> StateAndR# _RealWorld
+@
+
+%------------------------------------------------------------
+\section{Library stuff built with the Really Primitive Stuff}
+
+\subsection{The state transformer monad}
+
+\subsubsection{Types}
+
+A state transformer is a function from a state to a pair of a result and a new 
+state.  
+@
+type _ST s a = _State s -> (a, _State s)
+@
+The @_ST@ type is {\em abstract}, so that the programmer cannot see its 
+representation.  If he could, he could write bad things like:
+@
+bad :: _ST s a
+bad = \s -> ...(f s)...(g s)...
+@
+Here, @s@ is duplicated, which would be bad news.
+
+A state is represented by a primitive state value, of type @State# s@, 
+wrapped up in a @_State@ constructor.  The reason for boxing it in this
+way is so that we can be strict or lazy in the state.  (Remember, all 
+primitive types are unboxed, and hence can't be bottom; but types built
+with @data@ are all boxed.)
+@
+data _State s = S# (State# s)
+@ 
+
+\subsubsection{The state transformer combinators}
+
+Now for the combinators, all of which live inside the @_ST@
+abstraction.  Notice that @returnST@ and @thenST@ are lazy in the
+state.
+@
+returnST :: a -> _ST s a
+returnST a s = (a, s)
+
+thenST :: _ST s a -> (a -> _ST s b) -> _ST s b
+thenST m k s = let (r,new_s) = m s
+               in 
+               k r new_s
+
+fixST :: (a -> _ST s a) -> _ST s a
+fixST k s = let ans = k r s
+                (r,new_s) = ans
+            in
+            ans
+@
+The interesting one is, of course, @_runST@.  We can't infer its type!
+(It has a funny name because it must be wired into the compiler.)
+@
+-- _runST :: forall a. (forall s. _ST s a) -> a
+_runST m = case m (S# realWorld#) of
+           (r,_) -> r
+@
+
+\subsubsection{Other useful combinators}
+
+There are various other standard combinators, all defined in terms the
+fundamental combinators above. The @seqST@ combinator is like
+@thenST@, except that it discards the result of the first state
+transformer:
+@
+seqST :: _ST s a -> _ST s b -> _ST s b
+seqST m1 m2 = m1 `thenST` (\_ -> m2)
+@
+
+We also have {\em strict} (... in the state...) variants of the
+then/return combinators (same types as their pals):
+@
+returnStrictlyST a s@(S# _) = (a, s)
+
+thenStrictlyST m k s@(S# _)
+  = case (m s) of { (r, new_s@(S# _)) ->
+    k r new_s }
+
+seqStrictlyST m k = ... ditto, for seqST ...
+@
+
+The combinator @listST@ takes a list of state transformers, and
+composes them in sequence, returning a list of their results:
+@
+listST :: [_ST s a] -> _ST s [a]
+listST []     = returnST []
+listST (m:ms) = m              `thenST` \ r ->
+               listST ms       `thenST` \ rs ->
+               returnST (r:rs)
+@
+The @mapST@ combinator ``lifts'' a function from a value to state
+transformers to one which works over a list of values:
+@
+mapST :: (a -> _ST s b) -> [a] -> _ST s [b]
+mapST f ms = listST (map f ms)
+@
+The @mapAndUnzipST@ combinator is similar to @mapST@, except that here the
+function returns a pair:
+@
+mapAndUnzipST :: (a -> _ST s (b,c)) -> [a] -> _ST s ([b],[c])
+mapAndUnzipST f (m:ms)
+  = f m                        `thenST` \ ( r1,  r2) ->
+    mapAndUnzipST f ms `thenST` \ (rs1, rs2) ->
+    returnST (r1:rs1, r2:rs2)
+@
+
+\subsubsection{The @PrimIO@ monad}
+\label{sect:io-spec}
+
+The @PrimIO@ type is defined in as a state transformer which manipulates the 
+@_RealWorld@.
+@
+type PrimIO a = _ST _RealWorld a      -- Transparent
+@
+The @PrimIO@ type is an ordinary type synonym, transparent to the programmer.
+
+The type @_RealWorld@ and value @realWorld#@ do not need to be hidden (although 
+there is no particular point in exposing them).  Even having a value of type 
+@realWorld#@ does not compromise safety, since the type @_ST@ is hidden. 
+
+It is type-correct to use @returnST@ in an I/O context, but it is a
+bit more efficient to use @returnPrimIO@.  The latter is strict in the
+state, which propagates backwards to all the earlier combinators
+(provided they are unfolded).  Why is it safe for @returnPrimIO@ to be
+strict in the state?  Because every context in which an I/O state
+transformer is used will certainly evaluate the resulting state; it is
+the state of the real world!
+@
+returnPrimIO :: a -> PrimIO a
+returnPrimIO a s@(S# _) -> (a, s)
+@
+We provide strict versions of the other combinators too.
+@
+thenPrimIO m k s = case m s of
+                    (r,s) -> k r s
+@
+@fixPrimIO@ has to be lazy, though!
+@
+fixPrimIO  = fixST
+@
+The other combinators are just the same as before, but use the strict
+@thenPrimIO@ and @returnPrimIO@ for efficiency.
+@
+foldrPrimIO f z []     = z
+foldrPrimIO f z (m:ms) = foldrPrimIO f z ms `thenPrimIO` \ ms' ->
+                        f m ms'
+
+listPrimIO ms = foldrPrimIO (\ a xs -> a `thenPrimIO` \ x -> returnPrimIO (x : xs))
+               (returnPrimIO []) ms
+
+mapPrimIO f ms = listPrimIO (map f ms)
+
+mapAndUnzipPrimIO f (m:ms)
+  = f m                            `thenPrimIO` \ ( r1,  r2) ->
+    mapAndUnzipPrimIO f ms  `thenPrimIO` \ (rs1, rs2) ->
+    returnPrimIO (r1:rs1, r2:rs2)
+@
+
+\subsection{Arrays}
+
+\subsubsection{Types}
+
+@
+data Array      ix elt = _Array     (ix,ix) (Array# elt)
+data _ByteArray ix     = _ByteArray (ix,ix) ByteArray#
+
+data _MutableArray     s ix elt = _MutableArray     (ix,ix) (MutableArray# s elt)
+data _MutableByteArray s ix     = _MutableByteArray (ix,ix) (MutableByteArray# s)
+@
+
+\subsubsection{Operations on immutable arrays}
+
+Ordinary array indexing is straightforward.
+@
+(!) :: Ix ix => Array ix elt -> ix -> elt
+@
+QUESTIONs: should @_ByteArray@s be indexed by Ints or ix?  With byte offsets
+or sized ones? (sized ones [WDP])
+@
+indexCharArray   :: Ix ix => _ByteArray ix -> ix -> Char
+indexIntArray    :: Ix ix => _ByteArray ix -> ix -> Int
+indexAddrArray   :: Ix ix => _ByteArray ix -> ix -> _Addr
+indexFloatArray  :: Ix ix => _ByteArray ix -> ix -> Float
+indexDoubleArray :: Ix ix => _ByteArray ix -> ix -> Double
+@
+@Addr@s are indexed straightforwardly by @Int@s.  Unlike the primitive
+operations, though, the offsets assume that the array consists entirely of the
+type of value being indexed, and so there's an implicit multiplication by
+the size of that value.  To access @Addr@s with mixed values requires
+you to do a DIY job using the primitives.
+@
+indexAddrChar :: Addr -> Int -> Char
+...etc...
+indexStaticCharArray   :: Addr -> Int -> Char
+indexStaticIntArray    :: Addr -> Int -> Int
+indexStaticFloatArray  :: Addr -> Int -> Float
+indexStaticDoubleArray :: Addr -> Int -> Double
+indexStaticArray       :: Addr -> Int -> Addr
+@
+
+\subsubsection{Operations on mutable arrays}
+@
+newArray     :: Ix ix => (ix,ix) -> elt -> _ST s (_MutableArray s ix elt)
+newCharArray :: Ix ix => (ix,ix) -> _ST s (_MutableByteArray s ix) 
+...
+@
+
+@
+readArray   :: Ix ix => _MutableArray s ix elt -> ix -> _ST s elt 
+readCharArray   :: Ix ix => _MutableByteArray s ix -> ix -> _ST s Char 
+...
+@
+
+@
+writeArray  :: Ix ix => _MutableArray s ix elt -> ix -> elt -> _ST s () 
+writeCharArray  :: Ix ix => _MutableByteArray s ix -> ix -> Char -> _ST s () 
+...
+@
+
+@
+freezeArray :: Ix ix => _MutableArray s ix elt -> _ST s (Array ix elt)
+freezeCharArray :: Ix ix => _MutableByteArray s ix -> _ST s (_ByteArray ix Char)
+...
+@
+
+We have no need on one-function-per-type for unsafe freezing:
+@
+unsafeFreezeArray :: Ix ix => _MutableArray s ix elt -> _ST s (Array ix elt)  
+unsafeFreezeByteArray :: Ix ix => _MutableByteArray s ix -> _ST s (_ByteArray ix elt)
+@
+
+Sometimes we want to snaffle the bounds of one of these beasts:
+@
+boundsOfArray     :: Ix ix => _MutableArray s ix elt -> (ix, ix)  
+boundsOfByteArray :: Ix ix => _MutableByteArray s ix -> (ix, ix)
+@
+
+Lastly, ``equality'':
+@
+sameMutableArray     :: _MutableArray s ix elt -> _MutableArray s ix elt -> Bool
+sameMutableByteArray :: _MutableByteArray s ix -> _MutableByteArray s ix -> Bool
+@
+
+
+\subsection{Variables}
+
+\subsubsection{Types}
+
+Mutable variables are (for now anyway) implemented as arrays.  The @MutableVar@ type
+is opaque, so we can change the implementation later if we want.
+@
+type MutableVar s a = _MutableArray s Int a
+@
+
+\subsubsection{Operations}
+@
+newVar   :: a -> _ST s (MutableVar s a)
+readVar  :: MutableVar s a -> _ST s a
+writeVar :: MutableVar s a -> a -> _ST s ()
+sameVar  :: MutableVar s a -> MutableVar s a -> Bool
+@
+
+\subsection{Stable pointers}
+
+Nothing exciting here, just simple boxing up.
+@
+data _StablePtr a = _StablePtr (StablePtr# a)
+
+makeStablePointer :: a -> _StablePtr a
+freeStablePointer :: _StablePtr a -> PrimIO ()
+@
+
+\subsection{``Malloc'' pointers}
+
+Again, just boxing up.
+@
+data _MallocPtr = _MallocPtr MallocPtr#
+@
+
+\subsection{C calls}
+
+Everything in this section goes for @_casm_@ too.
+
+{\em ToDo: mention @_ccall_gc_@ and @_casm_gc_@...}
+
+The @_ccall_@ construct has the following form:
+$$@_ccall_@~croutine~a_1~\ldots~a_n$$
+This whole construct has type $@PrimIO@~res$.
+The rules are these:
+\begin{itemize}
+\item
+The first ``argument'', $croutine$, must be the literal name of a C procedure.
+It cannot be a Haskell expression which evaluates to a string, etc; it must be 
+simply the name of the procedure.
+\item
+The arguments $a_1, \ldots,a_n$ must be of {\em C-callable} type.
+\item
+The whole construct has type $@PrimIO@~ty$, where $ty$ is a {\em C-returnable} type.
+\end{itemize}
+A {\em boxed-primitive} type is both C-callable and C-returnable.
+A boxed primitive type is anything declared by:
+@
+data T = C# t
+@
+where @t@ is a primitive type.  Note that
+programmer-defined boxed-primitive types are perfectly OK:
+@
+data Widget = W# Int#
+data Screen = S# CHeapPtr#
+@
+
+There are other types that can be passed to C (C-callable).  This
+table summarises (including the standard boxed-primitive types):
+@
+Boxed              Type of transferd   Corresp.     Which is
+Type               Prim. component     C type       *probably*...
+------             ---------------     ------       -------------
+Char               Char#               StgChar      unsigned char
+Int                Int#                StgInt       long int
+_Word              Word#               StgWord      unsigned long int
+_Addr              Addr#               StgAddr      char *
+Float              Float#              StgFloat     float
+Double             Double#             StgDouble    double
+
+Array              Array#              StgArray     StgPtr
+_ByteArray         ByteArray#          StgByteArray StgPtr
+_MutableArray      MutableArray#       StgArray     StgPtr
+_MutableByteArray   MutableByteArray#  StgByteArray StgPtr
+
+_State             State#              nothing!
+
+_StablePtr         StablePtr#          StgStablePtr StgPtr
+_MallocPtr         MallocPtr#          StgMallocPtr StgPtr
+@
+
+All of the above are {\em C-returnable} except:
+@
+       Array, _ByteArray, _MutableArray, _MutableByteArray
+@
+
+{\bf ToDo:} I'm pretty wary of @Array@ and @_MutableArray@ being in
+this list, and not too happy about @_State@ [WDP].
+
+{\bf ToDo:} Can code generator pass all the primitive types?  Should this be
+extended to include {\tt Bool\/} (or any enumeration type?)
+
+The type checker must be able to figure out just which of the C-callable/returnable
+types is being used.  If it can't, you have to add type signatures. For example,
+@
+f x = _ccall_ foo x
+@
+is not good enough, because the compiler can't work out what type @x@ is, nor 
+what type the @_ccall_@ returns.  You have to write, say:
+@
+f :: Int -> PrimIO Float
+f x = _ccall_ foo x
+@
+
+\subsubsection{Implementation}
+
+The desugarer unwraps the @_ccall_@ construct by inserting the necessary 
+evaluations etc to unbox the arguments.  For example, the body of the definition 
+of @f@ above would become:
+@
+        (\ s -> case x of { I# x# -> 
+                case s of { S# s# ->
+                case ccall# [Int#,Float#] x# s# of { StateAndFloat# f# new_s# ->
+                (F# f#, S# new_s#)
+                }}})
+@
+Notice that the state, too, is unboxed.
+
+The code generator must deal specially with primitive objects which
+are stored on the heap.
+
+... details omitted ...
+
+More importantly, it must construct a C Heap Pointer heap-object after
+a @_ccall_@ which returns a @MallocPtr#@.
+
+%--------------------------------------------------------
+\section{Non-primitive stuff that must be wired into GHC}
+
+@
+data Char    = C# Char#
+data Int     = I# Int#
+data _Word   = W# Word#
+data _Addr   = A# Addr#
+
+data Float   = F# Float#
+data Double  = D# Double#
+data Integer = J# Int# Int# ByteArray#
+
+-- and the other boxed-primitive types:
+    Array, _ByteArray, _MutableArray, _MutableByteArray,
+    _StablePtr, _MallocPtr
+
+data Bool     = False | True
+data CMP_TAG# = LT# | EQ# | GT#  -- used in derived comparisons
+
+data List a = [] | a : (List a)
+-- tuples...
+
+data Ratio a  = a :% a
+type Rational = Ratio Integer
+
+data {Request,Response,etc} -- so we can check the type of "main"
+
+data _Lift a = _Lift a    -- used Yukkily as described elsewhere
+
+type String  = [Char]    -- convenience, only
+@
+
+%------------------------------------------------------------
+\section{Programmer interface(s)}
+
+\subsection{The bog-standard interface}
+
+If you rely on the implicit @import Prelude@ that GHC normally does
+for you, and if you don't use any weird flags (notably
+@-fglasgow-exts@), and if you don't import one of the fairly-magic
+@PreludeGla*@ interfaces, then GHC should work {\em exactly} as the
+Haskell report says, and the full user namespaces should be available
+to you.
+
+Exception: until we burn in the new names @_scc_@ and @_ccall_@, the
+names @scc@ and @ccall@ are still available.
+
+\subsection{If you mess about with @import Prelude@...}
+
+Innocent renaming and hiding, e.g.,
+@
+import Prelude hiding ( fromIntegral ) renaming (map to mop)
+@
+should work just fine (even it {\em is} atrocious programming practice).
+
+There are some things you can do that will make GHC crash, e.g.,
+hiding a standard class:
+@
+import Prelude hiding ( Eq(..) )
+@
+Don't do that.
+
+\subsection{Turning on Glasgow extensions with @-fglasgow-exts@}
+
+If you turn on @-fglasgow-exts@, then all the primitive types and
+operations described herein are available.
+
+It is possible that some name conflicts between your code and the
+wired-in things might spring to life (though we doubt it...).
+Change your names :-)
+
+\subsection{@import PreludeGlaST@}
+
+@
+type ST s a = _ST s a  -- so you don't need -fglasgow-exts...
+@
+
+\subsection{@import PreludeGlaMisc@}
+
+\end{document}
+