\documentstyle[a4wide,grasp]{article}
\renewcommand{\textfraction}{0.1}
\renewcommand{\floatpagefraction}{0.9}
\renewcommand{\dblfloatpagefraction}{0.9}

\sloppy
\renewcommand{\today}{July 1996}

\begin{document}

\title{GHC prelude: types and operations}
\author{Simon L Peyton Jones \and John Launchbury \and Will Partain}

\maketitle
\tableofcontents

This ``state interface document'' corresponds to Glasgow Haskell
version~2.01.

\section{Really primitive stuff}

This section defines all the types which are primitive in Glasgow Haskell, and the
operations provided for them.

A primitive type is one which cannot be defined in Haskell, and which is 
therefore built into the language and compiler.
Primitive types are always unboxed; that is, a value of primitive type cannot be 
bottom.

Primitive values are often represented by a simple bit-pattern, such as @Int#@, 
@Float#@, @Double#@.  But this is not necessarily the case: a primitive value 
might be represented by a pointer to a heap-allocated object.  Examples include 
@Array#@, the type of primitive arrays.  You might think this odd: doesn't being 
heap-allocated mean that it has a box?  No, it does not.  A primitive array is 
heap-allocated because it is too big a value to fit in a register, and would be 
too expensive to copy around; in a sense, it is accidental that it is represented 
by a pointer.  If a pointer represents a primitive value, then it really does 
point to that value: no unevaluated thunks, no indirections...nothing can be at 
the other end of the pointer than the primitive value.

This section also describes a few non-primitive types, which are needed 
to express the result types of some primitive operations.

\subsection{Character and numeric types}

There are the following obvious primitive types:
@
type Char#
type Int#	-- see also Word# and Addr#, later
type Float#
type Double#
@
If you want to know their exact equivalents in C, see
@ghc/includes/StgTypes.lh@ in the GHC source.

Literals for these types may be written as follows:
@
1#		an Int#
1.2#		a Float#
1.34##		a Double#
'a'#		a Char#; for weird characters, use '\o<octal>'#
"a"#		an Addr# (a `char *')
@

\subsubsection{Comparison operations}
@
{gt,ge,eq,ne,lt,le}Char# :: Char# -> Char# -> Bool
    -- ditto for Int#, Word#, Float#, Double#, and Addr#
@

\subsubsection{Unboxed-character operations}
@
ord# :: Char# -> Int#
chr# :: Int# -> Char#
@

\subsubsection{Unboxed-@Int@ operations}
@
{plus,minus,times,quot,div,rem}Int# :: Int# -> Int# -> Int#
negateInt# :: Int# -> Int#
@
NB: No error/overflow checking!

\subsubsection{Unboxed-@Double@ and @Float@ operations}
@
{plus,minus,times,divide}Double# :: Double# -> Double# -> Double#
negateDouble# :: Double# -> Double#

float2Int#	:: Double# -> Int#   -- just a cast, no checking!
int2Double#	:: Int# -> Double#

expDouble#	:: Double# -> Double#
logDouble#	:: Double# -> Double#
sqrtDouble#	:: Double# -> Double#
sinDouble#	:: Double# -> Double#
cosDouble#	:: Double# -> Double#
tanDouble#	:: Double# -> Double#
asinDouble#	:: Double# -> Double#
acosDouble#	:: Double# -> Double#
atanDouble#	:: Double# -> Double#
sinhDouble#	:: Double# -> Double#
coshDouble#	:: Double# -> Double#
tanhDouble#	:: Double# -> Double#
powerDouble#	:: Double# -> Double# -> Double#
@
There's an exactly-matching set of unboxed-@Float@ ops; replace
@Double#@ with @Float#@ in the list above.  There are two
coercion functions for @Float#@/@Double#@:
@
float2Double#	:: Float# -> Double#
double2Float#	:: Double# -> Float#
@
The primitive versions of @encodeDouble@/@decodeDouble@:
@
encodeDouble#	:: Int# -> Int# -> ByteArray#	-- Integer mantissa
		-> Int#				-- Int exponent
		-> Double#

decodeDouble#	:: Double#
		-> GHCbase.ReturnIntAndGMP
@
(And the same for @Float#@s.)

\subsection{Operations on/for @Integers@ (interface to GMP)}
\label{sect:horrid-Integer-pairing-types}

We implement @Integers@ (arbitrary-precision integers) using the GNU
multiple-precision (GMP) package.

NB: some of this might change if we upgrade to using GMP~2.x.

The data type for @Integer@ must mirror that for @MP_INT@ in @gmp.h@
(see @gmp.info@).  It comes out as:
@
data Integer = J# Int# Int# ByteArray#
@
So, @Integer@ is really just a ``pairing'' type for a particular
collection of primitive types.

The operations in the GMP return other combinations of
GMP-plus-something, so we need ``pairing'' types for those, too:
@
data Return2GMPs     = Return2GMPs Int# Int# ByteArray# Int# Int# ByteArray#
data ReturnIntAndGMP = ReturnIntAndGMP Int# Int# Int# ByteArray#

-- ????? something to return a string of bytes (in the heap?)
@
The primitive ops to support @Integers@ use the ``pieces'' of the
representation, and are as follows:
@
negateInteger#	:: Int# -> Int# -> ByteArray# -> Integer

{plus,minus,times}Integer# :: Int# -> Int# -> ByteArray#
			   -> Int# -> Int# -> ByteArray#
			   -> Integer

cmpInteger# :: Int# -> Int# -> ByteArray#
	    -> Int# -> Int# -> ByteArray#
	    -> Int# -- -1 for <; 0 for ==; +1 for >

divModInteger#, quotRemInteger#
	:: Int# -> Int# -> ByteArray#
	-> Int# -> Int# -> ByteArray#
	-> GHCbase.Return2GMPs

integer2Int# :: Int# -> Int# -> ByteArray#
	     -> Int# 

int2Integer#  :: Int#  -> Integer -- NB: no error-checking on these two!
word2Integer# :: Word# -> Integer

addr2Integer# :: Addr# -> Integer
	-- the Addr# is taken to be a `char *' string
	-- to be converted into an Integer
@


\subsection{Words and addresses}

A @Word#@ is used for bit-twiddling operations.  It is the same size as
an @Int#@, but has no sign nor any arithmetic operations.
@
type Word#	-- Same size/etc as Int# but *unsigned*
type Addr#	-- A pointer from outside the "Haskell world" (from C, probably);
		-- described under "arrays"
@
@Word#@s and @Addr#@s have the usual comparison operations.
Other unboxed-@Word@ ops (bit-twiddling and coercions):
@
and#, or# :: Word# -> Word# -> Word#

not# :: Word# -> Word#

shiftL#, shiftRA#, shiftRL# :: Word# -> Int# -> Word#
	-- shift left, right arithmetic, right logical

iShiftL#, iShiftRA#, iShiftRL# :: Int# -> Int# -> Int#
	-- same shift ops, but on Int#s

int2Word#	:: Int#  -> Word# -- just a cast, really
word2Int#	:: Word# -> Int#
@

Unboxed-@Addr@ ops (C casts, really):
@
int2Addr#	:: Int#  -> Addr#
addr2Int#	:: Addr# -> Int#
@
Operations for indexing off of C pointers (@Addr#@s) to snatch values
are listed under ``arrays''.

\subsection{Arrays}

The type @Array# elt@ is the type of primitive,
unboxed arrays of values of type @elt@.  
@
type Array# elt
@
@Array#@ is more primitive than a Haskell
array --- indeed, Haskell arrays are implemented using @Array#@ ---
in that an @Array#@ is indexed only by @Int#@s, starting at zero.  It is also
more primitive by virtue of being unboxed.  That doesn't mean that it isn't
a heap-allocated object --- of course, it is.  Rather, being unboxed means
that it is represented by a pointer to the array itself, and not to a thunk
which will evaluate to the array (or to bottom).
The components of an @Array#@ are themselves boxed.

The type @ByteArray#@ is similar to @Array#@, except that it contains
just a string of (non-pointer) bytes.
@
type ByteArray#
@
Arrays of these types are useful when a Haskell program wishes to
construct a value to pass to a C procedure.  It is also possible to
use them to build (say) arrays of unboxed characters for internal use
in a Haskell program.  Given these uses, @ByteArray#@ is deliberately
a bit vague about the type of its components.  Operations are provided
to extract values of type @Char#@, @Int#@, @Float#@, @Double#@, and
@Addr#@ from arbitrary offsets within a @ByteArray#@.  (For type @Foo#@,
the $i$th offset gets you the $i$th @Foo#@, not the @Foo#@ at byte-position $i$.  Mumble.)
(If you want a @Word#@, grab an @Int#@, then coerce it.)

Lastly, we have static byte-arrays, of type @Addr#@ [mentioned
previously].  (Remember the duality between arrays and pointers in C.)
Arrays of this types are represented by a pointer to an array in the
world outside Haskell, so this pointer is not followed by the garbage
collector.  In other respects they are just like @ByteArray#@.  They
are only needed in order to pass values from C to Haskell.

\subsubsection{Reading and writing.}

Primitive arrays are linear, and indexed starting at zero.

The size and indices of a @ByteArray#@, @Addr#@, and
@MutableByteArray#@ are all in bytes.  It's up to the program to
calculate the correct byte offset from the start of the array.  This
allows a @ByteArray#@ to contain a mixture of values of different
type, which is often needed when preparing data for and unpicking
results from C.  (Umm... not true of indices... WDP 95/09)

{\em Should we provide some @sizeOfDouble#@ constants?}

Out-of-range errors on indexing should be caught by the code which
uses the primitive operation; the primitive operations themselves do
{\em not} check for out-of-range indexes. The intention is that the
primitive ops compile to one machine instruction or thereabouts.

We use the terms ``reading'' and ``writing'' to refer to accessing {\em mutable} 
arrays (see Section~\ref{sect:mutable}), and ``indexing'' 
to refer to reading a value from an {\em immutable} 
array.

If you want to read/write a @Word#@, read an @Int#@ and coerce.

Immutable byte arrays are straightforward to index (all indices in bytes):
@
indexCharArray#   :: ByteArray# -> Int# -> Char#
indexIntArray#    :: ByteArray# -> Int# -> Int#
indexAddrArray#   :: ByteArray# -> Int# -> Addr#
indexFloatArray#  :: ByteArray# -> Int# -> Float#
indexDoubleArray# :: ByteArray# -> Int# -> Double#

indexCharOffAddr#   :: Addr# -> Int# -> Char#
indexIntOffAddr#    :: Addr# -> Int# -> Int#
indexFloatOffAddr#  :: Addr# -> Int# -> Float#
indexDoubleOffAddr# :: Addr# -> Int# -> Double#
indexAddrOffAddr#   :: Addr# -> Int# -> Addr#	-- Get an Addr# from an Addr# offset
@
The last of these, @indexAddrOffAddr#@, extracts an @Addr#@ using an offset
from another @Addr#@, thereby providing the ability to follow a chain of
C pointers.

Something a bit more interesting goes on when indexing arrays of boxed
objects, because the result is simply the boxed object. So presumably
it should be entered --- we never usually return an unevaluated
object!  This is a pain: primitive ops aren't supposed to do
complicated things like enter objects.  The current solution is to
return a lifted value, but I don't like it!
@
indexArray#       :: Array# elt -> Int# -> GHCbase.Lift elt  -- Yuk!
@

\subsubsection{The state type}

The primitive type @State#@ represents the state of a state transformer.
It is parameterised on the desired type of state, which serves to keep
states from distinct threads distinct from one another.  But the {\em only}
effect of this parameterisation is in the type system: all values of type
@State#@ are represented in the same way.  Indeed, they are all 
represented by nothing at all!  The code generator ``knows'' to generate no 
code, and allocate no registers etc, for primitive states.
@
type State# s
@

The type @GHCbuiltins.RealWorld@ is truly opaque: there are no values defined
of this type, and no operations over it.  It is ``primitive'' in that
sense---but it is {\em not unboxed!} Its only role in life is to be the type
which distinguishes the @PrimIO@ state transformer (see
Section~\ref{sect:io-spec}).
@
data RealWorld
@

\subsubsection{States}

A single, primitive, value of type @State# RealWorld@ is provided.
@
realWorld# :: State# GHCbuiltins.RealWorld
@
(Note: in the compiler, not a @PrimOp@; just a mucho magic @Id@.)

\subsection{State pairing types}
\label{sect:horrid-pairing-types}

This subsection defines some types which, while they aren't quite primitive 
because we can define them in Haskell, are very nearly so.  They define 
constructors which pair a primitive state with a value of each primitive type.
They are required to express the result type of the primitive operations in the 
state monad.
@
data StateAndPtr#    s elt = StateAndPtr#    (State# s) elt 

data StateAndChar#   s     = StateAndChar#   (State# s) Char# 
data StateAndInt#    s     = StateAndInt#    (State# s) Int# 
data StateAndWord#   s     = StateAndWord#   (State# s) Word#
data StateAndFloat#  s     = StateAndFloat#  (State# s) Float# 
data StateAndDouble# s     = StateAndDouble# (State# s) Double#  
data StateAndAddr#   s     = StateAndAddr#   (State# s) Addr#

data StateAndStablePtr# s a = StateAndStablePtr#  (State# s) (StablePtr# a)
data StateAndForeignObj# s  = StateAndForeignObj# (State# s) ForeignObj#
data StateAndSynchVar#  s a = StateAndSynchVar#  (State# s) (SynchVar# a)

data StateAndArray#            s elt = StateAndArray#        (State# s) (Array# elt) 
data StateAndMutableArray#     s elt = StateAndMutableArray# (State# s) (MutableArray# s elt)  
data StateAndByteArray#        s = StateAndByteArray#        (State# s) ByteArray# 
data StateAndMutableByteArray# s = StateAndMutableByteArray# (State# s) (MutableByteArray# s)
@


\subsection{Mutable arrays}
\label{sect:mutable}

Corresponding to @Array#@ and @ByteArray#@,
we have the types of mutable versions of each.  
In each case, the representation is a pointer
to a suitable block of (mutable) heap-allocated storage.
@
type MutableArray# s elt
type MutableByteArray# s
@
\subsubsection{Allocation.}

Mutable arrays can be allocated.
Only pointer-arrays are initialised; arrays of non-pointers are filled
in by ``user code'' rather than by the array-allocation primitive.
Reason: only the pointer case has to worry about GC striking with a
partly-initialised array.
@
newArray#       :: Int# -> elt -> State# s -> StateAndMutableArray# s elt 

newCharArray#   :: Int# -> State# s -> StateAndMutableByteArray# s 
newIntArray#    :: Int# -> State# s -> StateAndMutableByteArray# s 
newAddrArray#   :: Int# -> State# s -> StateAndMutableByteArray# s 
newFloatArray#  :: Int# -> State# s -> StateAndMutableByteArray# s 
newDoubleArray# :: Int# -> State# s -> StateAndMutableByteArray# s 
@
The size of a @ByteArray#@ is given in bytes.

\subsubsection{Reading and writing}

%OLD: Remember, offsets in a @MutableByteArray#@ are in bytes.
@
readArray#       :: MutableArray# s elt -> Int# -> State# s -> StateAndPtr#    s elt
readCharArray#   :: MutableByteArray# s -> Int# -> State# s -> StateAndChar#   s
readIntArray#    :: MutableByteArray# s -> Int# -> State# s -> StateAndInt#    s
readAddrArray#	 :: MutableByteArray# s -> Int# -> State# s -> StateAndAddr#   s 
readFloatArray#  :: MutableByteArray# s -> Int# -> State# s -> StateAndFloat#  s 
readDoubleArray# :: MutableByteArray# s -> Int# -> State# s -> StateAndDouble# s 

writeArray#       :: MutableArray# s elt -> Int# -> elt     -> State# s -> State# s 
writeCharArray#   :: MutableByteArray# s -> Int# -> Char#   -> State# s -> State# s 
writeIntArray#    :: MutableByteArray# s -> Int# -> Int#    -> State# s -> State# s 
writeAddrArray#   :: MutableByteArray# s -> Int# -> Addr#   -> State# s -> State# s 
writeFloatArray#  :: MutableByteArray# s -> Int# -> Float#  -> State# s -> State# s 
writeDoubleArray# :: MutableByteArray# s -> Int# -> Double# -> State# s -> State# s 
@

\subsubsection{Equality.}

One can take ``equality'' of mutable arrays.  What is compared is the
{\em name} or reference to the mutable array, not its contents.
@
sameMutableArray#     :: MutableArray# s elt -> MutableArray# s elt -> Bool
sameMutableByteArray# :: MutableByteArray# s -> MutableByteArray# s -> Bool
@

\subsubsection{Freezing mutable arrays}

Only unsafe-freeze has a primitive.  (Safe freeze is done directly in Haskell 
by copying the array and then using @unsafeFreeze@.) 
@
unsafeFreezeArray#     :: MutableArray# s elt -> State# s -> StateAndArray#     s elt
unsafeFreezeByteArray# :: MutableByteArray# s -> State# s -> StateAndByteArray# s
@

\subsubsection{Stable pointers}

{\em Andy's comment.} {\bf Errors:} The following is not strictly true: the current
implementation is not as polymorphic as claimed.  The reason for this
is that the C programmer will have to use a different entry-routine
for each type of stable pointer.  At present, we only supply a very
limited number (1) of these routines.  It might be possible to
increase the range of these routines by providing general purpose
entry points to apply stable pointers to (stable pointers to)
arguments and to enter (stable pointers to) boxed primitive values.
{\em End of Andy's comment.}

A stable pointer is a name for a Haskell object which can be passed to the 
external world.  It is ``stable'' in the sense that the name does not change when 
the Haskell garbage collector runs --- in contrast to the address of the object 
which may well change.

The stable pointer type is parameterised by the type of the thing which is named.
@
type StablePtr# a
@
A stable pointer is represented by an index into the (static) 
@StablePointerTable@.  The Haskell garbage collector treats the 
@StablePointerTable@ as a source of roots for GC.

The @makeStablePointer@ function converts a value into a stable pointer.
It is part of the @PrimIO@ monad, because we want to be sure we don't
allocate one twice by accident, and then only free one of the copies.
@
makeStablePointer#  :: a -> State# RealWorld -> StateAndStablePtr# RealWorld a
freeStablePointer#  :: StablePtr# a -> State# RealWorld -> State# RealWorld
deRefStablePointer# :: StablePtr# a -> State# RealWorld -> StateAndPtr RealWorld a
@
There is also a C procedure @FreeStablePtr@ which frees a stable pointer.

%
% Rewritten and updated for MallocPtr++ -- 4/96 SOF
%
\subsubsection{Foreign objects}

A @ForeignObj@ is a reference to an object outside the Haskell
world (i.e., from the C world, or a reference to an object on another
machine completely.), where the Haskell world has been told ``Let me
know when you're finished with this ...''.

The @ForeignObj@ type is just a special @Addr#@ ({\em not} parameterised).
@
type ForeignObj#
@

A typical use of @ForeignObj@ is in constructing Haskell bindings
to external libraries. A good example is that of writing a binding to
an image-processing library (which was actually the main motivation
for implementing @ForeignObj@'s precursor, @MallocPtr@). The
images manipulated are not stored in the Haskell heap, either because
the library insist on allocating them internally or we (sensibly)
decide to spare the GC from having to heave heavy images around.

@
data Image = Image ForeignObj#

instance CCallable Image
@

The @ForeignObj#@ type is then used to refer to the externally
allocated image, and to acheive some type safety, the Haskell binding
defines the @Image@ data type. So, a value of type @ForeignObj#@ is
used to ``box'' up an external reference into a Haskell heap object
that we can then indirectly reference:

@
createImage :: (Int,Int) -> PrimIO Image
@

So far, this looks just like an @Addr#@ type, but @ForeignObj#@
offers a bit more, namely that we can specify a {\em finalisation
routine} to invoke when the @ForeignObj#@ is discarded by the
GC. The garbage collector invokes the finalisation routine associated
with the @ForeignObj#@, saying `` Thanks, I'm through with this
now..'' For the image-processing library, the finalisation routine could for
the images free up memory allocated for them. The finalisation routine has
currently to be written in C (the finalisation routine can in turn call on
@FreeStablePtr@ to deallocate a stable pointer.).

Associating a finalisation routine with an external object is done by 
@makeForeignObj#@:

@
makeForeignObj# :: Addr# -- foreign reference
                -> Addr# -- pointer to finalisation routine
		-> StateAndForeignObj# RealWorld ForeignObj#
@

(Implementation: a linked list of all @ForeignObj#@s is maintained to allow the
 garbage collector to detect when a @ForeignObj#@ becomes garbage.)

Like @Array@, @ForeignObj#@s are represented by heap objects.

{\bf ToDo:} Decide whether @FreeCHeapPointer@ is allowed to call on a
stable pointer. (I sincerely hope not since we will still be in the
GC at this point.)

\subsubsection{Synchronizing variables (I-vars, M-vars)}

ToDo ToDo ToDo

@
type SynchVar# s elt	-- primitive

newSynchVar#:: State# s -> StateAndSynchVar# s elt

takeMVar#   :: SynchVar# s elt -> State# s -> StateAndPtr# s elt
putMVar#    :: SynchVar# s elt -> State# s -> State# s

readIVar#   :: SynchVar# s elt -> State# s -> StateAndPtr# s elt
writeIVar#  :: SynchVar# s elt -> State# s -> State# s
@

\subsubsection{Controlling the garbage collector}

The C function {\tt PerformGC\/}, allows the C world to force Haskell
to do a garbage collection.  It can only be called while Haskell
is performing a C Call.

Note that this function can be used to define a Haskell IO operation
with the same effect:
@
>	performGCIO :: PrimIO ()
>	performGCIO = _ccall_gc_ PerformGC
@

{\bf ToDo:} Is there any need for abnormal/normal termination to force
a GC too?  Is there any need for a function that provides finer
control over GC: argument = amount of space required; result = amount
of space recovered.

\subsection{@spark#@ primitive operation (for parallel execution)}

{\em ToDo: say something}  It's used in the unfolding for @par@.

\subsection{The @errorIO#@ primitive operation}

The @errorIO#@ primitive takes an argument much like @PrimIO@.  It aborts execution of
the current program, and continues instead by performing the given @PrimIO@-like value
on the current state of the world.
@
errorIO# :: (State RealWorld -> ((), State RealWorld)) -> a
@

\subsection{C Calls}

{\bf ToDo:} current implementation has state variable as second
argument not last argument.

The @ccall#@ primitive can't be given an ordinary type, because it has
a variable number of arguments.  The nearest we can get is:
@
ccall# :: CRoutine -> a1# -> ... -> an# -> State# RealWorld -> StateAndR# RealWorld
@
where the type variables @a1#@\ldots@an#@ and @r#@ can be instantiated by any
primitive type, and @StateAndR#@ is the appropriate pairing type from 
Section~\ref{sect:horrid-pairing-types}.  The @CRoutine@ 
isn't a proper Haskell type at all; it just reminds us that @ccall#@ needs to 
know what C routine to call.

This notation is really short for a massive family of @ccall#@ primitives, one 
for each combination of types.  (Of course, the compiler simply remembers the 
types involved, and generates appropriate code when it finally spits out the C.)

Unlike all the other primitive operators, @ccall#@ is not bound to an in-scope 
identifier.  The only way it is possible to generate a @ccall#@ is via the 
@_ccall_@ construct.

All this applies equally to @casm#@:
@
casm#  :: CAsmString -> a1# -> ... -> an# -> State# RealWorld -> StateAndR# RealWorld
@

%------------------------------------------------------------
\section{Library stuff built with the Really Primitive Stuff}

\subsection{The state transformer monad}

\subsubsection{Types}

A state transformer is a function from a state to a pair of a result and a new 
state.  
@
newtype ST s a = ST (State s -> (a, State s))
@
The @ST@ type is {\em abstract}, so that the programmer cannot see its 
representation.  If he could, he could write bad things like:
@
bad :: ST s a
bad = ST $ \ s -> ...(f s)...(g s)...
@
Here, @s@ is duplicated, which would be bad news.

A state is represented by a primitive state value, of type @State# s@, 
wrapped up in a @State@ constructor.  The reason for boxing it in this
way is so that we can be strict or lazy in the state.  (Remember, all 
primitive types are unboxed, and hence can't be bottom; but types built
with @data@ are all boxed.)
@
data State s = S# (State# s)
@

\subsubsection{The state transformer combinators}

Now for the combinators, all of which live inside the @ST@
abstraction.  Notice that @returnST@ and @thenST@ are lazy in the
state.
@
returnST :: a -> ST s a
returnST a s = (a, s)

thenST :: ST s a -> (a -> ST s b) -> ST s b
thenST m k s = let (r,new_s) = m s
               in 
               k r new_s

fixST :: (a -> ST s a) -> ST s a
fixST k s = let ans = k r s
                (r,new_s) = ans
            in
            ans
@
The interesting one is, of course, @runST@.  We can't infer its type!
(It has a funny name because it must be wired into the compiler.)
@
-- runST :: forall a. (forall s. ST s a) -> a
runST m = case m (S# realWorld#) of
           (r,_) -> r
@

\subsubsection{Other useful combinators}

There are various other standard combinators, all defined in terms the
fundamental combinators above. The @seqST@ combinator is like
@thenST@, except that it discards the result of the first state
transformer:
@
seqST :: ST s a -> ST s b -> ST s b
seqST m1 m2 = m1 `thenST` (\_ -> m2)
@

We also have {\em strict} (... in the state...) variants of the
then/return combinators (same types as their pals):
@
returnStrictlyST a s@(S# _) = (a, s)

thenStrictlyST m k s@(S# _)
  = case (m s) of { (r, new_s@(S# _)) ->
    k r new_s }

seqStrictlyST m k = ... ditto, for seqST ...
@

The combinator @listST@ takes a list of state transformers, and
composes them in sequence, returning a list of their results:
@
listST :: [ST s a] -> ST s [a]
listST []     = returnST []
listST (m:ms) = m		`thenST` \ r ->
		listST ms	`thenST` \ rs ->
		returnST (r:rs)
@
The @mapST@ combinator ``lifts'' a function from a value to state
transformers to one which works over a list of values:
@
mapST :: (a -> ST s b) -> [a] -> ST s [b]
mapST f ms = listST (map f ms)
@
The @mapAndUnzipST@ combinator is similar to @mapST@, except that here the
function returns a pair:
@
mapAndUnzipST :: (a -> ST s (b,c)) -> [a] -> ST s ([b],[c])
mapAndUnzipST f (m:ms)
  = f m			`thenST` \ ( r1,  r2) ->
    mapAndUnzipST f ms	`thenST` \ (rs1, rs2) ->
    returnST (r1:rs1, r2:rs2)
@

\subsubsection{The @PrimIO@ monad}
\label{sect:io-spec}

The @PrimIO@ type is defined in as a state transformer which manipulates the 
@RealWorld@.
@
type PrimIO a = ST RealWorld a      -- Transparent
@
The @PrimIO@ type is an ordinary type synonym, transparent to the programmer.

The type @RealWorld@ and value @realWorld#@ do not need to be hidden (although 
there is no particular point in exposing them).  Even having a value of type 
@realWorld#@ does not compromise safety, since the type @ST@ is hidden. 

It is type-correct to use @returnST@ in an I/O context, but it is a
bit more efficient to use @returnPrimIO@.  The latter is strict in the
state, which propagates backwards to all the earlier combinators
(provided they are unfolded).  Why is it safe for @returnPrimIO@ to be
strict in the state?  Because every context in which an I/O state
transformer is used will certainly evaluate the resulting state; it is
the state of the real world!
@
returnPrimIO :: a -> PrimIO a
returnPrimIO a s@(S# _) -> (a, s)
@
We provide strict versions of the other combinators too.
@
thenPrimIO m k s = case m s of
		     (r,s) -> k r s
@
@fixPrimIO@ has to be lazy, though!
@
fixPrimIO  = fixST
@
The other combinators are just the same as before, but use the strict
@thenPrimIO@ and @returnPrimIO@ for efficiency.
@
foldrPrimIO f z []     = z
foldrPrimIO f z (m:ms) = foldrPrimIO f z ms `thenPrimIO` \ ms' ->
			 f m ms'

listPrimIO ms = foldrPrimIO (\ a xs -> a `thenPrimIO` \ x -> returnPrimIO (x : xs))
		(returnPrimIO []) ms

mapPrimIO f ms = listPrimIO (map f ms)

mapAndUnzipPrimIO f (m:ms)
  = f m			    `thenPrimIO` \ ( r1,  r2) ->
    mapAndUnzipPrimIO f ms  `thenPrimIO` \ (rs1, rs2) ->
    returnPrimIO (r1:rs1, r2:rs2)
@

\subsection{Arrays}

\subsubsection{Types}

@
data Array      ix elt = Array     (ix,ix) (Array# elt)
data ByteArray ix      = ByteArray (ix,ix) ByteArray#

data MutableArray     s ix elt = MutableArray     (ix,ix) (MutableArray# s elt)
data MutableByteArray s ix     = MutableByteArray (ix,ix) (MutableByteArray# s)
@

\subsubsection{Operations on immutable arrays}

Ordinary array indexing is straightforward.
@
(!) :: Ix ix => Array ix elt -> ix -> elt
@
QUESTIONs: should @ByteArray@s be indexed by Ints or ix?  With byte offsets
or sized ones? (sized ones [WDP])
@
indexCharArray   :: Ix ix => ByteArray ix -> ix -> Char
indexIntArray    :: Ix ix => ByteArray ix -> ix -> Int
indexAddrArray   :: Ix ix => ByteArray ix -> ix -> Addr
indexFloatArray  :: Ix ix => ByteArray ix -> ix -> Float
indexDoubleArray :: Ix ix => ByteArray ix -> ix -> Double
@
@Addr@s are indexed straightforwardly by @Int@s.  Unlike the primitive
operations, though, the offsets assume that the array consists entirely of the
type of value being indexed, and so there's an implicit multiplication by
the size of that value.  To access @Addr@s with mixed values requires
you to do a DIY job using the primitives.
@
indexAddrChar :: Addr -> Int -> Char
...etc...
indexStaticCharArray   :: Addr -> Int -> Char
indexStaticIntArray    :: Addr -> Int -> Int
indexStaticFloatArray  :: Addr -> Int -> Float
indexStaticDoubleArray :: Addr -> Int -> Double
indexStaticArray       :: Addr -> Int -> Addr
@

\subsubsection{Operations on mutable arrays}
@
newArray     :: Ix ix => (ix,ix) -> elt -> ST s (MutableArray s ix elt)
newCharArray :: Ix ix => (ix,ix) -> ST s (MutableByteArray s ix) 
...
@

@
readArray   :: Ix ix => MutableArray s ix elt -> ix -> ST s elt 
readCharArray   :: Ix ix => MutableByteArray s ix -> ix -> ST s Char 
...
@

@
writeArray  :: Ix ix => MutableArray s ix elt -> ix -> elt -> ST s () 
writeCharArray  :: Ix ix => MutableByteArray s ix -> ix -> Char -> ST s () 
...
@

@
freezeArray :: Ix ix => MutableArray s ix elt -> ST s (Array ix elt)
freezeCharArray :: Ix ix => MutableByteArray s ix -> ST s (ByteArray ix)
...
@

We have no need on one-function-per-type for unsafe freezing:
@
unsafeFreezeArray :: Ix ix => MutableArray s ix elt -> ST s (Array ix elt)  
unsafeFreezeByteArray :: Ix ix => MutableByteArray s ix -> ST s (ByteArray ix)
@

Sometimes we want to snaffle the bounds of one of these beasts:
@
boundsOfArray     :: Ix ix => MutableArray s ix elt -> (ix, ix)  
boundsOfByteArray :: Ix ix => MutableByteArray s ix -> (ix, ix)
@

Lastly, ``equality'':
@
sameMutableArray     :: MutableArray s ix elt -> MutableArray s ix elt -> Bool
sameMutableByteArray :: MutableByteArray s ix -> MutableByteArray s ix -> Bool
@


\subsection{Variables}

\subsubsection{Types}

Mutable variables are (for now anyway) implemented as arrays.  The @MutableVar@ type
is opaque, so we can change the implementation later if we want.
@
type MutableVar s a = MutableArray s Int a
@

\subsubsection{Operations}
@
newVar   :: a -> ST s (MutableVar s a)
readVar  :: MutableVar s a -> ST s a
writeVar :: MutableVar s a -> a -> ST s ()
sameVar  :: MutableVar s a -> MutableVar s a -> Bool
@

\subsection{Stable pointers}

Nothing exciting here, just simple boxing up.
@
data StablePtr a = StablePtr (StablePtr# a)

makeStablePointer :: a -> StablePtr a
freeStablePointer :: StablePtr a -> PrimIO ()
@

\subsection{Foreign objects}

Again, just boxing up.
@
data ForeignObj = ForeignObj ForeignObj#

makeForeignObj :: Addr -> Addr -> PrimIO ForeignObj
@

\subsection{C calls}

Everything in this section goes for @_casm_@ too.

{\em ToDo: mention @_ccall_gc_@ and @_casm_gc_@...}

The @_ccall_@ construct has the following form:
$$@_ccall_@~croutine~a_1~\ldots~a_n$$
This whole construct has type $@PrimIO@~res$.
The rules are these:
\begin{itemize}
\item
The first ``argument'', $croutine$, must be the literal name of a C procedure.
It cannot be a Haskell expression which evaluates to a string, etc; it must be 
simply the name of the procedure.
\item
The arguments $a_1, \ldots,a_n$ must be of {\em C-callable} type.
\item
The whole construct has type $@PrimIO@~ty$, where $ty$ is a {\em C-returnable} type.
\end{itemize}
A {\em boxed-primitive} type is both C-callable and C-returnable.
A boxed primitive type is anything declared by:
@
data T = C# t
@
where @t@ is a primitive type.  Note that
programmer-defined boxed-primitive types are perfectly OK:
@
data Widget = W# Int#
data Screen = S# CHeapPtr#
@

There are other types that can be passed to C (C-callable).  This
table summarises (including the standard boxed-primitive types):
@
Boxed	    	  Type of transferd   Corresp.     Which is
Type	    	  Prim. component     C type       *probably*...
------	    	  ---------------     ------       -------------
Char	    	  Char#		      StgChar	    unsigned char
Int 	    	  Int#		      StgInt	    long int
Word	    	  Word#		      StgWord	    unsigned long int
Addr	    	  Addr#		      StgAddr	    char *
Float	    	  Float#	      StgFloat      float
Double		  Double#	      StgDouble     double
		  		       
Array		  Array#	      StgArray      StgPtr
ByteArray	  ByteArray#	      StgByteArray  StgPtr
MutableArray	  MutableArray#	      StgArray      StgPtr
MutableByteArray  MutableByteArray#   StgByteArray  StgPtr
		  		      
State		  State#	      nothing!
		  		      
StablePtr	  StablePtr#	      StgStablePtr  StgPtr
ForeignObj	  ForeignObj#	      StgForeignObj StgPtr
@

All of the above are {\em C-returnable} except:
@
	Array, ByteArray, MutableArray, MutableByteArray
@

{\bf ToDo:} I'm pretty wary of @Array@ and @MutableArray@ being in
this list, and not too happy about @State@ [WDP].

{\bf ToDo:} Can code generator pass all the primitive types?  Should this be
extended to include {\tt Bool\/} (or any enumeration type?)

The type checker must be able to figure out just which of the C-callable/returnable
types is being used.  If it can't, you have to add type signatures. For example,
@
f x = _ccall_ foo x
@
is not good enough, because the compiler can't work out what type @x@ is, nor 
what type the @_ccall_@ returns.  You have to write, say:
@
f :: Int -> PrimIO Float
f x = _ccall_ foo x
@

\subsubsection{Implementation}

The desugarer unwraps the @_ccall_@ construct by inserting the necessary 
evaluations etc to unbox the arguments.  For example, the body of the definition 
of @f@ above would become:
@
        (\ s -> case x of { I# x# -> 
                case s of { S# s# ->
                case ccall# [Int#,Float#] x# s# of { StateAndFloat# f# new_s# ->
                (F# f#, S# new_s#)
                }}})
@
Notice that the state, too, is unboxed.

The code generator must deal specially with primitive objects which
are stored on the heap.

... details omitted ...

%
%More importantly, it must construct a C Heap Pointer heap-object after
%a @_ccall_@ which returns a @MallocPtr#@.
%

%--------------------------------------------------------
\section{Non-primitive stuff that must be wired into GHC}

@
data Char    = C# Char#
data Int     = I# Int#
data Word    = W# Word#
data Addr    = A# Addr#

data Float   = F# Float#
data Double  = D# Double#
data Integer = J# Int# Int# ByteArray#

-- and the other boxed-primitive types:
    Array, ByteArray, MutableArray, MutableByteArray,
    StablePtr, ForeignObj

data Bool     = False | True
data Ordering = LT | EQ | GT  -- used in derived comparisons

data List a = [] | a : (List a)
-- tuples...

data Lift a = Lift a    -- used Yukkily as described elsewhere

type String  = [Char]    -- convenience, only
@

%------------------------------------------------------------
\section{Programmer interface(s)}

\subsection{The bog-standard interface}

If you rely on the implicit @import Prelude@ that GHC normally does
for you, and if you don't use any weird flags (notably
@-fglasgow-exts@), and if you don't import one of the fairly-magic
@PreludeGla*@ interfaces, then GHC should work {\em exactly} as the
Haskell report says, and the full user namespaces should be available
to you.

\subsection{If you mess about with @import Prelude@...}

Innocent hiding, e.g.,
@
import Prelude hiding ( fromIntegral )
@
should work just fine.

There are some things you can do that will make GHC crash, e.g.,
hiding a standard class:
@
import Prelude hiding ( Eq(..) )
@
Don't do that.

\subsection{Turning on Glasgow extensions with @-fglasgow-exts@}

If you turn on @-fglasgow-exts@, then all the primitive types and
operations described herein are available.

It is possible that some name conflicts between your code and the
wired-in things might spring to life (though we doubt it...).
Change your names :-)

\end{document}