ghc/docs/state_interface/state-interface.verb

   1 \documentstyle[a4wide,grasp]{article}
   2 \renewcommand{\textfraction}{0.1}
   3 \renewcommand{\floatpagefraction}{0.9}
   4 \renewcommand{\dblfloatpagefraction}{0.9}
   5
   6 \sloppy
   7 \renewcommand{\today}{July 1996}
   8
   9 \begin{document}
  10
  11 \title{GHC prelude: types and operations}
  12 \author{Simon L Peyton Jones \and John Launchbury \and Will Partain}
  13
  14 \maketitle
  15 \tableofcontents
  16
  17 This ``state interface document'' corresponds to Glasgow Haskell
  18 version~2.01.
  19
  20 \section{Really primitive stuff}
  21
  22 This section defines all the types which are primitive in Glasgow Haskell, and the
  23 operations provided for them.
  24
  25 A primitive type is one which cannot be defined in Haskell, and which is
  26 therefore built into the language and compiler.
  27 Primitive types are always unboxed; that is, a value of primitive type cannot be
  28 bottom.
  29
  30 Primitive values are often represented by a simple bit-pattern, such as @Int#@,
  31 @Float#@, @Double#@.  But this is not necessarily the case: a primitive value
  32 might be represented by a pointer to a heap-allocated object.  Examples include
  33 @Array#@, the type of primitive arrays.  You might think this odd: doesn't being
  34 heap-allocated mean that it has a box?  No, it does not.  A primitive array is
  35 heap-allocated because it is too big a value to fit in a register, and would be
  36 too expensive to copy around; in a sense, it is accidental that it is represented
  37 by a pointer.  If a pointer represents a primitive value, then it really does
  38 point to that value: no unevaluated thunks, no indirections...nothing can be at
  39 the other end of the pointer than the primitive value.
  40
  41 This section also describes a few non-primitive types, which are needed
  42 to express the result types of some primitive operations.
  43
  44 \subsection{Character and numeric types}
  45
  46 There are the following obvious primitive types:
  47 @
  48 type Char#
  49 type Int#       -- see also Word# and Addr#, later
  50 type Float#
  51 type Double#
  52 @
  53 If you want to know their exact equivalents in C, see
  54 @ghc/includes/StgTypes.lh@ in the GHC source.
  55
  56 Literals for these types may be written as follows:
  57 @
  58 1#              an Int#
  59 1.2#            a Float#
  60 1.34##          a Double#
  61 'a'#            a Char#; for weird characters, use '\o<octal>'#
  62 "a"#            an Addr# (a `char *')
  63 @
  64
  65 \subsubsection{Comparison operations}
  66 @
  67 {gt,ge,eq,ne,lt,le}Char# :: Char# -> Char# -> Bool
  68     -- ditto for Int#, Word#, Float#, Double#, and Addr#
  69 @
  70
  71 \subsubsection{Unboxed-character operations}
  72 @
  73 ord# :: Char# -> Int#
  74 chr# :: Int# -> Char#
  75 @
  76
  77 \subsubsection{Unboxed-@Int@ operations}
  78 @
  79 {plus,minus,times,quot,div,rem}Int# :: Int# -> Int# -> Int#
  80 negateInt# :: Int# -> Int#
  81 @
  82 NB: No error/overflow checking!
  83
  84 \subsubsection{Unboxed-@Double@ and @Float@ operations}
  85 @
  86 {plus,minus,times,divide}Double# :: Double# -> Double# -> Double#
  87 negateDouble# :: Double# -> Double#
  88
  89 float2Int#      :: Double# -> Int#   -- just a cast, no checking!
  90 int2Double#     :: Int# -> Double#
  91
  92 expDouble#      :: Double# -> Double#
  93 logDouble#      :: Double# -> Double#
  94 sqrtDouble#     :: Double# -> Double#
  95 sinDouble#      :: Double# -> Double#
  96 cosDouble#      :: Double# -> Double#
  97 tanDouble#      :: Double# -> Double#
  98 asinDouble#     :: Double# -> Double#
  99 acosDouble#     :: Double# -> Double#
 100 atanDouble#     :: Double# -> Double#
 101 sinhDouble#     :: Double# -> Double#
 102 coshDouble#     :: Double# -> Double#
 103 tanhDouble#     :: Double# -> Double#
 104 powerDouble#    :: Double# -> Double# -> Double#
 105 @
 106 There's an exactly-matching set of unboxed-@Float@ ops; replace
 107 @Double#@ with @Float#@ in the list above.  There are two
 108 coercion functions for @Float#@/@Double#@:
 109 @
 110 float2Double#   :: Float# -> Double#
 111 double2Float#   :: Double# -> Float#
 112 @
 113 The primitive versions of @encodeDouble@/@decodeDouble@:
 114 @
 115 encodeDouble#   :: Int# -> Int# -> ByteArray#   -- Integer mantissa
 116                 -> Int#                         -- Int exponent
 117                 -> Double#
 118
 119 decodeDouble#   :: Double#
 120                 -> GHCbase.ReturnIntAndGMP
 121 @
 122 (And the same for @Float#@s.)
 123
 124 \subsection{Operations on/for @Integers@ (interface to GMP)}
 125 \label{sect:horrid-Integer-pairing-types}
 126
 127 We implement @Integers@ (arbitrary-precision integers) using the GNU
 128 multiple-precision (GMP) package.
 129
 130 NB: some of this might change if we upgrade to using GMP~2.x.
 131
 132 The data type for @Integer@ must mirror that for @MP_INT@ in @gmp.h@
 133 (see @gmp.info@).  It comes out as:
 134 @
 135 data Integer = J# Int# Int# ByteArray#
 136 @
 137 So, @Integer@ is really just a ``pairing'' type for a particular
 138 collection of primitive types.
 139
 140 The operations in the GMP return other combinations of
 141 GMP-plus-something, so we need ``pairing'' types for those, too:
 142 @
 143 data Return2GMPs     = Return2GMPs Int# Int# ByteArray# Int# Int# ByteArray#
 144 data ReturnIntAndGMP = ReturnIntAndGMP Int# Int# Int# ByteArray#
 145
 146 -- ????? something to return a string of bytes (in the heap?)
 147 @
 148 The primitive ops to support @Integers@ use the ``pieces'' of the
 149 representation, and are as follows:
 150 @
 151 negateInteger#  :: Int# -> Int# -> ByteArray# -> Integer
 152
 153 {plus,minus,times}Integer# :: Int# -> Int# -> ByteArray#
 154                            -> Int# -> Int# -> ByteArray#
 155                            -> Integer
 156
 157 cmpInteger# :: Int# -> Int# -> ByteArray#
 158             -> Int# -> Int# -> ByteArray#
 159             -> Int# -- -1 for <; 0 for ==; +1 for >
 160
 161 divModInteger#, quotRemInteger#
 162         :: Int# -> Int# -> ByteArray#
 163         -> Int# -> Int# -> ByteArray#
 164         -> GHCbase.Return2GMPs
 165
 166 integer2Int# :: Int# -> Int# -> ByteArray#
 167              -> Int#
 168
 169 int2Integer#  :: Int#  -> Integer -- NB: no error-checking on these two!
 170 word2Integer# :: Word# -> Integer
 171
 172 addr2Integer# :: Addr# -> Integer
 173         -- the Addr# is taken to be a `char *' string
 174         -- to be converted into an Integer
 175 @
 176
 177
 178 \subsection{Words and addresses}
 179
 180 A @Word#@ is used for bit-twiddling operations.  It is the same size as
 181 an @Int#@, but has no sign nor any arithmetic operations.
 182 @
 183 type Word#      -- Same size/etc as Int# but *unsigned*
 184 type Addr#      -- A pointer from outside the "Haskell world" (from C, probably);
 185                 -- described under "arrays"
 186 @
 187 @Word#@s and @Addr#@s have the usual comparison operations.
 188 Other unboxed-@Word@ ops (bit-twiddling and coercions):
 189 @
 190 and#, or# :: Word# -> Word# -> Word#
 191
 192 not# :: Word# -> Word#
 193
 194 shiftL#, shiftRA#, shiftRL# :: Word# -> Int# -> Word#
 195         -- shift left, right arithmetic, right logical
 196
 197 iShiftL#, iShiftRA#, iShiftRL# :: Int# -> Int# -> Int#
 198         -- same shift ops, but on Int#s
 199
 200 int2Word#       :: Int#  -> Word# -- just a cast, really
 201 word2Int#       :: Word# -> Int#
 202 @
 203
 204 Unboxed-@Addr@ ops (C casts, really):
 205 @
 206 int2Addr#       :: Int#  -> Addr#
 207 addr2Int#       :: Addr# -> Int#
 208 @
 209 Operations for indexing off of C pointers (@Addr#@s) to snatch values
 210 are listed under ``arrays''.
 211
 212 \subsection{Arrays}
 213
 214 The type @Array# elt@ is the type of primitive,
 215 unboxed arrays of values of type @elt@.
 216 @
 217 type Array# elt
 218 @
 219 @Array#@ is more primitive than a Haskell
 220 array --- indeed, Haskell arrays are implemented using @Array#@ ---
 221 in that an @Array#@ is indexed only by @Int#@s, starting at zero.  It is also
 222 more primitive by virtue of being unboxed.  That doesn't mean that it isn't
 223 a heap-allocated object --- of course, it is.  Rather, being unboxed means
 224 that it is represented by a pointer to the array itself, and not to a thunk
 225 which will evaluate to the array (or to bottom).
 226 The components of an @Array#@ are themselves boxed.
 227
 228 The type @ByteArray#@ is similar to @Array#@, except that it contains
 229 just a string of (non-pointer) bytes.
 230 @
 231 type ByteArray#
 232 @
 233 Arrays of these types are useful when a Haskell program wishes to
 234 construct a value to pass to a C procedure.  It is also possible to
 235 use them to build (say) arrays of unboxed characters for internal use
 236 in a Haskell program.  Given these uses, @ByteArray#@ is deliberately
 237 a bit vague about the type of its components.  Operations are provided
 238 to extract values of type @Char#@, @Int#@, @Float#@, @Double#@, and
 239 @Addr#@ from arbitrary offsets within a @ByteArray#@.  (For type @Foo#@,
 240 the $i$th offset gets you the $i$th @Foo#@, not the @Foo#@ at byte-position $i$.  Mumble.)
 241 (If you want a @Word#@, grab an @Int#@, then coerce it.)
 242
 243 Lastly, we have static byte-arrays, of type @Addr#@ [mentioned
 244 previously].  (Remember the duality between arrays and pointers in C.)
 245 Arrays of this types are represented by a pointer to an array in the
 246 world outside Haskell, so this pointer is not followed by the garbage
 247 collector.  In other respects they are just like @ByteArray#@.  They
 248 are only needed in order to pass values from C to Haskell.
 249
 250 \subsubsection{Reading and writing.}
 251
 252 Primitive arrays are linear, and indexed starting at zero.
 253
 254 The size and indices of a @ByteArray#@, @Addr#@, and
 255 @MutableByteArray#@ are all in bytes.  It's up to the program to
 256 calculate the correct byte offset from the start of the array.  This
 257 allows a @ByteArray#@ to contain a mixture of values of different
 258 type, which is often needed when preparing data for and unpicking
 259 results from C.  (Umm... not true of indices... WDP 95/09)
 260
 261 {\em Should we provide some @sizeOfDouble#@ constants?}
 262
 263 Out-of-range errors on indexing should be caught by the code which
 264 uses the primitive operation; the primitive operations themselves do
 265 {\em not} check for out-of-range indexes. The intention is that the
 266 primitive ops compile to one machine instruction or thereabouts.
 267
 268 We use the terms ``reading'' and ``writing'' to refer to accessing {\em mutable}
 269 arrays (see Section~\ref{sect:mutable}), and ``indexing''
 270 to refer to reading a value from an {\em immutable}
 271 array.
 272
 273 If you want to read/write a @Word#@, read an @Int#@ and coerce.
 274
 275 Immutable byte arrays are straightforward to index (all indices in bytes):
 276 @
 277 indexCharArray#   :: ByteArray# -> Int# -> Char#
 278 indexIntArray#    :: ByteArray# -> Int# -> Int#
 279 indexAddrArray#   :: ByteArray# -> Int# -> Addr#
 280 indexFloatArray#  :: ByteArray# -> Int# -> Float#
 281 indexDoubleArray# :: ByteArray# -> Int# -> Double#
 282
 283 indexCharOffAddr#   :: Addr# -> Int# -> Char#
 284 indexIntOffAddr#    :: Addr# -> Int# -> Int#
 285 indexFloatOffAddr#  :: Addr# -> Int# -> Float#
 286 indexDoubleOffAddr# :: Addr# -> Int# -> Double#
 287 indexAddrOffAddr#   :: Addr# -> Int# -> Addr#   -- Get an Addr# from an Addr# offset
 288 @
 289 The last of these, @indexAddrOffAddr#@, extracts an @Addr#@ using an offset
 290 from another @Addr#@, thereby providing the ability to follow a chain of
 291 C pointers.
 292
 293 Something a bit more interesting goes on when indexing arrays of boxed
 294 objects, because the result is simply the boxed object. So presumably
 295 it should be entered --- we never usually return an unevaluated
 296 object!  This is a pain: primitive ops aren't supposed to do
 297 complicated things like enter objects.  The current solution is to
 298 return a lifted value, but I don't like it!
 299 @
 300 indexArray#       :: Array# elt -> Int# -> GHCbase.Lift elt  -- Yuk!
 301 @
 302
 303 \subsubsection{The state type}
 304
 305 The primitive type @State#@ represents the state of a state transformer.
 306 It is parameterised on the desired type of state, which serves to keep
 307 states from distinct threads distinct from one another.  But the {\em only}
 308 effect of this parameterisation is in the type system: all values of type
 309 @State#@ are represented in the same way.  Indeed, they are all
 310 represented by nothing at all!  The code generator ``knows'' to generate no
 311 code, and allocate no registers etc, for primitive states.
 312 @
 313 type State# s
 314 @
 315
 316 The type @GHCbuiltins.RealWorld@ is truly opaque: there are no values defined
 317 of this type, and no operations over it.  It is ``primitive'' in that
 318 sense---but it is {\em not unboxed!} Its only role in life is to be the type
 319 which distinguishes the @PrimIO@ state transformer (see
 320 Section~\ref{sect:io-spec}).
 321 @
 322 data RealWorld
 323 @
 324
 325 \subsubsection{States}
 326
 327 A single, primitive, value of type @State# RealWorld@ is provided.
 328 @
 329 realWorld# :: State# GHCbuiltins.RealWorld
 330 @
 331 (Note: in the compiler, not a @PrimOp@; just a mucho magic @Id@.)
 332
 333 \subsection{State pairing types}
 334 \label{sect:horrid-pairing-types}
 335
 336 This subsection defines some types which, while they aren't quite primitive
 337 because we can define them in Haskell, are very nearly so.  They define
 338 constructors which pair a primitive state with a value of each primitive type.
 339 They are required to express the result type of the primitive operations in the
 340 state monad.
 341 @
 342 data StateAndPtr#    s elt = StateAndPtr#    (State# s) elt
 343
 344 data StateAndChar#   s     = StateAndChar#   (State# s) Char#
 345 data StateAndInt#    s     = StateAndInt#    (State# s) Int#
 346 data StateAndWord#   s     = StateAndWord#   (State# s) Word#
 347 data StateAndFloat#  s     = StateAndFloat#  (State# s) Float#
 348 data StateAndDouble# s     = StateAndDouble# (State# s) Double#
 349 data StateAndAddr#   s     = StateAndAddr#   (State# s) Addr#
 350
 351 data StateAndStablePtr# s a = StateAndStablePtr#  (State# s) (StablePtr# a)
 352 data StateAndForeignObj# s  = StateAndForeignObj# (State# s) ForeignObj#
 353 data StateAndSynchVar#  s a = StateAndSynchVar#  (State# s) (SynchVar# a)
 354
 355 data StateAndArray#            s elt = StateAndArray#        (State# s) (Array# elt)
 356 data StateAndMutableArray#     s elt = StateAndMutableArray# (State# s) (MutableArray# s elt)
 357 data StateAndByteArray#        s = StateAndByteArray#        (State# s) ByteArray#
 358 data StateAndMutableByteArray# s = StateAndMutableByteArray# (State# s) (MutableByteArray# s)
 359 @
 360
 361
 362 \subsection{Mutable arrays}
 363 \label{sect:mutable}
 364
 365 Corresponding to @Array#@ and @ByteArray#@,
 366 we have the types of mutable versions of each.
 367 In each case, the representation is a pointer
 368 to a suitable block of (mutable) heap-allocated storage.
 369 @
 370 type MutableArray# s elt
 371 type MutableByteArray# s
 372 @
 373 \subsubsection{Allocation.}
 374
 375 Mutable arrays can be allocated.
 376 Only pointer-arrays are initialised; arrays of non-pointers are filled
 377 in by ``user code'' rather than by the array-allocation primitive.
 378 Reason: only the pointer case has to worry about GC striking with a
 379 partly-initialised array.
 380 @
 381 newArray#       :: Int# -> elt -> State# s -> StateAndMutableArray# s elt
 382
 383 newCharArray#   :: Int# -> State# s -> StateAndMutableByteArray# s
 384 newIntArray#    :: Int# -> State# s -> StateAndMutableByteArray# s
 385 newAddrArray#   :: Int# -> State# s -> StateAndMutableByteArray# s
 386 newFloatArray#  :: Int# -> State# s -> StateAndMutableByteArray# s
 387 newDoubleArray# :: Int# -> State# s -> StateAndMutableByteArray# s
 388 @
 389 The size of a @ByteArray#@ is given in bytes.
 390
 391 \subsubsection{Reading and writing}
 392
 393 %OLD: Remember, offsets in a @MutableByteArray#@ are in bytes.
 394 @
 395 readArray#       :: MutableArray# s elt -> Int# -> State# s -> StateAndPtr#    s elt
 396 readCharArray#   :: MutableByteArray# s -> Int# -> State# s -> StateAndChar#   s
 397 readIntArray#    :: MutableByteArray# s -> Int# -> State# s -> StateAndInt#    s
 398 readAddrArray#   :: MutableByteArray# s -> Int# -> State# s -> StateAndAddr#   s
 399 readFloatArray#  :: MutableByteArray# s -> Int# -> State# s -> StateAndFloat#  s
 400 readDoubleArray# :: MutableByteArray# s -> Int# -> State# s -> StateAndDouble# s
 401
 402 writeArray#       :: MutableArray# s elt -> Int# -> elt     -> State# s -> State# s
 403 writeCharArray#   :: MutableByteArray# s -> Int# -> Char#   -> State# s -> State# s
 404 writeIntArray#    :: MutableByteArray# s -> Int# -> Int#    -> State# s -> State# s
 405 writeAddrArray#   :: MutableByteArray# s -> Int# -> Addr#   -> State# s -> State# s
 406 writeFloatArray#  :: MutableByteArray# s -> Int# -> Float#  -> State# s -> State# s
 407 writeDoubleArray# :: MutableByteArray# s -> Int# -> Double# -> State# s -> State# s
 408 @
 409
 410 \subsubsection{Equality.}
 411
 412 One can take ``equality'' of mutable arrays.  What is compared is the
 413 {\em name} or reference to the mutable array, not its contents.
 414 @
 415 sameMutableArray#     :: MutableArray# s elt -> MutableArray# s elt -> Bool
 416 sameMutableByteArray# :: MutableByteArray# s -> MutableByteArray# s -> Bool
 417 @
 418
 419 \subsubsection{Freezing mutable arrays}
 420
 421 Only unsafe-freeze has a primitive.  (Safe freeze is done directly in Haskell
 422 by copying the array and then using @unsafeFreeze@.)
 423 @
 424 unsafeFreezeArray#     :: MutableArray# s elt -> State# s -> StateAndArray#     s elt
 425 unsafeFreezeByteArray# :: MutableByteArray# s -> State# s -> StateAndByteArray# s
 426 @
 427
 428 \subsubsection{Stable pointers}
 429
 430 {\em Andy's comment.} {\bf Errors:} The following is not strictly true: the current
 431 implementation is not as polymorphic as claimed.  The reason for this
 432 is that the C programmer will have to use a different entry-routine
 433 for each type of stable pointer.  At present, we only supply a very
 434 limited number (1) of these routines.  It might be possible to
 435 increase the range of these routines by providing general purpose
 436 entry points to apply stable pointers to (stable pointers to)
 437 arguments and to enter (stable pointers to) boxed primitive values.
 438 {\em End of Andy's comment.}
 439
 440 A stable pointer is a name for a Haskell object which can be passed to the
 441 external world.  It is ``stable'' in the sense that the name does not change when
 442 the Haskell garbage collector runs --- in contrast to the address of the object
 443 which may well change.
 444
 445 The stable pointer type is parameterised by the type of the thing which is named.
 446 @
 447 type StablePtr# a
 448 @
 449 A stable pointer is represented by an index into the (static)
 450 @StablePointerTable@.  The Haskell garbage collector treats the
 451 @StablePointerTable@ as a source of roots for GC.
 452
 453 The @makeStablePointer@ function converts a value into a stable pointer.
 454 It is part of the @PrimIO@ monad, because we want to be sure we don't
 455 allocate one twice by accident, and then only free one of the copies.
 456 @
 457 makeStablePointer#  :: a -> State# RealWorld -> StateAndStablePtr# RealWorld a
 458 freeStablePointer#  :: StablePtr# a -> State# RealWorld -> State# RealWorld
 459 deRefStablePointer# :: StablePtr# a -> State# RealWorld -> StateAndPtr RealWorld a
 460 @
 461 There is also a C procedure @FreeStablePtr@ which frees a stable pointer.
 462
 463 %
 464 % Rewritten and updated for MallocPtr++ -- 4/96 SOF
 465 %
 466 \subsubsection{Foreign objects}
 467
 468 A @ForeignObj@ is a reference to an object outside the Haskell
 469 world (i.e., from the C world, or a reference to an object on another
 470 machine completely.), where the Haskell world has been told ``Let me
 471 know when you're finished with this ...''.
 472
 473 The @ForeignObj@ type is just a special @Addr#@ ({\em not} parameterised).
 474 @
 475 type ForeignObj#
 476 @
 477
 478 A typical use of @ForeignObj@ is in constructing Haskell bindings
 479 to external libraries. A good example is that of writing a binding to
 480 an image-processing library (which was actually the main motivation
 481 for implementing @ForeignObj@'s precursor, @MallocPtr@). The
 482 images manipulated are not stored in the Haskell heap, either because
 483 the library insist on allocating them internally or we (sensibly)
 484 decide to spare the GC from having to heave heavy images around.
 485
 486 @
 487 data Image = Image ForeignObj#
 488
 489 instance CCallable Image
 490 @
 491
 492 The @ForeignObj#@ type is then used to refer to the externally
 493 allocated image, and to acheive some type safety, the Haskell binding
 494 defines the @Image@ data type. So, a value of type @ForeignObj#@ is
 495 used to ``box'' up an external reference into a Haskell heap object
 496 that we can then indirectly reference:
 497
 498 @
 499 createImage :: (Int,Int) -> PrimIO Image
 500 @
 501
 502 So far, this looks just like an @Addr#@ type, but @ForeignObj#@
 503 offers a bit more, namely that we can specify a {\em finalisation
 504 routine} to invoke when the @ForeignObj#@ is discarded by the
 505 GC. The garbage collector invokes the finalisation routine associated
 506 with the @ForeignObj#@, saying `` Thanks, I'm through with this
 507 now..'' For the image-processing library, the finalisation routine could for
 508 the images free up memory allocated for them. The finalisation routine has
 509 currently to be written in C (the finalisation routine can in turn call on
 510 @FreeStablePtr@ to deallocate a stable pointer.).
 511
 512 Associating a finalisation routine with an external object is done by
 513 @makeForeignObj#@:
 514
 515 @
 516 makeForeignObj# :: Addr# -- foreign reference
 517                 -> Addr# -- pointer to finalisation routine
 518                 -> StateAndForeignObj# RealWorld ForeignObj#
 519 @
 520
 521 (Implementation: a linked list of all @ForeignObj#@s is maintained to allow the
 522  garbage collector to detect when a @ForeignObj#@ becomes garbage.)
 523
 524 Like @Array@, @ForeignObj#@s are represented by heap objects.
 525
 526 {\bf ToDo:} Decide whether @FreeCHeapPointer@ is allowed to call on a
 527 stable pointer. (I sincerely hope not since we will still be in the
 528 GC at this point.)
 529
 530 \subsubsection{Synchronizing variables (I-vars, M-vars)}
 531
 532 ToDo ToDo ToDo
 533
 534 @
 535 type SynchVar# s elt    -- primitive
 536
 537 newSynchVar#:: State# s -> StateAndSynchVar# s elt
 538
 539 takeMVar#   :: SynchVar# s elt -> State# s -> StateAndPtr# s elt
 540 putMVar#    :: SynchVar# s elt -> State# s -> State# s
 541
 542 readIVar#   :: SynchVar# s elt -> State# s -> StateAndPtr# s elt
 543 writeIVar#  :: SynchVar# s elt -> State# s -> State# s
 544 @
 545
 546 \subsubsection{Controlling the garbage collector}
 547
 548 The C function {\tt PerformGC\/}, allows the C world to force Haskell
 549 to do a garbage collection.  It can only be called while Haskell
 550 is performing a C Call.
 551
 552 Note that this function can be used to define a Haskell IO operation
 553 with the same effect:
 554 @
 555 >       performGCIO :: PrimIO ()
 556 >       performGCIO = _ccall_gc_ PerformGC
 557 @
 558
 559 {\bf ToDo:} Is there any need for abnormal/normal termination to force
 560 a GC too?  Is there any need for a function that provides finer
 561 control over GC: argument = amount of space required; result = amount
 562 of space recovered.
 563
 564 \subsection{@spark#@ primitive operation (for parallel execution)}
 565
 566 {\em ToDo: say something}  It's used in the unfolding for @par@.
 567
 568 \subsection{The @errorIO#@ primitive operation}
 569
 570 The @errorIO#@ primitive takes an argument much like @PrimIO@.  It aborts execution of
 571 the current program, and continues instead by performing the given @PrimIO@-like value
 572 on the current state of the world.
 573 @
 574 errorIO# :: (State RealWorld -> ((), State RealWorld)) -> a
 575 @
 576
 577 \subsection{C Calls}
 578
 579 {\bf ToDo:} current implementation has state variable as second
 580 argument not last argument.
 581
 582 The @ccall#@ primitive can't be given an ordinary type, because it has
 583 a variable number of arguments.  The nearest we can get is:
 584 @
 585 ccall# :: CRoutine -> a1# -> ... -> an# -> State# RealWorld -> StateAndR# RealWorld
 586 @
 587 where the type variables @a1#@\ldots@an#@ and @r#@ can be instantiated by any
 588 primitive type, and @StateAndR#@ is the appropriate pairing type from
 589 Section~\ref{sect:horrid-pairing-types}.  The @CRoutine@
 590 isn't a proper Haskell type at all; it just reminds us that @ccall#@ needs to
 591 know what C routine to call.
 592
 593 This notation is really short for a massive family of @ccall#@ primitives, one
 594 for each combination of types.  (Of course, the compiler simply remembers the
 595 types involved, and generates appropriate code when it finally spits out the C.)
 596
 597 Unlike all the other primitive operators, @ccall#@ is not bound to an in-scope
 598 identifier.  The only way it is possible to generate a @ccall#@ is via the
 599 @_ccall_@ construct.
 600
 601 All this applies equally to @casm#@:
 602 @
 603 casm#  :: CAsmString -> a1# -> ... -> an# -> State# RealWorld -> StateAndR# RealWorld
 604 @
 605
 606 %------------------------------------------------------------
 607 \section{Library stuff built with the Really Primitive Stuff}
 608
 609 \subsection{The state transformer monad}
 610
 611 \subsubsection{Types}
 612
 613 A state transformer is a function from a state to a pair of a result and a new
 614 state.
 615 @
 616 newtype ST s a = ST (State s -> (a, State s))
 617 @
 618 The @ST@ type is {\em abstract}, so that the programmer cannot see its
 619 representation.  If he could, he could write bad things like:
 620 @
 621 bad :: ST s a
 622 bad = ST $ \ s -> ...(f s)...(g s)...
 623 @
 624 Here, @s@ is duplicated, which would be bad news.
 625
 626 A state is represented by a primitive state value, of type @State# s@,
 627 wrapped up in a @State@ constructor.  The reason for boxing it in this
 628 way is so that we can be strict or lazy in the state.  (Remember, all
 629 primitive types are unboxed, and hence can't be bottom; but types built
 630 with @data@ are all boxed.)
 631 @
 632 data State s = S# (State# s)
 633 @
 634
 635 \subsubsection{The state transformer combinators}
 636
 637 Now for the combinators, all of which live inside the @ST@
 638 abstraction.  Notice that @returnST@ and @thenST@ are lazy in the
 639 state.
 640 @
 641 returnST :: a -> ST s a
 642 returnST a s = (a, s)
 643
 644 thenST :: ST s a -> (a -> ST s b) -> ST s b
 645 thenST m k s = let (r,new_s) = m s
 646                in
 647                k r new_s
 648
 649 fixST :: (a -> ST s a) -> ST s a
 650 fixST k s = let ans = k r s
 651                 (r,new_s) = ans
 652             in
 653             ans
 654 @
 655 The interesting one is, of course, @runST@.  We can't infer its type!
 656 (It has a funny name because it must be wired into the compiler.)
 657 @
 658 -- runST :: forall a. (forall s. ST s a) -> a
 659 runST m = case m (S# realWorld#) of
 660            (r,_) -> r
 661 @
 662
 663 \subsubsection{Other useful combinators}
 664
 665 There are various other standard combinators, all defined in terms the
 666 fundamental combinators above. The @seqST@ combinator is like
 667 @thenST@, except that it discards the result of the first state
 668 transformer:
 669 @
 670 seqST :: ST s a -> ST s b -> ST s b
 671 seqST m1 m2 = m1 `thenST` (\_ -> m2)
 672 @
 673
 674 We also have {\em strict} (... in the state...) variants of the
 675 then/return combinators (same types as their pals):
 676 @
 677 returnStrictlyST a s@(S# _) = (a, s)
 678
 679 thenStrictlyST m k s@(S# _)
 680   = case (m s) of { (r, new_s@(S# _)) ->
 681     k r new_s }
 682
 683 seqStrictlyST m k = ... ditto, for seqST ...
 684 @
 685
 686 The combinator @listST@ takes a list of state transformers, and
 687 composes them in sequence, returning a list of their results:
 688 @
 689 listST :: [ST s a] -> ST s [a]
 690 listST []     = returnST []
 691 listST (m:ms) = m               `thenST` \ r ->
 692                 listST ms       `thenST` \ rs ->
 693                 returnST (r:rs)
 694 @
 695 The @mapST@ combinator ``lifts'' a function from a value to state
 696 transformers to one which works over a list of values:
 697 @
 698 mapST :: (a -> ST s b) -> [a] -> ST s [b]
 699 mapST f ms = listST (map f ms)
 700 @
 701 The @mapAndUnzipST@ combinator is similar to @mapST@, except that here the
 702 function returns a pair:
 703 @
 704 mapAndUnzipST :: (a -> ST s (b,c)) -> [a] -> ST s ([b],[c])
 705 mapAndUnzipST f (m:ms)
 706   = f m                 `thenST` \ ( r1,  r2) ->
 707     mapAndUnzipST f ms  `thenST` \ (rs1, rs2) ->
 708     returnST (r1:rs1, r2:rs2)
 709 @
 710
 711 \subsubsection{The @PrimIO@ monad}
 712 \label{sect:io-spec}
 713
 714 The @PrimIO@ type is defined in as a state transformer which manipulates the
 715 @RealWorld@.
 716 @
 717 type PrimIO a = ST RealWorld a      -- Transparent
 718 @
 719 The @PrimIO@ type is an ordinary type synonym, transparent to the programmer.
 720
 721 The type @RealWorld@ and value @realWorld#@ do not need to be hidden (although
 722 there is no particular point in exposing them).  Even having a value of type
 723 @realWorld#@ does not compromise safety, since the type @ST@ is hidden.
 724
 725 It is type-correct to use @returnST@ in an I/O context, but it is a
 726 bit more efficient to use @returnPrimIO@.  The latter is strict in the
 727 state, which propagates backwards to all the earlier combinators
 728 (provided they are unfolded).  Why is it safe for @returnPrimIO@ to be
 729 strict in the state?  Because every context in which an I/O state
 730 transformer is used will certainly evaluate the resulting state; it is
 731 the state of the real world!
 732 @
 733 returnPrimIO :: a -> PrimIO a
 734 returnPrimIO a s@(S# _) -> (a, s)
 735 @
 736 We provide strict versions of the other combinators too.
 737 @
 738 thenPrimIO m k s = case m s of
 739                      (r,s) -> k r s
 740 @
 741 @fixPrimIO@ has to be lazy, though!
 742 @
 743 fixPrimIO  = fixST
 744 @
 745 The other combinators are just the same as before, but use the strict
 746 @thenPrimIO@ and @returnPrimIO@ for efficiency.
 747 @
 748 foldrPrimIO f z []     = z
 749 foldrPrimIO f z (m:ms) = foldrPrimIO f z ms `thenPrimIO` \ ms' ->
 750                          f m ms'
 751
 752 listPrimIO ms = foldrPrimIO (\ a xs -> a `thenPrimIO` \ x -> returnPrimIO (x : xs))
 753                 (returnPrimIO []) ms
 754
 755 mapPrimIO f ms = listPrimIO (map f ms)
 756
 757 mapAndUnzipPrimIO f (m:ms)
 758   = f m                     `thenPrimIO` \ ( r1,  r2) ->
 759     mapAndUnzipPrimIO f ms  `thenPrimIO` \ (rs1, rs2) ->
 760     returnPrimIO (r1:rs1, r2:rs2)
 761 @
 762
 763 \subsection{Arrays}
 764
 765 \subsubsection{Types}
 766
 767 @
 768 data Array      ix elt = Array     (ix,ix) (Array# elt)
 769 data ByteArray ix      = ByteArray (ix,ix) ByteArray#
 770
 771 data MutableArray     s ix elt = MutableArray     (ix,ix) (MutableArray# s elt)
 772 data MutableByteArray s ix     = MutableByteArray (ix,ix) (MutableByteArray# s)
 773 @
 774
 775 \subsubsection{Operations on immutable arrays}
 776
 777 Ordinary array indexing is straightforward.
 778 @
 779 (!) :: Ix ix => Array ix elt -> ix -> elt
 780 @
 781 QUESTIONs: should @ByteArray@s be indexed by Ints or ix?  With byte offsets
 782 or sized ones? (sized ones [WDP])
 783 @
 784 indexCharArray   :: Ix ix => ByteArray ix -> ix -> Char
 785 indexIntArray    :: Ix ix => ByteArray ix -> ix -> Int
 786 indexAddrArray   :: Ix ix => ByteArray ix -> ix -> Addr
 787 indexFloatArray  :: Ix ix => ByteArray ix -> ix -> Float
 788 indexDoubleArray :: Ix ix => ByteArray ix -> ix -> Double
 789 @
 790 @Addr@s are indexed straightforwardly by @Int@s.  Unlike the primitive
 791 operations, though, the offsets assume that the array consists entirely of the
 792 type of value being indexed, and so there's an implicit multiplication by
 793 the size of that value.  To access @Addr@s with mixed values requires
 794 you to do a DIY job using the primitives.
 795 @
 796 indexAddrChar :: Addr -> Int -> Char
 797 ...etc...
 798 indexStaticCharArray   :: Addr -> Int -> Char
 799 indexStaticIntArray    :: Addr -> Int -> Int
 800 indexStaticFloatArray  :: Addr -> Int -> Float
 801 indexStaticDoubleArray :: Addr -> Int -> Double
 802 indexStaticArray       :: Addr -> Int -> Addr
 803 @
 804
 805 \subsubsection{Operations on mutable arrays}
 806 @
 807 newArray     :: Ix ix => (ix,ix) -> elt -> ST s (MutableArray s ix elt)
 808 newCharArray :: Ix ix => (ix,ix) -> ST s (MutableByteArray s ix)
 809 ...
 810 @
 811
 812 @
 813 readArray   :: Ix ix => MutableArray s ix elt -> ix -> ST s elt
 814 readCharArray   :: Ix ix => MutableByteArray s ix -> ix -> ST s Char
 815 ...
 816 @
 817
 818 @
 819 writeArray  :: Ix ix => MutableArray s ix elt -> ix -> elt -> ST s ()
 820 writeCharArray  :: Ix ix => MutableByteArray s ix -> ix -> Char -> ST s ()
 821 ...
 822 @
 823
 824 @
 825 freezeArray :: Ix ix => MutableArray s ix elt -> ST s (Array ix elt)
 826 freezeCharArray :: Ix ix => MutableByteArray s ix -> ST s (ByteArray ix)
 827 ...
 828 @
 829
 830 We have no need on one-function-per-type for unsafe freezing:
 831 @
 832 unsafeFreezeArray :: Ix ix => MutableArray s ix elt -> ST s (Array ix elt)
 833 unsafeFreezeByteArray :: Ix ix => MutableByteArray s ix -> ST s (ByteArray ix)
 834 @
 835
 836 Sometimes we want to snaffle the bounds of one of these beasts:
 837 @
 838 boundsOfArray     :: Ix ix => MutableArray s ix elt -> (ix, ix)
 839 boundsOfByteArray :: Ix ix => MutableByteArray s ix -> (ix, ix)
 840 @
 841
 842 Lastly, ``equality'':
 843 @
 844 sameMutableArray     :: MutableArray s ix elt -> MutableArray s ix elt -> Bool
 845 sameMutableByteArray :: MutableByteArray s ix -> MutableByteArray s ix -> Bool
 846 @
 847
 848
 849 \subsection{Variables}
 850
 851 \subsubsection{Types}
 852
 853 Mutable variables are (for now anyway) implemented as arrays.  The @MutableVar@ type
 854 is opaque, so we can change the implementation later if we want.
 855 @
 856 type MutableVar s a = MutableArray s Int a
 857 @
 858
 859 \subsubsection{Operations}
 860 @
 861 newVar   :: a -> ST s (MutableVar s a)
 862 readVar  :: MutableVar s a -> ST s a
 863 writeVar :: MutableVar s a -> a -> ST s ()
 864 sameVar  :: MutableVar s a -> MutableVar s a -> Bool
 865 @
 866
 867 \subsection{Stable pointers}
 868
 869 Nothing exciting here, just simple boxing up.
 870 @
 871 data StablePtr a = StablePtr (StablePtr# a)
 872
 873 makeStablePointer :: a -> StablePtr a
 874 freeStablePointer :: StablePtr a -> PrimIO ()
 875 @
 876
 877 \subsection{Foreign objects}
 878
 879 Again, just boxing up.
 880 @
 881 data ForeignObj = ForeignObj ForeignObj#
 882
 883 makeForeignObj :: Addr -> Addr -> PrimIO ForeignObj
 884 @
 885
 886 \subsection{C calls}
 887
 888 Everything in this section goes for @_casm_@ too.
 889
 890 {\em ToDo: mention @_ccall_gc_@ and @_casm_gc_@...}
 891
 892 The @_ccall_@ construct has the following form:
 893 $$@_ccall_@~croutine~a_1~\ldots~a_n$$
 894 This whole construct has type $@PrimIO@~res$.
 895 The rules are these:
 896 \begin{itemize}
 897 \item
 898 The first ``argument'', $croutine$, must be the literal name of a C procedure.
 899 It cannot be a Haskell expression which evaluates to a string, etc; it must be
 900 simply the name of the procedure.
 901 \item
 902 The arguments $a_1, \ldots,a_n$ must be of {\em C-callable} type.
 903 \item
 904 The whole construct has type $@PrimIO@~ty$, where $ty$ is a {\em C-returnable} type.
 905 \end{itemize}
 906 A {\em boxed-primitive} type is both C-callable and C-returnable.
 907 A boxed primitive type is anything declared by:
 908 @
 909 data T = C# t
 910 @
 911 where @t@ is a primitive type.  Note that
 912 programmer-defined boxed-primitive types are perfectly OK:
 913 @
 914 data Widget = W# Int#
 915 data Screen = S# CHeapPtr#
 916 @
 917
 918 There are other types that can be passed to C (C-callable).  This
 919 table summarises (including the standard boxed-primitive types):
 920 @
 921 Boxed             Type of transferd   Corresp.     Which is
 922 Type              Prim. component     C type       *probably*...
 923 ------            ---------------     ------       -------------
 924 Char              Char#               StgChar       unsigned char
 925 Int               Int#                StgInt        long int
 926 Word              Word#               StgWord       unsigned long int
 927 Addr              Addr#               StgAddr       char *
 928 Float             Float#              StgFloat      float
 929 Double            Double#             StgDouble     double
 930
 931 Array             Array#              StgArray      StgPtr
 932 ByteArray         ByteArray#          StgByteArray  StgPtr
 933 MutableArray      MutableArray#       StgArray      StgPtr
 934 MutableByteArray  MutableByteArray#   StgByteArray  StgPtr
 935
 936 State             State#              nothing!
 937
 938 StablePtr         StablePtr#          StgStablePtr  StgPtr
 939 ForeignObj        ForeignObj#         StgForeignObj StgPtr
 940 @
 941
 942 All of the above are {\em C-returnable} except:
 943 @
 944         Array, ByteArray, MutableArray, MutableByteArray
 945 @
 946
 947 {\bf ToDo:} I'm pretty wary of @Array@ and @MutableArray@ being in
 948 this list, and not too happy about @State@ [WDP].
 949
 950 {\bf ToDo:} Can code generator pass all the primitive types?  Should this be
 951 extended to include {\tt Bool\/} (or any enumeration type?)
 952
 953 The type checker must be able to figure out just which of the C-callable/returnable
 954 types is being used.  If it can't, you have to add type signatures. For example,
 955 @
 956 f x = _ccall_ foo x
 957 @
 958 is not good enough, because the compiler can't work out what type @x@ is, nor
 959 what type the @_ccall_@ returns.  You have to write, say:
 960 @
 961 f :: Int -> PrimIO Float
 962 f x = _ccall_ foo x
 963 @
 964
 965 \subsubsection{Implementation}
 966
 967 The desugarer unwraps the @_ccall_@ construct by inserting the necessary
 968 evaluations etc to unbox the arguments.  For example, the body of the definition
 969 of @f@ above would become:
 970 @
 971         (\ s -> case x of { I# x# ->
 972                 case s of { S# s# ->
 973                 case ccall# [Int#,Float#] x# s# of { StateAndFloat# f# new_s# ->
 974                 (F# f#, S# new_s#)
 975                 }}})
 976 @
 977 Notice that the state, too, is unboxed.
 978
 979 The code generator must deal specially with primitive objects which
 980 are stored on the heap.
 981
 982 ... details omitted ...
 983
 984 %
 985 %More importantly, it must construct a C Heap Pointer heap-object after
 986 %a @_ccall_@ which returns a @MallocPtr#@.
 987 %
 988
 989 %--------------------------------------------------------
 990 \section{Non-primitive stuff that must be wired into GHC}
 991
 992 @
 993 data Char    = C# Char#
 994 data Int     = I# Int#
 995 data Word    = W# Word#
 996 data Addr    = A# Addr#
 997
 998 data Float   = F# Float#
 999 data Double  = D# Double#
1000 data Integer = J# Int# Int# ByteArray#
1001
1002 -- and the other boxed-primitive types:
1003     Array, ByteArray, MutableArray, MutableByteArray,
1004     StablePtr, ForeignObj
1005
1006 data Bool     = False | True
1007 data Ordering = LT | EQ | GT  -- used in derived comparisons
1008
1009 data List a = [] | a : (List a)
1010 -- tuples...
1011
1012 data Lift a = Lift a    -- used Yukkily as described elsewhere
1013
1014 type String  = [Char]    -- convenience, only
1015 @
1016
1017 %------------------------------------------------------------
1018 \section{Programmer interface(s)}
1019
1020 \subsection{The bog-standard interface}
1021
1022 If you rely on the implicit @import Prelude@ that GHC normally does
1023 for you, and if you don't use any weird flags (notably
1024 @-fglasgow-exts@), and if you don't import one of the fairly-magic
1025 @PreludeGla*@ interfaces, then GHC should work {\em exactly} as the
1026 Haskell report says, and the full user namespaces should be available
1027 to you.
1028
1029 \subsection{If you mess about with @import Prelude@...}
1030
1031 Innocent hiding, e.g.,
1032 @
1033 import Prelude hiding ( fromIntegral )
1034 @
1035 should work just fine.
1036
1037 There are some things you can do that will make GHC crash, e.g.,
1038 hiding a standard class:
1039 @
1040 import Prelude hiding ( Eq(..) )
1041 @
1042 Don't do that.
1043
1044 \subsection{Turning on Glasgow extensions with @-fglasgow-exts@}
1045
1046 If you turn on @-fglasgow-exts@, then all the primitive types and
1047 operations described herein are available.
1048
1049 It is possible that some name conflicts between your code and the
1050 wired-in things might spring to life (though we doubt it...).
1051 Change your names :-)
1052
1053 \end{document}
1054