1 \documentstyle[a4wide,grasp]{article}
2 \renewcommand{\textfraction}{0.1}
3 \renewcommand{\floatpagefraction}{0.9}
4 \renewcommand{\dblfloatpagefraction}{0.9}
7 \renewcommand{\today}{March 1997}
11 \title{The GHC Prelude and Libraries}
12 \author{Simon L Peyton Jones \and Will Partain}
17 \section{Introduction}
19 This document describes GHC's prelude and libraries. The basic story is that of
20 the Haskell 1.3 Report and Libraries document (which we do not reproduce here),
21 but this document describes in addition:
23 \item GHC's additional non-standard libraries and types, such as state transformers,
24 packed strings, foreign objects, stable pointers, and so on.
26 \item GHC's primitive types and operations. The standard Haskell functions are implemented
27 on top of these, and it is sometimes useful to use them directly.
29 \item The organsiation of these libraries into directories.
34 The libraries are organised into the following three groups, each of which
35 is kept in a separate sub-directory of GHC's installed @lib/@ directory:
37 \item[@lib/required/@] These are the libraries {\em required} by the Haskell
38 definition. All are defined by the Haskell Report, or by the Haskell Libraries Report.
39 They currently comprise:
42 \item @List@: more functions on lists.
43 \item @Char@: more functions on characters.
44 \item @Maybe@: more functions on @Maybe@ types.
45 \item @Complex@: functions on complex numbers.
46 \item @Ratio@: functions on rational numbers.
47 \item @Monad@: functions on characters.
48 \item @Ix@: the @Ix@ class of indexing operations.
49 \item @Array@: monolithic arrays.
50 \item @IO@: basic input/output functions.
51 \item @Directory@: basic functions for accessing the file system.
52 \item @System@: basic operating-system interface functions.
55 \item[@lib/glaExts@] GHC extension libraries, currently comprising:
57 \item @PackedString@: functions that manipulate strings packed efficiently, one character per byte.
58 \item @ST@: the state transformer monad.
59 \item @Foreign@: types and operations for GHC's foreign-language interface.
62 \item[@lib/concurrent@] GHC extension libraries to support Concurrent Haskell, currently comprising:
64 \item @Concurrent.hs@: main library.
65 \item @Parallel.hs@: stuff for multi-processor parallelism.
73 \item[@lib/ghc@] These libraries are the pieces on which all the others are built.
74 They aren't typically imported by Joe Programmer, but there's nothing to stop you
75 doing so if you want. In general, the modules prefixed by @Prel@ are pieces that go
76 towards building @Prelude@.
79 \item @GHC@: this ``library'' brings into scope all the primitive types and operations, such as
80 @Int#@, @+#@, @encodeFloat#@, etc etc. It is unique in that there is no Haskell
81 source code for it. Details in Section \ref{sect:ghc}.
83 \item @PrelBase@: defines the basic types and classes without which very few Haskell programs can work.
84 The classes are: @Eq@, @Ord@, @Enum@, @Bounded@, @Num@, @Show@, @Eval@, @Monad@, @MonadZero@, @MonadPlus@.
85 The types are: list, @Bool@, @Char@, @Ordering@, @String@, @Int@, @Integer@, @Maybe@, @Either@.
87 \item @PrelTup@: defines tuples and their instances.
88 \item @PrelList@: defines most of the list operations required by @Prelude@. (A few are in @PrelBase@,
89 to avoid gratuitous mutual recursion between modules.)
91 \item @PrelNum@ defines: the numeric classes beyond @Num@ (namely @Real@, @Integral@,
92 @Fractional@, @Floating@, @RealFrac@, @RealFloat@; instances for appropriate classes
93 for @Int@ and @Integer@; the types @Float@, @Double@, and @Ratio@ and their instances.
95 \item @PrelRead@: the @Read@ class and all its instances. It's kept separate because many programs
96 don't use @Read@ at all, so we don't even want to link in its code.
98 \item @ConcBase@: substrate stuff for Concurrent Haskell.
100 \item @IOBase@: substrate stuff for the main I/O libraries.
101 \item @IOHandle@: large blob of code for doing I/O on handles.
102 \item @PrelIO@: the remaining small pieces to produce the I/O stuff needed by @Prelude@.
104 \item @STBase@: substrate stuff for @ST@.
105 \item @ArrBase@: substrate stuff for @Array@.
107 \item @GHCerr@: error reporting code, called from code that the compiler plants in compiled programs.
108 \item @GHCmain@: the definition of @mainPrimIO@, which is what {\em really} gets
109 called by the runtime system. @mainPrimIO@ in turn calls @main@.
113 The @...Base@ modules generally export representation information that
114 is hidden from the public interface. For example the module @STBase@
115 exports the type @ST@ including its representation, whereas the module
116 @ST@ exports @ST@ abstractly.
118 None of these modules are involved in any mutual recursion, with the sole exception that
119 many modules import @IOBase.error@.
121 \section{The module @GHC@: really primitive stuff}
124 This section defines all the types which are primitive in Glasgow Haskell, and the
125 operations provided for them.
127 A primitive type is one which cannot be defined in Haskell, and which
128 is therefore built into the language and compiler. Primitive types
129 are always unboxed; that is, a value of primitive type cannot be
132 Primitive values are often represented by a simple bit-pattern, such as @Int#@,
133 @Float#@, @Double#@. But this is not necessarily the case: a primitive value
134 might be represented by a pointer to a heap-allocated object. Examples include
135 @Array#@, the type of primitive arrays. You might think this odd: doesn't being
136 heap-allocated mean that it has a box? No, it does not. A primitive array is
137 heap-allocated because it is too big a value to fit in a register, and would be
138 too expensive to copy around; in a sense, it is accidental that it is represented
139 by a pointer. If a pointer represents a primitive value, then it really does
140 point to that value: no unevaluated thunks, no indirections...nothing can be at
141 the other end of the pointer than the primitive value.
143 This section also describes a few non-primitive types, which are needed
144 to express the result types of some primitive operations.
146 \subsection{Character and numeric types}
148 There are the following obvious primitive types:
151 type Int# -- see also Word# and Addr#, later
155 If you want to know their exact equivalents in C, see
156 @ghc/includes/StgTypes.lh@ in the GHC source.
158 Literals for these types may be written as follows:
163 'a'# a Char#; for weird characters, use '\o<octal>'#
164 "a"# an Addr# (a `char *')
167 \subsubsection{Comparison operations}
169 {gt,ge,eq,ne,lt,le}Char# :: Char# -> Char# -> Bool
170 -- ditto for Int#, Word#, Float#, Double#, and Addr#
173 \subsubsection{Unboxed-character operations}
175 ord# :: Char# -> Int#
176 chr# :: Int# -> Char#
179 \subsubsection{Unboxed-@Int@ operations}
181 {plus,minus,times,quot,div,rem}Int# :: Int# -> Int# -> Int#
182 negateInt# :: Int# -> Int#
184 NB: No error/overflow checking!
186 \subsubsection{Unboxed-@Double@ and @Float@ operations}
188 {plus,minus,times,divide}Double# :: Double# -> Double# -> Double#
189 negateDouble# :: Double# -> Double#
191 float2Int# :: Double# -> Int# -- just a cast, no checking!
192 int2Double# :: Int# -> Double#
194 expDouble# :: Double# -> Double#
195 logDouble# :: Double# -> Double#
196 sqrtDouble# :: Double# -> Double#
197 sinDouble# :: Double# -> Double#
198 cosDouble# :: Double# -> Double#
199 tanDouble# :: Double# -> Double#
200 asinDouble# :: Double# -> Double#
201 acosDouble# :: Double# -> Double#
202 atanDouble# :: Double# -> Double#
203 sinhDouble# :: Double# -> Double#
204 coshDouble# :: Double# -> Double#
205 tanhDouble# :: Double# -> Double#
206 powerDouble# :: Double# -> Double# -> Double#
208 There's an exactly-matching set of unboxed-@Float@ ops; replace
209 @Double#@ with @Float#@ in the list above. There are two
210 coercion functions for @Float#@/@Double#@:
212 float2Double# :: Float# -> Double#
213 double2Float# :: Double# -> Float#
215 The primitive versions of @encodeDouble@/@decodeDouble@:
217 encodeDouble# :: Int# -> Int# -> ByteArray# -- Integer mantissa
218 -> Int# -- Int exponent
221 decodeDouble# :: Double#
222 -> GHCbase.ReturnIntAndGMP
224 (And the same for @Float#@s.)
226 \subsection{Operations on/for @Integers@ (interface to GMP)}
227 \label{sect:horrid-Integer-pairing-types}
229 We implement @Integers@ (arbitrary-precision integers) using the GNU
230 multiple-precision (GMP) package.
232 NB: some of this might change if we upgrade to using GMP~2.x.
234 The data type for @Integer@ must mirror that for @MP_INT@ in @gmp.h@
235 (see @gmp.info@). It comes out as:
237 data Integer = J# Int# Int# ByteArray#
239 So, @Integer@ is really just a ``pairing'' type for a particular
240 collection of primitive types.
242 The operations in the GMP return other combinations of
243 GMP-plus-something, so we need ``pairing'' types for those, too:
245 data Return2GMPs = Return2GMPs Int# Int# ByteArray# Int# Int# ByteArray#
246 data ReturnIntAndGMP = ReturnIntAndGMP Int# Int# Int# ByteArray#
248 -- ????? something to return a string of bytes (in the heap?)
250 The primitive ops to support @Integers@ use the ``pieces'' of the
251 representation, and are as follows:
253 negateInteger# :: Int# -> Int# -> ByteArray# -> Integer
255 {plus,minus,times}Integer# :: Int# -> Int# -> ByteArray#
256 -> Int# -> Int# -> ByteArray#
259 cmpInteger# :: Int# -> Int# -> ByteArray#
260 -> Int# -> Int# -> ByteArray#
261 -> Int# -- -1 for <; 0 for ==; +1 for >
263 divModInteger#, quotRemInteger#
264 :: Int# -> Int# -> ByteArray#
265 -> Int# -> Int# -> ByteArray#
266 -> GHCbase.Return2GMPs
268 integer2Int# :: Int# -> Int# -> ByteArray#
271 int2Integer# :: Int# -> Integer -- NB: no error-checking on these two!
272 word2Integer# :: Word# -> Integer
274 addr2Integer# :: Addr# -> Integer
275 -- the Addr# is taken to be a `char *' string
276 -- to be converted into an Integer
280 \subsection{Words and addresses}
282 A @Word#@ is used for bit-twiddling operations. It is the same size as
283 an @Int#@, but has no sign nor any arithmetic operations.
285 type Word# -- Same size/etc as Int# but *unsigned*
286 type Addr# -- A pointer from outside the "Haskell world" (from C, probably);
287 -- described under "arrays"
289 @Word#@s and @Addr#@s have the usual comparison operations.
290 Other unboxed-@Word@ ops (bit-twiddling and coercions):
292 and#, or# :: Word# -> Word# -> Word#
294 not# :: Word# -> Word#
296 shiftL#, shiftRA#, shiftRL# :: Word# -> Int# -> Word#
297 -- shift left, right arithmetic, right logical
299 iShiftL#, iShiftRA#, iShiftRL# :: Int# -> Int# -> Int#
300 -- same shift ops, but on Int#s
302 int2Word# :: Int# -> Word# -- just a cast, really
303 word2Int# :: Word# -> Int#
306 Unboxed-@Addr@ ops (C casts, really):
308 int2Addr# :: Int# -> Addr#
309 addr2Int# :: Addr# -> Int#
311 Operations for indexing off of C pointers (@Addr#@s) to snatch values
312 are listed under ``arrays''.
316 The type @Array# elt@ is the type of primitive,
317 unboxed arrays of values of type @elt@.
321 @Array#@ is more primitive than a Haskell
322 array --- indeed, Haskell arrays are implemented using @Array#@ ---
323 in that an @Array#@ is indexed only by @Int#@s, starting at zero. It is also
324 more primitive by virtue of being unboxed. That doesn't mean that it isn't
325 a heap-allocated object --- of course, it is. Rather, being unboxed means
326 that it is represented by a pointer to the array itself, and not to a thunk
327 which will evaluate to the array (or to bottom).
328 The components of an @Array#@ are themselves boxed.
330 The type @ByteArray#@ is similar to @Array#@, except that it contains
331 just a string of (non-pointer) bytes.
335 Arrays of these types are useful when a Haskell program wishes to
336 construct a value to pass to a C procedure. It is also possible to
337 use them to build (say) arrays of unboxed characters for internal use
338 in a Haskell program. Given these uses, @ByteArray#@ is deliberately
339 a bit vague about the type of its components. Operations are provided
340 to extract values of type @Char#@, @Int#@, @Float#@, @Double#@, and
341 @Addr#@ from arbitrary offsets within a @ByteArray#@. (For type @Foo#@,
342 the $i$th offset gets you the $i$th @Foo#@, not the @Foo#@ at byte-position $i$. Mumble.)
343 (If you want a @Word#@, grab an @Int#@, then coerce it.)
345 Lastly, we have static byte-arrays, of type @Addr#@ [mentioned
346 previously]. (Remember the duality between arrays and pointers in C.)
347 Arrays of this types are represented by a pointer to an array in the
348 world outside Haskell, so this pointer is not followed by the garbage
349 collector. In other respects they are just like @ByteArray#@. They
350 are only needed in order to pass values from C to Haskell.
352 \subsubsection{Reading and writing.}
354 Primitive arrays are linear, and indexed starting at zero.
356 The size and indices of a @ByteArray#@, @Addr#@, and
357 @MutableByteArray#@ are all in bytes. It's up to the program to
358 calculate the correct byte offset from the start of the array. This
359 allows a @ByteArray#@ to contain a mixture of values of different
360 type, which is often needed when preparing data for and unpicking
361 results from C. (Umm... not true of indices... WDP 95/09)
363 {\em Should we provide some @sizeOfDouble#@ constants?}
365 Out-of-range errors on indexing should be caught by the code which
366 uses the primitive operation; the primitive operations themselves do
367 {\em not} check for out-of-range indexes. The intention is that the
368 primitive ops compile to one machine instruction or thereabouts.
370 We use the terms ``reading'' and ``writing'' to refer to accessing {\em mutable}
371 arrays (see Section~\ref{sect:mutable}), and ``indexing''
372 to refer to reading a value from an {\em immutable}
375 If you want to read/write a @Word#@, read an @Int#@ and coerce.
377 Immutable byte arrays are straightforward to index (all indices in bytes):
379 indexCharArray# :: ByteArray# -> Int# -> Char#
380 indexIntArray# :: ByteArray# -> Int# -> Int#
381 indexAddrArray# :: ByteArray# -> Int# -> Addr#
382 indexFloatArray# :: ByteArray# -> Int# -> Float#
383 indexDoubleArray# :: ByteArray# -> Int# -> Double#
385 indexCharOffAddr# :: Addr# -> Int# -> Char#
386 indexIntOffAddr# :: Addr# -> Int# -> Int#
387 indexFloatOffAddr# :: Addr# -> Int# -> Float#
388 indexDoubleOffAddr# :: Addr# -> Int# -> Double#
389 indexAddrOffAddr# :: Addr# -> Int# -> Addr# -- Get an Addr# from an Addr# offset
391 The last of these, @indexAddrOffAddr#@, extracts an @Addr#@ using an offset
392 from another @Addr#@, thereby providing the ability to follow a chain of
395 Something a bit more interesting goes on when indexing arrays of boxed
396 objects, because the result is simply the boxed object. So presumably
397 it should be entered --- we never usually return an unevaluated
398 object! This is a pain: primitive ops aren't supposed to do
399 complicated things like enter objects. The current solution is to
400 return a lifted value, but I don't like it!
402 indexArray# :: Array# elt -> Int# -> GHCbase.Lift elt -- Yuk!
405 \subsubsection{The state type}
407 The primitive type @State#@ represents the state of a state transformer.
408 It is parameterised on the desired type of state, which serves to keep
409 states from distinct threads distinct from one another. But the {\em only}
410 effect of this parameterisation is in the type system: all values of type
411 @State#@ are represented in the same way. Indeed, they are all
412 represented by nothing at all! The code generator ``knows'' to generate no
413 code, and allocate no registers etc, for primitive states.
418 The type @GHCbuiltins.RealWorld@ is truly opaque: there are no values defined
419 of this type, and no operations over it. It is ``primitive'' in that
420 sense---but it is {\em not unboxed!} Its only role in life is to be the type
421 which distinguishes the @PrimIO@ state transformer (see
422 Section~\ref{sect:io-spec}).
427 \subsubsection{States}
429 A single, primitive, value of type @State# RealWorld@ is provided.
431 realWorld# :: State# GHCbuiltins.RealWorld
433 (Note: in the compiler, not a @PrimOp@; just a mucho magic @Id@.)
435 \subsection{State pairing types}
436 \label{sect:horrid-pairing-types}
438 This subsection defines some types which, while they aren't quite primitive
439 because we can define them in Haskell, are very nearly so. They define
440 constructors which pair a primitive state with a value of each primitive type.
441 They are required to express the result type of the primitive operations in the
444 data StateAndPtr# s elt = StateAndPtr# (State# s) elt
446 data StateAndChar# s = StateAndChar# (State# s) Char#
447 data StateAndInt# s = StateAndInt# (State# s) Int#
448 data StateAndWord# s = StateAndWord# (State# s) Word#
449 data StateAndFloat# s = StateAndFloat# (State# s) Float#
450 data StateAndDouble# s = StateAndDouble# (State# s) Double#
451 data StateAndAddr# s = StateAndAddr# (State# s) Addr#
453 data StateAndStablePtr# s a = StateAndStablePtr# (State# s) (StablePtr# a)
454 data StateAndForeignObj# s = StateAndForeignObj# (State# s) ForeignObj#
455 data StateAndSynchVar# s a = StateAndSynchVar# (State# s) (SynchVar# a)
457 data StateAndArray# s elt = StateAndArray# (State# s) (Array# elt)
458 data StateAndMutableArray# s elt = StateAndMutableArray# (State# s) (MutableArray# s elt)
459 data StateAndByteArray# s = StateAndByteArray# (State# s) ByteArray#
460 data StateAndMutableByteArray# s = StateAndMutableByteArray# (State# s) (MutableByteArray# s)
464 \subsection{Mutable arrays}
467 Corresponding to @Array#@ and @ByteArray#@,
468 we have the types of mutable versions of each.
469 In each case, the representation is a pointer
470 to a suitable block of (mutable) heap-allocated storage.
472 type MutableArray# s elt
473 type MutableByteArray# s
475 \subsubsection{Allocation.}
477 Mutable arrays can be allocated.
478 Only pointer-arrays are initialised; arrays of non-pointers are filled
479 in by ``user code'' rather than by the array-allocation primitive.
480 Reason: only the pointer case has to worry about GC striking with a
481 partly-initialised array.
483 newArray# :: Int# -> elt -> State# s -> StateAndMutableArray# s elt
485 newCharArray# :: Int# -> State# s -> StateAndMutableByteArray# s
486 newIntArray# :: Int# -> State# s -> StateAndMutableByteArray# s
487 newAddrArray# :: Int# -> State# s -> StateAndMutableByteArray# s
488 newFloatArray# :: Int# -> State# s -> StateAndMutableByteArray# s
489 newDoubleArray# :: Int# -> State# s -> StateAndMutableByteArray# s
491 The size of a @ByteArray#@ is given in bytes.
493 \subsubsection{Reading and writing}
495 %OLD: Remember, offsets in a @MutableByteArray#@ are in bytes.
497 readArray# :: MutableArray# s elt -> Int# -> State# s -> StateAndPtr# s elt
498 readCharArray# :: MutableByteArray# s -> Int# -> State# s -> StateAndChar# s
499 readIntArray# :: MutableByteArray# s -> Int# -> State# s -> StateAndInt# s
500 readAddrArray# :: MutableByteArray# s -> Int# -> State# s -> StateAndAddr# s
501 readFloatArray# :: MutableByteArray# s -> Int# -> State# s -> StateAndFloat# s
502 readDoubleArray# :: MutableByteArray# s -> Int# -> State# s -> StateAndDouble# s
504 writeArray# :: MutableArray# s elt -> Int# -> elt -> State# s -> State# s
505 writeCharArray# :: MutableByteArray# s -> Int# -> Char# -> State# s -> State# s
506 writeIntArray# :: MutableByteArray# s -> Int# -> Int# -> State# s -> State# s
507 writeAddrArray# :: MutableByteArray# s -> Int# -> Addr# -> State# s -> State# s
508 writeFloatArray# :: MutableByteArray# s -> Int# -> Float# -> State# s -> State# s
509 writeDoubleArray# :: MutableByteArray# s -> Int# -> Double# -> State# s -> State# s
512 \subsubsection{Equality.}
514 One can take ``equality'' of mutable arrays. What is compared is the
515 {\em name} or reference to the mutable array, not its contents.
517 sameMutableArray# :: MutableArray# s elt -> MutableArray# s elt -> Bool
518 sameMutableByteArray# :: MutableByteArray# s -> MutableByteArray# s -> Bool
521 \subsubsection{Freezing mutable arrays}
523 Only unsafe-freeze has a primitive. (Safe freeze is done directly in Haskell
524 by copying the array and then using @unsafeFreeze@.)
526 unsafeFreezeArray# :: MutableArray# s elt -> State# s -> StateAndArray# s elt
527 unsafeFreezeByteArray# :: MutableByteArray# s -> State# s -> StateAndByteArray# s
530 \subsubsection{Stable pointers}
532 {\em Andy's comment.} {\bf Errors:} The following is not strictly true: the current
533 implementation is not as polymorphic as claimed. The reason for this
534 is that the C programmer will have to use a different entry-routine
535 for each type of stable pointer. At present, we only supply a very
536 limited number (1) of these routines. It might be possible to
537 increase the range of these routines by providing general purpose
538 entry points to apply stable pointers to (stable pointers to)
539 arguments and to enter (stable pointers to) boxed primitive values.
540 {\em End of Andy's comment.}
542 A stable pointer is a name for a Haskell object which can be passed to the
543 external world. It is ``stable'' in the sense that the name does not change when
544 the Haskell garbage collector runs --- in contrast to the address of the object
545 which may well change.
547 The stable pointer type is parameterised by the type of the thing which is named.
551 A stable pointer is represented by an index into the (static)
552 @StablePointerTable@. The Haskell garbage collector treats the
553 @StablePointerTable@ as a source of roots for GC.
555 The @makeStablePointer@ function converts a value into a stable pointer.
556 It is part of the @PrimIO@ monad, because we want to be sure we don't
557 allocate one twice by accident, and then only free one of the copies.
559 makeStablePointer# :: a -> State# RealWorld -> StateAndStablePtr# RealWorld a
560 freeStablePointer# :: StablePtr# a -> State# RealWorld -> State# RealWorld
561 deRefStablePointer# :: StablePtr# a -> State# RealWorld -> StateAndPtr RealWorld a
563 There is also a C procedure @FreeStablePtr@ which frees a stable pointer.
566 % Rewritten and updated for MallocPtr++ -- 4/96 SOF
568 \subsubsection{Foreign objects}
570 A @ForeignObj@ is a reference to an object outside the Haskell
571 world (i.e., from the C world, or a reference to an object on another
572 machine completely.), where the Haskell world has been told ``Let me
573 know when you're finished with this ...''.
575 The @ForeignObj@ type is just a special @Addr#@ ({\em not} parameterised).
580 A typical use of @ForeignObj@ is in constructing Haskell bindings
581 to external libraries. A good example is that of writing a binding to
582 an image-processing library (which was actually the main motivation
583 for implementing @ForeignObj@'s precursor, @MallocPtr@). The
584 images manipulated are not stored in the Haskell heap, either because
585 the library insist on allocating them internally or we (sensibly)
586 decide to spare the GC from having to heave heavy images around.
589 data Image = Image ForeignObj#
591 instance CCallable Image
594 The @ForeignObj#@ type is then used to refer to the externally
595 allocated image, and to acheive some type safety, the Haskell binding
596 defines the @Image@ data type. So, a value of type @ForeignObj#@ is
597 used to ``box'' up an external reference into a Haskell heap object
598 that we can then indirectly reference:
601 createImage :: (Int,Int) -> PrimIO Image
604 So far, this looks just like an @Addr#@ type, but @ForeignObj#@
605 offers a bit more, namely that we can specify a {\em finalisation
606 routine} to invoke when the @ForeignObj#@ is discarded by the
607 GC. The garbage collector invokes the finalisation routine associated
608 with the @ForeignObj#@, saying `` Thanks, I'm through with this
609 now..'' For the image-processing library, the finalisation routine could for
610 the images free up memory allocated for them. The finalisation routine has
611 currently to be written in C (the finalisation routine can in turn call on
612 @FreeStablePtr@ to deallocate a stable pointer.).
614 Associating a finalisation routine with an external object is done by
618 makeForeignObj# :: Addr# -- foreign reference
619 -> Addr# -- pointer to finalisation routine
620 -> StateAndForeignObj# RealWorld ForeignObj#
623 (Implementation: a linked list of all @ForeignObj#@s is maintained to allow the
624 garbage collector to detect when a @ForeignObj#@ becomes garbage.)
626 Like @Array@, @ForeignObj#@s are represented by heap objects.
628 {\bf ToDo:} Decide whether @FreeCHeapPointer@ is allowed to call on a
629 stable pointer. (I sincerely hope not since we will still be in the
632 \subsubsection{Synchronizing variables (I-vars, M-vars)}
637 type SynchVar# s elt -- primitive
639 newSynchVar#:: State# s -> StateAndSynchVar# s elt
641 takeMVar# :: SynchVar# s elt -> State# s -> StateAndPtr# s elt
642 putMVar# :: SynchVar# s elt -> State# s -> State# s
644 readIVar# :: SynchVar# s elt -> State# s -> StateAndPtr# s elt
645 writeIVar# :: SynchVar# s elt -> State# s -> State# s
648 \subsubsection{Controlling the garbage collector}
650 The C function {\tt PerformGC\/}, allows the C world to force Haskell
651 to do a garbage collection. It can only be called while Haskell
652 is performing a C Call.
654 Note that this function can be used to define a Haskell IO operation
655 with the same effect:
657 > performGCIO :: PrimIO ()
658 > performGCIO = _ccall_gc_ PerformGC
661 {\bf ToDo:} Is there any need for abnormal/normal termination to force
662 a GC too? Is there any need for a function that provides finer
663 control over GC: argument = amount of space required; result = amount
666 \subsection{@spark#@ primitive operation (for parallel execution)}
668 {\em ToDo: say something} It's used in the unfolding for @par@.
670 \subsection{The @errorIO#@ primitive operation}
672 The @errorIO#@ primitive takes an argument much like @PrimIO@. It aborts execution of
673 the current program, and continues instead by performing the given @PrimIO@-like value
674 on the current state of the world.
676 errorIO# :: (State RealWorld -> ((), State RealWorld)) -> a
681 {\bf ToDo:} current implementation has state variable as second
682 argument not last argument.
684 The @ccall#@ primitive can't be given an ordinary type, because it has
685 a variable number of arguments. The nearest we can get is:
687 ccall# :: CRoutine -> a1# -> ... -> an# -> State# RealWorld -> StateAndR# RealWorld
689 where the type variables @a1#@\ldots@an#@ and @r#@ can be instantiated by any
690 primitive type, and @StateAndR#@ is the appropriate pairing type from
691 Section~\ref{sect:horrid-pairing-types}. The @CRoutine@
692 isn't a proper Haskell type at all; it just reminds us that @ccall#@ needs to
693 know what C routine to call.
695 This notation is really short for a massive family of @ccall#@ primitives, one
696 for each combination of types. (Of course, the compiler simply remembers the
697 types involved, and generates appropriate code when it finally spits out the C.)
699 Unlike all the other primitive operators, @ccall#@ is not bound to an in-scope
700 identifier. The only way it is possible to generate a @ccall#@ is via the
703 All this applies equally to @casm#@:
705 casm# :: CAsmString -> a1# -> ... -> an# -> State# RealWorld -> StateAndR# RealWorld
708 %------------------------------------------------------------
709 \section{Library stuff built with the Really Primitive Stuff}
711 \subsection{The state transformer monad}
713 \subsubsection{Types}
715 A state transformer is a function from a state to a pair of a result and a new
718 newtype ST s a = ST (State s -> (a, State s))
720 The @ST@ type is {\em abstract}, so that the programmer cannot see its
721 representation. If he could, he could write bad things like:
724 bad = ST $ \ s -> ...(f s)...(g s)...
726 Here, @s@ is duplicated, which would be bad news.
728 A state is represented by a primitive state value, of type @State# s@,
729 wrapped up in a @State@ constructor. The reason for boxing it in this
730 way is so that we can be strict or lazy in the state. (Remember, all
731 primitive types are unboxed, and hence can't be bottom; but types built
732 with @data@ are all boxed.)
734 data State s = S# (State# s)
737 \subsubsection{The state transformer combinators}
739 Now for the combinators, all of which live inside the @ST@
740 abstraction. Notice that @returnST@ and @thenST@ are lazy in the
743 returnST :: a -> ST s a
744 returnST a s = (a, s)
746 thenST :: ST s a -> (a -> ST s b) -> ST s b
747 thenST m k s = let (r,new_s) = m s
751 fixST :: (a -> ST s a) -> ST s a
752 fixST k s = let ans = k r s
757 The interesting one is, of course, @runST@. We can't infer its type!
758 (It has a funny name because it must be wired into the compiler.)
760 -- runST :: forall a. (forall s. ST s a) -> a
761 runST m = case m (S# realWorld#) of
765 \subsubsection{Other useful combinators}
767 There are various other standard combinators, all defined in terms the
768 fundamental combinators above. The @seqST@ combinator is like
769 @thenST@, except that it discards the result of the first state
772 seqST :: ST s a -> ST s b -> ST s b
773 seqST m1 m2 = m1 `thenST` (\_ -> m2)
776 We also have {\em strict} (... in the state...) variants of the
777 then/return combinators (same types as their pals):
779 returnStrictlyST a s@(S# _) = (a, s)
781 thenStrictlyST m k s@(S# _)
782 = case (m s) of { (r, new_s@(S# _)) ->
785 seqStrictlyST m k = ... ditto, for seqST ...
788 The combinator @listST@ takes a list of state transformers, and
789 composes them in sequence, returning a list of their results:
791 listST :: [ST s a] -> ST s [a]
792 listST [] = returnST []
793 listST (m:ms) = m `thenST` \ r ->
794 listST ms `thenST` \ rs ->
797 The @mapST@ combinator ``lifts'' a function from a value to state
798 transformers to one which works over a list of values:
800 mapST :: (a -> ST s b) -> [a] -> ST s [b]
801 mapST f ms = listST (map f ms)
803 The @mapAndUnzipST@ combinator is similar to @mapST@, except that here the
804 function returns a pair:
806 mapAndUnzipST :: (a -> ST s (b,c)) -> [a] -> ST s ([b],[c])
807 mapAndUnzipST f (m:ms)
808 = f m `thenST` \ ( r1, r2) ->
809 mapAndUnzipST f ms `thenST` \ (rs1, rs2) ->
810 returnST (r1:rs1, r2:rs2)
813 \subsubsection{The @PrimIO@ monad}
816 The @PrimIO@ type is defined in as a state transformer which manipulates the
819 type PrimIO a = ST RealWorld a -- Transparent
821 The @PrimIO@ type is an ordinary type synonym, transparent to the programmer.
823 The type @RealWorld@ and value @realWorld#@ do not need to be hidden (although
824 there is no particular point in exposing them). Even having a value of type
825 @realWorld#@ does not compromise safety, since the type @ST@ is hidden.
827 It is type-correct to use @returnST@ in an I/O context, but it is a
828 bit more efficient to use @returnPrimIO@. The latter is strict in the
829 state, which propagates backwards to all the earlier combinators
830 (provided they are unfolded). Why is it safe for @returnPrimIO@ to be
831 strict in the state? Because every context in which an I/O state
832 transformer is used will certainly evaluate the resulting state; it is
833 the state of the real world!
835 returnPrimIO :: a -> PrimIO a
836 returnPrimIO a s@(S# _) -> (a, s)
838 We provide strict versions of the other combinators too.
840 thenPrimIO m k s = case m s of
843 @fixPrimIO@ has to be lazy, though!
847 The other combinators are just the same as before, but use the strict
848 @thenPrimIO@ and @returnPrimIO@ for efficiency.
850 foldrPrimIO f z [] = z
851 foldrPrimIO f z (m:ms) = foldrPrimIO f z ms `thenPrimIO` \ ms' ->
854 listPrimIO ms = foldrPrimIO (\ a xs -> a `thenPrimIO` \ x -> returnPrimIO (x : xs))
857 mapPrimIO f ms = listPrimIO (map f ms)
859 mapAndUnzipPrimIO f (m:ms)
860 = f m `thenPrimIO` \ ( r1, r2) ->
861 mapAndUnzipPrimIO f ms `thenPrimIO` \ (rs1, rs2) ->
862 returnPrimIO (r1:rs1, r2:rs2)
867 \subsubsection{Types}
870 data Array ix elt = Array (ix,ix) (Array# elt)
871 data ByteArray ix = ByteArray (ix,ix) ByteArray#
873 data MutableArray s ix elt = MutableArray (ix,ix) (MutableArray# s elt)
874 data MutableByteArray s ix = MutableByteArray (ix,ix) (MutableByteArray# s)
877 \subsubsection{Operations on immutable arrays}
879 Ordinary array indexing is straightforward.
881 (!) :: Ix ix => Array ix elt -> ix -> elt
883 QUESTIONs: should @ByteArray@s be indexed by Ints or ix? With byte offsets
884 or sized ones? (sized ones [WDP])
886 indexCharArray :: Ix ix => ByteArray ix -> ix -> Char
887 indexIntArray :: Ix ix => ByteArray ix -> ix -> Int
888 indexAddrArray :: Ix ix => ByteArray ix -> ix -> Addr
889 indexFloatArray :: Ix ix => ByteArray ix -> ix -> Float
890 indexDoubleArray :: Ix ix => ByteArray ix -> ix -> Double
892 @Addr@s are indexed straightforwardly by @Int@s. Unlike the primitive
893 operations, though, the offsets assume that the array consists entirely of the
894 type of value being indexed, and so there's an implicit multiplication by
895 the size of that value. To access @Addr@s with mixed values requires
896 you to do a DIY job using the primitives.
898 indexAddrChar :: Addr -> Int -> Char
900 indexStaticCharArray :: Addr -> Int -> Char
901 indexStaticIntArray :: Addr -> Int -> Int
902 indexStaticFloatArray :: Addr -> Int -> Float
903 indexStaticDoubleArray :: Addr -> Int -> Double
904 indexStaticArray :: Addr -> Int -> Addr
907 \subsubsection{Operations on mutable arrays}
909 newArray :: Ix ix => (ix,ix) -> elt -> ST s (MutableArray s ix elt)
910 newCharArray :: Ix ix => (ix,ix) -> ST s (MutableByteArray s ix)
915 readArray :: Ix ix => MutableArray s ix elt -> ix -> ST s elt
916 readCharArray :: Ix ix => MutableByteArray s ix -> ix -> ST s Char
921 writeArray :: Ix ix => MutableArray s ix elt -> ix -> elt -> ST s ()
922 writeCharArray :: Ix ix => MutableByteArray s ix -> ix -> Char -> ST s ()
927 freezeArray :: Ix ix => MutableArray s ix elt -> ST s (Array ix elt)
928 freezeCharArray :: Ix ix => MutableByteArray s ix -> ST s (ByteArray ix)
932 We have no need on one-function-per-type for unsafe freezing:
934 unsafeFreezeArray :: Ix ix => MutableArray s ix elt -> ST s (Array ix elt)
935 unsafeFreezeByteArray :: Ix ix => MutableByteArray s ix -> ST s (ByteArray ix)
938 Sometimes we want to snaffle the bounds of one of these beasts:
940 boundsOfArray :: Ix ix => MutableArray s ix elt -> (ix, ix)
941 boundsOfByteArray :: Ix ix => MutableByteArray s ix -> (ix, ix)
944 Lastly, ``equality'':
946 sameMutableArray :: MutableArray s ix elt -> MutableArray s ix elt -> Bool
947 sameMutableByteArray :: MutableByteArray s ix -> MutableByteArray s ix -> Bool
951 \subsection{Variables}
953 \subsubsection{Types}
955 Mutable variables are (for now anyway) implemented as arrays. The @MutableVar@ type
956 is opaque, so we can change the implementation later if we want.
958 type MutableVar s a = MutableArray s Int a
961 \subsubsection{Operations}
963 newVar :: a -> ST s (MutableVar s a)
964 readVar :: MutableVar s a -> ST s a
965 writeVar :: MutableVar s a -> a -> ST s ()
966 sameVar :: MutableVar s a -> MutableVar s a -> Bool
969 \subsection{Stable pointers}
971 Nothing exciting here, just simple boxing up.
973 data StablePtr a = StablePtr (StablePtr# a)
975 makeStablePointer :: a -> StablePtr a
976 freeStablePointer :: StablePtr a -> PrimIO ()
979 \subsection{Foreign objects}
981 Again, just boxing up.
983 data ForeignObj = ForeignObj ForeignObj#
985 makeForeignObj :: Addr -> Addr -> PrimIO ForeignObj
990 Everything in this section goes for @_casm_@ too.
992 {\em ToDo: mention @_ccall_gc_@ and @_casm_gc_@...}
994 The @_ccall_@ construct has the following form:
995 $$@_ccall_@~croutine~a_1~\ldots~a_n$$
996 This whole construct has type $@PrimIO@~res$.
1000 The first ``argument'', $croutine$, must be the literal name of a C procedure.
1001 It cannot be a Haskell expression which evaluates to a string, etc; it must be
1002 simply the name of the procedure.
1004 The arguments $a_1, \ldots,a_n$ must be of {\em C-callable} type.
1006 The whole construct has type $@PrimIO@~ty$, where $ty$ is a {\em C-returnable} type.
1008 A {\em boxed-primitive} type is both C-callable and C-returnable.
1009 A boxed primitive type is anything declared by:
1013 where @t@ is a primitive type. Note that
1014 programmer-defined boxed-primitive types are perfectly OK:
1016 data Widget = W# Int#
1017 data Screen = S# CHeapPtr#
1020 There are other types that can be passed to C (C-callable). This
1021 table summarises (including the standard boxed-primitive types):
1023 Boxed Type of transferd Corresp. Which is
1024 Type Prim. component C type *probably*...
1025 ------ --------------- ------ -------------
1026 Char Char# StgChar unsigned char
1027 Int Int# StgInt long int
1028 Word Word# StgWord unsigned long int
1029 Addr Addr# StgAddr char *
1030 Float Float# StgFloat float
1031 Double Double# StgDouble double
1033 Array Array# StgArray StgPtr
1034 ByteArray ByteArray# StgByteArray StgPtr
1035 MutableArray MutableArray# StgArray StgPtr
1036 MutableByteArray MutableByteArray# StgByteArray StgPtr
1038 State State# nothing!
1040 StablePtr StablePtr# StgStablePtr StgPtr
1041 ForeignObj ForeignObj# StgForeignObj StgPtr
1044 All of the above are {\em C-returnable} except:
1046 Array, ByteArray, MutableArray, MutableByteArray
1049 {\bf ToDo:} I'm pretty wary of @Array@ and @MutableArray@ being in
1050 this list, and not too happy about @State@ [WDP].
1052 {\bf ToDo:} Can code generator pass all the primitive types? Should this be
1053 extended to include {\tt Bool\/} (or any enumeration type?)
1055 The type checker must be able to figure out just which of the C-callable/returnable
1056 types is being used. If it can't, you have to add type signatures. For example,
1060 is not good enough, because the compiler can't work out what type @x@ is, nor
1061 what type the @_ccall_@ returns. You have to write, say:
1063 f :: Int -> PrimIO Float
1067 \subsubsection{Implementation}
1069 The desugarer unwraps the @_ccall_@ construct by inserting the necessary
1070 evaluations etc to unbox the arguments. For example, the body of the definition
1071 of @f@ above would become:
1073 (\ s -> case x of { I# x# ->
1074 case s of { S# s# ->
1075 case ccall# [Int#,Float#] x# s# of { StateAndFloat# f# new_s# ->
1079 Notice that the state, too, is unboxed.
1081 The code generator must deal specially with primitive objects which
1082 are stored on the heap.
1084 ... details omitted ...
1087 %More importantly, it must construct a C Heap Pointer heap-object after
1088 %a @_ccall_@ which returns a @MallocPtr#@.
1091 %--------------------------------------------------------
1092 \section{Non-primitive stuff that must be wired into GHC}
1095 data Char = C# Char#
1097 data Word = W# Word#
1098 data Addr = A# Addr#
1100 data Float = F# Float#
1101 data Double = D# Double#
1102 data Integer = J# Int# Int# ByteArray#
1104 -- and the other boxed-primitive types:
1105 Array, ByteArray, MutableArray, MutableByteArray,
1106 StablePtr, ForeignObj
1108 data Bool = False | True
1109 data Ordering = LT | EQ | GT -- used in derived comparisons
1111 data List a = [] | a : (List a)
1114 data Lift a = Lift a -- used Yukkily as described elsewhere
1116 type String = [Char] -- convenience, only
1119 %------------------------------------------------------------
1120 \section{Programmer interface(s)}
1122 \subsection{The bog-standard interface}
1124 If you rely on the implicit @import Prelude@ that GHC normally does
1125 for you, and if you don't use any weird flags (notably
1126 @-fglasgow-exts@), and if you don't import one of the fairly-magic
1127 @PreludeGla*@ interfaces, then GHC should work {\em exactly} as the
1128 Haskell report says, and the full user namespaces should be available
1131 \subsection{If you mess about with @import Prelude@...}
1133 Innocent hiding, e.g.,
1135 import Prelude hiding ( fromIntegral )
1137 should work just fine.
1139 There are some things you can do that will make GHC crash, e.g.,
1140 hiding a standard class:
1142 import Prelude hiding ( Eq(..) )
1146 \subsection{Turning on Glasgow extensions with @-fglasgow-exts@}
1148 If you turn on @-fglasgow-exts@, then all the primitive types and
1149 operations described herein are available.
1151 It is possible that some name conflicts between your code and the
1152 wired-in things might spring to life (though we doubt it...).
1153 Change your names :-)