2 \documentstyle[a4wide,grasp]{article}
4 \renewcommand{\textfraction}{0.1}
5 \renewcommand{\floatpagefraction}{0.9}
6 \renewcommand{\dblfloatpagefraction}{0.9}
9 \renewcommand{\today}{March 1997}
13 \title{The GHC Prelude and Libraries}
14 \author{Simon L Peyton Jones \and Will Partain}
22 \section[syslibs]{Introduction}
24 This document describes GHC's prelude and libraries. The basic story is that of
25 the Haskell 1.3 Report and Libraries document (which we do not reproduce here),
26 but this document describes in addition:
28 \item GHC's additional non-standard libraries and types, such as state transformers,
29 packed strings, foreign objects, stable pointers, and so on.
31 \item GHC's primitive types and operations. The standard Haskell functions are implemented
32 on top of these, and it is sometimes useful to use them directly.
34 \item The organsiation of these libraries into directories.
39 The libraries are organised into the following three groups, each of which
40 is kept in a separate sub-directory of GHC's installed @lib/@ directory:
42 \item[@lib/required/@] These are the libraries {\em required} by the Haskell
43 definition. All are defined by the Haskell Report, or by the Haskell Libraries Report.
44 They currently comprise:
47 \item @List@: more functions on lists.
48 \item @Char@: more functions on characters.
49 \item @Maybe@: more functions on @Maybe@ types.
50 \item @Complex@: functions on complex numbers.
51 \item @Ratio@: functions on rational numbers.
52 \item @Monad@: functions on characters.
53 \item @Ix@: the @Ix@ class of indexing operations.
54 \item @Array@: monolithic arrays.
55 \item @IO@: basic input/output functions.
56 \item @Directory@: basic functions for accessing the file system.
57 \item @System@: basic operating-system interface functions.
60 \item[@lib/glaExts@] GHC extension libraries, currently comprising:
62 \item @PackedString@: functions that manipulate strings packed efficiently, one character per byte.
63 \item @ST@: the state transformer monad.
64 \item @Foreign@: types and operations for GHC's foreign-language interface.
67 \item[@lib/concurrent@] GHC extension libraries to support Concurrent Haskell, currently comprising:
69 \item @Concurrent.hs@: main library.
70 \item @Parallel.hs@: stuff for multi-processor parallelism.
78 \item[@lib/ghc@] These libraries are the pieces on which all the others are built.
79 They aren't typically imported by Joe Programmer, but there's nothing to stop you
80 doing so if you want. In general, the modules prefixed by @Prel@ are pieces that go
81 towards building @Prelude@.
84 \item @GHC@: this ``library'' brings into scope all the primitive types and operations, such as
85 @Int#@, @+#@, @encodeFloat#@, etc etc. It is unique in that there is no Haskell
86 source code for it. Details in Section \ref{sect:ghc}.
88 \item @PrelBase@: defines the basic types and classes without which very few Haskell programs can work.
89 The classes are: @Eq@, @Ord@, @Enum@, @Bounded@, @Num@, @Show@, @Eval@, @Monad@, @MonadZero@, @MonadPlus@.
90 The types are: list, @Bool@, @Char@, @Ordering@, @String@, @Int@, @Integer@, @Maybe@, @Either@.
92 \item @PrelTup@: defines tuples and their instances.
93 \item @PrelList@: defines most of the list operations required by @Prelude@. (A few are in @PrelBase@,
94 to avoid gratuitous mutual recursion between modules.)
96 \item @PrelNum@ defines: the numeric classes beyond @Num@ (namely @Real@, @Integral@,
97 @Fractional@, @Floating@, @RealFrac@, @RealFloat@; instances for appropriate classes
98 for @Int@ and @Integer@; the types @Float@, @Double@, and @Ratio@ and their instances.
100 \item @PrelRead@: the @Read@ class and all its instances. It's kept separate because many programs
101 don't use @Read@ at all, so we don't even want to link in its code.
103 \item @ConcBase@: substrate stuff for Concurrent Haskell.
105 \item @IOBase@: substrate stuff for the main I/O libraries.
106 \item @IOHandle@: large blob of code for doing I/O on handles.
107 \item @PrelIO@: the remaining small pieces to produce the I/O stuff needed by @Prelude@.
109 \item @STBase@: substrate stuff for @ST@.
110 \item @ArrBase@: substrate stuff for @Array@.
112 \item @GHCerr@: error reporting code, called from code that the compiler plants in compiled programs.
113 \item @GHCmain@: the definition of @mainPrimIO@, which is what {\em really} gets
114 called by the runtime system. @mainPrimIO@ in turn calls @main@.
118 The @...Base@ modules generally export representation information that
119 is hidden from the public interface. For example the module @STBase@
120 exports the type @ST@ including its representation, whereas the module
121 @ST@ exports @ST@ abstractly.
123 None of these modules are involved in any mutual recursion, with the sole exception that
124 many modules import @IOBase.error@.
126 \section{The module @GHC@: really primitive stuff}
129 This section defines all the types which are primitive in Glasgow Haskell, and the
130 operations provided for them.
132 A primitive type is one which cannot be defined in Haskell, and which
133 is therefore built into the language and compiler. Primitive types
134 are always unboxed; that is, a value of primitive type cannot be
137 Primitive values are often represented by a simple bit-pattern, such as @Int#@,
138 @Float#@, @Double#@. But this is not necessarily the case: a primitive value
139 might be represented by a pointer to a heap-allocated object. Examples include
140 @Array#@, the type of primitive arrays. You might think this odd: doesn't being
141 heap-allocated mean that it has a box? No, it does not. A primitive array is
142 heap-allocated because it is too big a value to fit in a register, and would be
143 too expensive to copy around; in a sense, it is accidental that it is represented
144 by a pointer. If a pointer represents a primitive value, then it really does
145 point to that value: no unevaluated thunks, no indirections...nothing can be at
146 the other end of the pointer than the primitive value.
148 This section also describes a few non-primitive types, which are needed
149 to express the result types of some primitive operations.
151 \subsection{Character and numeric types}
153 There are the following obvious primitive types:
156 type Int# -- see also Word# and Addr#, later
160 If you want to know their exact equivalents in C, see
161 @ghc/includes/StgTypes.lh@ in the GHC source.
163 Literals for these types may be written as follows:
168 'a'# a Char#; for weird characters, use '\o<octal>'#
169 "a"# an Addr# (a `char *')
172 \subsubsection{Comparison operations}
174 {gt,ge,eq,ne,lt,le}Char# :: Char# -> Char# -> Bool
175 -- ditto for Int#, Word#, Float#, Double#, and Addr#
178 \subsubsection{Unboxed-character operations}
180 ord# :: Char# -> Int#
181 chr# :: Int# -> Char#
185 \subsubsection{Unboxed-@Int@ operations}
187 {plus,minus,times,quot,div,rem}Int# :: Int# -> Int# -> Int#
188 negateInt# :: Int# -> Int#
191 NB: No error/overflow checking!
193 \subsubsection{Unboxed-@Double@ and @Float@ operations}
195 {plus,minus,times,divide}Double# :: Double# -> Double# -> Double#
196 negateDouble# :: Double# -> Double#
198 float2Int# :: Double# -> Int# -- just a cast, no checking!
199 int2Double# :: Int# -> Double#
201 expDouble# :: Double# -> Double#
202 logDouble# :: Double# -> Double#
203 sqrtDouble# :: Double# -> Double#
204 sinDouble# :: Double# -> Double#
205 cosDouble# :: Double# -> Double#
206 tanDouble# :: Double# -> Double#
207 asinDouble# :: Double# -> Double#
208 acosDouble# :: Double# -> Double#
209 atanDouble# :: Double# -> Double#
210 sinhDouble# :: Double# -> Double#
211 coshDouble# :: Double# -> Double#
212 tanhDouble# :: Double# -> Double#
213 powerDouble# :: Double# -> Double# -> Double#
216 There's an exactly-matching set of unboxed-@Float@ ops; replace
217 @Double#@ with @Float#@ in the list above. There are two
218 coercion functions for @Float#@/@Double#@:
220 float2Double# :: Float# -> Double#
221 double2Float# :: Double# -> Float#
224 The primitive versions of @encodeDouble@/@decodeDouble@:
226 encodeDouble# :: Int# -> Int# -> ByteArray# -- Integer mantissa
227 -> Int# -- Int exponent
230 decodeDouble# :: Double#
231 -> GHCbase.ReturnIntAndGMP
234 (And the same for @Float#@s.)
236 \subsection{Operations on/for @Integers@ (interface to GMP)}
237 \label{sect:horrid-Integer-pairing-types}
239 We implement @Integers@ (arbitrary-precision integers) using the GNU
240 multiple-precision (GMP) package.
242 NB: some of this might change if we upgrade to using GMP~2.x.
244 The data type for @Integer@ must mirror that for @MP_INT@ in @gmp.h@
245 (see @gmp.info@). It comes out as:
247 data Integer = J# Int# Int# ByteArray#
250 So, @Integer@ is really just a ``pairing'' type for a particular
251 collection of primitive types.
253 The operations in the GMP return other combinations of
254 GMP-plus-something, so we need ``pairing'' types for those, too:
256 data Return2GMPs = Return2GMPs Int# Int# ByteArray# Int# Int# ByteArray#
257 data ReturnIntAndGMP = ReturnIntAndGMP Int# Int# Int# ByteArray#
259 -- ????? something to return a string of bytes (in the heap?)
262 The primitive ops to support @Integers@ use the ``pieces'' of the
263 representation, and are as follows:
265 negateInteger# :: Int# -> Int# -> ByteArray# -> Integer
267 {plus,minus,times}Integer# :: Int# -> Int# -> ByteArray#
268 -> Int# -> Int# -> ByteArray#
271 cmpInteger# :: Int# -> Int# -> ByteArray#
272 -> Int# -> Int# -> ByteArray#
273 -> Int# -- -1 for <; 0 for ==; +1 for >
275 divModInteger#, quotRemInteger#
276 :: Int# -> Int# -> ByteArray#
277 -> Int# -> Int# -> ByteArray#
278 -> GHCbase.Return2GMPs
280 integer2Int# :: Int# -> Int# -> ByteArray#
283 int2Integer# :: Int# -> Integer -- NB: no error-checking on these two!
284 word2Integer# :: Word# -> Integer
286 addr2Integer# :: Addr# -> Integer
287 -- the Addr# is taken to be a `char *' string
288 -- to be converted into an Integer
293 \subsection{Words and addresses}
295 A @Word#@ is used for bit-twiddling operations. It is the same size as
296 an @Int#@, but has no sign nor any arithmetic operations.
298 type Word# -- Same size/etc as Int# but *unsigned*
299 type Addr# -- A pointer from outside the "Haskell world" (from C, probably);
300 -- described under "arrays"
303 @Word#@s and @Addr#@s have the usual comparison operations.
304 Other unboxed-@Word@ ops (bit-twiddling and coercions):
306 and#, or# :: Word# -> Word# -> Word#
308 not# :: Word# -> Word#
310 shiftL#, shiftRA#, shiftRL# :: Word# -> Int# -> Word#
311 -- shift left, right arithmetic, right logical
313 iShiftL#, iShiftRA#, iShiftRL# :: Int# -> Int# -> Int#
314 -- same shift ops, but on Int#s
316 int2Word# :: Int# -> Word# -- just a cast, really
317 word2Int# :: Word# -> Int#
321 Unboxed-@Addr@ ops (C casts, really):
323 int2Addr# :: Int# -> Addr#
324 addr2Int# :: Addr# -> Int#
327 Operations for indexing off of C pointers (@Addr#@s) to snatch values
328 are listed under ``arrays''.
332 The type @Array# elt@ is the type of primitive,
333 unboxed arrays of values of type @elt@.
338 @Array#@ is more primitive than a Haskell
339 array --- indeed, Haskell arrays are implemented using @Array#@ ---
340 in that an @Array#@ is indexed only by @Int#@s, starting at zero. It is also
341 more primitive by virtue of being unboxed. That doesn't mean that it isn't
342 a heap-allocated object --- of course, it is. Rather, being unboxed means
343 that it is represented by a pointer to the array itself, and not to a thunk
344 which will evaluate to the array (or to bottom).
345 The components of an @Array#@ are themselves boxed.
347 The type @ByteArray#@ is similar to @Array#@, except that it contains
348 just a string of (non-pointer) bytes.
353 Arrays of these types are useful when a Haskell program wishes to
354 construct a value to pass to a C procedure. It is also possible to
355 use them to build (say) arrays of unboxed characters for internal use
356 in a Haskell program. Given these uses, @ByteArray#@ is deliberately
357 a bit vague about the type of its components. Operations are provided
358 to extract values of type @Char#@, @Int#@, @Float#@, @Double#@, and
359 @Addr#@ from arbitrary offsets within a @ByteArray#@. (For type @Foo#@,
360 the $i$th offset gets you the $i$th @Foo#@, not the @Foo#@ at byte-position $i$. Mumble.)
361 (If you want a @Word#@, grab an @Int#@, then coerce it.)
363 Lastly, we have static byte-arrays, of type @Addr#@ [mentioned
364 previously]. (Remember the duality between arrays and pointers in C.)
365 Arrays of this types are represented by a pointer to an array in the
366 world outside Haskell, so this pointer is not followed by the garbage
367 collector. In other respects they are just like @ByteArray#@. They
368 are only needed in order to pass values from C to Haskell.
370 \subsubsection{Reading and writing.}
372 Primitive arrays are linear, and indexed starting at zero.
374 The size and indices of a @ByteArray#@, @Addr#@, and
375 @MutableByteArray#@ are all in bytes. It's up to the program to
376 calculate the correct byte offset from the start of the array. This
377 allows a @ByteArray#@ to contain a mixture of values of different
378 type, which is often needed when preparing data for and unpicking
379 results from C. (Umm... not true of indices... WDP 95/09)
381 {\em Should we provide some @sizeOfDouble#@ constants?}
383 Out-of-range errors on indexing should be caught by the code which
384 uses the primitive operation; the primitive operations themselves do
385 {\em not} check for out-of-range indexes. The intention is that the
386 primitive ops compile to one machine instruction or thereabouts.
388 We use the terms ``reading'' and ``writing'' to refer to accessing {\em mutable}
389 arrays (see Section~\ref{sect:mutable}), and ``indexing''
390 to refer to reading a value from an {\em immutable}
393 If you want to read/write a @Word#@, read an @Int#@ and coerce.
395 Immutable byte arrays are straightforward to index (all indices in bytes):
397 indexCharArray# :: ByteArray# -> Int# -> Char#
398 indexIntArray# :: ByteArray# -> Int# -> Int#
399 indexAddrArray# :: ByteArray# -> Int# -> Addr#
400 indexFloatArray# :: ByteArray# -> Int# -> Float#
401 indexDoubleArray# :: ByteArray# -> Int# -> Double#
403 indexCharOffAddr# :: Addr# -> Int# -> Char#
404 indexIntOffAddr# :: Addr# -> Int# -> Int#
405 indexFloatOffAddr# :: Addr# -> Int# -> Float#
406 indexDoubleOffAddr# :: Addr# -> Int# -> Double#
407 indexAddrOffAddr# :: Addr# -> Int# -> Addr# -- Get an Addr# from an Addr# offset
410 The last of these, @indexAddrOffAddr#@, extracts an @Addr#@ using an offset
411 from another @Addr#@, thereby providing the ability to follow a chain of
414 Something a bit more interesting goes on when indexing arrays of boxed
415 objects, because the result is simply the boxed object. So presumably
416 it should be entered --- we never usually return an unevaluated
417 object! This is a pain: primitive ops aren't supposed to do
418 complicated things like enter objects. The current solution is to
419 return a lifted value, but I don't like it!
421 indexArray# :: Array# elt -> Int# -> GHCbase.Lift elt -- Yuk!
425 \subsubsection{The state type}
427 The primitive type @State#@ represents the state of a state transformer.
428 It is parameterised on the desired type of state, which serves to keep
429 states from distinct threads distinct from one another. But the {\em only}
430 effect of this parameterisation is in the type system: all values of type
431 @State#@ are represented in the same way. Indeed, they are all
432 represented by nothing at all! The code generator ``knows'' to generate no
433 code, and allocate no registers etc, for primitive states.
439 The type @GHCbuiltins.RealWorld@ is truly opaque: there are no values defined
440 of this type, and no operations over it. It is ``primitive'' in that
441 sense---but it is {\em not unboxed!} Its only role in life is to be the type
442 which distinguishes the @PrimIO@ state transformer (see
443 Section~\ref{sect:io-spec}).
448 \subsubsection{States}
450 A single, primitive, value of type @State# RealWorld@ is provided.
452 realWorld# :: State# GHCbuiltins.RealWorld
455 (Note: in the compiler, not a @PrimOp@; just a mucho magic @Id@.)
457 \subsection{State pairing types}
458 \label{sect:horrid-pairing-types}
460 This subsection defines some types which, while they aren't quite primitive
461 because we can define them in Haskell, are very nearly so. They define
462 constructors which pair a primitive state with a value of each primitive type.
463 They are required to express the result type of the primitive operations in the
466 data StateAndPtr# s elt = StateAndPtr# (State# s) elt
468 data StateAndChar# s = StateAndChar# (State# s) Char#
469 data StateAndInt# s = StateAndInt# (State# s) Int#
470 data StateAndWord# s = StateAndWord# (State# s) Word#
471 data StateAndFloat# s = StateAndFloat# (State# s) Float#
472 data StateAndDouble# s = StateAndDouble# (State# s) Double#
473 data StateAndAddr# s = StateAndAddr# (State# s) Addr#
475 data StateAndStablePtr# s a = StateAndStablePtr# (State# s) (StablePtr# a)
476 data StateAndForeignObj# s = StateAndForeignObj# (State# s) ForeignObj#
477 data StateAndSynchVar# s a = StateAndSynchVar# (State# s) (SynchVar# a)
479 data StateAndArray# s elt = StateAndArray# (State# s) (Array# elt)
480 data StateAndMutableArray# s elt = StateAndMutableArray# (State# s) (MutableArray# s elt)
481 data StateAndByteArray# s = StateAndByteArray# (State# s) ByteArray#
482 data StateAndMutableByteArray# s = StateAndMutableByteArray# (State# s) (MutableByteArray# s)
487 \subsection{Mutable arrays}
490 Corresponding to @Array#@ and @ByteArray#@,
491 we have the types of mutable versions of each.
492 In each case, the representation is a pointer
493 to a suitable block of (mutable) heap-allocated storage.
495 type MutableArray# s elt
496 type MutableByteArray# s
499 \subsubsection{Allocation.}
501 Mutable arrays can be allocated.
502 Only pointer-arrays are initialised; arrays of non-pointers are filled
503 in by ``user code'' rather than by the array-allocation primitive.
504 Reason: only the pointer case has to worry about GC striking with a
505 partly-initialised array.
507 newArray# :: Int# -> elt -> State# s -> StateAndMutableArray# s elt
509 newCharArray# :: Int# -> State# s -> StateAndMutableByteArray# s
510 newIntArray# :: Int# -> State# s -> StateAndMutableByteArray# s
511 newAddrArray# :: Int# -> State# s -> StateAndMutableByteArray# s
512 newFloatArray# :: Int# -> State# s -> StateAndMutableByteArray# s
513 newDoubleArray# :: Int# -> State# s -> StateAndMutableByteArray# s
516 The size of a @ByteArray#@ is given in bytes.
518 \subsubsection{Reading and writing}
520 %OLD: Remember, offsets in a @MutableByteArray#@ are in bytes.
522 readArray# :: MutableArray# s elt -> Int# -> State# s -> StateAndPtr# s elt
523 readCharArray# :: MutableByteArray# s -> Int# -> State# s -> StateAndChar# s
524 readIntArray# :: MutableByteArray# s -> Int# -> State# s -> StateAndInt# s
525 readAddrArray# :: MutableByteArray# s -> Int# -> State# s -> StateAndAddr# s
526 readFloatArray# :: MutableByteArray# s -> Int# -> State# s -> StateAndFloat# s
527 readDoubleArray# :: MutableByteArray# s -> Int# -> State# s -> StateAndDouble# s
529 writeArray# :: MutableArray# s elt -> Int# -> elt -> State# s -> State# s
530 writeCharArray# :: MutableByteArray# s -> Int# -> Char# -> State# s -> State# s
531 writeIntArray# :: MutableByteArray# s -> Int# -> Int# -> State# s -> State# s
532 writeAddrArray# :: MutableByteArray# s -> Int# -> Addr# -> State# s -> State# s
533 writeFloatArray# :: MutableByteArray# s -> Int# -> Float# -> State# s -> State# s
534 writeDoubleArray# :: MutableByteArray# s -> Int# -> Double# -> State# s -> State# s
538 \subsubsection{Equality.}
540 One can take ``equality'' of mutable arrays. What is compared is the
541 {\em name} or reference to the mutable array, not its contents.
543 sameMutableArray# :: MutableArray# s elt -> MutableArray# s elt -> Bool
544 sameMutableByteArray# :: MutableByteArray# s -> MutableByteArray# s -> Bool
548 \subsubsection{Freezing mutable arrays}
550 Only unsafe-freeze has a primitive. (Safe freeze is done directly in Haskell
551 by copying the array and then using @unsafeFreeze@.)
553 unsafeFreezeArray# :: MutableArray# s elt -> State# s -> StateAndArray# s elt
554 unsafeFreezeByteArray# :: MutableByteArray# s -> State# s -> StateAndByteArray# s
558 \subsubsection{Stable pointers}
560 {\em Andy's comment.} \bf{Errors:} The following is not strictly true: the current
561 implementation is not as polymorphic as claimed. The reason for this
562 is that the C programmer will have to use a different entry-routine
563 for each type of stable pointer. At present, we only supply a very
564 limited number (1) of these routines. It might be possible to
565 increase the range of these routines by providing general purpose
566 entry points to apply stable pointers to (stable pointers to)
567 arguments and to enter (stable pointers to) boxed primitive values.
568 {\em End of Andy's comment.}
570 A stable pointer is a name for a Haskell object which can be passed to the
571 external world. It is ``stable'' in the sense that the name does not change when
572 the Haskell garbage collector runs --- in contrast to the address of the object
573 which may well change.
575 The stable pointer type is parameterised by the type of the thing which is named.
580 A stable pointer is represented by an index into the (static)
581 @StablePointerTable@. The Haskell garbage collector treats the
582 @StablePointerTable@ as a source of roots for GC.
584 The @makeStablePointer@ function converts a value into a stable pointer.
585 It is part of the @PrimIO@ monad, because we want to be sure we don't
586 allocate one twice by accident, and then only free one of the copies.
588 makeStablePointer# :: a -> State# RealWorld -> StateAndStablePtr# RealWorld a
589 freeStablePointer# :: StablePtr# a -> State# RealWorld -> State# RealWorld
590 deRefStablePointer# :: StablePtr# a -> State# RealWorld -> StateAndPtr RealWorld a
593 There is also a C procedure @FreeStablePtr@ which frees a stable pointer.
596 % Rewritten and updated for MallocPtr++ -- 4/96 SOF
598 \subsubsection{Foreign objects}
600 A @ForeignObj@ is a reference to an object outside the Haskell
601 world (i.e., from the C world, or a reference to an object on another
602 machine completely.), where the Haskell world has been told ``Let me
603 know when you're finished with this ...''.
605 The @ForeignObj@ type is just a special @Addr#@ ({\em not} parameterised).
611 A typical use of @ForeignObj@ is in constructing Haskell bindings
612 to external libraries. A good example is that of writing a binding to
613 an image-processing library (which was actually the main motivation
614 for implementing @ForeignObj@'s precursor, @MallocPtr@). The
615 images manipulated are not stored in the Haskell heap, either because
616 the library insist on allocating them internally or we (sensibly)
617 decide to spare the GC from having to heave heavy images around.
620 data Image = Image ForeignObj#
622 instance CCallable Image
626 The @ForeignObj#@ type is then used to refer to the externally
627 allocated image, and to acheive some type safety, the Haskell binding
628 defines the @Image@ data type. So, a value of type @ForeignObj#@ is
629 used to ``box'' up an external reference into a Haskell heap object
630 that we can then indirectly reference:
633 createImage :: (Int,Int) -> PrimIO Image
637 So far, this looks just like an @Addr#@ type, but @ForeignObj#@
638 offers a bit more, namely that we can specify a {\em finalisation
639 routine} to invoke when the @ForeignObj#@ is discarded by the
640 GC. The garbage collector invokes the finalisation routine associated
641 with the @ForeignObj#@, saying `` Thanks, I'm through with this
642 now..'' For the image-processing library, the finalisation routine could for
643 the images free up memory allocated for them. The finalisation routine has
644 currently to be written in C (the finalisation routine can in turn call on
645 @FreeStablePtr@ to deallocate a stable pointer.).
647 Associating a finalisation routine with an external object is done by
651 makeForeignObj# :: Addr# -- foreign reference
652 -> Addr# -- pointer to finalisation routine
653 -> StateAndForeignObj# RealWorld ForeignObj#
657 (Implementation: a linked list of all @ForeignObj#@s is maintained to allow the
658 garbage collector to detect when a @ForeignObj#@ becomes garbage.)
660 Like @Array@, @ForeignObj#@s are represented by heap objects.
662 \bf{ToDo:} Decide whether @FreeCHeapPointer@ is allowed to call on a
663 stable pointer. (I sincerely hope not since we will still be in the
666 \subsubsection{Synchronizing variables (I-vars, M-vars)}
671 type SynchVar# s elt -- primitive
673 newSynchVar#:: State# s -> StateAndSynchVar# s elt
675 takeMVar# :: SynchVar# s elt -> State# s -> StateAndPtr# s elt
676 putMVar# :: SynchVar# s elt -> State# s -> State# s
678 readIVar# :: SynchVar# s elt -> State# s -> StateAndPtr# s elt
679 writeIVar# :: SynchVar# s elt -> State# s -> State# s
683 \subsubsection{Controlling the garbage collector}
685 The C function {\tt PerformGC\/}, allows the C world to force Haskell
686 to do a garbage collection. It can only be called while Haskell
687 is performing a C Call.
689 Note that this function can be used to define a Haskell IO operation
690 with the same effect:
692 > performGCIO :: PrimIO ()
693 > performGCIO = _ccall_gc_ PerformGC
697 \bf{ToDo:} Is there any need for abnormal/normal termination to force
698 a GC too? Is there any need for a function that provides finer
699 control over GC: argument = amount of space required; result = amount
702 \subsection{@spark#@ primitive operation (for parallel execution)}
704 {\em ToDo: say something} It's used in the unfolding for @par@.
706 \subsection{The @errorIO#@ primitive operation}
708 The @errorIO#@ primitive takes an argument much like @PrimIO@. It aborts execution of
709 the current program, and continues instead by performing the given @PrimIO@-like value
710 on the current state of the world.
712 errorIO# :: (State RealWorld -> ((), State RealWorld)) -> a
718 \bf{ToDo:} current implementation has state variable as second
719 argument not last argument.
721 The @ccall#@ primitive can't be given an ordinary type, because it has
722 a variable number of arguments. The nearest we can get is:
724 ccall# :: CRoutine -> a1# -> ... -> an# -> State# RealWorld -> StateAndR# RealWorld
727 where the type variables @a1#@\ldots@an#@ and @r#@ can be instantiated by any
728 primitive type, and @StateAndR#@ is the appropriate pairing type from
729 Section~\ref{sect:horrid-pairing-types}. The @CRoutine@
730 isn't a proper Haskell type at all; it just reminds us that @ccall#@ needs to
731 know what C routine to call.
733 This notation is really short for a massive family of @ccall#@ primitives, one
734 for each combination of types. (Of course, the compiler simply remembers the
735 types involved, and generates appropriate code when it finally spits out the C.)
737 Unlike all the other primitive operators, @ccall#@ is not bound to an in-scope
738 identifier. The only way it is possible to generate a @ccall#@ is via the
741 All this applies equally to @casm#@:
743 casm# :: CAsmString -> a1# -> ... -> an# -> State# RealWorld -> StateAndR# RealWorld
746 %------------------------------------------------------------
747 \section{Library stuff built with the Really Primitive Stuff}
749 \subsection{The state transformer monad}
751 \subsubsection{Types}
753 A state transformer is a function from a state to a pair of a result and a new
756 newtype ST s a = ST (State s -> (a, State s))
759 The @ST@ type is {\em abstract}, so that the programmer cannot see its
760 representation. If he could, he could write bad things like:
763 bad = ST $ \ s -> ...(f s)...(g s)...
766 Here, @s@ is duplicated, which would be bad news.
768 A state is represented by a primitive state value, of type @State# s@,
769 wrapped up in a @State@ constructor. The reason for boxing it in this
770 way is so that we can be strict or lazy in the state. (Remember, all
771 primitive types are unboxed, and hence can't be bottom; but types built
772 with @data@ are all boxed.)
774 data State s = S# (State# s)
778 \subsubsection{The state transformer combinators}
780 Now for the combinators, all of which live inside the @ST@
781 abstraction. Notice that @returnST@ and @thenST@ are lazy in the
784 returnST :: a -> ST s a
785 returnST a s = (a, s)
787 thenST :: ST s a -> (a -> ST s b) -> ST s b
788 thenST m k s = let (r,new_s) = m s
792 fixST :: (a -> ST s a) -> ST s a
793 fixST k s = let ans = k r s
799 The interesting one is, of course, @runST@. We can't infer its type!
800 (It has a funny name because it must be wired into the compiler.)
802 -- runST :: forall a. (forall s. ST s a) -> a
803 runST m = case m (S# realWorld#) of
808 \subsubsection{Other useful combinators}
810 There are various other standard combinators, all defined in terms the
811 fundamental combinators above. The @seqST@ combinator is like
812 @thenST@, except that it discards the result of the first state
815 seqST :: ST s a -> ST s b -> ST s b
816 seqST m1 m2 = m1 `thenST` (\_ -> m2)
820 We also have {\em strict} (... in the state...) variants of the
821 then/return combinators (same types as their pals):
823 returnStrictlyST a s@(S# _) = (a, s)
825 thenStrictlyST m k s@(S# _)
826 = case (m s) of { (r, new_s@(S# _)) ->
829 seqStrictlyST m k = ... ditto, for seqST ...
832 The combinator @listST@ takes a list of state transformers, and
833 composes them in sequence, returning a list of their results:
835 listST :: [ST s a] -> ST s [a]
836 listST [] = returnST []
837 listST (m:ms) = m `thenST` \ r ->
838 listST ms `thenST` \ rs ->
842 The @mapST@ combinator ``lifts'' a function from a value to state
843 transformers to one which works over a list of values:
845 mapST :: (a -> ST s b) -> [a] -> ST s [b]
846 mapST f ms = listST (map f ms)
849 The @mapAndUnzipST@ combinator is similar to @mapST@, except that here the
850 function returns a pair:
852 mapAndUnzipST :: (a -> ST s (b,c)) -> [a] -> ST s ([b],[c])
853 mapAndUnzipST f (m:ms)
854 = f m `thenST` \ ( r1, r2) ->
855 mapAndUnzipST f ms `thenST` \ (rs1, rs2) ->
856 returnST (r1:rs1, r2:rs2)
860 \subsubsection{The @PrimIO@ monad}
863 The @PrimIO@ type is defined in as a state transformer which manipulates the
866 type PrimIO a = ST RealWorld a -- Transparent
869 The @PrimIO@ type is an ordinary type synonym, transparent to the programmer.
871 The type @RealWorld@ and value @realWorld#@ do not need to be hidden (although
872 there is no particular point in exposing them). Even having a value of type
873 @realWorld#@ does not compromise safety, since the type @ST@ is hidden.
875 It is type-correct to use @returnST@ in an I/O context, but it is a
876 bit more efficient to use @returnPrimIO@. The latter is strict in the
877 state, which propagates backwards to all the earlier combinators
878 (provided they are unfolded). Why is it safe for @returnPrimIO@ to be
879 strict in the state? Because every context in which an I/O state
880 transformer is used will certainly evaluate the resulting state; it is
881 the state of the real world!
883 returnPrimIO :: a -> PrimIO a
884 returnPrimIO a s@(S# _) = (a, s)
886 We provide strict versions of the other combinators too.
888 thenPrimIO m k s = case m s of
892 @fixPrimIO@ has to be lazy, though!
897 The other combinators are just the same as before, but use the strict
898 @thenPrimIO@ and @returnPrimIO@ for efficiency.
900 foldrPrimIO f z [] = z
901 foldrPrimIO f z (m:ms) = foldrPrimIO f z ms `thenPrimIO` \ ms' ->
904 listPrimIO ms = foldrPrimIO (\ a xs -> a `thenPrimIO` \ x -> returnPrimIO (x : xs))
907 mapPrimIO f ms = listPrimIO (map f ms)
909 mapAndUnzipPrimIO f (m:ms)
910 = f m `thenPrimIO` \ ( r1, r2) ->
911 mapAndUnzipPrimIO f ms `thenPrimIO` \ (rs1, rs2) ->
912 returnPrimIO (r1:rs1, r2:rs2)
918 \subsubsection{Types}
921 data Array ix elt = Array (ix,ix) (Array# elt)
922 data ByteArray ix = ByteArray (ix,ix) ByteArray#
924 data MutableArray s ix elt = MutableArray (ix,ix) (MutableArray# s elt)
925 data MutableByteArray s ix = MutableByteArray (ix,ix) (MutableByteArray# s)
929 \subsubsection{Operations on immutable arrays}
931 Ordinary array indexing is straightforward.
933 (!) :: Ix ix => Array ix elt -> ix -> elt
936 QUESTIONs: should @ByteArray@s be indexed by Ints or ix? With byte offsets
937 or sized ones? (sized ones [WDP])
939 indexCharArray :: Ix ix => ByteArray ix -> ix -> Char
940 indexIntArray :: Ix ix => ByteArray ix -> ix -> Int
941 indexAddrArray :: Ix ix => ByteArray ix -> ix -> Addr
942 indexFloatArray :: Ix ix => ByteArray ix -> ix -> Float
943 indexDoubleArray :: Ix ix => ByteArray ix -> ix -> Double
946 @Addr@s are indexed straightforwardly by @Int@s. Unlike the primitive
947 operations, though, the offsets assume that the array consists entirely of the
948 type of value being indexed, and so there's an implicit multiplication by
949 the size of that value. To access @Addr@s with mixed values requires
950 you to do a DIY job using the primitives.
952 indexAddrChar :: Addr -> Int -> Char
954 indexStaticCharArray :: Addr -> Int -> Char
955 indexStaticIntArray :: Addr -> Int -> Int
956 indexStaticFloatArray :: Addr -> Int -> Float
957 indexStaticDoubleArray :: Addr -> Int -> Double
958 indexStaticArray :: Addr -> Int -> Addr
962 \subsubsection{Operations on mutable arrays}
964 newArray :: Ix ix => (ix,ix) -> elt -> ST s (MutableArray s ix elt)
965 newCharArray :: Ix ix => (ix,ix) -> ST s (MutableByteArray s ix)
971 readArray :: Ix ix => MutableArray s ix elt -> ix -> ST s elt
972 readCharArray :: Ix ix => MutableByteArray s ix -> ix -> ST s Char
977 writeArray :: Ix ix => MutableArray s ix elt -> ix -> elt -> ST s ()
978 writeCharArray :: Ix ix => MutableByteArray s ix -> ix -> Char -> ST s ()
984 freezeArray :: Ix ix => MutableArray s ix elt -> ST s (Array ix elt)
985 freezeCharArray :: Ix ix => MutableByteArray s ix -> ST s (ByteArray ix)
990 We have no need on one-function-per-type for unsafe freezing:
992 unsafeFreezeArray :: Ix ix => MutableArray s ix elt -> ST s (Array ix elt)
993 unsafeFreezeByteArray :: Ix ix => MutableByteArray s ix -> ST s (ByteArray ix)
997 Sometimes we want to snaffle the bounds of one of these beasts:
999 boundsOfArray :: Ix ix => MutableArray s ix elt -> (ix, ix)
1000 boundsOfByteArray :: Ix ix => MutableByteArray s ix -> (ix, ix)
1004 Lastly, ``equality'':
1006 sameMutableArray :: MutableArray s ix elt -> MutableArray s ix elt -> Bool
1007 sameMutableByteArray :: MutableByteArray s ix -> MutableByteArray s ix -> Bool
1012 \subsection{Variables}
1014 \subsubsection{Types}
1016 Mutable variables are (for now anyway) implemented as arrays. The @MutableVar@ type
1017 is opaque, so we can change the implementation later if we want.
1019 type MutableVar s a = MutableArray s Int a
1023 \subsubsection{Operations}
1025 newVar :: a -> ST s (MutableVar s a)
1026 readVar :: MutableVar s a -> ST s a
1027 writeVar :: MutableVar s a -> a -> ST s ()
1028 sameVar :: MutableVar s a -> MutableVar s a -> Bool
1032 \subsection{Stable pointers}
1034 Nothing exciting here, just simple boxing up.
1036 data StablePtr a = StablePtr (StablePtr# a)
1038 makeStablePointer :: a -> StablePtr a
1039 freeStablePointer :: StablePtr a -> PrimIO ()
1042 \subsection{Foreign objects}
1044 Again, just boxing up.
1046 data ForeignObj = ForeignObj ForeignObj#
1048 makeForeignObj :: Addr -> Addr -> PrimIO ForeignObj
1051 \subsection{C calls}
1053 Everything in this section goes for @_casm_@ too.
1055 {\em ToDo: mention @_ccall_gc_@ and @_casm_gc_@...}
1057 The @_ccall_@ construct has the following form:
1058 $$@_ccall_@~croutine~a_1~\ldots~a_n$$
1059 This whole construct has type @PrimIO@~$res$.
1060 The rules are these:
1063 The first ``argument'', $croutine$, must be the literal name of a C procedure.
1064 It cannot be a Haskell expression which evaluates to a string, etc; it must be
1065 simply the name of the procedure.
1067 The arguments $a_1, \ldots,a_n$ must be of {\em C-callable} type.
1069 The whole construct has type @PrimIO@~$ty$, where $ty$ is a {\em C-returnable} type.
1071 A {\em boxed-primitive} type is both C-callable and C-returnable.
1072 A boxed primitive type is anything declared by:
1077 where @t@ is a primitive type. Note that
1078 programmer-defined boxed-primitive types are perfectly OK:
1080 data Widget = W# Int#
1081 data Screen = S# CHeapPtr#
1085 There are other types that can be passed to C (C-callable). This
1086 table summarises (including the standard boxed-primitive types):
1088 Boxed Type of transferd Corresp. Which is
1089 Type Prim. component C type *probably*...
1090 ------ --------------- ------ -------------
1091 Char Char# StgChar unsigned char
1092 Int Int# StgInt long int
1093 Word Word# StgWord unsigned long int
1094 Addr Addr# StgAddr char *
1095 Float Float# StgFloat float
1096 Double Double# StgDouble double
1098 Array Array# StgArray StgPtr
1099 ByteArray ByteArray# StgByteArray StgPtr
1100 MutableArray MutableArray# StgArray StgPtr
1101 MutableByteArray MutableByteArray# StgByteArray StgPtr
1103 State State# nothing!
1105 StablePtr StablePtr# StgStablePtr StgPtr
1106 ForeignObj ForeignObj# StgForeignObj StgPtr
1110 All of the above are {\em C-returnable} except:
1112 Array, ByteArray, MutableArray, MutableByteArray
1116 \bf{ToDo:} I'm pretty wary of @Array@ and @MutableArray@ being in
1117 this list, and not too happy about @State@ [WDP].
1119 \bf{ToDo:} Can code generator pass all the primitive types? Should this be
1120 extended to include {\tt Bool\/} (or any enumeration type?)
1122 The type checker must be able to figure out just which of the C-callable/returnable
1123 types is being used. If it can't, you have to add type signatures. For example,
1128 is not good enough, because the compiler can't work out what type @x@ is, nor
1129 what type the @_ccall_@ returns. You have to write, say:
1131 f :: Int -> PrimIO Float
1136 \subsubsection{Implementation}
1138 The desugarer unwraps the @_ccall_@ construct by inserting the necessary
1139 evaluations etc to unbox the arguments. For example, the body of the definition
1140 of @f@ above would become:
1142 (\ s -> case x of { I# x# ->
1143 case s of { S# s# ->
1144 case ccall# [Int#,Float#] x# s# of { StateAndFloat# f# new_s# ->
1149 Notice that the state, too, is unboxed.
1151 The code generator must deal specially with primitive objects which
1152 are stored on the heap.
1155 ... details omitted ...
1159 %More importantly, it must construct a C Heap Pointer heap-object after
1160 %a @_ccall_@ which returns a @MallocPtr#@.
1163 %--------------------------------------------------------
1164 \section{Non-primitive stuff that must be wired into GHC}
1167 data Char = C# Char#
1169 data Word = W# Word#
1170 data Addr = A# Addr#
1172 data Float = F# Float#
1173 data Double = D# Double#
1174 data Integer = J# Int# Int# ByteArray#
1176 -- and the other boxed-primitive types:
1177 Array, ByteArray, MutableArray, MutableByteArray,
1178 StablePtr, ForeignObj
1180 data Bool = False | True
1181 data Ordering = LT | EQ | GT -- used in derived comparisons
1183 data List a = [] | a : (List a)
1186 data Lift a = Lift a -- used Yukkily as described elsewhere
1188 type String = [Char] -- convenience, only
1192 %------------------------------------------------------------
1193 \section{Programmer interface(s)}
1195 \subsection{The bog-standard interface}
1197 If you rely on the implicit @import Prelude@ that GHC normally does
1198 for you, and if you don't use any weird flags (notably
1199 @-fglasgow-exts@), and if you don't import one of the fairly-magic
1200 @PreludeGla*@ interfaces, then GHC should work {\em exactly} as the
1201 Haskell report says, and the full user namespaces should be available
1204 \subsection{If you mess about with @import Prelude@...}
1206 Innocent hiding, e.g.,
1208 import Prelude hiding ( fromIntegral )
1211 should work just fine.
1213 There are some things you can do that will make GHC crash, e.g.,
1214 hiding a standard class:
1216 import Prelude hiding ( Eq(..) )
1221 \subsection{Turning on Glasgow extensions with @-fglasgow-exts@}
1223 If you turn on @-fglasgow-exts@, then all the primitive types and
1224 operations described herein are available.
1226 It is possible that some name conflicts between your code and the
1227 wired-in things might spring to life (though we doubt it...).
1228 Change your names :-)