[project @ 2001-08-27 11:45:23 by simonmar]
[ghc-hetmet.git] / ghc / docs / users_guide / primitives.sgml
1 <!-- UNBOXED TYPES AND PRIMITIVE OPERATIONS -->
2
3 <sect1 id="primitives">
4   <title>Unboxed types and primitive operations</title>
5   <indexterm><primary>PrelGHC module</primary></indexterm>
6
7   <para>This module defines all the types which are primitive in
8   Glasgow Haskell, and the operations provided for them.</para>
9
10   <para>Note: while you really can use this stuff to write fast code,
11   we generally find it a lot less painful, and more satisfying in the
12   long run, to use higher-level language features and libraries.  With
13   any luck, the code you write will be optimised to the efficient
14   unboxed version in any case.  And if it isn't, we'd like to know
15   about it.</para>
16   
17 <sect2 id="glasgow-unboxed">
18 <title>Unboxed types
19 </title>
20
21 <para>
22 <indexterm><primary>Unboxed types (Glasgow extension)</primary></indexterm>
23 </para>
24
25 <para>Most types in GHC are <firstterm>boxed</firstterm>, which means
26 that values of that type are represented by a pointer to a heap
27 object.  The representation of a Haskell <literal>Int</literal>, for
28 example, is a two-word heap object.  An <firstterm>unboxed</firstterm>
29 type, however, is represented by the value itself, no pointers or heap
30 allocation are involved.
31 </para>
32
33 <para>
34 Unboxed types correspond to the &ldquo;raw machine&rdquo; types you
35 would use in C: <literal>Int&num;</literal> (long int),
36 <literal>Double&num;</literal> (double), <literal>Addr&num;</literal>
37 (void *), etc.  The <emphasis>primitive operations</emphasis>
38 (PrimOps) on these types are what you might expect; e.g.,
39 <literal>(+&num;)</literal> is addition on
40 <literal>Int&num;</literal>s, and is the machine-addition that we all
41 know and love&mdash;usually one instruction.
42 </para>
43
44 <para>
45 Primitive (unboxed) types cannot be defined in Haskell, and are
46 therefore built into the language and compiler.  Primitive types are
47 always unlifted; that is, a value of a primitive type cannot be
48 bottom.  We use the convention that primitive types, values, and
49 operations have a <literal>&num;</literal> suffix.
50 </para>
51
52 <para>
53 Primitive values are often represented by a simple bit-pattern, such
54 as <literal>Int&num;</literal>, <literal>Float&num;</literal>,
55 <literal>Double&num;</literal>.  But this is not necessarily the case:
56 a primitive value might be represented by a pointer to a
57 heap-allocated object.  Examples include
58 <literal>Array&num;</literal>, the type of primitive arrays.  A
59 primitive array is heap-allocated because it is too big a value to fit
60 in a register, and would be too expensive to copy around; in a sense,
61 it is accidental that it is represented by a pointer.  If a pointer
62 represents a primitive value, then it really does point to that value:
63 no unevaluated thunks, no indirections&hellip;nothing can be at the
64 other end of the pointer than the primitive value.
65 </para>
66
67 <para>
68 There are some restrictions on the use of primitive types, the main
69 one being that you can't pass a primitive value to a polymorphic
70 function or store one in a polymorphic data type.  This rules out
71 things like <literal>[Int&num;]</literal> (i.e. lists of primitive
72 integers).  The reason for this restriction is that polymorphic
73 arguments and constructor fields are assumed to be pointers: if an
74 unboxed integer is stored in one of these, the garbage collector would
75 attempt to follow it, leading to unpredictable space leaks.  Or a
76 <function>seq</function> operation on the polymorphic component may
77 attempt to dereference the pointer, with disastrous results.  Even
78 worse, the unboxed value might be larger than a pointer
79 (<literal>Double&num;</literal> for instance).
80 </para>
81
82 <para>
83 Nevertheless, A numerically-intensive program using unboxed types can
84 go a <emphasis>lot</emphasis> faster than its &ldquo;standard&rdquo;
85 counterpart&mdash;we saw a threefold speedup on one example.
86 </para>
87
88 </sect2>
89
90 <sect2 id="unboxed-tuples">
91 <title>Unboxed Tuples
92 </title>
93
94 <para>
95 Unboxed tuples aren't really exported by <literal>PrelGHC</literal>,
96 they're available by default with <option>-fglasgow-exts</option>.  An
97 unboxed tuple looks like this:
98 </para>
99
100 <para>
101
102 <programlisting>
103 (# e_1, ..., e_n #)
104 </programlisting>
105
106 </para>
107
108 <para>
109 where <literal>e&lowbar;1..e&lowbar;n</literal> are expressions of any
110 type (primitive or non-primitive).  The type of an unboxed tuple looks
111 the same.
112 </para>
113
114 <para>
115 Unboxed tuples are used for functions that need to return multiple
116 values, but they avoid the heap allocation normally associated with
117 using fully-fledged tuples.  When an unboxed tuple is returned, the
118 components are put directly into registers or on the stack; the
119 unboxed tuple itself does not have a composite representation.  Many
120 of the primitive operations listed in this section return unboxed
121 tuples.
122 </para>
123
124 <para>
125 There are some pretty stringent restrictions on the use of unboxed tuples:
126 </para>
127
128 <para>
129
130 <itemizedlist>
131 <listitem>
132
133 <para>
134  Unboxed tuple types are subject to the same restrictions as
135 other unboxed types; i.e. they may not be stored in polymorphic data
136 structures or passed to polymorphic functions.
137
138 </para>
139 </listitem>
140 <listitem>
141
142 <para>
143  Unboxed tuples may only be constructed as the direct result of
144 a function, and may only be deconstructed with a <literal>case</literal> expression.
145 eg. the following are valid:
146
147
148 <programlisting>
149 f x y = (# x+1, y-1 #)
150 g x = case f x x of { (# a, b #) -&#62; a + b }
151 </programlisting>
152
153
154 but the following are invalid:
155
156
157 <programlisting>
158 f x y = g (# x, y #)
159 g (# x, y #) = x + y
160 </programlisting>
161
162
163 </para>
164 </listitem>
165 <listitem>
166
167 <para>
168  No variable can have an unboxed tuple type.  This is illegal:
169
170
171 <programlisting>
172 f :: (# Int, Int #) -&#62; (# Int, Int #)
173 f x = x
174 </programlisting>
175
176
177 because <literal>x</literal> has an unboxed tuple type.
178
179 </para>
180 </listitem>
181
182 </itemizedlist>
183
184 </para>
185
186 <para>
187 Note: we may relax some of these restrictions in the future.
188 </para>
189
190 <para>
191 The <literal>IO</literal> and <literal>ST</literal> monads use unboxed
192 tuples to avoid unnecessary allocation during sequences of operations.
193 </para>
194
195 </sect2>
196
197 <sect2>
198 <title>Character and numeric types</title>
199
200 <indexterm><primary>character types, primitive</primary></indexterm>
201 <indexterm><primary>numeric types, primitive</primary></indexterm>
202 <indexterm><primary>integer types, primitive</primary></indexterm>
203 <indexterm><primary>floating point types, primitive</primary></indexterm>
204 <para>
205 There are the following obvious primitive types:
206 </para>
207
208 <programlisting>
209 type Char#
210 type Int#
211 type Word#
212 type Addr#
213 type Float#
214 type Double#
215 type Int64#
216 type Word64#
217 </programlisting>
218
219 <indexterm><primary><literal>Char&num;</literal></primary></indexterm>
220 <indexterm><primary><literal>Int&num;</literal></primary></indexterm>
221 <indexterm><primary><literal>Word&num;</literal></primary></indexterm>
222 <indexterm><primary><literal>Addr&num;</literal></primary></indexterm>
223 <indexterm><primary><literal>Float&num;</literal></primary></indexterm>
224 <indexterm><primary><literal>Double&num;</literal></primary></indexterm>
225 <indexterm><primary><literal>Int64&num;</literal></primary></indexterm>
226 <indexterm><primary><literal>Word64&num;</literal></primary></indexterm>
227
228 <para>
229 If you really want to know their exact equivalents in C, see
230 <filename>ghc/includes/StgTypes.h</filename> in the GHC source tree.
231 </para>
232
233 <para>
234 Literals for these types may be written as follows:
235 </para>
236
237 <para>
238
239 <programlisting>
240 1#              an Int#
241 1.2#            a Float#
242 1.34##          a Double#
243 'a'#            a Char#; for weird characters, use e.g. '\o&#60;octal&#62;'#
244 "a"#            an Addr# (a `char *'); only characters '\0'..'\255' allowed
245 </programlisting>
246
247 <indexterm><primary>literals, primitive</primary></indexterm>
248 <indexterm><primary>constants, primitive</primary></indexterm>
249 <indexterm><primary>numbers, primitive</primary></indexterm>
250 </para>
251
252 </sect2>
253
254 <sect2>
255 <title>Comparison operations</title>
256
257 <para>
258 <indexterm><primary>comparisons, primitive</primary></indexterm>
259 <indexterm><primary>operators, comparison</primary></indexterm>
260 </para>
261
262 <para>
263
264 <programlisting>
265 {&#62;,&#62;=,==,/=,&#60;,&#60;=}# :: Int# -&#62; Int# -&#62; Bool
266
267 {gt,ge,eq,ne,lt,le}Char# :: Char# -&#62; Char# -&#62; Bool
268     -- ditto for Word# and Addr#
269 </programlisting>
270
271 <indexterm><primary><literal>&#62;&num;</literal></primary></indexterm>
272 <indexterm><primary><literal>&#62;=&num;</literal></primary></indexterm>
273 <indexterm><primary><literal>==&num;</literal></primary></indexterm>
274 <indexterm><primary><literal>/=&num;</literal></primary></indexterm>
275 <indexterm><primary><literal>&#60;&num;</literal></primary></indexterm>
276 <indexterm><primary><literal>&#60;=&num;</literal></primary></indexterm>
277 <indexterm><primary><literal>gt&lcub;Char,Word,Addr&rcub;&num;</literal></primary></indexterm>
278 <indexterm><primary><literal>ge&lcub;Char,Word,Addr&rcub;&num;</literal></primary></indexterm>
279 <indexterm><primary><literal>eq&lcub;Char,Word,Addr&rcub;&num;</literal></primary></indexterm>
280 <indexterm><primary><literal>ne&lcub;Char,Word,Addr&rcub;&num;</literal></primary></indexterm>
281 <indexterm><primary><literal>lt&lcub;Char,Word,Addr&rcub;&num;</literal></primary></indexterm>
282 <indexterm><primary><literal>le&lcub;Char,Word,Addr&rcub;&num;</literal></primary></indexterm>
283 </para>
284
285 </sect2>
286
287 <sect2>
288 <title>Primitive-character operations</title>
289
290 <para>
291 <indexterm><primary>characters, primitive operations</primary></indexterm>
292 <indexterm><primary>operators, primitive character</primary></indexterm>
293 </para>
294
295 <para>
296
297 <programlisting>
298 ord# :: Char# -&#62; Int#
299 chr# :: Int# -&#62; Char#
300 </programlisting>
301
302 <indexterm><primary><literal>ord&num;</literal></primary></indexterm>
303 <indexterm><primary><literal>chr&num;</literal></primary></indexterm>
304 </para>
305
306 </sect2>
307
308 <sect2>
309 <title>Primitive-<literal>Int</literal> operations</title>
310
311 <para>
312 <indexterm><primary>integers, primitive operations</primary></indexterm>
313 <indexterm><primary>operators, primitive integer</primary></indexterm>
314 </para>
315
316 <para>
317
318 <programlisting>
319 {+,-,*,quotInt,remInt,gcdInt}# :: Int# -&#62; Int# -&#62; Int#
320 negateInt# :: Int# -&#62; Int#
321
322 iShiftL#, iShiftRA#, iShiftRL# :: Int# -&#62; Int# -&#62; Int#
323         -- shift left, right arithmetic, right logical
324
325 addIntC#, subIntC#, mulIntC# :: Int# -> Int# -> (# Int#, Int# #)
326         -- add, subtract, multiply with carry
327 </programlisting>
328
329 <indexterm><primary><literal>+&num;</literal></primary></indexterm>
330 <indexterm><primary><literal>-&num;</literal></primary></indexterm>
331 <indexterm><primary><literal>*&num;</literal></primary></indexterm>
332 <indexterm><primary><literal>quotInt&num;</literal></primary></indexterm>
333 <indexterm><primary><literal>remInt&num;</literal></primary></indexterm>
334 <indexterm><primary><literal>gcdInt&num;</literal></primary></indexterm>
335 <indexterm><primary><literal>iShiftL&num;</literal></primary></indexterm>
336 <indexterm><primary><literal>iShiftRA&num;</literal></primary></indexterm>
337 <indexterm><primary><literal>iShiftRL&num;</literal></primary></indexterm>
338 <indexterm><primary><literal>addIntC&num;</literal></primary></indexterm>
339 <indexterm><primary><literal>subIntC&num;</literal></primary></indexterm>
340 <indexterm><primary><literal>mulIntC&num;</literal></primary></indexterm>
341 <indexterm><primary>shift operations, integer</primary></indexterm>
342 </para>
343
344 <para>
345 <emphasis>Note:</emphasis> No error/overflow checking!
346 </para>
347
348 </sect2>
349
350 <sect2>
351 <title>Primitive-<literal>Double</literal> and <literal>Float</literal> operations</title>
352
353 <para>
354 <indexterm><primary>floating point numbers, primitive</primary></indexterm>
355 <indexterm><primary>operators, primitive floating point</primary></indexterm>
356 </para>
357
358 <para>
359
360 <programlisting>
361 {+,-,*,/}##         :: Double# -&#62; Double# -&#62; Double#
362 {&#60;,&#60;=,==,/=,&#62;=,&#62;}## :: Double# -&#62; Double# -&#62; Bool
363 negateDouble#       :: Double# -&#62; Double#
364 double2Int#         :: Double# -&#62; Int#
365 int2Double#         :: Int#    -&#62; Double#
366
367 {plus,minux,times,divide}Float# :: Float# -&#62; Float# -&#62; Float#
368 {gt,ge,eq,ne,lt,le}Float# :: Float# -&#62; Float# -&#62; Bool
369 negateFloat#        :: Float# -&#62; Float#
370 float2Int#          :: Float# -&#62; Int#
371 int2Float#          :: Int#   -&#62; Float#
372 </programlisting>
373
374 </para>
375
376 <para>
377 <indexterm><primary><literal>+&num;&num;</literal></primary></indexterm>
378 <indexterm><primary><literal>-&num;&num;</literal></primary></indexterm>
379 <indexterm><primary><literal>*&num;&num;</literal></primary></indexterm>
380 <indexterm><primary><literal>/&num;&num;</literal></primary></indexterm>
381 <indexterm><primary><literal>&#60;&num;&num;</literal></primary></indexterm>
382 <indexterm><primary><literal>&#60;=&num;&num;</literal></primary></indexterm>
383 <indexterm><primary><literal>==&num;&num;</literal></primary></indexterm>
384 <indexterm><primary><literal>=/&num;&num;</literal></primary></indexterm>
385 <indexterm><primary><literal>&#62;=&num;&num;</literal></primary></indexterm>
386 <indexterm><primary><literal>&#62;&num;&num;</literal></primary></indexterm>
387 <indexterm><primary><literal>negateDouble&num;</literal></primary></indexterm>
388 <indexterm><primary><literal>double2Int&num;</literal></primary></indexterm>
389 <indexterm><primary><literal>int2Double&num;</literal></primary></indexterm>
390 </para>
391
392 <para>
393 <indexterm><primary><literal>plusFloat&num;</literal></primary></indexterm>
394 <indexterm><primary><literal>minusFloat&num;</literal></primary></indexterm>
395 <indexterm><primary><literal>timesFloat&num;</literal></primary></indexterm>
396 <indexterm><primary><literal>divideFloat&num;</literal></primary></indexterm>
397 <indexterm><primary><literal>gtFloat&num;</literal></primary></indexterm>
398 <indexterm><primary><literal>geFloat&num;</literal></primary></indexterm>
399 <indexterm><primary><literal>eqFloat&num;</literal></primary></indexterm>
400 <indexterm><primary><literal>neFloat&num;</literal></primary></indexterm>
401 <indexterm><primary><literal>ltFloat&num;</literal></primary></indexterm>
402 <indexterm><primary><literal>leFloat&num;</literal></primary></indexterm>
403 <indexterm><primary><literal>negateFloat&num;</literal></primary></indexterm>
404 <indexterm><primary><literal>float2Int&num;</literal></primary></indexterm>
405 <indexterm><primary><literal>int2Float&num;</literal></primary></indexterm>
406 </para>
407
408 <para>
409 And a full complement of trigonometric functions:
410 </para>
411
412 <para>
413
414 <programlisting>
415 expDouble#      :: Double# -&#62; Double#
416 logDouble#      :: Double# -&#62; Double#
417 sqrtDouble#     :: Double# -&#62; Double#
418 sinDouble#      :: Double# -&#62; Double#
419 cosDouble#      :: Double# -&#62; Double#
420 tanDouble#      :: Double# -&#62; Double#
421 asinDouble#     :: Double# -&#62; Double#
422 acosDouble#     :: Double# -&#62; Double#
423 atanDouble#     :: Double# -&#62; Double#
424 sinhDouble#     :: Double# -&#62; Double#
425 coshDouble#     :: Double# -&#62; Double#
426 tanhDouble#     :: Double# -&#62; Double#
427 powerDouble#    :: Double# -&#62; Double# -&#62; Double#
428 </programlisting>
429
430 <indexterm><primary>trigonometric functions, primitive</primary></indexterm>
431 </para>
432
433 <para>
434 similarly for <literal>Float&num;</literal>.
435 </para>
436
437 <para>
438 There are two coercion functions for <literal>Float&num;</literal>/<literal>Double&num;</literal>:
439 </para>
440
441 <para>
442
443 <programlisting>
444 float2Double#   :: Float# -&#62; Double#
445 double2Float#   :: Double# -&#62; Float#
446 </programlisting>
447
448 <indexterm><primary><literal>float2Double&num;</literal></primary></indexterm>
449 <indexterm><primary><literal>double2Float&num;</literal></primary></indexterm>
450 </para>
451
452 <para>
453 The primitive version of <function>decodeDouble</function>
454 (<function>encodeDouble</function> is implemented as an external C
455 function):
456 </para>
457
458 <para>
459
460 <programlisting>
461 decodeDouble#   :: Double# -&#62; PrelNum.ReturnIntAndGMP
462 </programlisting>
463
464 <indexterm><primary><literal>encodeDouble&num;</literal></primary></indexterm>
465 <indexterm><primary><literal>decodeDouble&num;</literal></primary></indexterm>
466 </para>
467
468 <para>
469 (And the same for <literal>Float&num;</literal>s.)
470 </para>
471
472 </sect2>
473
474 <sect2 id="integer-operations">
475 <title>Operations on/for <literal>Integers</literal> (interface to GMP)
476 </title>
477
478 <para>
479 <indexterm><primary>arbitrary precision integers</primary></indexterm>
480 <indexterm><primary>Integer, operations on</primary></indexterm>
481 </para>
482
483 <para>
484 We implement <literal>Integers</literal> (arbitrary-precision
485 integers) using the GNU multiple-precision (GMP) package (version
486 2.0.2).
487 </para>
488
489 <para>
490 The data type for <literal>Integer</literal> is either a small
491 integer, represented by an <literal>Int</literal>, or a large integer
492 represented using the pieces required by GMP's
493 <literal>MP&lowbar;INT</literal> in <filename>gmp.h</filename> (see
494 <filename>gmp.info</filename> in
495 <filename>ghc/includes/runtime/gmp</filename>).  It comes out as:
496 </para>
497
498 <para>
499
500 <programlisting>
501 data Integer = S# Int#             -- small integers
502              | J# Int# ByteArray#  -- large integers
503 </programlisting>
504
505 <indexterm><primary>Integer type</primary></indexterm> The primitive
506 ops to support large <literal>Integers</literal> use the
507 &ldquo;pieces&rdquo; of the representation, and are as follows:
508 </para>
509
510 <para>
511
512 <programlisting>
513 negateInteger#  :: Int# -&#62; ByteArray# -&#62; Integer
514
515 {plus,minus,times}Integer#, gcdInteger#, 
516   quotInteger#, remInteger#, divExactInteger#
517         :: Int# -> ByteArray#
518         -> Int# -> ByteArray#
519         -> (# Int#, ByteArray# #)
520
521 cmpInteger# 
522         :: Int# -> ByteArray#
523         -> Int# -> ByteArray#
524         -> Int# -- -1 for &#60;; 0 for ==; +1 for >
525
526 cmpIntegerInt# 
527         :: Int# -> ByteArray#
528         -> Int#
529         -> Int# -- -1 for &#60;; 0 for ==; +1 for >
530
531 gcdIntegerInt# :: 
532         :: Int# -> ByteArray#
533         -> Int#
534         -> Int#
535
536 divModInteger#, quotRemInteger#
537         :: Int# -> ByteArray#
538         -> Int# -> ByteArray#
539         -> (# Int#, ByteArray#,
540                   Int#, ByteArray# #)
541
542 integer2Int# :: Int# -> ByteArray# -> Int#
543
544 int2Integer#  :: Int#  -> Integer -- NB: no error-checking on these two!
545 word2Integer# :: Word# -> Integer
546
547 addr2Integer# :: Addr# -> Integer
548         -- the Addr# is taken to be a `char *' string
549         -- to be converted into an Integer.
550 </programlisting>
551
552 <indexterm><primary><literal>negateInteger&num;</literal></primary></indexterm>
553 <indexterm><primary><literal>plusInteger&num;</literal></primary></indexterm>
554 <indexterm><primary><literal>minusInteger&num;</literal></primary></indexterm>
555 <indexterm><primary><literal>timesInteger&num;</literal></primary></indexterm>
556 <indexterm><primary><literal>quotInteger&num;</literal></primary></indexterm>
557 <indexterm><primary><literal>remInteger&num;</literal></primary></indexterm>
558 <indexterm><primary><literal>gcdInteger&num;</literal></primary></indexterm>
559 <indexterm><primary><literal>gcdIntegerInt&num;</literal></primary></indexterm>
560 <indexterm><primary><literal>divExactInteger&num;</literal></primary></indexterm>
561 <indexterm><primary><literal>cmpInteger&num;</literal></primary></indexterm>
562 <indexterm><primary><literal>divModInteger&num;</literal></primary></indexterm>
563 <indexterm><primary><literal>quotRemInteger&num;</literal></primary></indexterm>
564 <indexterm><primary><literal>integer2Int&num;</literal></primary></indexterm>
565 <indexterm><primary><literal>int2Integer&num;</literal></primary></indexterm>
566 <indexterm><primary><literal>word2Integer&num;</literal></primary></indexterm>
567 <indexterm><primary><literal>addr2Integer&num;</literal></primary></indexterm>
568 </para>
569
570 </sect2>
571
572 <sect2>
573 <title>Words and addresses</title>
574
575 <para>
576 <indexterm><primary>word, primitive type</primary></indexterm>
577 <indexterm><primary>address, primitive type</primary></indexterm>
578 <indexterm><primary>unsigned integer, primitive type</primary></indexterm>
579 <indexterm><primary>pointer, primitive type</primary></indexterm>
580 </para>
581
582 <para>
583 A <literal>Word&num;</literal> is used for bit-twiddling operations.
584 It is the same size as an <literal>Int&num;</literal>, but has no sign
585 nor any arithmetic operations.
586
587 <programlisting>
588 type Word#      -- Same size/etc as Int# but *unsigned*
589 type Addr#      -- A pointer from outside the "Haskell world" (from C, probably);
590                 -- described under "arrays"
591 </programlisting>
592
593 <indexterm><primary><literal>Word&num;</literal></primary></indexterm>
594 <indexterm><primary><literal>Addr&num;</literal></primary></indexterm>
595 </para>
596
597 <para>
598 <literal>Word&num;</literal>s and <literal>Addr&num;</literal>s have
599 the usual comparison operations.  Other
600 unboxed-<literal>Word</literal> ops (bit-twiddling and coercions):
601 </para>
602
603 <para>
604
605 <programlisting>
606 {gt,ge,eq,ne,lt,le}Word# :: Word# -> Word# -> Bool
607
608 and#, or#, xor# :: Word# -> Word# -> Word#
609         -- standard bit ops.
610
611 quotWord#, remWord# :: Word# -> Word# -> Word#
612         -- word (i.e. unsigned) versions are different from int
613         -- versions, so we have to provide these explicitly.
614
615 not# :: Word# -> Word#
616
617 shiftL#, shiftRL# :: Word# -> Int# -> Word#
618         -- shift left, right logical
619
620 int2Word#       :: Int#  -> Word# -- just a cast, really
621 word2Int#       :: Word# -> Int#
622 </programlisting>
623
624 <indexterm><primary>bit operations, Word and Addr</primary></indexterm>
625 <indexterm><primary><literal>gtWord&num;</literal></primary></indexterm>
626 <indexterm><primary><literal>geWord&num;</literal></primary></indexterm>
627 <indexterm><primary><literal>eqWord&num;</literal></primary></indexterm>
628 <indexterm><primary><literal>neWord&num;</literal></primary></indexterm>
629 <indexterm><primary><literal>ltWord&num;</literal></primary></indexterm>
630 <indexterm><primary><literal>leWord&num;</literal></primary></indexterm>
631 <indexterm><primary><literal>and&num;</literal></primary></indexterm>
632 <indexterm><primary><literal>or&num;</literal></primary></indexterm>
633 <indexterm><primary><literal>xor&num;</literal></primary></indexterm>
634 <indexterm><primary><literal>not&num;</literal></primary></indexterm>
635 <indexterm><primary><literal>quotWord&num;</literal></primary></indexterm>
636 <indexterm><primary><literal>remWord&num;</literal></primary></indexterm>
637 <indexterm><primary><literal>shiftL&num;</literal></primary></indexterm>
638 <indexterm><primary><literal>shiftRA&num;</literal></primary></indexterm>
639 <indexterm><primary><literal>shiftRL&num;</literal></primary></indexterm>
640 <indexterm><primary><literal>int2Word&num;</literal></primary></indexterm>
641 <indexterm><primary><literal>word2Int&num;</literal></primary></indexterm>
642 </para>
643
644 <para>
645 Unboxed-<literal>Addr</literal> ops (C casts, really):
646
647 <programlisting>
648 {gt,ge,eq,ne,lt,le}Addr# :: Addr# -> Addr# -> Bool
649
650 int2Addr#       :: Int#  -> Addr#
651 addr2Int#       :: Addr# -> Int#
652 addr2Integer#   :: Addr# -> (# Int#, ByteArray# #)
653 </programlisting>
654
655 <indexterm><primary><literal>gtAddr&num;</literal></primary></indexterm>
656 <indexterm><primary><literal>geAddr&num;</literal></primary></indexterm>
657 <indexterm><primary><literal>eqAddr&num;</literal></primary></indexterm>
658 <indexterm><primary><literal>neAddr&num;</literal></primary></indexterm>
659 <indexterm><primary><literal>ltAddr&num;</literal></primary></indexterm>
660 <indexterm><primary><literal>leAddr&num;</literal></primary></indexterm>
661 <indexterm><primary><literal>int2Addr&num;</literal></primary></indexterm>
662 <indexterm><primary><literal>addr2Int&num;</literal></primary></indexterm>
663 <indexterm><primary><literal>addr2Integer&num;</literal></primary></indexterm>
664 </para>
665
666 <para>
667 The casts between <literal>Int&num;</literal>,
668 <literal>Word&num;</literal> and <literal>Addr&num;</literal>
669 correspond to null operations at the machine level, but are required
670 to keep the Haskell type checker happy.
671 </para>
672
673 <para>
674 Operations for indexing off of C pointers
675 (<literal>Addr&num;</literal>s) to snatch values are listed under
676 &ldquo;arrays&rdquo;.
677 </para>
678
679 </sect2>
680
681 <sect2>
682 <title>Arrays</title>
683
684 <para>
685 <indexterm><primary>arrays, primitive</primary></indexterm>
686 </para>
687
688 <para>
689 The type <literal>Array&num; elt</literal> is the type of primitive,
690 unpointed arrays of values of type <literal>elt</literal>.
691 </para>
692
693 <para>
694
695 <programlisting>
696 type Array# elt
697 </programlisting>
698
699 <indexterm><primary><literal>Array&num;</literal></primary></indexterm>
700 </para>
701
702 <para>
703 <literal>Array&num;</literal> is more primitive than a Haskell
704 array&mdash;indeed, the Haskell <literal>Array</literal> interface is
705 implemented using <literal>Array&num;</literal>&mdash;in that an
706 <literal>Array&num;</literal> is indexed only by
707 <literal>Int&num;</literal>s, starting at zero.  It is also more
708 primitive by virtue of being unboxed.  That doesn't mean that it isn't
709 a heap-allocated object&mdash;of course, it is.  Rather, being unboxed
710 means that it is represented by a pointer to the array itself, and not
711 to a thunk which will evaluate to the array (or to bottom).  The
712 components of an <literal>Array&num;</literal> are themselves boxed.
713 </para>
714
715 <para>
716 The type <literal>ByteArray&num;</literal> is similar to
717 <literal>Array&num;</literal>, except that it contains just a string
718 of (non-pointer) bytes.
719 </para>
720
721 <para>
722
723 <programlisting>
724 type ByteArray#
725 </programlisting>
726
727 <indexterm><primary><literal>ByteArray&num;</literal></primary></indexterm>
728 </para>
729
730 <para>
731 Arrays of these types are useful when a Haskell program wishes to
732 construct a value to pass to a C procedure. It is also possible to use
733 them to build (say) arrays of unboxed characters for internal use in a
734 Haskell program.  Given these uses, <literal>ByteArray&num;</literal>
735 is deliberately a bit vague about the type of its components.
736 Operations are provided to extract values of type
737 <literal>Char&num;</literal>, <literal>Int&num;</literal>,
738 <literal>Float&num;</literal>, <literal>Double&num;</literal>, and
739 <literal>Addr&num;</literal> from arbitrary offsets within a
740 <literal>ByteArray&num;</literal>.  (For type
741 <literal>Foo&num;</literal>, the $i$th offset gets you the $i$th
742 <literal>Foo&num;</literal>, not the <literal>Foo&num;</literal> at
743 byte-position $i$.  Mumble.)  (If you want a
744 <literal>Word&num;</literal>, grab an <literal>Int&num;</literal>,
745 then coerce it.)
746 </para>
747
748 <para>
749 Lastly, we have static byte-arrays, of type
750 <literal>Addr&num;</literal> &lsqb;mentioned previously].  (Remember
751 the duality between arrays and pointers in C.)  Arrays of this types
752 are represented by a pointer to an array in the world outside Haskell,
753 so this pointer is not followed by the garbage collector.  In other
754 respects they are just like <literal>ByteArray&num;</literal>.  They
755 are only needed in order to pass values from C to Haskell.
756 </para>
757
758 </sect2>
759
760 <sect2>
761 <title>Reading and writing</title>
762
763 <para>
764 Primitive arrays are linear, and indexed starting at zero.
765 </para>
766
767 <para>
768 The size and indices of a <literal>ByteArray&num;</literal>, <literal>Addr&num;</literal>, and
769 <literal>MutableByteArray&num;</literal> are all in bytes.  It's up to the program to
770 calculate the correct byte offset from the start of the array.  This
771 allows a <literal>ByteArray&num;</literal> to contain a mixture of values of different
772 type, which is often needed when preparing data for and unpicking
773 results from C.  (Umm&hellip;not true of indices&hellip;WDP 95/09)
774 </para>
775
776 <para>
777 <emphasis>Should we provide some <literal>sizeOfDouble&num;</literal> constants?</emphasis>
778 </para>
779
780 <para>
781 Out-of-range errors on indexing should be caught by the code which
782 uses the primitive operation; the primitive operations themselves do
783 <emphasis>not</emphasis> check for out-of-range indexes. The intention is that the
784 primitive ops compile to one machine instruction or thereabouts.
785 </para>
786
787 <para>
788 We use the terms &ldquo;reading&rdquo; and &ldquo;writing&rdquo; to refer to accessing
789 <emphasis>mutable</emphasis> arrays (see <xref LinkEnd="sect-mutable">), and
790 &ldquo;indexing&rdquo; to refer to reading a value from an <emphasis>immutable</emphasis>
791 array.
792 </para>
793
794 <para>
795 Immutable byte arrays are straightforward to index (all indices are in
796 units of the size of the object being read):
797
798 <programlisting>
799 indexCharArray#   :: ByteArray# -> Int# -> Char#
800 indexIntArray#    :: ByteArray# -> Int# -> Int#
801 indexAddrArray#   :: ByteArray# -> Int# -> Addr#
802 indexFloatArray#  :: ByteArray# -> Int# -> Float#
803 indexDoubleArray# :: ByteArray# -> Int# -> Double#
804
805 indexCharOffAddr#   :: Addr# -> Int# -> Char#
806 indexIntOffAddr#    :: Addr# -> Int# -> Int#
807 indexFloatOffAddr#  :: Addr# -> Int# -> Float#
808 indexDoubleOffAddr# :: Addr# -> Int# -> Double#
809 indexAddrOffAddr#   :: Addr# -> Int# -> Addr#
810  -- Get an Addr# from an Addr# offset
811 </programlisting>
812
813 <indexterm><primary><literal>indexCharArray&num;</literal></primary></indexterm>
814 <indexterm><primary><literal>indexIntArray&num;</literal></primary></indexterm>
815 <indexterm><primary><literal>indexAddrArray&num;</literal></primary></indexterm>
816 <indexterm><primary><literal>indexFloatArray&num;</literal></primary></indexterm>
817 <indexterm><primary><literal>indexDoubleArray&num;</literal></primary></indexterm>
818 <indexterm><primary><literal>indexCharOffAddr&num;</literal></primary></indexterm>
819 <indexterm><primary><literal>indexIntOffAddr&num;</literal></primary></indexterm>
820 <indexterm><primary><literal>indexFloatOffAddr&num;</literal></primary></indexterm>
821 <indexterm><primary><literal>indexDoubleOffAddr&num;</literal></primary></indexterm>
822 <indexterm><primary><literal>indexAddrOffAddr&num;</literal></primary></indexterm>
823 </para>
824
825 <para>
826 The last of these, <function>indexAddrOffAddr&num;</function>, extracts an <literal>Addr&num;</literal> using an offset
827 from another <literal>Addr&num;</literal>, thereby providing the ability to follow a chain of
828 C pointers.
829 </para>
830
831 <para>
832 Something a bit more interesting goes on when indexing arrays of boxed
833 objects, because the result is simply the boxed object. So presumably
834 it should be entered&mdash;we never usually return an unevaluated
835 object!  This is a pain: primitive ops aren't supposed to do
836 complicated things like enter objects.  The current solution is to
837 return a single element unboxed tuple (see <xref LinkEnd="unboxed-tuples">).
838 </para>
839
840 <para>
841
842 <programlisting>
843 indexArray#       :: Array# elt -> Int# -> (# elt #)
844 </programlisting>
845
846 <indexterm><primary><literal>indexArray&num;</literal></primary></indexterm>
847 </para>
848
849 </sect2>
850
851 <sect2>
852 <title>The state type</title>
853
854 <para>
855 <indexterm><primary><literal>state, primitive type</literal></primary></indexterm>
856 <indexterm><primary><literal>State&num;</literal></primary></indexterm>
857 </para>
858
859 <para>
860 The primitive type <literal>State&num;</literal> represents the state of a state
861 transformer.  It is parameterised on the desired type of state, which
862 serves to keep states from distinct threads distinct from one another.
863 But the <emphasis>only</emphasis> effect of this parameterisation is in the type
864 system: all values of type <literal>State&num;</literal> are represented in the same way.
865 Indeed, they are all represented by nothing at all!  The code
866 generator &ldquo;knows&rdquo; to generate no code, and allocate no registers
867 etc, for primitive states.
868 </para>
869
870 <para>
871
872 <programlisting>
873 type State# s
874 </programlisting>
875
876 </para>
877
878 <para>
879 The type <literal>GHC.RealWorld</literal> is truly opaque: there are no values defined
880 of this type, and no operations over it.  It is &ldquo;primitive&rdquo; in that
881 sense - but it is <emphasis>not unlifted!</emphasis> Its only role in life is to be
882 the type which distinguishes the <literal>IO</literal> state transformer.
883 </para>
884
885 <para>
886
887 <programlisting>
888 data RealWorld
889 </programlisting>
890
891 </para>
892
893 </sect2>
894
895 <sect2>
896 <title>State of the world</title>
897
898 <para>
899 A single, primitive, value of type <literal>State&num; RealWorld</literal> is provided.
900 </para>
901
902 <para>
903
904 <programlisting>
905 realWorld# :: State# RealWorld
906 </programlisting>
907
908 <indexterm><primary>realWorld&num; state object</primary></indexterm>
909 </para>
910
911 <para>
912 (Note: in the compiler, not a <literal>PrimOp</literal>; just a mucho magic
913 <literal>Id</literal>. Exported from <literal>GHC</literal>, though).
914 </para>
915
916 </sect2>
917
918 <sect2 id="sect-mutable">
919 <title>Mutable arrays</title>
920
921 <para>
922 <indexterm><primary>mutable arrays</primary></indexterm>
923 <indexterm><primary>arrays, mutable</primary></indexterm>
924 Corresponding to <literal>Array&num;</literal> and <literal>ByteArray&num;</literal>, we have the types of
925 mutable versions of each.  In each case, the representation is a
926 pointer to a suitable block of (mutable) heap-allocated storage.
927 </para>
928
929 <para>
930
931 <programlisting>
932 type MutableArray# s elt
933 type MutableByteArray# s
934 </programlisting>
935
936 <indexterm><primary><literal>MutableArray&num;</literal></primary></indexterm>
937 <indexterm><primary><literal>MutableByteArray&num;</literal></primary></indexterm>
938 </para>
939
940 <sect3>
941 <title>Allocation</title>
942
943 <para>
944 <indexterm><primary>mutable arrays, allocation</primary></indexterm>
945 <indexterm><primary>arrays, allocation</primary></indexterm>
946 <indexterm><primary>allocation, of mutable arrays</primary></indexterm>
947 </para>
948
949 <para>
950 Mutable arrays can be allocated. Only pointer-arrays are initialised;
951 arrays of non-pointers are filled in by &ldquo;user code&rdquo; rather than by
952 the array-allocation primitive.  Reason: only the pointer case has to
953 worry about GC striking with a partly-initialised array.
954 </para>
955
956 <para>
957
958 <programlisting>
959 newArray#       :: Int# -> elt -> State# s -> (# State# s, MutableArray# s elt #)
960
961 newCharArray#   :: Int# -> State# s -> (# State# s, MutableByteArray# s elt #)
962 newIntArray#    :: Int# -> State# s -> (# State# s, MutableByteArray# s elt #)
963 newAddrArray#   :: Int# -> State# s -> (# State# s, MutableByteArray# s elt #)
964 newFloatArray#  :: Int# -> State# s -> (# State# s, MutableByteArray# s elt #)
965 newDoubleArray# :: Int# -> State# s -> (# State# s, MutableByteArray# s elt #)
966 </programlisting>
967
968 <indexterm><primary><literal>newArray&num;</literal></primary></indexterm>
969 <indexterm><primary><literal>newCharArray&num;</literal></primary></indexterm>
970 <indexterm><primary><literal>newIntArray&num;</literal></primary></indexterm>
971 <indexterm><primary><literal>newAddrArray&num;</literal></primary></indexterm>
972 <indexterm><primary><literal>newFloatArray&num;</literal></primary></indexterm>
973 <indexterm><primary><literal>newDoubleArray&num;</literal></primary></indexterm>
974 </para>
975
976 <para>
977 The size of a <literal>ByteArray&num;</literal> is given in bytes.
978 </para>
979
980 </sect3>
981
982 <sect3>
983 <title>Reading and writing</title>
984
985 <para>
986 <indexterm><primary>arrays, reading and writing</primary></indexterm>
987 </para>
988
989 <para>
990
991 <programlisting>
992 readArray#       :: MutableArray# s elt -> Int# -> State# s -> (# State# s, elt #)
993 readCharArray#   :: MutableByteArray# s -> Int# -> State# s -> (# State# s, Char# #)
994 readIntArray#    :: MutableByteArray# s -> Int# -> State# s -> (# State# s, Int# #)
995 readAddrArray#   :: MutableByteArray# s -> Int# -> State# s -> (# State# s, Addr# #)
996 readFloatArray#  :: MutableByteArray# s -> Int# -> State# s -> (# State# s, Float# #)
997 readDoubleArray# :: MutableByteArray# s -> Int# -> State# s -> (# State# s, Double# #)
998
999 writeArray#       :: MutableArray# s elt -> Int# -> elt     -> State# s -> State# s
1000 writeCharArray#   :: MutableByteArray# s -> Int# -> Char#   -> State# s -> State# s
1001 writeIntArray#    :: MutableByteArray# s -> Int# -> Int#    -> State# s -> State# s
1002 writeAddrArray#   :: MutableByteArray# s -> Int# -> Addr#   -> State# s -> State# s
1003 writeFloatArray#  :: MutableByteArray# s -> Int# -> Float#  -> State# s -> State# s
1004 writeDoubleArray# :: MutableByteArray# s -> Int# -> Double# -> State# s -> State# s
1005 </programlisting>
1006
1007 <indexterm><primary><literal>readArray&num;</literal></primary></indexterm>
1008 <indexterm><primary><literal>readCharArray&num;</literal></primary></indexterm>
1009 <indexterm><primary><literal>readIntArray&num;</literal></primary></indexterm>
1010 <indexterm><primary><literal>readAddrArray&num;</literal></primary></indexterm>
1011 <indexterm><primary><literal>readFloatArray&num;</literal></primary></indexterm>
1012 <indexterm><primary><literal>readDoubleArray&num;</literal></primary></indexterm>
1013 <indexterm><primary><literal>writeArray&num;</literal></primary></indexterm>
1014 <indexterm><primary><literal>writeCharArray&num;</literal></primary></indexterm>
1015 <indexterm><primary><literal>writeIntArray&num;</literal></primary></indexterm>
1016 <indexterm><primary><literal>writeAddrArray&num;</literal></primary></indexterm>
1017 <indexterm><primary><literal>writeFloatArray&num;</literal></primary></indexterm>
1018 <indexterm><primary><literal>writeDoubleArray&num;</literal></primary></indexterm>
1019 </para>
1020
1021 </sect3>
1022
1023 <sect3>
1024 <title>Equality</title>
1025
1026 <para>
1027 <indexterm><primary>arrays, testing for equality</primary></indexterm>
1028 </para>
1029
1030 <para>
1031 One can take &ldquo;equality&rdquo; of mutable arrays.  What is compared is the
1032 <emphasis>name</emphasis> or reference to the mutable array, not its contents.
1033 </para>
1034
1035 <para>
1036
1037 <programlisting>
1038 sameMutableArray#     :: MutableArray# s elt -> MutableArray# s elt -> Bool
1039 sameMutableByteArray# :: MutableByteArray# s -> MutableByteArray# s -> Bool
1040 </programlisting>
1041
1042 <indexterm><primary><literal>sameMutableArray&num;</literal></primary></indexterm>
1043 <indexterm><primary><literal>sameMutableByteArray&num;</literal></primary></indexterm>
1044 </para>
1045
1046 </sect3>
1047
1048 <sect3>
1049 <title>Freezing mutable arrays</title>
1050
1051 <para>
1052 <indexterm><primary>arrays, freezing mutable</primary></indexterm>
1053 <indexterm><primary>freezing mutable arrays</primary></indexterm>
1054 <indexterm><primary>mutable arrays, freezing</primary></indexterm>
1055 </para>
1056
1057 <para>
1058 Only unsafe-freeze has a primitive.  (Safe freeze is done directly in Haskell
1059 by copying the array and then using <function>unsafeFreeze</function>.)
1060 </para>
1061
1062 <para>
1063
1064 <programlisting>
1065 unsafeFreezeArray#     :: MutableArray# s elt -> State# s -> (# State# s, Array# s elt #)
1066 unsafeFreezeByteArray# :: MutableByteArray# s -> State# s -> (# State# s, ByteArray# #)
1067 </programlisting>
1068
1069 <indexterm><primary><literal>unsafeFreezeArray&num;</literal></primary></indexterm>
1070 <indexterm><primary><literal>unsafeFreezeByteArray&num;</literal></primary></indexterm>
1071 </para>
1072
1073 </sect3>
1074
1075 </sect2>
1076
1077 <sect2>
1078 <title>Synchronizing variables (M-vars)</title>
1079
1080 <para>
1081 <indexterm><primary>synchronising variables (M-vars)</primary></indexterm>
1082 <indexterm><primary>M-Vars</primary></indexterm>
1083 </para>
1084
1085 <para>
1086 Synchronising variables are the primitive type used to implement
1087 Concurrent Haskell's MVars (see the Concurrent Haskell paper for
1088 the operational behaviour of these operations).
1089 </para>
1090
1091 <para>
1092
1093 <programlisting>
1094 type MVar# s elt        -- primitive
1095
1096 newMVar#    :: State# s -> (# State# s, MVar# s elt #)
1097 takeMVar#   :: SynchVar# s elt -> State# s -> (# State# s, elt #)
1098 putMVar#    :: SynchVar# s elt -> State# s -> State# s
1099 </programlisting>
1100
1101 <indexterm><primary><literal>SynchVar&num;</literal></primary></indexterm>
1102 <indexterm><primary><literal>newSynchVar&num;</literal></primary></indexterm>
1103 <indexterm><primary><literal>takeMVar</literal></primary></indexterm>
1104 <indexterm><primary><literal>putMVar</literal></primary></indexterm>
1105 </para>
1106
1107 </sect2>
1108
1109 </sect1>
1110
1111 <!-- Emacs stuff:
1112      ;;; Local Variables: ***
1113      ;;; mode: sgml ***
1114      ;;; sgml-parent-document: ("users_guide.sgml" "book" "chapter" "sect1") ***
1115      ;;; End: ***
1116  -->