ghc/docs/users_guide/primitives.sgml

   1 <!-- UNBOXED TYPES AND PRIMITIVE OPERATIONS -->
   2
   3 <sect1 id="primitives">
   4   <title>Unboxed types and primitive operations</title>
   5   <indexterm><primary>PrelGHC module</primary></indexterm>
   6
   7   <para>This module defines all the types which are primitive in
   8   Glasgow Haskell, and the operations provided for them.</para>
   9
  10   <para>Note: while you really can use this stuff to write fast code,
  11   we generally find it a lot less painful, and more satisfying in the
  12   long run, to use higher-level language features and libraries.  With
  13   any luck, the code you write will be optimised to the efficient
  14   unboxed version in any case.  And if it isn't, we'd like to know
  15   about it.</para>
  16
  17 <sect2 id="glasgow-unboxed">
  18 <title>Unboxed types
  19 </title>
  20
  21 <para>
  22 <indexterm><primary>Unboxed types (Glasgow extension)</primary></indexterm>
  23 </para>
  24
  25 <para>Most types in GHC are <firstterm>boxed</firstterm>, which means
  26 that values of that type are represented by a pointer to a heap
  27 object.  The representation of a Haskell <literal>Int</literal>, for
  28 example, is a two-word heap object.  An <firstterm>unboxed</firstterm>
  29 type, however, is represented by the value itself, no pointers or heap
  30 allocation are involved.
  31 </para>
  32
  33 <para>
  34 Unboxed types correspond to the &ldquo;raw machine&rdquo; types you
  35 would use in C: <literal>Int&num;</literal> (long int),
  36 <literal>Double&num;</literal> (double), <literal>Addr&num;</literal>
  37 (void *), etc.  The <emphasis>primitive operations</emphasis>
  38 (PrimOps) on these types are what you might expect; e.g.,
  39 <literal>(+&num;)</literal> is addition on
  40 <literal>Int&num;</literal>s, and is the machine-addition that we all
  41 know and love&mdash;usually one instruction.
  42 </para>
  43
  44 <para>
  45 Primitive (unboxed) types cannot be defined in Haskell, and are
  46 therefore built into the language and compiler.  Primitive types are
  47 always unlifted; that is, a value of a primitive type cannot be
  48 bottom.  We use the convention that primitive types, values, and
  49 operations have a <literal>&num;</literal> suffix.
  50 </para>
  51
  52 <para>
  53 Primitive values are often represented by a simple bit-pattern, such
  54 as <literal>Int&num;</literal>, <literal>Float&num;</literal>,
  55 <literal>Double&num;</literal>.  But this is not necessarily the case:
  56 a primitive value might be represented by a pointer to a
  57 heap-allocated object.  Examples include
  58 <literal>Array&num;</literal>, the type of primitive arrays.  A
  59 primitive array is heap-allocated because it is too big a value to fit
  60 in a register, and would be too expensive to copy around; in a sense,
  61 it is accidental that it is represented by a pointer.  If a pointer
  62 represents a primitive value, then it really does point to that value:
  63 no unevaluated thunks, no indirections&hellip;nothing can be at the
  64 other end of the pointer than the primitive value.
  65 </para>
  66
  67 <para>
  68 There are some restrictions on the use of primitive types, the main
  69 one being that you can't pass a primitive value to a polymorphic
  70 function or store one in a polymorphic data type.  This rules out
  71 things like <literal>[Int&num;]</literal> (i.e. lists of primitive
  72 integers).  The reason for this restriction is that polymorphic
  73 arguments and constructor fields are assumed to be pointers: if an
  74 unboxed integer is stored in one of these, the garbage collector would
  75 attempt to follow it, leading to unpredictable space leaks.  Or a
  76 <function>seq</function> operation on the polymorphic component may
  77 attempt to dereference the pointer, with disastrous results.  Even
  78 worse, the unboxed value might be larger than a pointer
  79 (<literal>Double&num;</literal> for instance).
  80 </para>
  81
  82 <para>
  83 Nevertheless, A numerically-intensive program using unboxed types can
  84 go a <emphasis>lot</emphasis> faster than its &ldquo;standard&rdquo;
  85 counterpart&mdash;we saw a threefold speedup on one example.
  86 </para>
  87
  88 </sect2>
  89
  90 <sect2 id="unboxed-tuples">
  91 <title>Unboxed Tuples
  92 </title>
  93
  94 <para>
  95 Unboxed tuples aren't really exported by <literal>PrelGHC</literal>,
  96 they're available by default with <option>-fglasgow-exts</option>.  An
  97 unboxed tuple looks like this:
  98 </para>
  99
 100 <para>
 101
 102 <programlisting>
 103 (# e_1, ..., e_n #)
 104 </programlisting>
 105
 106 </para>
 107
 108 <para>
 109 where <literal>e&lowbar;1..e&lowbar;n</literal> are expressions of any
 110 type (primitive or non-primitive).  The type of an unboxed tuple looks
 111 the same.
 112 </para>
 113
 114 <para>
 115 Unboxed tuples are used for functions that need to return multiple
 116 values, but they avoid the heap allocation normally associated with
 117 using fully-fledged tuples.  When an unboxed tuple is returned, the
 118 components are put directly into registers or on the stack; the
 119 unboxed tuple itself does not have a composite representation.  Many
 120 of the primitive operations listed in this section return unboxed
 121 tuples.
 122 </para>
 123
 124 <para>
 125 There are some pretty stringent restrictions on the use of unboxed tuples:
 126 </para>
 127
 128 <para>
 129
 130 <itemizedlist>
 131 <listitem>
 132
 133 <para>
 134  Unboxed tuple types are subject to the same restrictions as
 135 other unboxed types; i.e. they may not be stored in polymorphic data
 136 structures or passed to polymorphic functions.
 137
 138 </para>
 139 </listitem>
 140 <listitem>
 141
 142 <para>
 143  Unboxed tuples may only be constructed as the direct result of
 144 a function, and may only be deconstructed with a <literal>case</literal> expression.
 145 eg. the following are valid:
 146
 147
 148 <programlisting>
 149 f x y = (# x+1, y-1 #)
 150 g x = case f x x of { (# a, b #) -&#62; a + b }
 151 </programlisting>
 152
 153
 154 but the following are invalid:
 155
 156
 157 <programlisting>
 158 f x y = g (# x, y #)
 159 g (# x, y #) = x + y
 160 </programlisting>
 161
 162
 163 </para>
 164 </listitem>
 165 <listitem>
 166
 167 <para>
 168  No variable can have an unboxed tuple type.  This is illegal:
 169
 170
 171 <programlisting>
 172 f :: (# Int, Int #) -&#62; (# Int, Int #)
 173 f x = x
 174 </programlisting>
 175
 176
 177 because <literal>x</literal> has an unboxed tuple type.
 178
 179 </para>
 180 </listitem>
 181
 182 </itemizedlist>
 183
 184 </para>
 185
 186 <para>
 187 Note: we may relax some of these restrictions in the future.
 188 </para>
 189
 190 <para>
 191 The <literal>IO</literal> and <literal>ST</literal> monads use unboxed
 192 tuples to avoid unnecessary allocation during sequences of operations.
 193 </para>
 194
 195 </sect2>
 196
 197 <sect2>
 198 <title>Character and numeric types</title>
 199
 200 <indexterm><primary>character types, primitive</primary></indexterm>
 201 <indexterm><primary>numeric types, primitive</primary></indexterm>
 202 <indexterm><primary>integer types, primitive</primary></indexterm>
 203 <indexterm><primary>floating point types, primitive</primary></indexterm>
 204 <para>
 205 There are the following obvious primitive types:
 206 </para>
 207
 208 <programlisting>
 209 type Char#
 210 type Int#
 211 type Word#
 212 type Addr#
 213 type Float#
 214 type Double#
 215 type Int64#
 216 type Word64#
 217 </programlisting>
 218
 219 <indexterm><primary><literal>Char&num;</literal></primary></indexterm>
 220 <indexterm><primary><literal>Int&num;</literal></primary></indexterm>
 221 <indexterm><primary><literal>Word&num;</literal></primary></indexterm>
 222 <indexterm><primary><literal>Addr&num;</literal></primary></indexterm>
 223 <indexterm><primary><literal>Float&num;</literal></primary></indexterm>
 224 <indexterm><primary><literal>Double&num;</literal></primary></indexterm>
 225 <indexterm><primary><literal>Int64&num;</literal></primary></indexterm>
 226 <indexterm><primary><literal>Word64&num;</literal></primary></indexterm>
 227
 228 <para>
 229 If you really want to know their exact equivalents in C, see
 230 <filename>ghc/includes/StgTypes.h</filename> in the GHC source tree.
 231 </para>
 232
 233 <para>
 234 Literals for these types may be written as follows:
 235 </para>
 236
 237 <para>
 238
 239 <programlisting>
 240 1#              an Int#
 241 1.2#            a Float#
 242 1.34##          a Double#
 243 'a'#            a Char#; for weird characters, use e.g. '\o&#60;octal&#62;'#
 244 "a"#            an Addr# (a `char *'); only characters '\0'..'\255' allowed
 245 </programlisting>
 246
 247 <indexterm><primary>literals, primitive</primary></indexterm>
 248 <indexterm><primary>constants, primitive</primary></indexterm>
 249 <indexterm><primary>numbers, primitive</primary></indexterm>
 250 </para>
 251
 252 </sect2>
 253
 254 <sect2>
 255 <title>Comparison operations</title>
 256
 257 <para>
 258 <indexterm><primary>comparisons, primitive</primary></indexterm>
 259 <indexterm><primary>operators, comparison</primary></indexterm>
 260 </para>
 261
 262 <para>
 263
 264 <programlisting>
 265 {&#62;,&#62;=,==,/=,&#60;,&#60;=}# :: Int# -&#62; Int# -&#62; Bool
 266
 267 {gt,ge,eq,ne,lt,le}Char# :: Char# -&#62; Char# -&#62; Bool
 268     -- ditto for Word# and Addr#
 269 </programlisting>
 270
 271 <indexterm><primary><literal>&#62;&num;</literal></primary></indexterm>
 272 <indexterm><primary><literal>&#62;=&num;</literal></primary></indexterm>
 273 <indexterm><primary><literal>==&num;</literal></primary></indexterm>
 274 <indexterm><primary><literal>/=&num;</literal></primary></indexterm>
 275 <indexterm><primary><literal>&#60;&num;</literal></primary></indexterm>
 276 <indexterm><primary><literal>&#60;=&num;</literal></primary></indexterm>
 277 <indexterm><primary><literal>gt&lcub;Char,Word,Addr&rcub;&num;</literal></primary></indexterm>
 278 <indexterm><primary><literal>ge&lcub;Char,Word,Addr&rcub;&num;</literal></primary></indexterm>
 279 <indexterm><primary><literal>eq&lcub;Char,Word,Addr&rcub;&num;</literal></primary></indexterm>
 280 <indexterm><primary><literal>ne&lcub;Char,Word,Addr&rcub;&num;</literal></primary></indexterm>
 281 <indexterm><primary><literal>lt&lcub;Char,Word,Addr&rcub;&num;</literal></primary></indexterm>
 282 <indexterm><primary><literal>le&lcub;Char,Word,Addr&rcub;&num;</literal></primary></indexterm>
 283 </para>
 284
 285 </sect2>
 286
 287 <sect2>
 288 <title>Primitive-character operations</title>
 289
 290 <para>
 291 <indexterm><primary>characters, primitive operations</primary></indexterm>
 292 <indexterm><primary>operators, primitive character</primary></indexterm>
 293 </para>
 294
 295 <para>
 296
 297 <programlisting>
 298 ord# :: Char# -&#62; Int#
 299 chr# :: Int# -&#62; Char#
 300 </programlisting>
 301
 302 <indexterm><primary><literal>ord&num;</literal></primary></indexterm>
 303 <indexterm><primary><literal>chr&num;</literal></primary></indexterm>
 304 </para>
 305
 306 </sect2>
 307
 308 <sect2>
 309 <title>Primitive-<literal>Int</literal> operations</title>
 310
 311 <para>
 312 <indexterm><primary>integers, primitive operations</primary></indexterm>
 313 <indexterm><primary>operators, primitive integer</primary></indexterm>
 314 </para>
 315
 316 <para>
 317
 318 <programlisting>
 319 {+,-,*,quotInt,remInt,gcdInt}# :: Int# -&#62; Int# -&#62; Int#
 320 negateInt# :: Int# -&#62; Int#
 321
 322 iShiftL#, iShiftRA#, iShiftRL# :: Int# -&#62; Int# -&#62; Int#
 323         -- shift left, right arithmetic, right logical
 324
 325 addIntC#, subIntC#, mulIntC# :: Int# -> Int# -> (# Int#, Int# #)
 326         -- add, subtract, multiply with carry
 327 </programlisting>
 328
 329 <indexterm><primary><literal>+&num;</literal></primary></indexterm>
 330 <indexterm><primary><literal>-&num;</literal></primary></indexterm>
 331 <indexterm><primary><literal>*&num;</literal></primary></indexterm>
 332 <indexterm><primary><literal>quotInt&num;</literal></primary></indexterm>
 333 <indexterm><primary><literal>remInt&num;</literal></primary></indexterm>
 334 <indexterm><primary><literal>gcdInt&num;</literal></primary></indexterm>
 335 <indexterm><primary><literal>iShiftL&num;</literal></primary></indexterm>
 336 <indexterm><primary><literal>iShiftRA&num;</literal></primary></indexterm>
 337 <indexterm><primary><literal>iShiftRL&num;</literal></primary></indexterm>
 338 <indexterm><primary><literal>addIntC&num;</literal></primary></indexterm>
 339 <indexterm><primary><literal>subIntC&num;</literal></primary></indexterm>
 340 <indexterm><primary><literal>mulIntC&num;</literal></primary></indexterm>
 341 <indexterm><primary>shift operations, integer</primary></indexterm>
 342 </para>
 343
 344 <para>
 345 <emphasis>Note:</emphasis> No error/overflow checking!
 346 </para>
 347
 348 </sect2>
 349
 350 <sect2>
 351 <title>Primitive-<literal>Double</literal> and <literal>Float</literal> operations</title>
 352
 353 <para>
 354 <indexterm><primary>floating point numbers, primitive</primary></indexterm>
 355 <indexterm><primary>operators, primitive floating point</primary></indexterm>
 356 </para>
 357
 358 <para>
 359
 360 <programlisting>
 361 {+,-,*,/}##         :: Double# -&#62; Double# -&#62; Double#
 362 {&#60;,&#60;=,==,/=,&#62;=,&#62;}## :: Double# -&#62; Double# -&#62; Bool
 363 negateDouble#       :: Double# -&#62; Double#
 364 double2Int#         :: Double# -&#62; Int#
 365 int2Double#         :: Int#    -&#62; Double#
 366
 367 {plus,minux,times,divide}Float# :: Float# -&#62; Float# -&#62; Float#
 368 {gt,ge,eq,ne,lt,le}Float# :: Float# -&#62; Float# -&#62; Bool
 369 negateFloat#        :: Float# -&#62; Float#
 370 float2Int#          :: Float# -&#62; Int#
 371 int2Float#          :: Int#   -&#62; Float#
 372 </programlisting>
 373
 374 </para>
 375
 376 <para>
 377 <indexterm><primary><literal>+&num;&num;</literal></primary></indexterm>
 378 <indexterm><primary><literal>-&num;&num;</literal></primary></indexterm>
 379 <indexterm><primary><literal>*&num;&num;</literal></primary></indexterm>
 380 <indexterm><primary><literal>/&num;&num;</literal></primary></indexterm>
 381 <indexterm><primary><literal>&#60;&num;&num;</literal></primary></indexterm>
 382 <indexterm><primary><literal>&#60;=&num;&num;</literal></primary></indexterm>
 383 <indexterm><primary><literal>==&num;&num;</literal></primary></indexterm>
 384 <indexterm><primary><literal>=/&num;&num;</literal></primary></indexterm>
 385 <indexterm><primary><literal>&#62;=&num;&num;</literal></primary></indexterm>
 386 <indexterm><primary><literal>&#62;&num;&num;</literal></primary></indexterm>
 387 <indexterm><primary><literal>negateDouble&num;</literal></primary></indexterm>
 388 <indexterm><primary><literal>double2Int&num;</literal></primary></indexterm>
 389 <indexterm><primary><literal>int2Double&num;</literal></primary></indexterm>
 390 </para>
 391
 392 <para>
 393 <indexterm><primary><literal>plusFloat&num;</literal></primary></indexterm>
 394 <indexterm><primary><literal>minusFloat&num;</literal></primary></indexterm>
 395 <indexterm><primary><literal>timesFloat&num;</literal></primary></indexterm>
 396 <indexterm><primary><literal>divideFloat&num;</literal></primary></indexterm>
 397 <indexterm><primary><literal>gtFloat&num;</literal></primary></indexterm>
 398 <indexterm><primary><literal>geFloat&num;</literal></primary></indexterm>
 399 <indexterm><primary><literal>eqFloat&num;</literal></primary></indexterm>
 400 <indexterm><primary><literal>neFloat&num;</literal></primary></indexterm>
 401 <indexterm><primary><literal>ltFloat&num;</literal></primary></indexterm>
 402 <indexterm><primary><literal>leFloat&num;</literal></primary></indexterm>
 403 <indexterm><primary><literal>negateFloat&num;</literal></primary></indexterm>
 404 <indexterm><primary><literal>float2Int&num;</literal></primary></indexterm>
 405 <indexterm><primary><literal>int2Float&num;</literal></primary></indexterm>
 406 </para>
 407
 408 <para>
 409 And a full complement of trigonometric functions:
 410 </para>
 411
 412 <para>
 413
 414 <programlisting>
 415 expDouble#      :: Double# -&#62; Double#
 416 logDouble#      :: Double# -&#62; Double#
 417 sqrtDouble#     :: Double# -&#62; Double#
 418 sinDouble#      :: Double# -&#62; Double#
 419 cosDouble#      :: Double# -&#62; Double#
 420 tanDouble#      :: Double# -&#62; Double#
 421 asinDouble#     :: Double# -&#62; Double#
 422 acosDouble#     :: Double# -&#62; Double#
 423 atanDouble#     :: Double# -&#62; Double#
 424 sinhDouble#     :: Double# -&#62; Double#
 425 coshDouble#     :: Double# -&#62; Double#
 426 tanhDouble#     :: Double# -&#62; Double#
 427 powerDouble#    :: Double# -&#62; Double# -&#62; Double#
 428 </programlisting>
 429
 430 <indexterm><primary>trigonometric functions, primitive</primary></indexterm>
 431 </para>
 432
 433 <para>
 434 similarly for <literal>Float&num;</literal>.
 435 </para>
 436
 437 <para>
 438 There are two coercion functions for <literal>Float&num;</literal>/<literal>Double&num;</literal>:
 439 </para>
 440
 441 <para>
 442
 443 <programlisting>
 444 float2Double#   :: Float# -&#62; Double#
 445 double2Float#   :: Double# -&#62; Float#
 446 </programlisting>
 447
 448 <indexterm><primary><literal>float2Double&num;</literal></primary></indexterm>
 449 <indexterm><primary><literal>double2Float&num;</literal></primary></indexterm>
 450 </para>
 451
 452 <para>
 453 The primitive version of <function>decodeDouble</function>
 454 (<function>encodeDouble</function> is implemented as an external C
 455 function):
 456 </para>
 457
 458 <para>
 459
 460 <programlisting>
 461 decodeDouble#   :: Double# -&#62; PrelNum.ReturnIntAndGMP
 462 </programlisting>
 463
 464 <indexterm><primary><literal>encodeDouble&num;</literal></primary></indexterm>
 465 <indexterm><primary><literal>decodeDouble&num;</literal></primary></indexterm>
 466 </para>
 467
 468 <para>
 469 (And the same for <literal>Float&num;</literal>s.)
 470 </para>
 471
 472 </sect2>
 473
 474 <sect2 id="integer-operations">
 475 <title>Operations on/for <literal>Integers</literal> (interface to GMP)
 476 </title>
 477
 478 <para>
 479 <indexterm><primary>arbitrary precision integers</primary></indexterm>
 480 <indexterm><primary>Integer, operations on</primary></indexterm>
 481 </para>
 482
 483 <para>
 484 We implement <literal>Integers</literal> (arbitrary-precision
 485 integers) using the GNU multiple-precision (GMP) package (version
 486 2.0.2).
 487 </para>
 488
 489 <para>
 490 The data type for <literal>Integer</literal> is either a small
 491 integer, represented by an <literal>Int</literal>, or a large integer
 492 represented using the pieces required by GMP's
 493 <literal>MP&lowbar;INT</literal> in <filename>gmp.h</filename> (see
 494 <filename>gmp.info</filename> in
 495 <filename>ghc/includes/runtime/gmp</filename>).  It comes out as:
 496 </para>
 497
 498 <para>
 499
 500 <programlisting>
 501 data Integer = S# Int#             -- small integers
 502              | J# Int# ByteArray#  -- large integers
 503 </programlisting>
 504
 505 <indexterm><primary>Integer type</primary></indexterm> The primitive
 506 ops to support large <literal>Integers</literal> use the
 507 &ldquo;pieces&rdquo; of the representation, and are as follows:
 508 </para>
 509
 510 <para>
 511
 512 <programlisting>
 513 negateInteger#  :: Int# -&#62; ByteArray# -&#62; Integer
 514
 515 {plus,minus,times}Integer#, gcdInteger#,
 516   quotInteger#, remInteger#, divExactInteger#
 517         :: Int# -> ByteArray#
 518         -> Int# -> ByteArray#
 519         -> (# Int#, ByteArray# #)
 520
 521 cmpInteger#
 522         :: Int# -> ByteArray#
 523         -> Int# -> ByteArray#
 524         -> Int# -- -1 for &#60;; 0 for ==; +1 for >
 525
 526 cmpIntegerInt#
 527         :: Int# -> ByteArray#
 528         -> Int#
 529         -> Int# -- -1 for &#60;; 0 for ==; +1 for >
 530
 531 gcdIntegerInt# ::
 532         :: Int# -> ByteArray#
 533         -> Int#
 534         -> Int#
 535
 536 divModInteger#, quotRemInteger#
 537         :: Int# -> ByteArray#
 538         -> Int# -> ByteArray#
 539         -> (# Int#, ByteArray#,
 540                   Int#, ByteArray# #)
 541
 542 integer2Int# :: Int# -> ByteArray# -> Int#
 543
 544 int2Integer#  :: Int#  -> Integer -- NB: no error-checking on these two!
 545 word2Integer# :: Word# -> Integer
 546
 547 addr2Integer# :: Addr# -> Integer
 548         -- the Addr# is taken to be a `char *' string
 549         -- to be converted into an Integer.
 550 </programlisting>
 551
 552 <indexterm><primary><literal>negateInteger&num;</literal></primary></indexterm>
 553 <indexterm><primary><literal>plusInteger&num;</literal></primary></indexterm>
 554 <indexterm><primary><literal>minusInteger&num;</literal></primary></indexterm>
 555 <indexterm><primary><literal>timesInteger&num;</literal></primary></indexterm>
 556 <indexterm><primary><literal>quotInteger&num;</literal></primary></indexterm>
 557 <indexterm><primary><literal>remInteger&num;</literal></primary></indexterm>
 558 <indexterm><primary><literal>gcdInteger&num;</literal></primary></indexterm>
 559 <indexterm><primary><literal>gcdIntegerInt&num;</literal></primary></indexterm>
 560 <indexterm><primary><literal>divExactInteger&num;</literal></primary></indexterm>
 561 <indexterm><primary><literal>cmpInteger&num;</literal></primary></indexterm>
 562 <indexterm><primary><literal>divModInteger&num;</literal></primary></indexterm>
 563 <indexterm><primary><literal>quotRemInteger&num;</literal></primary></indexterm>
 564 <indexterm><primary><literal>integer2Int&num;</literal></primary></indexterm>
 565 <indexterm><primary><literal>int2Integer&num;</literal></primary></indexterm>
 566 <indexterm><primary><literal>word2Integer&num;</literal></primary></indexterm>
 567 <indexterm><primary><literal>addr2Integer&num;</literal></primary></indexterm>
 568 </para>
 569
 570 </sect2>
 571
 572 <sect2>
 573 <title>Words and addresses</title>
 574
 575 <para>
 576 <indexterm><primary>word, primitive type</primary></indexterm>
 577 <indexterm><primary>address, primitive type</primary></indexterm>
 578 <indexterm><primary>unsigned integer, primitive type</primary></indexterm>
 579 <indexterm><primary>pointer, primitive type</primary></indexterm>
 580 </para>
 581
 582 <para>
 583 A <literal>Word&num;</literal> is used for bit-twiddling operations.
 584 It is the same size as an <literal>Int&num;</literal>, but has no sign
 585 nor any arithmetic operations.
 586
 587 <programlisting>
 588 type Word#      -- Same size/etc as Int# but *unsigned*
 589 type Addr#      -- A pointer from outside the "Haskell world" (from C, probably);
 590                 -- described under "arrays"
 591 </programlisting>
 592
 593 <indexterm><primary><literal>Word&num;</literal></primary></indexterm>
 594 <indexterm><primary><literal>Addr&num;</literal></primary></indexterm>
 595 </para>
 596
 597 <para>
 598 <literal>Word&num;</literal>s and <literal>Addr&num;</literal>s have
 599 the usual comparison operations.  Other
 600 unboxed-<literal>Word</literal> ops (bit-twiddling and coercions):
 601 </para>
 602
 603 <para>
 604
 605 <programlisting>
 606 {gt,ge,eq,ne,lt,le}Word# :: Word# -> Word# -> Bool
 607
 608 and#, or#, xor# :: Word# -> Word# -> Word#
 609         -- standard bit ops.
 610
 611 quotWord#, remWord# :: Word# -> Word# -> Word#
 612         -- word (i.e. unsigned) versions are different from int
 613         -- versions, so we have to provide these explicitly.
 614
 615 not# :: Word# -> Word#
 616
 617 shiftL#, shiftRL# :: Word# -> Int# -> Word#
 618         -- shift left, right logical
 619
 620 int2Word#       :: Int#  -> Word# -- just a cast, really
 621 word2Int#       :: Word# -> Int#
 622 </programlisting>
 623
 624 <indexterm><primary>bit operations, Word and Addr</primary></indexterm>
 625 <indexterm><primary><literal>gtWord&num;</literal></primary></indexterm>
 626 <indexterm><primary><literal>geWord&num;</literal></primary></indexterm>
 627 <indexterm><primary><literal>eqWord&num;</literal></primary></indexterm>
 628 <indexterm><primary><literal>neWord&num;</literal></primary></indexterm>
 629 <indexterm><primary><literal>ltWord&num;</literal></primary></indexterm>
 630 <indexterm><primary><literal>leWord&num;</literal></primary></indexterm>
 631 <indexterm><primary><literal>and&num;</literal></primary></indexterm>
 632 <indexterm><primary><literal>or&num;</literal></primary></indexterm>
 633 <indexterm><primary><literal>xor&num;</literal></primary></indexterm>
 634 <indexterm><primary><literal>not&num;</literal></primary></indexterm>
 635 <indexterm><primary><literal>quotWord&num;</literal></primary></indexterm>
 636 <indexterm><primary><literal>remWord&num;</literal></primary></indexterm>
 637 <indexterm><primary><literal>shiftL&num;</literal></primary></indexterm>
 638 <indexterm><primary><literal>shiftRA&num;</literal></primary></indexterm>
 639 <indexterm><primary><literal>shiftRL&num;</literal></primary></indexterm>
 640 <indexterm><primary><literal>int2Word&num;</literal></primary></indexterm>
 641 <indexterm><primary><literal>word2Int&num;</literal></primary></indexterm>
 642 </para>
 643
 644 <para>
 645 Unboxed-<literal>Addr</literal> ops (C casts, really):
 646
 647 <programlisting>
 648 {gt,ge,eq,ne,lt,le}Addr# :: Addr# -> Addr# -> Bool
 649
 650 int2Addr#       :: Int#  -> Addr#
 651 addr2Int#       :: Addr# -> Int#
 652 addr2Integer#   :: Addr# -> (# Int#, ByteArray# #)
 653 </programlisting>
 654
 655 <indexterm><primary><literal>gtAddr&num;</literal></primary></indexterm>
 656 <indexterm><primary><literal>geAddr&num;</literal></primary></indexterm>
 657 <indexterm><primary><literal>eqAddr&num;</literal></primary></indexterm>
 658 <indexterm><primary><literal>neAddr&num;</literal></primary></indexterm>
 659 <indexterm><primary><literal>ltAddr&num;</literal></primary></indexterm>
 660 <indexterm><primary><literal>leAddr&num;</literal></primary></indexterm>
 661 <indexterm><primary><literal>int2Addr&num;</literal></primary></indexterm>
 662 <indexterm><primary><literal>addr2Int&num;</literal></primary></indexterm>
 663 <indexterm><primary><literal>addr2Integer&num;</literal></primary></indexterm>
 664 </para>
 665
 666 <para>
 667 The casts between <literal>Int&num;</literal>,
 668 <literal>Word&num;</literal> and <literal>Addr&num;</literal>
 669 correspond to null operations at the machine level, but are required
 670 to keep the Haskell type checker happy.
 671 </para>
 672
 673 <para>
 674 Operations for indexing off of C pointers
 675 (<literal>Addr&num;</literal>s) to snatch values are listed under
 676 &ldquo;arrays&rdquo;.
 677 </para>
 678
 679 </sect2>
 680
 681 <sect2>
 682 <title>Arrays</title>
 683
 684 <para>
 685 <indexterm><primary>arrays, primitive</primary></indexterm>
 686 </para>
 687
 688 <para>
 689 The type <literal>Array&num; elt</literal> is the type of primitive,
 690 unpointed arrays of values of type <literal>elt</literal>.
 691 </para>
 692
 693 <para>
 694
 695 <programlisting>
 696 type Array# elt
 697 </programlisting>
 698
 699 <indexterm><primary><literal>Array&num;</literal></primary></indexterm>
 700 </para>
 701
 702 <para>
 703 <literal>Array&num;</literal> is more primitive than a Haskell
 704 array&mdash;indeed, the Haskell <literal>Array</literal> interface is
 705 implemented using <literal>Array&num;</literal>&mdash;in that an
 706 <literal>Array&num;</literal> is indexed only by
 707 <literal>Int&num;</literal>s, starting at zero.  It is also more
 708 primitive by virtue of being unboxed.  That doesn't mean that it isn't
 709 a heap-allocated object&mdash;of course, it is.  Rather, being unboxed
 710 means that it is represented by a pointer to the array itself, and not
 711 to a thunk which will evaluate to the array (or to bottom).  The
 712 components of an <literal>Array&num;</literal> are themselves boxed.
 713 </para>
 714
 715 <para>
 716 The type <literal>ByteArray&num;</literal> is similar to
 717 <literal>Array&num;</literal>, except that it contains just a string
 718 of (non-pointer) bytes.
 719 </para>
 720
 721 <para>
 722
 723 <programlisting>
 724 type ByteArray#
 725 </programlisting>
 726
 727 <indexterm><primary><literal>ByteArray&num;</literal></primary></indexterm>
 728 </para>
 729
 730 <para>
 731 Arrays of these types are useful when a Haskell program wishes to
 732 construct a value to pass to a C procedure. It is also possible to use
 733 them to build (say) arrays of unboxed characters for internal use in a
 734 Haskell program.  Given these uses, <literal>ByteArray&num;</literal>
 735 is deliberately a bit vague about the type of its components.
 736 Operations are provided to extract values of type
 737 <literal>Char&num;</literal>, <literal>Int&num;</literal>,
 738 <literal>Float&num;</literal>, <literal>Double&num;</literal>, and
 739 <literal>Addr&num;</literal> from arbitrary offsets within a
 740 <literal>ByteArray&num;</literal>.  (For type
 741 <literal>Foo&num;</literal>, the $i$th offset gets you the $i$th
 742 <literal>Foo&num;</literal>, not the <literal>Foo&num;</literal> at
 743 byte-position $i$.  Mumble.)  (If you want a
 744 <literal>Word&num;</literal>, grab an <literal>Int&num;</literal>,
 745 then coerce it.)
 746 </para>
 747
 748 <para>
 749 Lastly, we have static byte-arrays, of type
 750 <literal>Addr&num;</literal> &lsqb;mentioned previously].  (Remember
 751 the duality between arrays and pointers in C.)  Arrays of this types
 752 are represented by a pointer to an array in the world outside Haskell,
 753 so this pointer is not followed by the garbage collector.  In other
 754 respects they are just like <literal>ByteArray&num;</literal>.  They
 755 are only needed in order to pass values from C to Haskell.
 756 </para>
 757
 758 </sect2>
 759
 760 <sect2>
 761 <title>Reading and writing</title>
 762
 763 <para>
 764 Primitive arrays are linear, and indexed starting at zero.
 765 </para>
 766
 767 <para>
 768 The size and indices of a <literal>ByteArray&num;</literal>, <literal>Addr&num;</literal>, and
 769 <literal>MutableByteArray&num;</literal> are all in bytes.  It's up to the program to
 770 calculate the correct byte offset from the start of the array.  This
 771 allows a <literal>ByteArray&num;</literal> to contain a mixture of values of different
 772 type, which is often needed when preparing data for and unpicking
 773 results from C.  (Umm&hellip;not true of indices&hellip;WDP 95/09)
 774 </para>
 775
 776 <para>
 777 <emphasis>Should we provide some <literal>sizeOfDouble&num;</literal> constants?</emphasis>
 778 </para>
 779
 780 <para>
 781 Out-of-range errors on indexing should be caught by the code which
 782 uses the primitive operation; the primitive operations themselves do
 783 <emphasis>not</emphasis> check for out-of-range indexes. The intention is that the
 784 primitive ops compile to one machine instruction or thereabouts.
 785 </para>
 786
 787 <para>
 788 We use the terms &ldquo;reading&rdquo; and &ldquo;writing&rdquo; to refer to accessing
 789 <emphasis>mutable</emphasis> arrays (see <xref LinkEnd="sect-mutable">), and
 790 &ldquo;indexing&rdquo; to refer to reading a value from an <emphasis>immutable</emphasis>
 791 array.
 792 </para>
 793
 794 <para>
 795 Immutable byte arrays are straightforward to index (all indices are in
 796 units of the size of the object being read):
 797
 798 <programlisting>
 799 indexCharArray#   :: ByteArray# -> Int# -> Char#
 800 indexIntArray#    :: ByteArray# -> Int# -> Int#
 801 indexAddrArray#   :: ByteArray# -> Int# -> Addr#
 802 indexFloatArray#  :: ByteArray# -> Int# -> Float#
 803 indexDoubleArray# :: ByteArray# -> Int# -> Double#
 804
 805 indexCharOffAddr#   :: Addr# -> Int# -> Char#
 806 indexIntOffAddr#    :: Addr# -> Int# -> Int#
 807 indexFloatOffAddr#  :: Addr# -> Int# -> Float#
 808 indexDoubleOffAddr# :: Addr# -> Int# -> Double#
 809 indexAddrOffAddr#   :: Addr# -> Int# -> Addr#
 810  -- Get an Addr# from an Addr# offset
 811 </programlisting>
 812
 813 <indexterm><primary><literal>indexCharArray&num;</literal></primary></indexterm>
 814 <indexterm><primary><literal>indexIntArray&num;</literal></primary></indexterm>
 815 <indexterm><primary><literal>indexAddrArray&num;</literal></primary></indexterm>
 816 <indexterm><primary><literal>indexFloatArray&num;</literal></primary></indexterm>
 817 <indexterm><primary><literal>indexDoubleArray&num;</literal></primary></indexterm>
 818 <indexterm><primary><literal>indexCharOffAddr&num;</literal></primary></indexterm>
 819 <indexterm><primary><literal>indexIntOffAddr&num;</literal></primary></indexterm>
 820 <indexterm><primary><literal>indexFloatOffAddr&num;</literal></primary></indexterm>
 821 <indexterm><primary><literal>indexDoubleOffAddr&num;</literal></primary></indexterm>
 822 <indexterm><primary><literal>indexAddrOffAddr&num;</literal></primary></indexterm>
 823 </para>
 824
 825 <para>
 826 The last of these, <function>indexAddrOffAddr&num;</function>, extracts an <literal>Addr&num;</literal> using an offset
 827 from another <literal>Addr&num;</literal>, thereby providing the ability to follow a chain of
 828 C pointers.
 829 </para>
 830
 831 <para>
 832 Something a bit more interesting goes on when indexing arrays of boxed
 833 objects, because the result is simply the boxed object. So presumably
 834 it should be entered&mdash;we never usually return an unevaluated
 835 object!  This is a pain: primitive ops aren't supposed to do
 836 complicated things like enter objects.  The current solution is to
 837 return a single element unboxed tuple (see <xref LinkEnd="unboxed-tuples">).
 838 </para>
 839
 840 <para>
 841
 842 <programlisting>
 843 indexArray#       :: Array# elt -> Int# -> (# elt #)
 844 </programlisting>
 845
 846 <indexterm><primary><literal>indexArray&num;</literal></primary></indexterm>
 847 </para>
 848
 849 </sect2>
 850
 851 <sect2>
 852 <title>The state type</title>
 853
 854 <para>
 855 <indexterm><primary><literal>state, primitive type</literal></primary></indexterm>
 856 <indexterm><primary><literal>State&num;</literal></primary></indexterm>
 857 </para>
 858
 859 <para>
 860 The primitive type <literal>State&num;</literal> represents the state of a state
 861 transformer.  It is parameterised on the desired type of state, which
 862 serves to keep states from distinct threads distinct from one another.
 863 But the <emphasis>only</emphasis> effect of this parameterisation is in the type
 864 system: all values of type <literal>State&num;</literal> are represented in the same way.
 865 Indeed, they are all represented by nothing at all!  The code
 866 generator &ldquo;knows&rdquo; to generate no code, and allocate no registers
 867 etc, for primitive states.
 868 </para>
 869
 870 <para>
 871
 872 <programlisting>
 873 type State# s
 874 </programlisting>
 875
 876 </para>
 877
 878 <para>
 879 The type <literal>GHC.RealWorld</literal> is truly opaque: there are no values defined
 880 of this type, and no operations over it.  It is &ldquo;primitive&rdquo; in that
 881 sense - but it is <emphasis>not unlifted!</emphasis> Its only role in life is to be
 882 the type which distinguishes the <literal>IO</literal> state transformer.
 883 </para>
 884
 885 <para>
 886
 887 <programlisting>
 888 data RealWorld
 889 </programlisting>
 890
 891 </para>
 892
 893 </sect2>
 894
 895 <sect2>
 896 <title>State of the world</title>
 897
 898 <para>
 899 A single, primitive, value of type <literal>State&num; RealWorld</literal> is provided.
 900 </para>
 901
 902 <para>
 903
 904 <programlisting>
 905 realWorld# :: State# RealWorld
 906 </programlisting>
 907
 908 <indexterm><primary>realWorld&num; state object</primary></indexterm>
 909 </para>
 910
 911 <para>
 912 (Note: in the compiler, not a <literal>PrimOp</literal>; just a mucho magic
 913 <literal>Id</literal>. Exported from <literal>GHC</literal>, though).
 914 </para>
 915
 916 </sect2>
 917
 918 <sect2 id="sect-mutable">
 919 <title>Mutable arrays</title>
 920
 921 <para>
 922 <indexterm><primary>mutable arrays</primary></indexterm>
 923 <indexterm><primary>arrays, mutable</primary></indexterm>
 924 Corresponding to <literal>Array&num;</literal> and <literal>ByteArray&num;</literal>, we have the types of
 925 mutable versions of each.  In each case, the representation is a
 926 pointer to a suitable block of (mutable) heap-allocated storage.
 927 </para>
 928
 929 <para>
 930
 931 <programlisting>
 932 type MutableArray# s elt
 933 type MutableByteArray# s
 934 </programlisting>
 935
 936 <indexterm><primary><literal>MutableArray&num;</literal></primary></indexterm>
 937 <indexterm><primary><literal>MutableByteArray&num;</literal></primary></indexterm>
 938 </para>
 939
 940 <sect3>
 941 <title>Allocation</title>
 942
 943 <para>
 944 <indexterm><primary>mutable arrays, allocation</primary></indexterm>
 945 <indexterm><primary>arrays, allocation</primary></indexterm>
 946 <indexterm><primary>allocation, of mutable arrays</primary></indexterm>
 947 </para>
 948
 949 <para>
 950 Mutable arrays can be allocated. Only pointer-arrays are initialised;
 951 arrays of non-pointers are filled in by &ldquo;user code&rdquo; rather than by
 952 the array-allocation primitive.  Reason: only the pointer case has to
 953 worry about GC striking with a partly-initialised array.
 954 </para>
 955
 956 <para>
 957
 958 <programlisting>
 959 newArray#       :: Int# -> elt -> State# s -> (# State# s, MutableArray# s elt #)
 960
 961 newCharArray#   :: Int# -> State# s -> (# State# s, MutableByteArray# s elt #)
 962 newIntArray#    :: Int# -> State# s -> (# State# s, MutableByteArray# s elt #)
 963 newAddrArray#   :: Int# -> State# s -> (# State# s, MutableByteArray# s elt #)
 964 newFloatArray#  :: Int# -> State# s -> (# State# s, MutableByteArray# s elt #)
 965 newDoubleArray# :: Int# -> State# s -> (# State# s, MutableByteArray# s elt #)
 966 </programlisting>
 967
 968 <indexterm><primary><literal>newArray&num;</literal></primary></indexterm>
 969 <indexterm><primary><literal>newCharArray&num;</literal></primary></indexterm>
 970 <indexterm><primary><literal>newIntArray&num;</literal></primary></indexterm>
 971 <indexterm><primary><literal>newAddrArray&num;</literal></primary></indexterm>
 972 <indexterm><primary><literal>newFloatArray&num;</literal></primary></indexterm>
 973 <indexterm><primary><literal>newDoubleArray&num;</literal></primary></indexterm>
 974 </para>
 975
 976 <para>
 977 The size of a <literal>ByteArray&num;</literal> is given in bytes.
 978 </para>
 979
 980 </sect3>
 981
 982 <sect3>
 983 <title>Reading and writing</title>
 984
 985 <para>
 986 <indexterm><primary>arrays, reading and writing</primary></indexterm>
 987 </para>
 988
 989 <para>
 990
 991 <programlisting>
 992 readArray#       :: MutableArray# s elt -> Int# -> State# s -> (# State# s, elt #)
 993 readCharArray#   :: MutableByteArray# s -> Int# -> State# s -> (# State# s, Char# #)
 994 readIntArray#    :: MutableByteArray# s -> Int# -> State# s -> (# State# s, Int# #)
 995 readAddrArray#   :: MutableByteArray# s -> Int# -> State# s -> (# State# s, Addr# #)
 996 readFloatArray#  :: MutableByteArray# s -> Int# -> State# s -> (# State# s, Float# #)
 997 readDoubleArray# :: MutableByteArray# s -> Int# -> State# s -> (# State# s, Double# #)
 998
 999 writeArray#       :: MutableArray# s elt -> Int# -> elt     -> State# s -> State# s
1000 writeCharArray#   :: MutableByteArray# s -> Int# -> Char#   -> State# s -> State# s
1001 writeIntArray#    :: MutableByteArray# s -> Int# -> Int#    -> State# s -> State# s
1002 writeAddrArray#   :: MutableByteArray# s -> Int# -> Addr#   -> State# s -> State# s
1003 writeFloatArray#  :: MutableByteArray# s -> Int# -> Float#  -> State# s -> State# s
1004 writeDoubleArray# :: MutableByteArray# s -> Int# -> Double# -> State# s -> State# s
1005 </programlisting>
1006
1007 <indexterm><primary><literal>readArray&num;</literal></primary></indexterm>
1008 <indexterm><primary><literal>readCharArray&num;</literal></primary></indexterm>
1009 <indexterm><primary><literal>readIntArray&num;</literal></primary></indexterm>
1010 <indexterm><primary><literal>readAddrArray&num;</literal></primary></indexterm>
1011 <indexterm><primary><literal>readFloatArray&num;</literal></primary></indexterm>
1012 <indexterm><primary><literal>readDoubleArray&num;</literal></primary></indexterm>
1013 <indexterm><primary><literal>writeArray&num;</literal></primary></indexterm>
1014 <indexterm><primary><literal>writeCharArray&num;</literal></primary></indexterm>
1015 <indexterm><primary><literal>writeIntArray&num;</literal></primary></indexterm>
1016 <indexterm><primary><literal>writeAddrArray&num;</literal></primary></indexterm>
1017 <indexterm><primary><literal>writeFloatArray&num;</literal></primary></indexterm>
1018 <indexterm><primary><literal>writeDoubleArray&num;</literal></primary></indexterm>
1019 </para>
1020
1021 </sect3>
1022
1023 <sect3>
1024 <title>Equality</title>
1025
1026 <para>
1027 <indexterm><primary>arrays, testing for equality</primary></indexterm>
1028 </para>
1029
1030 <para>
1031 One can take &ldquo;equality&rdquo; of mutable arrays.  What is compared is the
1032 <emphasis>name</emphasis> or reference to the mutable array, not its contents.
1033 </para>
1034
1035 <para>
1036
1037 <programlisting>
1038 sameMutableArray#     :: MutableArray# s elt -> MutableArray# s elt -> Bool
1039 sameMutableByteArray# :: MutableByteArray# s -> MutableByteArray# s -> Bool
1040 </programlisting>
1041
1042 <indexterm><primary><literal>sameMutableArray&num;</literal></primary></indexterm>
1043 <indexterm><primary><literal>sameMutableByteArray&num;</literal></primary></indexterm>
1044 </para>
1045
1046 </sect3>
1047
1048 <sect3>
1049 <title>Freezing mutable arrays</title>
1050
1051 <para>
1052 <indexterm><primary>arrays, freezing mutable</primary></indexterm>
1053 <indexterm><primary>freezing mutable arrays</primary></indexterm>
1054 <indexterm><primary>mutable arrays, freezing</primary></indexterm>
1055 </para>
1056
1057 <para>
1058 Only unsafe-freeze has a primitive.  (Safe freeze is done directly in Haskell
1059 by copying the array and then using <function>unsafeFreeze</function>.)
1060 </para>
1061
1062 <para>
1063
1064 <programlisting>
1065 unsafeFreezeArray#     :: MutableArray# s elt -> State# s -> (# State# s, Array# s elt #)
1066 unsafeFreezeByteArray# :: MutableByteArray# s -> State# s -> (# State# s, ByteArray# #)
1067 </programlisting>
1068
1069 <indexterm><primary><literal>unsafeFreezeArray&num;</literal></primary></indexterm>
1070 <indexterm><primary><literal>unsafeFreezeByteArray&num;</literal></primary></indexterm>
1071 </para>
1072
1073 </sect3>
1074
1075 </sect2>
1076
1077 <sect2>
1078 <title>Synchronizing variables (M-vars)</title>
1079
1080 <para>
1081 <indexterm><primary>synchronising variables (M-vars)</primary></indexterm>
1082 <indexterm><primary>M-Vars</primary></indexterm>
1083 </para>
1084
1085 <para>
1086 Synchronising variables are the primitive type used to implement
1087 Concurrent Haskell's MVars (see the Concurrent Haskell paper for
1088 the operational behaviour of these operations).
1089 </para>
1090
1091 <para>
1092
1093 <programlisting>
1094 type MVar# s elt        -- primitive
1095
1096 newMVar#    :: State# s -> (# State# s, MVar# s elt #)
1097 takeMVar#   :: SynchVar# s elt -> State# s -> (# State# s, elt #)
1098 putMVar#    :: SynchVar# s elt -> State# s -> State# s
1099 </programlisting>
1100
1101 <indexterm><primary><literal>SynchVar&num;</literal></primary></indexterm>
1102 <indexterm><primary><literal>newSynchVar&num;</literal></primary></indexterm>
1103 <indexterm><primary><literal>takeMVar</literal></primary></indexterm>
1104 <indexterm><primary><literal>putMVar</literal></primary></indexterm>
1105 </para>
1106
1107 </sect2>
1108
1109 </sect1>
1110
1111 <!-- Emacs stuff:
1112      ;;; Local Variables: ***
1113      ;;; mode: sgml ***
1114      ;;; sgml-parent-document: ("users_guide.sgml" "book" "chapter" "sect1") ***
1115      ;;; End: ***
1116  -->