From: simonpj Date: Tue, 15 Oct 2002 08:56:55 +0000 (+0000) Subject: [project @ 2002-10-15 08:56:50 by simonpj] X-Git-Tag: Approx_11550_changesets_converted~1550 X-Git-Url: http://git.megacz.com/?a=commitdiff_plain;h=0c0a693f3bc05a7c31f53c7b27f55479c1bc0ee9;p=ghc-hetmet.git [project @ 2002-10-15 08:56:50 by simonpj] Some reorganising. Simon and I agreed to leave out most of the (inaccurate) section on primitive operations, pointing people instead to the Real Truth in primops.txt and the External Core document. Also: bugs section added --- diff --git a/ghc/docs/users_guide/vs_haskell.sgml b/ghc/docs/users_guide/bugs.sgml similarity index 92% rename from ghc/docs/users_guide/vs_haskell.sgml rename to ghc/docs/users_guide/bugs.sgml index 9eb771e..3967634 100644 --- a/ghc/docs/users_guide/vs_haskell.sgml +++ b/ghc/docs/users_guide/bugs.sgml @@ -1,3 +1,7 @@ + + Known bugs and infelicities + + Haskell 98 vs. Glasgow Haskell: language non-compliance @@ -265,6 +269,37 @@ main = print (array (1,1) [(1,2), (1,3)]) + + + Known bugs + +GHC has the following known bugs: + + + +GHC's inliner can be persuaded into non-termination using the standard way to encode +recursion via a data type: + + data U = MkU (U -> Bool) + + russel :: U -> Bool + russel u@(MkU p) = not $ p u + + x :: Bool + x = russel (MkU russel) + +We have never found another program, other than this contrived one, that makes GHC +diverge, and fixing the problem would impose an extra overhead on every compilation. So the +bug remains un-fixed. There is more background in + +Secrets of the GHC inliner. + + + + + + -&primitives; + + + Unboxed types and primitive operations + +GHC is built on a raft of primitive data types and operations. +While you really can use this stuff to write fast code, + we generally find it a lot less painful, and more satisfying in the + long run, to use higher-level language features and libraries. With + any luck, the code you write will be optimised to the efficient + unboxed version in any case. And if it isn't, we'd like to know + about it. + +We do not currently have good, up-to-date documentation about the +primitives, perhaps because they are mainly intended for internal use. +There used to be a long section about them here in the User Guide, but it +became out of date, and wrong information is worse than none. + +The Real Truth about what primitive types there are, and what operations +work over those types, is held in the file +fptools/ghc/compiler/prelude/primops.txt. +This file is used directly to generate GHC's primitive-operation definitions, so +it is always correct! It is also intended for processing into text. + + Indeed, +the result of such processing is part of the description of the + External + Core language. +So that document is a good place to look for a type-set version. +We would be very happy if someone wanted to volunteer to produce an SGML +back end to the program that processes primops.txt so that +we could include the results here in the User Guide. + +What follows here is a brief summary of some main points. + + +Unboxed types + + + +Unboxed types (Glasgow extension) + + +Most types in GHC are boxed, which means +that values of that type are represented by a pointer to a heap +object. The representation of a Haskell Int, for +example, is a two-word heap object. An unboxed +type, however, is represented by the value itself, no pointers or heap +allocation are involved. + + + +Unboxed types correspond to the “raw machine” types you +would use in C: Int# (long int), +Double# (double), Addr# +(void *), etc. The primitive operations +(PrimOps) on these types are what you might expect; e.g., +(+#) is addition on +Int#s, and is the machine-addition that we all +know and love—usually one instruction. + + + +Primitive (unboxed) types cannot be defined in Haskell, and are +therefore built into the language and compiler. Primitive types are +always unlifted; that is, a value of a primitive type cannot be +bottom. We use the convention that primitive types, values, and +operations have a # suffix. + + + +Primitive values are often represented by a simple bit-pattern, such +as Int#, Float#, +Double#. But this is not necessarily the case: +a primitive value might be represented by a pointer to a +heap-allocated object. Examples include +Array#, the type of primitive arrays. A +primitive array is heap-allocated because it is too big a value to fit +in a register, and would be too expensive to copy around; in a sense, +it is accidental that it is represented by a pointer. If a pointer +represents a primitive value, then it really does point to that value: +no unevaluated thunks, no indirections…nothing can be at the +other end of the pointer than the primitive value. + + + +There are some restrictions on the use of primitive types, the main +one being that you can't pass a primitive value to a polymorphic +function or store one in a polymorphic data type. This rules out +things like [Int#] (i.e. lists of primitive +integers). The reason for this restriction is that polymorphic +arguments and constructor fields are assumed to be pointers: if an +unboxed integer is stored in one of these, the garbage collector would +attempt to follow it, leading to unpredictable space leaks. Or a +seq operation on the polymorphic component may +attempt to dereference the pointer, with disastrous results. Even +worse, the unboxed value might be larger than a pointer +(Double# for instance). + + + +Nevertheless, A numerically-intensive program using unboxed types can +go a lot faster than its “standard” +counterpart—we saw a threefold speedup on one example. + + + + + +Unboxed Tuples + + + +Unboxed tuples aren't really exported by GHC.Exts, +they're available by default with . An +unboxed tuple looks like this: + + + + + +(# e_1, ..., e_n #) + + + + + +where e_1..e_n are expressions of any +type (primitive or non-primitive). The type of an unboxed tuple looks +the same. + + + +Unboxed tuples are used for functions that need to return multiple +values, but they avoid the heap allocation normally associated with +using fully-fledged tuples. When an unboxed tuple is returned, the +components are put directly into registers or on the stack; the +unboxed tuple itself does not have a composite representation. Many +of the primitive operations listed in this section return unboxed +tuples. + + + +There are some pretty stringent restrictions on the use of unboxed tuples: + + + + + + + + + Unboxed tuple types are subject to the same restrictions as +other unboxed types; i.e. they may not be stored in polymorphic data +structures or passed to polymorphic functions. + + + + + + + Unboxed tuples may only be constructed as the direct result of +a function, and may only be deconstructed with a case expression. +eg. the following are valid: + + + +f x y = (# x+1, y-1 #) +g x = case f x x of { (# a, b #) -> a + b } + + + +but the following are invalid: + + + +f x y = g (# x, y #) +g (# x, y #) = x + y + + + + + + + + + No variable can have an unboxed tuple type. This is illegal: + + + +f :: (# Int, Int #) -> (# Int, Int #) +f x = x + + + +because x has an unboxed tuple type. + + + + + + + + + +Note: we may relax some of these restrictions in the future. + + + +The IO and ST monads use unboxed +tuples to avoid unnecessary allocation during sequences of operations. + + + + + diff --git a/ghc/docs/users_guide/lang.sgml b/ghc/docs/users_guide/lang.sgml index b0bab75..bae8c86 100644 --- a/ghc/docs/users_guide/lang.sgml +++ b/ghc/docs/users_guide/lang.sgml @@ -3,7 +3,6 @@ &glasgowexts ¶llel -&vs-hs diff --git a/ghc/docs/users_guide/ug-book.sgml b/ghc/docs/users_guide/ug-book.sgml index cf5dd38..afb72a8 100644 --- a/ghc/docs/users_guide/ug-book.sgml +++ b/ghc/docs/users_guide/ug-book.sgml @@ -18,5 +18,6 @@ &wrong; &utils; &win32-dll; +&bugs; &faq; diff --git a/ghc/docs/users_guide/ug-ent.sgml b/ghc/docs/users_guide/ug-ent.sgml index ba5c617..c08d307 100644 --- a/ghc/docs/users_guide/ug-ent.sgml +++ b/ghc/docs/users_guide/ug-ent.sgml @@ -17,7 +17,7 @@ - +