\documentstyle[11pt,literate,a4wide]{article} %-------------------- \begin{rawlatex} %\input{transfig} %\newcommand{\folks}[1]{$\spadesuit$ {\em #1} $\spadesuit$} %\newcommand{\ToDo}[1]{$\spadesuit$ {\bf ToDo:} {\em #1} $\spadesuit$} % to avoid src-location marginpars, comment in/out this defn. %\newcommand{\srcloc}[1]{{\tt #1}} %\newcommand{\srclocnote}[1]{} %\newcommand{\srclocnote}[1]{\marginpar{\small\srcloc{#1}}} \setcounter{secnumdepth}{6} \setcounter{tocdepth}{6} \end{rawlatex} %-------------------- \begin{document} \title{Basic types and the standard Prelude: OBSOLETE} \author{The AQUA team} \date{November 1992 (obsolete February 1994)} \maketitle \begin{rawlatex} \tableofcontents \pagebreak \end{rawlatex} % added to keep DPH stuff happy: \begin{rawlatex} \def\DPHaskell{DPHaskell} \def\POD{POD} \end{rawlatex} This document describes how we deal with Haskell's standard prelude, notably what the compiler itself ``knows'' about it. There's nothing intellectually difficult here---it's just vast and occasionally delicate. First, some introduction, mostly terminology. Second, the actual compiler source code which defines what the compiler knows about the prelude. Finally, something about how we compile the prelude code (with GHC, of course) to produce the executable bits for the prelude. %************************************************************************ %* * \section{Introduction and terminology} %* * %************************************************************************ The standard prelude is made of many, many pieces. The GHC system must deal with these pieces in different ways. For example, the compiler must obviously do different things for primitive operations (e.g., addition on machine-level @Ints@) and for plain written-in-Haskell functions (e.g., @tail@). In this section, the main thing we do is explain the various ways that we categorise prelude thingies, most notably types. %************************************************************************ %* * \subsection{Background information} %* * %************************************************************************ %************************************************************************ %* * \subsubsection{Background terms: Heap objects} %* * %************************************************************************ A {\em heap object} (equivalently {\em closure}) is always a contiguous block of memory, starting with an info pointer. {\em Dynamic} heap objects are allocated by a sequence of instructions in the usual way. In contrast, {\em static heap objects} are statically allocated at fixed, labelled locations outside the dynamic heap --- but we still call them heap objects! Their GC code does not evacuate them, and they are never scavenged since they never appear in to-space. Note: the ``staticness'' does {\em not} mean they are read-only; they may be updatable. (Much) more on this stuff in the STG paper. %************************************************************************ %* * \subsection{Categorising the prelude bits} %* * %************************************************************************ Here are four different ways in which we might categorise prelude things generally. Note, also, the {\em simplifying assumptions} that we make so that we can have a ``Prelude onion,'' in which each ``layer'' includes the preceding ones. \begin{description} %------------------------------------------------------------------ \item[Primitive vs Haskell-able:] Some parts of the prelude cannot be expressed in Haskell ({\em primitive}), whereas most of it can be ({\em Haskell-able}). BIG NOTE: Because of our non-standard support for unboxed numbers and operations thereon, some of the things in @PreludeBuiltin@ in the report {\em are} Haskell-able. For example, the @negate@ operation on an @Int@ is just: \begin{verbatim} negateInt i = case i of MkInt i# -> case (negateInt# i#) of j# -> MkInt j# \end{verbatim} Of course, this just moves the goalposts: @negateInt#@ is now the primitive, non-Haskell-able thingy... So: something is ``primitive'' if we cannot define it in our GHC-extended Haskell. For more information, please see \sectionref{prelude-more-on-types} for further discussion about types in the Prelude. %------------------------------------------------------------------ \item[From (exported by) PreludeCore or not:] The module @PreludeCore@ exports all the types, classes, and instances in the prelude. These entities are ``immutable;'' they can't be hidden, renamed, or really fiddled in any way. (NB: The entities {\em exported by} @PreludeCore@ may {\em originally} be from another module. For example, the @Complex@ datatype is defined in @PreludeComplex@; nonetheless, it is exported by @PreludeCore@ and falls into the category under discussion here.) {\em Simplifying assumption:} We take everything primitive (see previous classification) to be ``from PreludeCore''. {\em Simplifying assumption:} We take all {\em values} from @PreludeBuiltin@ to be ``from PreludeCore.'' This includes @error@ and the various \tr{prim*} functions (which may or may not be ``primitive'' in our system [because of our extensions for unboxery]). It shouldn't be hard to believe that something from @PreludeBuiltin@ is (at least) slightly magic and not just another value... {\em Simplifying assumption:} The GHC compiler has ``wired in'' information about {\em all} @fromPreludeCore@ things. The fact that they are ``immutable'' means we don't have to worry about ``unwiring'' them in the face of renaming, etc., (which would be pretty bizarre, anyway). Not-exported-by-PreludeCore things (non-@PreludeBuiltin@ values) can be renamed, hidden, etc. %------------------------------------------------------------------ \item[Compiler-must-know vs compiler-chooses-to-know vs compiler-unknown:] There are some prelude things that the compiler has to ``know about.'' For example, it must know about the @Bool@ data type, because (for one reason) it needs it to typecheck guards. {\em Simplifying assumption:} By decree, the compiler ``must know'' about everything exported from @PreludeCore@ (see previous classification). This is only slight overkill: there are a few types (e.g., @Request@), classes (e.g., @RealFrac@), and instances (e.g., anything for @RealFrac@)---all @fromPreludeCore@---that the compiler could, strictly speaking, get away with not knowing about. However, it is a {\em pain} to maintain the distinction... On the other hand, the compiler really {\em doesn't} need to know about the non-@fromPreludeCore@ stuff (as defined above). It can read the relevant information out of a \tr{.hi} interface file, just as it would for a user-defined module (and, indeed, that's what it does). An example of something the compiler doesn't need to know about is the @tail@ function, defined in @PreludeList@, exported by @Prelude@. There are some non-@fromPreludeCore@ things that the compiler may {\em choose} to clutch to its bosom: this is so it can do unfolding on the use of a function. For example, we always want to unfold uses of @&&@ and @||@, so we wire info about them into the compiler. (We won't need this when we are able to pass unfolding info via interface files.) %------------------------------------------------------------------ \item[Per-report vs Glasgow-extension:] Some of our prelude stuff is not strictly as per the Haskell report, notably the support for monadic I/O, and our different notion of what is truly primitive in Haskell (c.f. @PreludeBuiltin@'s ideas). In this document, ``Haskell'' always means ``Glasgow-extended Haskell.'' \end{description} %************************************************************************ %* * \subsection[prelude-more-on-types]{More about the Prelude datatypes} %* * %************************************************************************ The previous section explained how we categorise the prelude as a whole. In this section, we home in on prelude datatypes. %************************************************************************ %* * \subsubsection{Boxed vs unboxed types} %* * %************************************************************************ Objects of a particular type are all represented the same way. We recognise two kinds of types: \begin{description} \item[Boxed types.] The domain of a boxed type includes bottom. Values of boxed type are always represented by a pointer to a heap object, which may or may not be evaluated. Anyone needing to scrutinise a value of boxed type must evaluate it first by entering it. Value of boxed type can be passed to polymorphic functions. \item[Unboxed types.] The domain of an unboxed type does not include bottom, so values of unboxed type do not need a representation which accommodates the possibility that it is not yet evaluated. Unboxed values are represented by one or more words. At present, if it is represented by more than one word then none of the words are pointers, but we plan to lift this restriction eventually. (At present, the only multi-word values are @Double#@s.) An unboxed value may be represented by a pointer to a heap object: primitive strings and arbitrary-precision integers are examples (see Section~\ref{sect-primitive}). \end{description} %************************************************************************ %* * \subsubsection{Primitive vs algebraic types} %* * %************************************************************************ There is a second classification of types, which is not quite orthogonal: \begin{description} \item[Primitive types.] A type is called {\em primitive} if it cannot be defined in (Glasgow-extended) Haskell, and the only operations which manipulate its representation are primitive ones. It follows that the domain corresponding to a primitive type has no bottom element; that is, all primitive data types are unboxed. By convention, the names of all primitive types end with @#@. \item[Algebraic data types.] These are built with Haskell's @data@ declaration. Currently, @data@ declarations can {\em only} build boxed types (and hence {\em all unboxed types are also primitive}), but we plan to lift this restriction in due course. \end{description} %************************************************************************ %* * \subsection[prelude-onion]{Summary of the ``Prelude onion''} %* * %************************************************************************ Summarizing: \begin{enumerate} \item {\em Primitive} types, and operations thereon (@PrimitiveOps@), are at the core of the onion. \item Everything exported @fromPreludeCore@ (w/ all noted provisos) makes up the next layer of the onion; and, by decree, the compiler has built-in knowledge of all of it. All the primitive stuff is included in this category. \item The compiler {\em chooses to know} about a few of the non-@fromPreludeCore@ values in the @Prelude@. This is (exclusively) for access to their unfoldings. \item The rest of the @Prelude@ is ``unknown'' to the compiler itself; it gets its information from a \tr{Prelude.hi} file, exactly as it does for user-defined modules. \end{enumerate} %************************************************************************ %* * \section{What the compiler knows about the prelude} %* * %************************************************************************ This is essentially the stuff in the directory \tr{ghc/compiler/prelude}. %************************************************************************ %* * \subsection{What the compiler knows about prelude types (and ops thereon)} %* * %************************************************************************ The compiler has wired into it knowledge of all the types in the standard prelude, all of which are exported by @PreludeCore@. Strictly speaking, it needn't know about some types (e.g., the @Request@ and @Response@ datatypes), but it's tidier in the end to wire in everything. Primitive types, and related stuff, are covered first. Then the more ordinary prelude types. The more turgid parts may be arranged alphabetically... \downsection \downsection % pretty ugly, no? %************************************************************************ %* * \section{Primitive types (and ``kinds'') {\em and} operations thereon} \label{sect-primitive} %* * %************************************************************************ There are the following primitive types. %partain:\begin{center} \begin{tabular}{|llll|} \hline Type & Represents & Size (32|64-bit words) & Pointer? \\ \hline @Void#@ & zero-element type & 1 & No \\ @Char#@ & characters & 1 & No \\ @Int#@ & 32|64-bit integers & 1 & No \\ @Float#@ & 32|64-bit floats & 1 & No \\ @Double#@ & 64|128-bit floats & 2 & No \\ @Arr#@ & array of pointers & ? & Yes \\ @Arr# Char#@ & array of @Char#@s & ? & No \\ @Arr# Int#@ & array of @Int#@s & ? & No \\ @Arr# Float#@ & array of @Float#@s & ? & No \\ @Arr# Double#@ & array of @Double#@s & ? & No \\ @Integer#@ & arbitrary-precision integers & 1 & Yes \\ @LitString#@ & literal C-style strings & 1 & No \\ \hline \end{tabular} %partain:\end{center} Notes: (a)~@Integer#s@ have a pointer in them, to a @Arr# Int#@; see the discussion in @TyInteger@. (b)~@LitString#@ is a magical type used {\em only} to handle literal C-strings; this is a convenience; we could use an @Arr# Char#@ instead. What the compiler knows about these primitive types is either (a)~given with the corresponding algebraic type (e.g., @Int#@ stuff is with @Int@ stuff), or (b)~in a module of its own (e.g., @Void#@). \downsection \input{PrimKind.lhs} \section{Details about ``Glasgow-special'' types} \downsection \input{TysPrim.lhs} \input{TyPod.lhs} \input{TyProcs.lhs} \upsection \input{PrimOps.lhs} \upsection %************************************************************************ %* * \section{Details (mostly) about non-primitive Prelude types} \label{sect-nonprim-tys} %* * %************************************************************************ \downsection \input{TysWiredIn.lhs} \upsection %************************************************************************ %* * %\subsection{What the compiler knows about prelude values} %* * %************************************************************************ \downsection \input{PrelVals.lhs} \upsection %************************************************************************ %* * \subsection{Uniquifiers and utility bits for this prelude stuff} %* * %************************************************************************ \downsection \downsection \input{PrelFuns.lhs} \upsection \upsection %************************************************************************ %* * %\subsection{The @AbsPrel@ interface to the compiler's prelude knowledge} %* * %************************************************************************ \downsection \input{AbsPrel.lhs} \upsection %************************************************************************ %* * \section{The executable code for prelude bits} %* * %************************************************************************ This essentially describes what happens in the directories \tr{ghc/lib/{io,prelude}}; the former is to support the (non-std) Glasgow I/O; the latter is regular prelude things. ToDo: more. \printindex \end{document}