ghc/docs/users_guide/vs_haskell.sgml

   1 <sect1 id="vs-Haskell-defn">
   2   <title>Haskell&nbsp;98 vs.&nbsp;Glasgow Haskell: language non-compliance
   3 </title>
   4
   5   <indexterm><primary>GHC vs the Haskell 98 language</primary></indexterm>
   6   <indexterm><primary>Haskell 98 language vs GHC</primary></indexterm>
   7
   8   <para>This section lists Glasgow Haskell infelicities in its
   9   implementation of Haskell&nbsp;98.  See also the &ldquo;when things
  10   go wrong&rdquo; section (<XRef LinkEnd="wrong">) for information
  11   about crashes, space leaks, and other undesirable phenomena.</para>
  12
  13   <para>The limitations here are listed in Haskell Report order
  14   (roughly).</para>
  15
  16   <sect2 id="haskell98-divergence">
  17     <title>Divergence from Haskell&nbsp;98</title>
  18
  19
  20     <sect3 id="infelicities-lexical">
  21       <title>Lexical syntax</title>
  22
  23       <itemizedlist>
  24         <listitem>
  25           <para>The Haskell report specifies that programs may be
  26           written using Unicode.  GHC only accepts the ISO-8859-1
  27           character set at the moment.</para>
  28         </listitem>
  29
  30         <listitem>
  31           <para>Certain lexical rules regarding qualified identifiers
  32           are slightly different in GHC compared to the Haskell
  33           report.  When you have
  34           <replaceable>module</replaceable><literal>.</literal><replaceable>reservedop</replaceable>,
  35           such as <literal>M.\</literal>, GHC will interpret it as a
  36           single qualified operator rather than the two lexemes
  37           <literal>M</literal> and <literal>.\</literal>.</para>
  38         </listitem>
  39
  40         <listitem>
  41           <para>When <option>-fglasgow-exts</option> is on, GHC
  42           reserves several keywords beginning with two underscores.
  43           This is due to the fact that GHC uses the same lexical
  44           analyser for interface file parsing as it does for source
  45           file parsing, and these keywords are used in interface
  46           files.  Do not use any identifiers beginning with a double
  47           underscore in <option>-fglasgow-exts</option> mode.</para>
  48         </listitem>
  49       </itemizedlist>
  50     </sect3>
  51
  52       <sect3 id="infelicities-syntax">
  53         <title>Context-free syntax</title>
  54
  55       <itemizedlist>
  56         <listitem>
  57           <para>GHC doesn't do fixity resolution in expressions during
  58           parsing.  For example, according to the Haskell report, the
  59           following expression is legal Haskell:
  60 <programlisting>
  61     let x = 42 in x == 42 == True</programlisting>
  62         and parses as:
  63 <programlisting>
  64     (let x = 42 in x == 42) == True</programlisting>
  65
  66           because according to the report, the <literal>let</literal>
  67           expression <quote>extends as far to the right as
  68           possible</quote>.  Since it can't extend past the second
  69           equals sign without causing a parse error
  70           (<literal>==</literal> is non-fix), the
  71           <literal>let</literal>-expression must terminate there.  GHC
  72           simply gobbles up the whole expression, parsing like this:
  73 <programlisting>
  74     (let x = 42 in x == 42 == True)</programlisting>
  75
  76           The Haskell report is arguably wrong here, but nevertheless
  77           it's a difference between GHC & Haskell 98.</para>
  78         </listitem>
  79       </itemizedlist>
  80     </sect3>
  81
  82   <sect3 id="infelicities-exprs-pats">
  83       <title>Expressions and patterns</title>
  84
  85       <variablelist>
  86         <varlistentry>
  87           <term>Very long <literal>String</literal> constants:</term>
  88           <listitem>
  89             <para>May not go through.  If you add a &ldquo;string
  90             gap&rdquo; every few thousand characters, then the strings
  91             can be as long as you like.</para>
  92
  93             <para>Bear in mind that string gaps and the
  94             <option>-cpp</option><indexterm><primary><option>-cpp</option>
  95             </primary></indexterm> option don't mix very well (see
  96             <xref linkend="c-pre-processor">).</para>
  97           </listitem>
  98         </varlistentry>
  99       </variablelist>
 100
 101     </sect3>
 102
 103     <sect3 id="infelicities-decls">
 104       <title>Declarations and bindings</title>
 105
 106       <para>None known.</para>
 107
 108     </sect3>
 109
 110     <sect3 id="infelicities-Modules">
 111       <title>Module system and interface files</title>
 112
 113       <variablelist>
 114
 115         <varlistentry>
 116           <term> Namespace pollution </term>
 117           <listitem>
 118             <para>Several modules internal to GHC are visible in the
 119             standard namespace.  All of these modules begin with
 120             <literal>Prel</literal>, so the rule is: don't use any
 121             modules beginning with <literal>Prel</literal> in your
 122             program, or you may be comprehensively screwed.</para>
 123           </listitem>
 124         </varlistentry>
 125       </variablelist>
 126
 127     </sect3>
 128
 129     <sect3 id="infelicities-numbers">
 130       <title>Numbers, basic types, and built-in classes</title>
 131
 132       <variablelist>
 133         <varlistentry>
 134           <term>Multiply-defined array elements&mdash;not checked:</term>
 135           <listitem>
 136             <para>This code fragment <emphasis>should</emphasis>
 137             elicit a fatal error, but it does not:
 138
 139 <programlisting>
 140 main = print (array (1,1) [(1,2), (1,3)])</programlisting>
 141
 142             </para>
 143           </listitem>
 144         </varlistentry>
 145       </variablelist>
 146
 147     </sect3>
 148
 149       <sect3 id="infelicities-Prelude">
 150         <title>In Prelude support</title>
 151
 152       <variablelist>
 153         <varlistentry>
 154           <term>The <literal>Char</literal> type</term>
 155           <indexterm><primary><literal>Char</literal></primary><secondary>size
 156           of</secondary></indexterm>
 157           <listitem>
 158             <para>The Haskell report says that the
 159             <literal>Char</literal> type holds 16 bits.  GHC follows
 160             the ISO-10646 standard a little more closely:
 161             <literal>maxBound :: Char</literal> in GHC is
 162             <literal>0x10FFFF</literal>.</para>
 163           </listitem>
 164         </varlistentry>
 165
 166         <varlistentry>
 167           <term>Arbitrary-sized tuples:</term>
 168           <listitem>
 169             <para>Tuples are currently limited to size 61.  HOWEVER:
 170             standard instances for tuples (<literal>Eq</literal>,
 171             <literal>Ord</literal>, <literal>Bounded</literal>,
 172             <literal>Ix</literal> <literal>Read</literal>, and
 173             <literal>Show</literal>) are available
 174             <emphasis>only</emphasis> up to 5-tuples.</para>
 175
 176             <para>This limitation is easily subvertible, so please ask
 177             if you get stuck on it.</para>
 178             </listitem>
 179           </varlistentry>
 180         </variablelist>
 181     </sect3>
 182   </sect2>
 183
 184   <sect2 id="haskell98-undefined">
 185     <title>GHC's interpretation of undefined behaviour in
 186     Haskell&nbsp;98</title>
 187
 188     <para>This section documents GHC's take on various issues that are
 189     left undefined or implementation specific in Haskell 98.</para>
 190
 191     <variablelist>
 192       <varlistentry>
 193         <term>Sized integral types</term>
 194         <indexterm><primary><literal>Int</literal></primary><secondary>size of</secondary>
 195         </indexterm>
 196
 197         <listitem>
 198           <para>In GHC the <literal>Int</literal> type follows the
 199           size of an address on the host architecture; in other words
 200           it holds 32 bits on a 32-bit machine, and 64-bits on a
 201           64-bit machine.</para>
 202
 203           <para>Arithmetic on <literal>Int</literal> is unchecked for
 204           overflow<indexterm><primary>overflow</primary><secondary><literal>Int</literal></secondary>
 205             </indexterm>, so all operations on <literal>Int</literal> happen
 206           modulo
 207           2<superscript><replaceable>n</replaceable></superscript>
 208           where <replaceable>n</replaceable> is the size in bits of
 209           the <literal>Int</literal> type.</para>
 210
 211           <para>The <literal>fromInteger</literal><indexterm><primary><literal>fromInteger</literal></primary>
 212             </indexterm>function (and hence
 213           also <literal>fromIntegral</literal><indexterm><primary><literal>fromIntegral</literal></primary>
 214             </indexterm>) is a special case when
 215           converting to <literal>Int</literal>.  The value of
 216           <literal>fromIntegral x :: Int</literal> is given by taking
 217           the lower <replaceable>n</replaceable> bits of <literal>(abs
 218           x)</literal>, multiplied by the sign of <literal>x</literal>
 219           (in 2's complement <replaceable>n</replaceable>-bit
 220           arithmetic).  This behaviour was chosen so that for example
 221           writing <literal>0xffffffff :: Int</literal> preserves the
 222           bit-pattern in the resulting <literal>Int</literal>.</para>
 223
 224
 225            <para>Negative literals, such as <literal>-3</literal>, are
 226              specified by (a careful reading of) the Haskell Report as
 227              meaning <literal>Prelude.negate (Prelude.fromInteger 3)</literal>.
 228              So <literal>-2147483648</literal> means <literal>negate (fromInteger 2147483648)</literal>.
 229              Since <literal>fromInteger</literal> takes the lower 32 bits of the representation,
 230              <literal>fromInteger (2147483648::Integer)</literal>, computed at type <literal>Int</literal> is
 231              <literal>-2147483648::Int</literal>.  The <literal>negate</literal> operation then
 232              overflows, but it is unchecked, so <literal>negate (-2147483648::Int)</literal> is just
 233              <literal>-2147483648</literal>.  In short, one can write <literal>minBound::Int</literal> as
 234              a literal with the expected meaning (but that is not in general guaranteed.
 235              </para>
 236
 237           <para>The <literal>fromIntegral</literal> function also
 238           preserves bit-patterns when converting between the sized
 239           integral types (<literal>Int8</literal>,
 240           <literal>Int16</literal>, <literal>Int32</literal>,
 241           <literal>Int64</literal> and the unsigned
 242           <literal>Word</literal> variants), see <xref
 243           linkend="sec-Int"> and <xref linkend="sec-Word">.</para>
 244
 245         </listitem>
 246       </varlistentry>
 247
 248       <varlistentry>
 249         <term>Unchecked float arithmetic</term>
 250         <listitem>
 251           <para>Operations on <literal>Float</literal> and
 252           <literal>Double</literal> numbers are
 253           <emphasis>unchecked</emphasis> for overflow, underflow, and
 254           other sad occurrences.  (note, however that some
 255           architectures trap floating-point overflow and
 256           loss-of-precision and report a floating-point exception,
 257           probably terminating the
 258           program)<indexterm><primary>floating-point
 259           exceptions</primary></indexterm>.</para>
 260         </listitem>
 261       </varlistentry>
 262     </variablelist>
 263
 264   </sect2>
 265
 266 </sect1>
 267
 268 <!-- Emacs stuff:
 269      ;;; Local Variables: ***
 270      ;;; mode: sgml ***
 271      ;;; sgml-parent-document: ("users_guide.sgml" "book" "chapter" "sect1") ***
 272      ;;; End: ***
 273  -->