[project @ 2003-05-21 15:49:54 by simonpj]

[ghc-hetmet.git] / ghc / docs / users_guide / glasgow_exts.sgml
diff --git a/ghc/docs/users_guide/glasgow_exts.sgml b/ghc/docs/users_guide/glasgow_exts.sgml

index 2d53f08..ddd522f 100644 (file)
--- a/ghc/docs/users_guide/glasgow_exts.sgml
+++ b/ghc/docs/users_guide/glasgow_exts.sgml
@@ -959,38 +959,19 @@ classes: exploring the design space</ULink > (Simon Peyton Jones, Mark
  Jones, Erik Meijer).
  </para>
  
-<para>
-I'd like to thank people who reported shorcomings in the GHC 3.02
-implementation.  Our default decisions were all conservative ones, and
-the experience of these heroic pioneers has given useful concrete
-examples to support several generalisations.  (These appear below as
-design choices not implemented in 3.02.)
-</para>
-
-<para>
-I've discussed these notes with Mark Jones, and I believe that Hugs
-will migrate towards the same design choices as I outline here.
-Thanks to him, and to many others who have offered very useful
-feedback.
-</para>
  
-<sect3>
+<sect3 id="type-restrictions">
  <title>Types</title>
  
  <para>
-There are the following restrictions on the form of a qualified
-type:
-</para>
-
-<para>
+GHC imposes the following restrictions on the form of a qualified
+type, whether declared in a type signature
+or inferred. Consider the type:
  
  <programlisting>
    forall tv1..tvn (c1, ...,cn) => type
  </programlisting>
  
-</para>
-
-<para>
  (Here, I write the "foralls" explicitly, although the Haskell source
  language omits them; in Haskell 1.4, all the free type variables of an
  explicit source-language type signature are universally quantified,
@@ -1005,11 +986,15 @@ in GHC, you can give the foralls if you want.  See <xref LinkEnd="universal-quan
  
  <para>
   <emphasis>Each universally quantified type variable
-<literal>tvi</literal> must be mentioned (i.e. appear free) in <literal>type</literal></emphasis>.
+<literal>tvi</literal> must be reachable from <literal>type</literal></emphasis>.
  
+A type variable is "reachable" if it it is functionally dependent
+(see <xref linkend="functional-dependencies">)
+on the type variables free in <literal>type</literal>.
  The reason for this is that a value with a type that does not obey
  this restriction could not be used without introducing
-ambiguity. Here, for example, is an illegal type:
+ambiguity. 
+Here, for example, is an illegal type:
  
  
  <programlisting>
@@ -1064,10 +1049,6 @@ territory free in case we need it later.
  
  </para>
  
-<para>
-These restrictions apply to all types, whether declared in a type signature
-or inferred.
-</para>
  
  <para>
  Unlike Haskell 1.4, constraints in types do <emphasis>not</emphasis> have to be of
@@ -1154,56 +1135,14 @@ be acyclic</emphasis>.  So these class declarations are OK:
  
  </para>
  </listitem>
-<listitem>
-
-<para>
- <emphasis>In the signature of a class operation, every constraint
-must mention at least one type variable that is not a class type
-variable</emphasis>.
-
-Thus:
-
-
-<programlisting>
-  class Collection c a where
-    mapC :: Collection c b => (a->b) -> c a -> c b
-</programlisting>
-
-
-is OK because the constraint <literal>(Collection a b)</literal> mentions
-<literal>b</literal>, even though it also mentions the class variable
-<literal>a</literal>.  On the other hand:
-
-
-<programlisting>
-  class C a where
-    op :: Eq a => (a,b) -> (a,b)
-</programlisting>
-
-
-is not OK because the constraint <literal>(Eq a)</literal> mentions on the class
-type variable <literal>a</literal>, but not <literal>b</literal>.  However, any such
-example is easily fixed by moving the offending context up to the
-superclass context:
-
  
-<programlisting>
-  class Eq a => C a where
-    op ::(a,b) -> (a,b)
-</programlisting>
-
-
-A yet more relaxed rule would allow the context of a class-op signature
-to mention only class type variables.  However, that conflicts with
-Rule 1(b) for types above.
-
-</para>
-</listitem>
  <listitem>
  
  <para>
- <emphasis>The type of each class operation must mention <emphasis>all</emphasis> of
-the class type variables</emphasis>.  For example:
+ <emphasis>All of the class type variables must be reachable (in the sense 
+mentioned in <xref linkend="type-restrictions">)
+from the free varibles of each method type
+</emphasis>.  For example:
  
  
  <programlisting>
@@ -1379,63 +1318,8 @@ For example, this is OK:
    instance Stateful (ST s) (MutVar s) where ...
  </programlisting>
  
-
-The "at least one not a type variable" restriction is to ensure that
-context reduction terminates: each reduction step removes one type
-constructor.  For example, the following would make the type checker
-loop if it wasn't excluded:
-
-
-<programlisting>
-  instance C a => C a where ...
-</programlisting>
-
-
-There are two situations in which the rule is a bit of a pain. First,
-if one allows overlapping instance declarations then it's quite
-convenient to have a "default instance" declaration that applies if
-something more specific does not:
-
-
-<programlisting>
-  instance C a where
-    op = ... -- Default
-</programlisting>
-
-
-Second, sometimes you might want to use the following to get the
-effect of a "class synonym":
-
-
-<programlisting>
-  class (C1 a, C2 a, C3 a) => C a where { }
-
-  instance (C1 a, C2 a, C3 a) => C a where { }
-</programlisting>
-
-
-This allows you to write shorter signatures:
-
-
-<programlisting>
-  f :: C a => ...
-</programlisting>
-
-
-instead of
-
-
-<programlisting>
-  f :: (C1 a, C2 a, C3 a) => ...
-</programlisting>
-
-
-I'm on the lookout for a simple rule that preserves decidability while
-allowing these idioms.  The experimental flag
-<option>-fallow-undecidable-instances</option><indexterm><primary>-fallow-undecidable-instances
-option</primary></indexterm> lifts this restriction, allowing all the types in an
-instance head to be type variables.
-
+See <xref linkend="undecidable-instances"> for an experimental
+extension to lift this restriction.
  </para>
  </listitem>
  <listitem>
@@ -1497,16 +1381,10 @@ instance C Int b => Foo b where ...
  </programlisting>
  
  
-is not OK.  Again, the intent here is to make sure that context
-reduction terminates.
+is not OK.  See <xref linkend="undecidable-instances"> for an experimental
+extension to lift this restriction.
+
  
-Voluminous correspondence on the Haskell mailing list has convinced me
-that it's worth experimenting with a more liberal rule.  If you use
-the flag <option>-fallow-undecidable-instances</option> can use arbitrary
-types in an instance context.  Termination is ensured by having a
-fixed-depth recursion stack.  If you exceed the stack depth you get a
-sort of backtrace, and the opportunity to increase the stack depth
-with <option>-fcontext-stack</option><emphasis>N</emphasis>.
  
  </para>
  </listitem>
@@ -1519,6 +1397,80 @@ with <option>-fcontext-stack</option><emphasis>N</emphasis>.
  
  </sect2>
  
+<sect2 id="undecidable-instances">
+<title>Undecidable instances</title>
+
+<para>The rules for instance declarations state that:
+<itemizedlist>
+<listitem><para>At least one of the types in the <emphasis>head</emphasis> of
+an instance declaration <emphasis>must not</emphasis> be a type variable.
+</para></listitem>
+<listitem><para>All of the types in the <emphasis>context</emphasis> of
+an instance declaration <emphasis>must</emphasis> be type variables.
+</para></listitem>
+</itemizedlist>
+These restrictions ensure that 
+context reduction terminates: each reduction step removes one type
+constructor.  For example, the following would make the type checker
+loop if it wasn't excluded:
+<programlisting>
+  instance C a => C a where ...
+</programlisting>
+There are two situations in which the rule is a bit of a pain. First,
+if one allows overlapping instance declarations then it's quite
+convenient to have a "default instance" declaration that applies if
+something more specific does not:
+
+
+<programlisting>
+  instance C a where
+    op = ... -- Default
+</programlisting>
+
+
+Second, sometimes you might want to use the following to get the
+effect of a "class synonym":
+
+
+<programlisting>
+  class (C1 a, C2 a, C3 a) => C a where { }
+
+  instance (C1 a, C2 a, C3 a) => C a where { }
+</programlisting>
+
+
+This allows you to write shorter signatures:
+
+
+<programlisting>
+  f :: C a => ...
+</programlisting>
+
+
+instead of
+
+
+<programlisting>
+  f :: (C1 a, C2 a, C3 a) => ...
+</programlisting>
+
+
+Voluminous correspondence on the Haskell mailing list has convinced me
+that it's worth experimenting with more liberal rules.  If you use
+the experimental flag <option>-fallow-undecidable-instances</option>
+<indexterm><primary>-fallow-undecidable-instances
+option</primary></indexterm>, you can use arbitrary
+types in both an instance context and instance head.  Termination is ensured by having a
+fixed-depth recursion stack.  If you exceed the stack depth you get a
+sort of backtrace, and the opportunity to increase the stack depth
+with <option>-fcontext-stack</option><emphasis>N</emphasis>.
+</para>
+<para>
+I'm on the lookout for a less brutal solution: a simple rule that preserves decidability while
+allowing these idioms interesting idioms.  
+</para>
+</sect2>
+
  <sect2 id="implicit-parameters">
  <title>Implicit parameters
  </title>
@@ -1860,8 +1812,14 @@ In Proceedings of the 9th European Symposium on Programming,
  ESOP 2000, Berlin, Germany, March 2000, Springer-Verlag LNCS 1782,
  .
  </para>
-
  <para>
+Functional dependencies are introduced by a vertical bar in the syntax of a 
+class declaration;  e.g. 
+<programlisting>
+  class (Monad m) => MonadState s m | m -> s where ...
+
+  class Foo a b c | a b -> c where ...
+</programlisting>
  There should be more documentation, but there isn't (yet).  Yell if you need it.
  </para>
  </sect2>
@@ -3519,21 +3477,13 @@ Assertion failures can be caught, see the documentation for the
  <sect2 id="inline-pragma">
  <title>INLINE pragma
  
-<indexterm><primary>INLINE pragma</primary></indexterm>
+<indexterm><primary>INLINE and NOINLINE pragmas</primary></indexterm>
  <indexterm><primary>pragma, INLINE</primary></indexterm></title>
  
  <para>
  GHC (with <option>-O</option>, as always) tries to inline (or &ldquo;unfold&rdquo;)
  functions/values that are &ldquo;small enough,&rdquo; thus avoiding the call
  overhead and possibly exposing other more-wonderful optimisations.
-</para>
-
-<para>
-You will probably see these unfoldings (in Core syntax) in your
-interface files.
-</para>
-
-<para>
  Normally, if GHC decides a function is &ldquo;too expensive&rdquo; to inline, it
  will not do so, nor will it export that unfolding for other modules to
  use.
@@ -3550,7 +3500,6 @@ key_function :: Int -> String -> (Bool, Double)
  {-# INLINE key_function #-}
  #endif
  </programlisting>
-
  (You don't need to do the C pre-processor carry-on unless you're going
  to stick the code through HBC&mdash;it doesn't like <literal>INLINE</literal> pragmas.)
  </para>
@@ -3562,7 +3511,7 @@ very keen to inline it.
  </para>
  
  <para>
-An <literal>INLINE</literal> pragma for a function can be put anywhere its type
+Syntactially, an <literal>INLINE</literal> pragma for a function can be put anywhere its type
  signature could be put.
  </para>
  
@@ -3580,11 +3529,8 @@ For example, in GHC's own <literal>UniqueSupply</literal> monad code, we have:
  
  </para>
  
-</sect2>
-
-<sect2 id="noinline-pragma">
-<title>NOINLINE pragma
-</title>
+<sect3 id="noinline-pragma">
+<title>The NOINLINE pragma </title>
  
  <indexterm><primary>NOINLINE pragma</primary></indexterm>
  <indexterm><primary>pragma</primary><secondary>NOINLINE</secondary></indexterm>
@@ -3602,9 +3548,75 @@ size.
  <literal>NOINLINE</literal> (<literal>NOTINLINE</literal> is specified
  by Haskell 98 as the standard way to disable inlining, so it should be
  used if you want your code to be portable).</para>
+</sect3>
+
+
+<sect3 id="phase-control">
+<title>Phase control</title>
+
+<para> Sometimes you want to control exactly when in GHC's pipeline
+the INLINE pragma is switched on.  Inlining happens only during runs of
+the <emphasis>simplifier</emphasis>.  Each run of the simplifier has a different
+<emphasis>phase number</emphasis>; the phase number decreases towards zero.
+If you use <option>-dverbose-core2core</option>
+you'll see the sequence of phase numbers for successive runs of the simpifier.
+In an INLINE pragma you can optionally specify a phase number, thus:
+<itemizedlist>
+<listitem> <para>You can say "inline <literal>f</literal> in Phase 2 and all subsequent
+phases":
+<programlisting>
+  {-# INLINE [2] f #-}
+</programlisting>
+</para></listitem>
+
+<listitem> <para>You can say "inline <literal>g</literal> in all phases up to, but
+not including, Phase 3":
+<programlisting>
+  {-# INLINE [~3] g #-}
+</programlisting>
+</para></listitem>
+
+<listitem> <para>If you omit the phase indicator, you mean "inline in all phases".
+</para></listitem>
+</itemizedlist>
+You can use a phase number on a NOINLINE pragma too:
+<itemizedlist>
+<listitem> <para>You can say "do not inline <literal>f</literal> until Phase 2; in 
+Phase 2 and subsequently behave as if there was no pragma at all":
+<programlisting>
+  {-# NOINLINE [2] f #-}
+</programlisting>
+</para></listitem>
+
+<listitem> <para>You can say "do not inline <literal>g</literal> in Phase 3 or any subsequent phase; 
+before that, behave as if there was no pragma":
+<programlisting>
+  {-# NOINLINE [~3] g #-}
+</programlisting>
+</para></listitem>
+
+<listitem> <para>If you omit the phase indicator, you mean "never inline this function".
+</para></listitem>
+</itemizedlist>
+</para>
+<para>The same phase-numbering control is available for RULES (<xref LinkEnd="rewrite-rules">).</para>
+</sect3>
+
+
  
  </sect2>
  
+<sect2 id="rules">
+<title>RULES pragma</title>
+
+<para>
+The RULES pragma lets you specify rewrite rules.  It is described in
+<xref LinkEnd="rewrite-rules">.
+</para>
+
+</sect2>
+
+
      <sect2 id="specialize-pragma">
        <title>SPECIALIZE pragma</title>
  
@@ -3727,16 +3739,6 @@ pragma.
  
  </sect2>
  
-<sect2 id="rules">
-<title>RULES pragma</title>
-
-<para>
-The RULES pragma lets you specify rewrite rules.  It is described in
-<xref LinkEnd="rewrite-rules">.
-</para>
-
-</sect2>
-
  <sect2 id="deprecated-pragma">
  <title>DEPRECATED pragma</title>
  
@@ -3810,16 +3812,34 @@ From a syntactic point of view:
  <listitem>
  
  <para>
+ There may be zero or more rules in a <literal>RULES</literal> pragma.
+</para>
+</listitem>
+
+<listitem>
+
+<para>
   Each rule has a name, enclosed in double quotes.  The name itself has
  no significance at all.  It is only used when reporting how many times the rule fired.
  </para>
  </listitem>
-<listitem>
  
+<listitem>
  <para>
- There may be zero or more rules in a <literal>RULES</literal> pragma.
+A rule may optionally have a phase-control number (see <xref LinkEnd="phase-control">),
+immediately after the name of the rule.  Thus:
+<programlisting>
+  {-# RULES
+        "map/map" [2]  forall f g xs. map f (map g xs) = map (f.g) xs
+  #-}
+</programlisting>
+The "[2]" means that the rule is active in Phase 2 and subsequent phases.  The inverse
+notation "[~2]" is also accepted, meaning that the rule is active up to, but not including,
+Phase 2.
  </para>
  </listitem>
+
+
  <listitem>
  
  <para>
@@ -3828,6 +3848,7 @@ is set, so you must lay out your rules starting in the same column as the
  enclosing definitions.
  </para>
  </listitem>
+
  <listitem>
  
  <para>
@@ -4303,7 +4324,7 @@ If you add <option>-dppr-debug</option> you get a more detailed listing.
  <listitem>
  
  <para>
- The defintion of (say) <function>build</function> in <FileName>PrelBase.lhs</FileName> looks llike this:
+ The defintion of (say) <function>build</function> in <FileName>GHC/Base.lhs</FileName> looks llike this:
  
  <programlisting>
          build   :: forall a. (forall b. (a -> b -> b) -> b -> b) -> [a]
@@ -4321,9 +4342,9 @@ in the RHS of the <literal>INLINE</literal> thing.  I regret the delicacy of thi
  <listitem>
  
  <para>
- In <filename>ghc/lib/std/PrelBase.lhs</filename> look at the rules for <function>map</function> to
+ In <filename>libraries/base/GHC/Base.lhs</filename> look at the rules for <function>map</function> to
  see how to write rules that will do fusion and yet give an efficient
-program even if fusion doesn't happen.  More rules in <filename>PrelList.lhs</filename>.
+program even if fusion doesn't happen.  More rules in <filename>GHC/List.lhs</filename>.
  </para>
  </listitem>
  
@@ -4333,6 +4354,69 @@ program even if fusion doesn't happen.  More rules in <filename>PrelList.lhs</fi
  
  </sect2>
  
+<sect2 id="core-pragma">
+  <title>CORE pragma</title>
+
+  <indexterm><primary>CORE pragma</primary></indexterm>
+  <indexterm><primary>pragma, CORE</primary></indexterm>
+  <indexterm><primary>core, annotation</primary></indexterm>
+
+<para>
+  The external core format supports <quote>Note</quote> annotations;
+  the <literal>CORE</literal> pragma gives a way to specify what these
+  should be in your Haskell source code.  Syntactically, core
+  annotations are attached to expressions and take a Haskell string
+  literal as an argument.  The following function definition shows an
+  example:
+
+<programlisting>
+f x = ({-# CORE "foo" #-} show) ({-# CORE "bar" #-} x)
+</programlisting>
+
+  Sematically, this is equivalent to:
+
+<programlisting>
+g x = show x
+</programlisting>
+</para>
+
+<para>
+  However, when external for is generated (via
+  <option>-fext-core</option>), there will be Notes attached to the
+  expressions <function>show</function> and <VarName>x</VarName>.
+  The core function declaration for <function>f</function> is:
+</para>
+
+<programlisting>
+  f :: %forall a . GHCziShow.ZCTShow a ->
+                   a -> GHCziBase.ZMZN GHCziBase.Char =
+    \ @ a (zddShow::GHCziShow.ZCTShow a) (eta::a) ->
+        (%note "foo"
+         %case zddShow %of (tpl::GHCziShow.ZCTShow a)
+           {GHCziShow.ZCDShow
+            (tpl1::GHCziBase.Int ->
+                   a ->
+                   GHCziBase.ZMZN GHCziBase.Char -> GHCziBase.ZMZN GHCziBase.Cha
+r)
+            (tpl2::a -> GHCziBase.ZMZN GHCziBase.Char)
+            (tpl3::GHCziBase.ZMZN a ->
+                   GHCziBase.ZMZN GHCziBase.Char -> GHCziBase.ZMZN GHCziBase.Cha
+r) ->
+              tpl2})
+        (%note "foo"
+         eta);
+</programlisting>
+
+<para>
+  Here, we can see that the function <function>show</function> (which
+  has been expanded out to a case expression over the Show dictionary)
+  has a <literal>%note</literal> attached to it, as does the
+  expression <VarName>eta</VarName> (which used to be called
+  <VarName>x</VarName>).
+</para>
+
+</sect2>
+
  </sect1>
  
  <sect1 id="generic-classes">
@@ -4381,7 +4465,7 @@ Now you can make a data type into an instance of Bin like this:
    instance (Bin a, Bin b) => Bin (a,b)
    instance Bin a => Bin [a]
  </programlisting>
-That is, just leave off the "where" clasuse.  Of course, you can put in the
+That is, just leave off the "where" clause.  Of course, you can put in the
  where clause and over-ride whichever methods you please.
  </para>