[project @ 2001-10-26 11:53:34 by sewardj]

[ghc-hetmet.git] / ghc / docs / ghci / ghci.tex
diff --git a/ghc/docs/ghci/ghci.tex b/ghc/docs/ghci/ghci.tex

index 8939956..c4638a6 100644 (file)
--- a/ghc/docs/ghci/ghci.tex
+++ b/ghc/docs/ghci/ghci.tex
@@ -57,13 +57,6 @@
  %%%\newpage
  
  %%-----------------------------------------------------------------%%
-\section*{Misc text looking for a home}
-
-@compile@ is supplied with, and checks PIT (inside PCS) before
-reading package interfaces, so it doesn't read and add duplicate
-@ModIFace@s to PIT.
-
-
  \section{Details}
  
  \subsection{Outline of the design}
@@ -124,7 +117,7 @@ visibility.  Subsequent sections elaborate who can see what.
        unlinked translations of home modules only.
  \item {\bf Module Graph (MG)} (owner: CM) is the current module graph.
  \item {\bf Static Info (SI)} (owner: CM) is the package configuration
-      information and compiler flags.
+      information (PCI) and compiler flags (FLAGS).
  \item {\bf Persistent Compiler State (PCS)} (owner: @compile@)
        is @compile@'s private cache of information about package
        modules.
@@ -154,7 +147,7 @@ maps, so they are given a @Unique@.
  \end{verbatim}
  
  A @ModLocation@ says where a module is, what it's called and in what
-form it it.
+form it is.
  \begin{verbatim}
     data ModLocation = SourceOnly Module Path         -- .hs
                      | ObjectCode Module Path Path    -- .o, .hi
@@ -187,9 +180,9 @@ can be created quickly.
  \begin{verbatim}
     data ModSummary = ModSummary 
                          ModLocation   -- location and kind
-                        Maybe (String, Fingerprint)
+                        (Maybe (String, Fingerprint))
                                        -- source and fingerprint if .hs
-                        [ModName]     -- imports
+                        (Maybe [ModName])     -- imports if .hs or .hi
  
     type Fingerprint = ...  -- file timestamp, or source checksum?
  
@@ -261,7 +254,7 @@ inspecting them.
     to other entities, regardless of module boundaries.  HST only holds
     information for home modules; the corresponding wired-up details
     for package (non-home) modules are created on demand in the package
-   symbol table (PST) inside the persistent compiler's state (PST).
+   symbol table (PST) inside the persistent compiler's state (PCS).
  
     CM maintains the HST, which is passed to, but not modified by,
     @compile@.  If compilation of a module is successful, @compile@
@@ -311,16 +304,16 @@ inspecting them.
     single @Linkable@ -- as is the case for any module from a
     multi-module package.  For these reasons it seems appropriate to
     keep the two concepts distinct.  @Linkable@s also provide
-   information about the sequence in which individual package package
-   components should be linked, and that insn't the business of any
+   information about the sequence in which individual package
+   components should be linked, and that isn't the business of any
     specific module to know.
  
     CM passes @compile@ a module's old @ModIFace@, if it has one, in
     the hope that the module won't need recompiling.  If so, @compile@
-   can just return the @ModIFace@ along with a new @ModDetails@
-   created from it.  Similarly, CM passes in a module's old
-   @Linkable@, if it has one, and that's returned unchanged if the
-   module isn't recompiled.
+   can just return the new @ModDetails@ created from it, and CM will
+   re-use the old @ModIFace@.  If the module {\em is} recompiled (or 
+   scheduled to be loaded from disk), @compile@ returns both the 
+   new @ModIFace@ and new @Linkable@.
  
  \item 
     {\bf Module Graph (MG)} @:: known-only-to-CM@
@@ -341,15 +334,15 @@ It pretty much implements the HEP interface.  First, though, define a
  containing structure for the state of the entire CM system and its
  subsystems @compile@ and @link@:
  \begin{verbatim}
-   data SysState 
-      = SysState PCMS      -- CM's stuff
-                 PCS       -- compile's stuff
-                 PLS       -- link's stuff
-                 SI        -- the static info, never changes
-                 Finder    -- the finder
+   data CmState 
+      = CmState PCMS      -- CM's stuff
+                PCS       -- compile's stuff
+                PLS       -- link's stuff
+                SI        -- the static info, never changes
+                Finder    -- the finder
  \end{verbatim}
  
-Then @SysState@ is threaded through the HEP interface.  In reality
+The @CmState@ is threaded through the HEP interface.  In reality
  this might be done using @IORef@s, but for clarity:
  \begin{verbatim}
    type ModHandle = ... (opaque to CM/HEP clients) ...
@@ -357,79 +350,80 @@ this might be done using @IORef@s, but for clarity:
  
    cmInit       :: FLAGS 
                 -> [PkgInfo]
-               -> IO SysState
+               -> IO CmState
  
-  cmLoadModule :: SysState 
+  cmLoadModule :: CmState 
                 -> ModName 
-               -> IO (SysState, Either [SDoc] ModHandle)
+               -> IO (CmState, Either [SDoc] ModHandle)
  
    cmGetExpr    :: ModHandle 
-               -> SysState 
-               -> String -> IO (SysState, Either [SDoc] HValue)
+               -> CmState 
+               -> String -> IO (CmState, Either [SDoc] HValue)
  
-  cmRunExpr    :: HValue -> IO ()   -- don't need SysState here
+  cmRunExpr    :: HValue -> IO ()   -- don't need CmState here
  \end{verbatim}
  Almost all the huff and puff in this document pertains to @cmLoadModule@.
  
  
  \subsubsection{Implementing \mbox{\tt cmInit}}
-@cmInit@ creates an empty @SysState@ using @emptyPCMS@, @emptyPCS@,
+@cmInit@ creates an empty @CmState@ using @emptyPCMS@, @emptyPCS@,
  @emptyPLS@, making SI from the supplied flags and package info, and 
  by supplying the package info the @newFinder@.
  
  
  \subsubsection{Implementing \mbox{\tt cmLoadModule}}
  
-\begin{itemize}
+\begin{enumerate}
  \item {\bf Downsweep:} using @finder@ and @summarise@, chase from 
        the given module to
        establish the new home module graph (MG).  Do not chase into
        package modules.
-\item Remove from the old module graph the upward closure of 
-      all modules which are now out of date.  Also remove any 
-      modules which wouldn't be reachable from the new root module.
-      \ToDo{De-woollify this.}
-
-      Removing a module means also removing it from HST, UI and PLS,
-      but not from HIT.
-
-      Result: an up-to-date module graph, partitioned into 
-      modules which are in-date, and those needing recompiling.
-
-      Burble burble.
-\end{itemize}
-
-Pretty much as before.  \ToDo{... and what was Before?}
-
-Plus: detect module cycles during the downsweep.  During the upsweep,
-ensure that compilation failures for modules in cycles do not leave
-any of the global structures in an inconsistent state.  
-\begin{itemize}
-\item 
-   For PCS, that's never a problem because PCS doesn't hold any
-   information pertaining to home modules.
-\item 
-   HST and HIT: CM knows that these are mappings from @Module@ to
-   whatever, and can throw away entries from failed cycles, or,
-   equivalently, not commit updates to them until cycles succeed,
-   remembering of course to synthesise appropriate HSTs during
-   compilation of a cycle.
-\item 
-   UI -- a collection of @Linkable@s, between which there are no
-   direct refererences, so CM can remove additions from failed cycles
-   with no difficulty.
-\item 
-   OST -- linking is not carried out until the upsweep has
-   succeeded, so there's no problem here.
-\end{itemize}
-
-Plus: clear out the global data structures after the downsweep but
-before the upsweep.
-
-\ToDo{CM needs to supply a way for @compile@ to know which modules in
-      HST are in its downwards closure, and which not, so it can
-      correctly construct its instance environment.}
-
+\item Remove from HIT, HST, UI any modules in the old MG which are
+      not in the new one.  The old MG is then replaced by the new one.
+\item Topologically sort MG to generate a bottom-to-top traversal
+      order, giving a worklist.
+\item {\bf Upsweep:} call @compile@ on each module in the worklist in 
+      turn, passing it
+      the ``correct'' HST, PCS, the old @ModIFace@ if
+      available, and the summary.  ``Correct'' HST in the sense that
+      HST contains only the modules in the this module's downward
+      closure, so that @compile@ can construct the correct instance
+      and rule environments simply as the union of those in 
+      the module's downward closure.
+
+      If @compile@ doesn't return a new interface/linkable pair,
+      compilation wasn't necessary.  Either way, update HST with
+      the new @ModDetails@, and UI and HIT respectively if a 
+      compilation {\em did} occur.
+
+      Keep going until the root module is successfully done, or
+      compilation fails.
+      
+\item If the previous step terminated because compilation failed,
+      define the successful set as those modules in successfully
+      completed SCCs, i.e. all @Linkable@s returned by @compile@ excluding
+      those from modules in any cycle which includes the module which failed.
+      Remove from HST, HIT, UI and MG all modules mentioned in MG which 
+      are not in the successful set.  Call @link@ with the successful
+      set,
+      which should succeed.  The net effect is to back off to a point
+      in which those modules which are still aboard are correctly
+      compiled and linked.
+
+      If the previous step terminated successfully, 
+      call @link@ passing it the @Linkable@s in the upward closure of
+      all those modules for which @compile@ produced a new @Linkable@.
+\end{enumerate}
+As a small optimisation, do this:
+\begin{enumerate}
+\item[3a.] Remove from the worklist any module M where M's source
+     hasn't changed and neither has the source of any module in M's
+     downward closure.  This has the effect of not starting the upsweep
+     right at the bottom of the graph when that's not needed.
+     Source-change checking can be done quickly by CM by comparing
+     summaries of modules in MG against corresponding 
+     summaries from the old MG.
+\end{enumerate}
  
  
  %%-- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --%%
@@ -452,19 +446,14 @@ date.  There are three parts to it:
     caches them in the PIT.  Subsequent imports of the same module get
     them directly out of the PIT, avoiding slow lexing/parsing phases.
     Because foreign packages are assumed never to become out of date,
-   all contents of PIT remain valid forever.
-
-   Successful runs of @compile@ can add arbitrary numbers of new
-   interfaces to the PIT.  Failed runs could also contribute any new
-   interfaces read, but this could create inconsistencies between the
-   PIT and the unlinked images (UI).  Specifically, we don't want the
-   PIT to acquire interfaces for which UI hasn't got a corresponding
-   @Linkable@, and we don't want @Linkable@s from failed compilation
-   runs to enter UI, because we can't be sure that they are actually
-   necessary for a successful link.  So it seems simplest, albeit at a
-   small compilation speed loss, for @compile@ not to update PCS at
-   all following a failed compile.  We may revisit this
-   decision later.
+   all contents of PIT remain valid forever.  @compile@ of course
+   tries to find package interfaces in PIT in preference to reading
+   them from files.  
+
+   Both successful and failed runs of @compile@ can add arbitrary
+   numbers of new interfaces to the PIT.  The failed runs don't matter
+   because we assume that packages are static, so the data cached even
+   by a failed run is valid forever (ie for the rest of the session).
  
     \item
        {\bf Package Symbol Table (PST)} @:: FiniteMap Module ModDetails@
@@ -478,16 +467,18 @@ date.  There are three parts to it:
     interfaces, and we don't want to do that unnecessarily.
  
     The PST avoids these problems by allowing incremental wiring-in to
-   happen.  Pieces of foreign interfaces are renamed and placed in the
-   PST, but only as @compile@ discovers it needs them.  In the process
-   of incremental renaming, @compile@ may need to read more package
-   interfaces, which are returned to CM to add to the PIT.
+   happen.  Pieces of foreign interfaces are copied out of the holding
+   pen (HP), renamed, typechecked, and placed in the PST, but only as
+   @compile@ discovers it needs them.  In the process of incremental
+   renaming/typechecking, @compile@ may need to read more package
+   interfaces, which are added to the PIT and hence to 
+   HP.~\ToDo{How? When?}
  
     CM passes the PST to @compile@ and is returned an updated version
     on both success and failure.
  
     \item 
-      {\bf Holding Pen (HP)} @:: Ifaces@ 
+      {\bf Holding Pen (HP)} @:: HoldingPen@ 
  
     HP holds parsed but not-yet renamed-or-typechecked fragments of
     package interfaces.  As typechecking of other modules progresses,
@@ -515,26 +506,25 @@ date.  There are three parts to it:
  
  \subsubsection{What {\tt compile} does}
  @compile@ is necessarily somewhat complex.  We've decided to do away
-with private global variables -- they make the design harder to
-understand and may interfere with CM's need to roll the system back
-to a consistent state following compilation failure for modules in 
-a cycle.  Without further ado:
+with private global variables -- they make the design specification
+less clear, although the implementation might use them.  Without
+further ado:
  \begin{verbatim}
-   compile :: FLAGS       -- obvious
+   compile :: SI          -- obvious
             -> Finder      -- to find modules
             -> ModSummary  -- summary, including source
-           -> Maybe (ModIFace, Linkable)
-                          -- former summary and code, if avail
+           -> Maybe ModIFace
+                          -- former summary, if avail
             -> HST         -- for home module ModDetails
             -> PCS         -- IN: the persistent compiler state
  
-           -> CompResult
+           -> IO CompResult
  
     data CompResult
        = CompOK  ModDetails   -- new details (== HST additions)
-                (ModIFace, Linkable)
-                             -- summary and code; same as went in if
-                             -- compilation was not needed
+                (Maybe (ModIFace, Linkable))
+                             -- summary and code; Nothing => compilation
+                             -- not needed (old summary and code are still valid)
                  PCS          -- updated PCS
                  [SDoc]       -- warnings
  
@@ -543,7 +533,8 @@ a cycle.  Without further ado:
  
     data PCS
        = MkPCS PIT         -- package interfaces
-              PST         -- rename cache/global symtab contents
+              PST         -- post slurping global symtab contribs
+              HoldingPen  -- pre slurping interface bits and pieces
  
     emptyPCS :: IO PCS     -- since CM has no other way to make one
  \end{verbatim}
@@ -573,10 +564,10 @@ What @compile@ does: \ToDo{A bit vague ... needs refining.  How does
  
  \item
     If recompilation is not needed, create a new @ModDetails@ from the
-   old @ModIFace@, looking up information in HST and PCS.PST as necessary.
-   Return the new details, the old @ModIFace@ and @Linkable@, the PCS
-   \ToDo{I don't think the PCS should be updated, but who knows?}, and
-   an empty warning list.
+   old @ModIFace@, looking up information in HST and PCS.PST as
+   necessary.  Return the new details, a @Nothing@ denoting
+   compilation was not needed, the PCS \ToDo{I don't think the PCS
+   should be updated, but who knows?}, and an empty warning list.
  
  \item
     Otherwise, compilation is needed.  
@@ -599,7 +590,7 @@ What @compile@ does: \ToDo{A bit vague ... needs refining.  How does
  
  
  \subsubsection{Contents of \mbox{\tt ModDetails}, 
-               \mbox{\tt ModIFace} and \mbox{\tt Ifaces}}
+               \mbox{\tt ModIFace} and \mbox{\tt HoldingPen}}
  Only @compile@ can see inside these three types -- they are opaque to
  everyone else.  @ModDetails@ holds the post-renaming,
  post-typechecking environment created by compiling a module.
@@ -635,11 +626,11 @@ post-typechecking environment created by compiling a module.
         }
  \end{verbatim}
  
-@Ifaces@ is a cleaned-up version of that found in @RnMonad.lhs@, 
+@HoldingPen@ is a cleaned-up version of that found in @RnMonad.lhs@, 
  retaining just the 3 pieces actually comprising the holding pen:
  \begin{verbatim}
-   data Ifaces 
-      = Ifaces {
+   data HoldingPen 
+      = HoldingPen {
             iDecls :: DeclsMap,     -- A single, global map of Names to decls
  
             iInsts :: IfaceInsts,
@@ -767,10 +758,10 @@ Anyway:
  \begin{verbatim}
     data PLS -- as described above; opaque to everybody except the linker
  
-   link :: PCI -> ??? -> [[Linkable]] -> LinkState -> IO LinkResult
+   link :: PCI -> ??? -> [[Linkable]] -> PLS -> IO LinkResult
  
-   data LinkResult = LinkOK   LinkState
-                   | LinkErrs LinkState [SDoc]
+   data LinkResult = LinkOK   PLS
+                   | LinkErrs PLS [SDoc]
  
     emptyPLS :: IO PLS     -- since CM has no other way to make one
  \end{verbatim}
@@ -812,7 +803,7 @@ it may not be possible to link recursive groups containing libraries.
  \end{itemize}
  
  If linking in of a group should fail for some reason, @link@ should
-not modify its @LinkState@ at all.  In other words, linking each group
+not modify its PLS at all.  In other words, linking each group
  is atomic; it either succeeds or fails.
  
  \subsubsection*{\mbox{\tt Unlinked} and \mbox{\tt Linkable}}