%%%\newpage
%%-----------------------------------------------------------------%%
-\section*{Misc text looking for a home}
-
-@compile@ is supplied with, and checks PIT (inside PCS) before
-reading package interfaces, so it doesn't read and add duplicate
-@ModIFace@s to PIT.
-
-
\section{Details}
\subsection{Outline of the design}
unlinked translations of home modules only.
\item {\bf Module Graph (MG)} (owner: CM) is the current module graph.
\item {\bf Static Info (SI)} (owner: CM) is the package configuration
- information and compiler flags.
+ information (PCI) and compiler flags (FLAGS).
\item {\bf Persistent Compiler State (PCS)} (owner: @compile@)
is @compile@'s private cache of information about package
modules.
\end{verbatim}
A @ModLocation@ says where a module is, what it's called and in what
-form it it.
+form it is.
\begin{verbatim}
data ModLocation = SourceOnly Module Path -- .hs
| ObjectCode Module Path Path -- .o, .hi
\begin{verbatim}
data ModSummary = ModSummary
ModLocation -- location and kind
- Maybe (String, Fingerprint)
+ (Maybe (String, Fingerprint))
-- source and fingerprint if .hs
- [ModName] -- imports
+ (Maybe [ModName]) -- imports if .hs or .hi
type Fingerprint = ... -- file timestamp, or source checksum?
to other entities, regardless of module boundaries. HST only holds
information for home modules; the corresponding wired-up details
for package (non-home) modules are created on demand in the package
- symbol table (PST) inside the persistent compiler's state (PST).
+ symbol table (PST) inside the persistent compiler's state (PCS).
CM maintains the HST, which is passed to, but not modified by,
@compile@. If compilation of a module is successful, @compile@
single @Linkable@ -- as is the case for any module from a
multi-module package. For these reasons it seems appropriate to
keep the two concepts distinct. @Linkable@s also provide
- information about the sequence in which individual package package
- components should be linked, and that insn't the business of any
+ information about the sequence in which individual package
+ components should be linked, and that isn't the business of any
specific module to know.
CM passes @compile@ a module's old @ModIFace@, if it has one, in
the hope that the module won't need recompiling. If so, @compile@
- can just return the @ModIFace@ along with a new @ModDetails@
- created from it. Similarly, CM passes in a module's old
- @Linkable@, if it has one, and that's returned unchanged if the
- module isn't recompiled.
+ can just return the new @ModDetails@ created from it, and CM will
+ re-use the old @ModIFace@. If the module {\em is} recompiled (or
+ scheduled to be loaded from disk), @compile@ returns both the
+ new @ModIFace@ and new @Linkable@.
\item
{\bf Module Graph (MG)} @:: known-only-to-CM@
containing structure for the state of the entire CM system and its
subsystems @compile@ and @link@:
\begin{verbatim}
- data SysState
- = SysState PCMS -- CM's stuff
- PCS -- compile's stuff
- PLS -- link's stuff
- SI -- the static info, never changes
- Finder -- the finder
+ data CmState
+ = CmState PCMS -- CM's stuff
+ PCS -- compile's stuff
+ PLS -- link's stuff
+ SI -- the static info, never changes
+ Finder -- the finder
\end{verbatim}
-Then @SysState@ is threaded through the HEP interface. In reality
+The @CmState@ is threaded through the HEP interface. In reality
this might be done using @IORef@s, but for clarity:
\begin{verbatim}
type ModHandle = ... (opaque to CM/HEP clients) ...
cmInit :: FLAGS
-> [PkgInfo]
- -> IO SysState
+ -> IO CmState
- cmLoadModule :: SysState
+ cmLoadModule :: CmState
-> ModName
- -> IO (SysState, Either [SDoc] ModHandle)
+ -> IO (CmState, Either [SDoc] ModHandle)
cmGetExpr :: ModHandle
- -> SysState
- -> String -> IO (SysState, Either [SDoc] HValue)
+ -> CmState
+ -> String -> IO (CmState, Either [SDoc] HValue)
- cmRunExpr :: HValue -> IO () -- don't need SysState here
+ cmRunExpr :: HValue -> IO () -- don't need CmState here
\end{verbatim}
Almost all the huff and puff in this document pertains to @cmLoadModule@.
\subsubsection{Implementing \mbox{\tt cmInit}}
-@cmInit@ creates an empty @SysState@ using @emptyPCMS@, @emptyPCS@,
+@cmInit@ creates an empty @CmState@ using @emptyPCMS@, @emptyPCS@,
@emptyPLS@, making SI from the supplied flags and package info, and
by supplying the package info the @newFinder@.
\subsubsection{Implementing \mbox{\tt cmLoadModule}}
-\begin{itemize}
+\begin{enumerate}
\item {\bf Downsweep:} using @finder@ and @summarise@, chase from
the given module to
establish the new home module graph (MG). Do not chase into
package modules.
-\item Remove from the old module graph the upward closure of
- all modules which are now out of date. Also remove any
- modules which wouldn't be reachable from the new root module.
- \ToDo{De-woollify this.}
-
- Removing a module means also removing it from HST, UI and PLS,
- but not from HIT.
-
- Result: an up-to-date module graph, partitioned into
- modules which are in-date, and those needing recompiling.
-
- Burble burble.
-\end{itemize}
-
-Pretty much as before. \ToDo{... and what was Before?}
-
-Plus: detect module cycles during the downsweep. During the upsweep,
-ensure that compilation failures for modules in cycles do not leave
-any of the global structures in an inconsistent state.
-\begin{itemize}
-\item
- For PCS, that's never a problem because PCS doesn't hold any
- information pertaining to home modules.
-\item
- HST and HIT: CM knows that these are mappings from @Module@ to
- whatever, and can throw away entries from failed cycles, or,
- equivalently, not commit updates to them until cycles succeed,
- remembering of course to synthesise appropriate HSTs during
- compilation of a cycle.
-\item
- UI -- a collection of @Linkable@s, between which there are no
- direct refererences, so CM can remove additions from failed cycles
- with no difficulty.
-\item
- OST -- linking is not carried out until the upsweep has
- succeeded, so there's no problem here.
-\end{itemize}
-
-Plus: clear out the global data structures after the downsweep but
-before the upsweep.
-
-\ToDo{CM needs to supply a way for @compile@ to know which modules in
- HST are in its downwards closure, and which not, so it can
- correctly construct its instance environment.}
-
+\item Remove from HIT, HST, UI any modules in the old MG which are
+ not in the new one. The old MG is then replaced by the new one.
+\item Topologically sort MG to generate a bottom-to-top traversal
+ order, giving a worklist.
+\item {\bf Upsweep:} call @compile@ on each module in the worklist in
+ turn, passing it
+ the ``correct'' HST, PCS, the old @ModIFace@ if
+ available, and the summary. ``Correct'' HST in the sense that
+ HST contains only the modules in the this module's downward
+ closure, so that @compile@ can construct the correct instance
+ and rule environments simply as the union of those in
+ the module's downward closure.
+
+ If @compile@ doesn't return a new interface/linkable pair,
+ compilation wasn't necessary. Either way, update HST with
+ the new @ModDetails@, and UI and HIT respectively if a
+ compilation {\em did} occur.
+
+ Keep going until the root module is successfully done, or
+ compilation fails.
+
+\item If the previous step terminated because compilation failed,
+ define the successful set as those modules in successfully
+ completed SCCs, i.e. all @Linkable@s returned by @compile@ excluding
+ those from modules in any cycle which includes the module which failed.
+ Remove from HST, HIT, UI and MG all modules mentioned in MG which
+ are not in the successful set. Call @link@ with the successful
+ set,
+ which should succeed. The net effect is to back off to a point
+ in which those modules which are still aboard are correctly
+ compiled and linked.
+
+ If the previous step terminated successfully,
+ call @link@ passing it the @Linkable@s in the upward closure of
+ all those modules for which @compile@ produced a new @Linkable@.
+\end{enumerate}
+As a small optimisation, do this:
+\begin{enumerate}
+\item[3a.] Remove from the worklist any module M where M's source
+ hasn't changed and neither has the source of any module in M's
+ downward closure. This has the effect of not starting the upsweep
+ right at the bottom of the graph when that's not needed.
+ Source-change checking can be done quickly by CM by comparing
+ summaries of modules in MG against corresponding
+ summaries from the old MG.
+\end{enumerate}
%%-- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- --%%
caches them in the PIT. Subsequent imports of the same module get
them directly out of the PIT, avoiding slow lexing/parsing phases.
Because foreign packages are assumed never to become out of date,
- all contents of PIT remain valid forever.
-
- Successful runs of @compile@ can add arbitrary numbers of new
- interfaces to the PIT. Failed runs could also contribute any new
- interfaces read, but this could create inconsistencies between the
- PIT and the unlinked images (UI). Specifically, we don't want the
- PIT to acquire interfaces for which UI hasn't got a corresponding
- @Linkable@, and we don't want @Linkable@s from failed compilation
- runs to enter UI, because we can't be sure that they are actually
- necessary for a successful link. So it seems simplest, albeit at a
- small compilation speed loss, for @compile@ not to update PCS at
- all following a failed compile. We may revisit this
- decision later.
+ all contents of PIT remain valid forever. @compile@ of course
+ tries to find package interfaces in PIT in preference to reading
+ them from files.
+
+ Both successful and failed runs of @compile@ can add arbitrary
+ numbers of new interfaces to the PIT. The failed runs don't matter
+ because we assume that packages are static, so the data cached even
+ by a failed run is valid forever (ie for the rest of the session).
\item
{\bf Package Symbol Table (PST)} @:: FiniteMap Module ModDetails@
interfaces, and we don't want to do that unnecessarily.
The PST avoids these problems by allowing incremental wiring-in to
- happen. Pieces of foreign interfaces are renamed and placed in the
- PST, but only as @compile@ discovers it needs them. In the process
- of incremental renaming, @compile@ may need to read more package
- interfaces, which are returned to CM to add to the PIT.
+ happen. Pieces of foreign interfaces are copied out of the holding
+ pen (HP), renamed, typechecked, and placed in the PST, but only as
+ @compile@ discovers it needs them. In the process of incremental
+ renaming/typechecking, @compile@ may need to read more package
+ interfaces, which are added to the PIT and hence to
+ HP.~\ToDo{How? When?}
CM passes the PST to @compile@ and is returned an updated version
on both success and failure.
\item
- {\bf Holding Pen (HP)} @:: Ifaces@
+ {\bf Holding Pen (HP)} @:: HoldingPen@
HP holds parsed but not-yet renamed-or-typechecked fragments of
package interfaces. As typechecking of other modules progresses,
\subsubsection{What {\tt compile} does}
@compile@ is necessarily somewhat complex. We've decided to do away
-with private global variables -- they make the design harder to
-understand and may interfere with CM's need to roll the system back
-to a consistent state following compilation failure for modules in
-a cycle. Without further ado:
+with private global variables -- they make the design specification
+less clear, although the implementation might use them. Without
+further ado:
\begin{verbatim}
- compile :: FLAGS -- obvious
+ compile :: SI -- obvious
-> Finder -- to find modules
-> ModSummary -- summary, including source
- -> Maybe (ModIFace, Linkable)
- -- former summary and code, if avail
+ -> Maybe ModIFace
+ -- former summary, if avail
-> HST -- for home module ModDetails
-> PCS -- IN: the persistent compiler state
- -> CompResult
+ -> IO CompResult
data CompResult
= CompOK ModDetails -- new details (== HST additions)
- (ModIFace, Linkable)
- -- summary and code; same as went in if
- -- compilation was not needed
+ (Maybe (ModIFace, Linkable))
+ -- summary and code; Nothing => compilation
+ -- not needed (old summary and code are still valid)
PCS -- updated PCS
[SDoc] -- warnings
data PCS
= MkPCS PIT -- package interfaces
- PST -- rename cache/global symtab contents
+ PST -- post slurping global symtab contribs
+ HoldingPen -- pre slurping interface bits and pieces
emptyPCS :: IO PCS -- since CM has no other way to make one
\end{verbatim}
\item
If recompilation is not needed, create a new @ModDetails@ from the
- old @ModIFace@, looking up information in HST and PCS.PST as necessary.
- Return the new details, the old @ModIFace@ and @Linkable@, the PCS
- \ToDo{I don't think the PCS should be updated, but who knows?}, and
- an empty warning list.
+ old @ModIFace@, looking up information in HST and PCS.PST as
+ necessary. Return the new details, a @Nothing@ denoting
+ compilation was not needed, the PCS \ToDo{I don't think the PCS
+ should be updated, but who knows?}, and an empty warning list.
\item
Otherwise, compilation is needed.
\subsubsection{Contents of \mbox{\tt ModDetails},
- \mbox{\tt ModIFace} and \mbox{\tt Ifaces}}
+ \mbox{\tt ModIFace} and \mbox{\tt HoldingPen}}
Only @compile@ can see inside these three types -- they are opaque to
everyone else. @ModDetails@ holds the post-renaming,
post-typechecking environment created by compiling a module.
}
\end{verbatim}
-@Ifaces@ is a cleaned-up version of that found in @RnMonad.lhs@,
+@HoldingPen@ is a cleaned-up version of that found in @RnMonad.lhs@,
retaining just the 3 pieces actually comprising the holding pen:
\begin{verbatim}
- data Ifaces
- = Ifaces {
+ data HoldingPen
+ = HoldingPen {
iDecls :: DeclsMap, -- A single, global map of Names to decls
iInsts :: IfaceInsts,
\begin{verbatim}
data PLS -- as described above; opaque to everybody except the linker
- link :: PCI -> ??? -> [[Linkable]] -> LinkState -> IO LinkResult
+ link :: PCI -> ??? -> [[Linkable]] -> PLS -> IO LinkResult
- data LinkResult = LinkOK LinkState
- | LinkErrs LinkState [SDoc]
+ data LinkResult = LinkOK PLS
+ | LinkErrs PLS [SDoc]
emptyPLS :: IO PLS -- since CM has no other way to make one
\end{verbatim}
\end{itemize}
If linking in of a group should fail for some reason, @link@ should
-not modify its @LinkState@ at all. In other words, linking each group
+not modify its PLS at all. In other words, linking each group
is atomic; it either succeeds or fails.
\subsubsection*{\mbox{\tt Unlinked} and \mbox{\tt Linkable}}