change build machine, use /usr/bin/tclsh instead of /usr/bin/tcl

[fleet.git] / doc / archman.tex
diff --git a/doc/archman.tex b/doc/archman.tex

index 9617822..7019dd8 100644 (file)
--- a/doc/archman.tex
+++ b/doc/archman.tex
@@ -1,8 +1,18 @@
  \documentclass[10pt,oneside]{article}
  \reversemarginpar 
+\usepackage[titles]{tocloft}
  \usepackage{emp}
+\usepackage{amsmath}
  \DeclareGraphicsRule{*}{mps}{*}{}
  
+\newcommand{\footnoteremember}[2]{
+  \footnote{#2}
+  \newcounter{#1}
+  \setcounter{#1}{\value{footnote}}
+} \newcommand{\footnoterecall}[1]{
+  \footnotemark[\value{#1}]
+}
+
  \ifx\pdftexversion\undefined
  \usepackage[dvips]{graphicx}
  \DeclareGraphicsExtensions{.eps}\else
@@ -19,7 +29,7 @@
  \usepackage{bytefield}
  \usepackage[colorlinks=true, pdfstartview=FitV, linkcolor=blue, citecolor=blue, urlcolor=blue]{hyperref}
  \renewcommand{\ttdefault}{cmtt}
-\title{The FleetTwo Architecture Manual\\{\normalsize A Programmer's View of Fleet}}
+\title{The FleetTwo Architecture Manual\\{\normalsize A Programmer's View of FleetTwo}}
  \begin{document}
  \maketitle
  
@@ -67,8 +77,21 @@ necessary to write software and compilers for Fleet.
  \end{abstract}
  
  
+%\vfill
+%\fbox{\parbox{5in}{\footnotesize
+%\begin{center}
+%The software described in this memo may be obtained using
+%\href{http://www.darcs.net/}{\tt darcs}, with the command:
+%
+%{\tt darcs get http://research.cs.berkeley.edu/class/fleet/repos/fleet/}
+%\end{center}
+%}}
+
  \pagebreak
-\section*{Introduction (from \cite{ies02})}
+\tableofcontents
+
+\pagebreak
+\section{Introduction (from \cite{ies02})}
  
  During the past two and a half years, the Fleet architecture has gone
  through massive changes and improvements.  Nonetheless, the
@@ -127,7 +150,7 @@ logic makes concurrency available if we can figure out how to use it.
  
  
  \pagebreak
-\section*{The Programmer's View of The Ship-Fabric Interface}
+\section{The Programmer's View of The Ship-Fabric Interface}
  
  The role of the Fleet switch fabric is to carry {\it packets} from
  {\it sources} to {\it destinations}.
@@ -142,7 +165,7 @@ payload is a data item, it includes one machine word of data.
  The diagram below represents a {\it programmer's} conceptual view of
  the interface between ships and the switch fabric.  Actual
  implementation circuitry may differ substantially.  Sources and
-destinations which can send and recieve only tokens -- not data items
+destinations which can send and receive only tokens -- not data items
  -- are drawn as dashed lines.
  
  \vspace{0.3cm}
@@ -152,25 +175,25 @@ destinations which can send and recieve only tokens -- not data items
  \vspace{0.3cm}
  
  The term {\it port} refers to an interface to the ship, the {\it
-  valve} connecting it to the switch fabric, and the corresponding
+  dock} connecting it to the switch fabric, and the corresponding
  sources and destinations on the switch fabric.
  
-Each valve consists of a {\it data latch}, which is as wide as a
+Each dock consists of a {\it data latch}, which is as wide as a
  single machine word and a {\it pump}, which is a circular fifo of
  instruction-width latches.  The values in the instruction fifo
  control the data latch, as future chapters will explain.
  
-Note that the pump in each valve has a destination of its own, and
+Note that the pump in each dock has a destination of its own, and
  that there is no buffering fifo guarding this destination.  Note that
  all other destinations are guarded by a buffering fifo; the size of
-this fifo is exposed to the software programmer so she can ensure that
-deadlock does not occur.
+this fifo is exposed to the software programmer so she can avoid
+deadlock.
  
  
  \pagebreak
-\section*{Data Formats}
+\section{Data Formats}
  
-\subsection*{Path (12 bits)}
+\subsection{Path (12 bits)}
  
  These bits appear physically within the switch fabric. \footnote{In
    the Sun Labs implementation, these bits have ``address bit
@@ -188,7 +211,7 @@ packet represents a token\footnote{In the Sun Labs
  \end{bytefield}
  }
  
-\subsection*{Data Word In Memory (37 bits)}
+\subsection{Data Word In Memory (37 bits)}
  
  A word of data is 37 bits wide.
  
@@ -200,7 +223,7 @@ A word of data is 37 bits wide.
  \end{bytefield}
  }
  
-\subsection*{Packet In Flight (49 bits)}
+\subsection{Packet In Flight (49 bits)}
  
  {\tt\footnotesize
  \begin{bytefield}{49}
@@ -211,7 +234,7 @@ A word of data is 37 bits wide.
  \end{bytefield}
  }
  
-\subsection*{Instruction In Memory (37 bits)}
+\subsection{Instruction In Memory (37 bits)}
  
  An instruction must be no wider than a memory word.  The next section
  explains the bits in greater detail.  The {\tt Instruction Path} is
@@ -222,7 +245,7 @@ the instruction.
  
  {\tt\tiny
  \begin{bytefield}{49}
-  \bitheader[b]{0,10,11,17,18-26,36}\\
+  \bitheader[b]{0,6,7,17,18-26,36}\\
    \bitbox[r]{12}{}
    \bitbox{11}{Instruction Path}
    \bitbox{1}{1} 
@@ -238,7 +261,7 @@ the instruction.
  \end{bytefield}
  }
  
-\subsection*{Instruction Packet In Flight (49 bits)}
+\subsection{Instruction Packet In Flight (49 bits)}
  
  Note that Fleet simply copies the {\tt Instruction Path} field, bit
  for bit, into the packet {\tt Path} field in order to ``dispatch'' an
@@ -246,7 +269,7 @@ instruction.
  
  {\tt\tiny
  \begin{bytefield}{49}
-  \bitheader[b]{0,10,11,17,18-25,37,47,48}\\
+  \bitheader[b]{0,6,7,17,18-25,37,47,48}\\
    \bitbox[r]{1}{0} 
    \bitbox{11}{Instruction Path}
    \bitbox{11}{Instruction Path}
@@ -266,7 +289,7 @@ instruction.
  \setlength{\bitwidth}{5mm}
  \pagebreak
  
-\section*{Instruction Formats}
+\section{Instruction Formats}
  \begin{wrapfigure}{R}{2in}
  \epsfig{file=locations,width=2in}
  \vspace{-0.4in}
@@ -283,6 +306,7 @@ successor of the execution point.
  
  \vspace{0.3cm}
  {\bf Kill}
+\addcontentsline{toc}{subsection}{Kill}
  
  {\tt
  \begin{bytefield}{26}
@@ -292,7 +316,7 @@ successor of the execution point.
    \bitbox{1}{1} 
    \bitbox{1}{0} 
    \bitbox{1}{1} 
-  \bitbox{14}{}
+  \bitbox{14}{don't care}
    \bitbox{7}{Count} 
  \end{bytefield}}
  When a {\tt kill} instruction reaches the insertion point, it will
@@ -301,11 +325,12 @@ point.  When this occurs, the instruction at the execution point is
  retired and the count field of the {\tt kill} instruction is
  decremented.  If the {\tt kill} instruction's count was {\tt 0} before
  decrementing, the {\tt kill} instruction is retired.  The programmer
-is assured that a kill instruction with a count of $n$ will kill $n$
+is assured that a kill instruction with a count of $n$ will kill $n+1$
  {\it consecutive} instructions.
  
  \vspace{0.3cm}
  {\bf Clog}
+\addcontentsline{toc}{subsection}{Clog}
  
  {\tt
  \begin{bytefield}{26}
@@ -315,13 +340,15 @@ is assured that a kill instruction with a count of $n$ will kill $n$
    \bitbox{1}{1} 
    \bitbox{1}{0} 
    \bitbox{1}{0} 
-  \bitbox{21}{}
+  \bitbox{21}{don't care}
  \end{bytefield}}
-When a {\tt clog} instruction reaches the execution point, no more
-instructions will be executed until an {\tt unclog} is performed.
+When a {\tt clog} instruction reaches the execution point, it remains
+there and no more instructions will be executed until an {\tt unclog}
+is performed.
  
  \vspace{0.3cm}
  {\bf UnClog}
+\addcontentsline{toc}{subsection}{UnClog}
  
  {\tt
  \begin{bytefield}{26}
@@ -331,7 +358,7 @@ instructions will be executed until an {\tt unclog} is performed.
    \bitbox{1}{0} 
    \bitbox{1}{1} 
    \bitbox{1}{0} 
-  \bitbox{21}{}
+  \bitbox{21}{don't care}
  \end{bytefield}}
  When an {\tt unclog} instruction reaches the insertion point, it will wait
  there until a {\tt clog} instruction is at the execution point.  When
@@ -339,6 +366,7 @@ this occurs, both instructions retire.
  
  \vspace{0.3cm}
  {\bf Literal}
+\addcontentsline{toc}{subsection}{Literal}
  
  {\tt
  \begin{bytefield}{26}
@@ -355,6 +383,7 @@ in the data latch.
  
  \pagebreak
  {\bf Normal Instruction}
+\addcontentsline{toc}{subsection}{Normal Instruction}
  
  {\tt
  \begin{bytefield}{26}
@@ -372,27 +401,28 @@ in the data latch.
  \end{bytefield}
  }
  
+\footnoteremember{tidi}{{\tt Ti}=1,{\tt Di}=1 is invalid on input port.}
+\footnoteremember{didc}{Note that {\tt Di}=0,{\tt Dc}=1
+                 is meaningless and therefore reserved for other
+                 uses.}
+\footnoteremember{todo}{ {\tt To}=1,{\tt Do}=1 have special meaning on an output port.}
+\footnoteremember{toig}{{\tt To}=0,{\tt Ig}=1 is invalid}
  \begin{itemize}
-
-  \item [\tt Ti] ({\bf Token Input}) wait for a token and acknowledge
-                 it\footnote{{\tt Ti}=1,{\tt Di}=1 is invalid on input port.}
+  \item [\tt Ti] ({\bf Token Input}) wait for a token and acknowledge it.
  
    \item [\tt Di] ({\bf Data Input}) wait for a datum and acknowledge it.
+                 \footnoterecall{tidi}\footnoterecall{didc}
  
    \item [\tt Dc] ({\bf Data Capture}) capture (latch) the accepted
                   datum in the data latch.  This bit is ignored if the incoming packet is
-                 a token. \footnote{ Note that {\tt Di}=0,{\tt Dc}=1
-                 is meaningless and therefore reserved for other
-                 uses.}
+                 a token. \footnoterecall{didc}
  
-  \item [\tt Do] ({\bf Data Output}) emit a datum.
+  \item [\tt Do] ({\bf Data Output}) emit a datum.\footnoterecall{todo}
  
-  \item [\tt To] ({\bf Token Output}) emit a token.\footnote{ {\tt To}=1,{\tt
-                 Do}=1 have special meaning on an output port.}
+  \item [\tt To] ({\bf Token Output}) emit a token.\footnoterecall{todo}\footnoterecall{toig}
  
    \item [\tt Ig] ({\bf Ignore {\tt To} Until Last Iteration}) ignore
-                 the {\tt To} bit unless {\tt Count=0} \footnote{{\tt
-                 To}=0,{\tt Ig}=1 is invalid}
+                 the {\tt To} bit unless {\tt Count=0}\footnoterecall{toig}
  
    \item [\tt Rq] ({\bf ReQueue}) if set, instructions having nonzero
                   count are ``Re-Queued'' rather than RePeated.  See
@@ -418,12 +448,12 @@ if (Count==0) {
  Note how a ``standing'' instruction is encoded as {\tt Count=MAX\_COUNT}       
  
  \item [\tt Dest] ({\bf Data/Token Destination})
-   Normally, this field is copied into the address portion of any
+   Normally, this field is copied into the path portion of any
     outgoing packet ({\tt Do} on an output port or {\tt To} on either).
  
     However, in the special case of an output port, if {\tt Do=1,To=1}, then
     the {\it most significant} {\tt 11} bits of the value in the {\it
-   data register} are used as a destination address instead.
+   data register} are used as a destination path instead.
     %\footnote{This
     %functionality eliminates the need for any sort of ``{\tt Execute}''
     %ship, and lets a {\tt Fifo} ship act as an extension of the
@@ -433,23 +463,25 @@ Note how a ``standing'' instruction is encoded as {\tt Count=MAX\_COUNT}
  
  
  \pagebreak
-\section*{Opcodes}
+\section{Opcodes}
  
  Opcodes, as specified in \cite{am25}, are not yet implemented, but
  should be.
  
-\section*{Bypasses}
+\section{Bypasses}
  
  Bypasses, as specified in \cite{am27}, are not yet implemented, but
  should be.
  
-\section*{Sequencing Guarantees}
+\section{Sequencing Guarantees}
  
+\addcontentsline{toc}{subsection}{Source Sequence Guarantee}
  If two packets leave the same source and have the same path, they will
  arrive at their common destination {\it in the same order in which
    they left the source}.  This assurance is referred to as the {\it
    source sequence guarantee}.
  
+\addcontentsline{toc}{subsection}{Instruction Sequence Guarantee}
  CodeBags in memory consist of an unordered set of {\it fibers}.  A
  fiber is an ordered set of instructions which all share the same
  instruction path, and therefore will be executed by the same pump.
@@ -460,7 +492,7 @@ they appear in the fiber.  This is known as the {\it instruction
  will take advantage of the source sequence guarantee in order to
  fulfill the instruction sequence guarantee.
  
-\section*{Code Bags}
+\section{Code Bags}
  
  Code bags may overlap in memory.  If the assembler determines that two
  code bags share a set of fibers, it may feel free to re-order those
@@ -469,15 +501,15 @@ for these fibers.  The codebag descriptors for the two bags will then
  have different addresses and/or sizes, but the ranges which they
  encompass will overlap.
  
-\subsection*{Dispatching Code Bags}
+\subsection{Dispatching Code Bags}
  
-A codebag is ``dispatched'' by causing its constitutent words to
+A codebag is ``dispatched'' by causing its constituent words to
  arrive at the data latch of some output port, and then causing that
  output port's pump to execute a {\tt send} (not {\tt sendto})
  instruction, which is encoded as {\tt Do=1,To=1}.  Because
  instructions are encoded in such a way that their {\tt Instruction
    Path} appears in the upper 11 bits, the {\tt send} mechanism will
-correctly route the instruction to the pump which ought to execute it.
+correctly route the instruction to a pump will execute it.
  
  Currently the {\tt inCBD} port of the {\tt Memory} ship is ideally
  suited for this purpose, although a similar effect can easily be
@@ -496,16 +528,16 @@ Note that instructions are encoded relative to a particular dispatch
  output port.  That is, the {\tt Instruction Path} field of the
  instruction specifies a path to the desired execution pump -- but this
  path is also specified with respect to some source.  Therefore the
-instruction can only be correctly dispatched from that particular
-source.  However, it is a fairly simple matter two write a software
-program which can ``relocate'' a codebag by replacing the {\tt
-  Instruction Path} fields as needed.
+instruction can be correctly dispatched only from that particular
+source.  However, it is fairly simple to write software which can
+``relocate'' a codebag by replacing the {\tt Instruction Path} fields
+as needed.
  
-\subsection*{Tokens and Data Items}
+\subsection{Tokens and Data Items}
  
  When a data item is sent to a destination which expects a token, such
  as an output port destination, the effect will be the same as if a token
-had been sent to that destination.
+had been sent to that destination, and the data value will be lost.
  
  When a token is sent to a destination which expects a data item, such
  as a pump destination or an input port destination, the effect will be
@@ -514,9 +546,9 @@ to that destination.  In other words, the programmer has no completely
  reliable mechanism for distinguishing between tokens and data items.
  
  
-\section*{Future Directions}
+\section{Future Directions}
  
-\subsection*{The Role of the Count Field}
+\subsection{The Role of the Count Field}
  
  Looking back on the design of the pump, several things are now
  apparent which were not initially.  In particular, it seems that it
@@ -529,7 +561,7 @@ other types of instructions.  This would have a few advantages:
  
  \item It would be possible to have finitely-repeating,
        infinitely-requeueing instructions.  Various implementation
-      details\footnote{in many implementations, the count field of an
+      details\footnote{In many implementations, the count field of an
        instruction is destroyed as it counts down; another register and
        mux would be needed to ``save'' this value so it could be
        restored prior to being requeued} make it awkward to support
@@ -544,12 +576,12 @@ particular instruction:
  \begin{itemize}
  \item await and acknowledge a token
  \item await and acknowledge a datum
+\item load the awaited datum into the path latch (top 11 bits)
  \item load the awaited datum into the data latch
-\item load the a literal into the data latch
-\item load a value indicated by the instruction into the path latch
-\item load the top 11 bits of the data latch into the path latch
-\item treat the values in the path latch and data latch as a packet and transmit it
-\item treat the value in the path latch as a token and transmit it
+\item load a literal into the data latch
+\item load a literal into the path latch
+\item send the value in the data latch along the path in the path latch
+\item send a token along the path in the path latch
  \item set the count register to a literal value
  \item decrement the count register
  \item requeue this instruction if {\tt count>0}
@@ -566,19 +598,19 @@ latter feature makes it possible to load the data and destination
  latches from two distinct data items, allowing ships to create,
  handle, and consume {\it packets} in the form of a pair of data items.
  
-At this point, it boils down to a question of instruction bit
-budgeting.  Currently we have a separate instruction form for
-literals, so that the literal can use bits which are not otherwise
-relevant to literal instructions (for example, {\tt Di}).  Adding an
-additional instruction form for loading the count register would have
-similar advantages.
+At this point, the remaining decisions are simply a question of
+instruction bit budgeting.  Currently we have a separate instruction
+form for literals, so that the literal can use bits which are not
+otherwise relevant to literal instructions (for example, {\tt Di}).
+Adding an additional instruction form for loading the count register
+would have similar advantages.
  
  
-\subsection*{For Just a Little Bit More...}
+\subsection{For Just a Little Bit More...}
  
-Compared to early revisions of the Fleet architecture, the valves of
+Compared to early revisions of the Fleet architecture, the docks of
  FleetTwo are substantially more powerful.  One might ask, why not add
-additional power to the valves, perhaps up to the point of making them
+additional power to the docks, perhaps up to the point of making them
  tiny von Neumann machines in their own right?  Indeed, doing so would
  risk reliving the {\it Wheel of Reincarnation}
  \cite{WheelOfReincarnation}.
@@ -590,26 +622,26 @@ founded in the geometry of VLSI, and indeed, Euclidean space.
  Ultimately, the instruction stream for any component of a processor
  must exist in some physical incarnation -- some geometric extent.  If
  the control flow of a program is linear, with no branching, then the
-instruction stream has an optimal spatial mapping which is quite
-efficient: it is simply laid out in FIFO fashion.
+instruction stream has an optimal spatial mapping which is 
+efficient: simply lay it out as a FIFO.
  
  By contrast, however, a program whose instruction stream has
-unrestricted branching is logically a {\it tree} structure.  Quite
-unfortunately, general trees have no efficient mapping onto
+unrestricted branching is logically a {\it tree} structure.  
+Unfortunately, however, general trees have no efficient mapping onto
  two-dimensional surfaces.  By this, I mean that in every possible
  mapping, there exists some pair of {\it logically adjacent}
  instructions whose {\it physical distance} is proportional to the size
  of the entire program, rather than some constant factor irrespective
  of the program size.  Therefore I conclude that the spatial mapping of
  a program with unrestricted branching is in an entirely different
-class than that of programs with linear control flow.  I choose this
-distinction as the basis for limiting our valves to linear control
+class than that of programs with linear control flow, and I choose this
+distinction as the basis for limiting the docks to linear control
  flow sequences.  A similar argument can be made for looping constructs
  in which two or more loops may be nested within a single parent loop.
  
  With this distinction in mind, it would not be objectionable for
  future incarnations of Fleet to have {\it predicated} instructions at
-the valves.  This might take the form of:
+the docks.  This might take the form of:
  
  \begin{itemize}
  \item Instructions to unconditionally set some ``state flags''
@@ -619,9 +651,11 @@ the valves.  This might take the form of:
  
  I feel that this addition would not risk falling down the slippery
  slope -- instruction streams with predicated instructions (but not
-unrestricted branching) preserve the crucial property that they have
-an efficient spatial mapping.  Any steps beyond this, however, cross a
-fairly fundamental line.
+unrestricted branching or nested looping) preserve the crucial
+property that they have an efficient spatial mapping.  Any steps
+beyond this, however, cross a fairly fundamental line.
+
  
+\addcontentsline{toc}{section}{Ship Descriptions}