From b3d6596170bd4c060a61093860e868a46d68daa1 Mon Sep 17 00:00:00 2001 From: adam Date: Sun, 26 Aug 2007 11:50:10 +0100 Subject: [PATCH] major updates to manual --- doc/archman.tex | 303 +++++++++++++++++++++++++++++++++++++++-------------- doc/locations.svg | 271 +++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 498 insertions(+), 76 deletions(-) create mode 100644 doc/locations.svg diff --git a/doc/archman.tex b/doc/archman.tex index c969280..7e320e2 100644 --- a/doc/archman.tex +++ b/doc/archman.tex @@ -1,6 +1,7 @@ \documentclass[10pt,oneside]{article} \reversemarginpar \usepackage{palatino} +\usepackage{wrapfig} \usepackage{epsfig} \usepackage{verbatim} \usepackage{parskip} @@ -22,6 +23,9 @@ \bibitem[IES44]{ies44} Sutherland, Ivan, \href{http://research.cs.berkeley.edu/class/fleet/docs/people/ivan.e.sutherland/ies44-Fleet.Definition.pdf}{\it UCB-IES44: The Fleet Definition} +\bibitem[MS68]{WheelOfReincarnation} + Myer, T. and Sutherland, Ivan, {\it On the Design of Display Processors} + Communications of the ACM, Vol 11, No. 6, June 1968 . \end{thebibliography} \vspace{3cm} @@ -44,7 +48,10 @@ necessary to write software and compilers for Fleet. \pagebreak \section*{Introduction (from \cite{ies02})} -In \cite{ies02}, Sutherland writes: +During the past two and a half years, the Fleet architecture has gone +through massive changes and improvements. In spite of this, however, +the fundamental idea behind Fleet remains unchanged. As Sutherland +writes in the original Berkeley Fleet description \cite{ies02}, {\it When computers were new, logic and storage were expensive and wires @@ -194,16 +201,16 @@ the instruction. \bitheader[b]{0,10,11,17,18-26,36}\\ \bitbox[r]{12}{} \bitbox{11}{Instruction Path} - \bitbox{1}{K} - \bitbox{1}{L} + \bitbox{1}{1} \bitbox{1}{Ti} \bitbox{1}{Di} \bitbox{1}{Dc} \bitbox{1}{Do} \bitbox{1}{To} + \bitbox{1}{Ig} \bitbox{1}{Rq} + \bitbox{11}{Data/Token Path} \bitbox{7}{Count} - \bitbox{11}{Data/Token Path} \end{bytefield} } @@ -219,42 +226,39 @@ instruction. \bitbox[r]{1}{0} \bitbox{11}{Instruction Path} \bitbox{11}{Instruction Path} - \bitbox{1}{K} - \bitbox{1}{L} + \bitbox{1}{1} \bitbox{1}{Ti} \bitbox{1}{Di} \bitbox{1}{Dc} \bitbox{1}{Do} \bitbox{1}{To} + \bitbox{1}{Ig} \bitbox{1}{Rq} + \bitbox{11}{Data/Token Path} \bitbox{7}{Count} - \bitbox{11}{Data/Token Path} \end{bytefield} } - +\setlength{\bitwidth}{5mm} \pagebreak \section*{Instruction Formats} +\begin{wrapfigure}{R}{2in} +\epsfig{file=locations,width=2in} +\vspace{-0.4in} +\end{wrapfigure} +Within the pump's instruction fifo, there are two points of particular +interest. The first point is the {\it insertion point}, which is the +point in the ring where new instructions enter from the switch fabric. +The other point of interest is the {\it execution point}, which is the +point in the ring where instructions other than {\tt kill} and {\tt + unclog} are performed. + +The programmer may assume that the insertion point is the immediate +successor of the insertion point. -Instructions can be grouped into two categories: {\it killing} -instructions, which are acted upon as soon as they leave the -instruction horn, and {\it executing} instructions, which pass through -the instruction queue before being acted upon. - -Blank fields below are reserved for future use and must be set to -zero. - -Note that the arbiter is requested whenever {\it any of the first - three bits is {\tt 1}}. If the arbiter is not requested, - - - -\setlength{\bitwidth}{5mm} - -\subsection*{Killing Instructions} - -Kill (kill anything other than a Clog) +\vspace{0.3cm} +{\bf Kill} {\tt \begin{bytefield}{26} @@ -267,35 +271,51 @@ Kill (kill anything other than a Clog) \bitbox{14}{} \bitbox{7}{Count} \end{bytefield}} +When a {\tt kill} instruction reaches the insertion point, it will +wait there for a non-{\tt clog} instruction to reach the execution +point. When this occurs, the instruction at the execution point is +retired and the count field of the {\tt kill} instruction is +decremented. If the {\tt kill} instruction's count was {\tt 0} before +decrementing, the {\tt kill} instruction is retired. -UnClog (kill a Clog) +\vspace{0.3cm} +{\bf Clog} {\tt \begin{bytefield}{26} \bitheader[b]{0,20-25}\\ \bitbox{1}{0} \bitbox{1}{0} - \bitbox{1}{0} \bitbox{1}{1} \bitbox{1}{0} + \bitbox{1}{0} \bitbox{21}{} \end{bytefield}} +When a {\tt clog} instruction reaches the execution point, no more +instructions will be executed until an {\tt unclog} is performed. -\subsection*{Executing Instructions} -Clog +\vspace{0.3cm} +{\bf UnClog} {\tt \begin{bytefield}{26} \bitheader[b]{0,20-25}\\ \bitbox{1}{0} \bitbox{1}{0} - \bitbox{1}{1} \bitbox{1}{0} + \bitbox{1}{1} \bitbox{1}{0} \bitbox{21}{} \end{bytefield}} +When an {\tt unclog} instruction reaches the insertion point, it will wait +there until a {\tt clog} instruction is at the execution point. When +this occurs, both instructions retire. {\it Note:} this means +that if an {\tt unclog} instruction enters the instruction fifo +without a {\tt clog} ahead of it, the pump's insertion point will become +irretrievably jammed. -Literal (sign extended, implicit {\tt Rq=1}) +\vspace{0.3cm} +{\bf Literal} {\tt \begin{bytefield}{26} @@ -305,34 +325,18 @@ Literal (sign extended, implicit {\tt Rq=1}) \bitbox{17}{Literal} \bitbox{7}{Count} \end{bytefield}} - - -Normal - -{\tt -\begin{bytefield}{26} - \bitheader[b]{0,6,7,17-25}\\ - \bitbox{1}{1} - \bitbox{1}{Ti} - \bitbox{1}{Di} - \bitbox{1}{Dc} - \bitbox{1}{Do} - \bitbox{1}{To} - \bitbox{1}{Ig} - \bitbox{1}{Rq} - \bitbox{11}{Dest} - \bitbox{7}{Count} -\end{bytefield}} - - +When a literal instruction reaches the execution point, its {\tt + Literal} field is sign-extended to a full word length and captured +in the data latch. \pagebreak -\subsection*{Field Descriptions} +{\bf Normal Instruction (at Execution Point)} + {\tt \begin{bytefield}{26} \bitheader[b]{0,6,7,16-25}\\ - \bitbox{1}{0} + \bitbox{1}{1} \bitbox{1}{Ti} \bitbox{1}{Di} \bitbox{1}{Dc} @@ -396,13 +400,15 @@ Note how a ``standing'' instruction is encoded as {\tt Count=1111111} However, in the special case of an outbox, if {\tt Do=1,To=1}, then the {\it most significant} {\tt 11} bits of the value in the {\it - data register} are used as a destination address instead. \footnote{This - functionality eliminates the need for any sort of ``{\tt Execute}'' - ship, and lets a {\tt Fifo} ship act as an extension of the - instruction queue in the pump.} + data register} are used as a destination address instead. + %\footnote{This + %functionality eliminates the need for any sort of ``{\tt Execute}'' + %ship, and lets a {\tt Fifo} ship act as an extension of the + %instruction queue in the pump.} \end{itemize} + \pagebreak \section*{Fleet Assembly Language} @@ -413,6 +419,83 @@ syntax is given below: \verbatiminput{fleet.g} \pagebreak +\section*{FleetDoc} + +Inspired by JavaDoc, the Fleet software toolchain includes a tool +called {\tt FleetDoc}, which processes {\it ship description files} +(with the extension {\tt .ship}). These files contain sections for: + +\begin{itemize} +\item The name of the ship +\item A list of its ports +\item A prose description of the ship +\item A list of the constants associated with a ship +\item Java source code for a software simulation of the ship +\item Verilog source code for an FPGA simulation of the ship +\item A test case for the ship +\item A list of the contributors to all of the above +\end{itemize} + +Keeping all of this information in a single file ensures that changes +to one item (such as the list of ports) are propagated to the other +items (such as the verilog code for simulation purposes). + +Keeping this information in {\it machine-readable} format makes it +possible to automatically generate tools which depend on the precise +details of ship ports (such as the Fleet assembler) without the risk +inherent in manual synchronization of the spec with the +implementation. + +The latter half of this manual is generated automatically from the +contents of the {\tt .ship} files. + +\pagebreak +\subsection*{An Example FleetDoc File} +\begin{verbatim} +ship: BitFifo + +== Ports =========================================================== +in: in +in: inOp + constant lsbFirst: ....................................1 + constant msbFirst: ....................................0 + constant drop: .............................uuuuuu.. + constant take: .......................uuuuuu........ +... + +== TeX ============================================================== +The BitFifo internally stores some number of bits in a fifo structure. +Its capacity is guaranteed to be at least two full machine words or +more. +... + +== Fleeterpreter ==================================================== +public void service() { + if (box_inOp.dataReadyForShip() && box_in.dataReadyForShip()) { + ... + +== FPGA ============================================================== +always @(posedge clk) begin + if (!in_r && in_a) in_a <= 0; + if (!inOp_r && inOp_a) inOp_a <= 0; + ... + +== Test ============================================================== +#expect 1 +#expect 68719476736 +#ship debug : Debug +#ship bitfifo : BitFifo + +bitfifo.outOp: literal BitFifo.outOp[take=37]; [*] deliver; +... + +== Contributors ========================================================= +Amir Kamil +Adam Megacz +\end{verbatim} + + +\pagebreak \section*{Java APIs} \subsection*{Representing Code} \subsection*{Representing Ships} @@ -430,6 +513,8 @@ syntax is given below: \pagebreak \section*{Future Directions} +\subsection*{The Role of the Count Field} + Looking back on the design of the pump, several things are now apparent which were not initially. In particular, it seems that it could be useful to separate {\it loading the count register} from @@ -438,31 +523,97 @@ other types of instructions. This would have a few advantages: \begin{itemize} \item The size of the count field would not be a consideration in the ``instruction budget'' of normal execution instructions + \item It would be possible to have finitely-repeating, - infinitely-requeueing instructions (FIXME). + infinitely-requeueing instructions. Various implementation + details\footnote{in many implementations, the count field of an + instruction is destroyed as it counts down; another register and + mux would be needed to ``save'' this value so it could be + restored prior to being requeued} make it awkward to support + this unless the count register can be manipulated independently + of instruction execution. \end{itemize} -\begin{verbatim} -essence of the pump: - -ti -di +With this in mind, we can refactor the ``essence of the pump'' into +the following fields, which are basically orthogonal: -dc (data capture) -dl (data literal, take from instruction bits) +\begin{itemize} +\item wait for a token +\item wait for a datum +\item load the data latch from the awaited datum +\item load the data latch from a literal in the instruction +\item send a datum to a destination indicated by the instruction +\item send a datum to a destination indicated by the data latch value +\item send a token to a destination indicated by the instruction +\item send a token to a destination indicated by the data latch value +\item set the count register to a literal value +\item decrement the count register +\item requeue this instruction if {\tt count>0} +\item requeue this instruction unconditionally +\item repeat this instruction if {\tt count>0} +\item repeat this instruction unconditionally +\end{itemize} -di -doi (data output, destination taken from instruction) -dod (data output, destination taken from Data register) +The crucial change here is the decoupling of the act of {\it loading + the count register} from the act of {\it loading the next + instruction}. + +At this point, it boils down to a question of instruction bit +budgeting. Currently we have a separate instruction form for +literals, so that the literal can use bits which are not otherwise +relevant to literal instructions (for example, {\tt Di}). Adding an +additional instruction form for loading the count register would have +similar advantages. + + +\subsection*{For Just a Little Bit More...} + +Compared to early revisions of the Fleet architecture, the valves of +FleetTwo are substantially more powerful. One might ask, why not add +additional power to the valves, perhaps up to the point of making them +tiny von Neumann machines in their own right? Indeed, doing so would +risk reliving the {\it Wheel of Reincarnation} +\cite{WheelOfReincarnation}. + +I believe, however, that there is a fundamental distinction to be +made between devices whose control flow {\it can branch} and those +whose control flow {\it cannot branch}, and that this distinction is +founded in the geometry of VLSI, and indeed, Euclidean space. +Ultimately, the instruction stream for any component of a processor +must exist in some physical incarnation -- some geometric extent. If +the control flow of a program is linear, with no branching, then the +instruction stream has an optimal spatial mapping which is quite +efficient: it is simply laid out in FIFO fashion. + +By contrast, however, a program whose instruction stream has +unrestricted branching is logically a {\it tree} structure. Quite +unfortunately, general trees have no efficient mapping onto +two-dimensional surfaces. By this, I mean that in every possible +mapping, there exists some pair of {\it logically adjacent} +instructions whose {\it physical distance} is proportional to the size +of the entire program, rather than some constant factor irrespective +of the program size. Therefore I conclude that the spatial mapping of +a program with unrestricted branching is in an entirely different +class than that of programs with linear control flow. I choose this +distinction as the basis for limiting our valves to linear control +flow sequences. A similar argument can be made for looping constructs +in which two or more loops may be nested within a single parent loop. + +With this distinction in mind, it would not be objectionable for +future incarnations of Fleet to have {\it predicated} instructions at +the valves. This might take the form of: -set count -decrement count +\begin{itemize} +\item Instructions to unconditionally set some ``state flags'' +\item Instructions to conditionally set some ``state flags'' based on the contents of the data latch +\item Bits which cause an instruction to be ignored or retired based on the state flags +\end{itemize} -requeue if count>0 -requeue unconditionally -repeat if count>0 -repeat unconditionally +I feel that this addition would not risk falling down the slippery +slope -- instruction streams with predicated instructions (but not +unrestricted branching) preserve the crucial property that they have +an efficient spatial mapping. Any steps beyond this, however, cross a +fairly fundamental line. -\end{verbatim} diff --git a/doc/locations.svg b/doc/locations.svg new file mode 100644 index 0000000..1563a5e --- /dev/null +++ b/doc/locations.svg @@ -0,0 +1,271 @@ + + + + + + + + + + + + + + + + + + + image/svg+xml + + + + + + + + + + + + + + + + + + DataLatch + + + + + + + + Pump + + + Execution Point + InsertionPoint + + + + -- 1.7.10.4