1 \documentclass[10pt,oneside]{article}
9 \usepackage[colorlinks=true, pdfstartview=FitV, linkcolor=blue, citecolor=blue, urlcolor=blue]{hyperref}
10 \renewcommand{\ttdefault}{cmtt}
11 \title{The FleetTwo Architecture Manual\\{\normalsize A Programmer's View of Fleet}}
15 \begin{thebibliography}{[GDG01]}
16 \bibitem[IES02]{ies02}
18 \href{http://research.cs.berkeley.edu/class/fleet/docs/people/ivan.e.sutherland/ies02-FLEET-A.Once.Instruction.Computer.pdf}{\it UCB-IES02: Fleet -- A One Instruction Computer}
19 \bibitem[IES14]{ies14}
21 \href{http://research.cs.berkeley.edu/class/fleet/docs/people/ivan.e.sutherland/ies14-Fleet.Definition.pdf}{\it UCB-IES14: The Fleet Definition}
22 \bibitem[IES44]{ies44}
24 \href{http://research.cs.berkeley.edu/class/fleet/docs/people/ivan.e.sutherland/ies44-Fleet.Definition.pdf}{\it UCB-IES44: The Fleet Definition}
29 This manual attempts to detail all those aspects of Fleet which are
30 visible to a software programmer or compiler backend.
32 This document assumes that the reader is broadly familiar with the
33 Fleet architecture and the motivation behind it. For a far better
34 introduction to these topics, the reader is referred to \cite{ies02}
35 \cite{ies14} and \cite{ies44}. Please bear in mind that many of the
36 details in these documents have been changed, but they still serve as
37 the authoritative introduction to the architecture. After reviewing
38 these introductory papers, the reader is invited to digest this
39 architecture manual for the latest information on the details
40 necessary to write software and compilers for Fleet.
45 \section*{Introduction (from \cite{ies02})}
47 In \cite{ies02}, Sutherland writes:
50 When computers were new, logic and storage were expensive and wires
51 were relatively cheap. Early computer designers avoided using logic
52 wherever possible but were not greatly concerned with communication.
53 They made design choices consistent with the costs of the day. My
54 favorite example is the jump instruction. Early designers put jump
55 instructions at the end of each block of code to avoid the expense of
56 storing the address of the next block while executing the present one.
58 Today's chip fabrication methods invert the older cost balance. In
59 today's integrated circuits logic and memory are now almost free but
60 communication costs dominate chip area, energy consumption and delay.
61 In spite of these changes in the stuff from which we make computers,
62 vestiges of the past remain in many of today's common microprocessor
63 designs. For example, jump instructions still appear at the end of
64 each basic block of code in spite of the need to pre-fetch the next
65 block while executing this one. It seems better today to store a
66 pointer to the next block and the length of the current block early in
69 Instead of following the path of history, I'd like to listen carefully
70 to what modern chip structures have to teach about how one might build
71 a modern computer. I see three major lessons. First, simplicity can
72 reduce cost. Second, moving data will consume most of the time,
73 energy and chip area of a modern computer. And third, the low cost of
74 logic makes concurrency available if we can figure out how to use it.
78 \item {\bf Simplicity}: The Fleet architecture seeks simplicity by treating
79 all processing devices alike...
81 \item {\bf Communication}: The Fleet architecture seeks to control the
82 cost of communication by putting it under direct programmer
83 control. Thus Fleet avoids instructions like {\sc ADD} or {\sc STORE} that
84 include concealed communication to and from a register file...
86 \item {\bf Concurrency}: The Fleet architecture assumes concurrency
89 \item {\bf Asynchrony}: This provides enormous flexibility for
90 implementers to improve the performance or reduce the cost of
91 the processing devices in a Fleet system. -- Synchronous
92 implementations both of ships and of the switch fabric are
93 possible provided they use validity or occupancy bits to achieve
94 the arbitrary delays required at sources and destinations...
101 \section*{The Programmer's View of The Ship-Fabric Interface}
103 The role of the Fleet switch fabric is to carry {\it packets} from
104 {\it sources} to {\it destinations}.
106 A packet consists of a {\it path} and a {\it payload}. The path is
107 some string of bits which tells the switch fabric how to route from
108 the desired source to the desired destination. The distinction
109 between a {\it path} and a {\it destination} is extremely important!
110 The payload is either a {\it token} or a {\it data item}. If the
111 payload is a data item, it includes one machine word of data.
113 The diagram below represents a {\it programmer's} conceptual view of
114 the interface between ships and the switch fabric. Actual
115 implementation circuitry may differ substantially. Sources and
116 destinations which can send and recieve only tokens -- not data items
117 -- are drawn as dashed lines.
121 \epsfig{file=ports,width=4in}
125 The term {\it port} refers to each interface to the ship, the {\it
126 valve} connecting it to the switch fabric, and the corresponding
127 sources and destinations on the switch fabric.
129 Each valve consists of a {\it data latch}, which is as wide as a
130 single machine word and a {\it pump}, which is a circular fifo of
131 instruction-width latches. The contents of the instruction fifo
132 control the data latch, as future chapters will explain.
134 Note that the pump in each valve has a destination of its own, and
135 that there is no buffering fifo guarding this destination. Note that
136 all other destinations are guarded by a buffering fifo; the size of
137 this fifo is exposed to the software programmer so she can ensure that
138 deadlock does not occur.
142 \section*{Data Formats}
144 \subsection*{Packet Path (12 bits)}
146 These bits appear physically within the switch fabric, and have
147 ``address bit timing.'' The {\tt T} bit is the ``tokenhood'' bit; if
148 set, this packet represents a token and it does not cause the switch
149 fabric data latches to fire.
152 \begin{bytefield}{49}
153 \bitheader[b]{37,47,48}\\
160 \subsection*{Data Word In Memory (37 bits)}
162 A word of data is 37 bits wide.
165 \begin{bytefield}{49}
166 \bitheader[b]{0,36}\\
168 \bitbox{37}{Data Word}
172 \subsection*{Packet In Flight (49 bits)}
175 \begin{bytefield}{49}
176 \bitheader[b]{0,36,37,47,48}\\
178 \bitbox{11}{Packet Path}
179 \bitbox{37}{Data Word}
183 \subsection*{Instruction In Memory (37 bits)}
185 An instruction must be no wider than a memory word. The next section
186 explains the bits in greater detail. The {\tt Instruction Path} is
187 the path which the instruction itself will take in order to arrive at
188 the desired pump destination. The {\it Data/Token Path} is placed on
189 any packet inserted into the switch fabric as a result of executing
193 \begin{bytefield}{49}
194 \bitheader[b]{0,10,11,17,18-26,36}\\
196 \bitbox{11}{Instruction Path}
206 \bitbox{11}{Data/Token Path}
210 \subsection*{Instruction Packet In Flight (49 bits)}
212 Note that Fleet simply copies the {\tt Instruction Path} field, bit
213 for bit, into the packet {\tt Path} field in order to ``dispatch'' an
217 \begin{bytefield}{49}
218 \bitheader[b]{0,10,11,17,18-25,37,47,48}\\
220 \bitbox{11}{Instruction Path}
221 \bitbox{11}{Instruction Path}
231 \bitbox{11}{Data/Token Path}
238 \section*{Instruction Formats}
240 Instructions can be grouped into two categories: {\it killing}
241 instructions, which are acted upon as soon as they leave the
242 instruction horn, and {\it executing} instructions, which pass through
243 the instruction queue before being acted upon.
245 Blank fields below are reserved for future use and must be set to
248 Note that the arbiter is requested whenever {\it any of the first
249 three bits is {\tt 1}}. If the arbiter is not requested,
253 \setlength{\bitwidth}{5mm}
255 \subsection*{Killing Instructions}
257 Kill (kill anything other than a Clog)
260 \begin{bytefield}{26}
261 \bitheader[b]{0,6,7,20-25}\\
274 \begin{bytefield}{26}
275 \bitheader[b]{0,20-25}\\
284 \subsection*{Executing Instructions}
288 \begin{bytefield}{26}
289 \bitheader[b]{0,20-25}\\
298 Literal (sign extended, implicit {\tt Rq=1})
301 \begin{bytefield}{26}
302 \bitheader[b]{0,6,7,23-25}\\
313 \begin{bytefield}{26}
314 \bitheader[b]{0,6,7,17-25}\\
331 \subsection*{Field Descriptions}
333 \begin{bytefield}{26}
334 \bitheader[b]{0,6,7,16-25}\\
350 \item [\tt Ti] ({\bf Token Input}) wait for a token and accept
351 it\footnote{{\tt Ti}=1,{\tt Di}=1 is invalid on inbox.}
353 \item [\tt Di] ({\bf Data Input}) wait for a datum and accept it.
355 \item [\tt Dc] ({\bf Data Capture}) capture (latch) the accepted
356 datum. This bit is ignored if the incoming packet is
357 a token. \footnote{ Note that {\tt Di}=0,{\tt Dc}=1
358 is meaningless and therefore reserved for other
361 \item [\tt Do] ({\bf Data Output}) emit a datum.
363 \item [\tt To] ({\bf Token Output}) emit a token.\footnote{ {\tt To}=1,{\tt
364 Do}=1 have special meaning on an outbox.}
366 \item [\tt Ig] ({\bf Ignore {\tt To} Until Last Iteration}) ignore
367 the {\tt To} bit unless {\tt Count=0} \footnote{{\tt
368 To}=0,{\tt Ig}=1 is invalid}
370 \item [\tt Rq] ({\bf ReQueue}) if set, instructions having nonzero
371 count are ``Re-Queued'' rather than RePeated. See
372 {\tt Count} for more detail. \footnote{ An
373 instruction {\it in memory} may not have {\tt
374 Rq=1,Count=0} (use {\tt Rq=0,Count=0})}
376 \item [\tt Count] ({\bf Count}) {\it After} executing:
379 discard this instruction
381 if Count < MAX_COUNT {
385 put this instruction back into the instruction fifo
387 execute this instruction again
391 Note how a ``standing'' instruction is encoded as {\tt Count=1111111}
393 \item [\tt Dest] ({\bf Data/Token Destination})
394 Normally, this field is copied into the address portion of any
395 outgoing packet ({\tt Do} on an outbox or {\tt To}).
397 However, in the special case of an outbox, if {\tt Do=1,To=1}, then
398 the {\it most significant} {\tt 11} bits of the value in the {\it
399 data register} are used as a destination address instead. \footnote{This
400 functionality eliminates the need for any sort of ``{\tt Execute}''
401 ship, and lets a {\tt Fifo} ship act as an extension of the
402 instruction queue in the pump.}
407 \section*{Fleet Assembly Language}
409 The Fleet Assembly language is designed to be a human-readable
410 depiction of the bits which comprise a Fleet program. The formal
411 syntax is given below:
413 \verbatiminput{fleet.g}
417 \subsection*{Representing Code}
418 \subsection*{Representing Ships}
424 - sequencing guarantees
426 - behavior when token arrives at data port, vice versa
427 - overlapping codebags
431 \section*{Future Directions}
433 Looking back on the design of the pump, several things are now
434 apparent which were not initially. In particular, it seems that it
435 could be useful to separate {\it loading the count register} from
436 other types of instructions. This would have a few advantages:
439 \item The size of the count field would not be a consideration in the
440 ``instruction budget'' of normal execution instructions
441 \item It would be possible to have finitely-repeating,
442 infinitely-requeueing instructions (FIXME).
452 dl (data literal, take from instruction bits)
455 doi (data output, destination taken from instruction)
456 dod (data output, destination taken from Data register)
462 requeue unconditionally
464 repeat unconditionally