am33.tex

   1 \documentclass[10pt]{article}
   2 \usepackage{palatino}
   3 \usepackage{amsmath}
   4 \usepackage{epsfig}
   5 \usepackage{color}
   6 \usepackage{bytefield1}
   7 \usepackage{wrapfig}
   8 \usepackage{stmaryrd}
   9 \usepackage{subfigure}
  10 \usepackage{syntax}
  11 \usepackage{comment}
  12 \usepackage{fancyhdr}
  13 \usepackage{lastpage}
  14 \usepackage{multirow}
  15 \usepackage{multicol}
  16 \usepackage{rotating}
  17 \include{megacz}
  18 \bibliographystyle{alpha}
  19 \pagestyle{fancyplain}
  20
  21 \definecolor{light}{gray}{0.7}
  22
  23 \newcommand{\footnoteremember}[2]{
  24   \footnote{#2}
  25   \newcounter{#1}
  26   \setcounter{#1}{\value{footnote}}
  27 } \newcommand{\footnoterecall}[1]{
  28   \footnotemark[\value{#1}]
  29 }
  30
  31 %\pdfpagewidth 8.5in
  32 %\pdfpageheight 11in
  33 %\topmargin 0in
  34 %\textheight 7.5in
  35 %\textwidth 6.0in
  36 %\oddsidemargin 0.25in
  37 %\evensidemargin 0.25in
  38 %\headwidth 6.0in
  39 \def\to{\ $\rightarrow$\ }
  40
  41 \def\docnum{AM33}
  42
  43 \author{
  44 \normalsize{
  45 \begin{tabular}{c}
  46 \end{tabular}}
  47 }
  48
  49 \title{\vspace{-1cm}The FleetTwo Dock}
  50
  51 \begin{document}
  52
  53 \maketitle
  54
  55 \begin{abstract}
  56 Changes:
  57
  58 \begin{tabular}{rl}
  59 \color{red}
  60 ??-Apr
  61 \color{black}
  62 & noted that {\tt setFlags} can be used as {\tt nop} \\
  63 29-Mar
  64 & removed the {\tt L} flag (epilogues can now do this) \\
  65 & removed {\tt take\{Inner|Outer\}LoopCounter} instructions \\
  66 & renamed {\tt data} instruction to {\tt literal} \\
  67 & renamed {\tt send} instruction to {\tt move} \\
  68 23-Mar
  69 & added ``if its predicate is true'' to repeat count \\
  70 & added note that red wires do not contact ships \\
  71 & changed name of {\tt flags} instruction to {\tt setFlags} \\
  72 & removed black dot from diagrams \\
  73 & changed {\tt OL} (Outer Loop participant) to {\tt OS} (One Shot) and inverted polarity \\
  74 & indicated that the death of the {\tt tail} instruction is what causes the hatch to be unsealed \\
  75 & indicated that only {\tt send} instructions which wait for data are torpedoable \\
  76 & added section ``Torpedo Details'' \\
  77 & removed {\tt torpedo} instruction \\
  78 12-Mar
  79 \color{black}
  80 & renamed loop+repeat to outer+inner (not in red) \\
  81 & renamed {\tt Z} flag to {\tt L} flag (not in red) \\
  82 & rewrote ``inner and outer loops'' section \\
  83 & updated all diagrams \\
  84 \color{black}
  85 7-Mar
  86 & Moved address bits to the LSB-side of a 37-bit instruction \\
  87 & Added {\it micro-instruction} and {\it composite instruction} terms \\
  88 & Removed the {\tt DL} field, added {\tt decrement} mode to {\tt loop} \\
  89 & Created the {\tt Hold} field \\
  90 & Changed how ReLooping works \\
  91 & Removed {\tt clog}, {\tt unclog}, {\tt interrupt}, and {\tt massacre} \\
  92 \end{tabular}
  93 \end{abstract}
  94
  95 \vfill
  96
  97 \begin{center}
  98 \epsfig{file=overview,width=1.5in}
  99 \epsfig{file=indock,width=3in}
 100 \end{center}
 101
 102 \pagebreak
 103
 104 \section{Overview of Fleet}
 105
 106 A Fleet processor consists of a {\it switch fabric} with several
 107 functional units called {\it ships} connected to it.  At each
 108 connection between a ship and the switch fabric lies a programmable
 109 element known as the {\it dock}.
 110
 111 A {\it path} specifies a route through the switch fabric from a
 112 particular {\it source} to a particular {\it destination}.  The
 113 combination of a path and a single word {\it payload} is called a {\it packet}.  The
 114 switch fabric carries packets from their sources to their
 115 destinations.  Each dock has two destinations: one for {\it
 116   instructions} and one for {\it data}.  A Fleet is programmed by
 117 depositing packets into the switch fabric; these packets' paths lead
 118 them to the instruction destinations of the docks.
 119
 120 When a packet arrives at the instruction destination of a dock, it is
 121 enqueued for execution.  Before the instruction executes, it may cause
 122 the dock to wait for a packet to arrive at the dock's data destination
 123 or for a value to be presented by the ship.  It may present a data
 124 value to the ship or transmit it for transmission to some other
 125 destination.
 126
 127 When an instruction sends a packet into the switch fabric, it may
 128 specify that the payload of the packet is irrelevant.  Such packets
 129 are known as {\it tokens}, and consume less energy than data packets.
 130 From a programmer's perspective, a token packet is indistinguishable
 131 from a data packet with a unknown payload.
 132
 133 In the diagram below, the red wires carry instructions and the blue
 134 wires carry data; the switch fabric (gray area) carries both.  Notice
 135 that the red (instruction) wires do not contact the ships.  This is an
 136 advantage: ships are designed without any consideration for the
 137 instructions used to program their docks.
 138
 139 \begin{center}
 140 \epsfig{file=overview,width=2.5in}\\
 141 {\it Overview of a Fleet processor; gray shading represents a
 142   packet-switched network fabric; blue lines carry data, red lines
 143   carry instructions.}
 144 \end{center}
 145 \color{black}
 146
 147 \pagebreak
 148
 149 \section{The FleetTwo Pump}
 150
 151 The diagram below represents a {\it programmer's} conceptual view of
 152 the interface between ships and the switch fabric.  Actual
 153 implementation circuitry may differ substantially.  Sources and
 154 destinations that can send and receive only tokens -- not data items
 155 -- are drawn as dashed lines.
 156
 157 \begin{center}
 158 \epsfig{file=indock,width=3.5in}\\
 159 {\it an ``input'' dock}
 160
 161 \epsfig{file=outdock,width=3.5in}\\
 162 {\it an ``output'' dock}
 163 \end{center}
 164
 165 The term {\it port} refers to an interface to the ship, the {\it
 166   dock} connecting it to the switch fabric, and the corresponding
 167 sources and destinations on the switch fabric.
 168
 169 Each dock consists of a {\it data latch}, which is as wide as a single
 170 machine word and a {\it pump}, which is a circular fifo of
 171 instruction-width latches.  The values in the pump control the data
 172 latch.
 173
 174 Note that the pump in each dock has a destination of its own; this is
 175 the {\it instruction destination} mentioned in the previous section.
 176 Note that unlike all other destinations, there is no buffering fifo
 177 guarding this one.  The size of these fifos are exposed to the
 178 software programmer so he can avoid deadlock.
 179
 180 \pagebreak
 181 \section{Instructions}
 182
 183 In order to cause an instruction to execute, the programmer must first
 184 cause that instruction word to arrive in the data latch of some output
 185 dock.  For example, this might be the ``data read'' output dock of the
 186 memory access ship or the output of a fifo ship.  Once an instruction
 187 has arrived at this output dock, it is {\it dispatched} by sending it
 188 to the {\it instruction port} of the dock at which it is to execute.
 189
 190 Each instruction is 26 bits long, which makes it possible for an
 191 instruction and an 11-bit path to fit in a single word of memory.
 192 This path is the path from the {\it dispatching} dock to the {\it
 193   executing} dock.
 194
 195 \setlength{\bitwidth}{3.5mm}
 196 {\tt \footnotesize
 197 \begin{bytefield}{37}
 198   \bitheader[b]{0,10,11,36}\\
 199   \bitbox{26}{instruction}
 200   \bitbox{11}{dispatch path}
 201 \end{bytefield}}
 202
 203 {\bf Note:} the instruction encodings below are simply ``something to
 204 shoot at'' and a sanity check to make sure we haven't overrun our bit
 205 budget.  The final instruction encodings will probably be
 206 different.
 207
 208 All instruction words have the following format:
 209
 210 \setlength{\bitwidth}{3.5mm}
 211 {\tt \footnotesize
 212 \begin{bytefield}{37}
 213   \bitheader[b]{0,10,11,36}\\
 214 \color{black}
 215   \bitbox{1}{I}
 216   \bitbox{1}{OS}
 217   \bitbox{2}{P}
 218 \color{light}
 219   \bitbox[tbr]{22}{}
 220   \bitbox{11}{dispatch path}
 221 \color{black}
 222 \end{bytefield}}
 223
 224 Each instruction word is called a {\it micro instruction}.
 225 Collections of one or more micro instruction are known as {\it
 226   composite instructions}.
 227
 228 The {\tt I} bit stands for {\tt Interruptible}.  The {\tt OS} (``One
 229 Shot'') bit indicates whether or not this instruction is part of an
 230 outer loop.  Both of the preceding bits are explained in the next
 231 section.
 232
 233 \color{black}
 234
 235 The abbreviation {\tt P} stands for {\it predicate}; this is a two-bit
 236 code that indicates if the instruction should be executed or ignored.
 237
 238
 239
 240 \pagebreak
 241 \subsection{Life Cycle of an Instruction}
 242
 243 The diagram below shows an input dock for purposes of illustration
 244 (behavior at an output dock is identical).
 245
 246 \begin{center}
 247 \epsfig{file=indock,width=3in}\\
 248 {\it an input dock}
 249 \end{center}
 250
 251 Note the circle on the path between ``instr horn'' and ``instr fifo'';
 252 this is known as ``the hatch''.  The hatch has two states: sealed and
 253 unsealed.  When the machine powers up, the hatch is unsealed; it is
 254 sealed by the {\tt tail} instruction and unsealed whenever the outer
 255 loop counter is set to zero (for any reason\footnote{this
 256   includes {\tt OLC} being decremented to zero, a {\tt setOuter} with
 257   a literal field of zero, a {\tt setOuter} which copies a zero from
 258   the data register to {\tt OLC}, or the occurrence of a
 259   torpedo}).
 260
 261 When an instruction arrives at the instruction horn, it waits there
 262 until the hatch is in the unsealed state.  The instruction then enters
 263 the instruction fifo.  When an instruction emerges from the
 264 instruction fifo, it arrives at the ``on deck'' stage, where it may
 265 execute.
 266
 267 \subsubsection{Inner and Outer Loops}
 268
 269 A programmer can perform two types of loops: {\it inner} loops of only
 270 one micro-instruction and {\it outer} loops of multiple
 271 micro-instructions.  Inner loops may be nested within an outer loop,
 272 but no other nesting of loops is allowed.  The paths used by inner
 273 loops and outer loops are shown below:
 274
 275 \begin{center}
 276 \begin{minipage}{2in}
 277 \begin{center}
 278 \epsfig{file=inner-loop,width=2in}\\
 279 {\it inner loop (in red)}
 280 \end{center}
 281 \end{minipage}
 282 \begin{minipage}{2in}
 283 \begin{center}
 284 \epsfig{file=outer-loop,width=2in}\\
 285 {\it outer loop (in red)}
 286 \end{center}
 287 \end{minipage}
 288 \end{center}
 289
 290 Each type of loop has a counter associated with it: the {\tt ILC}
 291 counter for inner loops and the {\tt OLC} counter for outer loops.
 292 The inner loop counter applies only to certain ``inner-looping''
 293 instructions (see the table below for details).  When such an
 294 instruction reaches On Deck, if its predicate is true it will execute
 295 a number of times equal to {\tt ILC+1}, and leave {\tt ILC=0} after
 296 executing.  Non-inner-looping instructions and instructions whose
 297 predicate is false do not decrement {\tt ILC}.
 298
 299 The outer loop counter applies to all instructions {\it except} the
 300 instruction {\tt setOuter} with {\tt OS=1}, because such instructions
 301 are needed to reset the outer loop counter after it becomes zero.
 302 However, predicated {\tt setOuter} with {\tt OS=0} is useful for
 303 resetting the loop counter in the middle of the execution of a loop.
 304
 305 \subsubsection{On Deck}
 306
 307 The table below lists the actions which may be taken when an
 308 instruction arrives on deck:
 309
 310 \begin{center}
 311 \def\side#1{\begin{sideways}\parbox{15mm}{#1}\end{sideways}}
 312 \begin{tabular}{|r|ccccc|cccccc|}\hline
 313 %&\multicolumn{10}{c}{Predicate}&\\
 314 %&\multicolumn{10}{c}{True}&\\\hline
 315 &\multicolumn{5}{c}{Outer-Looping} &\multicolumn{5}{c}{One-Shot}&\\
 316 &\multicolumn{5}{c}{{\tt (OS=0)}} &\multicolumn{5}{c}{{\tt (OS=1)}}&\\
 317 &\side{{\tt move}}
 318 &\side{{\tt literal}}
 319 &\side{{\tt setFlags}}
 320 &\side{{\tt setInner}}
 321 &\side{{\tt setOuter}}
 322 &\side{{\tt move}}
 323 &\side{{\tt literal}}
 324 &\side{{\tt setFlags}}
 325 &\side{{\tt setInner}}
 326 &\side{{\tt setOuter}}
 327 &
 328 \\\hline
 329 Wait for hatch sealed         & +        & + & + & + & +  &          &   &   &   &   &  \\
 330 Fill IF0 w/ copy of self      & +        & + & + & + & +  &          &   &   &   &   &  \\\hline
 331 Request arbiter               & P+$\star$ &   &   &   &    & P+$\star$ &   &   &   &   &  \\
 332 Potentially torpedoable       & P+$\star$ &   &   &   &    & P+$\star$ &   &   &   &   &  \\\hline
 333 Execute                       & P+       & P+& P+& P+& P+ & ?        & ? & ? & ? & P &  \\
 334 Inner-looping                 & P+       &   &   &   & ?  & P+       &   &   &   & ? &  \\
 335 \hline
 336 \end{tabular}
 337
 338 \begin{tabular}{|r|l|}\hline
 339 +       & Only if {\tt OLC>0} (ie {\tt OLC} is positive) \\
 340 P       & Only if predicate is true \\
 341 P+      & Only if predicate is true and {\tt OLC>0} \\
 342 P+$\star$ & Only if predicate is true and {\tt OLC>0} and {\tt I=1} and one of {\tt Ti},{\tt Di},{\tt Do} true. \\
 343 ?       & to discuss \\\hline
 344 \end{tabular}
 345 \end{center}
 346
 347 \subsubsection{Torpedo}
 348
 349 There is a small fifo (not shown) before the latch marked
 350 ``Instruction Horn''; after the {\tt tail} instruction seals the
 351 hatch, any subsequent instructions will queue up in this fifo until
 352 the hatch is unsealed.  This is typically used as storage for a ``loop
 353 epilogue'' -- a sequence of instructions to be executed after a
 354 torpedo arrives or the outer loop counter expires.
 355
 356 Each dock has a fourth connection to the switch fabric (not shown),
 357 called its {\it torpedo destination}.  Anything (even a token) sent to
 358 this destination is treated as a torpedo.  Note that because this is a
 359 distinct destination, instructions or data queued up in the other
 360 destination fifos will not prevent a torpedo from occuring.
 361
 362 When a data item or token arrives at the torpedo destination, it lies
 363 there in wait until On Deck holds a potentially torpedoable
 364 instruction (see previous table).  Once this is the case, the torpedo
 365 causes the inner and outer loop counters to be set to zero (and
 366 therefore also unseals the hatch).\footnote{it is unspecified whether
 367   the torpedoed instruction is requeued or not; this may or may not
 368   occur, nondeterministically.  It is the programmer's responsibility
 369   to ensure that the program behaves the same whether this happens or
 370   not.  We think that this will not matter in most situations.}
 371
 372 \color{black}
 373
 374
 375 \subsection{Flags}
 376
 377 The pump has three flags: {\tt A}, {\tt B}, and {\tt S}.
 378
 379 \begin{itemize}
 380 \item The {\tt A} and {\tt B} flags are general-purpose flags which
 381       may be set and cleared by the programmer.
 382
 383 %\item
 384 %
 385 % The {\tt L} flag, known as the {\it last} flag, is set whenever
 386 %      the value in the outer counter ({\tt OLC}) is one,
 387 \color{black}
 388 % indicating
 389 %      that the dock is in the midst of the last iteration of an
 390 %      outer loop.  This flag can be used to perform certain
 391 %      operations (such as sending a completion token) only on the last
 392 %      iteration of an outer loop.
 393
 394 \item The {\tt S} flag, known as the {\it summary} flag.  Its value is
 395       determined by the ship, but unless stated otherwise, it should
 396       be assumed that whenever the 37th bit of the data ({\tt D})
 397       latch is loaded, that same bit is also loaded into the {\tt S}
 398       flag.  This lets the ship make decisions based on whether or not
 399       the top bit of the data latch is set; if two's complement
 400       numbers are in use, this will indicate whether or not the
 401       latched value is negative.
 402 \end{itemize}
 403
 404 Many instruction fields are specified as two-bit {\it predicates}.
 405 These fields contain one of four values, indicating if an action
 406 should be taken unconditionally or conditionally on one of the {\tt A}
 407 or {\tt B} flags:
 408
 409 \begin{itemize}
 410 \item {\tt 00:} if {\tt A} is set
 411 \item {\tt 10:} if {\tt B} is set
 412 \item {\tt 01:} TBD
 413 \item {\tt 11:} always
 414 \end{itemize}
 415
 416
 417 \pagebreak
 418 \section{Instructions}
 419
 420 Here is a list of the instructions supported by the dock:
 421
 422 \begin{center}
 423 \begin{tabular}{|l|}\hline
 424 {\tt move} (variants: {\tt moveto}, {\tt dispatch}) \\
 425 {\tt literal} (variants: {\tt literalhi}, {\tt literallo})\\
 426 {\tt setFlags} \\
 427 {\tt setInner} \\
 428 {\tt setOuter} \\
 429 %{\tt torpedo} \\
 430 {\tt tail} \\\hline
 431 \end{tabular}
 432 \end{center}
 433
 434 {\tt tail} {\it will probably become a bit on every instruction rather than
 435   its own instruction}
 436
 437 \color{black}
 438
 439
 440 \subsection{{\tt move} (variants: {\tt moveto}, {\tt dispatch})}
 441
 442 \setlength{\bitwidth}{5mm}
 443 {\tt
 444 \begin{bytefield}{26}
 445   \bitheader[b]{12-16,19,21}\\
 446 \color{light}
 447   \bitbox{1}{I}
 448   \bitbox{1}{OS}
 449   \bitbox{2}{P}
 450 \color{black}
 451    \bitbox{3}{001}
 452 \color{light}
 453   \bitbox[trb]{2}{}
 454 \color{black}
 455   \bitbox{1}{\tt Ti}
 456   \bitbox{1}{\tt Di}
 457   \bitbox{1}{\tt Dc}
 458   \bitbox{1}{\tt Do}
 459   \bitbox{1}{\tt To}
 460   \bitbox[l]{17}{}
 461 \end{bytefield}}
 462
 463 %\begin{bytefield}{26}
 464 %  \bitheader[b]{12-18}\\
 465 %  \bitbox[]{8}{\raggedleft Input Dock:}
 466 %  \bitbox[r]{2}{}
 467 %  \bitbox{1}{\tt So}
 468 %  \bitbox{1}{\tt Dc}
 469 %  \bitbox[l]{15}{}
 470 %\end{bytefield}
 471 %
 472 %\begin{bytefield}{26}
 473 %  \bitheader[b]{12-18}\\
 474 %  \bitbox[]{8}{\raggedleft Output Dock:}
 475 %  \bitbox[r]{2}{}
 476 %  \bitbox{1}{\tt Si}
 477 %  \bitbox{1}{\tt To}
 478 %  \bitbox[l]{15}{}
 479 %\end{bytefield}
 480
 481 \begin{bytefield}{26}
 482   \bitheader[b]{0,10,11}\\
 483   \bitbox[1]{13}{\raggedleft {\tt moveto} ({\tt LiteralPath\to Path})}
 484   \bitbox[r]{1}{}
 485   \bitbox{1}{\tt 1}
 486   \bitbox{11}{\tt LiteralPath}
 487 \end{bytefield}
 488
 489 \begin{bytefield}{26}
 490   \bitheader[b]{10,11}\\
 491   \bitbox[1]{13}{\raggedleft {\tt dispatch} ({\tt DP[37:27]\to Path})\ \ }
 492   \bitbox[r]{1}{}
 493   \bitbox{1}{\tt 0}
 494   \bitbox{1}{\tt 1}
 495 \color{light}
 496   \bitbox[trb]{10}{}
 497 \color{black}
 498 \end{bytefield}
 499
 500 \begin{bytefield}{26}
 501   \bitheader[b]{10,11}\\
 502   \bitbox[1]{13}{\raggedleft {\tt move} ({\tt Path} unchanged):}
 503   \bitbox[r]{1}{}
 504   \bitbox{1}{\tt 0}
 505   \bitbox{1}{\tt 0}
 506 \color{light}
 507   \bitbox[trb]{10}{}
 508 \color{black}
 509 \end{bytefield}
 510
 511 \begin{itemize}
 512 \item {\tt Ti} - Token Input: wait for the token predecessor to be full and drain it.
 513 \item {\tt Di} - Data Input: wait for the data predecessor to be full and drain it.
 514 \item {\tt Dc} - Data Capture: pulse the data latch.
 515 \item {\tt Do} - Data Output: fill the data successor.
 516 \item {\tt To} - Token Output: fill the token successor.
 517 \end{itemize}
 518
 519 The data successor and token successor must both be empty in order for
 520 a {\tt move} instruction to attempt execution.
 521
 522 The inner loop counter can hold a number {\tt 0..MAX} or a special
 523 value $\infty$.  If {\tt ILC} is nonzero after execution of a {\tt
 524   move} instruction, the instruction will execute again, and {\tt ILC}
 525 will be latched with {\tt (ILC==$\infty$?$\infty$:max(ILC-1, 0))}.  When
 526 the inner loop counter reaches zero, the instruction ceases executing.
 527
 528
 529 \pagebreak
 530 \subsection{{\tt literal}, {\tt literalhi}, {\tt literallo}}
 531
 532 These instructions load part or all of the data latch ({\tt D}).
 533
 534 {\tt literalhi: Literal[18:1]\to D[37:20]} (and {\tt Literal[18]\to S})
 535
 536 \setlength{\bitwidth}{5mm}
 537 {\tt
 538 \begin{bytefield}{26}
 539   \bitheader[b]{0,18,19,21}\\
 540 \color{light}
 541   \bitbox{1}{I}
 542   \bitbox{1}{OS}
 543   \bitbox{2}{P}
 544 \color{black}
 545   \bitbox{1}{0}
 546   \bitbox{2}{11}
 547 \color{light}
 548   \bitbox[trb]{1}{}
 549 \color{black}
 550   \bitbox{18}{Literal}
 551 \end{bytefield}}
 552
 553 {\tt literallo: Literal[19:1]\to D[19:1]}
 554
 555 \setlength{\bitwidth}{5mm}
 556 {\tt
 557 \begin{bytefield}{26}
 558   \bitheader[b]{0,18,19,21}\\
 559 \color{light}
 560   \bitbox{1}{I}
 561   \bitbox{1}{OS}
 562   \bitbox{2}{P}
 563 \color{black}
 564   \bitbox{1}{0}
 565   \bitbox{2}{10}
 566   \bitbox{19}{Literal}
 567 \end{bytefield}}
 568
 569 {\tt literal:}
 570
 571 \setlength{\bitwidth}{5mm}
 572 {\tt
 573 \begin{bytefield}{26}
 574   \bitheader[b]{0,18,19,21}\\
 575 \color{light}
 576   \bitbox{1}{I}
 577   \bitbox{1}{OS}
 578   \bitbox{2}{P}
 579 \color{black}
 580   \bitbox{1}{1}
 581   \bitbox{2}{SEL}
 582   \bitbox{19}{Literal}
 583 \end{bytefield}}
 584
 585 {\tt
 586 \begin{tabular}{|r|c|c|c|}\hline
 587 sel  & D[37:20]      & D[19:1]       \\\hline
 588 00  & Literal[18:1] & all 0         \\
 589 01  & Literal[18:1] & all 1         \\
 590 10  & all 0         & Literal[19:1] \\
 591 11  & all 1         & Literal[19:1] \\
 592 \hline
 593 \end{tabular}}
 594
 595
 596
 597
 598 \subsection{{\tt setFlags}}
 599
 600 \setlength{\bitwidth}{5mm}
 601 {\tt
 602 \begin{bytefield}{26}
 603   \bitheader[b]{0,7,8,15,16-19,21}\\
 604 \color{light}
 605   \bitbox{1}{I}
 606   \bitbox{1}{OS}
 607   \bitbox{2}{P}
 608 \color{black}
 609   \bitbox{3}{000}
 610   \bitbox{1}{0}
 611   \bitbox{6}{nextA}
 612   \bitbox{6}{nextB}
 613   \bitbox{6}{nextS}
 614 \end{bytefield}}
 615
 616 The {\tt P} field is a predicate; if it does not hold, the instruction
 617 is ignored.  Otherwise the flags are updated according to the {\tt
 618   nextA}, {\tt nextB}, and {\tt nextS} fields; each specifies the new
 619 value as the logical {\tt OR} of zero or more inputs:
 620
 621 \begin{center}
 622 {\tt
 623 \begin{bytefield}{6}
 624   \bitheader[b]{0-5}\\
 625   \bitbox{1}{${\text{\tt A}}$}
 626   \bitbox{1}{$\overline{\text{\tt A}}$}
 627   \bitbox{1}{${\text{\tt B}}$}
 628   \bitbox{1}{$\overline{\text{\tt B}}$}
 629   \bitbox{1}{${\text{\tt S}}$}
 630   \bitbox{1}{$\overline{\text{\tt S}}$}
 631 \end{bytefield}}
 632 \end{center}
 633
 634 Each bit corresponds to one possible input; all inputs whose bits are
 635 set are {\tt OR}ed together, and the resulting value is assigned to
 636 the flag.  Note that if none of the bits are set, the value assigned
 637 is zero.  Note also that it is possible to produce a {\tt 1} by {\tt
 638   OR}ing any flag with its complement.
 639 \color{red}
 640 Note that {\tt setFlags} can be used to create a {\tt nop} (no-op) by
 641 setting each flag to itself.
 642 \color{black}
 643
 644
 645 \pagebreak
 646
 647 \subsection{{\tt setInner}}
 648
 649 This instruction loads the inner loop counter with either a literal
 650 number, the special value $\infty$, or the contents of the {\tt data}
 651 register.
 652
 653 \setlength{\bitwidth}{5mm}
 654 {\tt
 655 \begin{bytefield}{26}
 656   \bitheader[b]{16-19,21}\\
 657 \color{light}
 658   \bitbox{1}{I}
 659   \bitbox{1}{OS}
 660   \bitbox{2}{P}
 661 \color{black}
 662   \bitbox{3}{000}
 663   \bitbox{1}{1}
 664   \bitbox{2}{01}
 665 \color{light}
 666   \bitbox[tbr]{8}{}
 667   \bitbox[l]{8}{}
 668 \color{black}
 669 \end{bytefield}}\\
 670
 671 \begin{bytefield}{26}
 672   \bitbox[r]{18}{\raggedleft from data latch:\hspace{0.2cm}\ }
 673   \bitbox{2}{\tt 00}
 674 \color{light}
 675   \bitbox[tbr]{6}{}
 676 \color{black}
 677 \end{bytefield}
 678
 679 \begin{bytefield}{26}
 680   \bitheader[b]{0,5,6,7}\\
 681   \bitbox[r]{18}{\raggedleft from literal:\hspace{0.2cm}\ }
 682   \bitbox{2}{\tt 10}
 683   \bitbox{6}{\tt Literal}
 684 \end{bytefield}
 685
 686 \begin{bytefield}{26}
 687   \bitheader[b]{0,5,6,7}\\
 688   \bitbox[r]{18}{\raggedleft with $\infty$\ \ }
 689   \bitbox{2}{\tt 11}
 690 \color{light}
 691   \bitbox[tbr]{6}{}
 692 \color{black}
 693 \end{bytefield}
 694
 695
 696 \subsection{{\tt setOuter}}
 697
 698 This instruction loads the outer loop counter {\tt OLC} with either
 699 {\tt max(0,OLC-1)}, a literal or the contents of the {\tt data}
 700 register.
 701
 702 \setlength{\bitwidth}{5mm}
 703 {\tt
 704 \begin{bytefield}{26}
 705   \bitheader[b]{16-19,21,24}\\
 706 \color{light}
 707   \bitbox{1}{I}
 708   \bitbox{1}{OS}
 709 \color{light}
 710   \bitbox[tbr]{2}{P}
 711 \color{black}
 712   \bitbox{3}{000}
 713   \bitbox{1}{1}
 714   \bitbox{2}{10}
 715 \color{light}
 716   \bitbox[tbr]{9}{}
 717   \bitbox[l]{7}{}
 718 \color{black}
 719 \end{bytefield}}\\
 720
 721 \begin{bytefield}{26}
 722   \bitbox[r]{19}{\raggedleft {\tt max(0,OLC-1)}:\hspace{0.2cm}\ }
 723   \bitbox{2}{\tt 00}
 724 %\color{light}
 725   \bitbox[tbr]{5}{}
 726 %\color{black}
 727 \color{black}
 728 \end{bytefield}
 729
 730 \begin{bytefield}{26}
 731   \bitbox[r]{19}{\raggedleft from data latch:\hspace{0.2cm}\ }
 732   \bitbox{2}{\tt 01}
 733 \color{light}
 734   \bitbox[tbr]{5}{}
 735 \color{black}
 736 \end{bytefield}
 737
 738 \begin{bytefield}{26}
 739   \bitheader[b]{0,5,6}\\
 740   \bitbox[r]{19}{\raggedleft from literal:\hspace{0.2cm}\ }
 741   \bitbox{1}{\tt 1}
 742   \bitbox{6}{\tt Literal}
 743 \end{bytefield}
 744
 745
 746 %\subsection{{\tt torpedo}}
 747 %
 748 %\setlength{\bitwidth}{5mm}
 749 %{\tt
 750 %\begin{bytefield}{26}
 751 %  \bitheader[b]{0,5,16-19,21}\\
 752 %\color{light}
 753 %  \bitbox{4}{}
 754 %\color{black}
 755 %  \bitbox{3}{000}
 756 %  \bitbox{1}{1}
 757 %  \bitbox{2}{00}
 758 %\color{light}
 759 %  \bitbox[tbr]{16}{}
 760 %\end{bytefield}}
 761 %
 762 %
 763 %When a {\tt torpedo} instruction reaches the instruction horn, it will
 764 %wait there until an instruction is on deck whose {\tt A}rmor bit is
 765 %not set.  The {\tt torpedo} will then cause ``Process \#2'' of the on
 766 %deck instruction to terminate and will set the outer loop counter to zero.
 767
 768 \subsection{{\tt tail}}
 769
 770 {\it This will probably become a bit on every instruction rather than
 771   its own instruction.  The only problem is that we have run out of bits in the {\tt literal} instruction.  Two possible solutions: (a) declare that {\tt literal} cannot be the last instruction in a loop or (b) because {\tt literal} instructions cannot be torpedoed anyways, re-use its {\tt I} bit for this purpose.}
 772
 773 \setlength{\bitwidth}{5mm}
 774 {\tt
 775 \begin{bytefield}{26}
 776   \bitheader[b]{0,5,16-19,21}\\
 777 \color{light}
 778   \bitbox{4}{}
 779 \color{black}
 780   \bitbox{3}{000}
 781   \bitbox{1}{1}
 782   \bitbox{2}{01}
 783 \color{light}
 784   \bitbox[tbr]{16}{}
 785 \end{bytefield}}
 786
 787 When a {\tt tail} instruction reaches {\tt IH}, it seals the hatch.
 788 The {\tt tail} instruction does not enter the instruction fifo.
 789
 790 \color{black}
 791 %\pagebreak
 792 %\subsection{{\tt takeOuterLoopCounter}}
 793 %
 794 %\setlength{\bitwidth}{5mm}
 795 %{\tt
 796 %\begin{bytefield}{26}
 797 %  \bitheader[b]{16-19,21}\\
 798 %\color{light}
 799 %  \bitbox{1}{A}
 800 %  \bitbox{1}{OS}
 801 %  \bitbox{2}{P}
 802 %\color{black}
 803 %  \bitbox{3}{000}
 804 %  \bitbox{1}{0}
 805 %  \bitbox{2}{11}
 806 %\color{light}
 807 %  \bitbox[tbr]{16}{}
 808 %\color{black}
 809 %\end{bytefield}}
 810 %
 811 %This instruction copies the value in the outer loop counter {\tt OLC}
 812 %into the least significant bits of the data latch and leaves all other
 813 %bits of the data latch unchanged.
 814 %
 815 %\subsection{{\tt takeInnerLoopCounter}}
 816 %
 817 %\setlength{\bitwidth}{5mm}
 818 %{\tt
 819 %\begin{bytefield}{26}
 820 %  \bitheader[b]{16-19,21}\\
 821 %\color{light}
 822 %  \bitbox{1}{A}
 823 %  \bitbox{1}{OS}
 824 %  \bitbox{2}{P}
 825 %\color{black}
 826 %  \bitbox{3}{???}
 827 %  \bitbox{1}{?}
 828 %  \bitbox{2}{??}
 829 %\color{light}
 830 %  \bitbox[tbr]{16}{}
 831 %\color{black}
 832 %\end{bytefield}}
 833 %
 834 %This instruction copies the value in the inner loop counter {\tt ILC}
 835 %into the least significant bits of the data latch and leaves all other
 836 %bits of the data latch unchanged.
 837 %
 838 %
 839 %
 840 %%\pagebreak
 841 %%\subsection{{\tt interrupt}}
 842 %%
 843 %%\setlength{\bitwidth}{5mm}
 844 %{\tt
 845 %\begin{bytefield}{26}
 846 %  \bitheader[b]{0,5,16-19,21}\\
 847 %\color{light}
 848 %  \bitbox{4}{}
 849 %\color{black}
 850 %  \bitbox{3}{000}
 851 %  \bitbox{1}{1}
 852 %  \bitbox{2}{00}
 853 %\color{light}
 854 %  \bitbox[tbr]{16}{}
 855 %\end{bytefield}}
 856 %
 857 %When an {\tt interrupt} instruction reaches {\tt IH}, it will wait
 858 %there for the {\tt OD} stage to be full with an instruction that has
 859 %the {\tt IM} bit set.  When this occurs, the instruction at {\tt OD}
 860 %{\it will not execute}, but {\it may reloop} if the conditions for
 861 %relooping are met.
 862 %\footnote{The ability to interrupt an instruction yet have it reloop is very
 863 %useful for processing chunks of data with a fixed size header and/or
 864 %footer and a variable length body.}
 865 %
 866 %
 867 %\subsection{{\tt massacre}}
 868 %
 869 %\setlength{\bitwidth}{5mm}
 870 %{\tt
 871 %\begin{bytefield}{26}
 872 %  \bitheader[b]{16-19,21}\\
 873 %\color{light}
 874 %  \bitbox{4}{}
 875 %\color{black}
 876 %  \bitbox{3}{000}
 877 %  \bitbox{1}{1}
 878 %  \bitbox{2}{01}
 879 %\color{light}
 880 %  \bitbox[tbr]{16}{}
 881 %\color{black}
 882 %\end{bytefield}}
 883 %
 884 %When a {\tt massacre} instruction reaches {\tt IH}, it will wait there
 885 %for the {\tt OD} stage to be full with an instruction that has the
 886 %{\tt IM} bit set.  When this occurs, all instructions in the
 887 %instruction fifo (including {\tt OD}) are retired.
 888 %
 889 %\subsection{{\tt clog}}
 890 %
 891 %\setlength{\bitwidth}{5mm}
 892 %{\tt
 893 %\begin{bytefield}{26}
 894 %  \bitheader[b]{16-19,21}\\
 895 %\color{light}
 896 %  \bitbox{4}{}
 897 %\color{black}
 898 %  \bitbox{3}{000}
 899 %  \bitbox{1}{1}
 900 %  \bitbox{2}{10}
 901 %\color{light}
 902 %  \bitbox[tbr]{16}{}
 903 %\color{black}
 904 %\end{bytefield}}
 905 %
 906 %When a {\tt clog} instruction reaches {\tt OD}, it remains there and
 907 %no more instructions will be executed until an {\tt unclog} is
 908 %performed.
 909 %
 910 %\subsection{{\tt unclog}}
 911 %
 912 %\setlength{\bitwidth}{5mm}
 913 %{\tt
 914 %\begin{bytefield}{26}
 915 %  \bitheader[b]{16-19,21}\\
 916 %\color{light}
 917 %  \bitbox{4}{}
 918 %\color{black}
 919 %  \bitbox{3}{000}
 920 %  \bitbox{1}{1}
 921 %  \bitbox[lrtb]{2}{11}
 922 %\color{light}
 923 %  \bitbox[tbr]{16}{}
 924 %\color{black}
 925 %\end{bytefield}}
 926 %
 927 %When an {\tt unclog} instruction reaches {\tt IH}, it will wait there
 928 %until a {\tt clog} instruction is at {\tt OD}.  When this occurs, both
 929 %instructions retire.
 930 %
 931 %Note that issuing an {\tt unclog} instruction to a dock which is not
 932 %clogged and whose instruction fifo contains no {\tt clog} instructions
 933 %will cause the dock to deadlock.
 934
 935
 936
 937 \pagebreak
 938 \epsfig{file=overview,height=5in,angle=90}
 939
 940 \pagebreak
 941 \subsection*{Input Dock}
 942 \epsfig{file=indock,width=7in,angle=90}
 943
 944 \pagebreak
 945 \subsection*{Output Dock}
 946 \epsfig{file=outdock,width=6.5in,angle=90}
 947
 948
 949 %\pagebreak
 950 %\epsfig{file=ports,height=5in,angle=90}
 951
 952 %\pagebreak
 953 %\epsfig{file=best,height=5in,angle=90}
 954
 955
 956 \end{document}