+The Interpreter was the first part of Mips2Java to be written. This was the most straightforward and simple way to run MIPS binaries inside the JVM. The simplicity of the interpreter also made it very simple to debug. Debugging machine-generated code is a pain. Most importantly, the interpreter provided a great reference to use when developing the compiler. With known working implementations of each MIPS instruction in Java writing a compiler became a matter of simple doing the same thing ahead of time.
+
+With the completion of the compiler the interpreter in Mips2Java has become less useful. However, it may still have some life left in it. One possible use is remote debugging with GDB. Although debugging the compiler generated JVM code is theoretically possible, it would be far easier to do in the interpreter. The interpreter may also be useful in cases where size is far more important than speed. The interpreter is very small. The interpreter and MIPS binary combined are smaller than the compiled classfiles.
+
+
+\subsection{Compiling to Java Source}
+
+The next major step in Mips2Java?s development was the Java source compiler. This generated Java source code (compliable with javac or Jikes) from a MIPS binary. Generating Java source code was preferred initially over JVM bytecode for two reasons. The authors weren?t very familiar with JVM bytecode and therefore generating Java source code was simpler. Generating source code also eliminated the need to do trivial optimizations in the Mips2java compiler that javac and Jikes already do. This mainly includes 2+2=4 stuff. For example, the MIPS register r0 is immutable and always 0. This register is represented by a static final int in the Java source compiler. Javac and Jikes automatically handle optimizing this away when possible. In the JVM bytecode compiler these optimizations needs to be done in Mips2Java.
+
+Early versions of the Mips2Java compiler were very simple. All 32 MIPS GPRs and a special PC register were fields in the generated java class. There was a run() method containing all the instructions in the .text segment converted to Java source code. A switch statement was used to allow jumps from instruction to instruction. The generated code looked something like this.
+
+%private final static int r0 = 0;
+%private int r1, r2, r3,...,r30;
+%private int r31 = 0xdeadbeef;
+%private int pc = ENTRY_POINT;
+%
+%public void run() {
+% for(;;) {
+% switch(pc) {
+% case 0x10000:
+% r29 = r29 ? 32;
+% case 0x10004:
+% r1 = r4 + r5;
+% case 0x10008:
+% if(r1 == r6) {
+% /* delay slot */
+% r1 = r1 + 1;
+% pc = 0x10018:
+% continue;
+% }
+% case 0x1000C:
+% r1 = r1 + 1;
+% case 0x10010:
+% r31 = 0x10018;
+% pc = 0x10210;
+% continue;
+% case 0x10014:
+% /* nop */
+% case 0x10018:
+% pc = r31;
+% continue;
+% ...
+% case 0xdeadbeef:
+% System.err.println(?Exited from ENTRY_POINT?);
+% System.err.println(?R2: ? + r2);
+% System.exit(1);
+% }
+% }
+%}
+
+This worked fine for small binaries but as soon as anything substantial was fed to it the 64k JVM method size limit was soon hit. The solution to this was to break up the code into many smaller methods. This required a trampoline to dispatch jumps to the appropriate method. With the addition of the trampoline the generated code looked something like this:
+
+%public void run_0x10000() {
+% for(;;) {
+% switch(pc) {
+% case 0x10000:
+% ...
+% case 0x10004:
+% ...
+% ...
+% case 0x10010:
+% r31 = 0x10018;
+% pc = 0x10210;
+% return;
+% ...
+% }
+% }
+%}
+%
+%pubic void run_0x10200() {
+% for(;;) {
+% switch(pc) {
+% case 0x10200:
+% ...
+% case 0x10204:
+% ...
+% }
+% }
+%}
+%
+%public void trampoline() {
+% for(;;) {
+% switch(pc&0xfffffe00) {
+% case 0x10000: run_0x10000(); break;
+% case 0x10200: run_0x10200(); break;
+% case 0xdeadbe00:
+% ...
+% }
+% }
+%}
+
+With this trampoline in place somewhat large binaries could be handled without much difficulty. There is no limit on the size of a classfile as a whole, just individual methods. This method should scale well. However, there are other classfile limitations that will limit the size of compiled binaries.
+
+Another interesting problem that was discovered while creating the trampoline method was javac and Jikes? inability to properly optimize switch statements. The follow code fragment gets compiled into a lookupswich by javac:
+
+%Switch(pc&0xffffff00) {
+% Case 0x00000100: run_100(); break;
+% Case 0x00000200: run_200(); break;
+% Case 0x00000300: run_300(); break;
+%}
+
+while this nearly identical code fragment gets compiled to a tableswitch
+
+Javac isn?t smart enough to see the patter in the case values and generates very suboptimal bytecode. Manually doing the shifts convinces javac to emit a tableswitch statement, which is significantly faster. This change alone nearly doubled the speed of the compiled binary.
+
+Finding the optimal method size lead to the next big performance increase. It was determined with experimentation that the optimal number of MIPS instructions per method is 128 (considering only power of two options). Going above or below that lead to performance decreases. This is most likely due to a combination of two factors.
+