1 \documentclass[letter]{seminar}
2 \usepackage{calc} % Simple computations with LaTeX variables
3 \usepackage[hang]{caption2} % Improved captions
4 \usepackage{fancybox} % To have several backgrounds
5 % (must be loaded before `fancyvrb')
6 \usepackage{fancyhdr} % Headers and footers definitions
7 \usepackage{fancyvrb} % Fancy verbatim environments
12 \usepackage{pdftricks}
14 \usepackage{pstcol} % PSTricks with the standard color package
15 % (before `graphicx' for the \scalebox macro)
16 \usepackage{graphicx} % Standard graphics package
17 \usepackage{multido} % General loop macro
18 \usepackage{pifont} % Ding symbols (mainly for lists)
19 \usepackage{pst-fr3d} % PSTricks 3D framed boxes
20 \usepackage{pst-grad} % PSTricks gradient mode
21 \usepackage{pst-node} % PSTricks nodes
22 \usepackage{pst-slpe} % Improved PSTricks gradients
25 \definecolor{gray}{rgb}{0.5,0.5,0.5}
26 \definecolor{CodeBorder}{rgb}{0.6,0.6,0.6}
27 \definecolor{CodeBackground}{rgb}{0.93,0.93,0.93}
32 \usepackage{bold-extra}
34 \usepackage{amssymb,amsmath,epsfig,alltt}
35 \usepackage{semcolor} % Seminar colored slides
36 \usepackage{semhelv} % Seminar helvetica fonts
37 \usepackage{semlayer} % Seminar overlays
38 \usepackage{slidesec} % Seminar sections and list of slides
39 \usepackage{url} % Convenient URL typesetting
40 \usepackage[pdftex,letterpaper,pdffitwindow=true,colorlinks=true,pdfpagemode=UseNone,
41 bookmarks=true]{hyperref} % Hyperlinks for PDF versions
43 \slidepagestyle{fancy}
45 \slidesmag{4} % Set magnification of slide
46 \def\SeminarPaperWidth{\paperwidth / 2}
47 \def\SeminarPaperHeight{\paperheight / 2}
48 \slideframe{none} % No default frame
52 % General size parameters
53 \renewcommand{\slidetopmargin}{-25mm}
54 \renewcommand{\slidebottommargin}{-10mm}
55 % \renewcommand{\slideleftmargin}{4mm}
56 % \renewcommand{\sliderightmargin}{4mm}
57 % To adjust the frame length to the header and footer ones
58 % \autoslidemarginstrue
59 % We suppress the header and footer `fancyhdr' rules
60 \fancyhf{} % Clear all fields
61 \renewcommand{\headrule}{}
62 \renewcommand{\footrule}{}
64 % \usepackage{nohyperref} % To deactivate the `hyperref' features
65 % \overlaysfalse % To suppress overlays
66 % \def\special@paper{}% Needed to avoid `hyperref' to collapse with ``dvips''
67 \newslideframe{IMAGE}{%
68 \boxput{\rput(0,-9mm){%
69 \includegraphics[width=\SeminarPaperHeight,height=\SeminarPaperWidth]{background.pdf}}}{#1}}
71 \RequirePackage[T1]{fontenc}
72 \RequirePackage{textcomp}
73 \renewcommand{\rmdefault}{trebuchet}
74 \renewcommand{\slidefonts}{%
75 \renewcommand{\rmdefault}{trebuchet}%
76 \renewcommand{\ttdefault}{cmtt}}%
77 \newcommand{\ParagraphTitle}[2][black]{%
78 \noindent\psshadowbox[fillstyle=solid,fillcolor=#1]{\large{#2}}}
79 \newcommand{\CenteredParagraphTitle}[2][black]{%
80 \centerline{\psshadowbox[fillstyle=solid,fillcolor=#1]{\large{#2}}}}
81 \renewcommand{\makeslideheading}[1]{%
82 \CenteredParagraphTitle[black]{%
83 \textcolor{black}{\huge\textbf{#1}}}}
84 \renewcommand{\makeslidesubheading}[1]{%
85 \CenteredParagraphTitle{\Large\theslidesubsection{} -- #1}}
86 \renewenvironment{dinglist}[2][black]
87 {\begin{list}{\ding{#2}}{}}{\end{list}}
88 \newcommand{\DingListSymbolA}{43}
89 \newcommand{\DingListSymbolB}{243}
90 \newcommand{\DingListSymbolC}{224}
91 \newcommand{\DingListSymbolD}{219}
92 \newcommand{\eqbox}[2][0.6]{%
93 \centerline{\psshadowbox[fillstyle=solid,fillcolor=gray]{%
96 \textcolor{black} {#2}
98 \renewcommand{\labelitemi}{\footnotesize$\blacktriangleright$}
99 \renewcommand{\labelitemii}{\footnotesize$\pmb\vartriangleright$}
100 \renewcommand{\labelitemiii}{$\pmb\rightsquigarrow$}
104 \ParagraphTitle{\bf \Large Complete Translation of}
106 {\bf \Large Unsafe Native Code to Safe Bytecode} \\
108 \textit{Brian Alliet, RIT\\Adam Megacz, UC Berkeley} \
115 \begin{slide}\raggedright
116 \renewcommand{\leftmargini}{5mm}
117 {\Large{\textcolor{black}{The Problem: Specifics}}}
118 \\\rule{\textwidth}{0.1pt}\\
124 XWT project (pure Java)
132 TrueType font rasterization
135 {\texttt{.cab}} extraction
149 \begin{slide}\raggedright
150 \renewcommand{\leftmargini}{5mm}
151 {\Large{\textcolor{black}{The Problem: Generally}}}
152 \\\rule{\textwidth}{0.1pt}\\
158 Lots of high-quality, complex, code out there in unsafe
178 Need to integrate with Java code
181 JNI isn't always an option
200 \begin{slide}\raggedright
201 \renewcommand{\leftmargini}{5mm}
202 {\Large{\textcolor{black}{Existing Approaches: Source-to-Source}}}
203 \\\rule{\textwidth}{0.1pt}\\
207 \item \begin{figure}[H]
209 \epsfig{file=source-to-source,width=\textwidth-1in}
219 \begin{slide}\raggedright
220 \renewcommand{\leftmargini}{5mm}
221 {\Large{\textcolor{black}{Existing Approaches: Source-to-Source}}}
222 \\\rule{\textwidth}{0.1pt}\\
260 \begin{slide}\raggedright
261 \renewcommand{\leftmargini}{5mm}
262 {\Large{\textcolor{black}{Existing Approaches: Source-to-Binary}}}
263 \\\rule{\textwidth}{0.1pt}\\
267 \item \begin{figure}[H]
269 \epsfig{file=source-to-binary,width=\textwidth-1in}
279 \begin{slide}\raggedright
280 \renewcommand{\leftmargini}{5mm}
281 {\Large{\textcolor{black}{Existing Approaches: Source-to-Binary}}}
282 \\\rule{\textwidth}{0.1pt}\\
300 \begin{slide}\raggedright
301 \renewcommand{\leftmargini}{5mm}
302 {\Large{\textcolor{black}{NestedVM: Binary-to-Source}}}
303 \\\rule{\textwidth}{0.1pt}\\
307 \item \begin{figure}[H]
309 \epsfig{file=binary-to-source,width=\textwidth-1in}
319 \begin{slide}\raggedright
320 \renewcommand{\leftmargini}{5mm}
321 {\Large{\textcolor{black}{Advantages}}}
322 \\\rule{\textwidth}{0.1pt}\\
328 No modifications to build process
334 Bug-for-bug compiler compatability
337 Zero post-translation human intervention
346 \begin{slide}\raggedright
347 \renewcommand{\leftmargini}{5mm}
348 {\Large{\textcolor{black}{Why MIPS?}}}
349 \\\rule{\textwidth}{0.1pt}\\
355 Simple, clean, regular semantics
358 Well-defined ABI (caller save, callee save)
361 Many similarities to JVM
366 32-bit aligned memory access
369 32x32 => 64bit multiply and divide
372 Unsigned math is rare
375 Easy to determine likely jump targets
386 \begin{slide}\raggedright
387 \renewcommand{\leftmargini}{5mm}
388 {\Large{\textcolor{black}{Memory Representation}}}
389 \\\rule{\textwidth}{0.1pt}\\
393 \item \begin{figure}[H]
395 \epsfig{file=memory,width=\textwidth-1in}
405 \begin{slide}\raggedright
406 \renewcommand{\leftmargini}{5mm}
407 {\Large{\textcolor{black}{Code Representation}}}
408 \\\rule{\textwidth}{0.1pt}\\
416 \begin{Verbatim}[fontsize=\tiny,frame=single,rulecolor=\color{CodeBorder},resetmargins=true,gobble=0]
417 private final static int r0 = 0;
418 private int r1, r2, r3, /* ... */ r30;
419 private int r31 = 0xdeadbeef;
424 case 0x10000: r29 = r29 - 32;
425 case 0x10004: r1 = r4 + r5;
426 case 0x10008: if (r1 == r6) {
431 case 0x1000C: r1 = r1 + 1;
432 case 0x10010: r31 = 0x10018;
435 case 0x10014: /* nop */
436 case 0x10018: pc = r31; continue;
446 \begin{slide}\raggedright
447 \renewcommand{\leftmargini}{5mm}
448 {\Large{\textcolor{black}{Trampoline Insertion}}}
449 \\\rule{\textwidth}{0.1pt}\\
457 \begin{Verbatim}[fontsize=\tiny,frame=single,rulecolor=\color{CodeBorder},resetmargins=true,gobble=0]
458 public void run_0x10000() {
459 while (true) switch(pc) {
462 case 0x10010: r31 = 0x10018;
467 pubic void run_0x10200() {
468 while (true) switch(pc) {
473 public void trampoline() {
474 while (true) switch(pc&0xfffffe00) {
475 case 0x10000: run_0x10000(); break;
476 case 0x10200: run_0x10200(); break;
487 \begin{slide}\raggedright
488 \renewcommand{\leftmargini}{5mm}
489 {\Large{\textcolor{black}{Binary-to-Binary Translation}}}
490 \\\rule{\textwidth}{0.1pt}\\
494 \item \begin{figure}[H]
496 \epsfig{file=binary-to-binary,width=\textwidth-1in}
506 \begin{slide}\raggedright
507 \renewcommand{\leftmargini}{5mm}
508 {\Large{\textcolor{black}{Binary-to-Binary Translation}}}
509 \\\rule{\textwidth}{0.1pt}\\
515 Allows the use of more optimal bytecode sequences that cannot be expressed
519 Smaller Bytecode - Javac doesn't emit space efficient bytecode
522 Translation is Much faster - doesn't need to go though the extra java
523 source compilation step
526 Compiled code can be loaded into the JVM at Runtime.
535 \begin{slide}\raggedright
536 \renewcommand{\leftmargini}{5mm}
537 {\Large{\textcolor{black}{Binary-to-Binary: Performance}}}
538 \\\rule{\textwidth}{0.1pt}\\
542 \item \begin{figure}[H]
544 \epsfig{file=chart7,width=\textwidth-1in}
554 \begin{slide}\raggedright
555 \renewcommand{\leftmargini}{5mm}
556 {\Large{\textcolor{black}{Binary-to-Binary: Performance}}}
557 \\\rule{\textwidth}{0.1pt}\\
563 Performance increased from 10-30\% after compiling to bytecode
567 Increase was primarily the result of two things
572 More optimal handing of branch instructions
575 Use if the JVM stack in favor of temporary variables
586 \begin{slide}\raggedright
587 \renewcommand{\leftmargini}{5mm}
588 {\Large{\textcolor{black}{Optimizations: Optimal Method Size}}}
589 \\\rule{\textwidth}{0.1pt}\\
593 \item \begin{figure}[H]
595 \epsfig{file=chart6,width=\textwidth-1in}
605 \begin{slide}\raggedright
606 \renewcommand{\leftmargini}{5mm}
607 {\Large{\textcolor{black}{Optimizations: Optimal Method Size}}}
608 \\\rule{\textwidth}{0.1pt}\\
612 \item \begin{figure}[H]
614 \epsfig{file=chart5,width=\textwidth-1in}
624 \begin{slide}\raggedright
625 \renewcommand{\leftmargini}{5mm}
626 {\Large{\textcolor{black}{Better Branching}}}
627 \\\rule{\textwidth}{0.1pt}\\
635 \begin{Verbatim}[fontsize=\tiny,frame=single,rulecolor=\color{CodeBorder},resetmargins=true,gobble=0]
640 case 0x10004: // JMP 0x10000
652 \begin{slide}\raggedright
653 \renewcommand{\leftmargini}{5mm}
654 {\Large{\textcolor{black}{Better Branching}}}
655 \\\rule{\textwidth}{0.1pt}\\
663 \begin{Verbatim}[fontsize=\tiny,frame=single,rulecolor=\color{CodeBorder},resetmargins=true,gobble=0]
668 12: GOTO 10 // JMP 0x10000
678 \begin{slide}\raggedright
679 \renewcommand{\leftmargini}{5mm}
680 {\Large{\textcolor{black}{Better Branching}}}
681 \\\rule{\textwidth}{0.1pt}\\
687 Conditional branches were also not handled optimally in the Binary-
695 \begin{Verbatim}[fontsize=\tiny,frame=single,rulecolor=\color{CodeBorder},resetmargins=true,gobble=0]
709 \begin{slide}\raggedright
710 \renewcommand{\leftmargini}{5mm}
711 {\Large{\textcolor{black}{Better Branching}}}
712 \\\rule{\textwidth}{0.1pt}\\
718 Binary-to-binary compiler emits JVM branch instructions that
719 branch directly to the destination
726 \begin{Verbatim}[fontsize=\tiny,frame=single,rulecolor=\color{CodeBorder},resetmargins=true,gobble=0]
740 \begin{slide}\raggedright
741 \renewcommand{\leftmargini}{5mm}
742 {\Large{\textcolor{black}{More efficient delay slots}}}
743 \\\rule{\textwidth}{0.1pt}\\
749 The MIPS ISA has ``delay slots'', that is, instruction after a branch instruction
750 that are always executed even if the branch is taken.
753 The modification the delay slot makes to the registers used in the branch
754 instruction cannot effect the branch. Therefore, the delay slot cannot just be
755 output before the branch
758 In the binary-to-source compiler delay slots were simply output twice (once
759 before setting the new pc and once after the branch)
768 \begin{slide}\raggedright
769 \renewcommand{\leftmargini}{5mm}
770 {\Large{\textcolor{black}{More efficient delay slots}}}
771 \\\rule{\textwidth}{0.1pt}\\
777 The JVM stack can be used to eliminate this inefficiency.
780 The variables used in the comparison are pushed to the stack, the
781 delay slot is executed, then the comparison/branch is done
784 Delay slot never needs to be output more than once
793 \begin{slide}\raggedright
794 \renewcommand{\leftmargini}{5mm}
795 {\Large{\textcolor{black}{Smaller Bytecode}}}
796 \\\rule{\textwidth}{0.1pt}\\
802 Javac doesn't emit the most space efficient bytecode possible
809 \begin{Verbatim}[fontsize=\tiny,frame=single,rulecolor=\color{CodeBorder},resetmargins=true,gobble=0]
811 Case 0x1: run_0x100(); break;
812 Case 0x2: run_0x200(); break;
819 Javac emits an ALOAD\_0 and an INVOKESPECIAL instruction for each case arm.
820 The ALOAD\_0 can be hoisted out of the switch statement to create smaller bytecode.
830 \begin{slide}\raggedright
831 \renewcommand{\leftmargini}{5mm}
832 {\Large{\textcolor{black}{Smaller Bytecode}}}
833 \\\rule{\textwidth}{0.1pt}\\
839 Smaller bytecode also results from many of the performance
840 increasing optimizations
843 He more efficient branch instructions also take up less space in bytecode
846 Using the stack in favor of local variables leads to less LOADs and
850 Better handling of delay slots avoids outputting each delay slot twice
859 \begin{slide}\raggedright
860 \renewcommand{\leftmargini}{5mm}
861 {\Large{\textcolor{black}{GCC Optimizations}}}
862 \\\rule{\textwidth}{0.1pt}\\
868 NestedVM emulates a MIPS R2000 CPU but its performance profile
869 is very different from the actual CPU.
872 GCC makes assumptions about the CPU that don't hold true for
876 This leads to suboptimal code generation
885 \begin{slide}\raggedright
886 \renewcommand{\leftmargini}{5mm}
887 {\Large{\textcolor{black}{GCC Optimizations}}}
888 \\\rule{\textwidth}{0.1pt}\\
894 Fortunately, GCC provides a wealth of options to compensate for
898 {\texttt{-falign-functions}} - Used to place functions on 512 byte boundaries to
899 reduce the chances of a single function spanning several java
909 \begin{slide}\raggedright
910 \renewcommand{\leftmargini}{5mm}
911 {\Large{\textcolor{black}{GCC Optimizations}}}
912 \\\rule{\textwidth}{0.1pt}\\
918 {\texttt{-fno-rename-registers}}
921 Normally GCC tries to use as many registers as possible make it easier to
925 This creates more work for the binary translator and the JVM. Using few
926 registers simplifies the resulting bytecode and allows the JIT compiler to
927 better optimize the resulting code.
936 \begin{slide}\raggedright
937 \renewcommand{\leftmargini}{5mm}
938 {\Large{\textcolor{black}{GCC Optimizations}}}
939 \\\rule{\textwidth}{0.1pt}\\
945 {\texttt{-fno-schedule-insns}}
948 The results of some MIPS operations are not immediately available. Many
949 instructions have delay slots that are executed before the results are
953 GCC tries to take advantage of these delay slots by using them to do useful
957 Since these results are immediately available in NestedVM this makes the
958 code more complicated with no benefit.
967 \begin{slide}\raggedright
968 \renewcommand{\leftmargini}{5mm}
969 {\Large{\textcolor{black}{Impact of Optimizations}}}
970 \\\rule{\textwidth}{0.1pt}\\
974 \item \begin{figure}[H]
976 \epsfig{file=chart4,width=\textwidth-1in}
986 \begin{slide}\raggedright
987 \renewcommand{\leftmargini}{5mm}
988 {\Large{\textcolor{black}{Overall Performance}}}
989 \\\rule{\textwidth}{0.1pt}\\
993 \item \begin{figure}[H]
995 \epsfig{file=chart3,width=\textwidth-1in}
1005 \begin{slide}\raggedright
1006 \renewcommand{\leftmargini}{5mm}
1007 {\Large{\textcolor{black}{Implementation}}}
1008 \\\rule{\textwidth}{0.1pt}\\
1014 {\texttt{http://nestedvm.ibex.org/}}