X-Git-Url: http://git.megacz.com/?p=nestedvm.git;a=blobdiff_plain;f=doc%2Fivme04.tex;h=13500be0f279e043dc99920f0f03c274cc6e64fe;hp=d248144ff67443ab1e3cbeb79488ddd159b37f08;hb=fd9029a7104f769a45dc56eeff9a3ce14febf5dc;hpb=ee8899b5e156e02c02e696eb41398c71c6272916 diff --git a/doc/ivme04.tex b/doc/ivme04.tex index d248144..13500be 100644 --- a/doc/ivme04.tex +++ b/doc/ivme04.tex @@ -4,11 +4,12 @@ \usepackage{amssymb,amsmath,epsfig,alltt} \sloppy \usepackage{palatino} -\usepackage{pdftricks} -\begin{psinputs} - \usepackage{pstricks} - \usepackage{pst-node} -\end{psinputs} +\usepackage{url} +%\usepackage{pdftricks} +%\begin{psinputs} +\usepackage{pstricks} +\usepackage{pst-node} +%\end{psinputs} \usepackage{parskip} \usepackage{tabularx} \usepackage{alltt} @@ -45,7 +46,7 @@ one source code language (such as C, Pascal, or Fortran) to another We present an alternative approach which translate MIPS binaries produced by any compiler into safe virtual machine bytecodes. This approach offers four key advantages over existing techniques: it is -language agnostic, it offers bug-for-bug compiler compatability, +language agnostic, it offers bug-for-bug compiler compatibility, requires no post-translation human intervention, and introduces no build process modifications. @@ -65,9 +66,8 @@ attractive solution for code which is not performance-critical. \section{Introduction} -Unsafe languages such as C \cite{KR} and C++ \cite{soustroup} have -been in use much longer than any of today's widely accepted safe -languages such as Java \cite{java} and C\# \cite{csharp}. +Unsafe languages such as C and C++ have been in use much longer than +any of today's widely accepted safe languages such as Java and C\# Consequently, there is a huge library of software written in these languages. Although safe languages offer substantial benefits, their comparatively young age often puts them at a disadvantage when breadth @@ -109,7 +109,7 @@ unsafe source, binary, and bytecode) and survey existing work for performing transformations between them. In the third section we introduce NestedVM and cover its two primary translation modes in detail. Section four describes the NestedVM runtime, which plays the -role of the OS kernel. Section five adresses the optimizations we +role of the OS kernel. Section five addresses the optimizations we employ and quantifies NestedVM's performance. Section six reviews our experiences in applying NestedVM to various popular software packages. We conclude with an analysis of NestedVM's weaknesses and potential @@ -122,7 +122,7 @@ The four program representations of interest in this context, along with their specific types in the C-to-JVM instantiation of the problem are shown in the following diagram: -\begin{pdfpic} +%\begin{pdfpic} \newlength{\MyLength} \settowidth{\MyLength}{machine code} \newcommand{\MyBox}[1]{\makebox[\MyLength][c]{#1}} @@ -138,15 +138,15 @@ problem are shown in the following diagram: & \\[0pt] \psset{nodesep=5pt,arrows=->} \end{psmatrix} -\end{pdfpic} +%\end{pdfpic} To illustrate the context of this diagram, the following arcs show the translations performed by a few familiar tools: -\begin{pdfpic} -\newlength{\MyLength} -\settowidth{\MyLength}{xmachine codex} -\newcommand{\MyBox}[1]{\makebox[\MyLength]{#1}} +%\begin{pdfpic} +%\newlength{\MyLength} +%\settowidth{\MyLength}{xmachine codex} +%\newcommand{\MyBox}[1]{\makebox[\MyLength]{#1}} \psmatrix[colsep=2,rowsep=0,nrot=:D] & \\[0pt] [name=s0]\MyBox{unsafe source} & [name=s1]\MyBox{safe source} \\[0pt] @@ -163,7 +163,7 @@ translations performed by a few familiar tools: \ncline{s1}{b1}\aput{:U}{\tt javac} \ncline{b1}{b0}\aput{:D}{\tt gcj}\bput{:D}{JITs} \endpsmatrix -\end{pdfpic} +%\end{pdfpic} Existing techniques for translating unsafe code into VM bytecode generally fall into two categories, which we expand upon in the @@ -175,10 +175,10 @@ source-to-binary translation. The most common technique employed is partial translation from unsafe source code to safe source code: -\begin{pdfpic} -\newlength{\MyLength} -\settowidth{\MyLength}{xmachine codex} -\newcommand{\MyBox}[1]{\makebox[\MyLength]{#1}} +%\begin{pdfpic} +%\newlength{\MyLength} +%\settowidth{\MyLength}{xmachine codex} +%\newcommand{\MyBox}[1]{\makebox[\MyLength]{#1}} \psmatrix[colsep=2,rowsep=0,nrot=:U] & \\[0pt] & \\[0pt] @@ -194,7 +194,7 @@ source code to safe source code: \ncline{s0}{s1}\aput{:U}{source-to}\bput{:U}{source} \ncline{s1}{b1}\aput{:U}{\tt javac} \endpsmatrix -\end{pdfpic} +%\end{pdfpic} A number of existing systems employ this technique; they can be divided into two categories: those which perform a partial @@ -205,12 +205,12 @@ programs. \subsubsection{Human-Assisted Translation} Jazillian \cite{jazillian} is a commercial solution which produces -extremely readable Java source code from C source code, but ony +extremely readable Java source code from C source code, but only translates a small portion of the C language. Jazillian is unique in that in addition to {\it language migration}, it also performs {\it API migration}; for example, Jazillian is intelligent enough to translate ``{\tt char*~s1~=~strcpy(s2)}'' into ``{\tt -String~s1~=~s2}''. Unfortunately such deep analysis is intractible +String~s1~=~s2}''. Unfortunately such deep analysis is intractable for most of the C language and standard library; indeed, Jazillian's documentation notes that {\it ``This is not your father's language translator... Jazillian does not always produce code that works @@ -226,7 +226,7 @@ for all of the C++ programs which it accepts. \subsubsection{Partial-Domain Translation} -The {\tt c2j} \cite{c2j}, {\tt c2j++} \cite{c2jpp}, Cappucinno +The {\tt c2j} \cite{c2j}, {\tt c2j++}, Cappucinno \cite{capp}, and Ephedra \cite{ephedra} systems each provide support for complete translation of a {\it subset} of the source language (C or C++). Each of the four tools supports a progressively greater @@ -243,7 +243,7 @@ recently written. Unfortunately, when the program being translated is large and complex, it is quite likely that it will use an unsupported feature in at least one place. In the absence of a programmer who understands the source -program, a single anomoly is often enough to render the entire +program, a single anomaly is often enough to render the entire translation process useless. As a result, these tools are mainly useful as an {\it aid} to programmers who could normally perform the conversion themselves, but want to save time by automating most of the @@ -255,10 +255,10 @@ process. Source-to-binary translation involves a compiler for the unsafe language which has been modified to emit safe bytecode. -\begin{pdfpic} -\newlength{\MyLength} -\settowidth{\MyLength}{xmachine codex} -\newcommand{\MyBox}[1]{\makebox[\MyLength]{#1}} +%\begin{pdfpic} +%\newlength{\MyLength} +%\settowidth{\MyLength}{xmachine codex} +%\newcommand{\MyBox}[1]{\makebox[\MyLength]{#1}} \psmatrix[colsep=2,rowsep=0,nrot=:U] & \\[0pt] [name=s0]\MyBox{unsafe source} & [name=s1]\MyBox{safe source} \\[0pt] @@ -272,7 +272,7 @@ language which has been modified to emit safe bytecode. \psset{nodesep=5pt,arrows=->} \ncline{s0}{b1}\bput{:U}{source-to-binary} \endpsmatrix -\end{pdfpic} +%\end{pdfpic} An experimental ``JVM backend'' for the {\tt gcc} compiler, known as {\tt egcs-jvm} \cite{egcsjvm}, attempts this approach. Since {\tt @@ -283,8 +283,8 @@ fundamental assumptions (such as the availability of pointer math) which cannot be directly supported; thus the compiler still fails for a substantial class of input programs. -A Java backend for the {\tt lcc} [CITE] compiler, known as {\tt -lcc-java} [CITE], is also available. Although this system is quite +A Java backend for the {\tt lcc} compiler \cite{lcc}, known as {\tt +lcc-java}, is also available. Although this system is quite clean and elegantly designed, it lacks any form of system library ({\tt libc}), so very few C programs will run without custom modification (which would cause them to diverge from the upstream @@ -312,10 +312,10 @@ translation. This offers three immediate advantages: code generation steps of compilation, it is freed from the extremely complex task of faithfully implementing languages which are often not fully or formally specified (such as C and - C++), and is able to support any langage for which a + C++), and is able to support any language for which a MIPS-targeted compiler exists. -\item {\bf Bug-for-bug compiler compatability} +\item {\bf Bug-for-bug compiler compatibility} Since NestedVM uses the compiler's {\it output} as its own {\it input}, it ensures that programs which are inadvertently @@ -382,7 +382,7 @@ carry a performance penalty on physical MIPS implementations. Our choice of a paged representation for memory carries only a small performance disadvantage: -\epsfig{file=charts/chart5,width=3in} +\epsfig{file=charts/chart5,height=2.7in,angle=-90} Additionally, this representation lets us to take advantage of the fact that on most JVM's, checking for a {\tt NullPointerException} @@ -437,10 +437,10 @@ translation. In this mode, NestedVM translates MIPS binaries into Java source code, which is then fed to a Java compiler in order to produce bytecode files: -\begin{pdfpic} -\newlength{\MyLength} -\settowidth{\MyLength}{xmachine codex} -\newcommand{\MyBox}[1]{\makebox[\MyLength]{#1}} +%\begin{pdfpic} +%\newlength{\MyLength} +%\settowidth{\MyLength}{xmachine codex} +%\newcommand{\MyBox}[1]{\makebox[\MyLength]{#1}} \psmatrix[colsep=2,rowsep=0,nrot=:U] & \\[0pt] & \\[0pt] @@ -456,7 +456,7 @@ produce bytecode files: \ncline{s1}{b1}\aput{:U}{\tt javac} \ncline{b0}{s1}\naput{\tt NestedVM} \endpsmatrix -\end{pdfpic} +%\end{pdfpic} \begin{figure*}[t] \begin{minipage}[c]{7in}% @@ -526,7 +526,7 @@ Translating unsafe code for use within a JVM proceeds as follows: emits a {\tt .java} file. \item The resulting {\tt .java} code is compiled into a {\tt .class} - file using {\tt jikes} \cite{jikes} or {\tt javac}. + file using {\tt jikes} or {\tt javac}. \item At runtime, the host Java code invokes the {\tt run()} method on the generated class. This is equivalent to the {\tt main()} @@ -539,10 +539,10 @@ Translating unsafe code for use within a JVM proceeds as follows: After implementing the binary-to-source compiler, a binary-to-binary translation mode was added. -\begin{pdfpic} -\newlength{\MyLength} -\settowidth{\MyLength}{xmachine codex} -\newcommand{\MyBox}[1]{\makebox[\MyLength]{#1}} +%\begin{pdfpic} +%\newlength{\MyLength} +%\settowidth{\MyLength}{xmachine codex} +%\newcommand{\MyBox}[1]{\makebox[\MyLength]{#1}} \psmatrix[colsep=2,rowsep=0,nrot=:U] & \\[0pt] [name=s0]\MyBox{unsafe source} & [name=s1]\MyBox{safe source} \\[0pt] @@ -557,7 +557,7 @@ translation mode was added. \ncline{s0}{b0}\bput{:U}{\tt gcc} \ncline{b0}{b1}\naput{\tt NestedVM} \endpsmatrix -\end{pdfpic} +%\end{pdfpic} This mode has several advantages: @@ -580,12 +580,12 @@ This mode has several advantages: \section{The NestedVM Runtime} -The NestedVM runtime fills the role typically assmed by an OS Kernel. +The NestedVM runtime fills the role typically assumed by an OS Kernel. Communication between MIPS code and the runtime is mediated by the {\tt SYSCALL} instruction, just as the {\tt libc}-kernel interface is on other MIPS implementations. -Two implemenations of the runtime are offered; a simple runtime with +Two implementations of the runtime are offered; a simple runtime with the minimum support required to comply with ANSI C, and a more sophisticated runtime which emulates a large portion of the POSIX API. @@ -707,7 +707,7 @@ local variables for registers did not offer much of a performance advantage, presumably since the JVM's JIT is intelligent enough to register-allocate temporaries for fields. -\epsfig{file=charts/chart4,width=3in} +\epsfig{file=charts/chart4,height=2.7in,angle=-90} Unfortunately, Java imposes a 64kb limit on the size of the bytecode for a single method. This presents a problem for NestedVM, and @@ -749,10 +749,10 @@ This change alone nearly doubled the speed of the compiled binary. The next performance improvement came from tuning the size of the methods invoked from the trampoline. Trial and error led to the -onclusion that HotSpot \cite{hotspot} -- the most widely deployed JVM +conclusion that HotSpot \cite{hotspot} -- the most widely deployed JVM -- performs best when 128 MIPS instructions are mapped to each method. -\epsfig{file=charts/chart1,width=3in} +\epsfig{file=charts/chart1,height=2.7in,angle=-90} This phenomenon is due to two factors: @@ -770,7 +770,7 @@ This phenomenon is due to two factors: \end{itemize} After tuning method sizes, our next performance boost came from -eliminating exraneous case branches, which yielded approximately a +eliminating extraneous case branches, which yielded approximately a 10\%-25\% performance improvement. Having case statements before each instruction prevents JIT compilers from being able to optimize across instruction boundaries, since control flow can enter the body of a @@ -823,7 +823,7 @@ central values; for example ``{\tt pc~=~N\_0x00010000~+~0x10}'' (where {\tt N\_0x000100000} is a non-final field to prevent {\tt javac} from inlining it). This was sufficient to get reasonably large binaries to compile, and caused only a small (approximately 5\%) performance -degredation and a similarly small increase in the size of the {\tt +degradation and a similarly small increase in the size of the {\tt .class} file. However, as we will see in the next section, compiling directly to {\tt .class} files (without the intermediate {\tt .java} file) eliminates this problem entirely. @@ -832,7 +832,7 @@ file) eliminates this problem entirely. Compiling directly to bytecode offers a substantial performance gain: -\epsfig{file=charts/chart3,width=3in} +\epsfig{file=charts/chart3,height=2.7in,angle=-90} Most of this improvement comes from the handling of branch instructions and from taking advantage of the JVM stack to eliminate @@ -918,7 +918,7 @@ case} arm shrinks by one instruction. The net result is quite reasonably sized {\tt .class} files: -\epsfig{file=charts/chart9,width=3in} +\epsfig{file=charts/chart9,height=2.7in,angle=-90} \subsection{Compiler Flags} @@ -978,13 +978,33 @@ source-to-MIPS compiler: \end{itemize} +The following chart quantifies the performance gain conferred by each +of the major optimizations outlined in this section: + +\epsfig{file=charts/chart10,height=2.7in,angle=-90} + + + \subsection{Overall Performance} -\epsfig{file=charts/chart8,width=3in} +All times are measured in seconds. All tests were performed on a dual +1Ghz Macintosh G4 running Apple's latest JVM (Sun HotSpot JDK +1.4.1). Each test was run 8 times within a single VM. The highest and +lowest times were removed and the remaining 6 were averaged. In each +case only the first run differed significantly from the rest. -\epsfig{file=charts/chart7,width=3in} +The {\tt libjpeg} test consisted of decoding a 1280x1024 jpeg and +writing a tga. The {\tt mspack} test consisted of extracting all +members from {\tt arial32.exe}, {\tt comic32.exe}, {\tt times32.exe}, +and {\tt verdan32.exe}. The {\tt libfreetype} test consisted of +rendering ASCII characters 32-127 of {\tt Comic.TTF} at sizes from 8 +to 48 incrementing by 4 for a total of 950 glyphs. + +\epsfig{file=charts/chart8,height=2.7in,angle=-90} + +\epsfig{file=charts/chart7,height=2.7in,angle=-90} -\epsfig{file=charts/chart6,width=3in} +\epsfig{file=charts/chart6,height=2.7in,angle=-90} \section{Sample Applications} @@ -994,7 +1014,7 @@ The Ibex Project utilizes three libraries for which no Java-only equivalent exists. The first is the FreeType font library, which parses, hints, and rasterizes TrueType and Postscript fonts with exceptional quality. The project also needed an open source JPEG -decompressor; surprisingly, none exist for Java. While encoders are +decompresser; surprisingly, none exist for Java. While encoders are plentiful, decoders are rare, since Sun's J2SE VM includes a native method to invoke {\tt libjpeg}. @@ -1049,6 +1069,9 @@ for checking the ``cpu time'' of a process. Unfortunately we had to substitute wall-clock time on an otherwise-quiescent machine as an approximation. +\epsfig{file=charts/chart11,height=2.7in,angle=-90} + + \section{Conclusion} @@ -1062,7 +1085,7 @@ applications. Although we have only implemented it for the Java Virtual Machine, our technique generalizes to other safe bytecode architectures. In -particular we would like to demonstrate this generality by retargeting +particular we would like to demonstrate this generality by re-targeting the translator to the Microsoft Intermediate Language \cite{msil}. Additionally, we would like to explore other uses for dynamic loading @@ -1077,23 +1100,8 @@ obtained from http://nestedvm.ibex.org \end{verbatim} -\appendix -\section{Appendix: Testing Methodology} - -All times are measured in seconds. All tests were performed on a dual -1Ghz Macintosh G4 running Apple's latest JVM (Sun HotSpot JDK -1.4.1). Each test was run 8 times within a single VM. The highest and -lowest times were removed and the remaining 6 were averaged. In each -case only the first run differed significantly from the rest. - -The {\tt libjpeg} test consisted of decoding a 1280x1024 jpeg and -writing a tga. The {\tt mspack} test consisted of extracting all -members from {\tt arial32.exe}, {\tt comic32.exe}, {\tt times32.exe}, -and {\tt verdan32.exe}. The {\tt libfreetype} test consisted of -rendering ASCII characters 32-127 of {\tt Comic.TTF} at sizes from 8 -to 48 incrementing by 4 for a total of 950 glyphs. -\bibliography{nestedvm} +\bibliography{ivme04} \end{document}