2 <p style="border: 1px red solid; width: 50%; padding: 10px; background-color: white; margin-left: auto; margin-right: auto">
4 The public APIs in this package are <b>stable</b>; package-private
5 APIs and all other packages are subject to change in future
6 releases.<br><br>Be sure to read <a
7 href=../../../../jargon.txt>doc/jargon.txt</a> and the <a
8 href=#package_description>description</a> below.
13 This package forms the stable core of the SBP API Classes fall into
16 <ul> <li> <font color=green>Elements of the grammar</font> -- the
17 pieces from which a grammar is composed.
19 <li> <font color=purple>Input, Location, and Region</font> -- the
20 input to be parsed, as well as classes for describing
21 locations and regions of that input.
23 <li> Parser -- the engine that actually performs the parsing
26 <li> <font color=blue>Trees and Forests</font> -- used to
27 represent the output of the parsing process.
33 <h2>Theory of Operation</h2>
37 The input that you parse is considered to be a stream of
38 <tt>Tokens</tt>; this stream is represented by an
39 <tt>Input<Token></tt>. In order to create this <tt>Input</tt>,
40 you must first decide what kind of tokens you want to parse. Based on
41 this decision, you should then implement subclasses of <tt>Input</tt>,
42 <tt>Parser</tt>, and <tt>Atom</tt> for that token type. If you are
43 parsing characters (which you usually are), these subclasses are
44 provided in the <tt>edu.berkeley.sbp.chr.*</tt> package so you don't
45 have to write them yourself.
49 You then create a grammar by instantiating objects belonging to your
50 subclass of <tt>Atom</tt> and forming them into sequences using
51 <tt>Sequence.create___()</tt> and <tt>new Union()</tt>.
55 Ultimately you will wind up with an instance of <tt>Union</tt>
56 corresponding to the "start nonterminal" of your grammar. You can
57 then provide this <tt>Union</tt> to the constructor of your
58 <tt>Parser</tt> subclass and invoke the <tt>Parser.parse(Input)</tt>
59 method on the <tt>Input</tt> to be parsed.
63 The result will be a <tt>Forest</tt>, which is an efficient
64 representation of a set of one or more trees that may share subtrees.
68 If the parse was ambiguous, you can use
69 <tt>Forest.expand(HashSet)</tt> to expand the Forest into all the
70 possible trees (there is not yet a stable API for inspecting the
71 <tt>Forest</tt> directly).
75 If the parse was <i>not</i> ambiguous, you can call
76 <tt>Forest.expand1()</tt> to return the single possible parsing as a
77 <tt>Tree</tt>. You would then typically use the methods of the
78 <tt>Tree</tt> class to examine the parse tree.
82 <h2>Guide to the API</h2>
87 <font color=cyan>package</font> <font color=#f0f>edu.berkeley.sbp.misc</font>;<br>
89 <font color=cyan>import</font> <font color=#f0f>edu.berkeley.sbp.*</font>;<br>
91 <font color=cyan>public</font> <font color=cyan>class</font> Demo2 {<br>
93 <font color=cyan>private</font> <font color=cyan>static</font> <font color=orange>Atom</font> <font color=#00f>atom</font>(<font color=orange>char</font> <font color=yellow>c</font>) {<br>
94 <font color=cyan>return</font> <font color=cyan>new</font> edu.berkeley.sbp.chr.<font color=orange>CharAtom</font>(c); }<br>
95 <font color=cyan>private</font> <font color=cyan>static</font> <font color=orange>Atom</font> <font color=#00f>atom</font>(<font color=orange>char</font> <font color=yellow>c1</font>, <font color=orange>char</font> <font color=yellow>c2</font>) {<br>
96 <font color=cyan>return</font> <font color=cyan>new</font> edu.berkeley.sbp.chr.<font color=orange>CharAtom</font>(c1, c2); }<br>
98 <font color=cyan>public</font> <font color=cyan>static</font> <font color=cyan>void</font> <font color=#00f>main</font>(<font color=orange>String[]</font> s) throws Exception {<br>
100 <font color=orange>Union</font> <font color=yellow>expr</font> = <font color=cyan>new</font> <font color=orange>Union</font>(<font color=#0f0>"Expr"</font>);<br>
102 <font color=orange>Element[]</font> <font color=yellow>add</font> <font color=yellow></font> = <font color=cyan>new</font> <font color=orange>Element</font>[] { expr, atom(<font color=#0f0>'+'</font>), expr };<br>
103 <font color=orange>Element[]</font> <font color=yellow>mult</font> <font color=yellow></font> = <font color=cyan>new</font> <font color=orange>Element</font>[] { expr, atom(<font color=#0f0>'*'</font>), expr };<br>
104 <font color=orange>Element[]</font> <font color=yellow>paren</font> = <font color=cyan>new</font> <font color=orange>Element</font>[] { atom(<font color=#0f0>'('</font>), expr, atom(<font color=#0f0>')'</font>) };<br>
106 <font color=orange>Sequence</font> <font color=yellow>addSequence</font> = <font color=orange>Sequence</font>.create(<font color=#0f0>"add"</font>, add, <font color=#f0f>null</font>, <font color=#f0f>false</font>);<br>
107 <font color=orange>Sequence</font> <font color=yellow>multSequence</font> = <font color=orange>Sequence</font>.create(<font color=#0f0>"mult"</font>, mult, <font color=#f0f>null</font>, <font color=#f0f>false</font>);<br>
110 // uncomment this line to disambiguate<br>
111 //multSequence = multSequence.andnot(Sequence.create("add", add, null, false));<br>
114 expr.add(<font color=orange>Sequence</font>.create(paren, 1));<br>
115 expr.add(addSequence);<br>
116 expr.add(multSequence);<br>
117 expr.add(<font color=orange>Sequence</font>.create(atom(<font color=#0f0>'0'</font>, <font color=#0f0>'9'</font>)));<br>
119 <font color=orange>String</font> <font color=yellow>input</font> = <font color=#0f0>"8+(1+3)*7"</font>;<br>
121 System.out.println(<font color=#0f0>"input: \""+input+"\""</font>);<br>
123 <font color=orange>StringBuffer</font> <font color=yellow>sb</font> = <font color=cyan>new</font> <font color=orange>StringBuffer</font>();<br>
124 expr.toString(sb);<br>
125 System.out.println(<font color=#0f0>"grammar: \n"</font>+sb);<br>
127 <font color=orange>Forest</font> <font color=yellow>f</font> = <font color=cyan>new</font> edu.berkeley.sbp.chr.<font color=orange>CharParser</font>(expr).parse(input);<br>
128 System.out.println(<font color=#0f0>"output: "</font>+f.expand1().toPrettyString());<br>
129 }<br>
135 Executing this code gives the following:
139 java -Xmx900m -cp edu.berkeley.sbp.jar edu.berkeley.sbp.misc.Demo2<br>
140 input: "8+(1+3)*7"<br>
142 Expr = [(] Expr [)]<br>
143 | "add":: Expr [+] Expr<br>
144 | "mult":: Expr [*] Expr<br>
145 | [0-9]<br>
147 Exception in thread "main" unresolved ambiguity; shared subtrees are shown as "*"<br>
148 possibility: mult:{add:{* * *} * *}<br>
149 possibility: add:{* * mult:{* * *}}<br>
153 If we uncomment the line in the example, the result is:
157 java -Xmx900m -cp edu.berkeley.sbp.jar edu.berkeley.sbp.misc.Demo2<br>
158 input: "8+(1+3)*7"<br>
160 Expr = [(] Expr [)] <br>
161 | "add":: Expr [+] Expr <br>
162 | "mult":: Expr [*] Expr &~ "add":: Expr [+] Expr <br>
163 | [0-9] <br>
165 output: add:{8 + mult:{add:{1 + 3} * 7}}<br>