2 <p style="border: 1px red solid; width: 50%; padding: 10px; background-color: white; margin-left: auto; margin-right: auto">
4 The public APIs in this package are <b>stable</b>; package-private
5 APIs and all other packages are subject to change in future
6 releases.<br><br>Be sure to read <a
7 href=../../../../jargon.txt>doc/jargon.txt</a> and the <a
8 href=#package_description>description</a> below; there is also a
9 <a href=#faq>faq</a> a the end of this document.
14 This package forms the stable core of the SBP API Classes fall into
17 <ul> <li> <font color=green>Elements of the grammar</font> -- the
18 pieces from which a grammar is composed.
20 <li> <font color=purple>Input, Location, and Region</font> -- the
21 input to be parsed, as well as classes for describing
22 locations and regions of that input.
24 <li> Parser -- the engine that actually performs the parsing
27 <li> <font color=blue>Trees and Forests</font> -- used to
28 represent the output of the parsing process.
34 <h2>Theory of Operation</h2>
38 The input that you parse is considered to be a stream of
39 <tt>Tokens</tt>; this stream is represented by an
40 <tt>Input<Token></tt>. In order to create this <tt>Input</tt>,
41 you must first decide what kind of tokens you want to parse. Based on
42 this decision, you should then implement subclasses of <tt>Input</tt>,
43 <tt>Parser</tt>, and <tt>Atom</tt> for that token type. If you are
44 parsing characters (which you usually are), these subclasses are
45 provided in the <tt>edu.berkeley.sbp.chr.*</tt> package so you don't
46 have to write them yourself.
50 You then create a grammar by instantiating objects belonging to your
51 subclass of <tt>Atom</tt> and forming them into sequences using
52 <tt>Sequence.create___()</tt> and <tt>new Union()</tt>.
56 Ultimately you will wind up with an instance of <tt>Union</tt>
57 corresponding to the "start nonterminal" of your grammar. You can
58 then provide this <tt>Union</tt> to the constructor of your
59 <tt>Parser</tt> subclass and invoke the <tt>Parser.parse(Input)</tt>
60 method on the <tt>Input</tt> to be parsed.
64 The result will be a <tt>Forest</tt>, which is an efficient
65 representation of a set of one or more trees that may share subtrees.
69 If the parse was ambiguous, you can use
70 <tt>Forest.expand(HashSet)</tt> to expand the Forest into all the
71 possible trees (there is not yet a stable API for inspecting the
72 <tt>Forest</tt> directly).
76 If the parse was <i>not</i> ambiguous, you can call
77 <tt>Forest.expand1()</tt> to return the single possible parsing as a
78 <tt>Tree</tt>. You would then typically use the methods of the
79 <tt>Tree</tt> class to examine the parse tree.
84 <h2>Guide to the API</h2>
89 <font color=cyan>package</font> <font color=#f0f>edu.berkeley.sbp.misc</font>;<br>
91 <font color=cyan>import</font> <font color=#f0f>edu.berkeley.sbp.*</font>;<br>
93 <font color=cyan>public</font> <font color=cyan>class</font> Demo2 {<br>
95 <font color=cyan>private</font> <font color=cyan>static</font> <font color=orange>Atom</font> <font color=#00f>atom</font>(<font color=orange>char</font> <font color=yellow>c</font>) {<br>
96 <font color=cyan>return</font> <font color=cyan>new</font> edu.berkeley.sbp.chr.<font color=orange>CharAtom</font>(c); }<br>
97 <font color=cyan>private</font> <font color=cyan>static</font> <font color=orange>Atom</font> <font color=#00f>atom</font>(<font color=orange>char</font> <font color=yellow>c1</font>, <font color=orange>char</font> <font color=yellow>c2</font>) {<br>
98 <font color=cyan>return</font> <font color=cyan>new</font> edu.berkeley.sbp.chr.<font color=orange>CharAtom</font>(c1, c2); }<br>
100 <font color=cyan>public</font> <font color=cyan>static</font> <font color=cyan>void</font> <font color=#00f>main</font>(<font color=orange>String[]</font> s) throws Exception {<br>
102 <font color=orange>Union</font> <font color=yellow>expr</font> = <font color=cyan>new</font> <font color=orange>Union</font>(<font color=#0f0>"Expr"</font>);<br>
104 <font color=orange>Element[]</font> <font color=yellow>add</font> <font color=yellow></font> = <font color=cyan>new</font> <font color=orange>Element</font>[] { expr, atom(<font color=#0f0>'+'</font>), expr };<br>
105 <font color=orange>Element[]</font> <font color=yellow>mult</font> <font color=yellow></font> = <font color=cyan>new</font> <font color=orange>Element</font>[] { expr, atom(<font color=#0f0>'*'</font>), expr };<br>
106 <font color=orange>Element[]</font> <font color=yellow>paren</font> = <font color=cyan>new</font> <font color=orange>Element</font>[] { atom(<font color=#0f0>'('</font>), expr, atom(<font color=#0f0>')'</font>) };<br>
108 <font color=orange>Sequence</font> <font color=yellow>addSequence</font> = <font color=orange>Sequence</font>.create(<font color=#0f0>"add"</font>, add, <font color=#f0f>null</font>, <font color=#f0f>false</font>);<br>
109 <font color=orange>Sequence</font> <font color=yellow>multSequence</font> = <font color=orange>Sequence</font>.create(<font color=#0f0>"mult"</font>, mult, <font color=#f0f>null</font>, <font color=#f0f>false</font>);<br>
112 // uncomment this line to disambiguate<br>
113 //multSequence = multSequence.andnot(Sequence.create("add", add, null, false));<br>
116 expr.add(<font color=orange>Sequence</font>.create(paren, 1));<br>
117 expr.add(addSequence);<br>
118 expr.add(multSequence);<br>
119 expr.add(<font color=orange>Sequence</font>.create(atom(<font color=#0f0>'0'</font>, <font color=#0f0>'9'</font>)));<br>
121 <font color=orange>String</font> <font color=yellow>input</font> = <font color=#0f0>"8+(1+3)*7"</font>;<br>
123 System.out.println(<font color=#0f0>"input: \""+input+"\""</font>);<br>
125 <font color=orange>StringBuffer</font> <font color=yellow>sb</font> = <font color=cyan>new</font> <font color=orange>StringBuffer</font>();<br>
126 expr.toString(sb);<br>
127 System.out.println(<font color=#0f0>"grammar: \n"</font>+sb);<br>
129 <font color=orange>Forest</font> <font color=yellow>f</font> = <font color=cyan>new</font> edu.berkeley.sbp.chr.<font color=orange>CharParser</font>(expr).parse(input);<br>
130 System.out.println(<font color=#0f0>"output: "</font>+f.expand1().toPrettyString());<br>
131 }<br>
137 Executing this code gives the following:
141 java -Xmx900m -cp edu.berkeley.sbp.jar edu.berkeley.sbp.misc.Demo2<br>
142 input: "8+(1+3)*7"<br>
144 Expr = [(] Expr [)]<br>
145 | "add":: Expr [+] Expr<br>
146 | "mult":: Expr [*] Expr<br>
147 | [0-9]<br>
149 Exception in thread "main" unresolved ambiguity; shared subtrees are shown as "*"<br>
150 possibility: mult:{add:{* * *} * *}<br>
151 possibility: add:{* * mult:{* * *}}<br>
155 If we uncomment the line in the example, the result is:
159 java -Xmx900m -cp edu.berkeley.sbp.jar edu.berkeley.sbp.misc.Demo2<br>
160 input: "8+(1+3)*7"<br>
162 Expr = [(] Expr [)] <br>
163 | "add":: Expr [+] Expr <br>
164 | "mult":: Expr [*] Expr &~ "add":: Expr [+] Expr <br>
165 | [0-9] <br>
167 output: add:{8 + mult:{add:{1 + 3} * 7}}<br>
175 <b>I get the error <tt>java.lang.Error: multiple non-dropped elements
176 in sequence</tt>, what does this mean?</b>
180 <font color=red><b>Note</b></font>: this question deals with the
181 package <tt>edu.berkeley.sbp.<b>meta</b></tt>, which is not considered
186 When using the class <tt>edu.berkeley.sbp.meta.Grammar</tt>, you must
187 supply an instance of <tt>Grammar.Bindings</tt>; this instance tells
188 SBP how to create a parse tree for an expression using the parse trees
189 of its subexpressions.
193 SBP has no trouble determining what to do when parsing an expression
194 that drops all of its subexpressions, or all but one -- for example:
202 ... in this example, only <tt>C</tt> is "non-dropped". In this case,
203 the result of parsing <tt>A</tt> is simply the result of parsing
208 However, if we were to leave more than one element un-dropped, SBP
209 needs to know how to form a single tree out of the two non-dropped
210 subtrees. There are two ways to do this. The simplest is to provide
211 a tag -- a string which becomes the common parent of the two subtrees:
215 Expr = <b>Mult::</b> Expr "*" Expr
219 If you are using <tt>AnnotationGrammarBindings</tt>, you can also deal
220 with this situation by declaring a method/inner-class whose name
221 matches the nonterminal (<tt>Expr</tt>) and has appropriate
222 annotations. This is fairly advanced stuff, and the code it uses
223 isn't quite as mature as the rest of the code.
227 <h2>Reporting Bugs</h2>
231 Bug reports are especially appreciated when you submit them as a test
233 <a href=../../../../../tests/testcase.g>grammar</a> and some
234 <a href=../../../../../tests/regression.tc>examples</a>).
236 This way we can add your bug report as part of the regression suite,
237 and be sure we never release a new version in which the bug has crept
243 For now, please send bug reports to <a
244 href=http://research.cs.berkeley.edu/project/sbp/list/>the mailing