+ + The public APIs in this package are stable; package-private + APIs and all other packages are subject to change in future + releases.

Be sure to read doc/jargon.txt and the description below; there is also a + faq a the end of this document. + +

+This package forms the stable core of the SBP API Classes fall into +five categories: + +

Theory of Operation

+ +The input that you parse is considered to be a stream of +Tokens; this stream is represented by an +Input<Token>. In order to create this Input, +you must first decide what kind of tokens you want to parse. Based on +this decision, you should then implement subclasses of Input, +Parser, and Atom for that token type. If you are +parsing characters (which you usually are), these subclasses are +provided in the edu.berkeley.sbp.chr.* package so you don't +have to write them yourself. + +

+ +You then create a grammar by instantiating objects belonging to your +subclass of Atom and forming them into sequences using +Sequence.create___() and new Union(). + +

+ +Ultimately you will wind up with an instance of Union +corresponding to the "start nonterminal" of your grammar. You can +then provide this Union to the constructor of your +Parser subclass and invoke the Parser.parse(Input) +method on the Input to be parsed. + +

+ +The result will be a Forest, which is an efficient +representation of a set of one or more trees that may share subtrees. + +

+ +If the parse was ambiguous, you can use +Forest.expand(HashSet) to expand the Forest into all the +possible trees (there is not yet a stable API for inspecting the +Forest directly). + +

+ +If the parse was not ambiguous, you can call +Forest.expand1() to return the single possible parsing as a +Tree. You would then typically use the methods of the +Tree class to examine the parse tree. + +

Guide to the API

Example

+package edu.berkeley.sbp.misc;
+
+import edu.berkeley.sbp.*;
+
+public class Demo2 {
+
+    private static Atom atom(char c) {
+        return new edu.berkeley.sbp.chr.CharAtom(c); }
+    private static Atom atom(char c1, char c2) {
+        return new edu.berkeley.sbp.chr.CharAtom(c1, c2); }
+
+    public static void main(String[] s) throws Exception {
+
+        Union expr = new Union("Expr");
+
+        Element[] add   = new Element[] { expr, atom('+'), expr };
+        Element[] mult  = new Element[] { expr, atom('*'), expr };
+        Element[] paren = new Element[] { atom('('), expr, atom(')') };
+
+        Sequence addSequence = Sequence.create("add", add, null, false);
+        Sequence multSequence = Sequence.create("mult", mult, null, false);
+
+ +        // uncomment this line to disambiguate
+        //multSequence = multSequence.andnot(Sequence.create("add", add, null, false));
+ +
+        expr.add(Sequence.create(paren, 1));
+        expr.add(addSequence);
+        expr.add(multSequence);
+        expr.add(Sequence.create(atom('0', '9')));
+
+        String input = "8+(1+3)*7";
+
+        System.out.println("input:  \""+input+"\"");
+
+        StringBuffer sb = new StringBuffer();
+        expr.toString(sb);
+        System.out.println("grammar: \n"+sb);
+
+        Forest f = new edu.berkeley.sbp.chr.CharParser(expr).parse(input);
+        System.out.println("output: "+f.expand1().toPrettyString());
+    }
+
+}
+

+java -Xmx900m -cp edu.berkeley.sbp.jar edu.berkeley.sbp.misc.Demo2
+input:  "8+(1+3)*7"
+grammar:
+Expr            = [(] Expr [)]
+                | "add":: Expr [+] Expr
+                | "mult":: Expr [*] Expr
+                | [0-9]
+
+Exception in thread "main" unresolved ambiguity; shared subtrees are shown as "*"
+  possibility: mult:{add:{* * *} * *}
+  possibility: add:{* * mult:{* * *}}
+

+java -Xmx900m -cp edu.berkeley.sbp.jar edu.berkeley.sbp.misc.Demo2
+input:  "8+(1+3)*7"
+grammar:
+Expr            = [(] Expr [)]
+                | "add":: Expr [+] Expr
+                | "mult":: Expr [*] Expr &~ "add":: Expr [+] Expr
+                | [0-9]
+
+output: add:{8 + mult:{add:{1 + 3} * 7}}
+

FAQs

+I get the error java.lang.Error: multiple non-dropped elements + in sequence, what does this mean? +

+Note: this question deals with the +package edu.berkeley.sbp.meta, which is not considered +stable. +

+When using the class edu.berkeley.sbp.meta.Grammar, you must +supply an instance of Grammar.Bindings; this instance tells +SBP how to create a parse tree for an expression using the parse trees +of its subexpressions. +

+SBP has no trouble determining what to do when parsing an expression +that drops all of its subexpressions, or all but one -- for example: +

+... in this example, only C is "non-dropped". In this case, +the result of parsing A is simply the result of parsing +C. +

+However, if we were to leave more than one element un-dropped, SBP +needs to know how to form a single tree out of the two non-dropped +subtrees. There are two ways to do this. The simplest is to provide +a tag -- a string which becomes the common parent of the two subtrees: +

+Expr = Mult:: Expr "*" Expr +

+If you are using AnnotationGrammarBindings, you can also deal +with this situation by declaring a method/inner-class whose name +matches the nonterminal (Expr) and has appropriate +annotations. This is fairly advanced stuff, and the code it uses +isn't quite as mature as the rest of the code. +

Reporting Bugs

+ +Bug reports are especially appreciated when you submit them as a test +case (here's the +grammar and some +examples). + +This way we can add your bug report as part of the regression suite, +and be sure we never release a new version in which the bug has crept +back in! + +