From: adam Date: Sun, 27 May 2007 20:41:54 +0000 (-0400) Subject: cleanup of doc/ subdirectory X-Git-Url: http://git.megacz.com/?p=sbp.git;a=commitdiff_plain;h=a8482cb26663a5842b90f940b0c7b4c5ec23fadb;ds=sidebyside cleanup of doc/ subdirectory darcs-hash:20070527204154-5007d-b5949d2a049c6823c846f421cccd3dba574c8db8.gz --- diff --git a/doc/osq.lunch.talk.pdf b/doc/osq.lunch.talk.pdf deleted file mode 100644 index 31ca8d2..0000000 Binary files a/doc/osq.lunch.talk.pdf and /dev/null differ diff --git a/doc/preprint.pdf b/doc/preprint.pdf deleted file mode 100644 index 6600afd..0000000 Binary files a/doc/preprint.pdf and /dev/null differ diff --git a/doc/sbp.html b/doc/sbp.html deleted file mode 100644 index fa933e6..0000000 --- a/doc/sbp.html +++ /dev/null @@ -1,306 +0,0 @@ - -The Scannerless Boolean Parser (SBP) - - - -
- -
- -SBP: the Scannerless Boolean Parser -
- -

-

-16-Aug: A new snapshot is here. -
-

- -

-
-
-

Update: [29-July-2006]

- -

- - - -Error handling has been massively improved. Here's an example parsing -from a substantial portion of the Java 1.5 -grammar. The input is missing a -closing angle-bracket on a generic type definition. Click on the -image to view full size. Type -make java15 after a checkout to try it yourself. -

- -
-
- -

-
-
-Update: [22-July-2006]

- -The API has been finalized and includes a decent example/mini-tutorial. -
-
-
- -

-
-
- -Update: [17-July-2006]

- -There is now a mailing list. - - -
-
- -

-
-
-Update: [05-July-2006]

- -The reflective grammar-to-java bindings are complete, so SBP is now -vastly easier to use. You can find example code here -and the companion grammar here. - -
-
- -

What is it?

- -The Scannerless Boolean Parser (SBP) is a scannerless parser for boolean -grammars (a superset of context-free grammars). It is written in -Java and emits Java source code. - -

What is interesting about it?

- -SBP deliberately sacrifices performance in favor of ease of extensibility. -

- -Since it is an implementation of the (modified) Lang-Tomita -GLR algorithm, SBP supports all context-free languages. -

- -It is scannerless -(does not require a lexer). This allows it to easily handle languages -which have non-regular lexical structure or lack a clear lexer-parser -distinction, such as TeX, XML, RFC1738 (URLs), ASN.1, SMTP headers, -and Wiki markup. -

- -In addition to the juxtaposition and union operators provided in -context-free languages, SBP supports grammars which use the -intersection operator (conjunctive -grammars) and the complement operator (boolean -grammars). - -

What features does it have?

- -Features fully implemented are in green; -those partially implemented are in orange; -those unimplemented (but planned) are in red. - -
  • An implementation of the Lang-Tomita GLR parsing algorithm -
      -
    • Including Johnstone & Scott's RNGLR algorithm for epsilon-productions - -
    • Visser's extensions - for scannerless parsing -
      • Follow, Avoid, Prefer, Reject constraints -
      • Character ranges -
      • Automatic insertion of whitespace/comments -
      - -
    • Any topological space can be - used as an alphabet (need not be discrete) -
      • Unicode -
      • Trees -
      - -
    • Associativity constraints on n-ary operators - -
    - -
  • Ability to parse a wide variety of grammars in - O(n3) time: - -
      -
    • all context-free grammars - -
    • epsilon productions, included in the parse forest - -
    • circularities, included in the parse forest. - -
    • Regular expression operators ( - *, - ?, - + - ) - -
    • conjunctive grammars - (intersection operator) - -
    • boolean grammars (intersection, intersect-with-complement, and - generalized-complement) -
    - - -
  • Facilitates experimenting with grammars - -
      -
    • Interpreted mode, in which the - parse table is interpreted directly, eliminating the - need for a compiler and making it easier for grammars - to operate on grammars. - -
    • Simple - API - makes it easy to generate, analyze, and modify grammars - programmatically. - -
        -
      • Components of a grammar (nonterminals, - productions, etc) represented as objects -
      • composite elements implement Iterable<T> -
      - -
    • Compiled mode, in which Java - source code is emitted; compiling this code yields a - parser. The resulting parser is much faster. -
    - - -
- -

What is it deliberately missing?

- -
  • Semantic actions; the only option is to return a parse forest. -
      -
    • This keeps the grammar specification language-neutral. -
    • A grammar can, however, indicate that certain parts of the parse tree should be dropped. -
    -
- -

What features would be nice to have?

- -
    -
  • Drop Farshi's algorithm and use GRMLR. - Done! - -
  • An implementation of the McPeak-Necula - optimization for bounded-depth determinism. - -
  • Lazy parse trees, to decrease the space requirements from - o(n) to o(1) [but still O(n)]. - -
  • Consider implementing - Aycock-Horspool unrolling. Improves performance with - only highly localized increase in algorithmic complexity. - Subsumes many other optimizations. - -
- -

What are the long term goals?

- -As we come to a more mature understanding of the pragmatic aspects of -boolean grammars, a long-term goal is to migrate support for these -features to existing high-performance GLR implementations (Elkhound, bison-glr). - -

Where can I read more about it?

- -
  • The README file is the best place to start -
  • After that, be sure to read jargon.txt -
  • The javadoc - is the best description of the API -
  • There's a tentative metagrammar, - written in itself. -
  • You can also get slides - from my talk at the OSQ Lunch on 02-Nov-2005, though some of - the stuff (specifically what SBP can and cannot do) is - outdated. -
  • A preprint of one of my conference - submissions. -
- -

Where can I get it?

- -The color coding above accurately reflects the state of the -implementation (11-Dec-2005). However, in its current state it is a -bit messy, and may require a bit of fiddling to get it to do what you -want. This situation should improve in the next few weeks as I am -done adding features (for now) and am currently focusing on -reliability, cleanliness, and performance. -

- -SBP is available under the BSD license. -

- -You can download a snapshot (11-Dec-2005) here. The parser-generator -requires Java 1.5 or later; the Java code it emits should run on any Java 1.1+ JVM. After unpacking -the archive, simply type make to compile SBP and run the -regression tests. - - - -

- -