TODO

   1 _____________________________________________________________________________
   2 Immediately
   3
   4   - keywordification (ie globally reject from all productions?
   5   - generalized follow-by?)
   6
   7   - clean up util package
   8
   9   - currently we GC the doomed stack when the parent dies... but
  10     [ideally] we should also GC the parent when the doomed stack dies
  11     if it was a positive conjunct.
  12
  13   - single-tree-return assumption
  14      - have a way to let question marks not be tagged?
  15      - "flat" sequences (no subtrees contain "::"s?) -- stringifiable
  16      - make it so that we can have multi-result nonterminals
  17        so long as they always appear under double-colons?
  18        auto-insert the unwrap?
  19
  20   - get rid of Sequence.Singleton if possible
  21
  22   - use 'a'-'z' or 'a-z' instead of [a-z]?
  23   - de-genericize?
  24   - foo.add(x)
  25     foo.add(y.andnot(x)) ==> this is broken
  26       - distinguish Conjunct from Sequence?
  27             => !(Conjunct instanceof Reducible)
  28   - avoid building the parts of the tree that end up getting dropped
  29       - is it worth adding an additional class of states for these?
  30          - or perhaps just a runtime node marker (hasNonDroppedParent)
  31       - "ambiguity modulo dropped fragments"?
  32          - this may conceal highly inefficient grammars...
  33   - double-check all the region logic
  34   - automatically collect time statistics and display
  35
  36 ______________________________________________________________________________
  37 v1.1
  38
  39 - use regression/least-squares/trend-prof to look for reductions whose
  40   behavior is O(n^2)? (ie performed a number of times proportional to
  41   the input consumed so far).
  42
  43 - Optimizations:
  44     - (x &~ y) => (x & z)
  45     - string lookahead
  46     - don't form result forests for negated productions
  47     - early kills: (a &~ ... b ...) -> once "... b" is seen, "a" is dead
  48
  49   - MUST HAVE BETTER ERROR MESSAGES
  50      - use for developing java15.g
  51      - better ambiguity reporting
  52         - colorized tree-diffs?
  53         - graphviz?
  54      - better toString() methods all around...
  55
  56   - Treewalker code compiler?
  57   - detect and reject circular gramars
  58   - skeleton generator?
  59   - precedes restrictions ("<-")
  60   - More topology untangling [later]
  61   - grammar highlighting?
  62   - Forest needs a "manual access" API (lifting makes this hard)
  63   - rewriting language? multiple passes?
  64   - write some grammars
  65       - Java grammar (java15.g)
  66       - TeX (math?)
  67       - RFC2822 (email message/headers)
  68       - Wikipedia grammar (needs to be both lexerless and boolean)
  69
  70
  71 ______________________________________________________________________________
  72 Features
  73
  74   - try harder to "fuse" states together along two dimensions:
  75      - identical (equivalent) states, or states that subsume each other
  76      - unnecessary intermediate states ("short cut" GLR)
  77
  78   - substring parsing
  79       - better error messages
  80       - Rekers & Koorn note that this can do really elegant and generalized "autocompletion".
  81   - Parameterized LR
  82   - "Regular Right Part" grammars (NP Chapman, etc)
  83   - Attribute unification
  84       - Partly-Linear-PATR? (O(n^6) unification grammar)
  85
  86   - optional "prefer whitespace higher up" disambiguation heuristic
  87
  88   - Incremental parse table construction
  89
  90   - "lazy GLR" and "lazy trees" -> language with first-class CF matching
  91       - perhaps linear boolean grammars instead? (linear time, quad space)
  92
  93   - Followed-by and not-followed-by predicates of arbitrary length
  94       - expands the grammar beyond Boolean LR...
  95       - requires *very* smart garbage collection
  96
  97 ______________________________________________________________________________
  98 Optimizations
  99
 100   - understand and implement the RNGLR "kernel state" optimization.
 101     The _Practical Early Parsing_ paper may help.
 102
 103   - implement Johnstone's algorithm for "reduced, resolved LR
 104     tables" to eliminate superfluous reductions on
 105     epsilon-transitions.
 106
 107   - Implement a k-token peek buffer (for each state, see if it "dead
 108     ends" during the next k Phases based solely on state -- ignoring
 109     result SPPF)
 110
 111   - Is there any way we can avoid creating a GSS.Node instance for
 112     nodes which are transient in the sense that they have only one
 113     eligible reduction?
 114
 115   - Re-read Rekers, particularly the stuff on optimal sharing
 116
 117   - bring back in parse-table phase resolution of precedence (just
 118     like associativity).  This can be inferred from the use of ">"
 119     when the rules are in one of these special forms:
 120
 121        E ::=  E     _
 122            >  _     E
 123
 124        E ::=  _     E
 125            >  E  _  E
 126
 127        E ::=  E  _  E
 128            >  E  _  E
 129
 130     where "_" is anything and "E" is the defining nonterminal.
 131     Essentially what we're looking for is the situation where the
 132     leftmost portion of one rule produces another rule, and the
 133     rightmost portion of the latter produces the former.
 134
 135     I'm not 100% certain that this is as "strong" as the prefer/avoid
 136     form (try to prove this, you probably can), but it's "what people
 137     intend" most of the time.
 138