We found serious problems with ambiguities in Grammatica. Since Grammatica does not solve ambiguity productions, our grammar file was changed in a way to improve the parser generator time, from 30 minutes to less that 1 second, but losing in readability. The know issues listed below are responsible for many of the changes.

Parser: Some grammar ambiguities may go undetected

When the last element of an production is optional, some ambiguities may go undetected. This is due to not properly comparing the tokens in the optional element with the tokens after the production. The ambiguous grammar A = B ["x"]; B = "y" ["x"]; illustrates this. In some cases this may also lead to parse errors as ambiguities have not been resolved with the appropriate number of look-ahead tokens. This error is present in all versions of Grammatica, but occurs infrequently. (Bug #4117)

Grammar: Production representation should be improved

The internal representation of a production makes it hard for several alternatives to share a several left-hand side elements. This may cause inherent ambiguities to be found, as the number of look-ahead tokens needed to separate the alternatives is infinite. If these alternatives could share the first elements, this ambiguity would not require a rewrite of the grammar. (Bug #4322)

Grammar: Identical productions should be unified

Identical synthetic productions are not identified as such. Instead, a new synthetic production is added in each case. This will cause problems for LR parsers, and is generally inefficient. Identical synthetic productions should be unified.

Tokenizer: Improve processing speed

There are probably still substantial parsing speed improvements to gain by improving the tokenizer performance. All the obvious optimizations have already been done, however. The next step is probably to create a DFA. (Bug #3603)

Check all Grammatica open issues in: http://grammatica.percederberg.net/doc/release/bugs.html

Last edited Feb 7, 2008 at 7:17 PM by vfpamp, version 4


No comments yet.