I'm not sure whether this a problem with the Nearley.js library, the Moo tokenizer/lexer or with my own code. So I might need to submit this as an issue to the Nearley repo. All the referenced files can be found in this Gist.
I am attempting to write a Nearley grammar that will parse a list of homework problems for one of my classes. The problems are in problems.txt and look like this:
Section 5.2 (Due 4/23)- #3, 5*, 8*, 9, 11, 14*, 15, 17*, 18*, 20, 21*, 22*, 24*, 25 (see example 5, not discussed in class)
Section 5.3 (Due 4/30)- #1, 3*, 4, 5, 6*, 7, 9*, 11, 13*, 16, 20*, 21*, 22*, 23, 24*, 25*, 27, 28*, 31, 32
That's just two lines as an example, whole file is larger.
The Nearley grammar I wrote is in problems-grammar.ne
here and I'm not entirely finished yet. I'm using the Moo tokenizer/lexer according to these instructions in the Nearley docs.
I'm currently testing my grammar by using the nearley-unparse
command as explained here using this command, where problems-grammar.js
is the parser compiled by Nearley.
nearley-unparse problems-grammar.js -o test.txt
Unfortunately, the unparser doesn't seem to be correctly generating grammars with examples of the tokens, apart from the newline token. Here is one output of nearley-unparse
:
Section (Due )- #*, ,
Section (Due )- #, *,
Section (Due )- #*, , , *,
Section (Due )- #*, *
Section (Due )- #*, *, *, *
I'm wondering whether this is a flaw in my grammar or a flaw with Nearley/Moo itself. If it's a problem with my code, how can I fix it?