I'm trying to write a simple int expression parser using tatsu, a PEG-based Python parser generator. Here is my code:
import tatsu
grammar = r'''
start = expression $ ;
expression = add | sub | term ;
add = expression '+' term ;
sub = expression '-' term ;
term = mul | div | number ;
mul = term '*' number ;
div = term '/' number ;
number = [ '-' ] /\d+/ ;
'''
parser = tatsu.compile(grammar)
print(parser.parse('2-1'))
The output of this program is ['-', '1']
instead of the expected ['2', '-', '1']
.
I get the correct output if I either:
- Remove support for unary minus, i.e. change the last rule to
number = /\d+/ ;
- Remove the term, mul and div rules, and support only addition and subtraction
- Replace the second rule with
expresssion = add | sub | mul | div | number ;
The last option actually works without leaving any feature out, but I don't understand why it works. What is going on?
EDIT: If I just flip the add/sub/mul/div rules to get rid of left recursion, it also works. But then evaluating the expressions becomes a problem, since the parse tree is flipped. (3-2-1
becomes 3-(2-1)
)