How do I handle negative numbers in a PEG grammar?

Question

I'm trying to write a simple int expression parser using tatsu, a PEG-based Python parser generator. Here is my code:

import tatsu

grammar = r'''
    start = expression $ ;
    expression = add | sub | term ;
    add = expression '+' term ;
    sub = expression '-' term ;
    term = mul | div | number ;
    mul = term '*' number ;
    div = term '/' number ;
    number = [ '-' ] /\d+/ ;
'''
parser = tatsu.compile(grammar)
print(parser.parse('2-1'))

The output of this program is ['-', '1'] instead of the expected ['2', '-', '1'].

I get the correct output if I either:

Remove support for unary minus, i.e. change the last rule to number = /\d+/ ;
Remove the term, mul and div rules, and support only addition and subtraction
Replace the second rule with expresssion = add | sub | mul | div | number ;

The last option actually works without leaving any feature out, but I don't understand why it works. What is going on?

EDIT: If I just flip the add/sub/mul/div rules to get rid of left recursion, it also works. But then evaluating the expressions becomes a problem, since the parse tree is flipped. (3-2-1 becomes 3-(2-1))

Most programming language grammars I've seen don't actually make the negative sign part of the number - it's just parsed as a unary minus operator applied to a positive number literal. — user2357112, Aug 19 '17 at 22:20
Have you tried eliminating left recursion from your grammar? The tatsu docs say left recursion support is experimental. — user2357112, Aug 19 '17 at 22:24
Looks like it works without left recursion, but then I lose left associativity. So this is just a bug in Tatsu? — itsadok, Aug 19 '17 at 22:31
I have much the same problem. I'd love to know why the first part (2) is disappearing from the AST. — Michael Grazebrook, Aug 23 '18 at 14:39

score 1 · Answer 1 · answered Nov 17 '18 at 08:16

1

There are left recursion cases that TatSu doesn't handle, and work on fixing that is currently on hold.

You can use left/right join/gather operators to control the associativity of parsed expressions in a non-left-recursive grammar.

answered Nov 17 '18 at 08:16

Apalala

9,017
3
30
48

How do I handle negative numbers in a PEG grammar?

1 Answers1