5

I'm trying to write a simple int expression parser using tatsu, a PEG-based Python parser generator. Here is my code:

import tatsu

grammar = r'''
    start = expression $ ;
    expression = add | sub | term ;
    add = expression '+' term ;
    sub = expression '-' term ;
    term = mul | div | number ;
    mul = term '*' number ;
    div = term '/' number ;
    number = [ '-' ] /\d+/ ;
'''
parser = tatsu.compile(grammar)
print(parser.parse('2-1'))

The output of this program is ['-', '1'] instead of the expected ['2', '-', '1'].

I get the correct output if I either:

  • Remove support for unary minus, i.e. change the last rule to number = /\d+/ ;
  • Remove the term, mul and div rules, and support only addition and subtraction
  • Replace the second rule with expresssion = add | sub | mul | div | number ;

The last option actually works without leaving any feature out, but I don't understand why it works. What is going on?

EDIT: If I just flip the add/sub/mul/div rules to get rid of left recursion, it also works. But then evaluating the expressions becomes a problem, since the parse tree is flipped. (3-2-1 becomes 3-(2-1))

itsadok
  • 28,822
  • 30
  • 126
  • 171
  • 2
    Most programming language grammars I've seen don't actually make the negative sign part of the number - it's just parsed as a unary minus operator applied to a positive number literal. – user2357112 Aug 19 '17 at 22:20
  • 1
    Have you tried eliminating left recursion from your grammar? The tatsu docs say left recursion support is experimental. – user2357112 Aug 19 '17 at 22:24
  • Looks like it works without left recursion, but then I lose left associativity. So this is just a bug in Tatsu? – itsadok Aug 19 '17 at 22:31
  • I have much the same problem. I'd love to know why the first part (2) is disappearing from the AST. – Michael Grazebrook Aug 23 '18 at 14:39

1 Answers1

1

There are left recursion cases that TatSu doesn't handle, and work on fixing that is currently on hold.

You can use left/right join/gather operators to control the associativity of parsed expressions in a non-left-recursive grammar.

Apalala
  • 9,017
  • 3
  • 30
  • 48