1

I am making a simple parser for expressions and this is my code:

import parsimonious as parmon

parser = parmon.Grammar(r"""
            E = E "+" E / id
            id = "0"/"1"/"2"/"3"/"4"/"5"/"6"/"7"/"8"/"9"
    """)

code = "2+2"

print(parser.parse(code))

I get this error:

IncompleteParseError(text, node.end, self)
parsimonious.exceptions.IncompleteParseError: Rule 'rules' matched in its entirety, but it didn't consume all the text. The non-matching portion of the text begins with '/ id
            id = "0"/"1"' (line 2, column 16).

I have also tried Lark-parser but couldn't get to work on that either. Help appreciated.

sophros
  • 14,672
  • 11
  • 46
  • 75
ParmuTownley
  • 957
  • 2
  • 14
  • 34
  • Parsimonious' github page lists "Maybe support left-recursive rules" under "Future Directions", so I don't think left-recursion is currently supported. – sepp2k Jan 16 '18 at 16:19
  • Try putting spaces around the `/` operators. Or write the rule as a regex: `id = ~"[0-9]+"` (remove the `+` if you really only want to accept single-digit numbers.) Also, what @sepp2k said. – rici Jan 16 '18 at 17:00
  • @rici that didn't work. – ParmuTownley Jan 16 '18 at 17:06
  • Which "that"? The spaces or the regex? And in what way did it "not work"? – rici Jan 16 '18 at 17:08
  • neither space nor regex worked. Both showed the same error. – ParmuTownley Jan 16 '18 at 17:09
  • I can't reproduce the error message. I downloaded parsimonious 0.8 which gave me an error on the `E = E "+" E / id` line, at the `/`. Changing it to `E = ( E "+" E ) / id` got it to compile the grammar. I know next to nothing about parsimonious. Unless you have some emotional attachment to PEG parsing, take a look at PLY, which is a mature project. – rici Jan 16 '18 at 17:41
  • @ParamdeepSinghObheroi: could you please mark one of the answers as answering the question if it does it for you? – sophros Nov 09 '20 at 12:15

2 Answers2

1

I can't offer anything wrt any of the parsers you mentioned. Have you considered pyparsing?

  • id is defined to be a one-digit numerical token.
  • Forward indicates that E will be defined later in the code. (It's analogous to the use of 'forward' in procedural languages.)
  • The << operator inserts the definition of E into itself. The parentheses call for a 'match first,' meaning that the first expression in the 'or' will be applied, if possible.
  • The parser is exercised within the two print functions.

Here's a simple parser for that kind of expression.

from pyparsing import *

id = Word(nums, min=1, max=1)
E = Forward()
E << (id + '+' + E | id)

code = '2 + 2'

print (E.parseString(code))

print (E.parseString('3+4+5'))

This codes yields this result.

['2', '+', '2']
['3', '+', '4', '+', '5']
Bill Bell
  • 21,021
  • 5
  • 43
  • 58
1

Maybe it is worth to elaborate on the @rici's comment with a solution to your problem:

E = E "+" E / id means in fact: E = E "+" (E / id) which is a non-ending recursive definition:

E = E "+" E / id when E is substituted for the right side:

E = (E "+" (E / id)) "+" (E / id), etc.

This means that although the right operand of + is matched right away in your example expression (picking id production which is the terminal character 2) there would still be (neverending) doubts how to match the left side.

That is why the EBNF you provided is wrong and changing it to:

E = ( E "+" E ) / id

solves the problem.

sophros
  • 14,672
  • 11
  • 46
  • 75