parsimonious - Rule 'rules' matched in its entirety, but it didn't consume all the text

Question

I am making a simple parser for expressions and this is my code:

import parsimonious as parmon

parser = parmon.Grammar(r"""
            E = E "+" E / id
            id = "0"/"1"/"2"/"3"/"4"/"5"/"6"/"7"/"8"/"9"
    """)

code = "2+2"

print(parser.parse(code))

I get this error:

IncompleteParseError(text, node.end, self)
parsimonious.exceptions.IncompleteParseError: Rule 'rules' matched in its entirety, but it didn't consume all the text. The non-matching portion of the text begins with '/ id
            id = "0"/"1"' (line 2, column 16).

I have also tried Lark-parser but couldn't get to work on that either. Help appreciated.

Parsimonious' github page lists "Maybe support left-recursive rules" under "Future Directions", so I don't think left-recursion is currently supported. — sepp2k, Jan 16 '18 at 16:19
Try putting spaces around the `/` operators. Or write the rule as a regex: `id = ~"[0-9]+"` (remove the `+` if you really only want to accept single-digit numbers.) Also, what @sepp2k said. — rici, Jan 16 '18 at 17:00
Which "that"? The spaces or the regex? And in what way did it "not work"? — rici, Jan 16 '18 at 17:08
I can't reproduce the error message. I downloaded parsimonious 0.8 which gave me an error on the `E = E "+" E / id` line, at the `/`. Changing it to `E = ( E "+" E ) / id` got it to compile the grammar. I know next to nothing about parsimonious. Unless you have some emotional attachment to PEG parsing, take a look at PLY, which is a mature project. — rici, Jan 16 '18 at 17:41
@ParamdeepSinghObheroi: could you please mark one of the answers as answering the question if it does it for you? — sophros, Nov 09 '20 at 12:15

score 1 · Answer 1 · answered Jan 17 '18 at 20:16

I can't offer anything wrt any of the parsers you mentioned. Have you considered pyparsing?

id is defined to be a one-digit numerical token.
Forward indicates that E will be defined later in the code. (It's analogous to the use of 'forward' in procedural languages.)
The << operator inserts the definition of E into itself. The parentheses call for a 'match first,' meaning that the first expression in the 'or' will be applied, if possible.
The parser is exercised within the two print functions.

Here's a simple parser for that kind of expression.

from pyparsing import *

id = Word(nums, min=1, max=1)
E = Forward()
E << (id + '+' + E | id)

code = '2 + 2'

print (E.parseString(code))

print (E.parseString('3+4+5'))

This codes yields this result.

['2', '+', '2']
['3', '+', '4', '+', '5']

score 1 · Answer 2 · answered Mar 28 '19 at 20:55

Maybe it is worth to elaborate on the @rici's comment with a solution to your problem:

E = E "+" E / id means in fact: E = E "+" (E / id) which is a non-ending recursive definition:

E = E "+" E / id when E is substituted for the right side:

E = (E "+" (E / id)) "+" (E / id), etc.

This means that although the right operand of + is matched right away in your example expression (picking id production which is the terminal character 2) there would still be (neverending) doubts how to match the left side.

That is why the EBNF you provided is wrong and changing it to:

E = ( E "+" E ) / id

solves the problem.

parsimonious - Rule 'rules' matched in its entirety, but it didn't consume all the text

2 Answers2