I've been experimenting with lark and I came across a little problem. Suppose I have the following grammar.
parser = Lark('''
?start: value
| start "or" value -> or
?value: DIGIT -> digit
| ID -> id
DIGIT: /[1-9]\d*/
%import common.CNAME -> ID
%import common.WS
%ignore WS
''', parser='lalr')
Let's say I want to parse 1orfoo
:
print(parser.parse("1orfoo").pretty())
I would expect lark to see it as the digit 1
followed by the identifier orfoo
(thus throwing an error because the grammar does not accept this kind of expressions).
However, the parser runs without error and outputs this:
or
digit 1
id foo
As you can see, lark splits the identifier and sees the expression as an or
statement.
Why is this happening? Am I missing something? How can I prevent this kind of behavior?
Thank you in advance.