Question
TLDR: I want to match anything but /.+?/
doesnt' seem to work, why?
I have the following super simple grammar and code:
from lark import Lark, Tree
parser: Lark = Lark(r"""
rterm: "(___hole 0" anything ")"
anything: /.+?/
%import common.ESCAPED_STRING
%import common.SIGNED_NUMBER
%import common.WS
%ignore WS
""", start='rterm')
test_strings: list[str] = ["(___hole 0 (fun n : nat => ___hole 1 (___hole 2 eq_refl : 0 + n = n)))"]
for test_string in test_strings:
print(f'{test_string=}')
tree: Tree = parser.parse(test_string)
print(tree.pretty())
when I try to parse the only test string I have it gives me an error:
Traceback (most recent call last):
File "/Users/brandomiranda/miniconda/envs/iit_term_synthesis/lib/python3.9/site-packages/IPython/core/interactiveshell.py", line 3398, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-18-352bf581b4ee>", line 19, in <cell line: 17>
tree: Tree = parser.parse(test_string)
File "/Users/brandomiranda/miniconda/envs/iit_term_synthesis/lib/python3.9/site-packages/lark/lark.py", line 581, in parse
return self.parser.parse(text, start=start, on_error=on_error)
File "/Users/brandomiranda/miniconda/envs/iit_term_synthesis/lib/python3.9/site-packages/lark/parser_frontends.py", line 106, in parse
return self.parser.parse(stream, chosen_start, **kw)
File "/Users/brandomiranda/miniconda/envs/iit_term_synthesis/lib/python3.9/site-packages/lark/parsers/earley.py", line 297, in parse
to_scan = self._parse(lexer, columns, to_scan, start_symbol)
File "/Users/brandomiranda/miniconda/envs/iit_term_synthesis/lib/python3.9/site-packages/lark/parsers/xearley.py", line 144, in _parse
to_scan = scan(i, to_scan)
File "/Users/brandomiranda/miniconda/envs/iit_term_synthesis/lib/python3.9/site-packages/lark/parsers/xearley.py", line 118, in scan
raise UnexpectedCharacters(stream, i, text_line, text_column, {item.expect.name for item in to_scan},
lark.exceptions.UnexpectedCharacters: No terminal matches 'f' in the current parser context, at line 1 col 13
(___hole 0 (fun n : nat => ___hole 1 (___hole 2 eq_r
^
Expected one of:
* RPAR
focus on the last line:
lark.exceptions.UnexpectedCharacters: No terminal matches 'f' in the current parser context, at line 1 col 13
(___hole 0 (fun n : nat => ___hole 1 (___hole 2 eq_r
^
Expected one of:
* RPAR
which surprises me because I would have expected .+?
to match any characgter but it claims that it can't match the f
. Does anyone know why?
Research
I've search and saw these two relevant questions but their contents didn't help:
- https://github.com/lark-parser/lark/issues/257
- Lark parser can't parse characters, even though they are defined in regex of rule this one seems helpful due to the type of error. It's not matching the char
f
for some reason but the dot.
should have captured that, no?
(nearly) cross posted here: https://github.com/lark-parser/lark/discussions/1163