4

I would like to generate a Python parser for a custom language. As I am new to parsing, my only requirement so far is that the generated module shall not depend on the generator.

I learned Tatsu, as it can generate the parser as Python module. But when I review the generated module, it still begins with

from tatsu.buffering import Buffer
from tatsu.parsing import Parser
from tatsu.parsing import tatsumasu, leftrec, nomem
...

Is there a way to generate standalone (depending only on standard Python libraries) parser module using Tatsu? If not, do I have any other option?

Adam Trhon
  • 2,915
  • 1
  • 20
  • 51

2 Answers2

1

Lark, a pure Python parsing library, can generate a small standalone parser for any LALR(1) grammar you throw at it:

The resulting parser is much smaller than Lark itself, it loads much faster because the grammar is pre-compiled, and it's just as easy to use.

Links:

Edit: I first wrote "any context-free grammar", but it is just LALR(1). Thanks Erez for the correction.

Aristide
  • 3,606
  • 2
  • 30
  • 50
  • 1
    Lark can generate a stand-alone parser for LALR(1) grammars only, not every CFG. It can parse every CFG using Earley, but then you need to import the entire library. – Erez May 17 '23 at 15:07
  • I have modified my answer, but now I have a doubt. LALR(1) is a class of parsers, is it a class of languages too? I am new fairly new to language theory, and just started this morning to write a grammar in Lark for a small rational language. – Aristide May 17 '23 at 19:46
  • LALR(1) is a class of parsers, and any grammar it can parse is a LALR(1) grammar, which is a subset of CFG grammars. Good luck with your grammar! – Erez May 18 '23 at 09:29
-1

Take a look at pegen, by Guido van Rossum, Pablo Galindo, et al.

It is the basis for the Python parser in Python 3.9.

Apalala
  • 9,017
  • 3
  • 30
  • 48
  • `pegen` does not generate a stand-alone parser. Running `python -m pegen data/expr.gram -o parser.py` (one of the provided examples) results in the following import: `from pegen.parser import memoize, memoize_left_rec, logger, Parser` – bramhaag May 24 '22 at 14:15