22

I know there are some vaguely similar questions already relating to BNF (Backus-Naur Form) grammars in Python, but none of them help me much in terms of my application.

I have multiple BNFs that I need to write code for. The code should be able to both generate and recognize legal strings using the BNF grammar.

The first BNF I'm working with is for all real numbers in Python. It is as follows:

<real number>    ::= <sign><natural number> |
                     <sign><natural number>'.'<digit sequence> |
                     <sign>'.'<digit><digit sequence> |
                     <sign><real number>'e'<natural number>
<sign>           ::= ‘’ | ‘+’ | ‘-‘
<natural number> ::= ‘0’ | <nonzero digit><digit sequence>
<nonzero digit>  ::= 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
<digit sequence> ::= ‘’ | <digit><digit sequence>
<digit>          ::= 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

Any BNF parsers I've found for Python seem extraordinarily complex, or use outside libraries. Is there any simpler way to check against and generate using BNF grammar in Python?

Jakemmarsh
  • 4,601
  • 12
  • 43
  • 70
  • 5
    BNF == Backus Normal Form? For those of us who don't play around with grammar parsers every day. – Ben Dec 23 '12 at 19:07
  • @Ben yes, you're correct. Sorry for not clarifying, I'll edit the post – Jakemmarsh Dec 23 '12 at 19:11
  • Are you looking for something that will parse a BNF file to generate a grammar/lexer or something you can write in Python to describe to it an equivalent of BNF? – Jon Clements Dec 23 '12 at 19:16
  • @JonClements I suppose I'm more looking for something that I can write in Python to describe to it an equivalent of a BNF, if I'm understanding you correctly. – Jakemmarsh Dec 23 '12 at 19:25
  • 2
    Okay - the friendliest and most versatile library that's still in active development I've used which uses Python based objects to describe grammars is http://pyparsing.wikispaces.com/ – Jon Clements Dec 23 '12 at 19:30
  • @JonClements I have come across pyparsing multiple times while researching this. However, from what I've seen, it doesn't quite seem able to implement something like the BNF above? – Jakemmarsh Dec 23 '12 at 20:52
  • @Jakemmarsh it would - have you tried? Also it's worth noting that if your BNF is typical to Python's expression of types, then you can "cheat" and use its `ast` and `tokenize` libraries – Jon Clements Dec 23 '12 at 20:56
  • 1
    What libraries and tools have you looked at, and what is wrong with them? – Marcin Dec 23 '12 at 20:58

3 Answers3

10

This post contains an example of a lexical scanner which doesn't need third-party libraries. It may not do all you want, but you should be able to use it as a basis for something that fits your needs.

I don't know if your applications all relate to lexical scanning - but if not, ply is a fairly easy to use parser (given that you need to know broadly how parsers work).

jcomeau_ictx
  • 37,688
  • 6
  • 92
  • 107
Vinay Sajip
  • 95,872
  • 14
  • 179
  • 191
  • 1
    I appreciate the response. I looked into your links, but I'm not quite sure they're what I'm looking for in this case. – Jakemmarsh Dec 23 '12 at 20:52
  • 5
    It would go a long way if you bothered to mention WHY it's not what you're looking for. You know, so the next guy can help. – OmnipotentEntity Dec 23 '12 at 20:54
  • 6
    the link is dead. it's really helpful to copy paste the most important part of it into the answer, or even all of them. – Jason Hu Jun 22 '15 at 21:06
  • 3
    Re: the dead link. Googling the strings `sexy lexing python` finds a Reddit thread https://www.reddit.com/r/programming/comments/7ztdm/sexy_lexing_in_python/ and perhaps an archive here: http://www.itpub.net/archiver/tid-1794484.html – Fuhrmanator Apr 03 '16 at 18:03
  • Archive link is empty? – Ekrem Dinçel May 27 '20 at 20:22
6

have a look at https://github.com/erikrose/parsimonious

Parsimonious aims to be the fastest arbitrary-lookahead parser written in pure Python—and the most usable. It's based on parsing expression grammars (PEGs), which means you feed it a simplified sort of EBNF notation.

cleder
  • 1,637
  • 14
  • 15
4

I had good experiences with grako. Note that grako is no longer maintained, and 竜 TatSu is recommended instead.

I used grako for parseWKT.

It takes a EBNF as input and generates a PEG parser from it.

I think it would be reasonable simple to write a BNF to EBNF Parser in tatsu, which would then generate a parser from the EBNF

cleder
  • 1,637
  • 14
  • 15
  • 1
    Note that development for grako has stopped. – Björn Lindqvist Nov 13 '22 at 14:15
  • Does Tatsu satisfy this requirement: "The code should be able to **both generate** and recognize legal strings using the BNF grammar."? Not that any other answer does, including the accepted one. If this question were recent, I would VTC as requesting opinions, but I guess that's not appropriate for a legacy question. – rici Nov 14 '22 at 17:19
  • No tatsu dos not generate BNF grammars, This is more a 1 out of 2 is better than nothing answer – cleder Nov 15 '22 at 11:20