4

I need to do a parser for propositional logic. I pretend to do it by hand implemented like a recursive descent parser in java.

My question is about the lexer, is it really needed for this job? I mean to define a finite state machine to recognize tokens etc. I have seen some examples about simple parsers for arithmetic and they handle all in a "single parser" relying only on the grammar rules. It doesn't look like they care about a separate independent lexer which provides the tokens for the parser.

Since i want to do this in the most correct way, i ask for advice for this job. Any link to related info is welcome.

Abdulaziz
  • 13,694
  • 3
  • 13
  • 12
Wyvern666
  • 633
  • 1
  • 11
  • 26

2 Answers2

3

A bit more information would be useful, i.e. the grammar you want to use and an example input String. I don't know how much you know about Chomsky's grammar levels, but that's the key. Simplified said, a lexer can parse on word level (Level 3: Regular grammars) and a parser can analyse syntax, too (Level 2: Context-free grammars). (more information here: lexers vs parsers)

It's possible to use a scannerless parser, but I think you would just integrate the lexer in your parser if you write without trying to avoid a lexer. In other words, if you write your program, you would call the part that tokenizes the raw input string the lexer and the one that applies the grammar the parser, if that's how you want to call it. But you shouldn't give a lot about the terms. If you write a parser and don't need a lexer, chances a high, that the lexer is already in your code, but who cares ;) Hope that helps, but feel free to ask if it's still unclear!

Community
  • 1
  • 1
Johannes Stadler
  • 737
  • 12
  • 24
  • 1
    Nice link. Well, in theory the lexer will deal with the regular grammar for propositional logic, the operators NOT, OR, AND, etc. And the parser will deal with well formed formulas of the logic (the context free grammar). Maybe for mi case the lexer has no much work to do, but can be usefull for practice and learning since im new with this topics. Im thinking that having a separate lexer would be more convenient and easier to extend if i want to parse different languages later. I really dont want to avoid anything, this just for learn. – Wyvern666 Dec 12 '13 at 18:26
  • 1
    The parser get tokens one by one from the lexer, or first the lexer do his job and sends all the tokens at once for the parser?. I assume it could be donde in both ways, but which is better? – Wyvern666 Dec 12 '13 at 18:36
  • 1
    If it's for learning a separate lexer is definitely better to understand the mechanics. IMHO it would then be more clear to first tokenize all the tokens and then parse them, especially if it's not about parsing as fast as you can and additionaly you can see the cuts between plaintext, tokens and result. – Johannes Stadler Dec 13 '13 at 11:20
1

You don't need a "real" lexer for this kind of parser. You do need something that picks off the atoms of your language (identifiers, operators, parentheses). You can do this for simple languages, directly in the parser.

See my SO answer on how to code a simple parser: Is there an alternative for flex/bison that is usable on 8-bit embedded systems?

Community
  • 1
  • 1
Ira Baxter
  • 93,541
  • 22
  • 172
  • 341