Combining lexer with many parsers

Question

I know a typical configuration of lexer and parser, where the lexer reads the source code and generates tokens, which are then directed to the parser, and the parser uses them as terminal symbols in its grammar productions. In a typical recursive-descent parser, you start by calling some top-level function representing the starting nonterminal, and this function call others and reads token by token from the lexer.

But what if I need two different parsers on top of the same lexer?

I mean, both of them reading from the same place, because I don't want to read the same source multiple times, that is, no multiple passes allowed, to avoid unnecessary duplicating work in the lexer. I just want it that when the next token in sequence have just been generated, both parsers consume it at the same time.

But I can call only one top-level function in one of these parsers; can't call both at the same time :/

Is there some way to run these parsers in some kind of a step mode? That is, when I've got a new token, I want to pass it to both parsers one after another, but only to advance them by that one token, update their internal states and data structures as far as they can, and return immediately to wait for another token.

I haven't seen any configuration of this kind never before. Is it possible at all to build a parser that way? Are there some materials about how this kind of parser could be structured in code? Has it any name?

EDIT 1: I don't want to use any parser generator tool, but write the code myself, because I want to learn how this kind of stuff works internally.

You must be talking about some *specific* implementation (e.g. Lex/Yacc); if you were implementing your own, you could simply design it with this in mind. So which implementation are you talking about? — Oliver Charlesworth, Feb 22 '11 at 20:53
Not using any parser generator. Writing my own. But I don't know how the code should be structured to make it well. — SasQ, Feb 22 '11 at 22:10
And what's the problem? Let your lexer return a lazy list, and both your parsers will consume it. You can stack as many parsers and transformers as you like (and a lexer should not be any different from the other parsers). — SK-logic, Feb 22 '11 at 22:42
The problem lies not in the lexer, but in the parser. How to write the code to let it be used step-wise instead of a one call from top to bottom? Is the push parser still a top-down parser? Or some other animal? (bottoms-up?) — SasQ, Feb 22 '11 at 23:44

score 3 · Answer 1 · answered Feb 22 '11 at 21:58

3

You described typical flow of a pull parser. It is called once and it takes control until all its input is completely parsed. Parser calls lexer by itself to get next token. A push parser, on the other hand, is called each time a new token is made available. So you can call several parsers for every new token. Classical Bison can be used in push mode (details are there). Lemon parser generator generates push parsers.

answered Feb 22 '11 at 21:58

Artem Zankovich

2,319
20
36

Thanks. Now I know at least the terminology. So I'm interested now how the push parser works. I'm not using any parser generator, since I want to understand the technique of its inner workings. – SasQ Feb 22 '11 at 22:14

Combining lexer with many parsers

1 Answers1