9

Has anyone seen a good comparison of parser generators' performance?

I'm particularly interested in: 1) recursive ascent parser generators for LALR(1) grammars; 2) parser generators which produce C/C++ based parsers.

skvadrik
  • 586
  • 1
  • 3
  • 11

1 Answers1

3

Are you interested in how fast the parser generators run? Depends of the type of technology of the parsing engine it supports, and the care of the guy who implemented the parser generator. See this answer for some numbers about LALR/GLR parser generators for real languages: https://stackoverflow.com/a/14151966/120163 IMHO, this isn't very important; parser generators are mostly a lot faster than the guy using them.

If the question is, how fast are the generated parsers? you get different answers. LALR parsers can be implemented with a few machine instructions per GOTO transition (using directly-indexed GOTO tables), and a few per reduction. That's pretty hard to beat.

Community
  • 1
  • 1
Ira Baxter
  • 93,541
  • 22
  • 172
  • 341
  • Thank you! I'm not interested in how fast parser generators run, I'm interested in how fast the generated parser is. I wrote LALR(1) recursive ascent generator recently (it produces hard-coded, not table-based parsers) and the parsers generated seem to be about 2 times faster than bison's (though very large). But bison is table-based LALR(1) and I woud like to compare my generator to a similar one. – skvadrik Jan 22 '13 at 00:00
  • 1
    If you are generating code to represent to GOTO table (as a kind of giant switch), you are trading lots of conditional branches for an indirect access. Have you considered generating the set of branches for state as a balanced tree? (Might cut the number of branches executed significantly). ... as a token-frequency biased tree (to take advantage of the fact that certain tokens ( ) = { } ; NNNN ID are very common, and other tokens are less common? – Ira Baxter Jan 22 '13 at 00:06
  • ... there's been work by Bill McKeeman to run the parser in parallel, e.g, break the input stream into N chunks, (partial) parse each chunk, glue together. I think he claimed pretty near linear speedup on big files. – Ira Baxter Jan 22 '13 at 00:07
  • An interesting thought about token frequency! I generate plain switches and currently don't do any optimization of the kind. (My generator is plain and rather stupid). Instead, I rely on C compiler (gcc -O2), and my profiler shows that my parsers have ~5 times less misspredictions than bison's. My generator is based this work: http://130.203.133.150/viewdoc/similar;jsessionid=08D34138469832FA8FB4303DA3E158E5?doi=10.1.1.50.4539 If you could post me a link with Bill McKeeman's work (where I can download it for free), I'd be very glad. :) – skvadrik Jan 22 '13 at 05:29
  • http://www.cs.dartmouth.edu/~mckeeman/references/ParallelLR82.pdf – Ira Baxter Jan 22 '13 at 07:31
  • Your parsers probably have smaller mispredicts because you have a separate jump instruction for each state, so the machine can associate state-specific predict information with each state. Bison probably has a core loop with one instruction that does the dispatch, so the machine has to associate all the predict information from all states with that one branch. Might be interest to bend Bison to have a seperate branch per state to see what the mispredicts do. However, after you have your parser tuned to death, you're likely to discover the lexer and the semantic processing are bottlenecks – Ira Baxter Jan 22 '13 at 07:40
  • Thank you for link. I tried to use the same approach with lexer generation (hardcoded instead of table based, e.g. RE2C). Semantic processing is another kind of thing. It seems that memory management is the bottleneck. – skvadrik Jan 22 '13 at 07:57
  • If all you are doing is allocating, memory management can be cheap. You can always preallocate a large chunk of RAM, set a pointer to the start, and add the size of the allocation to the pointer. – Ira Baxter Jan 23 '13 at 01:51
  • Thanks, that's exactly what I finally had to do. – skvadrik Jan 25 '13 at 12:30