2

I want to write Java code to build a LALR parser for my grammar. Can someone please suggest some books or some links where I can learn how to write Java code for a LALR parser?

johnnyRose
  • 7,310
  • 17
  • 40
  • 61
Chirag Tayal
  • 459
  • 1
  • 6
  • 14
  • 1
    http://en.wikipedia.org/wiki/Compilers:_Principles,_Techniques,_and_Tools http://java-source.net/open-source/parser-generators – Krab Mar 23 '11 at 05:17
  • 1
    Don't do it. Top-down, recursive descent (backtracking) parsers are easy to write and maintain. LR parsers are not, even when they are supported by parser generators. – Apalala Mar 23 '11 at 21:29

4 Answers4

3

Writing a LALR parser by hand is difficult, but it can he done. If you want to learn the theory behind constructing parsers for them by hand, consider looking into "Parsing Techniques: A Practical Guide" by Grune and Jacobs. It's an excellent book on general parsing techniques, and the chapter on LR parsing is particularly good.

If you're more interested in just getting a LALR parser that is written in Java, consider looking into Java CUP, which is a general purpose parser generator for Java.

Hope this helps!

templatetypedef
  • 362,284
  • 104
  • 897
  • 1,065
  • 1
    Actually, an LALR *parser* engine (to intepret the tables, do reductions,etc.) is pretty easy to write. An LALR parser *generator* to produced the parse tables is pretty hard to write, and you certainly don't want to simulate it by hand. If OP's task is to simply parse a grammar he has, using an existing generator system is by far the easiest. Agreed, Grune's books are pretty good. – Ira Baxter Mar 24 '11 at 22:09
  • I totally agree that using a parser generator is the way to go here. From what I remember, though building LALR tables is really hard because any interesting grammar is bound to have a huge number of states in it, even if those states aren't very complicated. I once tried building a parser for simple regular expressions this way and it took hours of effort. – templatetypedef Mar 24 '11 at 22:22
1

You can split the LALR functionality in two parts: preparation of the tables and parsing the input.

The first part is complex and errorprone, so even if you like knowing how it works I suggest to use a proven working table generator for the LALR states (and for the tokenizer DFA as well).

The second part consists of consuming those tables using some quite simple algorithms to tokenize and process the input into a parse tree/concrete syntax tree. This is easier to implement yourself if you like to do so, and you still have full control over how it works and what it does.

When doing parsing tasks, I personally use the free GOLD Parsing System, which has a nice UI for creating and debugging the grammar and it does also generate table files which can then be loaded and processed by an existing engine or your own implementation (the file format for these CGT files is well documented).

Lucero
  • 59,176
  • 9
  • 122
  • 152
1

As previously stated, you would always use a parser-generator to produce an LALAR parser. A few such tools for Java are:

R Campbell
  • 160
  • 1
  • 8
0

Just want to mention that my project CookCC ( http://coconut2015.github.io/cookcc/ ) is a LALR(1) parser + Lexer (much like flex).

The unique feature of CookCC is that you can write your lexer and parser in Java using Java annotations. See the calculator example here: https://github.com/coconut2015/cookcc/blob/master/tests/javaap/calc/Calculator.java

user1456982
  • 125
  • 6