Antlr for multiple language generation

Question

This post about the antlr simple example shows how to create and us a grammar for java.

However, this intermixes the grammar and the Java source code in the Exp.g source.

My Question is, Is it possible to decouple the grammar file from the target language, so that the one grammar file can be used for generating multiple Java, Scala, C++, etc Lexers/Parsers?

In general, .g grammar are indeed decoupled from the target language, which seems the case with this Exp.g. There are for example targets for C#: https://github.com/tunnelvisionlabs/antlr4cs, or C++ : http://www.soft-gems.net/index.php/tools/49-the-antlr4-c-target-is-here . Writing a target beyond java is beyond the ANTLR project itself, I believe — Simon Mourier, Sep 06 '17 at 21:23
@SimonMourier - I think you misinterpret the question. The Exp.g reference d IS intermixed with Java. Its that, that i am trying to avoid. — NWS, Sep 07 '17 at 18:30

score 2 · Accepted Answer · answered Sep 07 '17 at 06:44

It depends mostly on the reason why target code is used in the grammar. Is it only action code to do something with the found tokens (e.g. building a symbol table or alternative tree representation) then is indeed no problem do remove such native code and do the processing afterwards (using a parse tree walker or visitor).

However, predicates are a different. They are used to guide the parser and also require native code. What you can do is to move all the native code into a base class from which your generated parser derives. You then only need to re-write this base class in your target language and keep the grammar mostly free of native code (except for a single function call, which invokes the native code).

This approach has the advantage that no additional library reference is necessary (#include in C/C++, import in other languages), which also is native code preventing use for multiple targets.

Moving target-dependent logic to a base class for the parser does not remove target dependency from the grammar, because the superclass will still need to be imported by the parser (in the @header section) in a target language dependent fashion. — markt1964, May 31 '20 at 03:03

Antlr for multiple language generation

1 Answers1

Linked