6

I want to write a translator between two languages, and after some reading on the Internet I've decided to go with ANTLR. I had to learn it from scratch, but besides some trouble with eliminating left recursion everything went fine until now.

However, today some guy told me to check out Happy, a Haskell based parser generator. I have no Haskell knowledge, so I could use some advice, if Happy is indeed better than ANTLR and if it's worth learning it.

Specifically what concerns me is that my translator needs to support macro substitution, which I have no idea yet how to do in ANTLR. Maybe in Happy this is easier to do?

Or if think other parser generators are even better, I'd be glad to hear about them.

Gabriel
  • 2,313
  • 9
  • 29
  • 41
  • If you are able to say, the most useful piece of information you could provide right now is an answer to "What are the source and target languages?" – Sam Harwell Sep 03 '09 at 03:09
  • @280Z28 They are in-house created languages. They are somewhat similar to Java, with the difference that a class can contain macro definitions and then inside the methods the macros need to be expanded. – Gabriel Sep 04 '09 at 07:04
  • 1
    Meanwhile I figured out that my problem is simpler than I initially thought. I managed to do it with ANTLR, in the lexer, so no need to urgently learn Happy or other generator now. – Gabriel Sep 04 '09 at 07:06
  • Well, it would have been great if you had shared your solution or at least a link to how you got on the track of using the lexer to do this. – Engineer Apr 17 '16 at 06:40

1 Answers1

6

People keep believing that if they just get a parser, they've got it made when building language tools. Thats just wrong. Parsers get you to the foothills of the Himalayas then you need start climbing seriously.

If you want industrial-strength support for building language translators, see our DMS Software Reengineering Toolkit. DMS provides

  • Unicode-based lexers
  • full context-free parsers (left recursion? No problem! Arbitrary lookahead? No problem. Ambiguous grammars? No problem)
  • full front ends for C, C#, COBOL, Java, C++, JavaScript, ... (including full preprocessors for C and C++)
  • automatic construction of ASTs
  • support for building symbol tables with arbitrary scoping rules
  • attribute grammar evaluation, to build analyzers that leverage the tree structure
  • support for control and data flow analysis (as well realization of this for full C, Java and COBOL),
  • source-to-source transformations using the syntax of the source AND the target language
  • AST to source code prettyprinting, to reproduce target language text

Regarding the OP's request to handle macros: our C, COBOL and C++ front ends handle their respective language preprocessing by a) the traditional method of full expansion or b) non-expansion (where practical) to enable post-parsing transformation of the macros themselves. While DMS as a foundation doesn't specifically implement macro processing, it can support the construction and transformation of same.

As an example of a translator built with DMS, see the discussion of converting JOVIAL to C for the B-2 bomber. This is 100% translation for > 1 MSLOC of hard real time code. [It may amuse you to know that we were never allowed to see the actual program being translated (top secret).]. And yes, JOVIAL has a preprocessor, and yes we translated most JOVIAL macros into equivalent C versions.

[Haskell is a cool programming language but it doesn't do anything like this by itself. This isn't about what's expressible in the language. Its about figuring out what machinery is required to support the task of manipulating programs, and spending 100 man-years building it.]

Ira Baxter
  • 93,541
  • 22
  • 172
  • 341
  • 1
    @Ira Baxter - it's small world, you are walking distance from me. :o – Sam Harwell Sep 03 '09 at 03:05
  • Oops, hit the "up" button on "this is a great comment". You benefit from my hiccup. Find my email address from my user registration page and send me an introductory note; might be some fun conversation in here. – Ira Baxter Sep 03 '09 at 03:36
  • 1
    This is awesome. However I assume you can't find anything like this in the open source community. – Gabriel Sep 04 '09 at 07:08
  • Other program transformation systems: TXL is free but I don't think open source. Stratego is probably both. Both have pretty strong parsing technology. Neither directly supports building symbol tables, doing attribute grammars or doing control/data flow analysis. Dunno about Unicode. YMMV. – Ira Baxter Sep 06 '09 at 08:53