People keep believing that if they just get a parser, they've got it made
when building language tools. Thats just wrong. Parsers get you to the foothills
of the Himalayas then you need start climbing seriously.
If you want industrial-strength support for building language translators, see our
DMS Software Reengineering Toolkit. DMS provides
- Unicode-based lexers
- full context-free parsers (left recursion? No problem! Arbitrary lookahead? No problem. Ambiguous grammars? No problem)
- full front ends for C, C#, COBOL, Java, C++, JavaScript, ...
(including full preprocessors for C and C++)
- automatic construction of ASTs
- support for building symbol tables with arbitrary scoping rules
- attribute grammar evaluation, to build analyzers that leverage the tree structure
- support for control and data flow analysis (as well realization of this for full C, Java and COBOL),
- source-to-source transformations using the syntax of the source AND the target language
- AST to source code prettyprinting, to reproduce target language text
Regarding the OP's request to handle macros: our C, COBOL and C++ front ends handle their respective language preprocessing by a) the traditional method of full expansion or b) non-expansion (where practical) to enable post-parsing transformation of the macros themselves. While DMS as a foundation doesn't specifically implement macro processing, it can support the construction and transformation of same.
As an example of a translator built with DMS, see the discussion of
converting
JOVIAL to C for the B-2 bomber. This is 100% translation for > 1 MSLOC of hard
real time code. [It may amuse you to know that we were never allowed to see the actual program being translated (top secret).]. And yes, JOVIAL has a preprocessor, and yes we translated most JOVIAL macros into equivalent C versions.
[Haskell is a cool programming language but it doesn't do anything like this by itself.
This isn't about what's expressible in the language. Its about figuring out what machinery is required to support the task of manipulating programs, and
spending 100 man-years building it.]