I have developed a successful translator that uses ANTLR4 grammar and parse tree listeners. I'm very pleased with the rapid time to success using ANTLR for this project, and just started comparing binary outputs of my ANTLR4 based solution against a legacy C++ body of code that is slower. (My solution is targeted at speed so even though it's implemented in Java it could be faster).
However, when I started testing it with larger 110Mb input ASCII files, I find that I run out of HEAP. This occurs during the ANTLRInputStream instantiation. Which I believe I can fix with UnbufferedChar/Token streams. This stackoverflow question also suggests that the parse tree generation should be turned off, as the parse tree consumes a significant amount of memory.
If I turn off parse tree generation, my parse tree listeners won't be called. At least that's how I understand it. I suspect I won't be able to manage translating 1Gb files with the parse tree listener generation on. What is the solution?
I'd like to avoid moving my ParseTreeListener code to the grammar files if I can.