2

I am trying to parse a log file with the help of ANTLR version 4. Currently the file consist of 10703 lines, it can exceed to millions of lines. The Java is prompting following heap Exception. I reduced the lines to 300. The program ran successfully but when i made it 400 it start giving me the heap Exception again. I have amplified the heap memory of java too, but no luck. I also don't know whether this problem is of java or ANTLR.

here is the program which executes the grammar:

public class parser {
    public static void main(String[] args) {
        System.out.println("Start");
        String fileName = "D:\\folder\\logs.out";
        File file = new File(fileName);
        FileInputStream fis = null;

        try {
        // Open the input file stream

        fis = new FileInputStream(file);

        // Create a CharStream that reads from standard input

        ANTLRInputStream input = new ANTLRInputStream(fis);

        GrammarOSBLexer lexer=new GrammarOSBLexer(input);

        TokenStream tokenStream=new CommonTokenStream(lexer);
        GrammarOSBParser parsr=new GrammarOSBParser(tokenStream);
        fis.close();

        try {
            parsr.logs();

         } catch (RecognitionException e) {

             e.printStackTrace();
         }

        System.out.println("done!");

    }catch (IOException e) {

        e.printStackTrace();
   }            

  } 

}

`

Here is the Exception:

Exception in thread "main" java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
    at java.lang.reflect.Method.invoke(Unknown Source)
    at org.eclipse.jdt.internal.jarinjarloader.JarRsrcLoader.main(JarRsrcLoader.java:58)
Caused by: java.lang.OutOfMemoryError: Java heap space
    at java.util.Arrays.copyOf(Unknown Source)
    at java.util.Arrays.copyOf(Unknown Source)
    at java.util.ArrayList.grow(Unknown Source)
    at java.util.ArrayList.ensureExplicitCapacity(Unknown Source)
    at java.util.ArrayList.ensureCapacityInternal(Unknown Source)
    at java.util.ArrayList.add(Unknown Source)
    at org.antlr.runtime.BufferedTokenStream.fetch(BufferedTokenStream.java:146)
    at org.antlr.runtime.BufferedTokenStream.sync(BufferedTokenStream.java:137)
    at org.antlr.runtime.CommonTokenStream.consume(CommonTokenStream.java:68)
    at org.antlr.runtime.BaseRecognizer.match(BaseRecognizer.java:106)
    at com.javadude.antlr.tutorial.GrammarOSBParser.string(GrammarOSBParser.java:1269)
    at com.javadude.antlr.tutorial.GrammarOSBParser.random_messageText(GrammarOSBParser.java:350)
    at com.javadude.antlr.tutorial.GrammarOSBParser.messageTextTag(GrammarOSBParser.java:219)
    at com.javadude.antlr.tutorial.GrammarOSBParser.log(GrammarOSBParser.java:173)
    at com.javadude.antlr.tutorial.GrammarOSBParser.logs(GrammarOSBParser.java:111)
    at com.javadude.antlr.tutorial.parser.main(parser.java:91)
    ... 5 more

`

Rabia Naz khan
  • 507
  • 4
  • 13

1 Answers1

2

Looks like buffering all tokens in memory is not going to work well. So you should look for a solution that doesn't do the buffering. ANTLR4 comes with an UnbufferedTokenStream class that fits much better for you. See also this question for a discussion how to use such a stream and what drawbacks this has (where Sam Harwell + Terence Parr contributed).

Community
  • 1
  • 1
Mike Lischke
  • 48,925
  • 16
  • 119
  • 181