0

I'm trying to read each file in a directory that contains 4000 json.gz files. I'm running out of heap space during execution. I'm unsure how to solve this.

        File folder = new File(directoryPath);
        String [] files = folder.list();

        assert files != null;

        for (String file: files) {
            String filePath = directoryPath + "/" + file;

            if (filePath.substring(filePath.length() - 2).equalsIgnoreCase("gz")) {
                
                try (GZIPInputStream gzipInputStream = new GZIPInputStream(new FileInputStream(filePath))) {
                    InputStreamReader reader = new InputStreamReader(gzipInputStream);

                    Object obj = jsonParser.parse(reader);

                    TextFileModel textFileModel = processObject(obj, controller);
                    textFileModelList.add(textFileModel);
                    
                } catch (IOException | ParseException e) {
                    e.printStackTrace();
                }
          }

UPDATE:

I've tried a few of the suggestions but I'm still getting the same error

try (InputStreamReader reader = new InputStreamReader(new GZIPInputStream(new FileInputStream(filePath)))){

                    Object obj = jsonParser.parse(reader);

                    reader.close();

                    TextFileModel textFileModel = processObject(obj, controller);
                    obj = null;
                    textFileModelList.add(textFileModel);

                } catch (IOException | ParseException e) {
                    e.printStackTrace();
                }

Error Message:

Exception in thread "main" java.lang.OutOfMemoryError: Java heap space
    at org.json.simple.parser.Yylex.yylex(Yylex.java:668)
    at org.json.simple.parser.JSONParser.nextToken(JSONParser.java:269)
    at org.json.simple.parser.JSONParser.parse(JSONParser.java:118)
    at org.json.simple.parser.JSONParser.parse(JSONParser.java:92)
    at com.sbl.Main.main(Main.java:45)
Sean
  • 11
  • 1
  • 5
  • 1
    Does this answer your question? [How to deal with "java.lang.OutOfMemoryError: Java heap space" error?](https://stackoverflow.com/questions/37335/how-to-deal-with-java-lang-outofmemoryerror-java-heap-space-error) – Turamarth Jun 04 '21 at 14:25
  • Do you need to keep all those objects in memory (simultaneously)? If yes -> increase heap. If not -> dereference all objects that are no longer needed – Dietmar Höhmann Jun 04 '21 at 14:26
  • How do you dereference input streams(reader) ? – Sean Jun 04 '21 at 14:39
  • You should close the stream when you're finished with it. – NomadMaker Jun 04 '21 at 15:13

2 Answers2

0

Try to process the files as you read them, perhaps writing then to an alternate file or directory. You can also set the heap space using one of the java -Xm* options assuming your platform has sufficient memory to support it.

java -Xmn<size>        sets the initial and maximum size (in bytes) of the heap
                      for the young generation (nursery)
java -Xms<size>        set initial Java heap size
java -Xmx<size>        set maximum Java heap size

If your using and IDE, you should be able to configure it as well. In Eclipse you can edit the Run Configuration for your applications and set the options there.

WJS
  • 36,363
  • 4
  • 24
  • 39
0

In general, to increase the heap memory, you can specify the -Xmx option to set the Maximum heap memory available for JVM and Xms for Minimum heap memory.

You can set these as specified in the answer here - https://stackoverflow.com/a/14763095/8175739

Also, try defining the InputStreamReader in the try block so it can be closed too -

try (GZIPInputStream gzipInputStream = new GZIPInputStream(new FileInputStream(filePath)); 
InputStreamReader reader = new InputStreamReader(gzipInputStream)) {
    Object obj = jsonParser.parse(reader);      
    ...
}
Yatharth Ranjan
  • 461
  • 2
  • 6