0

I'm working on a web application using Java, where I have a method that should read a .xlsx file using apache-poi:

public static void xlsx(String arquivo) throws IOException{

     try {
        FileInputStream file = new FileInputStream(new File(arquivo));
        XSSFWorkbook workbook = new XSSFWorkbook(file);
        XSSFSheet sheet = workbook.getSheetAt(0);
        Iterator<Row> rowIterator = sheet.iterator();
        while (rowIterator.hasNext()) {
            Row row = rowIterator.next();
            Iterator<Cell> cellIterator = row.cellIterator();
            while (cellIterator.hasNext()) {
                Cell celula = cellIterator.next();
                /*here do the reading for each cell,*/
            }
        }
        file.close();

    } catch (IOException e) {
        e.printStackTrace();
        throw new IOException("Erro ao processar arquivo.",e.getCause());
    }
}

The method works correctly, however how likely this method will process files with thousands of lines of records, for example, about 25-300 thousand lines. When processing a large file I take the following exception:

(http-localhost-127.0.0.1-8080-4) Servlet.service() for servlet RestServlet threw exception: org.jboss.resteasy.spi.UnhandledException: java.lang.OutOfMemoryError: Java heap space

I need to know how can I avoid this type of error. If you have for example, read and process the file .xlsx 1000 to 1000 lines, or some other solution.

Erico Souza
  • 276
  • 4
  • 18
  • Did you try the [Apache POI SAX interface for .xlsx files](http://poi.apache.org/spreadsheet/how-to.html#xssf_sax_api)? From the [docs](http://poi.apache.org/spreadsheet/how-to.html#xssf_sax_api) `If memory footprint is an issue, then for XSSF, you can get at the underlying XML data, and process it yourself` – Gagravarr Jul 31 '14 at 15:49
  • thanks for help @Gagravarr, I will make these changes. – Erico Souza Jul 31 '14 at 16:51
  • I know that the developer doesn't often get a lot of input into the process, but it needs to be said: a 300k line spreadsheet should be transitioned into a database ASAP. – Gus Jul 31 '14 at 17:05
  • thanks for help @Gus but I don't know "database ASAP". you have any reference, aboaut database ASAP for me? – Erico Souza Jul 31 '14 at 18:08
  • ASAP is an acronym meaning "As Soon As Possible". – Gus Jul 31 '14 at 18:13
  • thanks for help dude. @Gus, you have any idea that I can do it? – Erico Souza Jul 31 '14 at 19:48

1 Answers1

1

If you want to keep existing implementation, simply increase JVM maximum heap size. Please see How to increase heap size for jBoss server for instructions and a related discussion.

Community
  • 1
  • 1
  • Thanks for help @VasylTrtiakov, but another problem I have is that my development machine is low on memory. Only 4gb. – Erico Souza Jul 31 '14 at 18:10
  • Then you may want to consider processing that large spreadsheet in parts saving intermediate results in some place (file, database, etc.) if that approach is applicable to your processing algorithm. – Vasyl Tretiakov Jul 31 '14 at 18:32