2

I am using Apache POI to read excel file. My file is about 100MB, and i got this java.lang.OutOfMemoryError: Java heap while loading workbook

XSSFWorkbook workbook = new XSSFWorkbook(excelFilePath);

My computer RAM is 8GB so its can't afford to use -Xmx too large. So my question is how can i handle this error?

EDIT

Samples of my data (My data is about 1087490 rows):

enter image description here

Fliko Genso
  • 35
  • 1
  • 11
  • 2
    For example by doing some experimenting? We can't tell you which number for -Xmx will work for **your** machine and **your** workload. You on the other hand can easily test that yourself?! – GhostCat Dec 19 '17 at 09:12
  • What data makes your excel file 100 mb, is it text or image or what..be specific – KishanCS Dec 19 '17 at 09:13
  • I already experimented it. its look like my machine cannot afford this XSSFWorkbook but is there any other library or ways to make my machine can read 100MB excel file? – Fliko Genso Dec 19 '17 at 09:14
  • What does `jvisualvm` say about this? – Tamas Rev Dec 19 '17 at 09:19
  • In the long past I used a JDBC-ODBC bridge too (one needs to create an ODBC data source). It no longer is supported for java 8. There might be some alternative, or you would need to import the data with java 7, say into an embedded database like H2. – Joop Eggen Dec 19 '17 at 10:18

1 Answers1

5

First of all - you have to make sure that your files are within the well defined limitations of the tool - see here.

Beyond that, it seems to be a pretty common problem to run out of memory for larger files, and one solution is to process XML data "manually":

If memory footprint is an issue, then for XSSF, you can get at the underlying XML data, and process it yourself. This is intended for intermediate developers who are willing to learn a little bit of low level structure of .xlsx files, and who are happy processing XML in java. Its relatively simple to use, but requires a basic understanding of the file structure. The advantage provided is that you can read a XLSX file with a relatively small memory footprint.

( quoted from here )

GhostCat
  • 137,827
  • 25
  • 176
  • 248
  • 1
    Also good to mention in this case are the [default limitations](https://poi.apache.org/spreadsheet/limitations.html) – XtremeBaumer Dec 19 '17 at 09:17
  • My data is about 1087490 records (rows). i will try test this solution – Fliko Genso Dec 19 '17 at 09:21
  • 2
    The [SXSSF (Streaming Usermodel API)](https://poi.apache.org/spreadsheet/how-to.html#SXSSF+%28Streaming+Usermodel+API%29) can only be used "when very large spreadsheets have to be **produced**". For **reading** huge files [XSSF and SAX (Event API)](https://poi.apache.org/spreadsheet/how-to.html#XSSF+and+SAX+%28Event+API%29) is needed instead. – Axel Richter Dec 19 '17 at 10:02
  • @AxelRichter Thanks, too ;-) ... will show my gratitude elsewhere – GhostCat Dec 19 '17 at 10:05
  • 1
    Sory for late reply. Thanks a lot! Its working! but little bit slow, maybe because too many data – Fliko Genso Dec 21 '17 at 01:03