4

I'm working on reading data from an excel document that gets updated on a bi-weekly basis with around 50,000 rows of data, and will probably reach about 120,000 before beginning a new sheet. I am using Apache POI to get the data. I get this exception below, but I believe the most important one exception is Caused by: java.lang.OutOfMemoryError: Java heap space. I've tested my code on a different excel sheet with less data and was able to read the data.

If you have any methods or suggestions on how to read a large excel document please share.

org.apache.poi.POIXMLException: java.lang.reflect.InvocationTargetException
    at org.apache.poi.xssf.usermodel.XSSFFactory.createDocumentPart(XSSFFactory.java:62)
    at org.apache.poi.POIXMLDocumentPart.read(POIXMLDocumentPart.java:456)
    at org.apache.poi.POIXMLDocumentPart.read(POIXMLDocumentPart.java:461)
    at org.apache.poi.POIXMLDocumentPart.read(POIXMLDocumentPart.java:461)
    at org.apache.poi.POIXMLDocumentPart.read(POIXMLDocumentPart.java:461)
    at org.apache.poi.POIXMLDocument.load(POIXMLDocument.java:162)
    at org.apache.poi.xssf.usermodel.XSSFWorkbook.<init>(XSSFWorkbook.java:271)
    at excelreader.readExcelFile(excelreader.java:28)
    at excelreader.main(excelreader.java:18)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:497)
    at com.intellij.rt.execution.application.AppMain.main(AppMain.java:140)
Caused by: java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
    at org.apache.poi.xssf.usermodel.XSSFFactory.createDocumentPart(XSSFFactory.java:60)
    ... 13 more
Caused by: java.lang.OutOfMemoryError: Java heap space
    at org.apache.xmlbeans.impl.store.Cur.createElementXobj(Cur.java:260)
    at org.apache.xmlbeans.impl.store.Cur$CurLoadContext.startElement(Cur.java:2995)
    at org.apache.xmlbeans.impl.store.Locale$SaxHandler.startElement(Locale.java:3207)
    at org.apache.xmlbeans.impl.piccolo.xml.Piccolo.reportStartTag(Piccolo.java:1082)
    at org.apache.xmlbeans.impl.piccolo.xml.PiccoloLexer.parseAttributesNS(PiccoloLexer.java:1822)
    at org.apache.xmlbeans.impl.piccolo.xml.PiccoloLexer.parseOpenTagNS(PiccoloLexer.java:1521)
    at org.apache.xmlbeans.impl.piccolo.xml.PiccoloLexer.parseTagNS(PiccoloLexer.java:1362)
    at org.apache.xmlbeans.impl.piccolo.xml.PiccoloLexer.yylex(PiccoloLexer.java:4678)
    at org.apache.xmlbeans.impl.piccolo.xml.Piccolo.yylex(Piccolo.java:1290)
    at org.apache.xmlbeans.impl.piccolo.xml.Piccolo.yyparse(Piccolo.java:1400)
    at org.apache.xmlbeans.impl.piccolo.xml.Piccolo.parse(Piccolo.java:714)
    at org.apache.xmlbeans.impl.store.Locale$SaxLoader.load(Locale.java:3454)
    at org.apache.xmlbeans.impl.store.Locale.parseToXmlObject(Locale.java:1276)
    at org.apache.xmlbeans.impl.store.Locale.parseToXmlObject(Locale.java:1263)
    at org.apache.xmlbeans.impl.schema.SchemaTypeLoaderBase.parse(SchemaTypeLoaderBase.java:345)
    at org.openxmlformats.schemas.spreadsheetml.x2006.main.CTPivotCacheRecords$Factory.parse(Unknown Source)
    at org.apache.poi.xssf.usermodel.XSSFPivotCacheRecords.readFrom(XSSFPivotCacheRecords.java:62)
    at org.apache.poi.xssf.usermodel.XSSFPivotCacheRecords.<init>(XSSFPivotCacheRecords.java:53)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:422)
    at org.apache.poi.xssf.usermodel.XSSFFactory.createDocumentPart(XSSFFactory.java:60)
    at org.apache.poi.POIXMLDocumentPart.read(POIXMLDocumentPart.java:456)
    at org.apache.poi.POIXMLDocumentPart.read(POIXMLDocumentPart.java:461)
    at org.apache.poi.POIXMLDocumentPart.read(POIXMLDocumentPart.java:461)
    at org.apache.poi.POIXMLDocumentPart.read(POIXMLDocumentPart.java:461)
    at org.apache.poi.POIXMLDocument.load(POIXMLDocument.java:162)
    at org.apache.poi.xssf.usermodel.XSSFWorkbook.<init>(XSSFWorkbook.java:271)
    at excelreader.readExcelFile(excelreader.java:28)
    at excelreader.main(excelreader.java:18)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
trincot
  • 317,000
  • 35
  • 244
  • 286
Allee Clark
  • 149
  • 1
  • 14
  • 3
    Well... have you tried raising the maximum heap allocated? – Mena Jul 02 '15 at 13:24
  • Yes, I am on Java 32-bit. I raised it to 512M and still got an the same exception. – Allee Clark Jul 02 '15 at 13:33
  • You can set the max up to 4GB on 32 bit. Tweak a little more? – Mena Jul 02 '15 at 13:37
  • @Mena I think this issue is because of memory leakage if its not working with 1GB of heap size so I dont think it will help much to just increase the heap size because of.. http://www.azulsystems.com/technology/java-heap-size and http://javarevisited.blogspot.in/2011/05/java-heap-space-memory-size-jvm.html – Viraj Nalawade Jul 02 '15 at 13:40
  • @VirajNalawade there could be memory leaks, but it'd be very hard to help the OP with that, since they'd need to scan the entire application. I would be skeptical to hear there are memory leaks in apache XmlBeans but you never know - a look at the versions and bug fixes might help. The easiest way to play with this, to **start** with, is still raising the heap to a higher level - we're talking large documents here after all. – Mena Jul 02 '15 at 13:45
  • 1
    @Mena That was very hepful.. thanks for the _detailed_ explanation – Viraj Nalawade Jul 02 '15 at 13:49

3 Answers3

2

1 .Easy way to solve OutOfMemoryError in java is to increase the maximum heap size by using JVM options "-Xmx512M", this will immediately solve your OutOfMemoryError.

The Java virtual machine (JVM) runs with a fixed upper memory limit, which you can modify thus:

-Xms<size> - Set initial Java heap size
-Xmx<size> - Set maximum Java heap size

$ java -Xms512m -Xmx1024m JavaApp

2.Second way to resolve OutOfMemoryError in Java is rather hard and comes when you don't have much memory and even after increase maximum heap size you are still getting java.lang.OutOfMemoryError, in this case you probably want to profile your application and look for any memory leak.

So to check memory leask you can use some tools.Its very difficult without it. Some of popular one are:

Jmap is a command line utility comes with JDK6 and allows you to take a memory dump of heap in a file. It’s easy to use as shwon below:

jmap -dump:format=b,file=heapdump 6054

Eclipse memory analyzer (MAT)
is a tool from eclipse foundation to analyze java heap dump. It helps to find classloader leaks and memory leaks and helps to minimize memory consumption.you can use MAT to analyze heap dump carrying millions of object and it also helps you to extract suspect of memory leak.

Community
  • 1
  • 1
Viraj Nalawade
  • 3,137
  • 3
  • 28
  • 44
1

It does have to do with heap size. I set it to 2g and still getting out of memory erroy. The excel document I am working with is in a pivot table, which Apache POI has limitations for. Which causes the first exception

Allee Clark
  • 149
  • 1
  • 14
0

try java.exe -Xmx512m -jar yourjar.jar

It will set maximum heap size to 512 MB. Or try bigger values if 512 is not enough.

talex
  • 17,973
  • 3
  • 29
  • 66
chris
  • 1,685
  • 3
  • 18
  • 28
  • I increased the space before and still the same error. Appreciate the input. – Allee Clark Jul 02 '15 at 13:36
  • @talex: Thx for the edit. (at)Allee: what about the progress? Does the app run longer or does it even affect somehow? Try using java 64 bit and setting -Xmx2048m. Does this change something? – chris Jul 02 '15 at 15:40
  • 2
    @AlleeClark 50.000 rows will use lots of memory. try to put maximum value (for 32bit version of java it is somewhere between 1.5 and 2 GB) – talex Jul 02 '15 at 15:57
  • It's compiling at the same speed. I believe its because I was reading from a pivot table. I gave 2 GB a try and still received the same exception. – Allee Clark Jul 02 '15 at 16:59
  • Ok, so i would - to get any further - reduce the data size. After that you may be able to estimate what's possible. And what not. And maybe you have to choose another strategy to walk on... – chris Jul 03 '15 at 05:03