3

I have a Microsoft Excel XLS file (2003 format) that when trying to open using Apache POI 3.17 (and 4.1.2) I get the below stack trace. I can open the Excel file without any errors in Excel 2016.

I do notice that when saving the file using Excel 2016 I can open it via Apache POI. It also shows a reduction in file size from 15 MB to 11 MB after saving it using Excel 2016.

The original file does have a lot of blank columns that extends out very far and reducing some of the lines in the file does eventually allow me to open it with Apache POI but it's not particular to any specific content in the line. I cannot post the original file due to sensitive info and inability to edit it which ends up fixing the issue.

Code to open XLS file

    import org.apache.poi.hssf.usermodel.HSSFWorkbook;
    import org.apache.poi.poifs.filesystem.POIFSFileSystem;


    final POIFSFileSystem fs = new POIFSFileSystem(new File("/home/dev/Downloads/test.xls"), true);
    final HSSFWorkbook xlsToAppendWorkbook = new HSSFWorkbook(fs);

Stacktrace

java.lang.ArrayIndexOutOfBoundsException: Index -2 out of bounds for length 46410
                at org.apache.poi.poifs.filesystem.BlockStore$ChainLoopDetector.claim(BlockStore.java:99)
                at org.apache.poi.poifs.filesystem.NPOIFSFileSystem.readCoreContents(NPOIFSFileSystem.java:414)
                at org.apache.poi.poifs.filesystem.NPOIFSFileSystem.<init>(NPOIFSFileSystem.java:234)
                at org.apache.poi.poifs.filesystem.NPOIFSFileSystem.<init>(NPOIFSFileSystem.java:167)
                at org.apache.poi.poifs.filesystem.POIFSFileSystem.<init>(POIFSFileSystem.java:98)
                at
dukethrash
  • 1,449
  • 4
  • 15
  • 25
  • 1
    The error suggests that the file is at least partly corrupt - where did it come from / what generated it? – Gagravarr Jun 21 '21 at 21:01
  • A Talend job created it and converted some source csv files to xls. The xls file has 3 tabs. The problem is that the xls does open within Apache POI when the amount of lines are reduced in the original source csv before it gets converted to xls. I agree it is corrupt but Excel doesn't seem to complain about it. – dukethrash Jun 22 '21 at 12:50
  • The "Talend job" does something wrong as it creates a corrupt `Excel` file. But `Excel` itself is tolerant enough to open it nevertheless. So to get what exactly is corrupt, one would need the file. You say, you cannot post the file. So this is a dead end. No help possible. – Axel Richter Jun 22 '21 at 13:46

0 Answers0