1

Can I read only the first row of an Excel file with Apache POI? I don't want to read the whole file because it has 50,000 rows and takes up to 10 minutes to read (performance is a disaster). I am getting the bytes via file upload. My options are a byte array or an InputStream. Right now I am doing this:

Workbook workbook = new XSSFWorkbook(excelByteStream); //This line takes a lot of time, while debugging up to 10 minutes
Sheet firstSheet = workbook.getSheetAt(0);
DataFormatter df = new DataFormatter();
List<ColumnPanel> columnPanels = new ArrayList<>();
int i = 0;

for (Cell cell : firstSheet.getRow(0))
{
    columnPanels.add(new ColumnPanel(df.formatCellValue(cell), i++));
}
Impulse The Fox
  • 2,638
  • 2
  • 27
  • 52
  • Can you save the file as `.xlsb` and check again? – Vityata Jun 22 '18 at 09:27
  • 1
    use [SXSSFWorkbook](https://poi.apache.org/apidocs/org/apache/poi/xssf/streaming/SXSSFWorkbook.html) with constructor `new SXSSFWorkbook(workbook, 1)` – XtremeBaumer Jun 22 '18 at 09:32
  • 2
    https://stackoverflow.com/questions/11891851/how-to-load-a-large-xlsx-file-with-apache-poi: take a look at monitorjbl's answer. I think it could provide the perfect solution for this. – Andrea Jun 22 '18 at 09:32
  • @Vityata Tried to create an XSSFWorkbook with it: XLSBUnsupportedException: .XLSB Binary Workbooks are not supported – Impulse The Fox Jun 22 '18 at 09:44
  • 1
    Don't read from a stream, but load a file. If the stream comes from outside, write the stream to a file first if you need to. See https://stackoverflow.com/questions/11154678/xssfworkbook-takes-a-lot-of-time-to-load – Thomas Timbul Jun 22 '18 at 09:55
  • @XtremeBaumer How does that help? I mean I still have to create the workbook, which takes lots of time because it will read everything. – Impulse The Fox Jun 22 '18 at 09:55
  • Possible duplicate of [XSSFWorkbook takes a lot of time to load](https://stackoverflow.com/questions/11154678/xssfworkbook-takes-a-lot-of-time-to-load) – Thomas Timbul Jun 22 '18 at 09:55
  • 1
    @XtremeBaumer SXSSF is for writing only, it doesn't support reading. – Thomas Timbul Jun 22 '18 at 09:56
  • 1
    [Excel Streaming Reader](https://github.com/monitorjbl/excel-streaming-reader) was suggested already. But if you do not wants an additionally library, you could try using `StAX` as shown here https://stackoverflow.com/questions/46601782/read-rows-sequentially-but-on-demand/46617086#46617086. – Axel Richter Jun 22 '18 at 10:11
  • @ThomasTimbul on the docs I can find it anywhere that it says its for writing only. Also I would assume that you can also read data from it since you can access rows inside a sheet (I have never tried it so could be wrong) – XtremeBaumer Jun 22 '18 at 10:13
  • @XtremeBaumer: [public SXSSFWorkbook(XSSFWorkbook workbook)](https://poi.apache.org/apidocs/org/apache/poi/xssf/streaming/SXSSFWorkbook.html#SXSSFWorkbook-org.apache.poi.xssf.usermodel.XSSFWorkbook-): " What is not supported: Access initial cells and rows in the template. After constructing all internal windows are empty and SXSSFSheet.getRow(int) and SXSSFRow.getCell(int) return null." – Axel Richter Jun 22 '18 at 10:18

1 Answers1

1

InputStream is = new FileInputStream(new File("/path/to/workbook.xlsx"));

Workbook workbook = StreamingReader.builder() .rowCacheSize(100) .bufferSize(4096) .open(is);

Sheet sheet = workBook.getSheetAt(0);

Row firstRow = sheet.rowIterator().next();

You can use this lib: https://github.com/monitorjbl/excel-streaming-reader

Good guide: Apache POI Streaming (SXSSF) for Reading

Quan Tang
  • 11
  • 2
  • 1
    While this link may answer the question, it is better to include the essential parts of the answer here and provide the link for reference. [Answers that are little more than a link may be deleted](https://stackoverflow.com/help/deleted-answers) – Rajesh Pandya Sep 06 '18 at 06:50