0

I've looked at several posts here on stockoverflow but I'm not finding what I'm looking for and ones I find that are close don't seem to work for me. Using OpenXmlReader doesn't read rows as it appears to never open the sheet data to a element type of row is never found.

I need to use the SAX method of using OpenXmlPartReader. What I'm looking to do is open up the Excel document and read row by row. For each row I need to grab a couple of cells either by index number or id and the value in that cell. So I don't need to loop through every cell I know which cells I need.

I want to use the SAX method to avoid running out of memory as I anticipate needing to read large files.

Community
  • 1
  • 1
Chris Ward
  • 771
  • 1
  • 9
  • 23

1 Answers1

1

I had a similar problem to solve. After combining several of the related answers found in SO, I coded a custom sax reader which is able to extract all rows from the 1st sheet.

Take a look at my xlsx-sax-exporter github project and let me know how it works for you.

The reader is very fast to open the worksheet and it calculates its dimensions by checking SheetDimensions reference when constructing.

It's also able to format numbers and dates after checking the workbook's cell and numbering formats as well. It even tries to cover the default numbering format cases.

The reader works with paging because I wanted to support extracting paged results on the server and it performs best if all pages are retrieved sequentially. It's blazing fast as expected because of sax streaming and memory usage is kept low. Benchmarking this without keeping rows in memory, it never uses more than 10 MB (release build).

Feel free to fork and send a pull request if you have a change to propose, or just open an issue and I'll try to take care of it when I have time.

theofanis
  • 119
  • 1
  • 3
  • UPDATE: The above link doesn't work anymore since I had to take it down for NDA reasons. So this answer here doesn't really offer anything anymore. – theofanis Dec 11 '20 at 10:13