2

I have a huge excel file with tons of columns which looks like this :-

Column1 Column2 Column3 Column4 Column5
abc             def             ghi
        mno             pqr
......

The output generated by my code when I print all the values in excel is :-

abc;def;ghi;null;null

mno;pqr;null;null;null

So, If we look at the output above we can note that the cells where I left blank values were not picked up by the POI library. Is there a way in which I can get these values as null? Or a way to recognize that the values presented skipped blank cells?

Please note: I am not using the usermodel (org.apache.poi.ss.usermodel) but an Event API to process xls and xlsx files.

I am implementing HSSFListener and overriding its processRecord(Record record) method for xls files. For xlsx files I am using javax.xml.parsers.SAXParser and org.xml.sax.XMLReader.

I am using JDK7 with Apache POI 3.7. Can someone please help?

I have already seen this possible duplicate How to get an Excel Blank Cell Value in Apache POI? But this doesn't answer my question as I am using Event API.

Community
  • 1
  • 1
ParagJ
  • 1,566
  • 10
  • 38
  • 56

1 Answers1

2

Yes, it can be done, and there are several examples of it which ship with Apache POI. They all relate to Event based xls / xlsx -> CSV, which looks very close to what you're doing. That makes me worry you may be re-inventing the wheel...

For HSSF event model processing, the example you want to look at is XLS2CSVmra. That is powered by MissingRecordAwareHSSFListener

For XSSF event model, the example you need is XLSX2CSV

Gagravarr
  • 47,320
  • 10
  • 111
  • 156
  • I knew that you will reply and save me as always :) Thanks a lot, I will take a look. – ParagJ Dec 31 '12 at 06:30
  • I checked the https://svn.apache.org/repos/asf/poi/trunk/src/examples/src/org/apache/poi/xssf/eventusermodel/XLSX2CSV.java example however in that example it is adding a comma for any blank value. How can I do it in my example? Below is my code: – ParagJ Jan 22 '13 at 12:27
  • public void endElement(String uri, String localName, String name) throws SAXException { // Process the last contents as required. // Do now, as characters() may be called more than once if(nextIsString) { try { int idx = Integer.parseInt(lastContents); lastContents = new XSSFRichTextString(sst.getEntryAt(idx)).toString(); } catch (NumberFormatException e) { } } // v => contents of a cell // Output after we've seen the string contents if(name.equals("v")) { values.add(lastContents); } if(name.equals("row")) { ... do something ... }} – ParagJ Jan 22 '13 at 12:29
  • You'd probably be best off asking a new question, as it's impossible to work out from that comment what you're doing and what's wrong... – Gagravarr Jan 23 '13 at 16:24
  • Thanks Gagravarr, but I solved it myself by maintaining current column number. Thanks a lot. – ParagJ Jan 24 '13 at 06:26
  • @Gagravarr I am having a similar issue, lets say I am using the XLSX2CSV example and its parsing a large excel sheet and i want to output that stream to text file just for specific sheets in workbook, how would I do that ? – Eddie Martinez Nov 11 '13 at 14:38
  • @EduardoDennis Step one would be asking a new question for that, since it really isn't part of this one... – Gagravarr Nov 11 '13 at 15:02
  • The xlxs example is not good, you need to know in advance the minimum row number. – Jonathan Drapeau Sep 25 '14 at 13:31