1

I use streaming POI API and would like to read the real value of a cell instead of the formatted one. My code which is below works fine but if the user doesn't display all the digit of a value in the excel sheet which is readed by my code, I've got the same truncated value in my result. I didn't find any solution in the streaming API - which is needed in my case to solve memory issue I had using the POI API without streaming.

    /**
     * @see org.apache.poi.xssf.eventusermodel.XSSFSheetXMLHandler.SheetContentsHandler cell(java.lang.String,
     *      java.lang.String)
     */
    @Override
    void cell(String cellReference, String formattedValue, XSSFComment comment) { 
       useTheCellValue(formattedValue) 
    }

2 Answers2

3

If you are constructing the XSSFSheetXMLHandler you can provide a DataFormatter. So if you are creating your own DataFormatter this DataFormatter could give you fully access to the formatting issues.

Example of how this could look like by changing the public void processSheet of the XLSX2CSV example in svn:

...
public void processSheet(
        StylesTable styles,
        ReadOnlySharedStringsTable strings,
        SheetContentsHandler sheetHandler, 
        InputStream sheetInputStream) throws IOException, SAXException {
    //DataFormatter formatter = new DataFormatter();
    DataFormatter formatter = new DataFormatter(java.util.Locale.US) {
        //do never formatting double values but do formatting dates
        public java.lang.String formatRawCellContents(double value, int formatIndex, java.lang.String formatString) {
            if (org.apache.poi.ss.usermodel.DateUtil.isADateFormat(formatIndex, formatString)) {
                return super.formatRawCellContents(value, formatIndex, formatString);
            } else {
                //return java.lang.String.valueOf(value);
                return super.formatRawCellContents(value, 0, "General");
            }
        }
    };
    InputSource sheetSource = new InputSource(sheetInputStream);
    try {
        XMLReader sheetParser = SAXHelper.newXMLReader();
        ContentHandler handler = new XSSFSheetXMLHandler(
              styles, null, strings, sheetHandler, formatter, false);
        sheetParser.setContentHandler(handler);
        sheetParser.parse(sheetSource);
     } catch(ParserConfigurationException e) {
        throw new RuntimeException("SAX parser appears to be broken - " + e.getMessage());
     }
}
...
Axel Richter
  • 56,077
  • 6
  • 60
  • 87
0

I've seen a ticket on POI about this point : https://bz.apache.org/bugzilla/show_bug.cgi?id=61858

It provides a first solution by changing the existing class.

This could be an interesting workaround even if the ideal solution should be to use a standard one.

  • 1
    To cite [PJ Fanning](https://bz.apache.org/bugzilla/show_bug.cgi?id=61858#c1): "If you look at the XLSX2CSV example in Poi, you will see that you can provide your own DataFormatter and with a custom DataFormatter, you could omit the number formatting and just represent the numbers in plain format." – Axel Richter May 29 '18 at 11:50