15

I am currently working on a project that reads an excel file using Apache POI.

My task seems to be simple, I just need to get the cell value as it was display in the excel file. I am aware of performing a switch statement based on the cell type of a cell. But if the data is something like

9,000.00

POI gives me 9000.0 when I do getNumericCellValue(). When I force the cell to be a string type and do getStringCellValue() it then gives me 9000. What I need is the data as how it was presented in excel.

I found some post telling to use DataFormat class but as per my understanding, it requires your code to be aware of the format that the cell has. In my case, I am not aware of the format that the cell might have.

So, how can I retrieve the cell value as how it was presented in excel?

Bnrdo
  • 5,325
  • 3
  • 35
  • 63

3 Answers3

29

Excel stores some cells as strings, but most as numbers with special formatting rules applied to them. What you'll need to do is have those formatting rules run against the numeric cells, to produce strings that look like they do in Excel.

Luckily, Apache POI has a class to do just that - DataFormatter

All you need to do is something like:

 Workbook wb = WorkbookFactory.create(new File("myfile.xls"));
 DataFormatter df = new DataFormatter();

 Sheet s = wb.getSheetAt(0);
 Row r1 = s.getRow(0);
 Cell cA1 = r1.getCell(0);

 String asItLooksInExcel = df.formatCellValue(cA1);

Doesn't matter what the cell type is, DataFormatter will format it as best it can for you, using the rules applied in Excel

Gagravarr
  • 47,320
  • 10
  • 111
  • 156
1

In fact, it's never possible to obtain exactly the formatted cell value with the locale defined when the cell was written. it's due to there locale observer and the fact that the internal locale excel prefix is never reused when formatting afterwards;

Analysis for POI 3.17 (may change since looking how the component is internally done)

for instance : the dateConverted format of the cell style (from CellStyle.getDataFormatString()) for Locale.US with the format dd MMM yyyy hh:mm:ss is :
"[$-0409]dd MMM yyyy hh:mm:ss;@" where the local excel internal prefix = [$-0409]

It's obtained from DateFormatConverter.localPrefixes private static map.

here is some code to work around this issue :

/**
 * Missing method in POI to enable the visualisation asIs of an cell with a
 * different locale in a xls document.
 *
 * @param style
 *            the cell style localized.
 * @return the Locale found using internal locationPrefixes.
 */
private final Locale extractLocaleFromDateCellStyle(final CellStyle style) {

    final String reOpenedFormat = style.getDataFormatString();

    LOGGER.info("Data Format of CellStyle : " + reOpenedFormat);
    Locale locale = getLocaleFromPrefixes(extractPrefixeFromPattern(reOpenedFormat));
    LOGGER.info("Found locale : " + locale);
    return locale;
}

/**
 * Extracts the internal prefix that represent the local of the style.
 *
 * @param pattern
 *            the DataFormatString of the cell style.
 * @return the prefix found.
 */
private final String extractPrefixeFromPattern(final String pattern) {

    Pattern regex = Pattern.compile(REGEX_PREFIX_PATTERN);
    Matcher match = regex.matcher(pattern);

    if (match.find()) {
        LOGGER.info("Found prefix: " + match.group(1));
        // return only the prefix
        return match.group(1);
    }
    return null;
}

/**
 * Reverse obtain the locale from the internal prefix from
 * <code>DateFormatConverter.localePrefixes</code> private static field.
 * <p>
 * Uses reflection API.
 *
 * @param prefixes
 *            the prefixes
 * @return the local corresponding tho the prefixes.
 */
public static Locale getLocaleFromPrefixes(final String prefixes) {

    try {

        @SuppressWarnings("unchecked")
        Map<String, String> map = getStaticPrivateInternalMapLocalePrefix();

        String localPrefix = null;

        // find the language_Country value matching the internal excel
        // prefix.
        for (Map.Entry<String, String> entry : map.entrySet()) {

            LOGGER.info("value : " + entry.getValue() + ", key :"
                    + entry.getKey());

            if (entry.getValue().equals(prefixes)
                    && !StringUtils.isBlank(entry.getKey())) {
                localPrefix = entry.getKey();
                break;
            }
        }

        // Generate a Locale with language, uppercase(country) info.
        LOGGER.info(localPrefix);
        if (localPrefix.indexOf('_') > 0) {
            String[] languageCountry = localPrefix.split("_");
            return new Locale(languageCountry[0],
                    StringUtils.defaultString(languageCountry[1]
                            .toUpperCase()));
        }

        // nothing found.
        return null;

        // factorized the multiples exceptions.
    } catch (Exception e) {
        throw new UnsupportedOperationException(e);
    }

}

/**
 * gets the internal code map for locale used by Excel.
 * 
 * @return the internal map.
 * @throws NoSuchFieldException
 *             if the private field name changes.
 * @throws IllegalAccessException
 *             if the accessible is restricted.
 */
private static Map<String, String> getStaticPrivateInternalMapLocalePrefix()
        throws NoSuchFieldException, IllegalAccessException {
    // REFLECTION
    Class<?> clazz = DateFormatConverter.class;
    Field fieldlocalPrefixes = (Field) clazz
            .getDeclaredField(DATE_CONVERTER_PRIVATE_PREFIXES_MAP);

    // change from private to public.
    fieldlocalPrefixes.setAccessible(true);

    @SuppressWarnings("unchecked")
    Map<String, String> map = (Map<String, String>) fieldlocalPrefixes
            .get(clazz);

    LOGGER.info("MAP localPrefixes : " + map);
    return map;
}

Thus, the following simple code should do the trick. Note the code is not fully tested with against null values and depends on the version of POI you use until they changed there LOCALE OBSERVER MADNESS :)

    ....
    final CellStyle cellStyle = reopenedCell.getCellStyle();

    Locale locale = extractLocaleFromDateCellStyle(cellStyle);
    LOGGER.info("FOUND LOCAL : " + locale);

    // use the same local from the cell style during writing.
    DataFormatter df = new DataFormatter(locale);

    String reOpenValue = df.formatCellValue(reopenedCell);

Regards.

-1

Use the CellType you can check every thing

if (cellValue.getCellType() == Cell.CELL_TYPE_NUMERIC)

// the cellValue is Numeric

if (cellValue.getCellType() == Cell.CELL_TYPE_STRING)

// the cellValue is a String

date is also given as numeric, at that time check the given cell is date or not by using dateUtil

if (DateUtil.isCellDateFormatted(cellData))

after you can convert the cellvalue into the date

newuser
  • 8,338
  • 2
  • 25
  • 33
  • 1
    I already did this, my problem is, the original format of the data is not restored. – Bnrdo Oct 16 '13 at 11:04
  • I tried to get the numeric value as String by getStringCellValue() i got exception. for that only i gave this. Is this possible to get the String as numeric. – newuser Oct 16 '13 at 11:07
  • Take a look http://stackoverflow.com/questions/1072561/how-can-i-read-numeric-strings-in-excel-cells-as-string-not-numbers-with-apach – newuser Oct 16 '13 at 11:09
  • Yes I set the cell type of the cell into string but I get inaccurate result. I already said this in my question – Bnrdo Oct 16 '13 at 11:25
  • I am also waiting for the answer – newuser Oct 16 '13 at 11:33