1

I'm working on a function where I have to store numbers received via uploading an excel file. It's written in Java, using apache poi library, under Spring framework (which is irrelevant).

The file which I'm trying to upload (Note that the column has already been set to Text) :

enter image description here

Code is as follow :

// function accepts "MultipartFile inputFile"
InputStream is = inputFile.getInputStream();
StreamingReader reader = StreamingReader.builder().rowCacheSize(100).bufferSize(4096).sheetIndex(0)
                        .read(is);
for (Row row : reader) {
    System.out.println("Reading next row.");
    System.out.println("row[0] is of type " + row.getCell(0).getCellType());
    Cell cell = row.getCell(0);
    String value = "";
    if (cell.getCellType() == Cell.CELL_TYPE_STRING) {
        value = cell.getStringCellValue().replaceAll("[\\D]", "");
    } else if (cell.getCellType() == Cell.CELL_TYPE_NUMERIC) {
        value = NumberToTextConverter.toText(cell.getNumericCellValue());
    }
    System.out.println("Value is " + value);
}

And I'm getting below output :

Reading next row.
row[0] is of type 0 // Which is equals to Cell.CELL_TYPE_NUMERIC
Value is 166609999

Issue is that I'd require to read it as '0166609999' instead of '166609999', strangely this only happens to xlsx file, if I save it as xls format and re-upload the file, I have no issue with detecting its cell type. Any ideas?

Edited for being marked as duplication:

  • Given answer https://stackoverflow.com/a/19401902/1131470 uses DataFormatter class which is not supported when we are streaming the sheet, as the Cell object retrieved is going to be a StreamingCell object, which throws an exception if we make a call to getCellStyle() function.

As of 29th Mar 2016

Seems that the class StreamingCell does not support DataFormatter, where DataFormatter is the only class available at the moment to grab what Excel displays. So current solution would be sticking back to reading the whole Excel file into memory. If anyone found an answer in future you may please post an answer here, I'd greatly appreciate that, as current solution is an absolutely terrible one.

As of 31st Mar 2016

Special thanks to Axel who pointed out its library version issue, updating streamer jar file to 0.2.12 solves the problem. Thanks!

Chor Wai Chun
  • 3,226
  • 25
  • 41
  • I suspect it may lie at the https://github.com/monitorjbl/excel-streaming-reader. So make sure you are using the latest version of this. – Axel Richter Mar 28 '16 at 10:25
  • @AxelRichter Thank you so much for pointing it out, yes when I look into the class of StreamingCell https://github.com/monitorjbl/excel-streaming-reader/blob/master/src/main/java/com/monitorjbl/xlsx/impl/StreamingCell.java, it seems that most functions throw UnsupportedException. No choice but I'd have to go back to reading the entire file while limiting upload size. – Chor Wai Chun Mar 29 '16 at 08:42
  • What you say can't be true. If the cell is formatted as `Text`, then with OpenXML the cell's type is `t="s"` and the cell's value points to the `sharedStrings.xml`. So `row.getCell(0).getCellType()` cannot be 0 and the value cannot be 166609647. This only can be if the cell is **not** formatted as `Text`. Then, and only then, you would need a formatter. – Axel Richter Mar 29 '16 at 13:56
  • I haven't go into extracting it as a zip and look into the Xml files yet. But I've hosted the file at https://www.dropbox.com/s/hjz0dletdq3qk1i/test_formatted_number.xlsx?dl=0 which you may grab and have a look at the type, of you wish to, appreciate your help =) – Chor Wai Chun Mar 30 '16 at 05:13

2 Answers2

2

Downloaded your file. Downloaded xlsx-streamer-0.2.12.jar, slf4j-api-1.7.20.jar and slf4j-nop-1.7.20.jar and placed in class path.

Having following code:

import com.monitorjbl.xlsx.*;
import org.apache.poi.ss.usermodel.*;

import java.io.*;

class StreamingReaderTest {

 public static void main(String[] args) throws Exception {

  try (
   InputStream is = new FileInputStream(new File("/home/axel/Downloads/test_formatted_number.xlsx"));
   StreamingReader reader = StreamingReader.builder()
           .rowCacheSize(100)
           .bufferSize(4096)
           .sheetIndex(0)
           .read(is);
  ) {
   for (Row row : reader) {
    System.out.println("row[0] is of type " + row.getCell(0).getCellType());
    Cell cell = row.getCell(0);
    String value = "";
    if (cell.getCellType() == Cell.CELL_TYPE_STRING) {
        value = cell.getStringCellValue();
    } else if (cell.getCellType() == Cell.CELL_TYPE_NUMERIC) {
        value = "" + cell.getNumericCellValue();
    }
    System.out.println("Value is " + value);
   }
  }
 }
}

Gives me:

enter image description here

So no problem here.

If I use DataFormatter with numeric cell values:

import com.monitorjbl.xlsx.*;
import org.apache.poi.ss.usermodel.*;

import java.io.*;

class StreamingReaderTest {

 public static void main(String[] args) throws Exception {

  try (
   InputStream is = new FileInputStream(new File("/home/axel/Downloads/test_formatted_number.xlsx"));
   StreamingReader reader = StreamingReader.builder()
           .rowCacheSize(100)
           .bufferSize(4096)
           .sheetIndex(0)
           .read(is);
  ) {
   for (Row row : reader) {
    System.out.println("row[0] is of type " + row.getCell(0).getCellType());
    Cell cell = row.getCell(0);
    String value = "";
    if (cell.getCellType() == Cell.CELL_TYPE_STRING) {
        value = cell.getStringCellValue();
    } else if (cell.getCellType() == Cell.CELL_TYPE_NUMERIC) {
        value = new DataFormatter().formatCellValue(cell);
    }
    System.out.println("Value is " + value);
   }
  }
 }
}

and put the number 166609647 in A2 formatted as 0000000000. then I get:

enter image description here

Axel Richter
  • 56,077
  • 6
  • 60
  • 87
  • Wow, absolutely awesome, let me try again later with your version of streamer jar file, hope it'd solves my issue.. – Chor Wai Chun Mar 31 '16 at 00:28
  • It was indeed version issue. My project was using an outdated 0.2.3 version, updating it solves my problem instantly, thanks! – Chor Wai Chun Mar 31 '16 at 04:11
0

Why .replaceAll("[\D]", "");

Works fine for me, just tried. Wonder what the issue is here.

fis = new FileInputStream(inputFile);
XSSFWorkbook inputWorkBook = new XSSFWorkbook (fis);
XSSFSheet inputSheet = inputWorkBook.getSheetAt(0);
Iterator<Row> rowIterator = inputSheet.iterator();

while(rowIterator.hasNext())
{
    Row row = rowIterator.next();
    Iterator<Cell> cellIterator = row.cellIterator();

    while (cellIterator.hasNext()) 
    {
        Cell cell = cellIterator.next();

        switch (cell.getCellType())
        {
            case Cell.CELL_TYPE_STRING:
                System.out.print(cell.getStringCellValue() + "\t");
                break;
            case Cell.CELL_TYPE_NUMERIC:
                System.out.print(cell.getNumericCellValue() + "\t");
                break;
            case Cell.CELL_TYPE_BOOLEAN:
                System.out.print(cell.getBooleanCellValue() + "\t");
                break;
            default :  
        }
    }
    System.out.println(""); 
}  
Tshilidzi Mudau
  • 7,373
  • 6
  • 36
  • 49
Cozimetzer
  • 662
  • 4
  • 12
  • Er, thanks for pointing out the weird stuffs =) but anyway it wouldn't affect what my question asks, hence I'll just leave it at the moment.. – Chor Wai Chun Mar 28 '16 at 04:15
  • Hi there, thanks again for the effort, and sorry for not being very clear on the question. If I'm accepting a XLS file, I'd use the similar approach as yours to process the file, and when I do that, '0166609647' appears alright. Problem happens when I accept XLSX file, which could be extremely large, hence streaming is preferred instead of reading the whole file into memory, and that's where issue happens. – Chor Wai Chun Mar 28 '16 at 04:44
  • I've added the portion on how I declare the streaming object up on my question, you have have a look again there. – Chor Wai Chun Mar 28 '16 at 04:45