0

After looking for some resources, I could load an Excel file which contain 1.000.000 rows data. But, I did not know how to get each data. Here is my code so far...

public void create(MultipartFile file) throws Exception {
    try {
        InputStream fileStream = new BufferedInputStream(file.getInputStream());
        OPCPackage opc = OPCPackage.open(fileStream);
        XSSFReader xssf = new XSSFReader(opc);
        SharedStringsTable sst = xssf.getSharedStringsTable();
        XSSFReader.SheetIterator itr = (XSSFReader.SheetIterator)xssf.getSheetData();

// I just realize, if I running below for-loop,
// this only print strings and in random order, not in the same order as the excel file.
// 20 is just an example

        for (int i = 0; i < 20; i++) {
            System.out.println(sst.getEntryAt(i).getT().toString());
        }

        while (itr.hasNext()) {
            InputStream is = itr.next();
            if (itr.getSheetName().equals("MY_SHEET_NAME")) {
                while ("data is avaiable, this is just example, I'll use something like hasNext() for the row in the sheet, but I dont know how to do it" != null) {
                    // Want to process and get all data in each cells, then store to DB
                    // What I did not know, is how to get data in each cells
                }
            } else {
                throw new Exception("Sheet not found");
            }
        }
    } catch (Exception e) {
        throw new Exception("Error is: " + e.getMessage());
    } finally {
        if (is != null) {
            is.close();
        }

        if (opc != null){
            opc.close();
        }

        if (fileStream != null) {
            fileStream.close();
        }
    }
}

I've tried to look at here to process the sheet, but I did not get how to grab the data in each cells. Any help will really help me..

Update

if I read the doc of apache POI, here, from the link, code part that will process my excel is here:

public void processOneSheet(String filename) throws Exception {
        OPCPackage pkg = OPCPackage.open(filename);
        XSSFReader r = new XSSFReader( pkg );
        SharedStringsTable sst = r.getSharedStringsTable();

        XMLReader parser = fetchSheetParser(sst);

        // To look up the Sheet Name / Sheet Order / rID,
        //  you need to process the core Workbook stream.
        // Normally it's of the form rId# or rSheet#
        InputStream sheet2 = r.getSheet("rId2");
        InputSource sheetSource = new InputSource(sheet2);
        parser.parse(sheetSource);
        sheet2.close();
    }

But, after call parser.parse(sheetSource), how I could get each data from each row and column? Because I wanna do validation on each data on each cells, then store it to database.

Update 2 I've try using this answer, https://stackoverflow.com/a/51818500/10454516. I could get the data, I've try to insert myObjectRepo.save(result) or myObjectRepo.save(myObject), both I placed the code inside void endRow method and I also try to place it right after switch but inside the if(lineNumber > 0), but its always return NullPointerException. But if I did not call save method, I try to print the result in the console, the result is printed.

Akza
  • 1,033
  • 3
  • 19
  • 37
  • https://github.com/pjfanning/poi-shared-strings-sample has a runnable version of POI's XLSX2CSV sample that uses the streamed data to generate CSV. – PJ Fanning Dec 03 '18 at 21:30
  • https://github.com/monitorjbl/excel-streaming-reader also streams the xlsx input but its probably easier to use then the SAX parser approach. – PJ Fanning Dec 03 '18 at 21:32

2 Answers2

0

One of the way you could grab the excel data is :

try {
        InputStream excelFile = new FileInputStream(mFileName); 
        XSSFWorkbook wb = new XSSFWorkbook(excelFile);
        XSSFSheet sheet = wb.getSheetAt(0);
        XSSFRow row;
        XSSFCell cell;

        Iterator<Row> rows = sheet.rowIterator();

        int col = 0, colPR = 1;
        int pageRank = 0;
        String url = null;

        while (rows.hasNext()) {
            row = (XSSFRow) rows.next();
            url = row.getCell(col).getStringCellValue();

            System.out.println("--------------------------");
        }

        FileOutputStream out = new FileOutputStream(mFileName);
        wb.write(out);
        out.flush();
        out.close();
    }
    catch (Exception e) {
        // TODO Auto-generated catch block
        e.printStackTrace();
    }
Vebbie
  • 1,669
  • 2
  • 12
  • 18
  • I've tried using XSSFWorkbook, but it's memory consuming. Because my excel file will contain approx 700k rows data. So thats why I did not use XSSFWorkbook and use that my current approach – Akza Dec 03 '18 at 07:08
0

Well, I'd like to answer my own question. After several experiments/trials and also based on this answer, I've made it to upload excel file which has at least 1.000.000 rows data and stored into db (postgresql). For upload, read and insert 1.000.000 rows data needs 5 minutes. Here is the link to the project. Hope this could help anybody who need it.

Akza
  • 1,033
  • 3
  • 19
  • 37