0

I have a java code which fetches data from excel document. I want to calculate the number of columns and total number of rows(in a particular column). How can I achieve this? Java code and desired o/p is provided below

(edit): what modification I should make to get the desired o/p for e.g. I should write a loop to get the count of columns and rows or there is a method to do the same

Desired O/P

ColumnA ColumnB ColumnC
Vinayak James   Dan
India   US      Denmark

 Total number of Columns: 3
number of data in ColumnA:2
number of data in ColumnB:2
number of data in ColumnC:2  

(EDIT):- Answered here-- Count number of rows in a column of Excel sheet(Java code provided)

My Java Code:

import java.io.File;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.IOException;
import java.util.Iterator;

import org.apache.poi.ss.formula.functions.Column;
import org.apache.poi.ss.usermodel.Cell;
import org.apache.poi.ss.usermodel.Row;
import org.apache.poi.xssf.usermodel.XSSFSheet;
import org.apache.poi.xssf.usermodel.XSSFWorkbook;

public class ExcelRead {
    public static void main(String[] args) {
        int count=0;
    try {
        FileInputStream file = new FileInputStream(new File("C:/Users/vinayakp/Desktop/Book.xlsx"));
        XSSFWorkbook workbook = new XSSFWorkbook(file);
        XSSFSheet sheet = workbook.getSheetAt(0);
        Iterator<Row> rowIterator = sheet.iterator();
        while(rowIterator.hasNext()) {
            Row row = rowIterator.next();
            Iterator<Cell> cellIterator = row.cellIterator();
            while(cellIterator.hasNext()) {
                Cell cell = cellIterator.next();
                switch(cell.getCellType()) {
                    case Cell.CELL_TYPE_BOOLEAN:
                        System.out.print(cell.getBooleanCellValue() + "\t\t");
                        break;
                    case Cell.CELL_TYPE_NUMERIC:
                        System.out.print(cell.getNumericCellValue() + "\t\t");
                        break;
                    case Cell.CELL_TYPE_STRING:
                        System.out.print(cell.getStringCellValue() + "\t\t");
                        break;
                }
            }
            System.out.println("");
        }

        file.close();    
    } catch (FileNotFoundException e) {
        e.printStackTrace();
    } catch (IOException ae) {
        ae.printStackTrace();
    }
}
}

Output I'm getting is:

ColumnA ColumnB ColumnC
Vinayak James   Dan
India   US      Denmark

I need to get the desired o/p as shown above. Code is working fine however I need to get the count values of column and rows. Kindly provide me the solution for the same. I had problems with the code earlier which was resolved in this question: Issue while reading Excel document (Java code)

Community
  • 1
  • 1
Vinayak Pahalwan
  • 2,915
  • 4
  • 26
  • 34

3 Answers3

3

I think you can use the code suggested here to get the columns and then for each row in a column (though it's more for each column in the row concerning POI's approach), just count the values you need.

So your code would probably follows something as:

for(Row row : sheet) {
   short minColIx = row.getFirstCellNum();
   short maxColIx = row.getLastCellNum();
   for(short colIx = minColIx; colIx<maxColIx; colIx++) {
     Cell c = row.getCell(colIx);
     if(c != null) {
        if(c.getCellType() == Cell.CELL_TYPE_NUMERIC) {
           // add c.getNumericCellValue()
        }
     }
   }
}

Also nice ideas from poi api docs for working with column numbers.

Community
  • 1
  • 1
acostache
  • 2,177
  • 7
  • 23
  • 48
  • this helped me to get the total number of columns..thanks :) i verified by adding more columns in my sheet and it worked. now i need to know the _count_ of number of rows in a given column – Vinayak Pahalwan Dec 13 '12 at 12:07
  • should i use `getPhysicalNumberOfRows` and iterate for every column to find total count of data of each row? – Vinayak Pahalwan Dec 13 '12 at 12:11
  • No, i don't think so, because that "returns the number of physically defined rows (NOT the number of rows in the sheet)" (http://poi.apache.org/apidocs/org/apache/poi/hssf/usermodel/HSSFSheet.html#getPhysicalNumberOfRows%28%29) – acostache Dec 13 '12 at 12:16
  • 1
    But i think you have enough api available; maybe you can try using a matrix or something like that, to "simulate" the number of rows and columns and on which you can operate whatever you need. – acostache Dec 13 '12 at 12:17
  • lemme see, anyway half of my prblm is solved..than you very much – Vinayak Pahalwan Dec 13 '12 at 12:21
  • Ok, - go ahead try that; and whatever you find helpful do post as it will help others! All the best! – acostache Dec 13 '12 at 12:22
  • looks a bit tricky...anyway i will post as i get the desired row data count result – Vinayak Pahalwan Dec 13 '12 at 12:23
2

The code uses Iterator to parse each row and column value. It doesn't know how many items it's got left to iterate over, so you can't query it for that result.

However you can get the last row index using API getLastRowNum (reference).

Gets the number last row on the sheet. Owing to idiosyncrasies in the excel file format, if the result of calling this method is zero, you can't tell if that means there are zero rows on the sheet, or one at position zero. For that case, additionally call getPhysicalNumberOfRows() to tell if there is a row at position zero or not.

Similarly for each cell you have API getLastCellNum in HSSFRow class (reference).

You need to alter your code to take advantage of these APIs.

Tutorial: Read/Write Excel in Java

Viral Patel
  • 8,506
  • 3
  • 32
  • 29
  • If i'm having columnA with 2 data and columbB with 3 dont u think the _getLastRowNum_ will give 3 as a result for each case? btw ur articles are sometimes a big help. – Vinayak Pahalwan Dec 13 '12 at 11:54
  • instead i should use `getPhysicalNumberOfRows` ?? – Vinayak Pahalwan Dec 13 '12 at 12:10
  • 1
    Glad you liked viralpatel.net. I think what you looking for is column > row approach of iterating excel file instead of traditional row > column. The later is more intuitive and that's why used by APIs like POI. I suggest you to manually parse the sheet and create 2D array to store count. You may need to check for null cells while counting. – Viral Patel Dec 13 '12 at 12:31
1

I'm not familiar with working with Excel workbooks in java code, but you have an iterator right there. Couldn't you simply add an int to it and increment it every iteration?

int counter = 0;
while(rowIterator.hasNext()) {
...
counter++;
}

With your cascaded iterators, you can keep track of the different numbers you want using several counters;

int[] counters = new int[4];
counters[0] = 0; // number of columns
counters[1] = 0; // number of rows in colA
counters[2] = 0; // number of rows in colB
counters[3] = 0; // number of rows in colC

while(rowIterator.hasNext()) {
    /* 
     * keep track of columns. Useful if colD and more may exist too, but
     * you don't want them counted
     */
    int col = 0;
    while(cellIterator.hasNext()) {
        switch(col) {
        case 0 : counters[1]++; break; // add row to colA counter
        case 1 : counters[2]++; break; // add row to colB counter
        case 2 : counters[3]++; break; // add row to colC counter
        }
        col++;
    }
    counters[0] = col; // save number of columns
}

This may not be the best way mind you, but I think it should work well enough. Bare in mind that I have no experience in this either so if anyone answers with a solution working with the xssf classes, it's likely better. Also, I have not tested this, but it should at least offer you one approach.

I don't know how xssf handle's this, but you may want to check if the cells are empty before counting them. Judging by your posted code however, I guess that's not necessary.

If you have a variable number of columns (likely, otherwise, why count?) you should make a list of variable length with counters in them, and add a new counter int for every column. Hope this helps you!

Mark Tielemans
  • 1,528
  • 3
  • 20
  • 40
  • i did, what happened is it was printing the total number of rows in the sheet. For e.g. if in columnA i'm having 2 data and column b is having 3 it will show the o/p as 3 for both the columns. may be i 'm doing it wrong. im new to this as well, anyway thanks – Vinayak Pahalwan Dec 13 '12 at 11:47
  • 1
    Editted answer to better address your needs, but I see an answer using the api has been posted. Better use that. – Mark Tielemans Dec 13 '12 at 12:26
  • well half of the job is done, i was able to iterate and count number of rows however, counting the rows for a given column is not yet finished. what acostache and viral said was right, have to work with matrix to get the solution – Vinayak Pahalwan Dec 13 '12 at 12:42
  • You can count the number of rows per column using my answer, `counters[1]` is the amount of rows in colA, `counters[2]` in colB, `counters[3]` in colC. The matrix answers are just more correct/neat. – Mark Tielemans Dec 13 '12 at 12:54