0

How do I read a .csv excel file with x number of rows and y number of columns, ignore irrelevant cells (things like names), then compute an average of the numbers in each column?

The Excel I have is something like this (, indicates new cell):

ID, week 1, week 2, week 3, .... , week 7
0 ,   1   ,   0.5 ,   0   ,      ,  1.2
1 ,  0.5  ,   1   ,   0.5 ,      ,  0.5
y , ......

so, how do I make it so it reads that kind of .csv file then computes an average in the format Week 1 = (Week 1 average), Week 2 = (week2 average) for all weeks?

Also am I correct in assuming I need to use a 2D Array for this?

Edit Here's my code so far, it's very crude and I'm not sure if it does things properly yet:

import java.io.File;
import java.io.FileNotFoundException;
import java.util.Scanner;

public class ClassAverage {
    public static void main(String[] args){
        readFile2Array("attendance.csv");
    }
    public static double[][] readFile2Array(String fileName){
        try {
            int rowCount = 0;
            int colCount = 0;

            Scanner rc = new Scanner(new File("attendance.csv"));
            while (rc.hasNextLine()) {
                rowCount++;
                rc.nextLine();
            }
            rc.close();
            System.out.println(rowCount);

            Scanner cc = new Scanner(new File("attendance.csv"));
            while (cc.hasNext()) { 
                colCount++;
                cc.next();
            }
            cc.close();

            colCount = colCount/rowCount;

            System.out.println(colCount);

            Scanner sc = new Scanner(new File("attendance.csv"));
            double[][] spreadSheet = new double[rowCount][colCount];
            while (sc.hasNext()) {
                for (int i=0; i<spreadSheet.length; ++i){
                    for (int j=0; j<spreadSheet[i].length; ++j){
                        spreadSheet[i][j] = Double.parseDouble(sc.next());
                    }
                }
            }
            sc.close();
            return spreadSheet;

        } catch (FileNotFoundException e) {
            System.out.println("File cannot be opened");
            e.printStackTrace();
        }
        return null;
    }

    public static double weeklyAvg(double[][] a){

    }
}

So a summary of what it's intended to do

readFile2Array: read the csv file and count the number of rows, then count the total number of cells, divide total number of cells by number of rows to find number of columns. Read again and put each cell into the correct place in a 2D array.

weeklyAvg: I haven't thought up a way to do this yet, but it's supposed to read the array column by column and compute an average for each column, then print out the result.

PS. I'm very new at Java so I have no idea what some suggestions mean so I'd really appreciate suggestions that are pure java based without addons and stuff (I'm not sure if that's what some people are suggesting even). I hope it's not too much to ask for (if it's even possible).

Kara
  • 6,115
  • 16
  • 50
  • 57
Win
  • 217
  • 6
  • 13

2 Answers2

0

You can use a Java library to handle your CSV file. For example opencsv ( you can find the latest maven version here http://mvnrepository.com/artifact/com.opencsv/opencsv/3.5)

And then you can parse your file like this :

CSVReader reader = new CSVReader(new FileReader("PATH_TO_YOUR_FILE"));
    String[] nextLine;
    int counter = 0;
    while ((nextLine = reader.readNext()) != null) {
        // nextLine[] is an array of values from the line
        System.out.println(nextLine[0] + nextLine[1]);
    }

You have to ignore the header line, you can simply do this by incrementing a counter and skipping the zero value. To compute the average you can use a hashmap where the key is the column header name (example week 1). Then you increment with the current line value and after the loop is completed you divide by the number of lines (don't forget to substract the ignored lines like header line)

  • To check if your value is a double you can simply use Double.parseDouble static method and catch the NumberFormatException. A more elegant way is to define a regex for the acceptable values – Abdelhamid Bakhta Nov 06 '15 at 15:09
  • Is there a way to do it without the CSVReader? I have no idea what maven is :\ – Win Nov 06 '15 at 15:51
0

To parse simple CSV files, it's pretty simple to just manually parse through it, as long as you know the format is the same throughout the file and it does not contain errors

  1. Create a storage data structure for each column you wish to compute (use a LinkedList<String>)
  2. Read through the CSV file line by line with a BufferedReader
  3. Use String.split(',') on each line and add the specific columns in the returned array to the correct LinkedList
  4. Loop through the LinkedLists at the end and compute your averages (using Double.parseDouble() to convert the Strings to doubles)

To make sure that the String you're attempting to parse is a double, you can either use a try-catch statement or use a regex. Check Java: how to check that a string is parsable to a double? for more information

Community
  • 1
  • 1
phflack
  • 2,729
  • 1
  • 9
  • 21