-1

I have over 300 CSV files and I am supposed to read all of them and then perform various operation on them. First I am supposed to read the only HALF file and then, later on, I am supposed to read one-third file.

This is how the CSV files look. I am only supposed to read the avg_rss12 column and that too only half column first and then one-third later on. There are over 500 lines in each file and number of lines in each file changes for each file.

Suppose a file has 500 lines then I am supposed to read first 250(for half file) and 167 lines if I am supposed to read one-third file. The number of lines in each file is different and there are over 300 lines so I cannot manually modify each file.

# Task: bending1                        
# Frequency (Hz): 20                        
# Clock (millisecond): 250                      
# Duration (seconds): 120                       
# Columns: time avg_rss12   var_rss12   avg_rss13   var_rss13   avg_rss23   var_rss23
0   39.25   0.43    22.75   0.43    33.75   1.3
250 39.25   0.43    23  0   33  0
500 39.25   0.43    23.25   0.43    33  0
750 39.5    0.5 23  0.71    33  0
1000    39.5    0.5 24  0   33  0
1250    39.25   0.43    24  0   33  0
1500    39.25   0.43    24  0   33  0

Here is my code. For some reason, it is not reading the file at all. Also is my way the correct way or I am doing something wrong?

public static void main(String args[])
        {
            String path_Test = "E:\\DTW-KNN\\Dataset\\Test\\bending1\\dataset1.csv";

            File dataFile = new File(path_Test);
            long data_size = dataFile.length();
            String[] test = null;
            int count = 0;
            int i;

            try {
                BufferedReader reader = new BufferedReader(new FileReader(dataFile));
                for (i = 0; i <= data_size/2.0; i++) {

                test[i] = reader.readLine();
                    System.out.println(test[i]);
                    count++;
                }

            }
            catch (Exception e)
            {
                e.printStackTrace();
            }

            System.out.println(count);

        }
Sam
  • 1,237
  • 1
  • 16
  • 29
  • Please clarify what you mean by "half" and "one-third" of the file. It would be helpful if you provided an example of what output you expect with the sample input you've given. – Daniel Centore Apr 03 '18 at 20:25
  • If a file has 500 lines then I am supposed to read first 250(for half file) and 167 lines if I am supposed to read one-third file. I have also edited in the question. – Sam Apr 03 '18 at 20:27
  • Do you need to ignore lines starting with a hash (#)? – Daniel Centore Apr 03 '18 at 20:29
  • yes. I am supposed to read from line file which is just next line below the avg_rss12 column. – Sam Apr 03 '18 at 20:30
  • i think you code would make sense of you are from a txt or any other text based file. since this is a csv i don't think you can read the data in lines – Ninja Apr 03 '18 at 20:30
  • There's nothing stopping you from reading a CSV as a text file. – Tripp Kinetics Apr 03 '18 at 20:31
  • 2
    `File.length();` will give you size of the file in bytes. It doesn't necessarily mean that reading half of the bytes will read half of the lines. You should instead read all the data in the file while counting the lines and at the end (once you know how many lines there is) just throw away the lines you don't need. – MatheM Apr 03 '18 at 20:31
  • The number of lines in the file appears to be `Duration * 1000 / Clock` (not counting the file header) – Sean Bright Apr 03 '18 at 20:32
  • 1
    if data_size is n BYTES, you are trying to read n/2 LINES. If the file has one line of 300 bytes, you are trying to read 150 lines ... . What exactly is the meaning of "it is not reading the file at all"? – Chuidiang Apr 03 '18 at 20:32
  • 1
    Doing the `for` loop's check as a floating-point operation gives you nothing and reduces performance. – Tripp Kinetics Apr 03 '18 at 20:33
  • @MatheM So you mean, I am supposed to read the entire file, store it in the array and then delete half array to get the first half of file? – Sam Apr 03 '18 at 20:34
  • Possible duplicate of [What is a NullPointerException, and how do I fix it?](https://stackoverflow.com/q/218384/5221149) – Andreas Apr 03 '18 at 20:34
  • You don't need to read the whole file, but there are ways of counting the lines in the file and finding out how many bytes are in those lines. – Tripp Kinetics Apr 03 '18 at 20:35
  • 1
    @TrippKinetics something like this long lineCount = Files.lines(path).count(); in java 8? – Sam Apr 03 '18 at 20:37
  • 1
    @Samvid Kulkarni Pretty much yes. Unless you know beforehand how many lines is there in the file you have to read it all. The file doesn't "know" how many lines is there in it. – MatheM Apr 03 '18 at 20:37
  • I see. Thanks, everyone. I will make changes accordingly. – Sam Apr 03 '18 at 20:39

3 Answers3

1

You appear to be attempting to store your rows in a null array (test). You have to allocate the array to (at least) half the number of lines in the file before you try to put any data in it.

Tripp Kinetics
  • 5,178
  • 2
  • 23
  • 37
1

I would do something like this. Just read all valid the lines into a list and then you can iterate over the amount you want and parse whatever data you want out of it.

    Scanner scan = new Scanner(new File("dataset1.csv"));
    List<String> lines = new ArrayList<>();
    while (scan.hasNextLine()) {
        String line = scan.nextLine();
        // If line is not a comment or empty
        if (!line.startsWith("#") && !line.trim().isEmpty()) {
            lines.add(line);
        }
    }
    scan.close();
    // Go through half the lines
    for (int i = 0; i < lines.size() / 2; ++i) {
        String line = lines.get(i);
        String[] split = line.split("\\s+");  // split on whitespace
        double avg_rss12 = Double.parseDouble(split[1]);
        System.out.println(avg_rss12);
    }
Daniel Centore
  • 3,220
  • 1
  • 18
  • 39
0

Your code is not working, but with few modifications you can read the entire file and display on console (minimum modified to looks like your code):

public static void main(String args[]) {
    String pathTest = "E:\\DTW-KNN\\Dataset\\Test\\bending1\\dataset1.csv";
    File file = new File(pathTest);
    BufferedReader reader = null;
    try {
        reader = new BufferedReader(new FileReader(file));

        String line = reader.readLine();
        while (line != null) {
            System.out.println(line);
            line = reader.readLine();
        }
    } catch (Exception e) {
        e.printStackTrace();
    } finally {
        if (reader != null) {
            try {
                reader.close(); //always close it, or use try-with-resources from Java7 :)
            } catch (IOException e) {
            }
        }
    }
}

It's not clear if you need to read half and one-third of 300 files or for each file, but I believe is the second option, so my strategy would be:

  • Get the count of lines in file (attention that file.length() returns the count of bytes from file)
  • Calculate the quantity of lines you need to read based on line count (a half or one third)
  • Read the necessary amount of lines

In Java 8 you can get the line count from a file using:

Files.lines(Paths.get(fileName)).count();
mrdc
  • 93
  • 5