-4

What function should i use for Java programming to get the total number of colons in a CSV file?

PS: not a Java developer.

Amir Hassan Azimi
  • 9,180
  • 5
  • 32
  • 43

4 Answers4

3

Read the file char by char (using a BufferedReader to make it fast), and count each colon you meet:

int countColons() throws IOException {
    try (BufferedReader in = new BufferedReader(new InputStreamReader(new FileInputStream(file.txt), "UTF-8"))) {
        int count = 0;
        int c;
        while ((c = in.read()) >= 0) {
            if (c == ':') {
                count++;
            }
        }
        return count;
    }
}

Of course, you should use the appropriate encoding for your file. Not necessarily UTF-8.

JB Nizet
  • 678,734
  • 91
  • 1,224
  • 1,255
1

Read the file line by line. For every line, use replaceAll to get rid of every character that isn't a colon. Then get the length of the resulting String. Keep a cumulative total of the results of this.

Dawood ibn Kareem
  • 77,785
  • 15
  • 98
  • 110
  • 2
    replaceAll to count characters? Why not read the file char by char and count the ones equal to a colon? Sounds much more logical and efficient to me. – JB Nizet Feb 18 '14 at 22:27
  • @JBNizet you could post that as an answer. I'll upvote it if you do. – Dawood ibn Kareem Feb 18 '14 at 22:28
  • @DavidWallace Why line by line when `org.apache.commons.io.FileUtils` reads the entire file to a String? :-) – corsiKa Feb 18 '14 at 22:30
  • @corsiKa Because very often, one must deal with files that are really big. If you're only processing a file in start-to-finish order, you really don't want to use up a huge amount of heap space doing so. I am a bit dismayed by how many of the answers here start by loading the entire file into memory - it's really not good practice. – Dawood ibn Kareem Feb 18 '14 at 22:33
  • @DavidWallace I find that very hard to swallow. I can't remember the last time I had to deal with something that big. If it really is that big, it shouldn't be in a file, it should be in a database. – corsiKa Feb 18 '14 at 22:38
  • @corsiKa Happens *all* the time. You can argue that you should read it into a DB or key/value store, but that's an additional, potentially unnecessary step. There are a bunch of reasons why passing around files is reasonable under a variety of conditions. – Dave Newton Feb 18 '14 at 22:42
  • @corsiKa Well, in my current job, I frequently have to deal with enormous files. Like hundreds of megabytes of XML. It would seem a bit rich to assume that the OP doesn't. But whether he does or not, we should be promoting best practices in our answers. And loading a file into memory when you only need to read it from start to finish is absolutely _not_ a best practice. – Dawood ibn Kareem Feb 18 '14 at 22:45
  • @DavidWallace I disagree with your last assertion. I think doing extra work because it might save some heap space, at the expense of making the code much less readable and straightforward, is a bad practice. And it can be argued both ways, but personally I'd say leave it simple (as per wypieprz's solution) unless you've identified there's a problem. – corsiKa Feb 18 '14 at 23:16
1

If you don't want to reinvent the wheel:

import org.apache.commons.io.FileUtils;
import org.apache.commons.lang3.StringUtils;

int count = StringUtils.countMatches(FileUtils.readFileToString(new File("file.csv")), ":");
wypieprz
  • 7,981
  • 4
  • 43
  • 46
  • This is excellent - knowing these kinds of utilities makes mundane tasks just disappear so we can focus on more important development work. – corsiKa Feb 18 '14 at 22:31
0

One cool trick I found a while ago for counting the number of occurences is to take the length of the string, and minute all the values from it that are not your desired value.

Example

// Assume fileStr contains everything in the file

int numberOfColons = fileStr.length() - fileStr.replaceAll(":", "").length();

This will give you the number of colons in the file.

Edit

Just remembered when I got it from. It is from this question.

The reason why I like this approach

Obviously, it's extremely short, which is always nice. It does give some of a hit to the processor, but it avoids all loops (in your code at least) and it seems like a very elegant solution to the problem.

Community
  • 1
  • 1
christopher
  • 26,815
  • 5
  • 55
  • 89