0

I'm trying to write a function to count specific Strings. The Strings to count look like the following:

first any character except comma at least once - the comma - any chracter but at least once

example string: test, test, test,
should count to 3

I've tried do that by doing the following:

int countSubstrings = 0;
final Pattern pattern = Pattern.compile("[^,]*,.+");
final Matcher matcher = pattern.matcher(commaString);
while (matcher.find()) {
    countSubstrings++;
}

Though my solution doesn't work. It always ends up counting to one and no further.

3 Answers3

2

Try this pattern instead: [^,]+

As you can see in the API, find() will give you the next subsequence that matches the pattern. So this will find your sequences of "non-commas" one after the other.

Tomas
  • 1,315
  • 10
  • 17
  • 1
    +1 That would be the most elegant solution for counting non-empty text between commas - if the requirement is to do so (the question is quite unclear about that, i.e. what about empty text or a missing comma at the end?) – Thomas Oct 30 '15 at 10:40
  • @Thomas The question is *very* unclear :) I'm interpreting it in the most likely way here, but of course your solution is equally valid. And +1 for nice name :D – Tomas Oct 30 '15 at 10:41
1

Your regex, especially the .+ part will match any char sequence of at least length 1. You want the match to be reluctant/lazy so add a ?: [^,]*,.+?

Note that .+? will still match a comma that directly follows a comma so you might want to replace .+? with [^,]+ instead (since commas can't match with this lazyness is not needed).

Besides that an easier solution might be to split the string and get the length of the array (or loop and check the elements if you don't want to allow for empty strings):

countSubstrings = commaString.split(",").length;

Edit:

Since you added an example that clarifies your expectations, you need to adjust your regex. You seem to want to count the number of strings followed by a comma so your regex can be simplified to [^,]+,. This matches any char sequence consisting of non-comma chars which is followed by a comma.

Note that this wouldn't match multiple commas or text at the end of the input, e.g. test,,test would result in a count of 1. If you have that requirement you need to adjust your regex.

Thomas
  • 87,414
  • 12
  • 119
  • 157
  • The sequence `[^,]+,.+?` is not going to give three matches since there is nothing following the third comma. – Tomas Oct 30 '15 at 10:31
  • 1
    @Tomas I see there was an update containing an example. Will adjust to that. – Thomas Oct 30 '15 at 10:32
0

So, quite good answers are already given. Very readable. Something like this should work, beware, it's not clean and probably not the fastest way to do this. But is is quite readable. :)

    public int countComma(String lots_of_words) {
        int count = 0;

        for (int x = 0; x < lots_of_words.length(); x++) {
        if (lots_of_words.charAt(x) == ',') {
            count++;
            }
        }
        return count;
    } 

Or even better:

public int countChar(String lots_of_words, char the_chosen_char) {
    int count = 0;
    for (int x = 0; x < lots_of_words.length(); x++) {
        if (lots_of_words.charAt(x) == the_chosen_char) {
            count++;
        }
    }
    return count;
}