0

Given the following:

String s = "The The The the the the";

How can I find how many instances of "The" are in the string s?

s.matches("The") only tells me if it at least one is there. s.contains("The") is the same.

Is there some simple way?

mix
  • 6,943
  • 15
  • 61
  • 90
  • 1
    Dupe of http://stackoverflow.com/questions/8975019/java-find-the-number-of-times-a-word-is-present-in-a-string-is-there-something – sgowd Mar 26 '12 at 07:01

6 Answers6

4

As i know Matcher.find() method attempts to find the next subsequence of the input sequence that matches the pattern. That means you can iterate through matches calling this method multiple times:

int count = 0;
while (matcher.find()) {
  count++;
}

you should use Matcher.start() and Matcher.end() to retrieve matching subsequence.

kant
  • 157
  • 1
  • 2
  • 15
2

You can use indexOf(str, count)

int count = 0;
String s = "The The The the the the";
String match = "The";
int searchStart = 0;

while ((searchStart = s.indexOf(match, searchStart)) != -1)
{
    count++;
    searchStart+= match.length();
}
npinti
  • 51,780
  • 5
  • 72
  • 96
  • This is going for an infinite loop – Chandra Sekhar Mar 26 '12 at 07:13
  • Indeed, this needs a `searchStart += match.length` in the loop body. – bezmax Mar 26 '12 at 07:16
  • @ChandraSekhar: I think you are right. It should now be fixed. Unfortunately I can't test the code at the moment. – npinti Mar 26 '12 at 07:17
  • @Max: Incrementing it by 1 should be enough. But you do have a point. It has been fixed. – npinti Mar 26 '12 at 07:23
  • @npinti: I'd rather increment it by the length of the string, otherwise it will check same symbols several times. Imagine if the `match` string is 1000 symbols long, it would have to waste 1000 cpu cycles every iteration to recompare what has already been done. – bezmax Mar 26 '12 at 07:27
0

You can use s.indexOf("The", index);, if it is returning some index then increment count and the index also and make it inot a loop until the index is not found.

NOTE: Initially the value of index is 0

Chandra Sekhar
  • 18,914
  • 16
  • 84
  • 125
0

Give a try of this:

String test = "The The The the the the";
System.out.println(test.split("The").length);
Jiří Šitina
  • 306
  • 4
  • 10
  • why does this return length + 1? e.g. test.split("blah").length = 1 – mix Mar 26 '12 at 07:28
  • 1
    thanks for remark: the reason is that this call actually splits the string using provided delimiter (the word in our case) so there are #of occurences of separator + 1 parts – Jiří Šitina Mar 26 '12 at 07:34
  • 2
    see my split answer which has this fixed. – krystan honour Mar 26 '12 at 07:42
  • This won't necessarily work due to the weird behavior of `String.split`. `String test = "The The The the the theTheTheThe"; System.out.println(test.split("The").length);` prints out `4`, which is _definitely_ not the right answer. `String.split` has weird behavior on trailing delimiters which will make this just not work if you have occurrences of the search string at the end of the searched string. – Louis Wasserman Mar 26 '12 at 12:30
0

Simply split the string on the word to be counted.

 String text = "the the water the the";
 System.out.println(text.split("the", -1).length -1);

Also if you are currently using apache commons lang you could use its count function from StringUtils

String text = "the the water the the";
int count = StringUtils.countMatches(text, "the");
System.out.println("count is " + count);

However don't just bring that in for that one function thats a bit of overkill :)

krystan honour
  • 6,523
  • 3
  • 36
  • 63
-1
String s = "The The The The The sdfadsfdas";

List<String> list = Arrays.asList(s.split(" "));

Set<String> unique = new HashSet<String>(list);
for (String key : unique) {
    System.out.println(key + ": " + Collections.frequency(list, key));
}
kandarp
  • 4,979
  • 11
  • 34
  • 43