165

Why is the following algorithm not halting for me?

In the code below, str is the string I am searching in, and findStr is the string occurrences of which I'm trying to find.

String str = "helloslkhellodjladfjhello";
String findStr = "hello";
int lastIndex = 0;
int count = 0;
    
while (lastIndex != -1) {
    lastIndex = str.indexOf(findStr,lastIndex);
    
    if( lastIndex != -1)
        count++;
           
    lastIndex += findStr.length();
}

System.out.println(count);
Alexander Ivanchenko
  • 25,667
  • 5
  • 22
  • 46
  • 19
    We did a really good one in Udacity: we used newSTR = str.replace(findStr, ""); and returned count = ((str.length() - newSTR.length())/findStr.length()); – SolarLunix Sep 14 '15 at 17:46
  • Similar question for characters: http://stackoverflow.com/q/275944/873282 – koppor Apr 16 '17 at 19:42
  • Don't you also want to account for the case where the prefix of the search string is its suffix? In that case I don't think any of the suggested answers would work. [here](https://gist.github.com/itissid/c6b2dd2f3ffd55874dc8c2a721716a08) is an example. In that case you would need a more elaborate algorithm, like the Knuth Morris Pratt(KMP) which is coded up in the CLRS book – Sid Jul 25 '17 at 15:42
  • 2
    it is not halting for you, because after reaching your 'halt' condition (lastIndex == -1) you reset it by incrementing the value of lastIndex (lastIndex += findStr.length();) – Legna Aug 01 '17 at 22:30
  • @Sid if you wanted that behaviour you could just increment lastIndex by only 1 each time rather than findStr.length. In my case for example, I only need to know if a character matched or not, don't mind about counting multiple overlaps. so just depends on each individual use case – Adam Burley Jan 30 '21 at 02:44
  • take a look: https://www.geeksforgeeks.org/frequency-substring-string/ – modos Nov 03 '22 at 22:05

28 Answers28

232

How about using StringUtils.countMatches from Apache Commons Lang?

String str = "helloslkhellodjladfjhello";
String findStr = "hello";

System.out.println(StringUtils.countMatches(str, findStr));

That outputs:

3
demongolem
  • 9,474
  • 36
  • 90
  • 105
A_M
  • 7,693
  • 6
  • 33
  • 37
137

Your lastIndex += findStr.length(); was placed outside the brackets, causing an infinite loop (when no occurence was found, lastIndex was always to findStr.length()).

Here is the fixed version :

String str = "helloslkhellodjladfjhello";
String findStr = "hello";
int lastIndex = 0;
int count = 0;

while (lastIndex != -1) {

    lastIndex = str.indexOf(findStr, lastIndex);

    if (lastIndex != -1) {
        count++;
        lastIndex += findStr.length();
    }
}
System.out.println(count);
ndsmyter
  • 6,535
  • 3
  • 22
  • 37
Olivier
  • 3,465
  • 2
  • 23
  • 26
  • This will fail for string "aaa" and substring "aa". It will return 1 while the count is two. The indexes of the occurrences are [0,1] – Niko Oct 14 '21 at 21:31
  • @Niko The correct result depends on whether or not overlapping matches are allowed. – Unmitigated May 14 '22 at 16:36
107

A shorter version. ;)

String str = "helloslkhellodjladfjhello";
String findStr = "hello";
System.out.println(str.split(findStr, -1).length-1);
Peter Lawrey
  • 525,659
  • 79
  • 751
  • 1,130
  • 9
    `return haystack.split(Pattern.quote(needle), -1).length - 1;` if for instance `needle=":)"` – Mr_and_Mrs_D Dec 16 '12 at 16:01
  • You can even remove second argument of `split()` for a shorter one-liner – Laurent Grégoire Dec 28 '12 at 11:46
  • 2
    @lOranger Without the `,-1` it will drop trailing matches. – Peter Lawrey Dec 28 '12 at 12:02
  • 4
    Ouch, thanks, good to know! This will teach me to read the small lines in the javadoc... – Laurent Grégoire Dec 28 '12 at 12:05
  • 4
    Nice! But it includes only non-overlapping matches, no? E.g. matching "aa" in "aaa" will return 1, not 2? Of course including overlapping or non-overlapping matches are both valid and dependent on user requirements (perhaps a flag to indicate count overlaps, yes/no)? – Cornel Masson Apr 26 '13 at 09:24
  • 3
    -1 .. try running this on "aaaa" and "aa".. the correct answer is 3 not 2. – Kalyanaraman Santhanam Sep 15 '14 at 06:06
  • 1
    Possibly this answer could be improved two ways: (1) Use `Splitter` from Google Guava, as it can work directly from a `String`, or (2) Metaquote the `String` via `Pattern.quote(String)` before calling `String.find(String, int)`. Without protective regex metaquoting, you may introduce subtle bugs. – kevinarpe Nov 03 '15 at 01:57
91

The last line was creating a problem. lastIndex would never be at -1, so there would be an infinite loop. This can be fixed by moving the last line of code into the if block.

String str = "helloslkhellodjladfjhello";
String findStr = "hello";
int lastIndex = 0;
int count = 0;

while(lastIndex != -1){

    lastIndex = str.indexOf(findStr,lastIndex);

    if(lastIndex != -1){
        count ++;
        lastIndex += findStr.length();
    }
}
System.out.println(count);
Cryptoclysm
  • 117
  • 13
codebreach
  • 2,155
  • 17
  • 30
  • 135
    This reply is the exact copy of the post I made an hour before ;) – Olivier Apr 20 '09 at 14:16
  • 11
    Note that this might or might not return the result expected. With substring "aa" and string to search "aaa" the number of occurences expected may be one (returned by this code), but may be two as well (in this case you'll need "lastIndex++" instead of "lastIndex += findStr.length()") depending on what you are looking for. – Stanislav Kniazev Apr 23 '09 at 12:52
  • 1
    @olivier didnt see that... :( @stan thats absolutely correct... i was just fixing the code in the problem... guess it depends on what bobcom means by number of occurrences in the string... – codebreach Apr 24 '09 at 20:52
  • do{ ... } while(lastIndex != -1) would be a better fit, esp. when processing multiple lines. – yanchenko Jul 11 '09 at 15:03
  • 2
    When are people going to learn to wrap stuff like this in a copy and paste static method? See my answer below, it's also more optimized. – mjs Apr 12 '15 at 09:54
  • if you didnt see oliver's post, means that both of you copied the same source from somewhere else :) – Baby Apr 15 '15 at 01:36
  • or both of us wrote the same thing, IIRC I wrote the whole thing in a text editor... how many different ways can one write such a simple program in anyways. – codebreach May 04 '15 at 03:33
  • 4
    The moral here is that if you're intending to write an answer, check _first_ whether or not someone else has already written the exact same answer. There's really no benefit in having the same answer appear twice, regardless of whether your answer was copied, or written independently. – Dawood ibn Kareem Jul 11 '18 at 04:03
  • plus one for everybody that ended up here from leet code :) – Niko Oct 15 '21 at 11:23
83

Do you really have to handle the matching yourself ? Especially if all you need is the number of occurences, regular expressions are tidier :

String str = "helloslkhellodjladfjhello";
Pattern p = Pattern.compile("hello");
Matcher m = p.matcher(str);
int count = 0;
while (m.find()){
    count +=1;
}
System.out.println(count);     
ndsmyter
  • 6,535
  • 3
  • 22
  • 37
Jean
  • 21,329
  • 5
  • 46
  • 64
  • 1
    This does NOT find special characters, it will find 0 count for strings below: `String str = "hel+loslkhel+lodjladfjhel+lo"; Pattern p = Pattern.compile("hel+lo");` – Ben Feb 02 '14 at 04:09
  • 13
    yes it will if you express your regex correctly. try with `Pattern.compile("hel\\+lo");` the `+` sign has a special meaning in a regex and needs to be escaped. – Jean Feb 02 '14 at 09:42
  • 4
    If what you are looking for is to take an arbitrary String and use it as an exact match with all special regular expression characters ignored, `Pattern.quote(str)` is your friend! – Mike Furtak Jan 10 '15 at 18:11
  • 2
    this does not work for "aaa" when str = "aaaaaa". There are 4 answer but yours giving 2 – Pujan Oct 29 '16 at 12:02
  • This solution doesn't work for this case: str = "This is a test \\n\\r string", subStr = "\\r", it shows 0 occurrences. – Maksym Ovsianikov Dec 01 '17 at 23:21
  • Can modern Java (such as 1.8) reduce that line count with some clever iterator? – Phlip Jan 01 '20 at 20:27
57

I'm very surprised no one has mentioned this one liner. It's simple, concise and performs slightly better than str.split(target, -1).length-1

public static int count(String str, String target) {
    return (str.length() - str.replace(target, "").length()) / target.length();
}
kmecpp
  • 2,371
  • 1
  • 23
  • 38
14

Here it is, wrapped up in a nice and reusable method:

public static int count(String text, String find) {
        int index = 0, count = 0, length = find.length();
        while( (index = text.indexOf(find, index)) != -1 ) {                
                index += length; count++;
        }
        return count;
}
mjs
  • 21,431
  • 31
  • 118
  • 200
8
public int countOfOccurrences(String str, String subStr) {
  return (str.length() - str.replaceAll(Pattern.quote(subStr), "").length()) / subStr.length();
}
  • Good answer. Can you mind adding some notes on how does it work? – santhosh kumar Sep 27 '17 at 20:07
  • Sure, str - is our source string, subStr - is a substring. The goal is to calculate amount of occurrences of subStr in str. To do this, we use the formula: (a-b)/c, where a - length of str, b - length of str without all occurrences of subStr (we remove all occurrences of subStr from str for this), c - length of subStr. So, basically we extract from the length of str - length of str without all subStr, and then we divide result on the length of subStr. Please let me know if you have any other questions. – Maksym Ovsianikov Oct 17 '17 at 00:35
  • Santhosh, you are welcome! The important part is to use Pattern.quote for subStr, otherwise in may fail in some cases, like this one: str = "This is a test \\n\\r string", subStr = "\\r". Some similar answers provided here don't use Pattern, so they will fail in such cases. – Maksym Ovsianikov Dec 01 '17 at 23:17
  • 2
    There is no reason for regex, use `replace`, not `replaceAll`. – NateS May 13 '20 at 13:14
8
String str = "helloslkhellodjladfjhello";
String findStr = "hello";
int lastIndex = 0;
int count = 0;

while((lastIndex = str.indexOf(findStr, lastIndex)) != -1) {
     count++;
     lastIndex += findStr.length() - 1;
}
System.out.println(count);

at the end of the loop count is 3; hope it helps

dumbPotato21
  • 5,669
  • 5
  • 21
  • 34
dfa
  • 114,442
  • 31
  • 189
  • 228
  • 7
    The code contains an error. If we search for a single character, the `findStr.length() - 1` returns 0 and we are in an endless cycle. – Jan Bodnar Sep 26 '14 at 10:58
7

A lot of the given answers fail on one or more of:

  • Patterns of arbitrary length
  • Overlapping matches (such as counting "232" in "23232" or "aa" in "aaa")
  • Regular expression meta-characters

Here's what I wrote:

static int countMatches(Pattern pattern, String string)
{
    Matcher matcher = pattern.matcher(string);

    int count = 0;
    int pos = 0;
    while (matcher.find(pos))
    {
        count++;
        pos = matcher.start() + 1;
    }

    return count;
}

Example call:

Pattern pattern = Pattern.compile("232");
int count = countMatches(pattern, "23232"); // Returns 2

If you want a non-regular-expression search, just compile your pattern appropriately with the LITERAL flag:

Pattern pattern = Pattern.compile("1+1", Pattern.LITERAL);
int count = countMatches(pattern, "1+1+1"); // Returns 2
benkc
  • 3,292
  • 1
  • 28
  • 37
6

You can number of occurrences using inbuilt library function:

import org.springframework.util.StringUtils;
StringUtils.countOccurrencesOf(result, "R-")
Victor
  • 761
  • 8
  • 7
3

Increment lastIndex whenever you look for next occurrence.

Otherwise it's always finding the first substring (at position 0).

Alexander Farber
  • 21,519
  • 75
  • 241
  • 416
Stanislav Kniazev
  • 5,386
  • 3
  • 35
  • 44
2

The answer given as correct is no good for counting things like line returns and is far too verbose. Later answers are better but all can be achieved simply with

str.split(findStr).length

It does not drop trailing matches using the example in the question.

Mark
  • 131
  • 1
  • 7
2
public int indexOf(int ch,
                   int fromIndex)

Returns the index within this string of the first occurrence of the specified character, starting the search at the specified index.

So your lastindex value is always 0 and it always finds hello in the string.

Alexander Farber
  • 21,519
  • 75
  • 241
  • 416
Bhushan Bhangale
  • 10,921
  • 5
  • 43
  • 71
  • lastIndex is set to the return value and then incremented, the only way it is 0 after an iteration of the loop is if the length of the substring is 1 – Ivo Sep 02 '21 at 09:33
1

Try this one. It replaces all the matches with a -.

String str = "helloslkhellodjladfjhello";
String findStr = "hello";
int numberOfMatches = 0;
while (str.contains(findStr)){
    str = str.replaceFirst(findStr, "-");
    numberOfMatches++;
}

And if you don't want to destroy your str you can create a new string with the same content:

String str = "helloslkhellodjladfjhello";
String strDestroy = str;
String findStr = "hello";
int numberOfMatches = 0;
while (strDestroy.contains(findStr)){
    strDestroy = strDestroy.replaceFirst(findStr, "-");
    numberOfMatches++;
}

After executing this block these will be your values:

str = "helloslkhellodjladfjhello"
strDestroy = "-slk-djladfj-"
findStr = "hello"
numberOfMatches = 3
Xander
  • 5,487
  • 14
  • 49
  • 77
1

Here is the advanced version for counting how many times the token occurred in a user entered string:

public class StringIndexOf {

    public static void main(String[] args) {

        Scanner scanner = new Scanner(System.in);

        System.out.println("Enter a sentence please: \n");
        String string = scanner.nextLine();

        int atIndex = 0;
        int count = 0;

        while (atIndex != -1)
        {
            atIndex = string.indexOf("hello", atIndex);

            if(atIndex != -1)
            {
                count++;
                atIndex += 5;
            }
        }

        System.out.println(count);
    }

}
Alexander Farber
  • 21,519
  • 75
  • 241
  • 416
Venzentx
  • 487
  • 3
  • 6
  • 17
1

This below method show how many time substring repeat on ur whole string. Hope use full to you:-

    String searchPattern="aaa"; // search string
    String str="aaaaaababaaaaaa"; // whole string
    int searchLength = searchPattern.length(); 
    int totalLength = str.length(); 
    int k = 0;
    for (int i = 0; i < totalLength - searchLength + 1; i++) {
        String subStr = str.substring(i, searchLength + i);
        if (subStr.equals(searchPattern)) {
           k++;
        }

    }
duggu
  • 37,851
  • 12
  • 116
  • 113
1

As @Mr_and_Mrs_D suggested:

String haystack = "hellolovelyworld";
String needle = "lo";
return haystack.split(Pattern.quote(needle), -1).length - 1;
Ron Tesler
  • 1,146
  • 1
  • 11
  • 24
1

Based on the existing answer(s) I'd like to add a "shorter" version without the if:

String str = "helloslkhellodjladfjhello";
String findStr = "hello";

int count = 0, lastIndex = 0;
while((lastIndex = str.indexOf(findStr, lastIndex)) != -1) {
    lastIndex += findStr.length() - 1;
    count++;
}

System.out.println(count); // output: 3
sjkm
  • 3,887
  • 2
  • 25
  • 43
  • this one takes into account if the string repeats, for instance if you are looking for the string 'xx' in a string 'xxx'. – tCoe Sep 29 '16 at 21:23
1

This solution prints the total number of occurrence of a given substring throughout the string, also includes the cases where overlapping matches do exist.

class SubstringMatch{
    public static void main(String []args){
        //String str = "aaaaabaabdcaa";
        //String sub = "aa";
        //String str = "caaab";
        //String sub = "aa";
        String str="abababababaabb";
        String sub = "bab";

        int n = str.length();
        int m = sub.length();

        // index=-1 in case of no match, otherwise >=0(first match position)
        int index=str.indexOf(sub), i=index+1, count=(index>=0)?1:0;
        System.out.println(i+" "+index+" "+count);

        // i will traverse up to only (m-n) position
        while(index!=-1 && i<=(n-m)){   
            index=str.substring(i, n).indexOf(sub);
            count=(index>=0)?count+1:count;
            i=i+index+1;  
            System.out.println(i+" "+index);
        }
        System.out.println("count: "+count);
    }
}
Anubhav Singh
  • 8,321
  • 4
  • 25
  • 43
1

Matcher.results()

You can find the number of occurrences of a substring in a string using Java 9 method Matcher.results() with a single line of code.

It produces a Stream of MatchResult objects which correspond to captured substrings, and the only thing needed is to apply Stream.count() to obtain the number of elements in the stream.

public static long countOccurrences(String source, String find) {
    
    return Pattern.compile(find) // Pattern
        .matcher(source) // Mather
        .results()       // Stream<MatchResults>
        .count();
}

main()

public static void main(String[] args) {
    System.out.println(countOccurrences("helloslkhellodjladfjhello", "hello"));
}

Output:

3
Alexander Ivanchenko
  • 25,667
  • 5
  • 22
  • 46
1

try adding lastIndex+=findStr.length() to the end of your loop, otherwise you will end up in an endless loop because once you found the substring, you are trying to find it again and again from the same last position.

0

here is the other solution without using regexp/patterns/matchers or even not using StringUtils.

String str = "helloslkhellodjladfjhelloarunkumarhelloasdhelloaruhelloasrhello";
        String findStr = "hello";
        int count =0;
        int findStrLength = findStr.length();
        for(int i=0;i<str.length();i++){
            if(findStr.startsWith(Character.toString(str.charAt(i)))){
                if(str.substring(i).length() >= findStrLength){
                    if(str.substring(i, i+findStrLength).equals(findStr)){
                        count++;
                    }
                }
            }
        }
        System.out.println(count);
0

If you need the index of each substring within the original string, you can do something with indexOf like this:

 private static List<Integer> getAllIndexesOfSubstringInString(String fullString, String substring) {
    int pointIndex = 0;
    List<Integer> allOccurences = new ArrayList<Integer>();
    while(fullPdfText.indexOf(substring,pointIndex) >= 0){
       allOccurences.add(fullPdfText.indexOf(substring, pointIndex));
       pointIndex = fullPdfText.indexOf(substring, pointIndex) + substring.length();
    }
    return allOccurences;
}
Rhino
  • 101
  • 8
0
public static int getCountSubString(String str , String sub){
int n = 0, m = 0, counter = 0, counterSub = 0;
while(n < str.length()){
  counter = 0;
  m = 0;
  while(m < sub.length() && str.charAt(n) == sub.charAt(m)){
    counter++;
    m++; n++;
  }
  if (counter == sub.length()){
    counterSub++;
    continue;
  }
  else if(counter > 0){
    continue;
  }
  n++;
}

return  counterSub;

}

Nikolai Nechai
  • 111
  • 1
  • 3
  • this question is 8 years old, and without any indication of why this is a better solution than the 22 other solutions posted, it should probably be removed – Jason Wheeler Nov 29 '17 at 23:36
0

Just a little more peachy answer

    public int countOccurrences(String str, String sub) {
        if (str == null || str.length() == 0 || sub == null || sub.length() == 0) return 0;
        int count = 0;
        int i = 0;
        while ((i = str.indexOf(sub, i)) != -1) {
            count++;
            i += sub.length();
        }
        return count;
    }
ucMedia
  • 4,105
  • 4
  • 38
  • 46
0

I was asked this question in an interview just now and I went completely blank. (Like always, I told myself that the moment the interview ends ill get the solution) which I did, 5 mins after the call ended :(

    int subCounter=0;
    int count =0;
    for(int i=0; i<str.length(); i++) {
        if((subCounter==0 && "a".equals(str.substring(i,i+1))) 
                || (subCounter==1 && "b".equals(str.substring(i,i+1)))
                || (subCounter==2 && "c".equals(str.substring(i,i+1)))) {
            ++subCounter;
        }
        if(subCounter==3) {
            count = count+1;
            subCounter=0;
        }
    }
    System.out.println(count);
Shahid Sarwar
  • 1,209
  • 14
  • 29
0

The best solution to this problem you can find in org.springframework.util.StringUtils.countOccurrencesOf(string, substring):

// IndexOfWithJumpSubstringCounterImpl (countOccurrencesOf after refactoring)
public static int count(String string, String substring) {
    if (string == null || string.length() == 0 
        || substring == null || substring.length() == 0) {
        return 0;
    }

    int count = 0;
    int idx;
    for(int pos = 0; (idx = string.indexOf(substring, pos)) != -1; pos = idx + substring.length()) {
        ++count;
    }

    return count;
}

There is performance comparison based on JMH (full report: https://medium.com/p/d924cf933fc3):

(impl)                                Mode  Cnt      Score     Error   Units
IndexOfWithJumpSubstringCounterImpl  thrpt   10  86171.752 ± 225.064  ops/ms
IndexOfSubstringCounterImpl          thrpt   10  77560.418 ± 154.745  ops/ms
ReplaceBasedSubstringCounterImpl     thrpt   10  29758.761 ±  35.899  ops/ms
RegExSubstringCounterImpl            thrpt   10   5121.197 ±  10.030  ops/ms
mrkandreev
  • 31
  • 1
  • 4