59

How can I get the number of times an NSString (for example, @"cake") appears in a larger NSString (for example, @"Cheesecake, apple cake, and cherry pie")?

I need to do this on a lot of strings, so whatever method I use would need to be relatively fast.

Thanks!

igul222
  • 8,557
  • 14
  • 52
  • 60

13 Answers13

103

This isn't tested, but should be a good start.

NSUInteger count = 0, length = [str length];
NSRange range = NSMakeRange(0, length); 
while(range.location != NSNotFound)
{
  range = [str rangeOfString: @"cake" options:0 range:range];
  if(range.location != NSNotFound)
  {
    range = NSMakeRange(range.location + range.length, length - (range.location + range.length));
    count++; 
  }
}
Matthew Flaschen
  • 278,309
  • 50
  • 514
  • 539
  • 1
    range = [str rangeOfString: @"cake" options:0 range:range); This LOC must be replaced by following: range = [str rangeOfString: @"cake" options:0 range:range]; The parenthesis used instead of bracket. – necixy Jun 17 '11 at 11:32
71

A regex like the one below should do the job without a loop interaction...

Edited

NSString *string = @"Lots of cakes, with a piece of cake.";
NSError *error = NULL;
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:@"cake" options:NSRegularExpressionCaseInsensitive error:&error];
NSUInteger numberOfMatches = [regex numberOfMatchesInString:string options:0 range:NSMakeRange(0, [string length])];
NSLog(@"Found %i",numberOfMatches);

Only available on iOS 4.x and superiors.

gwdp
  • 1,202
  • 10
  • 21
  • 1
    [string length] should be [searchText length] I think? – Joris Mans Sep 21 '12 at 18:45
  • This is a great way to do this. – Keith Smiley Apr 09 '13 at 20:01
  • 2
    This should be the accepted answer. Honestly, this is much better than for-loops. Use a RegEx if you need more than just the first or last occurrence of a string. – auco Oct 05 '13 at 16:29
  • 6
    Even though this implementation is more compact, the accepted answer using a NSRange loop performs faster. In a quick test in a text document with 30 pages the loop search for a single word took 9ms while the regex implementation took 60ms. – Lars Blumberg Feb 25 '14 at 14:19
  • @LarsBlumberg loop search time will increase for each occurrence you have, regex not.. – gwdp Jan 26 '15 at 23:02
  • @LarsBlumberg could you please post a link to a text document – Aditya Aggarwal Apr 07 '15 at 06:53
  • @AdityaAggarwal just take any text document, such as a scientific paper and perform both search algorithms – Lars Blumberg Apr 07 '15 at 07:15
  • @gwdp even RegEx engine works by iterating and backtracking, internally. Not sure why you think that regex processing time won't increase with larger subject text. Try any document in https://regex101.com/#pcre and you might find how the number of iterative steps increase with increased amount of text. IMHO, for any performance critical operation regex is never a good choice. – Ayan Sengupta Aug 17 '16 at 21:40
45

was searching for a better method then mine but here's another example:

NSString *find = @"cake";
NSString *text = @"Cheesecake, apple cake, and cherry pie";

NSInteger strCount = [text length] - [[text stringByReplacingOccurrencesOfString:find withString:@""] length];
strCount /= [find length];

I would like to know which one is more effective.

And I made an NSString category for better usage:

// NSString+CountString.m

@interface NSString (CountString)
- (NSInteger)countOccurencesOfString:(NSString*)searchString;
@end

@implementation NSString (CountString)
- (NSInteger)countOccurencesOfString:(NSString*)searchString {
    NSInteger strCount = [self length] - [[self stringByReplacingOccurrencesOfString:searchString withString:@""] length];
    return strCount / [searchString length];
}
@end

simply call it by:

[text countOccurencesOfString:find];

Optional: you can modify it to search case insensitive by defining options:

Rodrigo
  • 11,909
  • 23
  • 68
  • 101
24

There are a couple ways you could do it. You could iteratively call rangeOfString:options:range:, or you could do something like:

NSArray * portions = [aString componentsSeparatedByString:@"cake"];
NSUInteger cakeCount = [portions count] - 1;

EDIT I was thinking about this question again and I wrote a linear-time algorithm to do the searching (linear to the length of the haystack string):

+ (NSUInteger) numberOfOccurrencesOfString:(NSString *)needle inString:(NSString *)haystack {
    const char * rawNeedle = [needle UTF8String];
    NSUInteger needleLength = strlen(rawNeedle);

    const char * rawHaystack = [haystack UTF8String];
    NSUInteger haystackLength = strlen(rawHaystack);

    NSUInteger needleCount = 0;
    NSUInteger needleIndex = 0;
    for (NSUInteger index = 0; index < haystackLength; ++index) {
        const char thisCharacter = rawHaystack[index];
        if (thisCharacter != rawNeedle[needleIndex]) {
            needleIndex = 0; //they don't match; reset the needle index
        }

        //resetting the needle might be the beginning of another match
        if (thisCharacter == rawNeedle[needleIndex]) {
            needleIndex++; //char match
            if (needleIndex >= needleLength) {
                needleCount++; //we completed finding the needle
                needleIndex = 0;
            }
        }
    }

    return needleCount;
}
Stunner
  • 12,025
  • 12
  • 86
  • 145
Dave DeLong
  • 242,470
  • 58
  • 448
  • 498
  • 3
    The componentsSeparatedByString solution causes quite a lot of unnecessary memory allocation. – Matthew Flaschen Jan 30 '10 at 04:28
  • 7
    @Matthew true, but it's a two-line solution. – Dave DeLong Jan 30 '10 at 04:36
  • `numberOfOccurrencesOfString:inString:` fails when the search string begins with the same chars as the needle, but then no longer matches while still being inside of a successful match. This is because needleIndex is always being reset to 0, when in reality it requires more complex logic. Take a simple example: `[self numberOfOccurrencesOfString:@"aab" inString:@"aaab"]` the return value is 0, when it should clearly be 1. – Senseful Jun 04 '14 at 21:02
  • See the [Boyer-Moore](http://en.wikipedia.org/wiki/Boyer%E2%80%93Moore_string_search_algorithm) and [Knuth-Morris-Pratt](http://en.wikipedia.org/wiki/Knuth%E2%80%93Morris%E2%80%93Pratt_algorithm) algorithms for all the intricacies involved in an efficient substring matching algorithm. – Senseful Jul 07 '14 at 17:48
12

A quicker to type, but probably less efficient solution.

- (int)numberOfOccurencesOfSubstring:(NSString *)substring inString:(NSString*)string
{
    NSArray *components = [string componentsSeparatedByString:substring];
    return components.count-1; // Two substring will create 3 separated strings in the array.
}
Community
  • 1
  • 1
Dash
  • 17,188
  • 6
  • 48
  • 49
3

Here's another version as a category on NSString:

-(NSUInteger) countOccurrencesOfSubstring:(NSString *) substring {
    if ([self length] == 0 || [substring length] == 0)
        return 0;

    NSInteger result = -1;
    NSRange range = NSMakeRange(0, 0);
    do {
        ++result;
        range = NSMakeRange(range.location + range.length,
                            self.length - (range.location + range.length));
        range = [self rangeOfString:substring options:0 range:range];
    } while (range.location != NSNotFound);
    return result;
}
Senseful
  • 86,719
  • 67
  • 308
  • 465
paulmelnikow
  • 16,895
  • 8
  • 63
  • 114
3

Here is a version done as an extension to NSString (same idea as Matthew Flaschen's answer):

@interface NSString (my_substr_search)
- (unsigned) countOccurencesOf: (NSString *)subString;
@end
@implementation NSString (my_substring_search)
- (unsigned) countOccurencesOf: (NSString *)subString {
    unsigned count = 0;
    unsigned myLength = [self length];
    NSRange uncheckedRange = NSMakeRange(0, myLength);
    for(;;) {
        NSRange foundAtRange = [self rangeOfString:subString
                                           options:0
                                             range:uncheckedRange];
        if (foundAtRange.location == NSNotFound) return count;
        unsigned newLocation = NSMaxRange(foundAtRange); 
        uncheckedRange = NSMakeRange(newLocation, myLength-newLocation);
        count++;
    }
}
@end
<somewhere> {
    NSString *haystack = @"Cheesecake, apple cake, and cherry pie";
    NSString *needle = @"cake";
    unsigned count = [haystack countOccurencesOf: needle];
    NSLog(@"found %u time%@", count, count == 1 ? @"" : @"s");
}
Chris Johnsen
  • 214,407
  • 26
  • 209
  • 186
3

If you want to count words, not just substrings, then use CFStringTokenizer.

Peter Hosey
  • 95,783
  • 15
  • 211
  • 370
3

Swift solution would be:

var numberOfSubstringAppearance = 0
let length = count(text)
var range: Range? = Range(start: text.startIndex, end: advance(text.startIndex, length))

while range != nil {

    range = text.rangeOfString(substring, options: NSStringCompareOptions.allZeros, range: range, locale: nil)

    if let rangeUnwrapped = range {

        let remainingLength = length - distance(text.startIndex, rangeUnwrapped.endIndex)
        range = Range(start: rangeUnwrapped.endIndex, end: advance(rangeUnwrapped.endIndex, remainingLength))
        numberOfSubstringAppearance++
     }
}
riik
  • 4,448
  • 1
  • 12
  • 16
1

Matthew Flaschen's answer was a good start for me. Here is what I ended up using in the form of a method. I took a slightly different approach to the loop. This has been tested with empty strings passed to stringToCount and text and with the stringToCount occurring as the first and/or last characters in text.

I use this method regularly to count paragraphs in the passed text (ie. stringToCount = @"\r").

Hope this of use to someone.

    - (int)countString:(NSString *)stringToCount inText:(NSString *)text{
        int foundCount=0;
        NSRange range = NSMakeRange(0, text.length);
        range = [text rangeOfString:stringToCount options:NSCaseInsensitiveSearch range:range locale:nil];
        while (range.location != NSNotFound) {
            foundCount++;
            range = NSMakeRange(range.location+range.length, text.length-(range.location+range.length));
            range = [text rangeOfString:stringToCount options:NSCaseInsensitiveSearch range:range locale:nil];
        }

        return foundCount;
   }

Example call assuming the method is in a class named myHelperClass...

int foundCount = [myHelperClass countString:@"n" inText:@"Now is the time for all good men to come to the aid of their country"];
user278859
  • 10,379
  • 12
  • 51
  • 74
0
for(int i =0;i<htmlsource1.length-search.length;i++){
  range = NSMakeRange(i,search.length);
  checker = [htmlsource1 substringWithRange:range];

  if ([search isEqualToString:checker]) {
   count++;

  }

 }
Dave DeLong
  • 242,470
  • 58
  • 448
  • 498
-1
-(IBAction)search:(id)sender{

  int  maincount = 0;
    for (int i=0; i<[self.txtfmainStr.text length]; i++) {
        char c =[self.substr.text characterAtIndex:0];
        char cMain =[self.txtfmainStr.text characterAtIndex:i];
        if (c == cMain) {
          int  k=i;
            int count=0;
            for (int j = 0; j<[self.substr.text length]; j++) {

                if (k ==[self.txtfmainStr.text length]) {
                    break;
                }

                if ([self.txtfmainStr.text characterAtIndex:k]==[self.substr.text characterAtIndex:j]) {

                    count++;
                }                

                if (count==[self.substr.text length]) {
                    maincount++;
                }

                k++;
            }


        }

        NSLog(@"%d",maincount);
    }

}
h.kishan
  • 681
  • 6
  • 20
-1

No built-in method. I'd suggest returning a c-string and using a common c-string style algorithm for substring counting... if you really need this to be fast.

If you want to stay in Objective C, this link might help. It describes the basic substring search for NSString. If you work with the ranges, adjust and count, then you'll have a "pure" Objective C solution... albeit, slow.

pestilence669
  • 5,698
  • 1
  • 23
  • 35
  • Doesn't calling e.g. NSString.UTF8String cause a new string to be allocated? It seems like it would be faster to use NSString's methods, such as rangeOfString. – Matthew Flaschen Jan 30 '10 at 04:25
  • Yes it does. Twice, if you decide to copy it for later use. Creating a c-string *once* and looking for k substrings is minimal in impact, compared to delving into NSString methods and allocating a substring after each hit. – pestilence669 Jan 30 '10 at 05:15
  • 2
    Not necessarily. If you start with an immutable string, substrings won't require allocations. As well and as Chris demonstrates, there is also no need to extract the substrings at all. Note also that converting a string to UTF8 can be extremely expensive if the string is, say, UTF-16. – bbum Jan 30 '10 at 05:35