1

I am a bit new to Objective C and was wondering if there is a better way to count words in a string.

ie:

NSString *str = @"this is a string";

// return should be 4 words ..

The way I now how to do it is by breaking the string into an array of words space (' ') character and count the array.

Any advise will be appreciated! Thanks!! :)

EDIT: For those of you who came here looking for answer; I found a similar post with an excellent reply.

How to count words within a text string?

Cœur
  • 37,241
  • 25
  • 195
  • 267
Unikorn
  • 1,140
  • 1
  • 13
  • 27
  • Why asking "without using regex"? Asking for "most efficient way" was enough. – Cœur Sep 23 '15 at 03:55
  • possible duplicate of [How to count words within a text string?](http://stackoverflow.com/questions/2266434/how-to-count-words-within-a-text-string) – Cœur Sep 23 '15 at 03:57

7 Answers7

6

There are two ways that don't involve collecting an array of words, and should be smarter than just breaking on spaces:

I would use one of these, even if I did want to collect or otherwise use the words.

Peter Hosey
  • 95,783
  • 15
  • 211
  • 370
  • Thanks Peter! I didn't know there was a CFStringTokenizer and is good to know. The reason I didn't want to use NSRegularExpression was because of the IOS 4 only limitation. – Unikorn Oct 20 '10 at 07:18
  • This answer is the best one. Check *WWDC 2012 Session 215: Text and Linguistic Analysis by Douglas Davidson*. A bit of example code: `__block int wordCount =0; NSRange range = {0,self.text.length }; [self.text enumerateSubstringsInRange:range options:NSStringEnumerationByWords usingBlock:^(NSString *substring, NSRange substringRange, NSRange enclosingRange, BOOL *stop) { wordCount++; }];` – José Manuel Sánchez Feb 01 '13 at 11:19
5

Are you sure you have a bottleneck in that part of code? If not (which is quite probable), then splitting on spaces seems perfectly acceptable to me. You could create a C string and count the spaces instead, but a lot of times such an “optimized” version is actually slower than the original one. That is, assuming that your current code looks like this:

NSUInteger wordCount = [[someString componentsSeparatedByString:@" "] count];

This is not exactly correct (see @"___" where underscore is a space), but maybe you really use a regex and split on \s+?

zoul
  • 102,279
  • 44
  • 260
  • 354
5

Unless you're going to be doing it hundreds of times a second, I would just opt for the readable solution, something like the following pseudocode:

def count (str):
    lastchar = " "
    count = 0
    for char as every character in string:
        if char is not whitespace and lastchar is whitespace:
            count = count + 1
        lastchar = char
    return count

It seems a bit of a waste to create a whole array of other strings just so you can count them and throw them away.

And if, for some reason, it becomes an issue, you can just replace the function body with a faster version. Make sure it is a problem first however. Optimisation of code that's fast enough already is wasted effort.

paxdiablo
  • 854,327
  • 234
  • 1,573
  • 1,953
  • Thanks for the detail reply! I think I will grab the char pointer and count the chars. – Unikorn Oct 20 '10 at 07:16
  • Not only is this easy to understand, it's probably far more efficient than using REs or a generalized scanner. The only tricky part is defining "whitespace" and coming up with a good algorithm to check for it, if you want a generalized solution (vs just checking for blank)." – Hot Licks Feb 20 '14 at 18:39
5

In this situation, I'd use an NSScanner like so:

NSString *str = @"this is a string";
NSScanner *scanner = [NSScanner scannerWithString:str];
NSCharacterSet *whiteSpace = [NSCharacterSet whitespaceAndNewlineCharacterSet];
NSCharacterSet *nonWhitespace = [whiteSpace invertedSet];
int wordcount = 0;

while(![scanner isAtEnd])
{
    [scanner scanUpToCharactersFromSet:nonWhitespace intoString:nil];
    [scanner scanUpToCharactersFromSet:whitespace intoString:nil];
    wordcount++;
}

This only creates two additional objects, no matter how long the string is.

paxdiablo
  • 854,327
  • 234
  • 1,573
  • 1,953
NSResponder
  • 16,861
  • 7
  • 32
  • 46
1

This code will count the number of words (i.e., non-empty substrings) contained in a string that are separated by any number of space or line break characters:

NSUInteger wordCount = 0;

for (NSString* word in [someString
    componentsSeparatedByCharactersInSet:
    [NSMutableCharacterSet characterSetWithCharactersInString:@" \n"]]) {

    if (![word  isEqual: @""]) {
        wordCount++;
    }

}

It's a slight improvement to zoul's answer without recurring to regexes.

Thomas C. G. de Vilhena
  • 13,819
  • 3
  • 50
  • 44
1

for storing string into an array

NSArray *yourArray = [str componentsSeparatedByString:@" "];

Update:

and to count no of word you can use

[yourArray count]
abatishchev
  • 98,240
  • 88
  • 296
  • 433
Gyani
  • 2,241
  • 1
  • 24
  • 38
0

One liner accurate solution:

return [[self componentsSeparatedByCharactersInSet:[NSCharacterSet whitespaceAndNewlineCharacterSet]] filteredArrayUsingPredicate:[NSPredicate predicateWithFormat:@"length > 0"]].count;
Cœur
  • 37,241
  • 25
  • 195
  • 267