8

What's a simple implementation of the following NSString category method that returns the number of words in self, where words are separated by any number of consecutive spaces or newline characters? Also, the string will be less than 140 characters, so in this case, I prefer simplicity & readability at the sacrifice of a bit of performance.

@interface NSString (Additions)
- (NSUInteger)wordCount;
@end

I found the following solutions:

But, isn't there a simpler way?

ma11hew28
  • 121,420
  • 116
  • 450
  • 651
  • 1
    I don't see how it's possible to do better than a linear search here. Depending on the implementation of scanUpToCharactersFromSet, this might fare better than O(n) in most cases. – tjarratt May 30 '11 at 00:58
  • @tjarratt: I think the OP wants the "simplest" method, not necessarily the fastest. – Aidan Steele May 30 '11 at 01:01
  • How about enumerating by word and and counting by using NSStringEnumerationByWords in a string enumeration? – Alex Zavatone Oct 30 '14 at 20:44
  • possible duplicate of [How to count words within a text string?](http://stackoverflow.com/questions/2266434/how-to-count-words-within-a-text-string) – Cœur Sep 23 '15 at 03:57

7 Answers7

16

Why not just do the following?

- (NSUInteger)wordCount {
    NSCharacterSet *separators = [NSCharacterSet whitespaceAndNewlineCharacterSet];
    NSArray *words = [self componentsSeparatedByCharactersInSet:separators];

    NSIndexSet *separatorIndexes = [words indexesOfObjectsPassingTest:^BOOL(id obj, NSUInteger idx, BOOL *stop) {
        return [obj isEqualToString:@""];
    }];

    return [words count] - [separatorIndexes count];
}
Aidan Steele
  • 10,999
  • 6
  • 38
  • 59
  • Thanks! That seems exactly correct & simple. I wonder if it's efficient. Good enough in my case though as I'm building an iOS app, not an operating system. :) I like it! – ma11hew28 May 30 '11 at 01:03
  • Hmm.. I don't think this is exactly correct. According to the Xcode documentation: "Adjacent occurrences of the separator characters produce empty strings in the result. Similarly, if the string begins or ends with separator characters, the first or last substring, respectively, is empty." I do not want to count empty strings as words. E.g., the method should return 1 for `@" hello "`, not 3. – ma11hew28 May 30 '11 at 01:16
  • nice! It works! I also confirmed the correctness another solution I found on the net, and it seems about twice as fast as your implementation and still fairly simple. So, [I posted it as an answer](http://stackoverflow.com/questions/6171422/objective-c-nsstring-wordcount/6171849#6171849). – ma11hew28 May 30 '11 at 02:51
11

I believe you have identified the 'simplest'. Nevertheless, to answer to your original question - "a simple implementation of the following NSString category...", and have it posted directly here for posterity:

@implementation NSString (GSBString)

- (NSUInteger)wordCount
{
    __block int words = 0;
    [self enumerateSubstringsInRange:NSMakeRange(0,self.length)
                             options:NSStringEnumerationByWords
                          usingBlock:^(NSString *substring, NSRange substringRange, NSRange enclosingRange, BOOL *stop) {words++;}];
    return words;
}

@end
tiritea
  • 1,229
  • 13
  • 18
9

There are a number of simpler implementations, but they all have tradeoffs. For example, Cocoa (but not Cocoa Touch) has word-counting baked in:

- (NSUInteger)wordCount {
    return [[NSSpellChecker sharedSpellChecker] countWordsInString:self language:nil];
}

It's also trivial to count words as accurately as the scanner simply using [[self componentsSeparatedByCharactersInSet:[NSCharacterSet whitespaceAndNewlineCharacterSet]] count]. But I've found the performance of that method degrades a lot for longer strings.

So it depends on the tradeoffs you want to make. I've found the absolute fastest is just to go straight-up ICU. If you want simplest, using existing code is probably simpler than writing any code at all.

Chuck
  • 234,037
  • 30
  • 302
  • 389
4
- (NSUInteger) wordCount
{
   NSArray *words = [self componentsSeparatedByString:@" "];
   return [words count];
}
Andres C
  • 919
  • 7
  • 14
  • 1
    This over counts if you have runs of spaces or newlines. – Obliquely Dec 17 '12 at 19:43
  • Wrong count for multiple spaces or multiple newlines `@"\n\n\n"`. Please see above correct solution: http://stackoverflow.com/a/6171439/1033581 – Cœur Sep 23 '15 at 01:41
1

A Objective-C one-liner version

NSInteger wordCount = word ? ([word stringByTrimmingCharactersInSet:NSCharacterSet.whitespaceAndNewlineCharacterSet.invertedSet].length + 1) : 0;
baguIO
  • 391
  • 2
  • 14
1

Looks like the second link I gave in my question still reigns as not only the fastest but also, in hindsight, a relatively simple implementation of -[NSString wordCount].

ma11hew28
  • 121,420
  • 116
  • 450
  • 651
0

Swift 3:

let words: [Any] = (string.components(separatedBy: " "))
let count = words.count
Josh O'Connor
  • 4,694
  • 7
  • 54
  • 98