2

On iOS, how can I count words within a specific text string?

Cœur
  • 37,241
  • 25
  • 195
  • 267
Nqeuew
  • 31
  • 2

4 Answers4

7

A more efficient method than splitting is to check the string character by character.

int word_count(NSString* s) {
  CFCharacterSetRef alpha = CFCharacterSetGetPredefined(kCFCharacterSetAlphaNumeric);
  CFStringInlineBuffer buf;
  CFIndex len = CFStringGetLength((CFStringRef)s);
  CFStringInitInlineBuffer((CFStringRef)s, &buf, CFRangeMake(0, len));
  UniChar c;
  CFIndex i = 0;
  int word_count = 0;
  Boolean was_alpha = false, is_alpha;
  while (c = CFStringGetCharacterFromInlineBuffer(&buf, i++)) {
    is_alpha = CFCharacterSetIsCharacterMember(alpha, c);
    if (!is_alpha && was_alpha)
      ++ word_count;
    was_alpha = is_alpha;
  }
  if (is_alpha)
    ++ word_count;
  return word_count;
}

Compared with @ennuikiller's solution, counting a 1,000,000-word string takes:

  • 0.19 seconds to build the string
  • 0.39 seconds to build the string + counting using my method.
  • 1.34 seconds to build the string + counting using ennuikiller's method.

The big disadvantage of my method is that it's not a one-liner.

Community
  • 1
  • 1
kennytm
  • 510,854
  • 105
  • 1,084
  • 1,005
  • not a one-liner is a bit of an understatement!! :) The op didn't ask for the most efficient solution. I would suspect that most methods in the NSString class can be coded more efficiently. I guess the determining factor would be how large the "text string" is. – ennuikiller Feb 16 '10 at 13:25
  • Thanks Kenny! I was just asking a similar question and your answer is excellent! +1 ...i am gonna quietly borrow that code of yours. – Unikorn Oct 20 '10 at 07:26
  • The solution is (slightly) broken. Not all characters fit in one unichar. – JeremyP Oct 20 '10 at 09:03
4
 [[stringToCOunt componentsSeparatedByCharactersInSet: [NSCharacterSet whitespaceCharacterSet] count]
ennuikiller
  • 46,381
  • 14
  • 112
  • 137
  • 2
    `[[stringToCount componentsSeparatedByCharactersInSet:[NSCharacterSet whitespaceCharacterSet] count]`. – erikprice Feb 15 '10 at 14:25
  • Fixed again. [[stringToCOunt componentsSeparatedByCharactersInSet: [NSCharacterSet whitespaceCharacterSet]] count] I was looking through documents by searching "counting words" or something like that, but I couldn't find a good way. This solution seems to be OK to me. Thank you all. (You guys are really quick!) – Nqeuew Feb 15 '10 at 14:50
  • 1
    This is not the most efficient way to count words. Specially not memory wise since it will split the whole string into a temporary array that will then be discarded. It is much better to simply look at runs of whitespace and punctiation in the text. This can't be done in one line but it will be much faster and will not use at least double the memory of the text. – Stefan Arentz Feb 15 '10 at 15:13
  • Not accurate solution as it does not handle consecutive spaces correctly. – Cœur Sep 23 '15 at 07:23
2

I think this method is better:

__block int wordCount = 0;
NSRange range = {0,self.text.length };
[self.text enumerateSubstringsInRange:range options:NSStringEnumerationByWords usingBlock:^(NSString *substring, NSRange substringRange, NSRange enclosingRange, BOOL *stop) {
    wordCount++;
}];

As a reference check the video of the session 215 of the WWDC 2012: Text and Linguistic Analysis by Douglas Davidson

José Manuel Sánchez
  • 5,215
  • 2
  • 31
  • 24
1

One liner accurate solution:

return [[self componentsSeparatedByCharactersInSet:[NSCharacterSet whitespaceAndNewlineCharacterSet]] filteredArrayUsingPredicate:[NSPredicate predicateWithFormat:@"length > 0"]].count;

This solution handles consecutive spaces correctly.

Cœur
  • 37,241
  • 25
  • 195
  • 267