3

I have several NSStrings with a format similar to the one below:

"Hello, how       are   you?"

How can I break the string into an array of words? For example, for the above sentence I would expect an array consisting of "Hello,", "how", "are", "you?"

Usually I would break the string into words by using the function [NSString componentsSeparatedByCharactersInSet: NSCharacterSet set]

However this won't work in this situation because the spaces between the words are of unequal length. Note I will not be aware of the size of each word and the space between them.

How can I accomplish this? I am working on an app for OSX not iOS.

EDIT: My eventual goal is to retrieve the second word in the sentence. If there is a easier way to do this without breaking the string into an array please feel free to suggest it.

Hot Licks
  • 47,103
  • 17
  • 93
  • 151
fdh
  • 5,256
  • 13
  • 58
  • 101
  • possible duplicate of [NSString tokenize in Objective-C](http://stackoverflow.com/questions/259956/nsstring-tokenize-in-objective-c) – SirDarius Sep 05 '12 at 21:07
  • @SirDarius -- That doesn't appear to be a duplicate at all, since most of the answers suggest using componentsSeparatedByString/CharactersInSet. – Hot Licks Sep 05 '12 at 21:13
  • Tokenization is the process of breaking a stream of text up into words, phrases, symbols, or other meaningful elements called tokens. The question here is therefore equivalent to asking how to tokenize a NSString. – SirDarius Sep 05 '12 at 21:19
  • 2
    @SirDarius - Except that the answer given in the other thread is not appropriate in this case. – Hot Licks Sep 05 '12 at 21:21

3 Answers3

5

Try this:

NSMutableArray *parts = [NSMutableArray arrayWithArray:[str componentsSeparatedByCharactersInSet:[NSCharacterSet  whitespaceCharacterSet]]];
[parts removeObjectIdenticalTo:@""];
NSString *res = [parts objectAtIndex:1]; // The second string
Sergey Kalinichenko
  • 714,442
  • 84
  • 1,110
  • 1,523
  • I was not able to get this to work as expected using `[parts removeObjectIdenticalTo:@""];` but when I replaced that with `[words removeObject:@""];` things worked. – mah Mar 08 '17 at 22:36
1

Well, you could actually write a loop to iterate through the characters and find the first non-blank after the first blank, then iterate further to find the ending blank (or end of line). Would probably be about 5x faster (with much fewer object allocations) than using one of the other methods, and could be done in about 10 lines.

Hot Licks
  • 47,103
  • 17
  • 93
  • 151
1

If you dont want to use a CharacterSet try this to remove extra spaces:

NSString* string = @"word1, word2          word3                        word4";
bool done = false;
do {
    NSString tempStr = [string stringByReplacingOccurrencesOfString:@"  " withString:@" "];
    done = [string isEqualToString:tempStr];
    string = tempStr;
} while (!done);
NSLog(@"%@", string);

this will output "word1, word2 word3 word4"

Kyle
  • 434
  • 2
  • 10