9

I've created an string tokenizer like this:

stringTokenizer = CFStringTokenizerCreate(
                         NULL
                         , (CFStringRef)str
                         , CFRangeMake(0, [str length])
                         , kCFStringTokenizerUnitSentence
                         , userLocale);

But how do I obtain those sentences now from the tokenizer? The CF String Programming Guide doesn't mention CFStringTokenizer or tokens (did a full-text search in the PDF).

stefanB
  • 77,323
  • 27
  • 116
  • 141
openfrog
  • 40,201
  • 65
  • 225
  • 373

2 Answers2

18

Here is an example of CFStringTokenizer usage:

CFStringRef string; // Get string from somewhere
CFLocaleRef locale = CFLocaleCopyCurrent();

CFStringTokenizerRef tokenizer = 
    CFStringTokenizerCreate(
        kCFAllocatorDefault
        , string
        , CFRangeMake(0, CFStringGetLength(string))
        , kCFStringTokenizerUnitSentence
        , locale);

CFStringTokenizerTokenType tokenType = kCFStringTokenizerTokenNone;
unsigned tokensFound = 0;

while(kCFStringTokenizerTokenNone !=
    (tokenType = CFStringTokenizerAdvanceToNextToken(tokenizer))) {
    CFRange tokenRange = CFStringTokenizerGetCurrentTokenRange(tokenizer);
    CFStringRef tokenValue =
        CFStringCreateWithSubstring(
            kCFAllocatorDefault
            , string
            , tokenRange);

  // Do something with the token
  CFShow(tokenValue);
  CFRelease(tokenValue);
  ++tokensFound;
}

// Clean up
CFRelease(tokenizer);
CFRelease(locale);
stefanB
  • 77,323
  • 27
  • 116
  • 141
sbooth
  • 16,646
  • 2
  • 55
  • 81
  • 6
    +1. It's also possible to do the same using the higher level `[NSString enumerateSubstringsInRange:options:usingBlock:]` with the option `NSStringEnumerationBySentences`, though the Foundation solution is a little more powerful. For example, you can specify any locale using `CFStringTokenizerCreate`, whereas `enumerateSubstringsInRange:options:usingBlock:` uses the current user locale. – David Snabel-Caunt Sep 20 '13 at 18:13
0

You may also use:

    [mutstri enumerateSubstringsInRange:NSMakeRange(0, [mutstri length])
                                options:NSStringEnumerationBySentences
                             usingBlock:^(NSString *substring, NSRange substringRange, NSRange enclosingRange, BOOL *stop){

                                 NSLog(@"%@", substring);

                             }];
mirap
  • 1,266
  • 12
  • 23