1

My goal is to use stringByReplacingOccurrencesOfString to replace occurrences of words or phrases with replacements. The words and their replacements are found in a dictionary such as that the word or phrases are keys, and their values are their replacements:

{"is fun" : "foo",
 "funny" : "bar"}

Because stringByReplacingOccurrencesOfString is literal and disregards "words" in the convention Western language sense, I am running in the trouble where the following sentence:

"He is funny and is fun",

the phrase "is fun" is actually detected twice using this method: first as part of "is funny", and the second as part of "is fun", causing an issue where a literal occurrence is used for word replacement, and not realizing that it is actually part of another word.

I was wondering if there is a way to use stringByReplacingOccurrencesOfString that takes into consideration of wording, and so a phrase like "is funny" can be viewed in its complete self, and not also be viewed as "is funny" where "is fun" detected.

By the way, this is the code I am using for replacement when iterating across all the keys in the dictionary:

NSString *newText = [wholeSentence stringByReplacingOccurrencesOfString:wordKey withString:wordValue options:NSLiteralSearch range:[wholeSentence rangeOfString:stringByReplacingOccurrencesOfString:wordKey]];
        iteratedTranslatedText = newText;

Edit 1: Using the suggested solutions, this is what I have done:

NSString *string = @"Harry is fun. Shilp is his fun pet dog";
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:@"\bis fun\b" options:0 error:nil];
if (regex != nil) {
    NSTextCheckingResult *firstMatch = [regex firstMatchInString:string options:0 range:NSMakeRange(0, string.length)];
    //firstMatch is returning null
    if (firstMatch) {
        NSRange resultRange = [firstMatch rangeAtIndex:0];
        NSLog(@"first match at index:%lu", (unsigned long)resultRange.location);

    }
}

However, this is returning firstMatch as null. According to the regex tutorial on word boundaries, this is how to anchor a word or phrase, so I am unsure why its not returning anything. Help is appreciated!

daspianist
  • 5,336
  • 8
  • 50
  • 94
  • For this you need to venture into the territory of *regular expressions* (`NSRegularExpression`) – borrrden Oct 15 '14 at 03:46
  • Thanks for the tip @borrrden. Would you say that the answer recommended here would be the suggest way to go about it? http://stackoverflow.com/questions/9661690/user-regular-expression-to-find-replace-substring-in-nsstring – daspianist Oct 15 '14 at 03:50
  • 1
    You would do better to use `NSScanner` for this, especially if you have a lot of replacements. Looping over the string once for each replaceable will be time-consuming. With a scanner, you only go through the string once. See e.g. http://stackoverflow.com/a/21100435/ or https://github.com/woolsweater/NSString-WSSHTMLEntityConversion (compare the performance of that code with the repo linked in the header to see what I'm talking about). – jscs Oct 15 '14 at 03:50
  • Yes, that answer shows it pretty well. – borrrden Oct 15 '14 at 03:53
  • @JoshCaswell this is great advice about `NSScanner`, and its something that I haven't not heard about before. Thanks for pointing it out! – daspianist Oct 15 '14 at 03:59
  • Oh, actually, looking at that GitHub code for the first time in quite a while, I think the major performance benefit may have been using `NSMutableString` instead of `NSString`. But it does have some `NSScanner` code for demonstration. – jscs Oct 15 '14 at 04:05

1 Answers1

1

As your comment, you can use NSRegrlarEXPression in your project. For example:

NSString *string = @"He is funny and is fun";
NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:@"is fun([^a-zA-Z]+|$)" options:0 error:nil];
if (regex != nil) {
    NSTextCheckingResult *firstMatch = [regex firstMatchInString:string options:0 range:NSMakeRange(0, string.length)];
    if (firstMatch) {
        NSRange resultRange = [firstMatch rangeAtIndex:0];
        NSLog(@"first match at index:%d", resultRange.location);
    }
}

And to result: first match at index:16

Pandara
  • 139
  • 7
  • 1
    This would also match "his fun pet dog" etc. A better regex would probably be `/bis fun/b` – borrrden Oct 15 '14 at 05:59
  • Thanks for chiming in. @borrrden I tried the suggestion of just using `/bis fun/b` as the pattern but am not getting any range matches using the code Pandara provided. Should I be using `/bis fun/b([^a-zA-Z]+|$)`? – daspianist Oct 15 '14 at 16:58
  • Thanks for answering @Pandara. This solution is very close - per the comment, the `is fun([^a-zA-Z]+|$)` doesn't work for sentences like "his fun pet", but I tried `/bis fun/b` `firstMatch` is returning nil. What should I modify the regex to improve the accuracy? Thanks! – daspianist Oct 15 '14 at 17:10
  • FYI I realized that the "anchor" word is `\b` according to http://www.regular-expressions.info/wordboundaries.html but unfortunately it still did not work. – daspianist Oct 15 '14 at 18:31
  • 2
    Oh man, I typed forward slash instead of backslash. Be aware that you need to escape it when you use it in Xcode (i.e. type `\\b` instead of `\b`) – borrrden Oct 16 '14 at 00:44