0

I have a simple regex search and replace method. Everything works fine as expected, however when I was hammer testing yesterday the string I entered had "????" in it. this caused the regex to fail with the following error...

error   NSError *   domain: @"NSCocoaErrorDomain" - code: 2048  0x0fd3e970

upon further research I believe that it might be treating the question marks as a "trigraph". Chuck has a good explanation in this post.What does the \? (backslash question mark) escape sequence mean?

I tried to escape the sequence prior to creating the regex with this

string = [string stringByReplacingOccurrencesOfString:@"\?\?" withString:@"\?\\?"];

and it seem to stop the error but the search and replace no longer works. Here is the method I am using.

- (NSString *)searchAndReplaceText:(NSString *)searchString withText:(NSString *)replacementString inString:(NSString *)text {

    NSRegularExpression *regex = [self regularExpressionWithString:searchString];
    NSRange range = [regex rangeOfFirstMatchInString:text options:0 range:NSMakeRange(0, text.length)];   

    NSString *newText = [regex stringByReplacingMatchesInString:text options:0 range:range withTemplate:replacementString];

    return newText;
}

- (NSRegularExpression *)regularExpressionWithString:(NSString *)string {

    NSError *error = NULL;
    NSString *pattern = [NSString stringWithFormat:@"\\b%@\\b", string];

    NSRegularExpression *regex = [NSRegularExpression regularExpressionWithPattern:pattern options:NSRegularExpressionCaseInsensitive error:&error];

    if (error)
        NSLog(@"Couldn't create regex with given string and options");

    return regex;
}

My questions are; is there a better way of escaping this sequence? Is this a case of trigraphs, or another possibility? Or is a there a way in code of ignoring trigraphs or turning this off?

Thanks

Community
  • 1
  • 1
Ron Myschuk
  • 6,011
  • 2
  • 20
  • 32
  • Do you really want a regular expression search, or do you use it only to restrict the result to word boundaries? – Martin R Nov 28 '13 at 16:14
  • honestly I switched to using the regex for the word boundaries and because I wanted to limit the replace to only the first instance in the sentence, but it has ended up being nothing but a hassle. – Ron Myschuk Nov 28 '13 at 16:18
  • the gist of this portion of the app is to replace blocks of text with acronyms. "Hey, what's up? how are you?" matches(magically) "hey whats up" (with an associated acronym of "HWU") and would swap it out to become "HWU? how are you?" – Ron Myschuk Nov 28 '13 at 16:33
  • What is the `text` and what the `searchString` in your last comment? – Martin R Nov 28 '13 at 16:41
  • I've a more detailed example below but the text is "Hey, what's up? how are you?" and the search string would be "hey, what's up" – Ron Myschuk Nov 28 '13 at 16:49

1 Answers1

2

My questions are; is there a better way of escaping this sequence?

Yes, you can properly escape any sequence of characters for a regular expression like this:

NSString* escapedExpression = [NSRegularExpression escapedPatternForString: aStringToEscapeCharactersIn];

EDIT

You don't have to run this on the whole expression. You can use NSString stringwithFormat: to insert escaped strings into REs with patterns in them e.g.

pattern = [NSString stringWithFormat: @"^%@(.*)", [NSRegularExpression escapedPatternForString: @"????"]];

will give you the pattern ^\?\?\?\?(.*)

JeremyP
  • 84,577
  • 15
  • 123
  • 161
  • Good to know, I wasn't aware of that method. – Martin R Nov 28 '13 at 16:52
  • cool method...Technically you have answered my question. Unfortunately it has left other undesirable results like not being able to match the range afterwards. I might look into scraping regex in this instance and look into NSScanner as an alternative. Cheers – Ron Myschuk Nov 28 '13 at 17:36
  • @RonMyschuk Check out my amended answer to see if that fixes the problem. – JeremyP Nov 29 '13 at 10:57