2

I am using the following extension method to get NSRange array of a substring:

extension String {
  func nsRangesOfString(findStr:String) -> [NSRange] {
    let ranges: [NSRange]
    do {
      // Create the regular expression.
      let regex = try NSRegularExpression(pattern: findStr, options: [])

      // Use the regular expression to get an array of NSTextCheckingResult.
      // Use map to extract the range from each result.
      ranges = regex.matches(in: self, options: [], range: NSMakeRange(0, self.characters.count)).map {$0.range}
    }
    catch {
      // There was a problem creating the regular expression
      ranges = []
    }
    return ranges
  }
}

However, I didn't realize why it doesn't work sometimes. Here are two similar cases, one works and the other doesn't:

That one works:

self(String):

"וצפן (קרי: יִצְפֹּ֣ן) לַ֭יְשָׁרִים תּוּשִׁיָּ֑ה מָ֝גֵ֗ן לְהֹ֣לְכֵי תֹֽם׃"

findStr:

"קרי:"

And that one doesn't:

self(String):

"לִ֭נְצֹר אָרְח֣וֹת מִשְׁפָּ֑ט וְדֶ֖רֶךְ חסידו (קרי: חֲסִידָ֣יו) יִשְׁמֹֽר׃"

findStr:

"קרי:"

(An alternate steady method would be an appropriate answer though.)

Dorad
  • 3,413
  • 2
  • 44
  • 71
  • I'm sorry but would you kindly convert sample strings to English? – sCha Sep 19 '17 at 06:26
  • I could, but those aren't just random strings, those are the real strings being matched in my app, and i want to figure out why the second returns nothing. – Dorad Sep 19 '17 at 06:29
  • is it mandatory to use regex to do such a task? – Ahmad F Sep 19 '17 at 06:31
  • Negative. Another suggestion is welcome. – Dorad Sep 19 '17 at 06:32
  • 1
    What is it that you are actually trying to do here? What is the end goal? Happy to suggest another way but I don’t know what your requirement is for this. – Fogmeister Sep 19 '17 at 06:41
  • I am locating special pieces in strings and formatting their ranges specially (size, color) in attributed strings. – Dorad Sep 19 '17 at 06:51
  • See if this might be helpful if not with regex. https://stackoverflow.com/questions/40413218/swift-find-all-occurrences-of-a-substring – Rishi Sep 19 '17 at 06:53

1 Answers1

11

NSRange ranges are specified in terms of UTF-16 code units (which is what NSString uses internally), therefore the length must be self.utf16.count:

        ranges = regex.matches(in: self, options: [],
                               range: NSRange(location: 0, length: self.utf16.count))
            .map {$0.range}

In the case of your second string we have

let s2 = "לִ֭נְצֹר אָרְח֣וֹת מִשְׁפָּ֑ט וְדֶ֖רֶךְ חסידו (קרי: חֲסִידָ֣יו) יִשְׁמֹֽר׃"
print(s2.characters.count) // 46
print(s2.utf16.count)      // 74

and that's why the pattern is not found with your code.

Starting with Swift 4 you can compute a NSRange for the entire string also as

NSRange(self.startIndex..., in: self)
Martin R
  • 529,903
  • 94
  • 1,240
  • 1,382
  • Also possibly helpful: https://stackoverflow.com/questions/27880650/swift-extract-regex-matches. – Martin R Sep 19 '17 at 07:45
  • Awesome. Is it safe to use `utf16.count` for all other types of string (including English)? – sCha Sep 19 '17 at 08:11
  • @sCha: `s.utf16.count` is the number of UTF-16 code units in a string, and always the same as `(s as NSString).length`, no matter what language. *If* you need a NSRange/NSString compatible count then that's the correct method. – Martin R Sep 19 '17 at 08:18