I am using regex to remove certain substrings in a string and I am encountering situations where the range is shorter than the actual string's length when emojis are present in the string.
For example:
// Original string
1.30pm. Let’s eat https://somedomain.com/vDYEvkSFqa
//Expected end result
1.30pm. Let’s eat
However, what I'm getting is this result:
1.30pm. Let’s eat a
Notice the extra "a" left behind.
I am using the below code, and it works well for strings without any emoji. But with emoji, the range
obtained is too short.
func removeTwitterUrlInTweet(text: String) -> String {
let pattern = "https://somedomain.com/.*"
do {
let regex = try NSRegularExpression(pattern: pattern, options: .caseInsensitive)
let range = NSMakeRange(0, text.count)
let modifiedText = regex.stringByReplacingMatches(in: text, options: [], range: range, withTemplate: "")
return modifiedText
} catch {
Log("Tweet regex err: \(error.localizedDescription)")
return text
}
}
My second attempt converts the string to utf8
but crashes with index out of range.
func removeTwitterUrlInTweet(text: String) -> String {
let pattern = "https://somedomain.com/.*"
let string = String(describing: text.cString(using: .utf8))
do {
let regex = try NSRegularExpression(pattern: pattern, options: .caseInsensitive)
let range = NSMakeRange(0, string.count)
let modifiedText = regex.stringByReplacingMatches(in: text, options: [], range: range, withTemplate: "")
return modifiedText
} catch {
Log("Tweet regex err: \(error.localizedDescription)")
return text
}
}
How do I get the full range from the text that includes emojis?