The problem is "\w+" works fine with just plain text. However, the goal is to avoid having the emoji characters included as whitespace.
Example:
"This is some text ".regex("\\w+")
Desired output:
["This","is","some","text",""]
Code:
extension String {
func regex (pattern: String) -> [String] {
do {
let regex = try NSRegularExpression(pattern: pattern, options: NSRegularExpressionOptions(rawValue: 0))
let nsstr = self as NSString
let all = NSRange(location: 0, length: nsstr.length)
var matches : [String] = [String]()
regex.enumerateMatchesInString(self, options: NSMatchingOptions(rawValue: 0), range: all) {
(result : NSTextCheckingResult?, _, _) in
if let r = result {
let result = nsstr.substringWithRange(r.range) as String
matches.append(result)
}
}
return matches
} catch {
return [String]()
}
}
}
The code above gives the following output:
"This is some text ".regex("\\w+")
// Yields: ["This", "is", "some", "text"]
// Note the are missing.
Is it a coding issue, regex issue, or both? Other answers seem to show the same problem.
func matchesForRegexInText(regex: String!, text: String!) -> [String] {
do {
let regex = try NSRegularExpression(pattern: regex, options: [])
let nsString = text as NSString
let results = regex.matchesInString(text,
options: [], range: NSMakeRange(0, nsString.length))
return results.map { nsString.substringWithRange($0.range)}
} catch let error as NSError {
print("invalid regex: \(error.localizedDescription)")
return []
}
}
let string = "This is some text "
let matches = matchesForRegexInText("\\w+", text: string)
// Also yields ["This", "is", "some", "text"]
My Mistake
\w+ is word boundary
"This is some text \t ".regex("[^ |^\t]+")
// Give correct answer ["This", "is", "some", "text", ""]