rmaddy's answer is correct (+1). A Swift 3 implementation is:
var sentences = [String]()
string.enumerateSubstrings(in: string.startIndex ..< string.endIndex, options: .bySentences) { substring, substringRange, enclosingRange, stop in
sentences.append(substring!)
}
You can also use regular expression, NSRegularExpression
, though it's much hairier than rmaddy's .bySentences
solution. In Swift 3:
var sentences = [String]()
let regex = try! NSRegularExpression(pattern: "(^|\\s+)(\\w.*?[.!?]+)(?=(\\s+|$))")
regex.enumerateMatches(in: string, range: NSMakeRange(0, string.characters.count)) { match, flags, stop in
sentences.append((string as NSString).substring(with: match!.rangeAt(2)))
}
Or Swift 2:
let regex = try! NSRegularExpression(pattern: "(^|\\s+)(\\w.*?[.!?]+)(?=(\\s+|$))", options: [])
var sentences = [String]()
regex.enumerateMatchesInString(string, options: [], range: NSMakeRange(0, string.characters.count)) { match, flags, stop in
sentences.append((string as NSString).substringWithRange(match!.rangeAtIndex(2)))
}
The [.!?]
syntax matches any of those three characters. The |
means "or". The ^
matches the start of the string. The $
matches the end of the string. The \\s
matches a whitespace character. The \\w
matches a "word" character. The *
matches zero or more of the preceding character. The +
matches one or more of the preceding character. The (?=)
is a look-ahead assertion (e.g. see if there's something there, but don't advance through that match).
I've tried to simplify this a bit, and it's still pretty complicated. Regular expressions offer rich text pattern matching, but, admittedly, it is a little dense when you first use it. But this rendition matches (a) repeated punctuation (e.g. "Thank you!!!"
), (b) leading spaces, and (c) trailing spaces, too.