Use NSLinguisticTagger. It gets the sentences right for your given input, because it analyzes in actual linguistic terms.
Here's a rough draft (Swift 1.2, this won't compile in Swift 2.0):
let s = "I want to split a paragraph into sentences. But, there is a problem. My paragraph includes dates like Jan.13, 2014 , words like U.A.E and numbers like 2.2. How do i split this."
var r = [Range<String.Index>]()
let t = s.linguisticTagsInRange(
indices(s), scheme: NSLinguisticTagSchemeLexicalClass,
options: nil, tokenRanges: &r)
var result = [String]()
let ixs = Array(enumerate(t)).filter {
$0.1 == "SentenceTerminator"
}.map {r[$0.0].startIndex}
var prev = s.startIndex
for ix in ixs {
let r = prev...ix
result.append(
s[r].stringByTrimmingCharactersInSet(
NSCharacterSet.whitespaceCharacterSet()))
prev = advance(ix,1)
}
Here is a Swift 2.0 version (updated to Xcode 7 beta 6):
let s = "I want to split a paragraph into sentences. But, there is a problem. My paragraph includes dates like Jan.13, 2014 , words like U.A.E and numbers like 2.2. How do i split this."
var r = [Range<String.Index>]()
let t = s.linguisticTagsInRange(
s.characters.indices, scheme: NSLinguisticTagSchemeLexicalClass,
tokenRanges: &r)
var result = [String]()
let ixs = t.enumerate().filter {
$0.1 == "SentenceTerminator"
}.map {r[$0.0].startIndex}
var prev = s.startIndex
for ix in ixs {
let r = prev...ix
result.append(
s[r].stringByTrimmingCharactersInSet(
NSCharacterSet.whitespaceCharacterSet()))
prev = ix.advancedBy(1)
}
And here it is updated for Swift 3:
let s = "I want to split a paragraph into sentences. But, there is a problem. My paragraph includes dates like Jan.13, 2014 , words like U.A.E and numbers like 2.2. How do i split this."
var r = [Range<String.Index>]()
let t = s.linguisticTags(
in: s.startIndex..<s.endIndex,
scheme: NSLinguisticTagSchemeLexicalClass,
tokenRanges: &r)
var result = [String]()
let ixs = t.enumerated().filter {
$0.1 == "SentenceTerminator"
}.map {r[$0.0].lowerBound}
var prev = s.startIndex
for ix in ixs {
let r = prev...ix
result.append(
s[r].trimmingCharacters(
in: NSCharacterSet.whitespaces))
prev = s.index(after: ix)
}
result
is an array of four strings, one sentence per string:
["I want to split a paragraph into sentences.",
"But, there is a problem.",
"My paragraph includes dates like Jan.13, 2014 , words like U.A.E and numbers like 2.2.",
"How do i split this."]