2

I ran into a very strange problem today with Swift 2.

I have this simple method to extract a substring based on NSRange:

func substringWithRange(string: String, range: NSRange) -> String {
    let startIndex = string.startIndex.advancedBy(range.location)
    let endIndex = startIndex.advancedBy(range.length)
    let substringRange = Range<String.Index>(start: startIndex, end: endIndex)

    return string.substringWithRange(substringRange)
}

With ordinary strings or strings containing unicode characters everything works fine. But one string contains the newline characters "\r\n" and suddenly

let startIndex = string.startIndex.advancedBy(range.location)

is always 1 greater than it should be.

let string = "<html>\r\n var info={};</html>"
let range = NSMakeRange(9, 12)

let substring = substringWithRange(string, range: range)

//Expected: var info={};
//Actual: ar info={};<

//string.startIndex = 0
//range.location = 9
//startIndex after advancedBy = 10

Does anyone know why advancedBy is acting that way and how I can solve this problem?

iONsky
  • 21
  • 1
  • 4
  • I don't see any problem here. `\r\n` is one character. Why don't you decrease your range to `(8,11)`? – t4nhpt Oct 23 '15 at 10:22
  • @t4nhpt: I can't because I get the NSRange from a NSRegularExpression match and I don't know if there is `\r\n` or not – iONsky Oct 23 '15 at 11:24
  • Your method to create a Swift range from NSRange is not correct, you would have problems with other special characters (such as Emojis) as well. Compare [Swift extract regex matches](http://stackoverflow.com/questions/27880650/swift-extract-regex-matches) and [NSRange to Range](http://stackoverflow.com/questions/25138339/nsrange-to-rangestring-index). – Martin R Oct 23 '15 at 11:41
  • @MartinR: Thank you for the links! I used your method `rangeFromNSRange` from [link]http://stackoverflow.com/questions/25138339/nsrange-to-rangestring-index and it works now. But I still have one question: When I take your example from the link above and create NSRange programmatically `let str = "abc" let n1 = NSMakeRange(0, 3) // NSRange back to String range: let r2 = str.rangeFromNSRange(n1)! print(str.substringWithRange(r2))` I get "a" but I would expect "ab" – iONsky Oct 23 '15 at 13:46
  • @iONsky: NSString stores the Emoji character as two UTF-16 characters, therefore an NSRange with length 3 covers only the letter "a" and the Emoji. – Martin R Oct 23 '15 at 14:04
  • @MartinR This question is really about a (potential?) bug in Swift 2.x, where Swift doesn't count \r characters (but, for example, NSString does). I do not think this is a duplicate question. – Brian Stewart Jul 07 '16 at 21:15
  • @BrianStewart: I think it is. Swift counts `"\r\n"` as a single grapheme cluster (intentionally, not a bug), but NSString counts it as two UTF-16 code units. Swift `String` ranges and `NSString` ranges are different. Also OP confirmed that the linked-to answer solved the problem. – Martin R Jul 07 '16 at 21:33

1 Answers1

11

The reason is that Swift treats \r\n as one character

let cr = "\r"
cr.characters.count // 1
let lf = "\n"
lf.characters.count // 1
let crlf = "\r\n"
crlf.characters.count // 1
vadian
  • 274,689
  • 30
  • 353
  • 361