TL;DR
The documentation for NSString.length specifies:
The number of UTF-16 code units in the receiver.
Thus, if you want to interop between String and NSString:
- You should use
string.utf16.count
, and it will match up perfectly with (string as NSString).length
.
If you want to count the number of visible characters:
You should use string.count
, and it will match up to the same number of times you need the → (right) key on your keyboard until you get to the end of the string (assuming you start at the beginning).
Note: This is not always 100% accurate, but it appears Apple is constantly improving the implementation to make it more and more accurate.
Here's a Swift 4.0 playground to test a bunch of strings and functions:
let header = "NSString .utf16❔ encodedOffset❔ NSRange❔ .count❔ .characters❔ distance❔ .unicodeScalars❔ .utf8❔ Description"
var format = " %3d %3d ❓ %3d ❓ %3d ❓ %3d ❓ %3d ❓ %3d ❓ %3d ❓ %3d ❓ %@"
format = format.replacingOccurrences(of: "❓", with: "%@") // "❓" acts as a placeholder for "%@" to align the text perfectly
print(header)
test("")
test("abc")
test("❌")
test("")
test("☾test")
test("")
test("\u{200d}\u{200d}\u{200d}")
test("")
test("\u{1F468}")
test("♀️♂️")
test("你好吗")
test("مرحبا", "Arabic word")
test("م", "Arabic letter")
test("שלום", "Hebrew word")
test("ם", "Hebrew letter")
func test(_ s: String, _ description: String? = nil) {
func icon(for length: Int) -> String {
return length == (s as NSString).length ? "✅" : "❌"
}
let description = description ?? "'" + s + "'"
let string = String(
format: format,
(s as NSString).length,
s.utf16.count, icon(for: s.utf16.count),
s.endIndex.encodedOffset, icon(for: s.endIndex.encodedOffset),
NSRange(s.startIndex..<s.endIndex, in: s).upperBound, icon(for: NSRange(s.startIndex..<s.endIndex, in: s).upperBound),
s.count, icon(for: s.count),
s.characters.count, icon(for: s.characters.count),
s.distance(from: s.startIndex, to: s.endIndex), icon(for: s.distance(from: s.startIndex, to: s.endIndex)),
s.unicodeScalars.count, icon(for: s.unicodeScalars.count),
s.utf8.count, icon(for: s.utf8.count),
description)
print(string)
}
And here is the output:
NSString .utf16❔ encodedOffset❔ NSRange❔ .count❔ .characters❔ distance❔ .unicodeScalars❔ .utf8❔ Description
0 0 ✅ 0 ✅ 0 ✅ 0 ✅ 0 ✅ 0 ✅ 0 ✅ 0 ✅ ''
3 3 ✅ 3 ✅ 3 ✅ 3 ✅ 3 ✅ 3 ✅ 3 ✅ 3 ✅ 'abc'
1 1 ✅ 1 ✅ 1 ✅ 1 ✅ 1 ✅ 1 ✅ 1 ✅ 3 ❌ '❌'
4 4 ✅ 4 ✅ 4 ✅ 1 ❌ 1 ❌ 1 ❌ 2 ❌ 8 ❌ ''
5 5 ✅ 5 ✅ 5 ✅ 5 ✅ 5 ✅ 5 ✅ 5 ✅ 7 ❌ '☾test'
11 11 ✅ 11 ✅ 11 ✅ 1 ❌ 1 ❌ 1 ❌ 7 ❌ 25 ❌ ''
11 11 ✅ 11 ✅ 11 ✅ 1 ❌ 1 ❌ 1 ❌ 7 ❌ 25 ❌ ''
8 8 ✅ 8 ✅ 8 ✅ 4 ❌ 4 ❌ 4 ❌ 4 ❌ 16 ❌ ''
2 2 ✅ 2 ✅ 2 ✅ 1 ❌ 1 ❌ 1 ❌ 1 ❌ 4 ❌ ''
58 58 ✅ 58 ✅ 58 ✅ 13 ❌ 13 ❌ 13 ❌ 32 ❌ 122 ❌ '♀️♂️'
3 3 ✅ 3 ✅ 3 ✅ 3 ✅ 3 ✅ 3 ✅ 3 ✅ 9 ❌ '你好吗'
5 5 ✅ 5 ✅ 5 ✅ 5 ✅ 5 ✅ 5 ✅ 5 ✅ 10 ❌ Arabic word
1 1 ✅ 1 ✅ 1 ✅ 1 ✅ 1 ✅ 1 ✅ 1 ✅ 2 ❌ Arabic letter
4 4 ✅ 4 ✅ 4 ✅ 4 ✅ 4 ✅ 4 ✅ 4 ✅ 8 ❌ Hebrew word
1 1 ✅ 1 ✅ 1 ✅ 1 ✅ 1 ✅ 1 ✅ 1 ✅ 2 ❌ Hebrew letter
Conclusions:
- To get a length that is compatible with NSString/NSRange, use either
(s as NSString).length
, s.utf16.count
(preferred), s.endIndex.encodedOffset
, or NSRange(s.startIndex..<s.endIndex, in: s)
.
- To get the number of visible characters, use either
s.count
(preferred), s.characters.count
(deprecated), or s.distance(from: s.startIndex, to: s.endIndex)
A helpful extension to get the full range of a String:
public extension String {
var nsrange: NSRange {
return NSRange(startIndex..<endIndex, in: self)
}
}
Thus, you can call the original method like so:
replace("", characterAtIndex: "".utf16.count - 1) // �!