7

Elsewhere I've seen it told that Swift's comparisons use NFD normalization.

However, running in the iSwift playground I've found that

print("\u{0071}\u{0307}\u{0323}" == "\u{0071}\u{0323}\u{0307}");

gives false, despite this being an example straight from the standard of "Canonical Equivalence", which Swift's documentation claims to follow.

So, what kind of canonicalization is performed by Swift, and is this a bug?

Community
  • 1
  • 1
Veedrac
  • 58,273
  • 15
  • 112
  • 169
  • Good question! Also, as I understand the documentation, `"\u{0071}\u{0307}\u{0323}".precomposedStringWithCanonicalMapping` should return `"\u{0071}\u{0323}\u{0307}"`, i.e. the NFC form with the combining marks in a defined order. But it doesn't, as one can verify with `print(Array(string.unicodeScalars))`. – Martin R Jan 31 '16 at 16:20
  • 1
    Does the [source code](https://github.com/apple/swift/blob/master/stdlib/public/core/String.swift) give a clue? "*The strings which are equivalent according to their NFD form are considered equal. ...*" – As I understand it, your strings have the same NFC form, but different NFD form. – Martin R Jan 31 '16 at 16:27
  • @MartinR It's not NFD vs. NFC, since NFC is just NFD followed by "Canonical Composition", which happens after the reordering (which is deduced from "The fully decomposed and canonically ordered string is processed by another subpart of the Unicode Normalization Algorithm known as the Canonical Composition Algorithm.") I've checked the behaviour against Python's `unicodedata.normalize`, and Python seems to agree that NFD should reorder. – Veedrac Jan 31 '16 at 16:34
  • I assume an answer can be found by digging deeper into the Swift source code... Ultimately, if I see it correctly, the ICU library is used for string comparisons. – You could also ask at https://lists.swift.org/mailman/listinfo/swift-users. – Martin R Jan 31 '16 at 16:40
  • 1
    @Veedrac You can use bugreport.apple.com with a free Apple ID; you don't have to be a member of a paid developer program. And anyone can report Swift issues at https://bugs.swift.org/. – rickster Jan 31 '16 at 23:57
  • @rickster bugreport.apple.com is the site that was giving me trouble. Thanks for the other link; I'm not sure how I missed that. [I've submitted a bug report.](https://bugs.swift.org/browse/SR-649) – Veedrac Feb 01 '16 at 00:15

1 Answers1

5

It seems that this was in bug in Swift that has since been fixed. With Swift 3 and Xcode 8.0,

print("\u{0071}\u{0307}\u{0323}" == "\u{0071}\u{0323}\u{0307}")

now prints true.

AthanasiusOfAlex
  • 1,290
  • 2
  • 16
  • 23