5

I was trying to user CharacterSet to check if a user input string contains any non decimal digit characters. I use CharacterSet.decimalDigits and take the intersection of that with the user input. If this intersection is empty, it presumably means the user hasn't entered valid input. Yet the intersection is not empty.

let digits = CharacterSet.decimalDigits
let letters = CharacterSet(charactersIn: "abcd") // never prints

let intersection = digits.intersection(letters)
for c in "abcd".characters {
    if intersection.contains(UnicodeScalar(String(c))!) {
        print("contains \(c)") // never prints
    }
}

for i in 0...9 {
    if intersection.contains(UnicodeScalar(String(i))!) {
        print("contains \(i)")
    }
}

print("intersection is empty: \(intersection.isEmpty)") // prints false

I even tried looping over all unicode scalars to test for membership, and that doesn't print anything.

for i in 0x0000...0xFFFF {
    guard let c = UnicodeScalar(i) else {
        continue
    }
    if intersection.contains(c) {
        print("contains \(c)")
    }
}

Why is the set non empty?

Note Using let digits = CharacterSet(charactersIn: "1234567890") works as expected. I know that the decimalDigits contains more than just 0-9, but the intersection should still be empty.

nteissler
  • 1,513
  • 15
  • 16
  • There are several bug reports with respect to character sets at https://bugs.swift.org/. – Martin R Jan 10 '17 at 13:10
  • Nice.. I just enumerated over `0x0000...0xFFFF` to see the other characters in .decimalDigits, I had no idea it was more than 0...9 – MathewS Jan 10 '17 at 15:47
  • 1
    @MathewS: If you are interested: [here](http://stackoverflow.com/q/15741631/1187415) are some (Objective-C and) Swift methods to get all characters from a CharacterSet. There are 550 "digits", such as "꩓" (CHAM DIGIT THREE), "໔" (LAO DIGIT FOUR) or "" (MATHEMATICAL DOUBLE-STRUCK DIGIT ONE) – Martin R Jan 10 '17 at 19:54
  • @MartinR thanks.. the Cham alphabet is beautiful! – MathewS Jan 10 '17 at 20:01

1 Answers1

0

I browsed the CharacterSet bugs and didn't see one around intersections incorrectly reporting isEmpty so if you have the time you should file a bug since this is a nice reproducible example.

In the mean time, you could try this to check to see if the input contains any characters from .decimalDigits characterSet:

let letterInput = CharacterSet(charactersIn: "abcd")
digits.isSubset(of: letterInput.inverted)
// -> true

let letterAndDigitInput = CharacterSet(charactersIn: "abcd 1234")
digits.isSubset(of: letterAndDigitInput.inverted)
// -> false

let digitInput = CharacterSet(charactersIn: "1234")
digits.isSubset(of: digitInput.inverted)
// -> false
MathewS
  • 2,267
  • 2
  • 20
  • 31