0

Fondation has CharacterSet struct (briged to NSCharacterSet) for managing sets of characters, for example when working with Formatter instances. Surprisingly, CharacterSet is not a Set, although the functionality and purpose is totally the same. Unfortunately CharacterSet is not a Collection ether, so right now I have no idea how to retrieve its elements.

// We can initialize with String
let wrongCharacterSet = CharacterSet(charactersIn: "0123456789").inverted
// but how can we get the characters back ?
var chSet = CharacterSet.decimalDigits
let chString = String(chSet) // doesn't work
let chS = Set(chSet) // doesn't work
let chArr = Array(chSet) // doesn't work
Dávid Pásztor
  • 51,403
  • 9
  • 85
  • 116
Paul B
  • 3,989
  • 33
  • 46
  • 2
    Why would you need to access all elements of the `CharacterSet` as a `Collection`? What problem are you trying to solve? As for the "functionality" of `CharacterSet` "being the same" as that of `Set`, that's because both conform to the `SetAlgebra` protocol. – Dávid Pásztor Apr 02 '19 at 10:58
  • 3
    https://stackoverflow.com/questions/15741631/nsarray-from-nscharacterset ? But as stated by Dávid Pásztor, what's your goal? You might have a "better way". – Larme Apr 02 '19 at 10:58
  • 2
    See https://stackoverflow.com/questions/42252492/strange-string-unicodescalars-and-characterset-behaviour/42252675#42252675 – vadian Apr 02 '19 at 11:04
  • @Dávid Pásztor, thanks for reminding about `SetAlgebra`. I was not going to manipulate the elements using `Collection`'s methods. `CharacterSet` has everything for inserting, removing, searching and more. Just wanted to **view** it to make sure everything works as expected, maybe show it to to the user in some cases. @Larme and @vadian actually answered my question in comments. – Paul B Apr 02 '19 at 14:20

1 Answers1

0

I've slightly modified the solution, found in answers pointed out by @Larme and @vadian. Both answers end up with the same algorithm. All I wanted was to look at the contents of the set. Yes, it is not a common thing to want to. It turns out that the only way to get all the elements of CharacterSet is to loop through all possible unicode scalars and check if they belong to the set. Feels so strange to me in the word where we can switch between Sets, Arrays and even Dictionaries so easy. The reason for modification is and attempt to speed up the function. My rough experiments show that using scalars is 30% faster even if we create a string in the end.

extension CharacterSet {
    func allUnicodeScalars() -> [UnicodeScalar] {
        var result: [UnicodeScalar] = []
        for plane in Unicode.UTF8.CodeUnit.min...16 where self.hasMember(inPlane: plane) {
            for unicode in Unicode.UTF32.CodeUnit(plane) << 16 ..< Unicode.UTF32.CodeUnit(plane + 1) << 16 {
                if let uniChar = UnicodeScalar(unicode), self.contains(uniChar) {
                    result.append(uniChar)
                }
            }
        }
        return result
    }
}

// Testing and timing
printTimeElapsedWhenRunningCode(title:"allUnicodeScalars()") {
print(String.UnicodeScalarView(chSet.allUnicodeScalars()))
}
// Time elapsed for allUnicodeScalars(): 1.936843991279602 s.
printTimeElapsedWhenRunningCode(title:"allCharacters()") {
    print(String(chSet.allCharacters()))
}
// Time elapsed for allCharacters(): 2.9846099615097046 s.

//Timing functions (for reference):
private func printTimeElapsedWhenRunningCode(title:String, operation:()->()) {
    let startTime = CFAbsoluteTimeGetCurrent()
    operation()
    let timeElapsed = CFAbsoluteTimeGetCurrent() - startTime
    print("Time elapsed for \(title): \(timeElapsed) s.")
}

private func timeElapsedInSecondsWhenRunningCode(operation: ()->()) -> Double {
    let startTime = CFAbsoluteTimeGetCurrent()
    operation()
    let timeElapsed = CFAbsoluteTimeGetCurrent() - startTime
    return Double(timeElapsed)
}

UPD: Yes, the question is a duplicate, and a better answer exists.

Paul B
  • 3,989
  • 33
  • 46