I'm working on an application that eventually reads and prints arbitrary and untrustable Unicode characters to the screen.
There are a number of ways to wreck havoc using Unicode strings, and I would like my program to behave correctly for "dangerous" strings. For instance, the RTL override character will make strings look like they're backwards.
Since the audience is mostly programmers, my solution would be to, first, get the type C canonical form of the string, and then replace anything that's not a printable character on its own with the Unicode code point in the form \uXXXXXX
. (The intent is not to have a perfectly accurate representation of the string, it is to have a mostly good representation. The full string data is still available.)
My problem, then, is determining what's an actual printable character and what's a non-printable character. Swift has a Character
class, but contrary to, say, Java's Character
class, the Swift one doesn't seem to have any method to find out the classification of a character.
How could I carry that plan? Is there anything else I should consider?