Background
I am writing a DFA based regex parser, for performance reasons, I need to use a dictionary [Unicode.Scalar : State]
to map the next states. Now I need a bunch of special unicode values to represent special character expressions like .
, \w
, \d
...
My Question
Which of the unicode values are safe to use for this purpose?
I was using U+0000
for .
, but I need more now. I checked the unicode documentation, the Noncharacters seems promising, but in swift, those are considered invalid unicode. For example, the following code gives me a compiler error Invalid unicode scalar
.
let c = "\u{FDD0}"