I know that I can escape a basic Unicode character in Ruby with the \uNNNN
escape sequence. For example, for a smiling face U+263A (☺) I can use the string literal "\u2603"
.
How do I escape Unicode characters greater than U+FFFF that fall outside the basic multilingual plane, like a winking face: U+1F609 (😉)?
Using the surrogate pair form like in Java doesn't work; it results in an invalid string that contains the individual surrogate code points:
s = "\uD83D\uDE09" # => "\xED\xA0\xBD\xED\xB8\x89"
s.valid_encoding? # => false