How do I find 0xD801
and 0xDC00
from 0x10400
?
JavaScript uses UCS-2 internally. That’s why String#charCodeAt()
doesn’t work the way you’d want it to.
If you want to get the code point of every Unicode character (including non-BMP characters) in a string, you could use Punycode.js’s utility functions to convert between UCS-2 strings and UTF-16 code points:
// String#charCodeAt() replacement that only considers full Unicode characters
punycode.ucs2.decode(''); // [119558]
punycode.ucs2.decode('abc'); // [97, 98, 99]
If you don’t need to do it programmatically though, and you’ve already got the character, just use mothereff.in/js-escapes. It will tell you how to escape any character in JavaScript.