Suppose we have a UTF-8 string (represented by a string of hex bytes in character form) that might include an emoji, or any other Unicode characters. How do we represent the string as a literal in JavaScript for use in the alert
function? In PHP, there are two easy ways: (1) "\xE2\x96\xB6"
(2) hex2bin('E296B6')
. I'm having trouble doing the same thing in pure JavaScript. '\xE2\x96\xB6'
doesn't seem to work (it displays a paragraph mark instead of a right solid triangle in an alert
function).
I thought of writing a 'hex2bin' function to return the argument as a hex byte string, but JavaScript has no such datatype. In PHP, strings can contain any bit patterns, but I don't think this is true for JavaScript.
I know that JavaScript is a modern language that supports Unicode, so there must be an easy way to do this.
Note that any answer that talks about the \u
construct is wrong, since \u does not accept a UTF-8 string. UTF-8 is currently the standard and recommended for most storage of character strings, yet most programming languages do not yet offer simple literal syntax for UTF-8 byte strings.
When programmers talk about low-level representations for Unicode, they are frequently interested in UTF-8, since it is the standard and an efficient encoding. UTF-16 and Unicode code points (and the many odd encodings) are of interest, particularly for naming characters (U+HHHH
notation) and for representing them in fixed widths. But it is UTF-8 that is the standard, and we need more answers on Stack Overflow to help us with UTF-8.