26

Why does this work:

<p id="emoji">&#x1f604;</p>

And this doesn't:

document.getElementById("emoji").innerHTML = String.fromCharCode(parseInt('1f604', 16));
Tom Söderlund
  • 4,743
  • 4
  • 45
  • 67

2 Answers2

48

A 'char' in JS terms is actually a UTF-16 code unit, not a full Unicode character. (This sad state of affairs stems from ancient times when there wasn't a difference*.) To use a character outside of the Basic Multilingual Plane you have to write it in the UTF-16-encoded form of a surrogate pair of two 16-bit code units:

String.fromCharCode(0xD83D, 0xDE04)

In ECMAScript 6 we will get some interfaces that let us deal with strings as if they were full Unicode code points, though they are incomplete and are only a façade over the String type which is still stored as a code unit sequence. Then we'll be able to do:

String.fromCodePoint(0x1F604)

See this question for some polyfill code to let you use this feature in today's browsers.

(*: When I get access to a time machine I'm leaving Hitler alone and going back to invent UTF-8 earlier. UTF-16 must never have been!)

Community
  • 1
  • 1
bobince
  • 528,062
  • 107
  • 651
  • 834
8

You can also use the hacky method if you don't want to include String.fromCodePoint() in your code. It consists in creating a virtual element ...

elem=document.createElement('p')

... Filling it with the Working HTML...

elem.innerHTML = "&#x1f604"

... And finally getting its value

value = elem.innerHTML

To make it short, this works because of the fact that, as soon as you set the value of a HTML container, the value gets converted into the corresponding character.

Hope I could help.

Thomas P.
  • 463
  • 4
  • 12