4

I use "".charCodeAt(pos) to get the Unicode number for a strange character, and then String.fromCharCode for the reverse.

But I'm having problems with characters that have a Unicode number greater than 55349. For example, the Blackboard Bold characters. If I want Lowercase Blackboard Bold X (), which has a Unicode number of 120169, if I alert the code from JavaScript:

alert(String.fromCharCode(120169));

I get another character. The same thing happens if I log an Uppercase Blackboard Bold X (), which has a Unicode number of 120143, from directly within JavaScript:

s="";
alert(s.charCodeAt(0))
alert(s.charCodeAt(1))

Output:

55349
56655

Is there a method to work with these kind of characters?

Nat Riddle
  • 928
  • 1
  • 10
  • 24
gialloporpora
  • 199
  • 2
  • 11
  • 1
    possible duplicate of [JavaScript strings outside of the BMP](http://stackoverflow.com/questions/3744721/javascript-strings-outside-of-the-bmp) – Tim Down Jan 22 '13 at 16:11

1 Answers1

7

Internally, Javascript stores strings in a 16-bit encoding resembling UCS2 and UTF-16. (I say resembling, since it’s really neither of those two). The fact that they’re 16-bits means that characters outside the BMP, with code points above 65535, will be split up into two different characters. If you store the two different characters separately, and recombine them later, you should get the original character without problem.

Recognizing that you have such a character can be rather tricky, though.

Mathias Bynens has written a blog post about this: JavaScript’s internal character encoding: UCS-2 or UTF-16?. It’s very interesting (though a bit arcane at times), and concludes with several references to code libraries that support the conversion from UCS-2 to UTF-16 and vice versa. You might be able to find what you need in there.

Martijn
  • 13,225
  • 3
  • 48
  • 58
  • 2
    +1 also you can use `s.codePointAt(0)` in modern browsers now! https://developer.mozilla.org/en/docs/Web/JavaScript/Reference/Global_Objects/String/codePointAt – Joël Jul 29 '16 at 09:19
  • @Joël you should submit that as an answer – ricka Feb 19 '20 at 00:53