19

In JavaScript I can do this:

foo = "\u2669" // 1/4 note

But I can't do this

foo = "\u1D15D" // full note  -five hex digits

It will be interpreted as "\u1D15" followed by "D"

Are there any workarounds for this?

UPDATE 2012-07-09: The proposal for ECMAScript Harmony now includes support for all Unicode characters.

itpastorn
  • 2,935
  • 1
  • 22
  • 24
  • The context is this small application about musical notes in Canvas that a student of mine is attempting: http://keryx.se/dev/html5/noter/noter.html (Nothing fancy. He has only programmed for just 2 months, a few hours a week.) – itpastorn May 16 '12 at 09:22
  • 1
    Possible duplicate of [JavaScript strings outside of the BMP](http://stackoverflow.com/questions/3744721/javascript-strings-outside-of-the-bmp). – Frédéric Hamidi May 16 '12 at 09:22
  • [Got this on Twitter](https://twitter.com/#!/zcorpan/status/202752353986813952): "You have to use surrogate pairs" – itpastorn May 16 '12 at 22:07
  • Javascript is still stuck in Unicode 1 from before 1995. It is miserable for modern text processing. – tchrist May 26 '12 at 07:02

5 Answers5

32

Try putting the unicode between curly braces: '\u{1D15D}'.

Manuel Dipre
  • 463
  • 5
  • 9
  • This is the way to go! More read up here: https://dmitripavlutin.com/what-every-javascript-developer-should-know-about-unicode/ – b00t Jan 09 '19 at 07:17
  • This doesn't work in regular expressions I don't think, unfortunately. I wish! – Lance Sep 02 '23 at 21:00
7

In the MDN documentation for fromCharCode, they note that javascript will only naturally handle characters up to 0xFFFF. However, they also have an implementation of a fixed method for fromCharCode that may do what you want (reproduced below):

function fixedFromCharCode (codePt) {
    if (codePt > 0xFFFF) {
        codePt -= 0x10000;
        return String.fromCharCode(0xD800 + (codePt >> 10), 0xDC00 + (codePt & 0x3FF));
    }
    else {
        return String.fromCharCode(codePt);
    }
}

foo = fixedFromCharCode(0x1D15D);
mpdaugherty
  • 1,118
  • 8
  • 19
  • If you have the correct font installed doing `String.fromCharCode( 0x01D15D )` will render you that char. – Andre Steenveld May 16 '12 at 09:36
  • 1
    Do you have an example of a font for which that works? I'd love to see that fromCharCode now supports codes outside the BMP, but everything I read today seemed to indicate it doesn't. (Also see Frédéric Hamidi's comment above) – mpdaugherty May 16 '12 at 10:14
  • I just did a quick test in opera and it spat out the "can't render char"-character. So that lead me to believe that you need the font actually render the character. – Andre Steenveld May 16 '12 at 13:38
  • That’s a horrible hack. Not being able to specify the full range of Unicode code points is a severe hardship. Javascript is still stuck in the pre-1995 world. It’s a scandal. – tchrist May 16 '12 at 16:20
  • Sorry for being late answering this. I need to get my student to install an appropriate font first, for him to try it. I will get back in a day or two and give credit if it works. – itpastorn May 16 '12 at 22:09
  • @Dre String.fromCharCode( 0x01D15D ) by itself does not work. At least not in Firefox - which is my only tested browser so far. – itpastorn May 16 '12 at 22:45
  • 2
    Downloaded the [Code2001 font](http://font.downloadatoz.com/download,33552,code2001-font-for-linux.html) and ran tests. fixedFromCharCode works. – itpastorn May 16 '12 at 22:47
  • 1
    Ref: [The algorithm explained at Wikipedia](http://en.wikipedia.org/wiki/UTF-16#Code_points_U.2B10000_to_U.2B10FFFF) – itpastorn May 17 '12 at 11:09
1

Nowadays, you can simply use String.fromCodePoint(), as documented in MDN. For instance:

> String.fromCodePoint(0x1f0a1)
""
Pedro Asad
  • 11
  • 1
0

I did a little checking and it appears that there is no full note near 0x2669. (table of unicode chars)

Although using 0x01D15D does give me a unknown unicode character this could be because I don't have a music font though. Javascript will try to parse as byte as it can and 0x1D15D is 2.5 bytes padding it with a 0 will make it 3 and parsable.

Also this was quite handy: unicode.org/charts/PDF/U1D100.pdf

Andre Steenveld
  • 148
  • 1
  • 11
0

You can use this:

function fromOutsideBMP(cp) {
// 0x01D120
  var x=cp-0x10000;
  var top10=parseInt("11111111110000000000",2);
  var end10=parseInt("1111111111",2);
  var part1=(x&top10)/1024+0xD800;
  var part2=(x&end10)+0xDC00;
  var s=String.fromCharCode(part1)+String.fromCharCode(part2);
  return s;
}

Example:

> fromOutsideBMP(0x01d122)
  ""
>
dda
  • 6,030
  • 2
  • 25
  • 34