195

I need to insert an Omega (Ω) onto my html page. I am using its HTML escaped code to do that, so I can write Ω and get Ω. That's all fine and well when I put it into a HTML element; however, when I try to put it into my JS, e.g. var Omega = Ω, it parses that code as JS and the whole thing doesn't work. Anyone know how to go about this?

Greg Hewgill
  • 951,095
  • 183
  • 1,149
  • 1,285
Bluefire
  • 13,519
  • 24
  • 74
  • 118
  • 16
    `var Omega = "Ω";` too simple? – Heretic Monkey Oct 26 '12 at 19:24
  • 8
    Notepad doesn't accept that, it just writes an O :P – Bluefire Oct 26 '12 at 19:26
  • 14
    @MikeMcCaughan Yes but if an other developper messes with the encoding of the source file, you have lost... You will always have someone to say "OOOOps i did not know that uses CP1250 as the default encoding and i did not notice that small change when commiting" or "character enco-what?" ;=) – Samuel Rossille Oct 26 '12 at 19:28
  • 3
    @Bluefire switch to a better text editor that supports setting the character encoding (e.g. notepad++) and set it to UTF-8. Then you can write chineese in your source code if you want... Or stay in the category of ppl targeted by my first comment ;=) http://en.wikipedia.org/wiki/Character_encoding – Samuel Rossille Oct 26 '12 at 19:31
  • @SamuelRossille; too true, I had forgotten that we should increase code complexity to work around lack of tool knowledge. I did copy and paste my comment into Notepad (on Win7, mind you), and it worked fine. You can set the Encoding, as you so rightly pointed out. – Heretic Monkey Oct 26 '12 at 19:35
  • @MikeMcCaughan I meant Notepad++ – Bluefire Oct 26 '12 at 19:43
  • 4
    @Bluefire, Notepad++ should handle it fine, you just need to change the Encoding in the menu to UTF-8 or UCS-2. – Heretic Monkey Oct 26 '12 at 19:46
  • @SamuelRossille I have done that to myself more than once :( Not sure if it was my diff viewer or some random text editor. Completely killed my SVN Blame history as well when I did it. – basher Jan 13 '16 at 20:55
  • 3
    Stick to what @SamuelRossille wrote: Use only escaped unicode chars. They are in plain ACSII and can be opened / edited in every editor, command line, sent with any protocol, etc. It's the safest way. If you have a lot of unicode text, you can use a tool like https://mothereff.in/js-escapes to escape unicode text before pasting to the file. – StanE Nov 27 '17 at 02:39

6 Answers6

279

I'm guessing that you actually want Omega to be a string containing an uppercase omega? In that case, you can write:

var Omega = '\u03A9';

(Because Ω is the Unicode character with codepoint U+03A9; that is, 03A9 is 937, except written as four hexadecimal digits.)

Edited to add (in 2022): There now exists an alternative form that better supports codepoints above U+FFFF:

let Omega = '\u{03A9}';
let desertIslandEmoji = '\u{1F3DD}';

Judging from https://caniuse.com/mdn-javascript_builtins_string_unicode_code_point_escapes, most or all browsers added support for it in 2015, so it should be reasonably safe to use.

ruakh
  • 175,680
  • 26
  • 273
  • 307
  • 6
    And if one wants to find out what the hexadecimal value for a unicode string is: https://mothereff.in/js-escapes – StanE Nov 27 '17 at 02:47
  • 1
    Another way of deriving the hexadecimal value for a unicode string from within JavaScript is: "Ω".codePointAt(0).toString(16); – Kostas Minaidis Jun 05 '20 at 11:39
  • Is there no way to use the original code, 937? – Richard Jul 11 '22 at 14:36
  • @Richard: I'm not sure why you describe the decimal representation as "original"; the hexadecimal representation is, and has always been, the usual one. But if you want to use the decimal representation . . . there's no specific notation for that, *but* you can use `String.fromCharCode`, which takes its arguments as numbers, so doesn't know or care how they're represented: `var Omega = String.fromCharCode(937);` – ruakh Jul 11 '22 at 16:04
  • @ruakh Because the original code format in the question provides the full range of values in unicode afaik, and is what is generally provided on sites like : https://unicode.org/emoji/charts/full-emoji-list.html#1f636. The slash notation Javascript uses seems more limited and complicated. – Richard Jul 11 '22 at 16:10
  • @Richard: You seem to have misunderstood something. The site that you link to provides the hexadecimal representation, *not* the decimal representation, of the codepoint. – ruakh Jul 11 '22 at 16:53
  • Correct. And it doesn't work in Javascript if you use , just like in this question. – Richard Jul 11 '22 at 17:08
  • 1
    @Richard: Sorry, I don't understand what you're getting at. In what way is the JavaScript notation, `\u03A9`, "more limited and complicated" than the HTML and XML notation, `Ω`? I think it's just that you've gotten used to the latter, so it seems simple, and anything else seems different and strange and complicated. – ruakh Jul 11 '22 at 17:23
  • Thank you. Maybe I misunderstood what I read, but I thought the slashes limit you to low planes. – Richard Jul 11 '22 at 17:45
  • 1
    @Richard: Oh, I see what you're getting at! Yes, the original `\u03A9` notation only supports U+0000 through U+FFFF. But there's a newer notation, `\u{03A9}`, that supports arbitrary codepoints, and has been supported in all browsers since 2015. I'll edit this answer to mention that. – ruakh Jul 11 '22 at 17:52
60

Although @ruakh gave a good answer, I will add some alternatives for completeness:

You could in fact use even var Omega = 'Ω' in JavaScript, but only if your JavaScript code is:

  • inside an event attribute, as in onclick="var Omega = '&#937'; alert(Omega)" or
  • in a script element inside an XHTML (or XHTML + XML) document served with an XML content type.

In these cases, the code will be first (before getting passed to the JavaScript interpreter) be parsed by an HTML parser so that character references like Ω are recognized. The restrictions make this an impractical approach in most cases.

You can also enter the Ω character as such, as in var Omega = 'Ω', but then the character encoding must allow that, the encoding must be properly declared, and you need software that let you enter such characters. This is a clean solution and quite feasible if you use UTF-8 encoding for everything and are prepared to deal with the issues created by it. Source code will be readable, and reading it, you immediately see the character itself, instead of code notations. On the other hand, it may cause surprises if other people start working with your code.

Using the \u notation, as in var Omega = '\u03A9', works independently of character encoding, and it is in practice almost universal. It can however be as such used only up to U+FFFF, i.e. up to \uffff, but most characters that most people ever heard of fall into that area. (If you need “higher” characters, you need to use either surrogate pairs or one of the two approaches above.)

You can also construct a character using the String.fromCharCode() method, passing as a parameter the Unicode number, in decimal as in var Omega = String.fromCharCode(937) or in hexadecimal as in var Omega = String.fromCharCode(0x3A9). This works up to U+FFFF. This approach can be used even when you have the Unicode number in a variable.

Seanny123
  • 8,776
  • 13
  • 68
  • 124
Jukka K. Korpela
  • 195,524
  • 37
  • 270
  • 390
  • 31
    Times have changed now, 5 years later, people use these things called "emoji" outside of the `U+FFFF` range. JavaScript has too, so you can do this. `var poop = '\u{1F4A9}';` – ReinstateMonica3167040 Oct 14 '17 at 00:40
  • 2
    @Userthatisnotauser and _that_ should be the accepted answer! – Marten Koetsier Apr 12 '18 at 10:19
  • How you can insert the 'open lock' character '\uD83D\uDD13' using the one code that is '0x1F512' in JavaScript? And why we need two codes to insert one character? – tarekahf May 14 '18 at 17:34
  • 8
    @tarekahf Here's a brief lesson on Unicode. UTF-16 only spanned Unicode points U+0000 to U+FFFF. Then Unicode grew and surrogates were invented so UTF-16 could access those points. But JavaScript can just do this now: `var lock = '\u{1F512}'` And you get this: – ReinstateMonica3167040 Jul 17 '18 at 01:26
14

One option is to put the character literally in your script, e.g.:

const omega = 'Ω';

This requires that you let the browser know the correct source encoding, see Unicode in JavaScript

However, if you can't or don't want to do this (e.g. because the character is too exotic and can't be expected to be available in the code editor font), the safest option may be to use new-style string escape or String.fromCodePoint:

const omega = '\u{3a9}';

// or:

const omega = String.fromCodePoint(0x3a9);

This is not restricted to UTF-16 but works for all unicode code points. In comparison, the other approaches mentioned here have the following downsides:

  • HTML escapes (const omega = '&#937';): only work when rendered unescaped in an HTML element
  • old style string escapes (const omega = '\u03A9';): restricted to UTF-16
  • String.fromCharCode: restricted to UTF-16
coldfix
  • 6,604
  • 3
  • 40
  • 50
5

The answer is correct, but you don't need to declare a variable. A string can contain your character:

"This string contains omega, that looks like this: \u03A9"

Unfortunately still those codes in ASCII are needed for displaying UTF-8, but I am still waiting (since too many years...) the day when UTF-8 will be same as ASCII was, and ASCII will be just a remembrance of the past.

fresko
  • 1,890
  • 2
  • 24
  • 25
1

I found this question when trying to implement a font-awesome style icon system in html. I have an API that provides me with a hex string and I need to convert it to unicode to match with the font-family.

Say I have the string const code = 'f004'; from my API. I can't do simple string concatenation (const unicode = '\u' + code;) since the system needs to recognize that it's unicode and this will in fact cause a syntax error if you try.

@coldfix mentioned using String.fromCodePoint but it takes a number as an argument, not a string.

To finally cross the finish line, just add parseInt and pass 16 (since hex is base 16) to it's second parameter. You'll finally get a unicode string from a simple hex string.

This is what I did:

const code = 'f004';
const toUnicode = code => String.fromCodePoint(parseInt(code, 16));

toUnicode(code);
// => '\uf004'
benja2729
  • 11
  • 1
0

Try using Function(), like this:

var code = "2710"
var char = Function("return '\\u"+code+"';")()

It works well, just do not add any 's or "s or spaces.

In the example, char is "✐".