1

Let's say I have the following basic HTML page

<html>
  <head>
    <script src="http://ajax.googleapis.com/ajax/libs/jquery/1/jquery.min.js"></script>    
    <meta charset=utf-8 />
    <title>JS Bin</title>
  </head>
  <body>
    \u00f2
  </body>
</html>

When the page renders, what I see is \u00f2 whereas I was expecting ò. And there comes the big "but". With the following Javascript code, what I see is the ò character (2 seconds later).

$(function(){
  window.setTimeout(function(){
    $("body").html("\u00f2")},2000);
  });
});

My question is, why is this happening? I am aware of rather than rendering the Unicode codepoints, I could convert them to HTML entities and render the correct character directly. The question is more for learning purposes.

Here is the jsbin

dda
  • 6,030
  • 2
  • 25
  • 34
Bahadir Cambel
  • 422
  • 5
  • 12

2 Answers2

3

This is because \u00f2 is not valid HTML markup for unicode characters. The proper HTML markup is &#x00f2. All you need to do is replace \u with &#x and you should be fine.

If you want to know why jQuery uses \u, it is because javascript uses \u designate unicode characters. You can read more here: jquery .text() and unicode.

In short, use \u in Javascript, &#x in HTML, and don't try to switch that around or you'll run into problems (such as what is happening here)

Community
  • 1
  • 1
cegfault
  • 6,442
  • 3
  • 27
  • 49
3

It happens because in HTML, \u00f2 is just a sequence of five characters; the backslash \ never has any special meaning in HTML. In JavaScript strings, \u00f2 has a special meaning: it denotes the Unicode code unit with hexadecimal number 00f2, i.e. the chacter “ò”.

Conversely, although you can use &#x00f2; in HTML to denote “ò”, you cannot do that in JavaScript, though you could use functions that convert &#x00f2; (which is just a sequence of eight characters from the JavaScript point of view) to “ò”. Moreover, if your JavaScript code appears as embedded in HTML in a script element or in an event attribute, then browsers may, depending on certain rules, first interpret &#x00f2; by HTML rules before invoking the JavaScript interpreter.

In HTML documents, the modern, generally recommendable method is to enter the characters directly, using the UTF-8 encoding. You can do the same in JavaScript, too, e.g. $("body").html("ò")},2000). However, this is sometimes avoided due to assumed or real complications in specifying the character encoding.

Jukka K. Korpela
  • 195,524
  • 37
  • 270
  • 390