5

Supposing I have a string of code like so:

\u00e5\u00b1\u00b1\u00e4\u00b8\u008a\u00e7\u009a\u0084\u00e4\u00ba\u00ba

How would I convert these back into Chinese characters using Javascript:

山上的人

This is so that I can actually display Chinese on my web page. Right now it comes out as å±±ä¸ç人.

This website manages to accomplish this, however this is with PHP they don't expose.

I am not familiar with how character encoding works well at all, so I don't even know the terminology to search for a proper solution.

dthree
  • 19,847
  • 14
  • 77
  • 106
  • Possible duplicate of https://stackoverflow.com/questions/3745666/how-to-convert-from-hex-to-ascii-in-javascript – Devan Buggay Jan 15 '18 at 02:20
  • The code from that post doesn't do the trick. – dthree Jan 15 '18 at 02:21
  • Ah perhaps this one is more relevant: https://stackoverflow.com/questions/21647928/javascript-unicode-string-to-hex – Devan Buggay Jan 15 '18 at 02:24
  • That makes it come out as `å±±ä¸ç人` in the browser. I'm trying to get this unicode to render on the page as Chinese. – dthree Jan 15 '18 at 02:27
  • 2
    @dthree—that is result of literal conversion of control code (\u00e5) to code point (229) to character (å). – RobG Jan 15 '18 at 03:40
  • 1
    JavaScript internally only works with UCS-2 (where 山 is `\u5C71`), and doesn't understand UTF-8 (`\xE5\xB1\xB1`). Moreover, `\u00e5\u00b1\u00b1` is likely wrong due to extra zeroes, probably. Thus it would be good to give it data in the proper form in the first place, rather than try to transform it (but if you absolutely need to do that, Steven Tang's answer seems to be good). Where is your data coming from? – Amadan Jan 15 '18 at 05:27
  • @Amadan thanks. Yes, Steven's answer worked. Unfortunately I can't change the source data. – dthree Jan 15 '18 at 06:25

2 Answers2

2

The string appears to be in UTF-8.

https://github.com/mathiasbynens/utf8.js is a helpful Javascript library that saves you the headache of learning the UTF-8 standard, and will decode the UTF-8 into text.

Here's a demo: https://mothereff.in/utf-8

Paste in \u00e5\u00b1\u00b1\u00e4\u00b8\u008a\u00e7\u009a\u0084\u00e4\u00ba\u00ba into the "UTF-8-encoded" textarea to decode it.

Steven Tang
  • 954
  • 1
  • 7
  • 21
-1

Add <meta charset="UTF-8"> inside the <head></head> tag of your HTML file so that it will display Chinese properly. Just put the Chinese characters directly in your HTML file

gldanoob
  • 762
  • 1
  • 7
  • 18