0

At the moment I have the following unicode string stored in a database.

\u4ece\u4e0b\u5468\u4e00\u8d77\uff0c\u5947\u5f02

If I send it out through php to the apple APNS server, it is displayed correctly on the iphones. If I however want to display these characters (chinese characters) on a website, it does not display them in chinese characters but purely as \u4ece\u4e0b\u5468\u4e00\u8d77\uff0c\u5947\u5f02

Anyone can help me out how to display them correctly?

HansStam
  • 509
  • 2
  • 9
  • 22
  • 1
    You need to add more detail. How are you outputting the characters on your web site, what encoding is your web site using... – Pekka Nov 01 '12 at 07:46

1 Answers1

3

That's not "Unicode", those are Unicode escape sequences. This: "下" is a Unicode character. This: "\u4e0b" is the string "backslash you four ee zero bee".1 If you put that escape sequence exactly like that into JSON, it happens to resolve to the correct characters when JSON is decoded. That's because that escape sequence happens to be used in JSON. That hints at another problem though, which is that you are creating your JSON by hand like this:

$apns = "{\"message\":\"$unicodeEscape\"}";

Don't do that. Make a native array in your programming language of choice and JSON-encode it:

$apns = json_encode(array('message' => '从下周一起,奇异'));

If you'd currently do this, the string would show up as "\u4ece..." on the iPhone as well, because the string content would get correctly JSON escaped to preserve its original content.

To HTML, those escape sequences don't mean anything special in the first place, they certainly don't stand for Chinese characters.

Store the actual Chinese characters in your database encoded in, for example, UTF-8, not an escape sequence which is only relevant in certain contexts.

I'd recommend you read most articles on http://kunststube.net for more detailed information.


Since they're apparently JSON escapes, the easiest way to convert them back from the format they're currently in should be to parse them as JSON:

$string = json_decode("\"$string\"");

That only works if the string doesn't contain anything that would make the JSON syntax invalid of course, like a ". Otherwise, you can adapt this solution.


1 (That string is also made up of "Unicode characters", because each of these characters can be represented by Unicode.)

Community
  • 1
  • 1
deceze
  • 510,633
  • 85
  • 743
  • 889
  • Thanks deceze, but I already have the characters stored in the DB this way, and the system is live. I do not want to change the data in the database (it will create a big mess and more problems with the cleint). Is there no way I canconvert these characters back into unicode? – HansStam Nov 02 '12 at 00:34