0

Having trouble getting foreign characters and Emoji to display.

Edited to clarify

A user types an Emoji character into a text field which is then sent to the server (php) and saved into the database (mysql). When displaying the text we grab a JSON encoded string from the server, which is parsed and displayed on the client side.

QUESTION: the character for a "trophy" emoji saved in the DB reads as

%uD83C%uDFC6

When that is sent back to the client we don't see the emoji picture, we actually see the raw encoded text.

How would we get the client side to read that text as an emoji character and display the image?

(all on an iphone / mobile safari)

Thanks!

PotatoFro
  • 6,378
  • 4
  • 20
  • 22
  • @hakre Sorry for the confusion! I've edited the question to clearly state my meaning. Thanks! – PotatoFro Aug 08 '12 at 02:38
  • 1
    If that's what gets put into your database then something in the pipe is broken, and you need to trace it to find out what. Also, `utf8mb4`. – Ignacio Vazquez-Abrams Aug 08 '12 at 02:40
  • @IgnacioVazquez-Abrams I just stumbled onto a [stackoverflow post](http://stackoverflow.com/questions/2708958/differences-between-utf8-and-latin1) about utf8 and utf8mb4 which says utf8mb4 is only available as of MySql 5.5 (I'm on 5.1). Any way to do this on 5.1? – PotatoFro Aug 08 '12 at 02:46
  • @PotatoFro: Try with binary strings. But **first** try to display that character independent to database. Do a mock, see if you *can* display these emoij images at all with plain text. – hakre Aug 08 '12 at 10:47

3 Answers3

1

Check the encodings used by your client, your web server, and your database table. Make sure they are all using encodings that can handle the characters you are concerned about.

Trott
  • 66,479
  • 23
  • 173
  • 212
1

Looks like the problem is my MySql encoding... utf8mb4 would allow it - unfortunately it's unavailable before MySQL v5.5

PotatoFro
  • 6,378
  • 4
  • 20
  • 22
0

the character for a "trophy" emoji saved in the DB reads as %uD83C%uDFC6

Then your data are already mangled. %u escapes are specific to the JavaScript escape() function, which should generally never be used. Make sure your textarea->PHP handling uses standards-compliant encoding, eg encodeURIComponent if you need to get a JS variable into a URL query.

Then, having proper raw UTF-8 strings in your PHP layer, you can worry about getting MySQL to store characters like the emoji that are outside of the Basic Multilingual Plane. Best way is columns with a utf8mb4 collation; if that is not available try binary columns which will allow you to store any byte sequence (treating it as UTF-8 when it comes back out). That way, however, you won't get case-insensitive comparisons.

bobince
  • 528,062
  • 107
  • 651
  • 834
  • Awesome... switched from escape() to encodeURIComponent() and all is well. Seems like the "text" datatype in MySQL 5.1 works fine. – PotatoFro Aug 10 '12 at 20:51