9

We are developing an iPhone app that would send emoticons from iPhone to server-side PHP and insert into MySQL tables. I am doing the server-side work.

But after insert statement executed successfully, the inserted value become blank.
What I could insert into the field(varchar) correctly is text, but once including emoticons,
just the text could be inserted and the emoticons would be cut automatically.

Someone give me advice about set the field type to Blog so that it could store image data.
But the inserted value is not always including emoticons case and size is small.

*I am using mysql_real_escape_string for inserting value.

Mathias Bynens
  • 144,855
  • 52
  • 216
  • 248
tech_me
  • 540
  • 2
  • 7
  • 20

2 Answers2

23

Most iOS emojis use code points above the Basic Multilingual Plane of the Unicode table. For example, (SMILING FACE WITH OPEN MOUTH AND SMILING EYES) is at U+1F604.

OS X character viewer

Now, see http://dev.mysql.com/doc/refman/5.5/en/charset-unicode.html.

MySQL before version 5.5 only supports UTF-8 for the BMP, which includes characters between U+0000 and U+FFFF (i.e. only a subset of actual UTF-8; MySQL's utf8 is not real UTF-8). It cannot store the character at code point U+1F604 or other similar "high characters". MySQL 5.5+ supports utf8mb4 (actual UTF-8), utf16 and utf32, which are able to encode these characters. If you're using MySQL 5.5+, use one of these column character sets and make sure you're using the same charset for your connection encoding to/from PHP. If you are on MySQL < 5.5, you'll have to use a BLOB column type. That type stores raw bytes without caring about the "characters" in it. The downside is that you won't be able to efficiently search or index the text.

deceze
  • 510,633
  • 85
  • 743
  • 889
  • Thanks for your reply. The MySQL version that we are using now is _5.1.54_... So, I would try the blob data type or upgrade to _5.5_ and use `utf8mb4` for my field. I will be back later;) – tech_me Mar 01 '12 at 04:51
  • Thank you @deceze. What I have just confirmed now is that emoticons would be stored correctly after changed the field type from varchar to blob in our MySQL _5.1.54_. We are discussing what should apply to our server-side as some configuration would be changed in php code(or CakePHP framework). – tech_me Mar 01 '12 at 05:40
  • 1
    Finally, we upgraded to **5.5+** and emoticons are displayed normally, so that it could be compatible with our future development. Thank you again, @deceze. – tech_me Mar 02 '12 at 01:10
  • @deceze How would one choose between `utf8mb4`, `utf16`, and `utf32`? Is there an advantage to using one over the other? – John Erck May 22 '14 at 14:03
  • 1
    @John Mostly storage efficiency. UTF-8 is most efficient if you're using mostly ASCII characters. If you're using *a lot* of Asian character or emoji which are high up in the Unicode table, UTF-16 may use fewer bytes. There's almost no reason to use UTF-32 really, as it's *always* the most inefficient. – deceze May 22 '14 at 14:07
  • After lots of testing I'm realizing that `utf8mb4` is required for MySQL to be able to handle the emoji characters I see people using such as: http://i.stack.imgur.com/uhyOk.png – John Erck May 26 '14 at 15:01
2

Some of the emoji characters work with older non-blobed mysql configurations because they are encoded using a 3 byte codepoint and mysql can store a 3 byte character. If you cannot upgrade mysql nor use blobs for whatever reason, you can scrub out 4 byte codepoints and keep the 3 byte ones.

If your computer has emoji capabilities, here is a list of the 3 byte iOS emoji characters:

☺❤✨❕❔✊✌✋☝☀☔☁⛄⚡☎➿✂⚽⚾⛳♠♥♣♦〽☕⛪⛺⛲⛵✈⛽⚠♨1⃣2⃣3⃣4⃣5⃣6⃣7⃣8⃣9⃣0⃣#⃣⬆⬇⬅➡↗↖↘↙◀▶⏪⏩♿㊙㊗✳✴♈♉♊♋♌♍♎♏♐♑♒♓⛎⭕❌©®™

Jake
  • 1,135
  • 1
  • 12
  • 26