0

After reading this tutorial on how to change mysql data structure to support four bytes emojis and this here, I've set the following:

  • node server receive the emoticon as a surrogate pair, like String.charCodeAt(0).toString(16) and string.charCodeAt(1).toString(16) : d83d, de01, which is the surrogate pair for UTF16: U+1F601.
  • I changed the column of my table to text varchar(3000) CHARACTER SET utf8mb4 NOT NULL
  • For every transaction with emoticons, I run conn.query('set character set utf8mb4'). This set prevent me from having an exception. [EDIT] After this, command, if I run SHOW VARIABLES LIKE 'character_set_%' I got:

  {Variable_name: "character_set_client", Value: "utf8mb4"} 
>         {Variable_name: "character_set_connection", Value: "utf8"} 
>         {Variable_name: "character_set_database", Value: "utf8"} 
>         {Variable_name: "character_set_filesystem", Value: "binary"} 
>         {Variable_name: "character_set_results", Value: "utf8mb4"} 
>         {Variable_name: "character_set_server", Value: "utf8"} 
>         {Variable_name: "character_set_system", Value: "utf8"} 
>         {Variable_name: "character_sets_dir", Value: "/usr/share/mysql/charsets/"}

But the result is that all emoticon are always saved as '?', with the possible exception of three bytes emoticons in the 3rd line.

select text, hex(text) from table_text;
+-------------------------------+------------------------------------------------------------+
| text                          | hex(text)                                                  |
+-------------------------------+------------------------------------------------------------+
| ?                             | 3F                                                         |
| Eita ?     ?                  | 45697461203F20202020203F20                                 |
| 1                             | 31                                                         |
| ? ??☺️??                      | 3F203F3FE298BAEFB88F3F3F                                   |
| ???????                       | 3F3F3F3F3F3F3F                                             |
| ???????   Crazy Stuff ??????? | 3F3F3F3F3F3F3F2020204372617A79205374756666203F3F3F3F3F3F3F |
| Crazy Stuff ???????           | 4372617A79205374756666203F3F3F3F3F3F3F                     |
+-------------------------------+------------------------------------------------------------+

[EDIT] So I found the problem. For some reason, if you only run set character set utf8mb4 is not enough as it only changes character_set_client. To set character_set_client and character_set_connection, you must run 'SET NAMES utf8mb4'

Victor M Perez
  • 2,185
  • 3
  • 19
  • 22
JLCDev
  • 609
  • 1
  • 5
  • 18
  • Generally it's a good idea to answer your own question if you solve your problem, then accept the answer. That way somebody else may be able to find your answer. – O. Jones Feb 28 '18 at 19:01

1 Answers1

0

So I found the problem. For some reason, if you only run set character set utf8mb4 is not enough as it only changes character_set_client. To set character_set_client and character_set_connection, you must run

'SET NAMES utf8mb4'

before run the statement. So:

conn.query('SET NAMES utf8mb4');
conn.query('Your statement inserting/updating emojis')
JLCDev
  • 609
  • 1
  • 5
  • 18