After reading this tutorial on how to change mysql data structure to support four bytes emojis and this here, I've set the following:
- node server receive the emoticon as a surrogate pair, like
String.charCodeAt(0).toString(16) and string.charCodeAt(1).toString(16) : d83d, de01,
which is the surrogate pair forUTF16: U+1F601
. - I changed the column of my table to
text varchar(3000) CHARACTER SET utf8mb4 NOT NULL
- For every transaction with emoticons, I run
conn.query('set character set utf8mb4')
. This set prevent me from having an exception. [EDIT] After this, command, if I runSHOW VARIABLES LIKE 'character_set_%'
I got:
{Variable_name: "character_set_client", Value: "utf8mb4"}
> {Variable_name: "character_set_connection", Value: "utf8"}
> {Variable_name: "character_set_database", Value: "utf8"}
> {Variable_name: "character_set_filesystem", Value: "binary"}
> {Variable_name: "character_set_results", Value: "utf8mb4"}
> {Variable_name: "character_set_server", Value: "utf8"}
> {Variable_name: "character_set_system", Value: "utf8"}
> {Variable_name: "character_sets_dir", Value: "/usr/share/mysql/charsets/"}
But the result is that all emoticon are always saved as '?'
, with the possible exception of three bytes emoticons in the 3rd line.
select text, hex(text) from table_text;
+-------------------------------+------------------------------------------------------------+
| text | hex(text) |
+-------------------------------+------------------------------------------------------------+
| ? | 3F |
| Eita ? ? | 45697461203F20202020203F20 |
| 1 | 31 |
| ? ??☺️?? | 3F203F3FE298BAEFB88F3F3F |
| ??????? | 3F3F3F3F3F3F3F |
| ??????? Crazy Stuff ??????? | 3F3F3F3F3F3F3F2020204372617A79205374756666203F3F3F3F3F3F3F |
| Crazy Stuff ??????? | 4372617A79205374756666203F3F3F3F3F3F3F |
+-------------------------------+------------------------------------------------------------+
[EDIT] So I found the problem. For some reason, if you only run set character set utf8mb4
is not enough as it only changes character_set_client
. To set character_set_client
and character_set_connection
, you must run 'SET NAMES utf8mb4'