4

I'm losing my mind on this issue since yesterday.

I'm trying to convert my MySQL database from utf8 to utf8mb4. To do so, I followed those sites/threads : https://mathiasbynens.be/notes/mysql-utf8mb4#utf8-to-utf8mb4 , Cannot store emoji in database , Change MySQL default character set to UTF-8 in my.cnf? , etc.

My database seems to have a utf8mb4_unicode_ci collation as expected, and all her tables too.

Nevertheless, when I proceed SHOW VARIABLES WHERE Variable_name LIKE 'character\_set\_%' OR Variable_name LIKE 'collation%'; I obtain :

#With a Root access
Variable_name              Value
character_set_client        utf8                #expecting utf8mb4
character_set_connection    utf8                #expecting utf8mb4
character_set_database      utf8mb4             #good
character_set_filesystem    binary              #good
character_set_results       utf8                #expecting utf8mb4
character_set_server        utf8mb4             #good
character_set_system        utf8                #expecting utf8mb4
collation_connection        utf8_general_ci     #expecting utf8mb4_unicode_ci
collation_database          utf8mb4_unicode_ci  #good
collation_server            utf8mb4_unicode_ci  #good

#With a standard user access
Variable_name               Value
character_set_client        utf8                #expecting utf8mb4
character_set_connection    utf8mb4             #good 
character_set_database      utf8mb4             #good
character_set_filesystem    binary              #good
character_set_results       utf8                #expecting utf8mb4
character_set_server        utf8mb4             #good
character_set_system        utf8                #expecting utf8mb4
collation_connection        utf8mb4_unicode_ci  #good
collation_database          utf8mb4_unicode_ci  #good
collation_server            utf8mb4_unicode_ci  #good

I set a /etc/mysql/conf.d/90-my.cnf file like this :

[client]
default-character-set = utf8mb4

[mysql]
default-character-set = utf8mb4

[mysqld]
character-set-client-handshake = FALSE
character-set-server = utf8mb4
collation-server = utf8mb4_unicode_ci
init-connect = 'SET collation_connection = utf8mb4_unicode_ci'
init-connect = 'SET NAMES utf8mb4'

My MySQL version is 5.5.54, all set in a debian 7.

Does anyone have a clue to help me ?

Thx for your help, and sorry for my bad english...

EDIT

Fun fact : when I check variables in the in-line command, I got this :

mysql> show variables like "%character%";
+--------------------------+----------------------------+
| Variable_name            | Value                      |
+--------------------------+----------------------------+
| character_set_client     | utf8mb4                    |
| character_set_connection | utf8mb4                    |
| character_set_database   | utf8mb4                    |
| character_set_filesystem | binary                     |
| character_set_results    | utf8mb4                    |
| character_set_server     | utf8mb4                    |
| character_set_system     | utf8                       |
| character_sets_dir       | /usr/share/mysql/charsets/ |
+--------------------------+----------------------------+

While I still have utf8 for character_set_client and character_set_results when I check with phpmyadmin (with a non-superuser access)

Community
  • 1
  • 1
Vae
  • 636
  • 1
  • 8
  • 16

1 Answers1

2

When connecting as user root, init-connect is ignored.

Give your application its own login without SUPER privilege.

Then, to make extra sure, whenever establishing a connection from your app, do the language-specific method of providing the character set. (Or execute SET NAMES utf8mb4.) What app language are you using?

In case something else is going wrong, see "Best Practice" in Trouble with utf8 characters; what I see is not what I stored

Community
  • 1
  • 1
Rick James
  • 135,179
  • 13
  • 127
  • 222
  • Thx a lot for your concern. I just edited the thread to add character/collation stuff with a standard user access to the database. Results are better but still not perfect. I took a quick look to your linked post, man your explanations are very good. I'll try those as soon as I come back from work. I'll tell you if anything work. (Btw I tried the `HEX(txt)` thing and the result is good, everything seems "hex-encoded" as expected.) Anyway, I keep you in touch quickly. Thx again – Vae May 08 '17 at 04:50
  • Hey Rick, here are the news : I set everything you mention in the "What you should do" part. The test of the data is good (no double encoding and every character is correctly recognize). My problem is : I have question mark in the database instead of emojis (however they are correctly shown in my website). You wrote "The bytes to be stored are not encoded as utf8/utf8mb4. Fix this." How can I fix this ? FYI, I'm using php. Thx again for your interest. See you soon. – Vae May 08 '17 at 12:01
  • "explanations are very good" -- I hate to admin how many years I have been honing those explanations. – Rick James May 08 '17 at 13:54
  • An Emoji went into the table as `?`? That implies that you were not connected as `utf8mb4` _when `INSERTing`_. It is not recoverable. What do I need to clarify in "Question Marks"? – Rick James May 08 '17 at 13:56
  • Did you see my last edit ? It appears everything is saved as expected in the database. I think phpmyadmin is misconfigured : when i log-in database with it, I have some `utf8` . When I check on my website with php or when I log in mysql with an inline command, everything is set at `utf8mb4`.... – Vae May 08 '17 at 14:07