12

So my goal is to set all the character sets and collations to utf8 and utf8_unicode_ci.

Im using an AWS RDS to host the MySQL server.

Ive set the collation_connection variable to utf8_unicode_ci in the Parameter group for the RDS. This is how I set the variable in the parameter group .

Ive then rebooted my RDS and when going to MySQL console it shows the following values for variables.

mysql> show variables like "%character%";show variables like "%collation%";
+--------------------------+-------------------------------------------+
| Variable_name            | Value                                     |
+--------------------------+-------------------------------------------+
| character_set_client     | utf8                                      |
| character_set_connection | utf8                                      |
| character_set_database   | utf8                                      |
| character_set_filesystem | binary                                    |
| character_set_results    | utf8                                      |
| character_set_server     | utf8                                      |
| character_set_system     | utf8                                      |
| character_sets_dir       | /rdsdbbin/mysql-5.6.22.R1/share/charsets/ |
+--------------------------+-------------------------------------------+
8 rows in set (0.01 sec)

+----------------------+-----------------+
| Variable_name        | Value           |
+----------------------+-----------------+
| collation_connection | utf8_general_ci |
| collation_database   | utf8_unicode_ci |
| collation_server     | utf8_unicode_ci |
+----------------------+-----------------+

Im not sure why the collation_connection is set as utf8_general_ci

Parampal Pooni
  • 2,958
  • 8
  • 34
  • 40
  • Changing the table to general does fix the problem i was having (in regards to my update not working - http://www.web-technology-experts-notes.in/2014/07/illegal-mix-of-collations-utf8-unicode-ci-implicit-and-utf8-general-ci-implicit.html but still i dont know why collation_connection isnt changing. – Parampal Pooni Mar 13 '16 at 20:41
  • Did you end up fixing this issue? As I am also facing the same with Amazon RDS parameter group. – Iqbal S Nov 17 '16 at 01:22
  • nope, still interested in a potential solution – Parampal Pooni Nov 17 '16 at 01:39
  • I have this same issue too and would be interested in a solution – JamieD Nov 30 '16 at 10:54
  • I wrote an answer with a solution to this. I guess this problem will be seen more frequently nowadays, since MySQL 8.0 has changed the default utf8mb4 collation. – Marcus Jun 11 '19 at 15:52

5 Answers5

7

The variables which you see here are the current client session's variables:

+----------------------+-----------------+
| Variable_name        | Value           |
+----------------------+-----------------+
| collation_connection | utf8_general_ci |
| collation_database   | utf8_unicode_ci |
| collation_server     | utf8_unicode_ci |
+----------------------+-----------------+

So, you need to run the following commands, when you initiate the connection, on the client from which you are connecting:

SET collation_connection = 'utf8_unicode_ci';
Braiam
  • 1
  • 11
  • 47
  • 78
Iqbal S
  • 1,156
  • 10
  • 16
3

The collation_connection variable is set by the client; clients may be compiled with different settings - in this case, the default collation for utf8 appears to be utf8_general_ci.

The problem is that the MySQL client allows the user to set the charset (default-character-set), but not the collation. If one needs a specific collation, this can't be directly worked around.

In this case, what one can do (besides runtime changes) is to set the server character-set-client-handshake variable off (0). This will skip the character set (and implicitly, collation) negotiation with the client(s), and will impose the server charset/collation settings on any client connection.

Marcus
  • 5,104
  • 2
  • 28
  • 24
0

In the Parameter group, set the variable init_connect to

SET COLLATION_CONNECTION=utf8_unicode_ci

Thiago
  • 81
  • 10
0

I did it - prescribed in /etc/mysql/mariadb.conf.d/50-server.cnf

character-set-client-handshake = FALSE

I don’t know/don’t understand how identical the two parameters (--skip... and FALSE), but if they are identical, then in the official documentation it says

Don't ignore client side character set value sent during handshake. (Defaults to on; use --skip-character-set-client-handshake to disable.)

Also in the documentation, I did not find anything about the value FALSE, but it works.

-4

If you have any existing tables with different CHARACTER SET or COLLATION, they need converting; see ALTER TABLE ... CONVERT TO ....

If you code does not connect as utf8, that needs changing. The details are very language specific (Java vs PHP vs ...).

And, if you are doing html pages, they need to start with

<meta charset=UTF-8>

As long as you are going through this exercise, you should go all the way, so that you can handle all of Chinese and Emoji: utf8mb4 and utf8mb4_unicode_520_ci.

Rick James
  • 135,179
  • 13
  • 127
  • 222
  • Thx, but do you know why the collation_connection is still set as utf8_general_ci? (even after I set collation_connection variable to utf8_unicode_ci in the Parameter group). – Parampal Pooni Mar 13 '16 at 19:44
  • Sorry, I am unfamiliar with "parameter group"; I don't even know what product has that. There are too many 3rd party packages adding too little value to MySQL – Rick James Mar 13 '16 at 23:59