0

I have a weird problem with MySql supporting cyrilic alphabet. The database has been created in utf8_unicode_ci from the start, however the tables were not. Right now the table data, if supplied in cyrrilic looks like this ????????, if I create a table from start in utf there is no problem, however if I try to change the existing table encoding by using

ALTER TABLE <table_name> CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;

Which is supposed to change existing data or

ALTER TABLE Strategies
  CHARACTER SET utf8,
  COLLATE utf8_unicode_ci;

which is supposed to change future data, it doesn't work.

I have also change my.cnf file and added in

[mysqld]
#
#default-character-set=utf8 this one breaks mysql restart
character-set-server=utf8
skip-character-set-client-handshake
collation-server=utf8_unicode_ci
init-connect='SET NAMES utf8'
init_connect='SET collation_connection = utf8_general_ci'

If I run SHOW VARIABLES WHERE Variable_name LIKE 'character_set_%' OR Variable_name LIKE 'collation%'; I get:

enter image description here

I also change to utf directly in PHP my admin and it actually shows that the table is in utf but nothing happens to the existing ????????? or to the future cyrillic inputs.

Hopefully someone else had experinced this kind of issue, would be really greatfull for any help or suggestions. Thank you.

Yuri Zolotarev
  • 721
  • 9
  • 23

1 Answers1

0

If a table starts out as latin1 and has latin1-encoded characters in it, use ALTER TABLE ... CONVERT TO CHARACTER SET utf8 (as you did)

Before converting, test the old encoding do two things:

SHOW CREATE TABLE ... -- to see that the columns say latin1
SELECT HEX(col) ... -- to see what the encoding looks like: é should show E9

I say, "before" because it is possible to cram utf8 into latin1 incorrectly. é should show C3A9 -- this is "double-encoding".

Do likewise after the conversion:

SHOW CREATE TABLE ... -- to see that the columns say utf8
SELECT HEX(col) ... -- to see what the encoding looks like; é should show C3A9

C383C2A9 would indicate double-encoding. A mess.

Do not depend on init-connect='SET NAMES utf8' if you connect as root, init-connect is ignored for root and any other SUPER user.

But... You say you put Cyrillic text into a latin1 column? That is "impossible" since latin1 cannot represent anything other than latin-based Western European characters. So... You probably have "double-encoding".

For more debugging, see Trouble with utf8 . Note especially "question mark".

To repair double-encoding, see Fixes , and pick the appropriate case. This link also says what you should have done (the 2-step Alter) instead of what you did.

Rick James
  • 135,179
  • 13
  • 127
  • 222