0

I'm having an issue with encoding of a MySQL table.

What I've Done

  1. The table is set to be utf8_unicode_ci
  2. The columns are also set to be utf8_unicode_ci where needed
  3. Setting "SET NAMES utf8" and the meta before viewing

The Data/Problem

The column in question (review) is copied and pasted from MS word.

  • Have strings such as ’ instead of '
  • Prêt à Manger which should read Pret a Manger
  • other similar issues

Ive Tried:

UPDATE table_name SET col_name = CONVERT(CONVERT(CONVERT(col_name USING latin1) USING binary) USING utf8);

This cut's off the data. E.g "a nice review costing £10 and more". After running the above query, it gets turned into "a ncie review costing".

Any help/advice would be appreciated greatly. Thanks, Matt

mattwt
  • 11
  • 3
  • I've got [a solution](http://stackoverflow.com/questions/9304485/how-to-detect-utf-8-characters-in-a-latin1-encoded-column-mysql) that might work for you. You mean "Prêt à Manger", and ‘ right? – tadman Jul 30 '13 at 16:17
  • thanks for your help but: MySQL returned an empty result set (i.e. zero rows), with no changes – mattwt Jul 30 '13 at 17:30
  • I think you have UTF-8 data that's being interpreted as Latin1, not the other way around. – tadman Jul 30 '13 at 17:52
  • php says it's utf8 but then why would it not display correctly in the browser if i've set all the headers? – mattwt Jul 30 '13 at 19:57
  • PHP only says what it thinks it is, not what it actually is. When you see single characters expanding into multiple characters it's almost always a case of multi-byte UTF-8 characters being mis-interpreted as Latin1. – tadman Jul 30 '13 at 21:06

1 Answers1

0

To double check you have the correct utf8 settings try this:

  1. Login to mysql and run mysql> show variables like 'char%';

The output should look like:


     +--------------------------+---------------------------------+
     | Variable_name            | Value                           |
     +--------------------------+---------------------------------+
     | character_set_client     | utf8                            |
     | character_set_connection | utf8                            |
     | character_set_database   | utf8                            |
     | character_set_filesystem | binary                          |
     | character_set_results    | utf8                            |
     | character_set_server     | utf8                            |
     | character_set_system     | utf8                            |
     | character_sets_dir       | /usr/local/mysql/share/charsets/|
     +--------------------------+---------------------------------+

  1. Then run mysql> show variables like 'collation%';

This should give you:


     +----------------------+-----------------+
     | Variable_name        | Value           |
     +----------------------+-----------------+
     | collation_connection | utf8_general_ci |
     | collation_database   | utf8_unicode_ci |
     | collation_server     | utf8_unicode_ci |
     +----------------------+-----------------+

If not edit your /etc/my.cnf or /etc/mysql/my.cnf if you an not find it just run sudo find / -name my.cnf

After you do find it do vim /etc/my.cnf press i to insert text and add


    [mysqld]
    collation-server = utf8_unicode_ci
    init-connect='SET NAMES utf8'
    character-set-server = utf8

Feel free to comment out the previous/old [mysqld] block or lines.

Save the file by pressing esc then type :wq and press enter

  1. Restart mysql service mysql restart or service mysqld restart

Hope that helps.

Eric Semwenda
  • 436
  • 6
  • 16