21

I have a staging Rails site up that's running on MySQL 5.0.32-Debian.

On this particular site, all of my tables are using utf8 / utf8_general_ci encoding.

Inside that database, I have some data that looks like so:

mysql> select * from currency_types limit 1,10;
+------+-----------------+---------+
| code | name            | symbol  |
+------+-----------------+---------+
| CAD  | Canadian Dollar | $       |
| CNY  | Chinese Yuan    | å…ƒ     |
| EUR  | Euro            | €     |
| GBP  | Pound           | £      |
| INR  | Indian Rupees   | ₨     |
| JPY  | Yen             | ¥      |
| MXN  | Mexican Peso    | $       |
| USD  | US Dollar       | $       |
| PHP  | Philippine Peso | ₱     |
| DKK  | Denmark Kroner  | kr      |
+------+-----------------+---------+

Here's the issue I'm having

On staging (with the db and Rails site running on the debian box), the characters for symbols are appearing correctly when displayed from Rails. For instance, the Chinese Yuan is appearing as 元 in my browser, not å…ƒ as it shows inside the database.

When I download that data to my local OS X development machine and run the db and Rails locally, I see the representation from inside the DB (å…ƒ) on my browser, not the character 元 as I see in staging.

Debugging I've done

I've ensured all headers for Content-Type are coming back as utf8 from each webserver (local, staging).

My local mysql server and the staging server are both setup to use utf8 as the default charset. I'm using "set names 'utf8'" before I make any calls.

I can even connect to my staging db from my OS X Rails host, and I still see the characters å…ƒ representing the yuan. I'm guessing then, perhaps there's an issue with my mysql local client, but I can't figure out what the issue is.

Perhaps this might lend a clue

To make it even more confusing, if I paste the character 元 into the db on my local machine, I see that in the web browser fine. --- YET if I paste that same character into my staging db, I get a ? mark in it's place on the page from my staging Rails site.

Also, locally on my OS X rails machine if I use "set names 'latin1'" before my queries, the characters all come back properly. I did have these tables set as latin1 before - could this be the issue?

Someone please help me out here, I'm going crazy trying to figure out what's wrong!

mat
  • 12,943
  • 5
  • 39
  • 44
Subimage
  • 4,393
  • 3
  • 24
  • 18

7 Answers7

29

AHA! Seems I had some table information encoded in latin1 before, and stupidly changed the databases to utf8 without converting.

Running the following fixed that currency_types table:

mysqldump -u root -p --opt --default-character-set=latin1 --skip-set-charset  DBNAME > DBNAME.sql

mysql -u root -p --default-character-set=utf8  DBNAME < DBNAME.sql

Now I just have to ensure that the other content generated after the latin1 > utf8 switch isn't messed up by that :(

Subimage
  • 4,393
  • 3
  • 24
  • 18
  • 1
    Yes that is the problem. but when you set your connection to latin1 it appeared normal because it did the same translation. I had this problem but could not re-create the database. So I changed phpMyAdmin to use a latin1 connection, then exported (so the exported data was now correct), then removed that hack and re-imported. Data fixed. Details here: http://omegadelta.net/2010/11/23/when-you-thought-the-db-was-utf-8-but-it-wasnt/ – William Denniss Nov 23 '10 at 04:15
  • Thank you! I was scratching my head quite a bit this morning and this turned out to be the solution for a database created before my time! – Cymen Jan 14 '11 at 17:38
  • Windows-user who only receive 'access denied'-messages, should change **DBNAME.sql** to **%homepath%\DBNAME.sql** for both the mysqldump- and the mysql-calls. And thanks to Subimage! – Alex B. Jan 11 '12 at 00:13
  • Pure genius. Well done @Subimage. Your diagnosis that the database was converted to utf8 without converting the data in individual tables is exactly what had happened in our case. Your solution works very well as well. – Sujoy Gupta Jul 11 '12 at 01:41
  • I read about so many of the other utf8-rails3-mysql issues before discovering this ticket which turned out to be exactly what had happened with the site I am working on. – Jesse Clark Apr 08 '13 at 02:57
  • Saved the day for me. Thanks. – NM Pennypacker Aug 06 '14 at 16:22
22

Do you have these two lines in your database.yml under the proper section?

encoding: utf8
collation: utf8_general_ci
Can Berk Güder
  • 109,922
  • 25
  • 130
  • 137
  • didn't know the yml file could have a collation line, but yes i do have the encoding one... – Subimage Dec 09 '08 at 00:23
  • 1
    Why is utf8_unicode_ci recommended over utf8_general_ci? – mauriciomdea Nov 27 '15 at 14:02
  • @mauriciomdea Good question: essentially `utf8_general_ci` is a little faster but it's not as accurate. Best practice is to use `utf8_unicode_ci`. Read more here: https://stackoverflow.com/questions/766809/whats-the-difference-between-utf8-general-ci-and-utf8-unicode-ci – Joshua Pinter Mar 04 '18 at 20:33
2
  1. The problem could have been with you MySQL client in staging it does not support UTF-8.
  2. Your local OSX ruby installation configuration might not have declared the proper configs. You should have "encoding: utf8" in "config/database.yml" for the MySQL database. You should have "$KCODE = 'u'" in "config/environment.rb" for the ruby enviroment.
yrcjaya
  • 413
  • 4
  • 8
  • I don't have the $KCODE part, but i do have "encoding: utf8" in all config files. It seems my issue was that I had mixed-encoded content inside the database. So I was storing latin-1 characters, but trying to read them as utf8 – Subimage Dec 06 '08 at 09:10
  • i saw ur answer after posting this.. Anyway thanks for pointing this error. I have seen this error happen in many cases – yrcjaya Dec 06 '08 at 10:35
1

Another simple approach is to set the encode type by using SQL Alter statement. You can do this using the below bash script.

for t in $(mysql --user=root --password=admin  --database=DBNAME -e "show tables";);do echo "Altering" $t;mysql --user=root --password=admin --database=DBNAME -e "ALTER TABLE $t CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;";done

prettified

  for t in $(mysql --user=root --password=admin  --database=DBNAME -e "show tables";);
    do 
       echo "Altering" $t;
       mysql --user=root --password=admin --database=DBNAME -e "ALTER TABLE $t CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;";
    done
RameshVel
  • 64,778
  • 30
  • 169
  • 213
0

My DB was already set by default to utf8, but I encountered the same problem.

Also after adding the following usual meta tag, the problem was still there:

<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />

Then I created a dedicated connection.php to ensure all communication with MySQL is set to charset utf8. Note that there is no - in utf8 in mysqli_set_charset($bd, 'utf8')!

Here is my Connection.php:

<?php
    $mysql_hostname = "localhost";
    $mysql_user = "username";
    $mysql_password = "password";
    $mysql_database = "dbname";
    $prefix = "";
    $bd = mysqli_connect($mysql_hostname, $mysql_user, $mysql_password) or die("Could not connect database");
    mysqli_select_db($bd, $mysql_database) or die("Could not select database");
    if(!mysqli_set_charset($bd, 'utf8'))  {
        exit() ;
    }
?>

Another php file:

<?php
    //Include database connection details
    require_once('connection.php');

    //Enter code here...

    //Create query
    $qry = "SELECT * FROM subject";
    $result = mysqli_query($bd, $qry);
?>

//Other stuff
honk
  • 9,137
  • 11
  • 75
  • 83
mak arthur
  • 119
  • 8
0

For Rails run the following code snippet into rails console. It will generate an sql for all tables. Then log in to mysql and execute copied sql from rails console. It will alter all tables encoding.

schema = File.open('db/schema.rb', 'r').read
rows = schema.split("\n")

table_name = nil
rows.each do |row|
  if row =~ /create_table/
     table_name = row.match(/create_table "(.+)"/)[1]
     puts "ALTER TABLE `#{table_name}` CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci;"
  end
end
Rokibul Hasan
  • 4,078
  • 2
  • 19
  • 30
0

You can generate a migration, the Rails way, to change the collation type on your databases:

rails generate migration ChangeDatabaseCollation

Then you can edit the generated file and paste:

def change
  # for each table that will store the new collation execute:
  execute "ALTER TABLE my_table CONVERT TO CHARACTER SET utf8 COLLATE utf8_general_ci"
end

And run the migration:

rake db:migrate

You can also enforce the new collation on your database.yml:

development:
  adapter: mysql2
  encoding: utf8
  collation: utf8_general_ci

For more information on Rails migrations:

http://edgeguides.rubyonrails.org/active_record_migrations.html

For more information on collation types:

http://collation-charts.org/

mauriciomdea
  • 2,954
  • 3
  • 15
  • 10