64

I have a rails application running on production mode, but all of the sudden this error came up today when a user tried to save a record.

Mysql2::Error: Incorrect string value

More details (from production log):

Parameters: {"utf8"=>"â<9c><93>" ... 

Mysql2::Error: Incorrect string value: '\xC5\x99\xC3\xA1k 

Mysql2::Error: Incorrect string value: '\xC5\x99\xC3\xA1k 

Now I saw some solutions that required dropping the databases and recreating it, but I cannot do that.

Now mysql shows this:

mysql> show variables like 'char%';
+--------------------------+----------------------------+
| Variable_name            | Value                      |
+--------------------------+----------------------------+
| character_set_client     | utf8                       |
| character_set_connection | utf8                       |
| character_set_database   | latin1                     |
| character_set_filesystem | binary                     |
| character_set_results    | utf8                       |
| character_set_server     | latin1                     |
| character_set_system     | utf8                       |
| character_sets_dir       | /usr/share/mysql/charsets/ |
+--------------------------+----------------------------+
8 rows in set (0.04 sec)

What is wrong and how can I change it so I do not have any problems with any characters?

Also: Is this problem solvable with javascript? Convert it before sending it ?

Thanks

Jordan Running
  • 102,619
  • 17
  • 182
  • 182
Trt Trt
  • 5,330
  • 13
  • 53
  • 86
  • possible duplicate of [Mysql - Mysql2::Error: Incorrect string value:](http://stackoverflow.com/questions/16350310/mysql-mysql2error-incorrect-string-value) – Sam Sep 18 '15 at 11:50
  • Also see this: https://dev.mysql.com/doc/refman/5.5/en/faqs-cjk.html#qandaitem-A-11-1-16 – Carson Reinke Sep 18 '15 at 16:42

7 Answers7

86

the problem is caused by charset of your mysql server side. You can config manually like:

ALTER TABLE your_database_name.your_table CONVERT TO CHARACTER SET utf8

or drop the table and recreate it like:

rake db:drop
rake db:create
rake db:migrate

references:

https://stackoverflow.com/a/18498210/2034097

https://stackoverflow.com/a/16934647/2034097

UPDATE

the first command only affect specified table, if you want to change all the tables in a database, you can do like

ALTER DATABASE databasename CHARACTER SET utf8 COLLATE utf8_general_ci;

reference:

https://stackoverflow.com/a/6115705/2034097

Community
  • 1
  • 1
Chuanpin Zhu
  • 2,226
  • 1
  • 21
  • 21
  • I run this ALTER TABLE your_database_name.your_table CONVERT TO CHARACTER SET utf8 against all my tables. – Trt Trt Mar 17 '14 at 20:38
  • My records are still there, I can use them and create new ones. Great. Now I will have to run this again if I want to create other tables right? – Trt Trt Mar 17 '14 at 20:38
  • congrats, to answer your question, I updated my answer, please take a look. – Chuanpin Zhu Mar 17 '14 at 20:41
  • 4
    Running the ALTER DATABASE command did not change the character set of my problem table. Running ALTER TABLE did work. I was able to check the character set name using the following command. SELECT CCSA.character_set_name FROM information_schema.`TABLES` T, information_schema.`COLLATION_CHARACTER_SET_APPLICABILITY` CCSA WHERE CCSA.collation_name = T.table_collation AND T.table_schema = "your_databasename" AND T.table_name = "your_tablename"; reference: http://stackoverflow.com/questions/1049728/how-do-i-see-what-character-set-a-database-table-column-is-in-mysql – E L Rayle Mar 09 '15 at 12:45
  • I was already in utf8, but looks like I needed utf8mb4 http://stackoverflow.com/questions/13653712/java-sql-sqlexception-incorrect-string-value-xf0-x9f-x91-xbd-xf0-x9f – chris finne Nov 10 '15 at 16:53
  • I think dropping & recreating the database from migration would only fix the problem if you had edited the MySQL database in some underlying way to fix the problem, like set the Charset and/or Collation in the database.yml config files – Jason FB Apr 05 '20 at 11:44
  • For Rails specifically https://blog.jasonfleetwoodboldt.com/2020/04/05/how-to-get-emojis-in-mysql-fixing-incorrect-string-value/ – Jason FB Apr 05 '20 at 12:41
55

I managed to store emojis (which take up 4 bytes) by following this blog post:

Rails 4, MySQL, and Emoji (Mysql2::Error: Incorrect string value error.)

You might think that you’re safe inserting most utf8 data in to mysql when you’ve specified that the charset is utf-8. Sadly, however, you’d be wrong. The problem is that the utf8 character set takes up 3 bytes when stored in a VARCHAR column. Emoji characters, on the other hand, take up 4 bytes.

The solution is in 2 parts:

Change the encoding of your table and fields:

ALTER TABLE `[table]` 
  CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_bin,
MODIFY [column] VARCHAR(250)
    CHARACTER SET utf8mb4 COLLATE utf8mb4_bin

Tell the mysql2 adapter about it:

development:
  adapter: mysql2
  database: db
  username: 
  password:
  encoding: utf8mb4
  collation: utf8mb4_unicode_ci

Hope this helps someone!

Then I had to restart my app and it worked. Please note that some emojis will work without this fix, while some won't:

  • ➡️ Did work
  • Did not work until I applied the fix described above.
Undo
  • 25,519
  • 37
  • 106
  • 129
Besi
  • 22,579
  • 24
  • 131
  • 223
  • to access mysql via terminal: sudo mysql -u root – Guilherme Nunes Apr 22 '19 at 07:42
  • 2
    Why did you set the collation to `utf8mb4_unicode_ci` in your `database.yml` rather than `utf8mb4_bin`? – Cannon Moyer Sep 18 '19 at 14:45
  • This finally works for me with Chinese character set, thanks! – Sunding Wei Oct 10 '19 at 09:00
  • Works with Rails 6. Not sure why this is not the default. – Sylar Sep 09 '20 at 06:50
  • @CannonMoyer i think it's because ActiveRecord doesn't recognize ```utf8mb4_unicode_bin```: ```ActiveRecord::StatementInvalid (Mysql2::Error: Unknown collation: 'utf8mb4_unicode_bin': SET NAMES utf8mb4 COLLATE utf8mb4_unicode_bin```. I'm going with ```utf8mb4_unicode_ci``` in the db and in database.yml, looks like it's working for the emojis i've tested so far. – ilasno May 18 '22 at 23:49
28

You can use a migration like this to convert your tables to utf8:

class ConvertTablesToUtf8 < ActiveRecord::Migration
  def change_encoding(encoding,collation)
    connection = ActiveRecord::Base.connection
    tables = connection.tables
    dbname =connection.current_database
    execute <<-SQL
      ALTER DATABASE #{dbname} CHARACTER SET #{encoding} COLLATE #{collation};
    SQL
    tables.each do |tablename|
      execute <<-SQL
        ALTER TABLE #{dbname}.#{tablename} CONVERT TO CHARACTER SET #{encoding} COLLATE #{collation};
      SQL
    end
  end

  def change
    reversible do |dir|
      dir.up do
        change_encoding('utf8','utf8_general_ci')
      end
      dir.down do
        change_encoding('latin1','latin1_swedish_ci')
      end
    end
  end
end
mfazekas
  • 5,589
  • 1
  • 34
  • 25
4

If you want to the store emoji, you need to do the following:

  1. Create a migration (thanks @mfazekas)

    class ConvertTablesToUtf8 < ActiveRecord::Migration def change_encoding(encoding,collation) connection = ActiveRecord::Base.connection tables = connection.tables dbname =connection.current_database execute <<-SQL ALTER DATABASE #{dbname} CHARACTER SET #{encoding} COLLATE #{collation}; SQL tables.each do |tablename| execute <<-SQL ALTER TABLE #{dbname}.#{tablename} CONVERT TO CHARACTER SET #{encoding} COLLATE #{collation}; SQL end end

    def change reversible do |dir| dir.up do change_encoding('utf8mb4','utf8mb4_bin') end dir.down do change_encoding('latin1','latin1_swedish_ci') end end end end

  2. Change rails charset to utf8mb4 (thanks @selvamani-p)

    production: encoding: utf8mb4

References:

https://stackoverflow.com/a/39465494/1058096

https://stackoverflow.com/a/26273185/1058096

Kalpesh Patel
  • 2,772
  • 2
  • 25
  • 52
dtelaroli
  • 1,227
  • 1
  • 13
  • 20
  • Thanks for this. I think you could just change the character set and collation for specific columns which require emoji support - if for example changing all tables to utf8mb4 means existing index keys are too long. The migration could execute SQL like: execute <<-SQL ALTER TABLE #{table} MODIFY #{column} text CHARACTER SET #{character_set} COLLATE #{collation}; SQL – Ollie H-M Apr 24 '18 at 09:48
  • Great, but if you forget to change collation you will get errors. It happens if you use a tool to manage the database. Everything depends of the case, if you need performance you should pay attention on it and your solution is better than mine. – dtelaroli Apr 25 '18 at 22:02
  • This answer looks more like a comment on @mfazekas' answer to me. See ["Why and how are some answers deleted?"](https://stackoverflow.com/help/deleted-answers) – Hirurg103 Nov 20 '20 at 01:03
1

Need to change CHARACTER SET and COLLATE for already created database:

ALTER DATABASE databasename CHARACTER SET utf8 COLLATE utf8_unicode_ci;

Or it was necessary to create a database with pre-set parameters:

CREATE DATABASE databasename CHARACTER SET utf8 COLLATE utf8_general_ci;
shilovk
  • 11,718
  • 17
  • 75
  • 74
0

It seems like an encoding problem while getting data from database. Try adding the below to your database.yml file

   encoding: utf8 

Hope this solves your issue

Dhepthi
  • 453
  • 4
  • 11
  • 1
    MySQL's utf8 character set only allows for a subset of unicode. Additionally, changing the encoding in the database.yml file does not convert the database tables themselves to the specified encoding. – Omegalen Nov 05 '15 at 20:32
0

Also, if you don't want to do changes in your database structure, you could opt by serializing the field in question.

class MyModel < ActiveRecord::Base
  serialize :content

  attr_accessible :content, :title
end
  • Works in a pinch but definitely not the long-term solution you're looking for. Multiple drawbacks to serializing content. Much better to get your database properly configured at some point. – Joshua Pinter Mar 04 '18 at 23:57
  • @JoshuaPinter Could you expand upon why using serialization isn't a good long-term solution? Currently trying to decide how to handle this situation. – Brandon Sturgeon Jul 25 '18 at 05:44
  • @BrandonSturgeon There's lots but ones that I've run up against: as the value of the field grows, it can become unweildly and very inefficient. And it might hit a character limit in the field database. We were storing an array of id values that we were serializing instead of having a proper join table and it means you can't utilize database queries properly when retrieving those ids. Lots of stuff that is obvious and lots that might not be. Just use it sparingly and be aware of the implications. – Joshua Pinter Jul 26 '18 at 16:37
  • @JoshuaPinter Thank you for the response! I decided to collate all tables that needed collation thru a migration. This seems to function just as well. Another consideration, I believe, is if you need to access the database outside of a rails project, you'd have to implement a deserialization method separately. – Brandon Sturgeon Jul 26 '18 at 18:43
  • @BrandonSturgeon Good point! Again, great in a pinch but generally not a long-term, robust solution. – Joshua Pinter Jul 27 '18 at 20:15