1

So i have some Unicode(Arabic) text data stored in a Mongoid model and i want to insert it into a mysql database. I had to use gsub to escape single quotes as that was causing me SQL insertion errors.

text = model.text.squish().gsub("'", %q(\\\'))
db_con.query("insert into table (text) values ('#{text}')")

Now my problem is when i view the data at phpmyadmin this what i see

اليوم.. ملايين الهوات٠تودع "واتساب" للأبد

I tried adding force_encoding('UTF-8') but that didn't change anything, i also tried escaping with str.dump but that transformed the data into Unicode code points like u{243} when viewed in phpmyadmin. How can this be fixed.

STF
  • 1,485
  • 3
  • 19
  • 36
user2968505
  • 435
  • 2
  • 7
  • 18
  • Make sure all your encodings agree _before_ you put the data in. If you use a default charset of non-unicode, and then insert unicode data, mysql won't know it will be unicode and will store the same bytes but not convert to unicode ( because you told it it was already in unicode). – erik258 Jan 02 '17 at 18:05
  • In the database structure the Collation is "utf8_general_ci". – user2968505 Jan 02 '17 at 18:09
  • Collation is just for alphabetical order. What is the default _charset_? Or more accurately, what charset was your unicode data inserted with? – erik258 Jan 02 '17 at 18:10
  • Ah i just checked the information schema schemata, the default charset is utf-8, the data should have been inserted as UTF8 as well, its important to note that this data is displayed correctly in the Rails C and the views in my Rails app, and in the mongo shell. – user2968505 Jan 02 '17 at 18:21
  • http://stackoverflow.com/a/38363567/1766831 -- look for "Mojibake" – Rick James Jan 03 '17 at 04:10

1 Answers1

0

Fixed it by executing this query before insertion "SET CHARACTER SET 'UTF8'"

text = model.text.squish().gsub("'", %q(\\\'))
db_con.query("SET CHARACTER SET 'UTF8'")
db_con.query("insert into table (text) values ('#{text}')")
user2968505
  • 435
  • 2
  • 7
  • 18