0

I have a latin1_swedish_ci table that contains cyrillic characters. When I look at the data through phpMyAdmin, it tooks something like:

шòõùцðрÑÂúøõ ñðýúø þтúð÷ыòðютÑÂѠþт ÑÂþñûюôõýøѠтрðôøцøþýýþù тðùýы ø

My goal is to use PHP to fetch each string, send it to Google Translate, and store the result in the MySQL database in a new column.

My problem is that all I get when I query the query is garbage like what you see here. I know that I have to play with the headers, the MySQL connection and the character encoding but I have yet to find something that works. What could I do to get the string in Cyrillic?

Cœur
  • 37,241
  • 25
  • 195
  • 267
  • Can you take a look at the original application which uses these ? – Vatev Nov 18 '13 at 11:47
  • 1
    Why not using proper encoding on the database in the first place? – cen Nov 18 '13 at 11:49
  • 1
    You can't reliably store Cyrillic characters in a latin1-table. [Latin1 stands for "Latin alphabet No. 1"](http://en.wikipedia.org/wiki/Latin1) which does not include Cyrillic letters. You should store the information in a Unicode table/column. Do that first and you save yourself a lot of headaches. **For any further questions please check http://stackoverflow.com/questions/279170/utf-8-all-the-way-through first!** – feeela Nov 18 '13 at 11:50
  • The problem is that I was given an SQL file and I am trying to fix the problem now without having any access to the original data. Any suggestion as to what to do now? – user3004549 Nov 18 '13 at 11:59
  • Should I try and fix the problem in the SQL file itself? Should I change the encoding of that file? Should I change the connection encoding when I load the file? Right now, the SQL file creates latin1_swedish_ci tables. Should I change that as well? – user3004549 Nov 18 '13 at 12:02
  • The encoding has been severely mistreated at some or several points here, there's no straight forward answer for how to fix this without knowing exactly how it got into the state it's in. You'd have to experiment with various ways of converting between different encodings to figure out what exactly works to hopefully restore the original characters. – deceze Nov 18 '13 at 13:17

1 Answers1

1

If you have a .sql file, edit it, find where it says CREATE TABLE "name of the table here" and at the end change latin1 to utf8. Then import it in your database.

EXAMPLE

ENGINE=InnoDB DEFAULT CHARSET=latin1;
ENGINE=InnoDB DEFAULT CHARSET=utf8;
MentalRay
  • 334
  • 1
  • 6
  • I tried that and I still get garbage. Should I change the encoding of the file? Of the connection? What encoding should I choose for the database itself? – user3004549 Nov 18 '13 at 12:16
  • Before you insert any data from your app to database, and after selecting database try this mysql_query("SET NAMES 'utf8'", $con); $con is for your connection like this $con = mysql_connect("hostname","username","password") or die( 'Could not connect to DB: ' . mysql_error() ); Database encoding should be ok if it is utf8_general_ci – MentalRay Nov 18 '13 at 12:21
  • So, there is no need to change the encoding of the SQL file itself? Any idea as to what collation/charset I should be using at the database, table and field levels? – user3004549 Nov 18 '13 at 12:29
  • utf8_general_ci covers almost everything when you are going to deal with "strange" characters. – MentalRay Nov 18 '13 at 12:40