0

I displayed some Russian words on HTML and it showed like following.

Полный кадр

The result is:

Полный кадр

I tried to convert the original string to UTF8 and displayed in HTML. I showed like above.

Some information:

store field collation: latin1_swedish_ci

The sql result is :

ÐолнÑй кадÑ

I queried the database and get the value. And then convert to utf8:

mb_convert_encoding($value, 'utf-8', 'windows-1251');

Could you please take a look at this link. Someone fixed it:

http://stackoverflow.com/questions/13765242/need-help-determining-encoding-of-the-text

The problem is I'm rebuilding a website. Some data is from old website. And I cannot input it manually because It's a lot. So I wrote a php script to get data from the old website and inserted to my new website. The old website display russian in HTML perfectly. But I didn't code it, and I cannot see his code. I already check the data between old table and new table. It's the same.

  • try to build on sql fiddle – SagarPPanchal Feb 10 '15 at 04:12
  • Are the characters entities or the actual character? Latin1 doesn't support russian characters. "The major drawback of using Latin-1 is that it has only 256 possible characters, and therefore has no way of effectively representing characters in languages whose alphabets differ significantly from English (this includes languages like Russian, Greek, Hebrew, Arabic, almost all Asian languages and just about every language ever invented by humans)." https://www.bluebox.net/insight/blog-article/getting-out-of-mysql-character-set-hell You won't be able to convert what isn't there, e.g. the char. – chris85 Feb 10 '15 at 04:19
  • Someone fixed here by another langauge , not PHP. And I cannot find some functions from PHP likes the code there. http://stackoverflow.com/questions/13765242/need-help-determining-encoding-of-the-text – Tien Nguyen Feb 10 '15 at 04:22
  • I'm sorry, but without knowledge of at least either what the actual bytes are, and/or what text these bytes is supposed to display as, there are simply too many variables. Please update your question with this information. – tripleee Feb 10 '15 at 04:27
  • I edited. Please help me. – Tien Nguyen Feb 10 '15 at 05:01
  • Do you generally have problems to support Russian in your app? Then look at the duplicate. Or do you already have a ton of irreproducible data which is corrupted using the wrong charset and you're trying to rescue it? Then we do indeed need more information. Read [What Every Programmer Absolutely, Positively Needs To Know About Encodings And Character Sets To Work With Text](http://kunststube.net/encoding/) and [Handling Unicode Front To Back In A Web App](http://kunststube.net/frontback/) and maybe that'll give you enough information to figure it out yourself. – deceze Feb 10 '15 at 05:09
  • I think I had the same problem as yours. I inherited a system where MySQL fields show something like ÐолнÑй ÐºÐ°Ð´Ñ but the HTML output is correctly in Russian/Cyrillic. Please see http://stackoverflow.com/questions/9407834/mysql-convert-latin1-characters-on-a-utf8-table-into-utf8 this might answer your problem as it had mine. Pls don't downvote if it doesnt answer your question as I was also looking for the answer myself. Good luck! – alds Aug 12 '15 at 12:32

1 Answers1

0

You should change collation for your table to e.g. utf8_unicode_ci, or koir... for Russian, so you can get the right result. latin_swedish does not support Cyrillic characters.

ClassyPimp
  • 715
  • 7
  • 20
  • He/she probably needs to do charset and collation, presuming charset is latin1 as well. http://stackoverflow.com/questions/341273/what-does-character-set-and-collation-mean-exactly – chris85 Feb 10 '15 at 04:30
  • Thanks for your help. But It is the same as before. – Tien Nguyen Feb 10 '15 at 04:30
  • Write something new to the table and see if it comes out as you expect. If not please post what you inputted and what it came out as. – chris85 Feb 10 '15 at 04:34
  • try dropping db/table, recreate tables with unicode charset and collation, and test-run your scrapping script again. – ClassyPimp Feb 10 '15 at 10:15