0

this is really doing my nut.....

all relevant PHP Output scripts set headers (in this case only one file - the main php script):

header("Content-type: text/html; charset=utf-8");

HTML meta is set in head:

<meta http-equiv="Content-Type" content="text/html; charset=utf-8">

all Mysql tables and related columns set to:

utf8_unicode_ci     Unicode (multilingual), case-insensitive

I have been writing a class to do some translation.. when the class writes to a file using fopen, fputs etc everything works great, the correct chars appear in my output files (Which are written as php arrays and saved to the filesystem as .php or .htm files. eval() brings back .htm files correctly, as does just including the .php files when I want to use them. All good.

Prob is when I am trying to create translation entries to my DB. My DB connection class has the following line added directly after the initial connection:

 mysql_query("SET NAMES utf8, character_set_results = 'utf8', character_set_client = 'utf8', character_set_connection = 'utf8', character_set_database = 'utf8', character_set_server = 'utf8'");

instead of seeing the correct chars, i get the usual crud you would expect using the wrong charset in the DB. Eg:

Propriétés

instead of:

propriétés

don't even get me started on Russian, Japanese, etc chars! But then using UTF8 should not make any single language charset an issue...

What have I missed? I know its not the PHP as the site shows the correct chars from the included translation .php or .htm files, its only when I am dealing with the MySQL DB that I am having these issues. PHPMyAdmin shows the entries with the wrong chars, so I assume its happening when the PHP "writes" to MySQL. Have checked similar questions here on stack, but none of the answers (all of which were taken care of) give me any clues...

Also, anyone have thoughts on speed difference using include $filename vs eval(file_get_contents($filename)).

Nick
  • 908
  • 12
  • 29
  • You should use only `SET NAMES utf8` as initial query, as it already sets up `character_set_results`, `character_set_client` and `character_set_connection` options. – galymzhan Apr 06 '12 at 17:52
  • @galymzhan yes, have tried that in several ways... so you suggest: mysql_query("SET character_set_results = 'utf8', character_set_client = 'utf8', character_set_connection = 'utf8', character_set_database = 'utf8', character_set_server = 'utf8'"); – Nick Apr 06 '12 at 17:54
  • @galymzhan tried that, no effect... – Nick Apr 06 '12 at 17:56

3 Answers3

1

Here is all you need to make sure you have a good display of those chars :

/* HTTP charset */
header("Content-Type:text/html; charset=UTF-8");

/* Set MySQL communication encoding */
mysql_set_charset("UTF8");

You also need to set the DB encoding to the correct one, also each table's encoding AND the field's encoding

Last but not least, your php file's encoding should also match.

Dany Khalife
  • 1,850
  • 3
  • 20
  • 47
1

You say that you are seeing "the usual crud you would expect using the wrong charset". But that crud is in fact created by using utf8_encode() on an already UTF8 string, so chances are that you are not using the "wrong encoding" anywhere, but exceeding the times you are encoding into UTF8.

You may take a look into a library I made to fix that kind of problems:

https://stackoverflow.com/a/3521340/290221

Community
  • 1
  • 1
Sebastián Grignoli
  • 32,444
  • 17
  • 71
  • 86
  • have just tried implementing your brilliant class. Now i see this as the "crud" output Propriétés Note is different to the original "Crud". I think you are correct, could you deduce from the 2 examples how many times this might have been encoded? FYI I am not running any utf8_encode() – Nick Apr 06 '12 at 18:41
  • Scrub that last comment! This works perfectly! Excelent php class! Thankyou! – Nick Apr 06 '12 at 18:49
  • Beware! Use it to fix your strings (ideally, only once), and try to set your environment in a way that whatever you put into your database comes out the exact same way. – Sebastián Grignoli Apr 06 '12 at 19:13
0

There is a mysql_set_charset('utf8'); in mysql for that. Run the query at the beginning of another query.

Starx
  • 77,474
  • 47
  • 185
  • 261