3

I have this simple PHP-script, which searches a mySQL database and outputs the result to the user. I used to use ISO-8859-1 as my charset, but was advised to use UTF-8. But I have trouble going from my former charset to the new one.

To clarify some things, I have:

  • Created a database and table encoded in UTF-8 with collation utf8_unicode_ci.
  • Encoded my PHP-file in UTF-8.
  • Set meta charset to UTF-8.
  • Set all text mime-types to UTF-8 through create-mime.assign.pl in Lighty (Lighttpd).

Now, the problem arises when I retrieve data from the database with characters like ö, ü etc. If I just do echo "ö"; without retrieving it from the database, it works fine. I guess there must be something wrong with the database then?

I've tried the following, and they've solved my problem:

  • Set meta charset to ISO-8859-1 (which, for some strange reason works, but breaks the echo'd "ö").
  • Set a utf8_decode() function around the output.
  • After mysql_select_db() declared the following mysql_set_charset('utf8');.

I know that I've found multiple solutions, but I just don't know why it wont work without them? And is it bad practice to use utf8_decode() on output, or the mysql_set_charset() function?

AnonymousJ
  • 87
  • 1
  • 12
  • database connector as well? .. ah, nope ... yeah, you need to set the charset on the connector (`mysql_set_charset` - although this is deprecated and you should at least be using MySQLi), otherwise you'll sort of be converting the utf-8 in the database to iso 8859-1 (the default for the MySQL connector) before passing it to PHP and ultimately the screen... which will then result in '?' – CD001 Apr 22 '13 at 15:41
  • @CD001 Why the hell does the MySQL connector default to ISO-8859-1? Oh well, guess that's the problem then. Will try to use fullybaked's answer below. – AnonymousJ Apr 22 '13 at 15:46
  • Heheh - yeah, I know ... try http://php.net/manual/en/mysqli.set-charset.php or http://uk3.php.net/manual/en/function.mysql-set-charset.php although, the whole of the `mysql_*` functions are deprecated now. – CD001 Apr 22 '13 at 15:55
  • I know that they're deprecated. Just have to find some time to convert my script to use `mysqli_*`. :-P – AnonymousJ Apr 22 '13 at 15:58

1 Answers1

1

MySQL is funny with UTF8. You need to ensure the server is running in UTF and that the connection is as well

If you can modify the my.cnf file on your server you can add these to the [mysqld] section and restart it

character-set-server = utf8
skip-character-set-client-handshake

You could alternatively (or as well) use

query("SET NAMES utf8");   

before sending/retrieving data to ensure the database expects UTF8 data to be passed

fullybaked
  • 4,117
  • 1
  • 24
  • 37
  • Will try to see if it works, if I edit the my.cnf file. By the way, the SET NAMES solution is not [recommended (or that's what I've read in multiple stackoverflow posts)](http://stackoverflow.com/questions/5288953/is-mysql-real-escape-string-broken/5289141#5289141). – AnonymousJ Apr 22 '13 at 15:48
  • There are better ways to achieve the encoding that set names, but if all else fails (or you are constrained by your hosting provider) it is a possible solution. Thats why I included it. My recommendation would be to modify the my.cnf file to properly set up the connection though. – fullybaked Apr 22 '13 at 15:50
  • But would `character-set-server = utf8` and `skip-character-set-client-handshake` be enough? Shouldn't I also do `default-character-set=utf8` in both [mysqld] and [client], and `character-set-client=utf8` under [mysqld]? – AnonymousJ Apr 22 '13 at 16:00
  • We only use the 2 I mentioned and pass UTF8 back and forth just fine. – fullybaked Apr 22 '13 at 16:02
  • The my.cnf change worked for me, thanks! I had utf8 working fully on my local host, but it wouldn't work on the production server. The connector was the missing piece. – Jeremy Goodell Sep 10 '14 at 06:49