4

After going through about two dozen posts I'm officially stumped. I have a database with utf8_general_ci collated columns. Using PHPMyAdmin I am able to view the UTF-8 Data in the table correctly (At least as far as I can tell.) I thought what I wanted to do was simple enough. I have queried for the data in many ways, and I just want to echo the utf-8 value:

echo bin2hex("more…"); //note "…" is a special character
6d 6f 72 65 e2 80 a6 (Hex Value)

However if I just echo $row->value I get:

6d 6f 72 65 85

UTF-8 Encoding it gives:

6d 6f 72 65 c2 85

Most posts I've read have said to use mysql_set_charset("utf8") but this really screws things up:

6d 6f 72 65 26 61 63 69 72 63 3b 80 26 62 72 76 62 61 72 3b

and finally using mysql_set_charset("utf8") & utf8_encode($var):

6d 6f 72 65 26 61 63 69 72 63 3b c2 80 26 62 72 76 62 61 72 3b

I have also tried setting the UTF8 settings in PHP. Godaddy makes this a bit more difficult so I've done so using ini_set. However the mbstring.encoding_translation will not turn on.

// UTF8 settings
ini_set('mbstring.language',            'Neutral');
ini_set('mbstring.internal_encoding',       'UTF-8');
ini_set('mbstring.http_input',          'UTF-8');
ini_set('mbstring.http_output',         'UTF-8');
ini_set('mbstring.encoding_translation',    'On');
ini_set('mbstring.detect_order',        'auto');
ini_set('mbstring.substitute_character',    'long');

Any tips on what I need to do?

mouser58907
  • 797
  • 2
  • 10
  • 21
  • Have you tried the "checklist" at [How to handle UTF-8 in a web app](http://kunststube.net/frontback)? Show us some code of how exactly you insert and retrieve data from the database. – deceze Apr 11 '12 at 02:23
  • @deceze I have done most of those. I populated the data by copy pasting the characters into PHPMyAdmin. I can run a select hex(field) and I get the correct value from mysql. I don't do anything special for retrieving data from the database either. Is there something else I can test? – mouser58907 Apr 11 '12 at 02:45
  • Does the complete test script in the aforelinked article work for you? – deceze Apr 11 '12 at 02:47
  • @deceze It seems to be working fine. This should help me narrow down my search. – mouser58907 Apr 11 '12 at 03:07

5 Answers5

4

My bet is that your actual data might be stored with something other than utf8.

First make sure that your database is properly set, meaning that everything is really stored with UTF-8 encoding.

This is what I have done when facing similar problem:

Always do testing in clean table, meaning that you should create new database and table for testing purposes and from the beginning make sure that all data actually stored in database is really utf8 encoded.

Make sure that database encoding is utf8:

CREATE DATABASE `test` CHARACTER SET `utf8` COLLATE `utf8_general_ci`; 

Make sure that fields containing text is encoded with utf8:

CREATE TABLE `test` \
(`id` INT AUTO_INCREMENT PRIMARY KEY, \
`name` VARCHAR(512) COLLATE `utf8_general_ci`) \
CHARACTER SET `utf8` COLLATE `utf8_general_ci`;

Make sure that connection used to retrieve data returns unmodified UTF-8 strings.

$connection = mysql_connect( ... );
// Make sure that connection does not change encoding:
mysql_set_charset('utf8', $connection);
// Insert some test data:
mysql_query("INSERT INTO `test` (`name`) VALUES (`Ab✓cdÄö`)", $connection);

After that try to read it and check if it works like it should, if it works then you know that problem is that something in your existing database, table structure or connection is wrong and should be something similar that we just set up in our test environment.

If you are using phpmyadmin just set everything as utf8 and select suitable utf8 collation that is same at every point. Then try to add some data to tables by using phpmyadmin and try read it with your php application. utf8_general_ci should work well.

Some information here: MySQL Connection Character Sets and Collations

2

With the PDO you can easily change the charset. Also it supports prepared statements, transactions etc. So you just have to set the charset on Class creation and there you go.

From the PHP Manual Comments:

$db = new PDO('mysql:host=your_hostname;dbname=your_db;charset=UTF-8', $user, $pass);
theiNaD
  • 815
  • 6
  • 18
  • 1
    it has been `charset=utf8` in my case. You can use `show create table ...` to know the exact charset in you setup. – Enric Mieza Nov 19 '17 at 19:04
1

You can try this:

SET NAMES utf8;
SET CHARACTER SET utf8;

See here, here and here.

Community
  • 1
  • 1
Ilia Frenkel
  • 1,969
  • 13
  • 19
  • According to the PHP documentation that has been replaced with mysql_set_charset("utf8") . [Source](http://php.net/manual/en/function.mysql-set-charset.php) – mouser58907 Apr 11 '12 at 02:46
0

Thanks Deceze, the culprit ended up being an htmlentities call that needed to be replaced with:

htmlspecialchars($row['col'], ENT_QUOTES, "UTF-8");

In the end I just misread my own code. After all this time it was something so trivial. Frustrating, but glad to have found the solution.

Thanks for all your help.

mouser58907
  • 797
  • 2
  • 10
  • 21
0

This post explains all aspects of working with UTF-8 in PHP and MySQL:

Hope that helps and saves your time.

Dmitry Pavlov
  • 30,789
  • 8
  • 97
  • 121