0

Problem:

When I retrieve values from MySQL table to my PHP Zend Framework-application, the characters ä and ö are returned from the database to my browser like this �.

Any ideas where the problem might occur? I have tried setting some properties to UTF-8, but still this occurs...I think I have still missed something.

Could you give some solutions how I can get rid of this for sure? What configs I need to set etc.

Thank you :)

jjepsuomi
  • 4,223
  • 8
  • 46
  • 74
  • Can you post your `Zend_DB` factory code? Are you sure that isn't an escape problem? Did you set the Chareset HTTP header? – Fabio Mora Jul 31 '12 at 14:57
  • What is the collation of the table this data is being pulled from? Make sure you look at the table and not the database, because tables can easily have a different collation than the database. – p0lar_bear Jul 31 '12 at 14:58
  • My application is in the very beginning at the moment and my database code is very minimal and handles only the setup of the connection so far, but I will post the code I use (P.S. I'm a beginner so bear with me ;D) – jjepsuomi Jul 31 '12 at 15:02
  • require_once(APPLICATION_PATH . "/models/Properties.php"); class Db { public static function conn() { $connProp = Properties::getProperties("database.properties"); $connParams = array("host" => $connProp['host'], "port" => $connProp['port'], "username" => $connProp['username'], "password" => $connProp['password'], "dbname" => $connProp['dbname']); $db = new Zend_Db_Adapter_Pdo_Mysql($connParams); return $db; } } – jjepsuomi Jul 31 '12 at 15:02
  • The collation was latin1_swedish_ci as suspected below :) Thanx everyone – jjepsuomi Jul 31 '12 at 15:14
  • Then please mark an answer correct / upvote the ones that helped you. – Brendan Aug 01 '12 at 18:08

4 Answers4

1

When characters of the wrong encoding are read to the browser, they come up as � or ?. There is an encoding mismatch somewhere:

  • In your browser
  • In the database table collation
  • In the HTML charset

Those characters would work in Unicode UTF-8, so you should verify that your DB table collation is utf8_bin or similar, as, if you want to store everything in UTF-8 Swedish, you might use utf8_swedish_ci

In your browser, ensure that your content encoding is set to auto-detect.

To create a UTF-8 / utf8_bin table in MySQL, here is an example:

CREATE TABLE `sample` (
  `id` int(11) unsigned NOT NULL AUTO_INCREMENT,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE=utf8_bin;

To ALTER an existing table to use UTF-8, use the following command:

ALTER TABLE table_name CONVERT TO CHARACTER SET utf8 COLLATE utf8_bin;

This post has a good explanation of whether or not to use CONVERT TO CHRACTER SET utf8

Note CHARSET and COLLATE must both be set, otherwise you will automatically get COLLATE=utf8_general_ci

For HTML, charset can be set using:

<meta charset="UTF-8">
Community
  • 1
  • 1
Brendan
  • 4,565
  • 1
  • 24
  • 39
1

I woudl suggest you have a look into your "website encoding" set in the browser.

If the website does not supply a encoding and your mysql/php is handling it "correctly" this could happen (e.g. its set to "western 1252" but requires utf8/unicode)

In Firefox you can see the encoding via view -> encoding (or similar)

Najzero
  • 3,164
  • 18
  • 18
1

You just have to properly set your encoding to UTF-8. Here are the basic steps for a minimal working solution :

  1. Make sure your DB connection uses UTF-8
  2. Make sure your HTTP content-type is set to UTF-8 (Ex: header('Content-type: text/html; charset=utf-8'); (also, you may specify the encoding for your document)
  3. Save your file as UTF-8. In Windows, files are saved as CP-1251, which means that special accented chars are not treated as UTF-8 encoded bytes (thus throw off what is being output to the browser. You have to explicitly set your encoding when saving your files. In Linux/Mac, this is not a problem as UTF-8 is used by default.
  4. Make sure your php.ini does not override the encoding. If you have foreign .htaccess files, check that they do not so either.

If you correctly handle UTF-8, you won't require utf8_encode/utf8_decode, htmlentities etc. and will be able to properly output accented characters without special treatement.

Note : some locales are not handled by UTF-8, notably Thaï.

Community
  • 1
  • 1
Yanick Rochon
  • 51,409
  • 25
  • 133
  • 214
1

Regarding Zend_Db_Adapter_Pdo_Mysql make sure that your constructor is using UTF-8, like this:

$your_pdo = Zend_Db::factory('Pdo_Mysql', array(
                'host'     => DATABASE_HOST,
                'username' => DATABASE_USERNAME,
                'password' => DATABASE_PASSWORD,
                'dbname'   => DATABASE_DBNAME,
                'options'  => array( 'charset' => 'utf8' )
        ) );    

And before using data in HTML escape it:

return htmlspecialchars( $string, ENT_QUOTES, 'UTF-8' );

Check out all other answers for Content-Type, file encoding and more.

Fabio Mora
  • 5,339
  • 2
  • 20
  • 30