43

I use the PDO library with a MySQL database in PHP, but if I insert any data encoded in UTF-8, like Arabic words, it’s inserted into the database, but as ?????????.

In my own framework, after I create the PDO connection, I send two queries – SET NAMES utf8 and SET CHARACTER SET utf8. It still doesn’t work.

Example:

loadclass('PDO', array(
    sprintf(
        'mysql:host=%s;port=%s;dbname=%s',
        confitem('database', 'host'),
        confitem('database', 'port'),
        confitem('database', 'name')
    ),
    confitem('database', 'username'),
    confitem('database', 'password'),
    array('PDO::ATTR_PERSISTENT' => confitem('database', 'pconnect'))
));
$this->query('SET NAMES ' . confitem('database', 'charset'));
$this->query('SET CHARACTER SET ' . confitem('database', 'charset'));

Workaround: Use the json_encode function to convert data before inserting it to the database, and use json_decode to decode it after fetching. This is how I do it now.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Jason4Ever
  • 1,439
  • 4
  • 23
  • 43

5 Answers5

149

Warning: This answer applies to PHP 5.3.5 and lower. Do not use it for PHP version 5.3.6 (released in March 2011) or later.

Compare with Palec's answer here.


Use:

$pdo = new PDO( 
    'mysql:host=hostname;dbname=defaultDbName', 
    'username', 
    'password', 
    array(PDO::MYSQL_ATTR_INIT_COMMAND => "SET NAMES utf8") 
); 

It forces UTF-8 on the PDO connection. It worked for me.

hakre
  • 193,403
  • 52
  • 435
  • 836
shark555
  • 2,562
  • 1
  • 20
  • 10
  • 2
    if you have PDO::MYSQL_ATTR_INIT_COMMAND => 'SET NAMES 'UTF8'' -- make sure you do not have added : ->exec("SET CHARACTER SET utf8") ... – Jeffz May 23 '15 at 01:20
  • You have to enable it using extension=intl or extension=php_intl.dll – M at Jul 15 '19 at 08:17
  • Nice. If you're getting NULL results in PHP but perfectly fine results in your MySQL client using the same SQL it's likely this. Problem solved the moment I added your solution . – Art Geigel Mar 29 '21 at 22:57
  • 1
    See the answer from Palec, charset=utf8 usually is the better method. – netAction Apr 14 '21 at 14:03
  • I now also put a warning in there. It might have been good in 2011, but not the last couple of years. /cc @ArtGeigel – hakre Jul 18 '21 at 15:06
  • What about php 8.1, there I get "Uncaught TypeError: Attribute value must be of type int for selected attribute, string given" for "array(PDO::MYSQL_ATTR_INIT_COMMAND => 'SET NAMES utf8')". – LarS Jan 20 '23 at 10:40
73

You have to set the correct character set for the connection. Add the charset=utf8 option to the DSN (this is MySQL-specific!)

$pdo = new PDO(
    'mysql:host=hostname;dbname=defaultDbName;charset=utf8',
    'username',
    'password'
);

However, in PHP versions before 5.3.6 (released in March 2011), you must use a workaround as the charset option in the DSN is not supported.

$pdo = new PDO(
    'mysql:host=hostname;dbname=defaultDbName',
    'username',
    'password',
    array(PDO::MYSQL_ATTR_INIT_COMMAND => "SET NAMES utf8")
);
hakre
  • 193,403
  • 52
  • 435
  • 836
Palec
  • 12,743
  • 8
  • 69
  • 138
  • 5
    Wanted to update the existing, top-voted answer, but [my edit was rejected](https://stackoverflow.com/review/suggested-edits/3904725). Therefore posting it as a new answer. – Palec Jan 27 '14 at 05:46
  • Thank you! `charset=utf8'` Work for me on Datatable server side – Milad Ghiravani May 08 '14 at 15:03
  • Thanks for this! charset=utf8 was what sorted this very annoying issue for me as well. – Brigante Jun 06 '17 at 11:15
  • Could you just do both, or would that cause side effects? In other words why not just always set the charset in the connection string and pass that SET NAMES option as well? – still_dreaming_1 Apr 19 '19 at 18:03
4

All attempts like:

PDO::MYSQL_ATTR_INIT_COMMAND =>"SET NAMES 'utf8mb4' COLLATE 'utf8mb4_unicode_ci' "

or

$this->connection = new PDO('mysql:host='.DBHOST.';dbname='.DBNAME.';charset=utf8', DBUSER, DBPASS, self::$opt);

or

$this->connection->exec("set names utf8");

still generated unreadable text mess.

In my case, the cause of the problem was: htmlentities used prior to inserting data into a database. Cyrillic letters were destroyed completely.

bbe
  • 344
  • 3
  • 16
  • You can pass an encoding of 'UTF-8' to htmlentities as the third parameter: https://www.php.net/manual/en/function.htmlentities.php – still_dreaming_1 Jun 02 '22 at 12:53
1

Try setting the default_charset value in php.ini to UTF-8. Or you can set it using the ini_set function.

Also, if the input is coming through form submissions, make sure your web pages are set to UTF-8 using the meta tag.

Burak Guzel
  • 1,267
  • 1
  • 12
  • 12
  • hello , ok i will try , but i don't neeed this solve , i want solve it from my script :) ? – Jason4Ever Dec 18 '10 at 08:55
  • yes .. i tested data coming from web pages , yes it coming with utf-8 enconding , and there is a utf-8 encoding meta tag – Jason4Ever Dec 18 '10 at 08:56
  • I just had the same issue on the Windows machine, and this was the solution to the problem. Everything was talking UTF-8... except PHP. – v010dya Jan 26 '14 at 15:11
0

When interacting with mysql or mariadb, charset 'utf8' is not correct. You need to use utf8mb4. Due to historical issues, utf8 in mysql/mariadb is an alias to utf8mb3 which can only contain a subset of utf8 (3 bytes instead of 4) So adding 'charset=utf8mb4' to your PDO DSN connection string is the correct thing to do for recent PHP versions, NOT charset=utf8.

For example, the poop emoji in full color: cannot be stored if the column and PDO are not utf8mb4.

Rubin
  • 1
  • 2