1

Have the almost the same issue as this. I have this:

$name = "Ирина";

When i insert it into my DB i get this: Ирина.

This function:

print_r(mb_detect_encoding($name));

gives me UTF-8.

Next thing:

SHOW VARIABLES LIKE 'char%';

Returns me this:

+--------------------------+----------------------------+
| Variable_name            | Value                      |
+--------------------------+----------------------------+
| character_set_client     | utf8                       |
| character_set_connection | utf8                       |
| character_set_database   | latin1                     |
| character_set_filesystem | binary                     |
| character_set_results    | utf8                       |
| character_set_server     | latin1                     |
| character_set_system     | utf8                       |
| character_sets_dir       | /usr/share/mysql/charsets/ |
+--------------------------+----------------------------+

And as result of:

SHOW CREATE TABLE my_db.main; 

i get: ENGINE=InnoDB DEFAULT CHARSET=utf8

DB was created with this statement:

CREATE DATABASE IF NOT EXISTS my_db DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_general_ci;

When i apply query SET NAMES 'utf8'; info stores correctly. And, finally, the question: Why do i need to execute this query at the beginning of my connection?

P.S. --skip-character-set-client-handshake in MySQL isn't specified.

Community
  • 1
  • 1
Sardis
  • 13
  • 4
  • Try to use utf8_unicode_ci insted of general – Stefan Cvetkovic Jun 16 '16 at 16:16
  • You need to specify what encoding you'll be sending the data in. If you don't the database will use its default encoding (which may be different than the encoding your table is in). In PHP you should use [`mysqli_set_charset`](http://php.net/manual/en/mysqli.set-charset.php) or with PDO you need to put it as part of the connection string. – apokryfos Jun 16 '16 at 16:32
  • @apokryfos But defaults are set to UTF-8, aren't they? – Sardis Jun 17 '16 at 08:09
  • What is the value of the php config variable [`default_charset`](http://php.net/manual/en/ini.core.php#ini.default-charset)? – Ulrich Thomas Gabor Jun 17 '16 at 09:00
  • @StefanCvetkovic , isn't working. – Sardis Jun 17 '16 at 09:10
  • @GhostGambler in my php.ini file here's the section "Data Handling": `variables_order = "GPCS" ` `request_order = "GP" ` `register_argc_argv = Off` `auto_globals_jit = On post_max_size = 64M` `auto_prepend_file =` `auto_append_file =` `default_mimetype = "text/html" ` And there's no info about default_charset. – Sardis Jun 17 '16 at 09:18
  • That's not telling us what the value is. Use [`ini_get`](http://php.net/ini_get) – Ulrich Thomas Gabor Jun 17 '16 at 09:20
  • @GhostGambler it's UTF-8 – Sardis Jun 17 '16 at 09:21
  • You should also read this answer: http://stackoverflow.com/a/2662826/3340665, although it's not the explanation for your case. – Ulrich Thomas Gabor Jun 17 '16 at 09:21
  • You are using mysqli? – Ulrich Thomas Gabor Jun 17 '16 at 09:23
  • @GhostGambler I'm using this [tool](https://github.com/fulldecent/thin-pdo) – Sardis Jun 17 '16 at 09:27
  • You are saying that it doesn't work until you do `set names` which means the default of your mysql server is NOT utf-8. This means that the database tries to interpret UTF-8 as Latin1 (the server charset) and then convert it to UTF-8 to put it in that table. You can see how that could go wrong. – apokryfos Jun 17 '16 at 09:49
  • @apokryfos so in order to get rid of "set names" i have to change character_set_server to UTF-8 from latin1? – Sardis Jun 17 '16 at 09:52
  • Either server or database (probably both). – apokryfos Jun 17 '16 at 09:53
  • This one internally uses the original PDO. I looked around in the source code, but did not find a piece of code which sets the character set (see e.g. [here](https://github.com/php/php-src/blob/PHP-5.4/ext/pdo_mysql/mysql_driver.c#L544)). I think they just call the basic methods of [`mysql.h`](https://dev.mysql.com/downloads/connector/c/). I would have expected it to respect your `character_set_client` setting, but maybe they are not. You could look into the source code of that lib… – Ulrich Thomas Gabor Jun 17 '16 at 09:54
  • @apokryfos but why `set names` actually workes? According to [this page](http://dev.mysql.com/doc/refman/5.7/en/charset-connection.html) `set` statement changes `character_set_client`, `character_set_results`, `character_set_connection` that already 'UTF-8' (as we can see it from `show variables like` query) – Sardis Jun 17 '16 at 11:30
  • Did you execute the `show variables like char` query through your PDO connection? – apokryfos Jun 17 '16 at 11:31

1 Answers1

0

Ирина is Mojibake for Ирина.

When trying to use utf8/utf8mb4, if you see Mojibake, check the following. This discussion also applies to Double Encoding, which is not necessarily visible.

  • The bytes to be stored need to be utf8-encoded.
  • The connection when INSERTing and SELECTing text needs to specify utf8 or utf8mb4. (new PDO('...;charset=UTF8', ...);)
  • The column needs to be declared CHARACTER SET utf8 (or utf8mb4).
  • HTML should start with <meta charset=UTF-8>.

To check that the data was stored correctly, SELECT col, HEX(col) FROM .... The hex for utf8-encoding of Ирина is D098 D180 D0B8 D0BD D0B0. If, instead, you get C390 CB9C C391 E282AC C390 C2B8 C390 C2BD C390 C2B0, then the INSERT was messed up.

Rick James
  • 135,179
  • 13
  • 127
  • 222