0

I was trying to convert utf8 to utf8mb4 for a php website, unfortunately, I was success on localhost but failed on test server.

localhost:

  • php/5.3.29(brew install php53) or PHP/5.6.8(xampp)
  • Apache/2.4.16

test server:

  • PHP/5.6.14
  • nginx/1.6.0

They are connecting to the same mysql database(encoding: utf8mb4, collation: utf8mb4_unicode_ci).
data tables: (encoding: utf8, collation: utf8_unicode_ci)
some data tables: (encoding: utf8mb4, collation: utf8mb4_unicode_ci)

The php website with CodeIgniter framework current config:

$db['default']['char_set'] = 'utf8';
$db['default']['dbcollat'] = 'utf8_general_ci';

With this config, everything is ok. When I change the config to:

$db['default']['char_set'] = 'utf8mb4';
$db['default']['dbcollat'] = 'utf8mb4_general_ci';

everything is ok on my localhost server but not test server.

On test server, the page text which get data from DB show ???. I try to insert a comment with emoji from page, the comment text is ok.

I use a mysql client Sequel Pro to connect the database, I find the comment that inserted just now is incorrect encoding(windows-1252). PHP save windows-1252 text to a utf8 database.

T_T, please help me. Is that some php extends required?

iMumu
  • 11
  • 2
  • What are the MySQL versions? – deceze Dec 22 '15 at 08:51
  • The MySQL version is 5.6.16 – iMumu Dec 22 '15 at 08:53
  • Sounds like your webapp is using an incorrect connection encoding. – eggyal Dec 22 '15 at 08:56
  • Look at http://stackoverflow.com/a/279279/476, confirm every step, debug your values by outputting them as `bin2hex($str)` and figure out at which point they are changed how. This doesn't really contain enough information to diagnose anything specifically for you. – deceze Dec 22 '15 at 08:58
  • @deceze Thank you. I will debug it step by step~ – iMumu Dec 22 '15 at 09:25
  • Also do `SELECT col, hex(col) FROM tbl WHERE ...` to see what is in the table already. – Rick James Dec 22 '15 at 17:55
  • This is _not_ a duplicate of _that_ thread. This involves doing `ALTER TABLE ... CONVERT TO utf8mb4` on each table. It _may_ involve dealing with any indexes on `VARCHAR(255)`; let us know if you have any such. – Rick James Dec 22 '15 at 17:56
  • @RickJames Thanks. I try to save `haha` to db from website page and then run `SELECT col, hex(col) FROM tbl WHERE ...`. Local server: `haha F09F988268616861`, test server: `ðŸ˜haha C3B0C5B8CB9CC28168616861`. – iMumu Dec 23 '15 at 01:24
  • There are some mistakes at the php level, I think. Both local and test server are connecting to one database, and it works well on local server but not on test server. They have the same code and file encoding(utf-8). – iMumu Dec 23 '15 at 02:01
  • You have `latin1` somewhere on the test server, probably when connecting the client to the mysql server. So, maybe this is a dup thread. – Rick James Dec 23 '15 at 06:09
  • latin1 and cp1252 are virtually the same. – Rick James Dec 23 '15 at 06:25
  • By all means, please post that answer as answer... – deceze Jan 08 '16 at 08:47

1 Answers1

0

I have resolved it.

On test server, the php use mysql extends without mysqlnd, so the charset depends on /usr/share/mysql/charsets/ settings. If it missed utf8mb4 defined, the mysql_set_charset function will failed: 2019: Can't initialize character set utf8mb4 (path: /usr/share/mysql/charsets/).

Just use mysql_query('SET NAMES utf8mb4') instead.

iMumu
  • 11
  • 2