0

I have tried all the possible solutions listed on SOF, but not a single solution works for me (Maybe am doing something not appropriate). Recently I have upgraded my infra, upgraded from PHP 5 to PHP 7 and then problem start, old infra still displaying all the Chinese character very well not an issue, but on PHP 7 I have an issue. It's display question marks only and few Chinese characters, i.e. 广?????运货运代?????????????????司

MariaDB

'character_set_client','utf8'
'character_set_connection','utf8'
'character_set_database','utf8'
'character_set_filesystem','binary'
'character_set_results','utf8'
'character_set_server','utf8'
'character_set_system','utf8'
'character_sets_dir','c:\\mariadb\\share\\charsets\\'

MariaDB Table Data

'4181','é“甲兵户外','TB0001',NULL,'2016-06-04 18:21:35',NULL,NULL
'4188','é“甲兵户外','TB0001',NULL,'2016-06-04 18:24:20',NULL,NULL
'4221','é“甲兵户外(TB0001)','TB0001',NULL,'2016-06-05 05:09:49','2016-08-24 06:54:57',NULL
'204424','广州凌è¿è´§è¿ä»£ç†æœåŠ¡æœ‰é™å…¬å¸',NULL,NULL,'2019-07-09 00:13:43','2020-02-19 10:08:21',NULL

Maria DB Table Definition

CREATE TABLE `companies` (
  `entity_id` int(11) NOT NULL,
  `name` varchar(100) NOT NULL,
  `reg_no` varchar(30) DEFAULT NULL,
  `website_url` varchar(100) DEFAULT NULL,
  `created` datetime NOT NULL,
  `updated` datetime DEFAULT NULL,
  `external_id` varchar(100) DEFAULT NULL,
  PRIMARY KEY (`entity_id`),
  KEY `fk_companies_1_idx` (`entity_id`),
  FULLTEXT KEY `ft_1` (`reg_no`),
  CONSTRAINT `FK_8244AA3A81257D5D` FOREIGN KEY (`entity_id`) REFERENCES `entities` (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8

PHP 7 Input

header('Content-type: text/html; charset=utf-8');

$conn = new mysqli($host, $username, $password, $dbname) or die("Connect failed: %s\n". $conn -> error);

$query = "SELECT * FROM companies where entity_id = 4188";
$result = $conn->query($query);
$row = mysqli_fetch_assoc($result);

$name = $row["name"];
echo "\n";
echo $name;
echo "\n";
echo utf8_decode($name);
echo "\n";
echo iconv('UTF-8', 'ISO-8859-1', $name);
echo "\n";
echo mb_convert_encoding($name, 'ISO-8859-1', 'UTF-8');
echo "\n";
echo  utf8_decode($name);

PHP 7 Output


é“甲兵户外
??????????????

??????????????
??????????????

This is my old infra

PHP 5 Input

header('Content-type: text/html; charset=utf-8');
$conn = new mysqli($host, $username, $password, $dbname) or die("Connect failed: %s\n". $conn -> error);

$query = "SELECT * FROM companies where entity_id = 4188";//204424";
$result = $conn->query($query);
$row = mysqli_fetch_assoc($result);
$name = $row["name"];
echo $name;

PHP 5 Output

铁甲兵户外

On both Interfaces I'm using the same database, but somehow on old infra the output is okay without any conversion of utf or latin.

  • In mariadb - utf8 (defaults to 3 byte) -> utf8mb4 (4 byte per char utf8). Table defination is important too - `show create table companies`. – danblack Mar 03 '20 at 02:47

1 Answers1

0

甲兵户外 is Mojibake for 甲兵户外

Mojibake occurs when something is incorrectly indicating latin1 (or some wrong character set).

For Chinese, you need utf8mb4, not imply utf8.

Do not use any encoders/decoders, then only mess things up worst.

For debugging, use hex. In MySQL, use SELECT col, hex(col) ...

More on "best practice", Mojibake, etc: Trouble with UTF-8 characters; what I see is not what I stored

Rick James
  • 135,179
  • 13
  • 127
  • 222