I'm importing emails from a Gmail account and storing them in a database. Apparently my PHP logic to convert everything to UTF-8 has failed with one particular email and when I run the query with that email I receive the following MariaDB error:
Incorrect string value: '\x96 conv...'
Here is the minimized version of the SQL query:
INSERT INTO mail (subject) VALUES ('orkut – convite...');
I eventually tracked this down to a header subject string. In my general MariaDB query log the string appears as intended with the en dash (–
): orkut – conv...
. When PHP attempts to run the query I get the error Incorrect string value
as mentioned above however when I manually run the exact same query via HeidiSQL the query executes just fine!
The header is split between the key and value and then the value is processed as follows:
$p = explode(': ', $header, 2);
$s = mb_convert_encoding(trim($p[1]), 'UTF-8', mb_detect_encoding($p[1]))
Now I've tried many things and the closest I could get was cmbuckly's answer on another thread however using $s = iconv('utf-8', 'windows-1252', $s);
did not resolve the problem in this circumstance. I have heaps of other attempts like using utf8_encode($s)
afterwards though I'm just at a loss at this point.
- My MariaDB database has the proper
utf8mb4
character set andutf8mb4_unicode_520_ci
database collation. - I am not going to try to use MariaDB to encode/re-encode text, this is explicitly intended as a PHP question.
What is causing MariaDB to not work with the query via PHP though allows the query to work via HeidiSQL? Have I properly decoded the string?