I have encountered a scenario where an email from someone in Europe keeps failing to execute. After minimizing the query I've determined that after all special characters like å
and é
are removed the query works fine in PHP / mysqli_query
. The queries also don't work in MariaDB's command line though they do work in HeidiSQL, I imagine whatever HeidiSQL uses it internally adjusts strings used in the Query tabs.
Let's get the following out of the way:
- Database Character Set:
utf8mb4
. - Database Collation:
utf8mb4_unicode_520_ci
. - Database column collation:
utf8mb4_unicode_520_ci
. - The correct query for the request method
SET CHARACTER SET 'utf8mb4'
is being correctly executed.
Here is the query:
INSERT INTO example_table (example_column) VALUES ('Håko');
I should note that I tried the following (which also failed) even though I firmly believe that this issue occurs from and should be resolved via PHP:
INSERT INTO example_table (example_column) VALUES (CONVERT('Håko' USING utf8));
Here is the MariaDB error:
Incorrect string value: '\xE9rard ...'
Like I said this string is originating from an email message so I'm pretty sure that the issue is with PHP, not MariaDB. So let's go backwards to that code that seems to otherwise work. Please keep in mind that this has taken at least two days to put together in the correct order to even get the strings to appear correctly in the MariaDB query log without being incorrectly converted to UTF-8 and corrupting the special Latin characters:
<?php
$s1 = '=?iso-8859-1?Q?=22G=E9rd_Tabt=22?= <berbs@example.com>';//"Gérd Tabt" <berbs@example.com>
if (strlen($s1) > 0)
{
if (substr_count($s1, '=?') && substr_count($s1, '?= '))
{
$p = explode('?= ', $s1);
$p[0] = $p[0].'?=';
$s2 = imap_mime_header_decode($p[0])[0]->text.' '.$p[1];
}
else {$s2 = imap_mime_header_decode($s1)[0]->text;}
if (strpos($s1, '=96') !== false) {$s2 = mb_convert_encoding($s2, 'UTF-8', 'CP1252');}
else if (mb_convert_encoding($s2, 'UTF-8') == substr_count($s1, '?')) {$s2 = mb_convert_encoding($s2, 'UTF-8');}
}
else {$s2 = $s1;}
?>
There isn't any other relevant code handling this header string.
What is causing what I presume to be UTF-8 encoded strings to break PHP's mysqli_query
and the MariaDB command line from working with this query?