We have a software written in perl which retrieves data from a mysql database. For this we use the DBD::mysql interface
We can retrieve all data correctly, the db is UTF8MB4 and the perl application uses UTF-8.
The code to retrieve the sql result is:
use utf8;
use encoding 'utf8';
...
my $dsn = "DBI:mysql:database=mydatabase;mysql_enable_utf8=1";
my $dbh = DBI->connect($dsn, $userid, $password, { mysql_enable_utf8 => 1 } ) or die $DBI::errstr;
...
my $sth = $dbh->prepare("SELECT addressid,
company, firstname, lastname,
address, zip, city, country,
phone, mobile, home,
speeddial_phone, speeddial_mobile, speeddial_home,
fax, email
FROM address
WHERE (firstname like ? or lastname like ? or company like ?)
LIMIT $sizeLimit
");
$sth->execute( $searchExpression, $searchExpression, $searchExpression) or die $DBI::errstr;
As long as the $searchExpression is containing normal characters, it works fine. But as soon as we query with special characters notin ASCII, for example é ö ä ü and similar, we don't get a empty result set back.
According to this post, this is due to a bug in the dbd::mysql drivers before version 4.041_01
http://blogs.perl.org/users/mike_b/2016/12/dbdmysql-all-your-utf-8-bugs-are-belong-to-us.html
I have tested different things, but to no avail.
I did turn on request logging in the mysql server and there I see that the parameters with special characters are coming in in a wrong encoding.
Here the output of the mysql log to file:
Time Id Command Argument
180905 9:17:06 403 Connect inno-ldap-db@localhost on phonebook_innovaphone
403 Query SELECT addressid,
company, firstname, lastname,
address, zip, city, country,
phone, mobile, home,
speeddial_phone, speeddial_mobile, speeddial_home,
fax, email
FROM address
WHERE companyid='1' and (firstname like 'andré%' or lastname like 'andré%' or company like 'andré%' )
LIMIT 25
403 Quit
Since we can't currently upgrade the system (it's debian 7, which includes only older packages like 4.021-1+deb7u3) I would need to implement a work around the problem.
Either some magic to pre-encode/decode the parameters, or would the odbc driver perhaps not experience this bug?