I'm connecting to a MySQL db from Python 3.5 and querying for some VARCHAR values that contain unicode characters. The behavior seems to have changed when we switched from Python 2.7 to 3.x.
When connecting, I specify charset "utf8" and use_unicode "True". This yields query results that are string objects, but with the wrong unicode characters.
HOWEVER! I can simulate the old behavior by setting use_unicode to "False" and then calling "decode('utf-8')" on the byte sequences returned by the query.
Also, when I connect via the command line client and select values manually, the results are the same "mangled" values the Python script generates when use_unicode is set to "True".
As an example, here's a "correct" value that I get when use_unicode is False and I decode the byte sequence myself in Python:
Hēroïne
...and here is the string object I get back when use_unicode is True:
HÄroïne
Any thoughts? Or direction where I should start looking?
EDITED TO ADD:
I can get the command line client to display the values correctly by setting character_set_results to "latin1". When character_set_results is "utf8" I get the mangled version. This is weird, though, because the table and column I'm querying are both configured to use "utf8" as their character set. The database itself, however, looks like it's latin1:
mysql> show variables like 'character_set%';
+--------------------------+-------------------------------------------+
| Variable_name | Value |
+--------------------------+-------------------------------------------+
| character_set_client | utf8 |
| character_set_connection | utf8 |
| character_set_database | latin1 |
| character_set_filesystem | binary |
| character_set_results | latin1 |
| character_set_server | latin1 |
| character_set_system | utf8 |
| character_sets_dir | /rdsdbbin/mysql-5.6.41.R4/share/charsets/ |
+--------------------------+-------------------------------------------+
8 rows in set (0.04 sec)
mysql> select title from site where title like 'Maison H%';
+------------------+
| title |
+------------------+
| Maison Hēroïne |
+------------------+
1 row in set (0.21 sec)
mysql> show create table site;
| site | CREATE TABLE `site` (
...
`title` varchar(255) CHARACTER SET utf8 NOT NULL DEFAULT '',
...
) ENGINE=InnoDB AUTO_INCREMENT=524892 DEFAULT CHARSET=utf8 COLLATE=utf8_bin |