We've recently started receiving 'utf8' codec can't decode bytes in position 30-31: unexpected end of data
when collecting data from the database(using Python and SQLAlchemy). We've located the error to some special Japanese characters.
All tables have CHARSET=utf8
and these are the settings we have on the server when I run show variables;
| character_set_client | latin1
| character_set_connection | latin1
| character_set_database | latin1
| character_set_filesystem | binary
| character_set_results | latin1
| character_set_server | latin1
| character_set_system | utf8
| character_sets_dir | /usr/share/mysql/charsets/
| collation_connection | latin1_swedish_ci
| collation_database | latin1_swedish_ci
| collation_server | latin1_swedish_ci
If we wan't to move our environment to utf8 - Which settings are recommended and how should we export and import the current data we have to make it work with the new settings?
I've read some posts about exporting the data as latin1
by adding --default-character-set=latin1
to the mysqldump command and then importing it to the database that have the new settings, but since our original tables already is in utf8, this won't work.
Tried setting the connection after reading this thread: SQLAlchemy and UnicodeDecodeError
This solves the issue of the application crashing, but all the old data is damaged.