Due to some unfortunate prior events, one of our legacy databases has columns with latin1
and utf8
encoding mixed together, that is, several columns have data that might have both of the encodings, while the column default encoding is latin1
. Is there a way to detect the encoding and convert them all to utf8
Asked
Active
Viewed 698 times
1

randomor
- 5,329
- 4
- 46
- 68
-
Dump the database, do `sed 's/DEFAULT CHARACTER SET latin1/DEFAULT CHARACTER SET utf8 COLLATE utf8_general_ci/'`, and import the file. Note that the conversion may be lossy if any of the columns contain characters that are not in *both* character sets. – Amal Murali Oct 21 '14 at 16:47
-
http://stackoverflow.com/a/1049958/4099592 This should help. You can run the queries and find out what encoding is where. Then you can run these queries to update them. `ALTER DATABASE databasename CHARACTER SET utf8 COLLATE utf8_unicode_ci; ALTER TABLE tablename CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;` – slapyo Oct 21 '14 at 16:48
-
Thanks! I'm gonna look into these solutions one by one, also saw this script: http://stackoverflow.com/questions/910793/detect-encoding-and-make-everything-utf-8 – randomor Oct 21 '14 at 16:49
-
There's this script on github https://github.com/nicjansma/mysql-convert-latin1-to-utf8 which i found out at http://stackoverflow.com/questions/9304485/how-to-detect-utf-8-characters-in-a-latin1-encoded-column-mysql answered by Patrick James McDougle which you can use to do the conversion. – theark Oct 21 '14 at 17:10