I have an MYSQL Database in utf-8 format, but the Characters inside the Database are ISO-8859-1 (ISO-8859-1 Strings are stored in utf-8). I've tried with recode, but it only converted e.g. ü to ü). Does anybody out there has an solution??
-
The easiest way would be to re-import the data with the correct character set specified. Any way to do that? – Pekka Jun 14 '11 at 11:04
-
Here is a duplicate with good answers: [I need help fixing Broken UTF8 encoding](http://stackoverflow.com/questions/1344692/i-need-help-fixing-broken-utf8-encoding) – Pekka Jun 14 '11 at 11:07
2 Answers
I just went through this. The biggest part of my solution was exporting the database to .csv and Find / Replace the characters in question. The character at issue may look like a space, but copy it directly from the cell as your Find parameter.
Once this is done - and missing this is what took me all morning:
- Save the file as CSV ( MS-DOS )
Excellent post on the issue
Source of MS-DOS idea

- 456
- 1
- 7
- 16
If you tried to store ISO-8859-1 characters in the a database which is set to UTF-8 you just managed to corrupt your "special characters" -- as MySQL would retrieve the bytes from the database and try to assemble them as UTF-8 rather than ISO-8859-1. The only way to read the data correctly is to use a script which does something like:
ResultSet rs = ...
byte[] b = rs.getBytes( COLUMN_NAME );
String s = new String( b, "ISO-8859-1" );
This would ensure you get the bytes (which came from a ISO-8859-1 string from what you said) and then you can assemble them back to ISO-8859-1 string. The other problem as well -- what do you use to "view" the strings in the database -- is it not the case that your console doesn't have the right charset to display those characters rather than the characters being stored wrongly?
NOTE: Updated the above after the last comment

- 6,006
- 1
- 22
- 29
-
1the database is set to utf-8 the strings stored in the db are iso-8859-1 – niklas Jun 14 '11 at 11:12
-
1I've just updated the code -- it's just a matter of using them ISO-8859-1 when re-assembling the bytes into a String. – Liv Jun 14 '11 at 11:16