0

I have an MYSQL Database in utf-8 format, but the Characters inside the Database are ISO-8859-1 (ISO-8859-1 Strings are stored in utf-8). I've tried with recode, but it only converted e.g. ü to ü). Does anybody out there has an solution??

niklas
  • 1
  • The easiest way would be to re-import the data with the correct character set specified. Any way to do that? – Pekka Jun 14 '11 at 11:04
  • Here is a duplicate with good answers: [I need help fixing Broken UTF8 encoding](http://stackoverflow.com/questions/1344692/i-need-help-fixing-broken-utf8-encoding) – Pekka Jun 14 '11 at 11:07

2 Answers2

0

I just went through this. The biggest part of my solution was exporting the database to .csv and Find / Replace the characters in question. The character at issue may look like a space, but copy it directly from the cell as your Find parameter.

Once this is done - and missing this is what took me all morning:

  • Save the file as CSV ( MS-DOS )

Excellent post on the issue

Source of MS-DOS idea

Jordan Reddick
  • 456
  • 1
  • 7
  • 16
0

If you tried to store ISO-8859-1 characters in the a database which is set to UTF-8 you just managed to corrupt your "special characters" -- as MySQL would retrieve the bytes from the database and try to assemble them as UTF-8 rather than ISO-8859-1. The only way to read the data correctly is to use a script which does something like:

ResultSet rs = ...
byte[] b = rs.getBytes( COLUMN_NAME );
String s = new String( b, "ISO-8859-1" );

This would ensure you get the bytes (which came from a ISO-8859-1 string from what you said) and then you can assemble them back to ISO-8859-1 string. The other problem as well -- what do you use to "view" the strings in the database -- is it not the case that your console doesn't have the right charset to display those characters rather than the characters being stored wrongly?

NOTE: Updated the above after the last comment

Liv
  • 6,006
  • 1
  • 22
  • 29
  • 1
    the database is set to utf-8 the strings stored in the db are iso-8859-1 – niklas Jun 14 '11 at 11:12
  • 1
    I've just updated the code -- it's just a matter of using them ISO-8859-1 when re-assembling the bytes into a String. – Liv Jun 14 '11 at 11:16