15

I have a scenario with two MySQL databases (in UTF-8), a Java code (a Timer Service) that synchronize both databases (reading form first of them and writing/updating to second) and a Web application that lets modify data loaded in the second database.

All database access is made using IBATIS (but I detect that I have the same problem using JDBC, PreparedStatements and ResultSets)

When my java code reads data from first database, I obtain characters like 'ó' when really it must be 'ó'. This data is wroten without modifications to the second database.

Later, when I see the loaded data in my web application, I see the extrange character despite the <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />.

If I decode the data using ...

new String(data.getBytes("UTF-8"));

... I visualize correctly the character (ó). But I can not use this solution as a general rule because when I modify data using web application form, the data is not updated in UTF-8 in my second database (despite the database is UTF-8 and my connection string is using characterEncoding, characterSetResults and useUnicode parameters).

From my Java code I obtain the following Database settings:

character_set_client-->utf8 
character_set_connection-->utf8 
character_set_database-->utf8 
character_set_filesystem-->binary 
character_set_results-->utf8 
character_set_server-->latin1 
character_set_system-->utf8 
character_sets_dir-->/usr/local/mysql51/share/mysql/charsets/ 

the character_set_server setting can't be changed and I don't know what I am doing wrong!!

How can I read UTF-8 data from MySQL using JDBC connector (mysql-connector-java-5.1.5-bin.jar)?

Is the problem with reading data from the first database or writing to the second database?

Andrew Tobilko
  • 48,120
  • 14
  • 91
  • 142

3 Answers3

37

A little late but this will help you:

DriverManager.getConnection(
           "jdbc:mysql://" + host + "/" + dbName 
           + "?useUnicode=true&characterEncoding=UTF-8", user, pass);
Andrew Tobilko
  • 48,120
  • 14
  • 91
  • 142
Doua Beri
  • 10,612
  • 18
  • 89
  • 138
  • In general application you don't create application. So you can't pass this parameter. Is there any other way that after reading i can change it?/ – Kulbhushan Singh Feb 13 '17 at 10:04
5

You can set the file.encoding property of your JVM to UTF-8 so all locale/encoding sensitive API will consider decoded Strings as UTF8.

For example, you can set it in your command line that launches your Java app:

java -Dfile.encoding=UTF-8 ....

You can also refer to this SO question for a complete explanation of Tomcat setup.

Andrew Tobilko
  • 48,120
  • 14
  • 91
  • 142
chburd
  • 4,131
  • 28
  • 33
5

At some point in the chain, UTF-8–encoded bytes are being decoded with Latin1. From the list of your settings, it appears this is happening at "character_set_server". Without knowing how these values were obtained, it is hard to interpret them.

Check the value of the system property "file.encoding". If that is not "UTF-8", then you need to explicitly specify "UTF-8" as the character encoding whenever you decode bytes to characters. For example, when you call a String constructor with a byte[], or use an InputStreamReader.

It is best to explicitly specify character encodings, rather than rely on the default platform encoding.

erickson
  • 265,237
  • 58
  • 395
  • 493