UTF-8 not working in Servlets with MySQL via JDBC

Question

I have a Java Servlet running on a Tomcat Server with a MySQL database connection using JDBC. If I have following piece of code works in, the hard-coded-HTML code works, but everything that comes from the database is displayed incorrectly.

response.setContentType("text/html;charset=UTF-8")

enter image description here

If I remove the line the text from the database gets displayed correct, but not the basic HTML. enter image description here

In the database, and Eclipse everything is set to UTF-8.

What if you specify encoding in DB connection URL jdbc:mysql://127.0.0.1:2222/theDB?characterEncoding=utf8? — StanislavL, Mar 06 '15 at 11:18
Are you sure your HTML files themselves are properly encoded? You don't tell where the HTML comes from so that's hard to tell — fge, Mar 06 '15 at 11:23
Please refer to this thread for setting requests with UTF8 encoding http://stackoverflow.com/questions/11002827/html-forms-and-jsp-problems-with-utf-8-character-encoding — zawhtut, Mar 06 '15 at 12:13
@eggyal: for a Java EE geared article, see [Unicode - How to get the characters right?](http://balusc.blogspot.nl/2009/05/unicode-how-to-get-characters-right.html). In OP's particular case it's likely a broken JDBC connection setting, which is pretty classic in case of MySQL as its JDBC driver doesn't use the server-specified encoding, but instead the client-specified one. — BalusC, Mar 06 '15 at 12:23

score 2 · Answer 1 · answered Mar 06 '15 at 11:40

On first sight it looks as if you were converting the text from the database again, once too much.

So the first check is the database. For instance the length of "löl" should be 3. Whether the data stored is correctly, read correctly. As @StanislavL mentioned, not only the database needs the proper encoding, in MySQL also the java driver that communicates needs to be told the encoding with ?useUnicode=yes&characterEncoding=UTF-8. Maybe write or debug a small piece of code reading the database.

If stored correctly the culprit might be String.getBytes() or new String(bytes).

In the browser inspect the encoding or save the pages. With a programmer's editor like NotePadd++ or JEdit inspect the HTML. These tools allow reloading with a different encoding, to see what the encodings are.

It should be that the first page is in UTF-8 and the second in Windows-1252 or something else.

Ensure that the HTML source text is correct: you might use "\u00FC" for ü in a JSP.

UTF-8 not working in Servlets with MySQL via JDBC

1 Answers1