1

I try to do a java web app. Everything is good in local tomcat 7 server. I have a jsp file;

<%@ page language="java" contentType="text/html; charset=UTF-8"
    pageEncoding="UTF-8"%>

<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">

and in this file i send form to my servlet(post) and in my servlet;

request.setCharacterEncoding("UTF-8");

and it works. But in jelastic tomcat server it doesn't work and these turkish characters 'ş','ğ','ı' are inserting to mySql database '?'.

If i update cells, it shown on file true.

What can i do? i try everything on internet but it doesn't change.

ROOT
  • 153
  • 4
  • 19
  • What your jsp file has inside is *promise* that it uses utf-8, but are you sure that it does really use it? Is encoding of files in your project set to utf-8? – Pshemo Jan 25 '15 at 19:19
  • Response's encoding is not utf-8.Request is not send true characters but in servlet, i change it with setCharacterEncoding() method in my localhost. In jsp page, static strings are shown true. – ROOT Jan 25 '15 at 19:29
  • And also my project's text file encoding is UTF-8 in application properties. – ROOT Jan 25 '15 at 19:39

2 Answers2

3

Double check the following settings, making sure everyone knows it's UTF-8 party.

<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
  <title>Page Title</title>
  <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
  <meta http-equiv="X-UA-Compatible" content="IE=edge,chrome=1">
  <meta name="format-detection" content="telephone=no" />
</head>
<body>
your html content goes here....
</body>
</html>

Database tables are using utf-8 charset, I don't trust db defaults that's why create table definitions have it.

CREATE DATABASE mydb DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_swedish_ci;

CREATE TABLE tMyTable (
  id int(11) NOT NULL auto_increment,
  code VARCHAR(20) NOT NULL,
  name VARCHAR(20) NOT NULL,
  PRIMARY KEY (id)
) ENGINE=InnoDB DEFAULT CHARACTER SET utf8 DEFAULT COLLATE utf8_swedish_ci;

Let JDBC connection know utf-8 charset.

<Resource name="jdbc/mydb" auth="Container" type="javax.sql.DataSource"
  maxActive="10" maxIdle="2" maxWait="10000"
  username="myuid" password="mypwd"
  driverClassName="com.mysql.jdbc.Driver"
  url="jdbc:mysql://localhost:3306/mydb?useUnicode=true&amp;characterEncoding=utf8"
  validationQuery="SELECT 1"
/>

Some Tomcat versions don't use the same charset origin for GET or POST form requests, so add useBodyEncodingForURI attribute to force GET form parser oboye setCharacterEncoding value.

<Connector port="8080"
           maxThreads="150" minSpareThreads="25" maxSpareThreads="75"
           enableLookups="false" redirectPort="8443" acceptCount="100"
           debug="0" connectionTimeout="20000"
           disableUploadTimeout="true" useBodyEncodingForURI="true"
/>

This call must happen before any filter or other code tries to read parameters from the request. So try to call it early as possible.

if (req.getCharacterEncoding() == null)
      req.setCharacterEncoding("UTF-8");

Be careful with the whitespace characters in a .jsp page. I use this technique to set multiple tag headers, see how ending and starting tags are next to each other.

<%@ taglib prefix="c" uri="http://java.sun.com/jsp/jstl/core" %><%@ 
   page contentType="text/html; charset=UTF-8" pageEncoding="ISO-8859-1"
   import="java.util.*, 
             java.io.*"
%><%
   request.setCharacterEncoding("UTF-8");
   String myvalue = "hello all and ÅÄÖ";
   String param = request.getParameter("fieldName");
   myvalue += " " + param;
%><!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
  <title>Page Title</title>
  <meta http-equiv="X-UA-Compatible" content="IE=edge,chrome=1">
  <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
  <meta name="format-detection" content="telephone=no" />
</head>
<body>
your html content goes here.... <%= myvalue %>
</body>

JSP page contentType attribute is the one set in http response object and pageEncoding is the one being used in a disk file. They don't need to match and I usually use ISO-8859-1 if page is only using safe us-ascii characters. Don't use UTF8WithBOM format because hidden leading bom marker bytes may create problems in some J2EE servers.

Last thing is how you write strings to the response stream, if you convert strings to bytes make sure it's using utf-8 and let client know it.

response.setContentType("text/html; charset=UTF-8");
response.getOutputStream().write( myData.getBytes("UTF-8") );

This was a long post but it pretty much covers most corner issues.

Whome
  • 10,181
  • 6
  • 53
  • 65
  • Thank you for your reply. It is very good answer but i try to do everything what you say, it didn't work. – ROOT Jan 26 '15 at 19:04
  • You sure servlet/jsp page gets proper chars. If you dump byte[] bytes=mystr.getBytes("UTF-8") array as an ascii numbers to a reply. Do you see proper utf8 byte sequence for turkish letters? – Whome Jan 26 '15 at 19:13
  • bytes reurn like this : [B@3f7d8bac i don't understand. And getCharacterEncoding() returns null – ROOT Jan 26 '15 at 19:32
  • See this bytes to hex string, then see if sequence is right. http://stackoverflow.com/questions/9655181/convert-from-byte-array-to-hex-string-in-java – Whome Jan 27 '15 at 00:37
0

The phrase "call it early as possible" in Whome's answer above hit the spot.

protected void doPost(HttpServletRequest request, HttpServletResponse response)
        throws ServletException, IOException {
    if (request.getCharacterEncoding() == null) {
        request.setCharacterEncoding("UTF-8");
    }
    String command = request.getParameter("command");
    ...

works. However,

protected void doPost(HttpServletRequest request, HttpServletResponse response)
        throws ServletException, IOException {
    String command = request.getParameter("command");
    if (request.getCharacterEncoding() == null) {
        request.setCharacterEncoding("UTF-8");
    }
    ...

doesn't work.

Park JongBum
  • 1,245
  • 1
  • 16
  • 27