5

Possible Duplicate:
UTF-8 encoding and http parameters

I have a UTF8 encoded JSP with a pure UTF8 header (and the text file is also encoded as UTF-8) and a form inside that page:

<?xml version="1.0" encoding="UTF-8" ?>
<%@ page language="java" contentType="text/html; charset=UTF-8"  pageEncoding="UTF-8"%>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>   <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /> </head>
<body>
This is a funny German character: ß
<form action="utf.do" method="post">
<input type="text" name="p" value="${p}" />
<input type="submit" value="OK"/>
</form>

Then I have a nice Spring-backed @Controller on the backend:

@Controller
public class UTFCtl {
@RequestMapping("/utf.do")
public ModelAndView handleUTF(@RequestParam(value="p", required=false) String anUTFString) {
    ModelAndView ret = new ModelAndView("utf");
    ret.addObject("p", anUTFString);
    return ret;
     }
 }

As you see the form transports its data via POST. Typing some German umlauts into the form field yields a bunch of crumbled characters at the backend. So submitting hähöhü on the form field yields hähöhü as value after submitting. I used the debugger and the var value is already scrambled meaning that Spring/Tomcat/Servlet hasn't detected the encoding correctly or the browser didn't encode my input correctly. The colleagues' usual response to that is: encode in ISO for Germany or encode using Javascript before transmitting. This shouldn't be neccessary, should it?? I mean, this is 2011 and that's what UTF8 is good for!

[EDIT] I think this is proving that the input is incoming as ISO even though I tell him to use UTF8:

byte[] in = anUTFString.getBytes("iso-8859-1");
String out = new String(in,"UTF-8");

out is then displayed correctly in the JSP!

I'm using Spring 2.5 on Tomcat 5.5 with Firefox 4 beta 11 on a Windows XP SP3 box. I already told the Tomcat in its to use URIEncoding="utf-8" but that doesn't change the game. I analysed the Firefox transmissions using Firebug and it seems to transmit UTF8. I also checked the current Spring WebMVC setup and IMO there are no further encoding changers anywhere, not in the config, nor in the web.xml (no listeners, nothing)- I read and understood most of the UTF-8 related docs and I worked like that in a PHP environment without any problems (simply switching PHP to utf-8, done)...

Community
  • 1
  • 1
Stefan
  • 3,382
  • 4
  • 23
  • 27
  • http://ibnaziz.wordpress.com/2008/06/10/spring-utf-8-conversion-using-characterencodingfilter/ was really helpful and fixed the problem for once. Seen that earlier already, thanks for pointing me there again. – Stefan Feb 21 '11 at 09:59

1 Answers1

3

So, indeed it's a matter of the server settings, too. Please note the duplication comment beneath the question. You have to tell your server as well as your deployment to use utf-8 and then everything's fine (pretty much like in PHP). Please note, that I'm duplicating the answer here (http://ibnaziz.wordpress.com/2008/06/10/spring-utf-8-conversion-using-characterencodingfilter/).

This works in a Tomcat environment:

edit your Tomcat's server.xml Connectors to deliver UTF-8: <Connector URIEncoding="utf-8" port="8080" blabla="blabla" ... >

Then add to your web.xml:

<filter>
    <filter-name>encodingFilter</filter-name>
<filter-class>org.springframework.web.filter.CharacterEncodingFilter</filter-class>
<init-param>
        <param-name>encoding</param-name>
        <param-value>UTF-8</param-value>
    </init-param>
    <init-param>
        <param-name>forceEncoding</param-name>
        <param-value>true</param-value>
    </init-param>
</filter>

<filter-mapping>
    <filter-name>encodingFilter</filter-name>
    <url-pattern>/*</url-pattern>
</filter-mapping>

this will tell the Spring framework to apply the UTF-8 filter for all kinds of requests (/*). After applying this you can even have links in the format ?q=äöüß which will be transported correctly. Though it's better to encode parameters for request transport:

URLEncoder.encode(aParameterWithUmlaut,"UTF-8")

Stefan
  • 3,382
  • 4
  • 23
  • 27