0

When I view the tomcat source code at http://grepcode.com/file/repo1.maven.org/maven2/org.apache.tomcat/tomcat-catalina/7.0.0/org/apache/catalina/connector/Request.java#Request.parseParameters%28%29 I can't find where to set encoding for the queryString which comes from get method and how does the configuration URIEncoding="UTF-8" in server.xml work in this method.

melo
  • 83
  • 2
  • 2
  • 5
  • 1
    Webserver should not be utf-8 aware. Your application should be aware of it. – Shiplu Mokaddim Dec 27 '12 at 12:11
  • 4
    @shiplu.mokadd.im: The `URIEncoding` parameter configures the servlet container, not the web server itself. And since the servlet container is responsible for decoding the query string and splitting it into separate parameters, it will need to be properly configured. – Codo Dec 27 '12 at 12:20

2 Answers2

3

The URIEncoding parameter is what you're looking for. It sets the character encoding to be used when URI decoding the query string.

You use it in server.xml as an attribute of the Connector entity.

If successfully used it in the past.

Codo
  • 75,595
  • 17
  • 168
  • 206
  • Do you mean URI contains the query string? – melo Dec 27 '12 at 14:48
  • In a GET request, the query string is in the URI. In a POST request, the query string is in the request body (but URL encoded). – Codo Dec 27 '12 at 14:53
  • For what part are you interested in documentation? Whether the query string is in the URL? – Codo Dec 27 '12 at 15:34
  • I've found it at http://tomcat.apache.org/tomcat-7.0-doc/servletapi/index.html,the method `HTTPServletRequest.getRequestURI`.But it says `Returns the part of this request's URL from the protocol name up to the query string in the first line of the HTTP request. The web container does not decode this String.`.It's seem to say URI does't contain query string. – melo Dec 28 '12 at 01:34
1

First, let's use a more recent version of Tomcat. 7.0.0 is years old: Request.java from Tomcat 7.0.34

Second, the parseParameters method does not set the encoding: it fetches the encoding which has been set by other components. Some places where the content encoding might have been set:

  1. The URIEncoding of the connector (defaults to ISO-8859-1 AS per HTTP RFC)
  2. The request body encoding (from the HTTP request's Content-Type header)
  3. Another component -- perhaps sniffing the encoding by looking at a parameter's value

If you just want to set the URI encoding to UTF-8 unconditionally across your site, then just use the URIEncoding attribute in your <Connector>.

The direct answer to your question is that server.xml's URIEncoding attribute does not work in this method: it works elsewhere.

Christopher Schultz
  • 20,221
  • 9
  • 60
  • 77