0

This has been driving me nuts.

So I have a (very simple vanilla servlet 3) web app. When I run it in eclipse all is fine. Among others I am able to register an account with Unicode (greek) username and then log in as site admin and visit the user's profile alright. When I export war to $CATALINA_HOME\webapps, launch $CATALINA_HOME\bin\startup.bat, open the site in the browser, login as admin and try to visit the user profile the username etc display as blank.
The files in ...\apache-tomcat-7.0.32\conf and the ones in ...\eclipse_workspaces\javaEE\Servers\Tomcat v7.0 Server at localhost-config differ only in the line (in server.xml) :

<Context docBase="ted2012" path="/ted2012" reloadable="true" 
source="org.eclipse.jst.jee.server:ted2012"/>

which is an eclipse thing.

The doGet method (slimmed) in the profile servlet :

protected void doGet(HttpServletRequest request,
        HttpServletResponse response) throws ServletException, IOException {

    final String username = Helpers.decodeRequest(request
            .getParameter("user"));
    if (username != null) {
        User user = null;
        try {
            System.out
                    .println("ProfileController.doGet() user name DECODED : "
                            + username);
            user = userService.getUserWithUsername(username); // THIS FAILS
            System.out.println("ProfileController.doGet() user : " + user);
            request.setAttribute("userToShow", user);
        } catch (ServiceExDBFailure e) {
            log.debug("ProfileController::doGet", e);
            request.setAttribute("ErrorString", e.getMessage());
        }
        sc.getRequestDispatcher(OTHERPROFILE_JSP)
                .forward(request, response);
        return;
    } else {
        //does not apply
    }
}

The decode method is :

public static String decodeRequest(String parameter)
        throws UnsupportedEncodingException {
    if (parameter == null)
        return null;
    System.out.println("decode - request.getBytes(\"iso-8859-1\"):"
            + new String(parameter.getBytes("iso-8859-1")));
    System.out.println("decode - request.getBytes(\"iso-8859-1\") BYTES:"
            + parameter.getBytes("iso-8859-1"));
    for (byte iterable_element : parameter.getBytes("iso-8859-1")) {
        System.out.println(iterable_element);
    }
    System.out.println("decode - request.getBytes(\"UTF-8\"):"
            + new String(parameter.getBytes(CHARSET_FOR_URL_ENCODING))); // UTF-8
    return URLDecoder.decode(new String(parameter.getBytes("iso-8859-1")),
            CHARSET_FOR_URL_ENCODING);
}

While the db call is :

            statement = conn.prepareStatement(query);
            statement.setString(1, username);
            System.out.println("ελληναρα");
            System.out.println(statement);
            set = statement.executeQuery();
            if (set.next()) {
                User user = new User();
                // user.setId(set.getInt("ID"));
                user.setUsername(set.getString("username"));
                user.setName(set.getString("name"));
                user.setSurname(set.getString("surname"));
                user.setPassword(set.getString("password"));
                user.setEmail(set.getString("email"));
                user.setRole(RolesENUM.values()[set.getInt("role")]);
                return user; // if the set is empty null is returned
            }

Tomcat prints :

decode - request.getBytes("iso-8859-1"):╧à╧â╧ä╬╡╧?╬╣╬▒
decode - request.getBytes("iso-8859-1") BYTES:[B@529b9ed
-49
-123
-49
-125
-49
-124
-50
-75
-49
-127
-50
-71
-50
-79
decode - request.getBytes("UTF-8"):├?┬à├?┬â├?┬ä├Ä┬╡├?┬?├Ä┬╣├Ä┬▒
ProfileController.doGet() user name DECODED : ╧à╧â╧ä╬╡╧?╬╣╬▒
com.mysql.jdbc.JDBC4PreparedStatement@766d7940: SELECT * FROM users WHERE username='╧à╧â╧ä╬╡╧?╬╣╬▒'
????????
ProfileController.doGet() user : null

while Eclipse prints :

decode - request.getBytes("iso-8859-1"):υστερια
decode - request.getBytes("iso-8859-1") BYTES:[B@4b6a6bdf
-49
-123
-49
-125
-49
-124
-50
-75
-49
-127
-50
-71
-50
-79
decode - request.getBytes("UTF-8"):ÏÏÏεÏια
ProfileController.doGet() user name DECODED : υστερια
com.mysql.jdbc.JDBC4PreparedStatement@37d02427: SELECT * FROM users WHERE username='υστερια'
ελληναρα
ProfileController.doGet() user : com.ted.domain.User@63144ceb

I believe for some reason the query that gets to the db is something crazy - notice that where in eclipse prints ελληναρα in tomcat prints ???????? while the unicode username (υστερια) prints as ╧à╧â╧ä╬╡╧?╬╣╬▒ and not as ???????.

So the question is - what changes between the Eclipse deployment and the tomcat deployment ? Why the hell the DB returns null ? I have really really tried to debug this in vain

Help

EDIT : replacing the line statement.setString(1, username); with statement.setString(1, "υστερια"); there is NO failure. So by the time this line is run the bytes are mangled up - notice though that the bytes are the same one by one

EDIT2 : Tomcat v7.0 Server at localhost Eclipse launch VM args (split for readability) :

-Dcatalina.base="C:\Dropbox\eclipse_workspaces\javaEE\.metadata\.plugins
\org.eclipse.wst.server.core\tmp1" 
-Dcatalina.home="C:\_\apache-tomcat-7.0.32" 
-Dwtp.deploy="C:\Dropbox\eclipse_workspaces\javaEE\.metadata\.plugins\org.eclipse.wst.server.core\tmp1\wtpwebapps" 
-Djava.endorsed.dirs="C:\_\apache-tomcat-7.0.32\endorsed"

NB : the launch for the app is created dynamically

EDIT 2013.03.30 : this is now on github - and see my more general question here

Community
  • 1
  • 1
Mr_and_Mrs_D
  • 32,208
  • 39
  • 178
  • 361
  • What are you trying to do with those request decodings and encodings? – JB Nizet Feb 17 '13 at 13:38
  • @JB Nizet : See : http://stackoverflow.com/a/12764128/281545. I am trying to avoid having to configure tomcat to have URIEncoding as UTF-8. Whatever I am trying to do is working when runing the app in eclipse while not running in tomcat. – Mr_and_Mrs_D Feb 17 '13 at 14:03
  • When you receive a parameter, it's already a String, so Tomcat has already transformed the bytes into a String using its default encoding. It's too late. – JB Nizet Feb 17 '13 at 14:05
  • @JB Nizet : the exact same code runs in eclipse fine (which launches the same install of Tomcat) - can you elaborate ? – Mr_and_Mrs_D Feb 17 '13 at 14:14
  • Check how Eclipse starts Tomcat. But this decoding/encoding is wrong. It makes no sense. Read the first two answers in the thread you linked to. It explains it very well. – JB Nizet Feb 17 '13 at 14:15
  • @JBNizet : It does make sense - please read it carefully. Instead of adding `URIEncoding="UTF-8"` in the `Context` I get the bytes in the default tomcat encoding and decode them my self. That's why I add the `request.getBytes(CHARSET_FOR_URL_ENCODING)) // UTF-8` call - to display the bytes are encoded in iso-8859-1. The subtle thing is that actually `getParameter()` does call `URLDecoder.decode()` - wiy\th default (`iso-8859-1`) charset. The advantage of how I do it is that it needs no xml edits. To be done properly I should query the server for its current encoding. Will edit with VM args – Mr_and_Mrs_D Feb 17 '13 at 14:24
  • No, it doesn't make sense. Suppose the tomcat encoding is ASCII, and you send parameters encoded in UTF8, containing only bytes that make no sense in ASCII. This means that the String containing the parameter value would only contain question marks (or whatever character is used when decoding a byte that isn't valid for ASCII). And transforming these question marks to bytes and then re-decoding the bytes to a String using UTF8 won't magically find the original bytes. That's like reducing the size of a bitmap image and enlarging it afterwards: it's a lossy operation. – JB Nizet Feb 17 '13 at 17:36
  • @JB Nizet: you may be very well right but notice I print the bytes and it is the same bytes in both tomcat and Eclipse. Then I call `URLDecoder.decode(new String(request.getBytes("iso-8859-1")), "UTF-8")` - it is passed the same bytes in (!?). The only part that may be fishy is `final String username = Helpers.decodeRequest(request.getParameter("user"));` where iso may be used (in getParameter("user")). Still one important part in the question is _why the hell does it run in eclipse and not in Tomcat (with identical config) ?_ – Mr_and_Mrs_D Feb 17 '13 at 18:01

1 Answers1

0

This was finally answered here.

The gist of the answer is that I eclipse had as default encoding UTF-8 and Tomcat windows-1252 so when I call new String() without specifying the encoding those are used to tranlate the byte[] to chars. Doing

new String(parameter.getBytes("iso-8859-1"), "UTF-8");

solves the problem - it would not though if tomcat in

request.getParameter("user") // url decoding is performed by tomcat - using the
// URIEncoding from server.xml or by default ISO-8859

would not use by default ISO-8859 as another encoding (say ASCII) would probably (behavior undefined and not controllable before Java 7 nio) replace undecodable characters with ? so parameter String would be corrupted (see ISO-8859-1 encoding and binary data preservation).

So bravo to tomcat for performing the conversion by default with ISO-8859 in its request.getParameter() and boo to Java ee spec guys who do not even mention in the docs that getParameter will perform URL decoding, let alone letting us specify the encoding, overriding server.xml.

Community
  • 1
  • 1
Mr_and_Mrs_D
  • 32,208
  • 39
  • 178
  • 361