6

I have a problem sending special characters like cyrillic or umlauts from a jsp to a servlet. I would greatly appreciate your help here.

Here is what I have done:

  1. Defined the utf-8 charset in the jsp:

    <%@ page language="java" contentType="text/html; charset=utf-8" 
        pageEncoding="utf-8"%>
    <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" 
    "http://www.w3.org/TR/html4/loose.dtd">
    <html>
     <head>
      <meta http-equiv="content-type" content="text/html; charset=utf-8" />
     ...
    
    <div class="authentication">
      <form name="registerForm" action="register" method="get" accept-charset="UTF-8">
        <input type="input" name="newUser" size="17"/>
        <input type="submit" value="senden" />
      </form>
    </div>
     ...
    
  2. Set Tomcat URIEncoding for Connector, in the server.xml

    <Connector URIEncoding="UTF-8" ...
    
  3. Inside the servlet - set the CharacterEncoding to UTF and read request

    public void doGet (HttpServletRequest request, HttpServletResponse response) throws IOException, ServletException{
    
    request.setCharacterEncoding("UTF-8");
    String username = request.getParameter("newUser");
    System.out.println("encoding: "+request.getCharacterEncoding());
    
    System.out.println("received: "+username);
    
  4. Here is what gets displayed when using for example: Однако

    encoding: UTF-8
    received: ??????
    

Am I missing something ? I suppose the servlet can not decode the String properly, but I have no idea why is this happening. I've followed all the advises on this topic but had no luck.

thanks in advance.

BalusC
  • 1,082,665
  • 372
  • 3,610
  • 3,555
cucicov
  • 198
  • 1
  • 2
  • 8

1 Answers1

9

Everything looks fine. Only the System.out.println() console also needs to be configured to interpret the byte stream as UTF-8.

If you're sitting in an IDE like Eclipse, then you can do that by setting Window > Preferences > General > Workspace > Text File Encoding to UTF-8. For other environments, you should be more specific about that so that we can tell how to configure it.

BalusC
  • 1,082,665
  • 372
  • 3,610
  • 3,555
  • I have already set this. But it does not work for me. Any other settings ? – Kaushik Lele Jul 22 '15 at 10:11
  • 1
    @KaushikLele: then you don't have the same question as the OP. Look for a different one or ask a new one. I can't do much else than giving you this link so you can finally understand the world of bytes and characters and can nail down further all on your own: http://balusc.blogspot.com/2009/05/unicode-how-to-get-characters-right.html – BalusC Jul 22 '15 at 10:15