4

I am on a windows OS. Using tomcat 8. IDE - Netbeans 8. JDK1.8.0_05

I am trying to specify a hebrew url pattern for certain servlets. (I have tried both by setting the Urlpattern attribute of the @webservlet annotation and by placing in the web.xml file).

The Hebrew mapping doesn't work. I check to see what the mappings look like while Tomcat is running (By using the MBeans tab of JConsole) and the Hebrew url is displayed as gibberish (specifically, question marks).

I have tried:

  • Adding -J-Dfile.encoding=UTF-8 to netbeans.conf file.
  • Changing windows Locale to Hebrew.
  • Using the URLEncoded version of the url in the pattern (this is also displayed as gibberish symbols in JConsole).
  • Also tried entering the URL in its encoded form into the address bar (eg: localhost:8080/test/%D7%A2%D7%91).
  • Checked the encoding of the Servlet files in notepad, they save as UTF-8 (after making the first change described in this list).
  • I have a filter on all url patterns (ie: "/*") that sets the character encoding of the request to UTF-8 (Also tried Apache's SetCharacterEncodingFilter)

Any suggestions on how I can map to a hebrew (UTF-8) url with tomcat, netbeans, java, windows setup?

Thanks.

theyuv
  • 1,556
  • 4
  • 26
  • 55
  • Try to configure tomcat to understand UTF8 ( Defautl I think charset=ISO-8859-1) http://stackoverflow.com/questions/138948/how-to-get-utf-8-working-in-java-webapps – simar Mar 09 '16 at 13:53
  • As well u can try to configure JVM for tomcat. Set -J-Dfile.encoding=UTF-8 in startup.bat – simar Mar 09 '16 at 13:56

2 Answers2

0

You need to configure your application server to encode request parameters in utf-8. Since you are using Tomcat, that would be setting URIEncoding="UTF-8" in your conf/server.xml file. Here's what it should look like:

<Connector port="8080" maxHttpHeaderSize="8192"
 maxThreads="150" minSpareThreads="25" maxSpareThreads="75"
 .......
 URIEncoding="UTF-8"
/>
MA--
  • 26
  • 3
  • 2
    URIEncoding defaults to UTF-8 in tomcat 8 – theyuv Mar 09 '16 at 14:09
  • If server is configured with "strict servlet compliance" on, the default value of URIEncoding attribute of connectors is "ISO-8859-1", the same as in older versions of Tomcat. I am guessing that's not the case. Still you might want to make sure that org.apache.catalina.STRICT_SERVLET_COMPLIANCE system property is NOT set to true. – MA-- Mar 09 '16 at 14:20
-1

Follow these steps:

  1. Write a Charset filter that will control all requests and responses:
    Reference: https://github.com/edendramis/freemarker-example/blob/master/src/main/java/com/edendramis/config/CharsetFilter.java

     package charsetFilter.classes;
    
     import java.io.IOException;
     import javax.servlet.Filter;
     import javax.servlet.FilterChain;
     import javax.servlet.FilterConfig;
     import javax.servlet.ServletException;
     import javax.servlet.ServletRequest;
     import javax.servlet.ServletResponse;
    
    public class CharsetFilter implements Filter{
        private String encoding;
    
        public void init(FilterConfig config) throws ServletException{
                encoding = config.getInitParameter("requestEncoding");
                if( encoding==null ) encoding="UTF-8";
        }
    
        public void doFilter(ServletRequest request, ServletResponse response, FilterChain       next)
        throws IOException, ServletException{
            // Respect the client-specified character encoding
            // (see HTTP specification section 3.4.1)
                if(null == request.getCharacterEncoding())
                request.setCharacterEncoding(encoding);
                /**
            * Set the default response content type and encoding
            */
            response.setContentType("text/html; charset=UTF-8");
            response.setCharacterEncoding("UTF-8");
                next.doFilter(request, response);
        }
    
            public void destroy(){}
    }`
    
  2. Add this filter to web.xml

     <filter>
            <filter-name>CharsetFilter</filter-name>
            <filter-class>charsetFilter.classes.CharsetFilter</filter-class>
                <init-param>
                    <param-name>requestEncoding</param-name>
                    <param-value>UTF-8</param-value>
                </init-param>
    </filter>
    
    <filter-mapping>
            <filter-name>CharsetFilter</filter-name>
            <url-pattern>/*</url-pattern>
    </filter-mapping>
    
  3. Write some HTML code e.g.:

<?xml version="1.0" encoding="UTF-8"?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN" "http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd"> <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="fi"> <head> <meta http-equiv='Content-Type' content='text/html; charset=UTF-8' />

  1. Use the following in your servlet:

    request.setCharacterEncoding("UTF-8");
    response.setContentType("text/html; charset=UTF-8");
    response.setCharacterEncoding("UTF-8"); 
    
  2. For getting String:
    String input = new String(request.getParameter("foo").getBytes("iso-8859-1"), "utf-8"); String input = URLDecoder.decode(request.getParameter("keyWord"), "UTF-8"); System.out.println(input);

For URL:

String str = "$ome UTF-8 text £900";
String url = "http://your-domain.com/url?p=" + URLEncoder.encode(str, "UTF-8");

Cheers!!

Ghayel
  • 1,113
  • 2
  • 10
  • 19
  • I edited the question. I have also tried (I had already had one set up) setting up a filter to set the character encoding of the `HttpServletRequest`. – theyuv Mar 09 '16 at 14:19
  • The rest of these deal with character encoding for displaying content. No? Not URLs. My UTF-8 content displays fine (including query strings). – theyuv Mar 11 '16 at 09:37
  • Have you tried URLEncoding.encode? I have updated my answer – Ghayel Mar 11 '16 at 17:52
  • You are encoding the parameters. I have no issue with the parameters. I have an issue with the url itself, specifically, the servlet path. – theyuv Mar 11 '16 at 18:11
  • @BalusC you are wrong at this time. I never plagiarized answer from any post. Check with open eyes my answer and the answer you referred. – Ghayel Mar 12 '16 at 11:13
  • @BalusC remove your bad comments against my post and please make a little search before rewarding plagiarism to other work – Ghayel Mar 12 '16 at 11:58
  • None of this all explains/solves OP's concrete problem. Point 5 is absolutely not recommended as it's a workaround not a solution. Next time try reproducing OP's concrete problem yourself instead of making guesses. – BalusC Mar 12 '16 at 12:43
  • Read OP's comments that he given against your post and then said something to someone else. A fat down vote to your post – Ghayel Mar 12 '16 at 19:49