2

I am facing a problem with jetty character encoding. When installed the jetty server on Mac (OSX), it works fine. But, when it is installed on Ubuntu (10.10), the character encoding is not proper.

The word in the page (not URL) having problem is: The New York Times® Bestsellers

It is shown as "The New York Times� Bestsellers" on the page served by the server on Linux

and it is shown as "The New York Times® Bestsellers" on the page served by the server on Mac (This is correct)

The jetty server version is: hightide-7.0.2.v20100331

The character encoding of file served is: UTF-8

Can you please let me know if any settings need to be changed to overcome this problem?

Thanks in advance!

skaffman
  • 398,947
  • 96
  • 818
  • 769
Ravi
  • 21
  • 1
  • 1
  • 3

4 Answers4

5

I had a similar problem with jetty 8 and solved it by adding this line to bin/jetty.sh:

JAVA_OPTIONS+=("-Dfile.encoding=UTF-8")
3

I also had a problem like this and I want to thank aditsu for his answer.

I am using restlet on top of a Jetty server on ubuntu 12.04 (and 14.04). The restlet application is behind an Apache server that functions as a proxyPass.

All files are UTF-8.
All HTTP-responses have Content-Type text/html; charset=UTF-8.
All files contain <meta content="text/html; charset=UTF-8" http-equiv="content-type"/>

The strange thing was that when the server boots and I visit the site, the character encoding was not UTF-8 so I got all those funny characters. Even when all signals were telling the server and agents and everything in between that UTF-8 is de encoding used.

When I restart the service manually after the server boot all characters are fine. Because I could not find an answer easily and I did not know who was causing this wrong encoding I kept restarting the service manually.

My candidates at that time were: Apache, Ubuntu service boot order, Restlet framework, File encoding actually used, HTTP headers, HTML meta tags. But all were as it was supposed to be.

So in the end it was Jetty which I only considered just now after having revisited this issue several times.

I still do not have a clue why starting at boot time makes the character encoding all wrong and after a manual restart of the service the encoding is correct. Adding the extra JAVA argument '-Dfile.encoding=UTF-8' made it all go away. Thanks to aditsu again for sharing his solution!!

Cheers

Edit: Settting the LANG environment variable in the start up script also solve the problem. I.e.

export LANG=en_US.UTF-8

Actually this is the difference between starting the Jetty server at boot time (LANG is not defined out of the box) and starting it from a shell. So two solutions for the same problem.

Marc Verkerk
  • 101
  • 1
  • 5
3

Got it; for me, it was missing encoding header of the JSP:

<%@ page contentType="text/html;charset=UTF-8" language="java" %>
Arnaud
  • 7,259
  • 10
  • 50
  • 71
Petar Tahchiev
  • 4,336
  • 4
  • 35
  • 48
0

You are probably reading directly raw http encoding and you need to decode it to utf8 using Decoder.

use java.net.URLDecoder line = URLDecoder.decode(line, "UTF-8");

For encoding text to html charset, use URLEncoder, like when java String directly to post: line = URLEncoder.encode(line, "UTF-8");