2

I am trying to retrieve some UTF-8 uni coded Chinese characters from a database using a Java file. When I do this the characters are returned as question marks.

However, when I display the characters from the database (using select * from ...) the characters are displayed normally. When I print a String in a Java file consisting of Chinese characters, they are also printed normally.

I had this problem in Eclipse: when I ran the program, the characters were being printed as question marks. However this problem was solved when I saved the Java file in UTF-8 format.

Running "locale" in the terminal currently returns this:

LANG="en_GB.UTF-8"
LC_COLLATE="en_GB.UTF-8"
LC_CTYPE="en_GB.UTF-8"
LC_MESSAGES="en_GB.UTF-8"
LC_MONETARY="en_GB.UTF-8"
LC_NUMERIC="en_GB.UTF-8"
LC_TIME="en_GB.UTF-8"
LC_ALL=

I have also tried to compile my java file using this:

 javac -encoding UTF-8 [java file] 

But still, the output is question marks.

It's quite strange how it will only sometimes display the characters. Does anyone have an explanation for this? Or even better, how to fix this so that the characters are correctly displayed?

gtgaxiola
  • 9,241
  • 5
  • 42
  • 64
201403540
  • 199
  • 1
  • 4
  • 12
  • What encoding is the text returned in from the Java app? – deceze Mar 19 '13 at 08:13
  • @deceze how would i check the encoding from the java app? i did try to compile the file using: javac -encoding UTF-8 [java file] but to no avail. – 201403540 Mar 19 '13 at 11:18
  • No idea, I'm no Java guy. If characters do not show up properly, it simply means the characters are being interpreted in a different encoding than they are output, that's all. The console may expect UTF-8 but may actually be receiving UTF-16, for instance. – deceze Mar 19 '13 at 11:23
  • What's your `LESSCHARSET` like? Set it to `export LESSCHARSET=utf-8` to make sure the Terminal displays UTF-8. – dda Mar 19 '13 at 15:13
  • @dda I just tried your suggestion, still getting question marks. – 201403540 Mar 19 '13 at 16:00

1 Answers1

7

The System.out printstream isn't created as a UTF-8 print stream. You can convert it to be one like this:

import java.io.PrintStream;
import java.io.UnsupportedEncodingException;

public class JavaTest {

    public static void main(String[] args) {


        try{
            PrintStream out = new PrintStream(System.out, true, "UTF-8");

            out.println("Hello");
            out.println("施华洛世奇");
            out.println("World");
        }
        catch(UnsupportedEncodingException UEE){

            //Yada yada yada
        }
    }
}

You can also set the default encoding as per here by:

java -Dfile.encoding=UTF-8  -jar JavaTest.jar
Community
  • 1
  • 1
Danack
  • 24,939
  • 16
  • 90
  • 122