3

Here's the problem: I got some Chinese in my code, and I'm writing it into sqlite.

When I run my program in eclipse and read the string out of sqlite, it just works fine. But when I packaged the project to a jar and run it in command line. The Chinese string which is read out of sqlite is unreadable.

After some trials, I know that the problem lies in the file.encoding system property. When I run the jar using command: java -Dfile.encoding=UTF-8 -jar TK.jar, it works fine with Chinese word, But if I set the system property in code like: System.setProperty("file.encoding", "UTF-8");, it won't work.

So, What is the difference between setting the system property from command line and code? And could anyone tell me how to set file.encoding system property in code?

Thanks a lot!

To summary:

Remember to add charset when using String.getBytes() as well as new String() in order to avoid unreadable output from Chinese or Japanese when running program in different environment.

Judking
  • 6,111
  • 11
  • 55
  • 84
  • 1
    Is the property set before the DB driver is initialized? It could be possible that the DB driver is getting initialized with the default system value and you are setting the property after than? – Vikdor Oct 14 '13 at 14:24
  • 2
    It should not be necessary to set `file.encoding` (not on the command line, nor in code). Somewhere in your code you are probably relying on the default character set being set to `UTF-8` (for example, you might be calling `String.getBytes()` without specifying the charset, or reading from a text file without specifying the charset). Fix that problem in your code. – Jesper Oct 14 '13 at 14:26
  • yes, I set the property in the first line of my Main method, and I invoke `System.getProperty` method to make sure it has already changed to "UTF-8" @Vikdor – Judking Oct 15 '13 at 00:29
  • you bet, I didn't add charset to `String.getBytes()`, Thanks a lot! (Could u issue your answer to below?) @Jesper – Judking Oct 16 '13 at 00:50

3 Answers3

4

Somewhere in your code, you are probably relying on the default character set being UTF-8. For example, when you call String.getBytes() without specifying a character set, Java will use the default character set. If you always want UTF-8, then specify this when calling String.getBytes():

byte[] utf8bytes = text.getBytes("UTF-8");

Also, if you want to read a file with a specific character encoding, then specify the character encoding rather than relying on the default setting. For example:

InputStream is = new FileInputStream("utf8file.txt");
BufferedReader in = new BufferedReader(new InputStreamReader(is, "UTF-8"));

// Read text, file will be read as UTF-8
String line = in.readLine();

in.close();
Jesper
  • 202,709
  • 46
  • 318
  • 350
0

If you specify your system properties on the command line they are available as soon as the program loads.

If you specify them inside your program then they will only become available after you set them, by loading the class with your main method you might also already load other classes using system properties before you are even able to set them !

GerritCap
  • 1,606
  • 10
  • 9
  • I don't get clear after reading your post. I invoke `System.setProperty("file.encoding", "UTF-8")` on the first line of Main method, how could it be difference with the way to setting it in command line like `java -Dfile.encoding=UTF-8 -jar TK.jar`? – Judking Oct 15 '13 at 00:33
0

Not all properties could be changed at runtime. This mean if do not specify value as argument and decide to set it from System.setProperty, that would not have effect. "file.encoding" is one of those. Possible duplicate.

Community
  • 1
  • 1
Mikhail
  • 4,175
  • 15
  • 31