21

I'm running my Java program from command-line (Windows 7). To simplify matters, I describe only the relevant part.

public static void main(String[] args) {
    System.out.println("Árpád");
}

My output is garbage. It is obviously a character-encoding problem, the Hungarian characters of Á and á are not showing up correctly. I've tried the following:

public static void main(String[] args) {
    PrintStream ps = new PrintStream(System.out, true, "UTF-8");
    ps.println("Árpád");
}

But my output is still garbage. How can I resolve this character-encoding issue with Windows 7 command-line? Thanks

weston
  • 54,145
  • 21
  • 145
  • 203
Lajos Arpad
  • 64,414
  • 37
  • 100
  • 175
  • 1
    Are you sure it's not a compilation problem? How are you compiling, and what encoding is your source code in? – Jon Skeet Dec 25 '12 at 12:56
  • 1
    Does your display actually support displaying such characters to start with? Ie, can you type them at your keyboard on this display and they appear correctly? – fge Dec 25 '12 at 12:58
  • I'm compiling with NetBeans and the character-encoding of the sources is UTF-8 – Lajos Arpad Dec 25 '12 at 12:58
  • Yes, my display supports displaying such characters and I can type my characters correctly. If I run my program from NetBeans it shows the output correctly. I only have problem in showing my result in command-line. This project will be used from command-line by clients who might have Hungarian results. – Lajos Arpad Dec 25 '12 at 13:00
  • 1
    Could you test if `Cp852` encoding helps you? In my (Polish) version of Win7 console it is working fine. – Pshemo Dec 25 '12 at 13:04
  • After running chcp 852 my output was the same. I guess this encoding is good for Polish characters, but not for Hungarian. Anyway, thanks for the tip. – Lajos Arpad Dec 25 '12 at 13:07
  • As shown here: http://www.microsoft.com/resources/documentation/windows/xp/all/proddocs/en-us/country.mspx?mfr=true both Hungarian and Polish character sets are encoded in 850 and 852. But I can't see the big Á in the result, there is a random character instead. – Lajos Arpad Dec 25 '12 at 13:27

1 Answers1

19

I got your code to work by finding the right encoding from the command line, and then either using the PrintStream version with that encoding, or by specifying it on the command line and just using System.out.println.

To find the encoding on the commandline, run chcp. Here's the output I got:

Active code page: 850

That corresponds to the Java charset name of "IBM850". So this then creates the right output on the command line:

java -Dfile.encoding=IBM850 Test
Jon Skeet
  • 1,421,763
  • 867
  • 9,128
  • 9,194
  • +1 Thank you, this almost resolved the problem. The small á character was showing up correctly, the big Á was not showing up correctly. – Lajos Arpad Dec 25 '12 at 13:10
  • @LajosArpad: Both work for me - which code page does your console use? – Jon Skeet Dec 25 '12 at 13:20
  • It's using 437 by default. I've tried by changing it to 850 and 852, but unfortunately the results were incorrect. – Lajos Arpad Dec 25 '12 at 13:25
  • @LajosArpad: Looking at http://en.wikipedia.org/wiki/Code_page_437 I suspect it just doesn't support all the characters you need :( It's possible that you can set this elsewhere in Windows - it's not something I've ever done... – Jon Skeet Dec 25 '12 at 13:28
  • @LajosArpad: Actually, `chcp` will let you *set* the code page as well. So try running `chcp 850` and then the command line I showed in my answer... – Jon Skeet Dec 25 '12 at 13:29
  • I've tried to run chcp and then the command, without success. Anyway, I thank you for this nice answer, I will start researching the issue based on your ideas and will let you know about the results. Thanks again. – Lajos Arpad Dec 25 '12 at 13:33
  • Unfortunately I didn't find a solution to the problem, your answer was the closest to a solution, as it allowed small Hungarian characters. I accept this answer as the solution until I find a better solution. Thanks again. – Lajos Arpad Dec 27 '12 at 20:45
  • my solution came with chcp 65001 and not using raster fonts in command prompt – ılǝ Jan 24 '13 at 12:58
  • How to do the same thing in UNIX. I did in Command prompt its works fine for me. – user1912935 Sep 23 '13 at 09:06
  • @user1912935: if the encoding is supported, the exact same command line should work fine. – Jon Skeet Sep 23 '13 at 09:55
  • @JonSkeet Is the default code page always 850 for windows cmd.exe environments? If not, does a portable approach exist to resolve this issue for every possible code page set? – MRalwasser Feb 18 '15 at 10:04
  • @MRalwasser: No, it isn't - it depends on your regional settings, as far as I'm aware. I don't know of an easy way of detecting it, unfortunately. (I'm sure there is a solution, but it may not be easy and I don't know it anyway!) – Jon Skeet Feb 18 '15 at 10:05