1

so I have to give some arguments to my Java app which is called from a .bat file. Doing this makes the arguments have the system's charset encoding, which makes some characters displayed wrongly. I tried this

     String titulo;

     titulo = new String (args[1].getBytes(),"Cp1252");

also tried with a few others from this list http://docs.oracle.com/javase/1.4.2/docs/guide/intl/encoding.doc.html and none of them succeeded. How else can I encode a String from Windows charset to Java's UTF 8? Thanks a lot in advance!

Regards, Rodrigo.

EDIT: The argument I give in the .bat is Martín and the output (which is a JLabel displaying) shows this MartÝn.

Joni
  • 108,737
  • 14
  • 143
  • 193
rMaero
  • 195
  • 4
  • 13
  • Did you already try "UTF-8" instead of "Cp1252"? – Hamed Feb 24 '12 at 18:42
  • You said "some characters displayed wrongly" but did not show how you are displaying the string. My guess is that the problem is on the output side, while the input parameters are probably correct. – Jörn Horstmann Feb 24 '12 at 18:45
  • Yes, I did... It gave me different (still wrong) characters. Thank you for reminding me though – rMaero Feb 24 '12 at 18:46
  • use "UTF8", no scoreline – Alfabravo Feb 24 '12 at 19:11
  • For anyone else wanting to find the list of possible encodings, the new URL seems to be: http://docs.oracle.com/javase/8/docs/technotes/guides/intl/encoding.doc.html – prewett Apr 29 '14 at 11:39

1 Answers1

2

The Windows command prompt cmd.exe actually doesn't use CP1252. What it uses apparently depends on the system; on Western European systems it's most likely CP850. So you can try this:

titulo = new String (args[1].getBytes(),"Cp850");

You can look at the code tables for cp850 to check what should happen: í is the byte ED in latin 1 (and, by extension, cp1252), and the byte ED in cp850 is Ý. Therefore: if you print "í" from a Java GUI to cmd.exe it will show up as "Ý".

(But you seem to be seeing the reverse: "í" from the terminal shows up as "Ý" in a GUI.. that doesn't make sense, cmd.exe should pass the byte A1 to Java, which should interprete it as "¡"..)

Joni
  • 108,737
  • 14
  • 143
  • 193
  • Then the proper solution would be to encode the .bat file as CP850 or change the console encoding, not this horrible hack. – Jörn Horstmann Feb 24 '12 at 23:35
  • Sure.. the problem is that most text editors on Windows don't support CP850, and there's no way to change the console encoding. **Edit** Just read you can use `chcp 1252` to change the console code page, but that may break some console apps – Joni Feb 24 '12 at 23:47