4

I've been searching around for ages and haven't found anyone who has had the same problem as me. When I run my program in Eclipse, everything looks fine. As soon as I run it in windows CMD, all special non-ASCII characters in my ArrayList are replaced with a ?. There are two attributes in the Dog class that are strings, namely "name" and "race".

Here's the code that prints the list in my main program:

System.out.println("\r\nLista på hundar i hundregistret: " + viewDogList.toString() + "\r\n");

Here's information from my Dog class, attributes, and methods used:

private String name; //attribute for the dog's name
private String race; //attribute for the dog's race

public Dog(String name, String race, int age, double weight)

        public String getName() { //hämta hundnamn      
        return name;
        }

        public void setName (String name) { //sätta hundnamn
            this.name = name;
        }

        public String getRace() {
            return race;            
        }

        public void setRace (String race) { //sätta hundras
            this.race = race;
        }

This is how the Dog list is constructed and the Dog object added:

ArrayList<Dog> viewDogList= new ArrayList<Dog>();
Dog dogInstance = new Dog("", "", 0, 0.0);
viewDogList.add(dogInstance);

When I print out the list in Eclipse after I've added a Dog object it is displayed as:

[Bjäbbis Schäfer 12 år 12.0 kg svans=14.4]

However, if I compile and run the program in CMD the same line is displayed as:

[Bj?bbis Sch?fer 12 år 12.0 kg svans=14.4]

Is there any solution into getting this to work? I have read something about bytes, string, character conversions but I don't think it's what I'm looking for!

EDIT: I forgot to mention that all strings unrelated to the ArrayList are properly displayed in the windows CMD. So its strange that only the ArrayList contents are displayed incorrectly.

I have also overrun the .toString method in the Dog class like so:

public String toString() {
    return name + " " + race + " " + getAge() + " år " + getWeight() + " kg " + "svans="+ getTailLength();
}

Any help appreciated! TIA

CLTX
  • 51
  • 4
  • This is a problem with a bad setup of your windows console; configure it so that it use a font which is able to display such characters. Or make a jump into the 21st century and use a cygwin terminal :p – fge Nov 26 '15 at 13:47
  • Possible duplicate of [Output non-utf8 symbols to console](http://stackoverflow.com/questions/9822039/output-non-utf8-symbols-to-console) – Jiri Tousek Nov 26 '15 at 13:48
  • Other possibly related posts: http://stackoverflow.com/questions/5965195/utf-8-cjk-characters-not-displaying-in-java and http://stackoverflow.com/questions/54952/java-utf-8-and-windows-console – Jiri Tousek Nov 26 '15 at 13:49
  • Check these 3 - http://stackoverflow.com/questions/1259084/what-encoding-code-page-is-cmd-exe-using and http://stackoverflow.com/questions/19955385/utf-8-in-windows-7-cmd and http://stackoverflow.com/questions/388490/unicode-characters-in-windows-command-line-how/388500#388500 .. **Basically** you need to make sure that your command prompt understands (encode and decode) those characters, and for that use Lucida console fonts which supports Unicode characters from advanced BMP plane and do execute this command prompt `chcp 65001` – hagrawal7777 Nov 26 '15 at 13:51

1 Answers1

1

EDIT: I have revoked my answer so that its information is now correct.

The reason normal strings containing special characters such as å, ä, ö worked was because that those are encoded in a way that cmd can read.

when you use the scanner, the strings are encoded in a way that cmd cannot read, thus, you have to make sure all scanner inputs are encoded properly so that the cmd can read it.

It's possible to set the character encoding of input by modifying Scanner:

new Scanner(System.in, "UTF-8")

EDIT: Another problem in Windows resulted in cmd not accepting chcp changes. 'chcp' is not recognized as an internal or external command, operable program or batch file. on a Windows PC

EDIT: Setting cmd to chcp 65001/UTF-8 did not work.

Conclusion: cmd does not support UTF-8 byt default, but setting cmd to UTF-8 (chcp 65001) does not work with java. The output is still incorrect and the program crashes if you input non-ascii characters anyway.

EDIT:

There is absolutely NO way to make cmd work with UTF-8. I had to the scanner to:

new Scanner(System.in, "cp850")

Of course, this made Eclipse not showing å,ä, ö characters correct, so I had to manually set the Eclipse console to dispalying chcp 850 like the windows cmd does by default.

Microsoft is at fault for all of this. There's absolutely no logic that cmd doesn't support UTF-8 and never has. It's so stupid. I bet it has to do with greedy M$ wanting $$$.

Community
  • 1
  • 1
CLTX
  • 51
  • 4