why '?' appears as output while Printing unicode characters in java

Question

While printing certain unicode characters in java we get output as '?'. Why is it so and is there any way to print these characters?

This is my code

String symbol1="\u200d";
        StringBuilder strg = new StringBuilder("unicodecharacter");
        strg.insert(5,symbol1);
        System.out.println("After insertion...");
        System.out.println(strg.toString());

Output is After insertion... unico?decharacter

Which certain characters, and printed how? Please share some code. This could be an encoding problem, a problem in processing character data, or a font problem. That’s about all one can say without real information about the situation. — Jukka K. Korpela, Sep 26 '13 at 20:49
You are printing with a non-Unicode encoding (as Unicode has all). If the encoding is ISO-8859-1 (Latin-1) you could try Windows-1252 (Windows Latin-1, a bit more). `new OutputStreamWriter(outputStream, "Windows-1252")`. — Joop Eggen, Sep 26 '13 at 20:57
What characters and where are you printing them? If you are trying to get arbitrary Unicode out to the Windows console just give up now, it's unresolvably broken. — bobince, Sep 27 '13 at 10:31

score 3 · Answer 1 · answered Sep 26 '13 at 20:49

3

Here's a great article, written by Joel Spolsky, on the topic. It won't directly help you solve your problem, but it will help you understand what's going on. It'll also show you how involved the situation really is.

answered Sep 26 '13 at 20:49

Daniel Kaplan

62,768
50
234
356

Thanks for the article it improved my understanding. – user2821099 Sep 30 '13 at 08:47

score 2 · Answer 2 · answered Sep 26 '13 at 20:46

You have a character encoding which doesn't match the character you have or the supported characters on the screen.

I would check which encoding you are using through out and try to determine whether you are reading, storing or printing the value correctly.

score 0 · Answer 3 · answered Sep 26 '13 at 20:51

0

Are you sure which encoding you need? You may need to explicitly encode your output as UTF-8 or ISO 8859-1 if you are dealing with European characters.

answered Sep 26 '13 at 20:51

It Grunt

3,300
3
21
35

score 0 · Answer 4 · answered Sep 26 '13 at 23:52

0

Java's default behaviour when reading an invalid unicode character is to replace it with the Replacement Character (\uFFFD). This character is often rendered as a question mark.

In your case, the text you're reading is not encoded as unicode, it's encoded as something else (Windows-1252 or ISO-8859-1 are probably the most common alternatives if your text is in English).

answered Sep 26 '13 at 23:52

Aurand

5,487
1
25
35

Thank you very much.I changed the encoding and now the code is working fine. – user2821099 Sep 30 '13 at 08:41

score 0 · Answer 5 · answered Oct 29 '17 at 15:26

I wrote an Open Source Library that has a utility that converts any String to Unicode sequence and vise-versa. It helps to diagnose such issues. So for instance to print your String you can use something like this:

String str= StringUnicodeEncoderDecoder.decodeUnicodeSequenceToString("\\u0197" +
   StringUnicodeEncoderDecoder.encodeStringToUnicodeSequence("Test"));

You can read about the library and where to download it and how to use it at Open Source Java library with stack trace filtering, Silent String parsing Unicode converter and Version comparison See the paragraph "String Unicode converter"

why '?' appears as output while Printing unicode characters in java

5 Answers5

Linked

Related