1

I am learning java and I found out that in java char ranges from 0-65536 and java uses Unicode to represent characters. So, I run the following code to see what all the characters are:

class A{
    public static void main(String args[]){
        char x=0;
        for(int i=0;i<65536;i++){
            x++;
            System.out.println(i + "th character is: " + x);
        }
    }
}

what I found is :-

  1. First 126 characters are same as ASCII characters.

  2. After 126th character it is just showing '?' mark.

Output:-

...
127th character is: ?
128th character is: ?
129th character is: ?
130th character is: ?
131th character is: ?
132th character is: ?
133th character is: ?
...
65534th character is: ?

My question is why it is showing '?' mark instead of the Unicode characters.

Remy Lebeau
  • 555,201
  • 31
  • 458
  • 770
  • 1
    id say they just can't be displayed – XtremeBaumer Feb 08 '17 at 14:11
  • Well how are you running this? In an IDE? From a command line? Which operating system? Note that not every character *is* printable. – Jon Skeet Feb 08 '17 at 14:11
  • 5
    The encoding of your console is not correctly set – Reimeus Feb 08 '17 at 14:12
  • @JonSkeet Skeet I am running this in command line and using Windows 10 OS – Abhinav Kumar Feb 08 '17 at 14:13
  • 1
    @AbhinavKumar: you need to configure a font in the console that is capable of displaying those characters (and you probably also need to change the command line encoding to UTF8 using `chcp 65001`) –  Feb 08 '17 at 14:16
  • Please check your encoding configurations. Encoding is not correctly set that is why special characters or symbols will not be displayed. – Ashraf.Shk786 Feb 08 '17 at 14:16
  • http://superuser.com/questions/269818/change-default-code-page-of-windows-console-to-utf-8 you will need utf-8 not sure if it works in win 10 with this guide – XtremeBaumer Feb 08 '17 at 14:16
  • And then make sure you're only looking for printable characters. All the ones you've listed are non-printable. – Jon Skeet Feb 08 '17 at 14:24
  • Try to use `Character x` instead of `char x` – Absolut Feb 08 '17 at 14:35
  • 1
    Some Unicode [characters](http://docs.oracle.com/javase/8/docs/api/java/lang/Character.html) require two UTF-16 code units (`char`). So, to run through them all (including unassigned, private use, etc), go `Character.MIN_CODE_POINT` to `Character.MAX_CODE_POINT` except where ` – Tom Blodget Feb 08 '17 at 19:11
  • 1
    @TomBlodget: this also means that Unicode codepoints that require UTF-16 surrogates (U+10000 to U+10FFFF) must be output using a `String` or `char[]` instead of a single `char`. Use `Character.toChars(int)` to convert a Unicode codepoint into a valid UTF-16 `char[]` sequence, and then you can convert the `char[]` to a `String` if needed. – Remy Lebeau Feb 09 '17 at 23:00
  • See this SO answer: http://stackoverflow.com/questions/217237/what-does-t-mean-when-my-text-is-displayed-as-question-marks – O.Badr Feb 10 '17 at 05:11
  • From oracle (http://www.oracle.com/us/technologies/java/supplementary-142654.html): "`The Unicode standard therefore has been extended to allow up to 1,112,064 characters. Those characters that go beyond the original 16-bit limit are called supplementary characters`" and : "`Supplementary character support in the Java platform was designed by the JSR-204 expert group within the Java Community Process.`" – O.Badr Feb 10 '17 at 05:18

1 Answers1

-2

Check your file encoding with the following line and see what comes up. If not 'UTF-8' then set it correctly. Still you wont see all of the char printed. So you need to see which one you want to set the file encoding accordingly.

System.out.println(System.getProperty("file.encoding"));
System.setProperty("file.encoding","UTF-8");
Paresh
  • 564
  • 6
  • 23