4

I have found that instance variable "char" default value is "u0000" (unicode of null). But when I tried with the piece of code in below, I could only see an empty print line. Please give me clarification.

public class Basics {

     char c;
     int x;

     public static void main(String[] args) {
         Basics s = new Basics();

         System.out.println(s.c);
         System.out.println(s.x);
     }   
}

Console output as follow:

(empty line)
0
Andrew Tobilko
  • 48,120
  • 14
  • 91
  • 142
Guna Yuvan
  • 101
  • 11
  • google yields: stackoverflow.com/questions/44878530/print-unicode-character-in-java – theRiley Oct 07 '18 at 10:38
  • There are [control picture](http://www.unicode.org/charts/nameslist/index.html) characters that you could substitute in—that makes them human-readable but of course changes the text. ␀. Note: the line might look empty to you but it is not. Send it to a file and look at the bytes. – Tom Blodget Oct 07 '18 at 14:41

3 Answers3

3

'\u0000' (char c = 0;) is a Unicode control character. You are not supposed to see it.

System.out.println(Character.isISOControl(s.c) ? "<control>" : s.c);
Andrew Tobilko
  • 48,120
  • 14
  • 91
  • 142
1

Try

System.out.println((int) s.c);

if you want to see the numeric value of the default char (which is 0).

Otherwise, it just prints a blank (not an empty line).

You can see that it's not an empty line if you add visible characters before an after s.c:

System.out.print ("--->");
System.out.print (s.c);
System.out.println ("<---");

will print:

---> <---
Eran
  • 387,369
  • 54
  • 702
  • 768
1

Could you please provide me more information about why does unicode is selected as default value for char data type? is there any specific reason behind this?

It was recognized that language that was to become Java needed to support multilingual character sets by default. At that time Unicode was the new standard way of doing it1. When Java first adopted Unicode, Unicode used 16 bit codes exclusively. That caused the Java designers to specify char as an unsigned 16 bit integral type. Unfortunately, Unicode rapidly expanded beyond a 16 bits, and Java had to adapt ... by switching to UTF-16 as Java's native in-memory text encoding scheme.

For more background:

But note that:

  • In the latest version of Java, you have the option enabling a more compact representation for text data.
  • The width of char is so hard-wired that it would be impossible to change. In fact, if you want to represent a Unicode code point, you should use an int rather than a char.

1 - It is still the standard way. AFAIK there are no credible alternatives to Unicode at this time.


The specific reason that \u0000 was chosen as the default initial value for char, is because it is zero. Objects are default initialized by writing all zero bytes to all fields irrespective of their types. This maps to zero for integral types and floating point types, false for boolean, and null for reference types.

It so happens that the \u0000 character maps to the ASCII NUL control character which is a non-printing character.

Stephen C
  • 698,415
  • 94
  • 811
  • 1,216