3

Why does the Java Language spec allow a unicode escape sequence to contain 1 or more 'u' characters before the 4 hex characters?

How do we print the 'A' character?

char c = '\u0041';
System.out.println("c: " + c);

// Prints: 
// c: A

Ok, but this prints 'A' too:

char c = '\uu0041';
System.out.println("c: " + c);

// Prints: 
// c: A

As does this:

char c = '\uuuuuuuuuuuuu0041';
System.out.println("c: " + c);

// Prints: 
// c: A

So there seems to be an infinite number of unicode escape sequences to represent any unicode escape sequence (ha!).

Question: What purpose does this serve in the language?

DataDino
  • 1,507
  • 1
  • 15
  • 30
  • 1
    See the duplicate. The reason (straight from the JLS) is: "The Java programming language specifies a standard way of transforming a program written in Unicode into ASCII that changes a program into a form that can be processed by ASCII-based tools. The transformation involves converting any Unicode escapes in the source text of the program to ASCII by adding an extra u - for example, \uxxxx becomes \uuxxxx - while simultaneously converting non-ASCII characters in the source text to Unicode escapes containing a single u each." – Erwin Bolwidt Sep 20 '16 at 05:38
  • @ErwinBolwidt, good find. Thanks! – DataDino Sep 20 '16 at 05:40

0 Answers0