Why does Java Language Spec allow this: \uuuu0041?

Asked Sep 20 '16 at 05:28

Active Sep 20 '16 at 05:28

Viewed 63 times

Why does the Java Language spec allow a unicode escape sequence to contain 1 or more 'u' characters before the 4 hex characters?

How do we print the 'A' character?

char c = '\u0041';
System.out.println("c: " + c);

// Prints: 
// c: A

Ok, but this prints 'A' too:

char c = '\uu0041';
System.out.println("c: " + c);

// Prints: 
// c: A

As does this:

char c = '\uuuuuuuuuuuuu0041';
System.out.println("c: " + c);

// Prints: 
// c: A

So there seems to be an infinite number of unicode escape sequences to represent any unicode escape sequence (ha!).

Question: What purpose does this serve in the language?

asked Sep 20 '16 at 05:28

DataDino

1

See the duplicate. The reason (straight from the JLS) is: "The Java programming language specifies a standard way of transforming a program written in Unicode into ASCII that changes a program into a form that can be processed by ASCII-based tools. The transformation involves converting any Unicode escapes in the source text of the program to ASCII by adding an extra u - for example, \uxxxx becomes \uuxxxx - while simultaneously converting non-ASCII characters in the source text to Unicode escapes containing a single u each." – Erwin Bolwidt Sep 20 '16 at 05:38
@ErwinBolwidt, good find. Thanks! – DataDino Sep 20 '16 at 05:40

0 Answers0