5

Select the three correct answers (valid declarations).

(a) char a = '\u0061';

(b) char 'a' = 'a';

(c) char \u0061 = 'a';

(d) ch\u0061r a = 'a';

(e) ch'a'r a = 'a';

Answer: (a), (c) and (d)

Book:

A Programmer's Guide to Java SCJP Certification (Third Edition)

Can someone please explain the reason for the option (c) and (d) as the IDE (IntelliJ IDEA) is showing it in red saying:

Cannot resolve symbol 'u0063'

As shown in IntelliJ IDEA

  • 2
    Java allows unicode characters in source code: https://stackoverflow.com/questions/4448180/why-does-java-permit-escaped-unicode-characters-in-the-source-code – rdas Oct 08 '19 at 08:56
  • 3
    Why are (c) and (d) not working for you? What error do you get? – Thilo Oct 08 '19 at 08:56
  • 1
    Java converts the `\u0061` to the character `a`, that makes (c) valid. Since it is allowed and possible as a variable name, it is also allowed and possible in a type name, so `ch\u0061r` is just `char`, which makes (d) valid as well. – deHaar Oct 08 '19 at 08:59
  • 1
    "not working for me" is not very helpful (e.g. do *you* convert Unicode sequences?) If using a (standard) Java compiler this must work, is one of the first operations a compiler has to do. See: [Java Language Specification 3.3. Unicode Escapes](https://docs.oracle.com/javase/specs/jls/se13/html/jls-3.html#jls-3.3) – user85421 Oct 08 '19 at 09:13
  • 1
    I was wrong in saying "not working for me" as I did not compile as the IDE was showing is a 'Cannot resolve symbol'. But it works once compiled. The issue was just that it is not yet supported in IDE. Thanks – Raj Rajeshwar Singh Rathore Oct 08 '19 at 09:30
  • 1
    not true, IDE is not showing that... at least not mine, also works fine with `javac` compiler... maybe some settings or so, but no idea WHICH IDE your are using. – user85421 Oct 08 '19 at 10:06
  • @Carlos - I have updated the IDE name and also attached a screenshot of the code in the question. Thanks for asking. – Raj Rajeshwar Singh Rathore Oct 09 '19 at 04:44
  • 1
    Perfect! I don't use IntelliJ, but it must have an option to set encoding, but I do not think that this would help. Maybe some InteliJ users can help, despite it really seems like an bug, eventually search/raise an issue at [JetBrains](https://youtrack.jetbrains.com/issues/IDEA) – user85421 Oct 09 '19 at 06:17
  • 4
    It is a known error (15 years!): [IDEABKL-89](https://youtrack.jetbrains.com/issue/IDEABKL-89) and [IDEA-65898](https://youtrack.jetbrains.com/issue/IDEA-65898) – user85421 Oct 09 '19 at 06:33
  • I wonder this is still not corrected... there are plenty duplicates for that ticket – user85421 Oct 09 '19 at 13:53

2 Answers2

4

The compiler can recognise Unicode escapes and translate them to UTF-16. ch\u0061r will become char which is a valid primitive type. It makes option D correct.

3.3. Unicode Escapes

A compiler for the Java programming language ("Java compiler") first recognizes Unicode escapes in its input, translating the ASCII characters \u followed by four hexadecimal digits to the UTF-16 code unit (§3.1) for the indicated hexadecimal value, and passing all other characters unchanged.

\u0061 will be translated to a which is a valid Java letter that can be used to form an identifier. It makes option C correct.

3.8. Identifiers

An identifier is an unlimited-length sequence of Java letters and Java digits, the first of which must be a Java letter.

Identifier:
    IdentifierChars but not a Keyword or BooleanLiteral or NullLiteral
IdentifierChars:
    JavaLetter {JavaLetterOrDigit}
JavaLetter:
    any Unicode character that is a "Java letter"
JavaLetterOrDigit:
    any Unicode character that is a "Java letter-or-digit"

A "Java letter" is a character for which the method Character.isJavaIdentifierStart(int) returns true.

A "Java letter-or-digit" is a character for which the method Character.isJavaIdentifierPart(int) returns true.

The "Java letters" include uppercase and lowercase ASCII Latin letters A-Z (\u0041-\u005a), and a-z (\u0061-\u007a), and, for historical reasons, the ASCII dollar sign ($, or \u0024) and underscore (_, or \u005f). The dollar sign should be used only in mechanically generated source code or, rarely, to access pre-existing names on legacy systems. The underscore may be used in identifiers formed of two or more characters, but it cannot be used as a one-character identifier due to being a keyword.

Andrew Tobilko
  • 48,120
  • 14
  • 91
  • 142
2

\u0061 means a. You can use \u0061 instead of a, therefore:

char \u0061 = 'a';

is the same as

char a = 'a';

and

ch\u0061r a = 'a';

is the same as

char a = 'a';
Lajos Arpad
  • 64,414
  • 37
  • 100
  • 175