Typecasting from int to char and ASCII values

Question

int a1 = 65535;

char ch2 = (char) a1;

System.out.println("ASCII value corresponding to 65535 after being typecasted : "+ch2);// prints?
char ch3 = 65535;
System.out.println("ASCII value corresponding to 65535 : "+ch3);// again prints?

I quote from Herbert Schildt Chapter 3 : Data types, Variables and Arrays :

The range of a char is 0 to 65535. There are no negative chars. The standard set of characters known as ASCII still ranges from 0 to 127 as always, and the extended 8-bit character set, ISO-Latin-1, ranges from 0 to 255. Since Java is designed to allow programs to be written for worldwide use, it makes sense that it would use Unicode to represent characters. An integer can also be assigned to a char as long as it is within range.
//char ch33 = 65536; compilation-error, ofcourse since out of char range (which is 0 - 65535)

int a11 = 65536;  
  char ch22 = (char) a11;   
System.out.println("ASCII value corresponding to 65536 after being typecasted : "+ch22);
  // non-printing character(appearance of a small square like figure in eclipse console)

The question is: why is there no compilation error for this line: char ch22 = (char) a11, even though char ch33 = 65536 does not works? One more thing, this was not the case when int a1 = 65535 was taken?

By casting the `int` to `char`, you are telling the compiler that you promise you known that the value will be in range – or just don't care. Typecasts bypass the compiler's type checks and are thus best avoided if not needed. — 5gon12eder, Sep 07 '14 at 10:51
@OliCharlesworth - your comment is wrong. char ch22 = (char) 65536; would not give a compilation error. — BatScream, Sep 07 '14 at 11:02
@ShirgillAnsari - This type of question where you need to know how the compiler evaluates expressions based on the kinds of Expressions, have been best dealt here. have a look, you would get your explanation : http://stackoverflow.com/questions/21317631/java-char-int-conversions — BatScream, Sep 07 '14 at 11:32

score 2 · Accepted Answer · answered Sep 07 '14 at 11:41

2

Okay, you have a couple of quite distinct questions there.

The first question, I think, is:

Why do you see ? when you output `ch2` and `ch3`

Because you're outputting an invalid character. Java characters represent UTF-16 code points, not actual characters. Some Unicode characters, in UTF-16, require two Java chars for storage. More about UTF-16 here in the Unicode FAQ. In UTF-16, the value 0xFFFF (which is what your ch2 and ch3 contain) is not valid as a standalone value; even if it were, there is no Unicode U+FFFF character.

Re the output of ch22: The reason you're seeing a little box is that you're outputting character 0 (the result of (char)65536 is 0, see below), which is a "control character" (all the characters below 32 — the normal space character — are various control characters). Character 0 is the "null" character, for which there's no generally-accepted glyph that I'm aware of.

Why no error when doing `int a11 = 65536; char ch22 = (char) a11;`?

Because that's how Java's narrowing primitive conversions are defined. No error is thrown; instead, only the relevant bits are used:

A narrowing conversion of a signed integer to an integral type T simply discards all but the n lowest order bits, where n is the number of bits used to represent type T. In addition to a possible loss of information about the magnitude of the numeric value, this may cause the sign of the resulting value to differ from the sign of the input value.

answered Sep 07 '14 at 11:41

T.J. Crowder

1,031,962
187
1,923
1,875

Casting an int value to a char has effectively the result `intval modulo 65536`. – MC Emperor Sep 07 '14 at 12:02
@MCEmperor: Yeah, pretty much, depending on which definition of modulo you want to apply. :-) I think the spec's language about "disregarding" the high-order bits is clearest, though, avoiding any ambiguity around negative numbers. – T.J. Crowder Sep 07 '14 at 12:16
@TJCrowder 'Which' definition of modulo? Are there multiple definitions of the modulo operator out there? – MC Emperor Sep 07 '14 at 13:36
1

@MCEmperor: The *operator*? No. (Technically Java has no modulo operator; it has a [*remainder* operator](http://docs.oracle.com/javase/specs/jls/se8/html/jls-15.html#jls-15.17.3).) The *operation*? [Yes](http://en.wikipedia.org/wiki/Modulo_operation). :-) – T.J. Crowder Sep 07 '14 at 13:50
@TJCrowder Ah, let me guess: *"When either a or n is negative, the naive definition breaks down and programming languages differ in how these values are defined."* – MC Emperor Sep 07 '14 at 14:05
@MCEmperor: Right -- that's why I referred to ambiguity re negative numbers. :-) I don't think it's just programming languages, I think there's a difference with the math concept. But I'm not much of a math guy, so... – T.J. Crowder Sep 07 '14 at 14:10

Volune · Answer 2 · 2014-09-07T11:33:32.240

About why `char ch22 = (char) a11` works

From java specification

A narrowing primitive conversion may lose information about the overall magnitude of a numeric value and may also lose precision and range.

[...]

A narrowing conversion of a signed integer to an integral type T simply discards all but the n lowest order bits, where n is the number of bits used to represent type T. In addition to a possible loss of information about the magnitude of the numeric value, this may cause the sign of the resulting value to differ from the sign of the input value.

About why `char c = 65536` doesn't work

From java specification

A narrowing primitive conversion followed by a boxing conversion may be used if the type of the variable is:

Byte and the value of the constant expression is representable in the type byte.

Short and the value of the constant expression is representable in the type short.

Character and the value of the constant expression is representable in the type char.

65536 is not inherently a char value

For example

1 is at the same time a byte, a short, a char, an int and a long value.
256 is a short, char, int and long value, but not a byte value.
65535 is a char, int and long value, but neither byte nor short value.
-1 is a byte, short, int, long value, but not a char value.
65536 is only an int and long value.

char c = (char)65536; will work

Per http://docs.oracle.com/javase/specs/jls/se8/html/jls-3.html#jls-3.10.1, all of these literals are of type `int`. — Oliver Charlesworth, Sep 07 '14 at 11:04
Of course i know these ranges. That doesn't answer me. it's working char c = (char)65536; I am asking why? — Farhan stands with Palestine, Sep 07 '14 at 11:06
@ShirgillAnsari because the java specification says so. I updated my answer with a quote of the relevant part of the specification. — Volune, Sep 07 '14 at 11:17
@ShirgillAnsari Your question is all but clear. Answer updated with _why char ch22 = (char) a11 works_ — Volune, Sep 07 '14 at 11:34

score 0 · Answer 3 · answered Sep 07 '14 at 20:35

Java's char type holds a Unicode/UTF-16 code unit, one or two of which encode a codepoint. Not all 16-bit positive integers are valid code units. And, since you want to deal with char instead of String, you'll want to restrict the values to codepoints encoded with only one code unit.

65535 is not a valid UTF-16 code unit nor a valid Unicode codepoint.

As to your question, why you don't get an exception, I can only compare with other integer-like operations where you don't get an exception for overflows and similar exceptional outcomes. Languages vary in their design compromises.

I'll submit, if you are doing the right thing—the right way—with char or Character or String, you won't run into problems like this. Forget about "ASCII still ranges from 0 to 127 as always, and the extended 8-bit character set, ISO-Latin-1." Java uses Unicode; Embrace it.

Typecasting from int to char and ASCII values

3 Answers3

Why do you see ? when you output `ch2` and `ch3`

Why no error when doing `int a11 = 65536; char ch22 = (char) a11;`?

About why `char ch22 = (char) a11` works

About why `char c = 65536` doesn't work

Linked

Related

Typecasting from int to char and ASCII values

3 Answers3

Why do you see ? when you output ch2 and ch3

Why no error when doing int a11 = 65536; char ch22 = (char) a11;?

About why char ch22 = (char) a11 works

About why char c = 65536 doesn't work

Linked

Related

Why do you see ? when you output `ch2` and `ch3`

Why no error when doing `int a11 = 65536; char ch22 = (char) a11;`?

About why `char ch22 = (char) a11` works

About why `char c = 65536` doesn't work