0

”0123”.charAt(0) – ’0’ returns an integer. What is this integer (and why is it this integer) and in general why does char-char = int?

Cornelius
  • 351
  • 2
  • 6
  • 14

3 Answers3

4

why is it this integer

Because the compiler can't guarantee that the result of the operation would produce a valid character value. Since characters, bytes, and integers are all implicitly interchangeable... And since integers are the only one of the three which comprises a superset of the possible values of all three... The conversion to integer is the default.

@wero illustrates this in a comment on the question above:

what character is '0' - '1'?

That is, if you subtract a higher character from a lower character, then you'd have a negative character. Which isn't valid. A negative integer, however, is valid.

Or consider a byte... What if you add the byte 250 to 250? Both are valid bytes, but the resulting value is not a valid byte. (Because it's two bytes wide.) So an integer is needed to contain it.

For constant expressions where the compiler can guarantee the value being produced, there may exist (now or in the future) compilers smart enough to maintain that. (They'd probably even optimize it to a compile-time constant in place of the expression in the code.) For example:

`1` - `0`

However, it wouldn't surprise me if even such a compiler would default back to an integer in the use of something like .charAt(), since it's no longer a constant expression. We, as humans, can intuitively infer the constant result of that expression. But a compiler needs strict and simple rules, and introducing a method invocation would complicate those rules considerably.

David
  • 208,112
  • 36
  • 198
  • 279
1

Because chars are encoded with ints. See ASCII table.

For example:

int code = 99;
char c = 'c';
int cc = (int) c; //int cc = 99
char sixty_seven = Integer.toString(code).charAt(0); //the char c

”0123”.charAt(0) – ’0’ is: From the string "0123" with the method charAt(index) is taken the first char from the string, namely the 0. Then we look at the ASCII table and see that the 0 is 48. Then we get 48-48 = 0 after subtraction.

That is possible because JAVA is very much based on C, where there are no Strings (String is an object). "Strings" can be read as char arrays. The ints and chars in C are interchangeable (chars are integer type) in the respective range.

Dimitar
  • 4,402
  • 4
  • 31
  • 47
1

Java char holds one Unicode/UTF-16 code unit. Many Unicode codepoints need only one code unit in their UTF-16 encoding and these code units are in broad ranges. Therefore, within those ranges, a char value can be considered a codepoint.

Subtraction between char values gives the "distance" between them, which is mostly useful when dealing with the few "well-organized" sequences of codepoints, such as the Basic Latin Letters (A-Z or a-z). It's not generally useful to know the "distance" between a quarter note and a half note or a smiling face and a winking face, ….

The result is an integer because, well, distance on a list of consecutive integers is measured by integers.

Again, this technique is only valid under very strict circumstances. There is always a less limited way to do it.

See the Unicode Code Charts.

Also see the documentation for the java.lang.Character class.

Tom Blodget
  • 20,260
  • 3
  • 39
  • 72