Is it safe to use !=/== with Character?

Question

Is it safe to use ==/!= while comparing Character?

Character being a boxed type is it safe to use ==/!= while comparing Character types?

  public static void main(String[] args) {

        Character c1 = 'd';
        Character c2 = (char) getInt();

        System.out.println(c1 == c2);
    }

    public static int getInt() {

        return 100;
    }

The following works as expected (true). However, are there cases where comparing Character with same value using == would lead to false? (Hence, do we have to use '.equals()' while comparing boxed primitive types?

Trivially `new Character('a') == new Character('a')` is false. Yes you should use `equals` when comparing reference types. — Sweeper, Oct 08 '21 at 17:19
Does this answer your question? [Why is 128==128 false but 127==127 is true when comparing Integer wrappers in Java?](https://stackoverflow.com/questions/1700081/why-is-128-128-false-but-127-127-is-true-when-comparing-integer-wrappers-in-ja) — Progman, Oct 08 '21 at 17:20

Alex Shesterov · Accepted Answer · 2021-10-08T19:24:19.407

15

No, it's not safe. You must use equals().

Demonstration:

System.out.println(Character.valueOf('Ü') == Character.valueOf('Ü'));
// -> false

Note that if you use autoboxing or Character.valueOf(), then some characters (ASCII characters) are cached and the same Character instance is reused, so == may return true for the same value:

System.out.println(Character.valueOf('A') == Character.valueOf('A'));
// -> true (on my machine)

But it doesn't work for all characters, and it won't work if you call the deprecated new Character(...) explicitly.

edited Oct 08 '21 at 19:24

answered Oct 08 '21 at 17:21

Alex Shesterov

26,085
12
82
103

Quote from `java.lang.Character#valueOf(char)` comment: "This method will always cache values in the range `\u0000` to `\u007F`, inclusive, and may cache other values outside of this range". The code itself, however, does not include any "other values". – Vasily Liaskovsky Oct 08 '21 at 17:26

Basil Bourque · Answer 2 · 2021-10-08T21:00:24.593

tl;dr

Use code points, not char/Character.

"d".codePointAt( 0 ) == 100  // true.

Details

The Answer by Alex Shesterov is correct. But bigger picture, you should not be using Character objects.

`Character` is broken

The Character class is a wrapper class for the primitive type char. The char/Character type is legacy as of Java 2, and is essentially broken. As a 16-bit value, it is physically incapable of representing most characters.

For example, try running:

 System.out.println( Character.valueOf( '' ) ) ;

Code points

Instead, when working with individual characters, use code point integer numbers. In Java that means using the int/Integer type.

If you look around classes such as String, StringBuilder, and Character you will find codePoint methods.

Let's revise your code snippet. We will change the names to be more descriptive. We switch out Character and char usage for mere int primitive integers. As such, we can compare our int values using == or !=.

package work.basil.text;

public class App7
{
    public static void main ( String[] args )
    {
        int codePointOf_LATIN_SMALL_LETTER_D = "d".codePointAt( 0 ); // Annoying zero-based index counting, not ordinal.
        int codePoint2 = getInt();

        boolean sameCharacter = ( codePointOf_LATIN_SMALL_LETTER_D == codePoint2 );  // Comparing `int` primitives with double-equals. 
        System.out.println( sameCharacter );
    }

    public static int getInt ()
    {
        return 100;  // Code point 100 is LATIN SMALL LETTER D, `d`. 
    }
}

When run:

true

Of course, if you use auto-boxing or otherwise mix the wrapper class Integer with the primitive int, then the same explanation in that other Answer applies here too.

Compare Integer object to Integer object by calling Integer#equals or Objects.equals( Object a , Object b ) rather than using ==.
Compare int primitive to int primitive using == or calling Integer.compare( int x , int y ).

There is nothing broken about `char` and `Character`. The legacy of Java 2 is that they are 16-bit. If Java were to be designed today, they would be 8-bit as in Go. In that case Java Strings would be UTF-8 encoded instead of UTF-16 encoded. UTF-8 encoding is more compact than UTF-16. — Alexey Veleshko, Dec 19 '21 at 10:23
The concern for memory efficiency even made Java developers devise an alternative internal representation for the String class. Traditionally it was always UTF-16 but now String will try to use an 8-bit Latin-1 encoding if possible. — Alexey Veleshko, Dec 19 '21 at 10:26

Is it safe to use !=/== with Character?

2 Answers2

tl;dr

Details

Character is broken

Code points

`Character` is broken