16

My Custom class that will be contained by HashSet

public class Person {
    String name;
    int age;

    public Person(String name, int age) {
        this.name = name;
        this.age = age;
    }

    @Override
    public String toString() {
        return "Person{" +
                "hashcode='" + this.hashCode() + '\'' +
                "name='" + name + '\'' +
                ", age=" + age +
                '}';
    }

    @Override
    public boolean equals(Object o) {
        if (this == o) return true;
        if (!(o instanceof Person)) return false;

        Person person = (Person) o;

        if (age != person.age) return false;
        if (!name.equals(person.name)) return false;

        return true;
    }

    @Override
    public int hashCode() {
        int result = name.hashCode();
        result = 31 * result + age;
        return result;
    }

    public String getName() {
        return name;
    }

    public void setName(String name) {
        this.name = name;
    }

    public int getAge() {
        return age;
    }

    public void setAge(int age) {
        this.age = age;
    }
}

My HashSet test that fails

   public void hashSetTest() {
        Set<Person>  personSet = new HashSet<Person>();
        Person p1 = new Person("raghu", 12);
        Person p2 = new Person("rimmu", 21);

        personSet.add(p1);
        personSet.add(p2);


       p1.setName("raghus");
       p1.setAge(13);

       int i2 =p1.hashCode();
       System.out.println(personSet.size() + ": "+ p1.hashCode()+" : "+personSet.contains(p1)+ " : "+i2);
    }

Iam expecting personSet.contains(p1) to pass. Why is it returning false? Thanks sri

Fred Foo
  • 355,277
  • 75
  • 744
  • 836
sridhar
  • 169
  • 1
  • 1
  • 3

4 Answers4

31

Because p1.hashCode() changes when you modify p1, so it can't be found at its original index in the hash table anymore. Never let a hash value depend on a mutable field.

(You're quite lucky that it fails during testing; it might just as well have succeeded, only to fail in production.)

Fred Foo
  • 355,277
  • 75
  • 744
  • 836
  • 5
    Well "never" is a bit harsh because there are situations you'll need it. Just remember that you've got to remove such an object, modify it and add anew to the set. But obviously it's preferred to just use immutable fields if possible. – Voo Feb 24 '11 at 21:02
  • 3
    @Voo: this may be a matter of style and preference, but I'd still say **never**. Hash functions should be defined on the basis of the immutable parts of an object. (I'd make a lightweight object such as this completely immutable, but then I don't like setters at all.) – Fred Foo Feb 24 '11 at 21:06
  • 1
    It surely is a pitfall you've got to be aware of, but if you need efficient access (~O(1)) to mutable objects there's not much you can do. I mean you could make the objects immuteable and create a new one and add that one and delete the old one but then is that any better than the other approach? – Voo Feb 24 '11 at 21:10
  • 3
    You should mark his answer as correct (the green flag besides the up/downvote). – Voo Feb 24 '11 at 21:15
  • 1
    Thanks for quick responses. Are you guys saying that oringinal p1 object with HashCode xxxx at location say '1' in HashTable is still indexed at the same location '1' in HashTable with a different HashCode yyyy after going through a modification? That is why contains method is actually not finding the object p1 in its set because now HashCode yyyy is showing a different index? --sri – sridhar Feb 24 '11 at 21:21
  • @Voo: There's often an immutable part in an otherwise mutable object that can serve as the base of its `hashCode()`. If not, it's not hashable. Harsh, maybe, but I prefer to call it "preventing hard-to-find bugs" :) – Fred Foo Feb 24 '11 at 21:30
  • @sri: How would the hashtable know that the `hashCode()` value has changed? Who would notify it? – Fred Foo Feb 24 '11 at 21:31
  • @larsmans: I agree that avoiding it is usually an extremely good idea, but then that's the usual tradeof between performance and maintainability/readability. I'd still use it in cases where the extra performance is necessary (after benchmarking other solutions obviously) but surely not without a good reason (and limiting the complexity with some clever abstraction so I call setters of those methods only in one specific place). – Voo Feb 24 '11 at 21:50
5

HashSet implements Set. The ApiDoc specifies:

Note: Great care must be exercised if mutable objects are used as set elements. The behavior of a set is not specified if the value of an object is changed in a manner that affects equals comparisons while the object is an element in the set.

In your example this is the case, because changing name or age on p1 affects equal-comparison. So according the ApiDoc, the behavior of Set in your case is unspecified.

Michael Konietzka
  • 5,419
  • 2
  • 28
  • 29
1

Hashes are simple pairing of key and values. Here's how the state of your code would look before and after the renaming in pseudo-code:

Before:

personSet => {
    SOME_NUM1 => Person(name=>"raghu", 12),
    SOME_NUM2 => Person(name=>"rimmu", 21)
}

p1.setName("raghus"); #p1.hashcode() = SOME_NEW_NUM
p1.setAge(13);#p1.hashcode() = SOME_OTHER_NEW_NUM

After:

personSet => {
    SOME_NUM1 => Person(name=>"raghu", 13),
    SOME_NUM2 => Person(name=>"rimmu", 21)
}

Since you have direct access to the p1 the object within the HashSet is updated correctly, but HashSet does not pay attention to contained objects hashcodes being updated. When a call to personSet.contains(p1) is called, the HashSet looks for an entry with the new value of p1.hashcode().

The p1 object is associated with its previous hashcode at the time when it was added to the HashSet.

Zsw
  • 3,920
  • 4
  • 29
  • 43
trisrael
  • 11
  • 1
1

I think you need to have hashCode depends on mutable fields quite often indeed: when you override equals that depends on mutable fields.

From hashCode's contract: "If two objects are equal according to the equals(Object) method, then calling the hashCode method on each of the two objects must produce the same integer result."

So, if you create two objects such that A.equals(B) is true, and then modify A such a way that you get A.equals(B) became false, you need to have hashCodes change too.

It's true that in hashCode's documentation is stated that "It is not required that if two objects are unequal according to the equals(java.lang.Object) method, then calling the hashCode method on each of the two objects must produce distinct integer results.", but I don't know how this can help.

think01
  • 681
  • 1
  • 10
  • 22