4

I have a class that will be used in a HashSet. It only contains two members, and both are of the same type interface. This is what it looks like:

class MyClass{
    MyInterface a;
    MyInterface b;

    public int hashCode(){
         return a.hashCode() + b.hashCode();
    }

    public boolean equals(Object obj){
         if(!(obj instanceof MyClass)
              return false;
         MyClass other (MyClass) obj;
         return (this.a == other.a && this.b == other.b) || (this.a == other.b && this.b == other.a);
    }
}

As you can see, two instances of MyClass are "equal" if they contain the same two instances of MyInterface.

Now, I was thinking that for hashCode(), I could just add up the default hashcodes of its members. Is this good enough? If not, what is a proper implementation of hashCode() for this case?

AxiomaticNexus
  • 6,190
  • 3
  • 41
  • 61
  • how is a and b's hashcode defined ? – jmj Jun 11 '15 at 19:19
  • 1
    This will provide an unequal hashcode distribution: elements around `Integer.MaxValue / 2` will be disproportionally represented compared to those at the extremes, leading to more hash collisions. – Jeroen Vannevel Jun 11 '15 at 19:19
  • Xor might be a better option than addition. Shouldn't have the uneven distribution problem that @JeroenVannevel mentioned. – lyngvi Jun 11 '15 at 19:22
  • @JeroenVannevel: I don't believe that's true. If the hash codes of the elements are uniformly distributed, then for overflow reasons the hash codes of the sums of the hash codes will be uniformly distributed. – Louis Wasserman Jun 11 '15 at 19:22
  • 2
    You could also use: `return Objects.hash(a, b);` – assylias Jun 11 '15 at 19:24
  • 2
    @assylias no, that would give two different hashcodes for equal objects, because [a, b] is equal to [b, a] – JB Nizet Jun 11 '15 at 19:26
  • 2
    @assylias the OP appears to require something order-agnostic which `Objects.hash` is not – Louis Wasserman Jun 11 '15 at 19:26
  • @LouisWasserman: good point, I disregarded overflowing. – Jeroen Vannevel Jun 11 '15 at 19:27
  • I believe what you're using is also the default `hashCode()` function supplied by certain IDE's such as eclipse. (Suggesting that while you could probably do better with a specially designed hash function, this one is a decent lazy/backup option.) – River Jun 11 '15 at 19:28
  • See http://stackoverflow.com/a/3613764/99248 and http://stackoverflow.com/a/3613423/99248 – raisercostin Jun 11 '15 at 19:38

3 Answers3

4

Yes, this is perfectly fine. It's equivalent to the implementation of Set.hashCode() for two-element sets.

Louis Wasserman
  • 191,574
  • 25
  • 345
  • 413
  • @River: Like I commented elsewhere, assuming `a.hashCode()` and `b.hashCode()` have proper hash code implementations that offer proper value distribution to begin with, I really don't see how the extra calculation will help reduce the likelihood of collisions. OP's implementation is fine. – sstan Jun 11 '15 at 19:46
  • That being said, you could almost certainly do better (less-collisions) with a finer-tuned hash function. The real question is whether it's worth your time to optimize this function rather than simply stick with one that is "good enough". – River Jun 11 '15 at 19:50
1

I would say no. Wouldn't this mean that these two instances of MyClass would hash to the same value:

MyClass {
  a.hashCode = 2;
  b.hashCode = 3;
}

and

MyClass {
  a.hashCode = 1;
  b.hashCode = 4;
}
River
  • 8,585
  • 14
  • 54
  • 67
Dan Forbes
  • 2,734
  • 3
  • 30
  • 60
  • 3
    So you have a hash collision. So what? Do you have a better suggestion? – Louis Wasserman Jun 11 '15 at 19:23
  • 1
    This is bound to happen... Perfect hash functions are near impossible – River Jun 11 '15 at 19:23
  • Maybe my example is overly simplistic, but it seems like a more complete implementation could minimize the number of collision. Do I have a better suggestion? No. – Dan Forbes Jun 11 '15 at 19:25
  • 3
    You could do a combination of + and *, which are both associative and commutative (which are required): `(a.hashCode() + b.hashCode()) * 31 + (a.hashCode() * b.hashCode())`. – Clashsoft Jun 11 '15 at 19:26
  • 2
    @Clashsoft: Assuming `a.hashCode()` and `b.hashCode()` have proper hashcode implementations that offer proper value distribution to begin with, I don't see any compelling reason to complicate the calculation. – sstan Jun 11 '15 at 19:40
  • I know, I wouldn't do that either if the hashCode of the two objects is reasonably well-distributed. This was just a response to the case represented by @DanForbes. – Clashsoft Jun 11 '15 at 19:42
0

Yes it is.

Because even with hashCode collision, the docs state:

It is not required that if two objects are unequal according to the equals(java.lang.Object) method, then calling the hashCode method on each of the two objects must produce distinct integer results.

Luca Fagioli
  • 12,722
  • 5
  • 59
  • 57