7

Let's say we have a hashcode() function, which will then be used inside our equals() method to determine the equality of two objects. Is this an allowed/accepted approach?

Assume that we use a simple implementation of a hash code. (For example a few instance variables multiplied by prime numbers.)

Daniel Compton
  • 13,878
  • 4
  • 40
  • 60
k4kuz0
  • 1,045
  • 1
  • 10
  • 24
  • 1
    Is comparing hashCode the only thing you plan to do in there? Or just one of several steps? – Thilo Dec 28 '15 at 08:17
  • I am not planning to use it, I just wanted to throw the idea out there since my instructor at uni simply said "don't do it" but couldn't explain why. – k4kuz0 Dec 28 '15 at 10:02

4 Answers4

9

This is a terrible way to check for equality, mostly since Objects don't have to be equal to return the same hashcode.

You should always use the equals method for this.

The general rule is:

If the equals method returns true for Objects a and b, the hashCode method must return the same value for a and b.

This does not mean, that if the hashCode method for a and b returns the same value, the equals method has to return true for these two instances.

for instance:

public int hashCode(){
  return 5;
}

is a valid, though be it inefficiënt, hashcode implementation.

EDIT:

to use it within an equals method would be something like this:

public class Person{

private String name;

public Person(String name){ this.name = name;}

public String getName(){ return this.name;}

@Override
public boolean equals(Object o){
  if ( !(o instanceof Person)){ return false;}
  Person p = (Person)o;
  boolean nameE = this.name == null ? p.getName() == null : this.name.equals(p.getName());
  boolean hashE = nameE ? true : randomTrueOrFalse();
  // the only moment you're sure hashE is true, is if the previous check returns true.
  // in any other case, it doesn't matter whether they are equal or not, since the nameCheck returns false, so in best case, it's redundant
  return nameE && hashE;
}

@Override
public int hashCode(){
  int hash = generateValidHashCode();
  return hash;
}

}
Stultuske
  • 9,296
  • 1
  • 25
  • 37
  • True, two objects that are not equal might have the same hash, but chances are slim by a wide margin. Say SHA-256 is used to find the hash. SHA-256 provides 2^256 possible values of the hash of any given object, right? This implies there is a 1/2^256 chance of having the same hash. – the_prole Jul 17 '21 at 21:32
  • @the_prole you are now assuming SHA-256 is commonly used in hashCode methods. I've worked on projects as professional developer for over a decade, I've never seen it in a production environment. – Stultuske Jul 19 '21 at 05:36
  • I see. So for hash code method, we cannot assume same collision resistance as SHA-256? – the_prole Aug 05 '21 at 03:45
2

It is a very bad practice. Hashes are supposed to have a minimal amount of collisions, but usually you have more possibilities for objects than the amount of possible hashes and because of the pigeonhole principle a few distinct objects must have the same hash.

When comparing hashes, you have a certain chance of getting "false positives".

matanso
  • 1,284
  • 1
  • 10
  • 17
1

Actually, it is not a bad idea!

But make sure you use this method to determine inequality, not equality. Hashing code may be faster than checking equality, especially when hashcode is stored (for example in java.lang.String).

If two object have different hashcodes they must be different, else they may be the same. For example you may use this method as the following

Object a, b;
if(a.hashCode() == b.hashCode()){
    if(a.equals(b)) return true;
}

return false;

Be aware that in some cases code above may be slower than using only equals(), especially when in most cases a does equal b.

From documentation of Object.java:

  • If two objects are equal according to the equals(Object) method, then calling the hashCode method on each of the two objects must produce the same integer result.
  • It is not required that if two objects are unequal according to the equals(java.lang.Object) method, then calling the hashCode method on each of the two objects must produce distinct integer results. However, the programmer should be aware that producing distinct integer results for unequal objects may improve the performance of hashtables.
Kamil Jarosz
  • 2,168
  • 2
  • 19
  • 30
  • If you are caching hashCodes this is a good optimization. Otherwise, calculating the hashCode is probably slower than comparing all the fields that go into it. – Thilo Dec 28 '15 at 11:09
  • Since this might cause false positives, I wouldn't really call it "not a bad idea". It would only be valid if you add it as an additional check, but then basically what you are doing is: if ( allIsTrue && (allIsTrue || somethingSimilar)) so, basically: allIsTrue already answered the question – Stultuske Dec 30 '15 at 10:14
  • If this is really an optimisation then you should expect the `equals` method to already be doing it. So either way you should just call `equals`. – kaya3 Jun 22 '21 at 09:50
  • @kaya3 Note that the question was stated from the point of actually _writing_ the `equals` method. Also sometimes optimizations happen under some specific circumstances, so it might not make sense to add them to `equals` for everyone – Kamil Jarosz Jun 22 '21 at 16:50
  • If you wrote this inside the `equals` method then it would have unbounded recursion for objects with the same hashCode, since it calls the `equals` method itself. – kaya3 Jun 22 '21 at 17:48
  • The code snippet from this answer was meant to give an idea of how this might work and should be treated rather as a pseudo code (you could also argue that `a` and `b` are uninitialized). When using this idea specifically inside `equals`, of course you would have to change the equality check to the actual implementation (or other method call), the same way you would change `a` to `this` and `b` to `other`. – Kamil Jarosz Jun 23 '21 at 07:07
0

Don't do this

While it is correct that you need to override equals() and hashCode() in pairs, having the same hash is not the same as having the same values.

Put some effort into really thinking the equality thing through. Don't shortcut here it will bite you later.

Jan
  • 13,738
  • 3
  • 30
  • 55