1

I recently have to override the equals and hashCode methods in Java. I hence looked for a fast and efficient way to compute hash codes.

Java developers seem to aggree on the following method :

    int hash = 23;
    hash = hash * 37 + paramOne;
    hash = hash * 37 + paramTwo;
    // And so on...

It might be simple arithmetic but I don't really get it. What are the garantees ? What are the corner cases ? Is there a better (rather simple) way to do it ?

Thank you !

Maxime
  • 1,776
  • 1
  • 16
  • 29
  • 3
    Where does "Java developers seem to agree on the following method" come from? – nhahtdh Jun 13 '12 at 09:45
  • 1
    hash functions are magical. You can analyse them and see if they work well, but its tough to explain why any particular numbers work. – Oliver Jun 13 '12 at 09:46
  • @nhahtdh For instance : http://stackoverflow.com/questions/113511/hash-code-implementation, http://stackoverflow.com/questions/299304/why-does-javas-hashcode-in-string-use-31-as-a-multiplier (and accordingly to one answer bellow), http://stackoverflow.com/questions/3121524/understanding-of-hash-code, and so on, without counting my coworker's opinion. – Maxime Jun 13 '12 at 09:51

3 Answers3

2

It's about prime factors. Have a look at this answer.

If you're just looking for a fast way and pragmatic way to do it and you have no major concerns about performance, have a look at the Apache Commons Lang HashCodeBuilder or a similar library function. There is an equivalent builder for equals.

Community
  • 1
  • 1
Urs Reupke
  • 6,791
  • 3
  • 35
  • 49
  • I actually tried several version of hash code generation. Using the standard hard coded `31 * i` or `<< N - 1`, using `Objects.hash` and `HashCodeBuilder` and I found that `HashCodeBuilder` is about the same speed as the standard hard-coded one mentioned by Bloch (and internally it does the same calculations). I was surprised, but I assume this has to do with the JVM optimizing the code by inlining it all after some amount of initial calls. `Objects.hash` was one of the slowest, probably due to the behind-the-scenes array generation inherent in the `Object... objects` ellipsis parameter. – Shadow Man Aug 01 '18 at 23:45
2

In words of Joshua Bloch (explaining the default implementation of hashCode() method in String class , that is : s[0]*31^(n-1) + s[1]*31^(n-2) + ... + s[n-1]):

The value 31 was chosen because it is an odd prime. If it were even and the multiplication overflowed, information would be lost, as multiplication by 2 is equivalent to shifting. The advantage of using a prime is less clear, but it is traditional. A nice property of 31 is that the multiplication can be replaced by a shift and a subtraction for better performance: 31 * i == (i << 5) - i. Modern VMs do this sort of optimization automatically.

For further reading , refer this and this.

Community
  • 1
  • 1
Kazekage Gaara
  • 14,972
  • 14
  • 61
  • 108
1

Joshua Bloch tells you how to override equals and hashCode properly in chapter 3 of his "Effective Java". Google for it and read it.

He's the guy who wrote the collections API and is now the chief Java architect for Google. That's authoritative enough for me.

duffymo
  • 305,152
  • 44
  • 369
  • 561