15

I've been developing in Java with Netbeans for some time now, and there are some things I just rely on working without really questioning how. Among these are the automatically generated hashCode() and equals() methods.

The equals method is straightforward to follow, but I find the hashCode method somewhat enigmatic. I don't understand why it chooses the multipliers and applies the operations it does.

import java.util.Arrays;
import java.util.Objects;

public class Foo {

    int id;
    String bar;
    byte[] things;

    @Override
    public int hashCode() {
        int hash = 7;
        hash = 89 * hash + this.id;
        hash = 89 * hash + Objects.hashCode(this.bar);
        hash = 89 * hash + Arrays.hashCode(this.things);
        return hash;
    }    
}

Searching the documentation, this site, and Google for things like "netbeans generate hashcode" turned up nothing that seemed relevant. Is anyone here familiar with what this generation strategy is and why Netbeans uses it?

Edit:
Thanks for the answers so far! Especially due to this answer on the linked SO question, I understand the logic behind using primes in designing a hashCode method much more fully now. However, the other aspect of my question that nobody has really addressed so far is how and why Netbeans chooses the prime numbers that it does for its generated methods. The hash field and the other multiplier (89 in my example) seem to be different depending on various factors of the class.

For example, if I add a second String to the class, hashCode() becomes

public int hashCode() {
    int hash = 7;
    hash = 13 * hash + this.id;
    hash = 13 * hash + Objects.hashCode(this.bar);
    hash = 13 * hash + Objects.hashCode(this.baz);
    hash = 13 * hash + Arrays.hashCode(this.things);
    return hash;
}

So, why does Netbeans choose these specific primes, as opposed to any other ones?

Community
  • 1
  • 1
Kerrigan Joseph
  • 327
  • 1
  • 4
  • 8
  • I think it's just a way to do it, IntelliJ IDEA uses 31 – Marco Acierno Feb 20 '14 at 17:16
  • Multiplying by 31 is simple to optimize to a shift and a minus, not sure about 89. Both should be primes though. – zapl Feb 20 '14 at 17:17
  • 2
    A good alternative is: `return Objects.hash(id, bar, baz, things);` which does more or less the same thing. – assylias Feb 21 '14 at 11:53
  • Saw your edits this morning as I was thinking about how I skirted the issue with my answer. :-) It's a good question, so added my upvote for you. – unigeek Feb 21 '14 at 11:54
  • 1
    Regarding why netbeans chooses that primes, zapl gave a good hint in his comment. If you really want to be sure, check out their code. General tip: if you like the answers you got so far, just upvote them instead of saying thanks in your question. – atamanroman Feb 21 '14 at 12:07

3 Answers3

4

This is an optimization aiming to better distribute the hash values. Eclipse does it similarly. Have a look at Why use a prime number in hashCode? and Why does Java's hashCode() in String use 31 as a multiplier?.

This is in no way required. Even return 0; is sufficient in order to fulfill the equals/hashcode contract. The only reason is that hash based data structures perform better with good distributed hash values.

Some would call this premature optimization. I guess it's ok since its a) for free (generated) and b) widely recognized (almost every IDE does it).

atamanroman
  • 11,607
  • 7
  • 57
  • 81
  • It is technically not required to produce a good hash but `return 0` would result in absolute worst case performance for any `HashMap` and no sane person should ever use such a broken hash. I wouldn't call that premature optimization. – zapl Feb 20 '14 at 17:25
  • @zapl I did not say that zero is a good hash ("_Even_ ..."), just that is a valid one. Multiplying primes multiple times to perfectly valid and sane hashes (int, string, byte[]) may be considered premature optimization and I would back that statement. – atamanroman Feb 20 '14 at 19:56
  • 1
    You're right, adjusting prime factors and such is easily premature optimization. Creating a basic implementation beyond `return 0` is IMO not just optimization, it's required to make hashCode work as intended, if not you're kind of breaking the contract. – zapl Feb 20 '14 at 22:08
  • That first SO question you linked to pointed me in the right direction. I guess if I want to get more Netbeans-specific answers I'll go spelunking through their code :) – Kerrigan Joseph Feb 21 '14 at 15:12
4

IBM has an article on how to write your own equals() and hashCode() methods. What they're doing is fine, though 31 tends to be a better prime because the multiplication can be optimized better.

Also have a look at how String.hashCode() works. It's exactly that, but with different primes and homogeneous types.

David Ehrmann
  • 7,366
  • 2
  • 31
  • 40
  • Would this be recommended @davidehrmann when returning the hashcode of a String property in your hashCode() method considering (as you say), that the String class already does something similar (using 31) to generate its hash code. For instance, if one had a class with a String property of 'name' which was used to define equality, would it be 'good enough' to simply use *return Objects.hashCode(this.name);* or is it worth writing something as shown above/letting the IDE generate the method. Seem like unnecessary duplication, is there an advantage? If no, where would it be recommended? – Zippy Dec 14 '20 at 02:21
  • If you have a class with a single string property that you'd expect to be used in a `HashMap` or `HashSet` alongside `Strings`, it might be worth multiplying by something or passing a second arg to `Objects.hashCode()`, but this feels like an edge case. Almost all the times I've used custom class as a map key, it's just that class and subclass in the map; I don't mix it with `Strings`. – David Ehrmann Dec 15 '20 at 05:02
  • Thanks @davidehrmann, what I mean is, say I had a HashMap that took a custom 'House' object as the key and then something else as the value, if the House object took a single String property (say 'houseName') and used that to determine equality, would it be OK to simply return the hashCode of the houseName property (since the String class creates a pretty good hash), or would it be necessary to write our own hash as shown above? Hope that's clear. Cheers – Zippy Dec 15 '20 at 19:45
2

From Joshua Bloch's item 9 of, Effective Java 2nd ed., the important thing to remember is to always override hashCode() when you override equals() to ensure that equal objects will have equal hash codes--otherwise you might easily violate this contract. While he says that a state of the art hash function is a topic for doctoral research, the recipe he gives for a good general purpose hashCode might, in your case, yield:

@Override
public int hashCode() {
    int result = 17;
    result = 31 * result + id;
    result = 31 * result + bar.hashCode();
    result = 31 * result + Arrays.hashCode(things);
    return result ;
}  

As mentioned by @zapl and David Ehrmann, the compiler can easily optimize the multiplication of 31 to a bit shift and minus 1 operation, so that may work out to be a little faster if that's important.

unigeek
  • 2,656
  • 27
  • 27