Why would one add a constant to hashCode()?

Question

I'm new to Java, and I've recently learned about hashCode(). On the wikipedia article about Java hashCode(), there is the following example of a hashCode() method:

public class Employee {
    int        employeeId;
    String     name;
    Department dept;

    // other methods would be in here

    @Override
    public int hashCode() {
        int hash = 1;
        hash = hash * 17 + employeeId;
        hash = hash * 31 + name.hashCode();
        hash = hash * 13 + (dept == null ? 0 : dept.hashCode());
        return hash;
    }
}

I understand that multiplying by 31 and 13 decreases the chance of collision, but I don't see why hash is initialized to 1 rather than to employeeId. In the end this simply has the effect of adding 17*31*13 to the hashCode(), which is not going to change whether two hashCode() values are equal or not.

Bloch's "Effective Java (Second Edition)" has a very similar example in Item 9 (pages 47 and 48), but his explanation of this additive constant is quite mysterious to me.

Edit: This question was marked as a duplicate of the question Why does Java's hashCode() in String use 31 as a multiplier? That question is not the same: it is asking whether there is any reason to prefer the number 31 to any other number in the formula for the hashCode() of a String. My question is why it is the case that in many examples of hashCode() which I have found online there is a single constant added to the hashCode() of all objects.

In fact, the example of hashCode() of a String is relevant here, because in that example there is no constant added. If adding 17*31*13 has any effect in the example I gave above, why not add such a constant when computing hashCode() of a String?

No reason. As pointed out, it's the same as starting with `int hash = 17 * employeeId;` and then continuing on, it just makes the code look nice and uniform, and the compiler will optimize it out of existence anyway, so it's purely cosmetic. — Mike 'Pomax' Kamermans, Oct 03 '18 at 04:57

score 0 · Answer 1 · answered Oct 03 '18 at 07:00

0

Starting with non zero value helps notionally when the hashCode overflows or is % by a non power of two which results in more bits being different comparing values which overflow compared to those which don't.

Small constants are not as effective as larger ones but can use less bytes and be faster. E.g. * 31 is faster but might not be as effective as * 109.

The difference it make depends on your use case.

Note: even ensuring you have unique hashCodes doesn't ensure you have no collisions after the number has been used to select a bucket.

answered Oct 03 '18 at 07:00

Peter Lawrey

525,659
79
751
1,130

Could you please expand a little more on your first sentence? Are you saying that shifting hashCode by a constant results in more differing bits when comparing two hashCodes modulo some number, making it easier to check that they are indeed different? – ExoticSphere Oct 03 '18 at 17:46
@ExoticSphere yes, if you add a constant this will cause more bits to be different. While the constant is small it gets multiplied by a factor, even so a bigger constant (and factors) might be better. – Peter Lawrey Oct 03 '18 at 19:14

Why would one add a constant to hashCode()?

1 Answers1

Linked