The reason for that is in the comments:
Computes key.hashCode() and spreads (XORs) higher bits of hash to lower. Because the table uses power-of-two masking, sets of hashes that vary only in bits above the current mask will always collide. (Among known examples are sets of Float keys holding consecutive whole numbers in small tables.)
What that is saying in simple words is that Key#hashcode
(last bits that we care about) would be the same for Keys that are actually different. And that would create collisions since these entries will end-up in the same bucket.
Where an Entry goes is decided based on the number of existing buckets or from the last n - bits as you have already seen from :
int index = (n - 1) & hash
If hashmap would not re-hash again - it means that entries that don't differ in those last bits would end up in the same bucket, search time == slower.
The reason that XOR
is used - because it has 50/50% distribution of 1
and 0
(as opposed to |
or &
that have 75/25 or 25/75).
And the &
operation is used instead of %
, not for speed only, but because hashcodes are int
s and can be negative. modulo on a negative number will be negative - meaning a negative bucket... Thus &
is used which will generate a positive index.