As the title suggests this is a question about an implementation detail from HashMap#resize
- that's when the inner array is doubled in size.
It's a bit wordy, but I've really tried to prove that I did my best understanding this...
This happens at a point when entries in this particular bucket/bin are stored in a Linked
fashion - thus having an exact order and in the context of the question this is important.
Generally the resize
could be called from other places as well, but let's look at this case only.
Suppose you put these strings as keys in a HashMap
(on the right there's the hashcode
after HashMap#hash
- that's the internal re-hashing.) Yes, these are carefully generated, not random.
DFHXR - 11111
YSXFJ - 01111
TUDDY - 11111
AXVUH - 01111
RUTWZ - 11111
DEDUC - 01111
WFCVW - 11111
ZETCU - 01111
GCVUR - 11111
There's a simple pattern to notice here - the last 4 bits are the same for all of them - which means that when we insert 8 of these keys (there are 9 total), they will end-up in the same bucket; and on the 9-th HashMap#put
, the resize
will be called.
So if currently there are 8 entries (having one of the keys above) in the HashMap
- it means there are 16 buckets in this map and the last 4 bits of they key decided where the entries end-up.
We put the nine-th key. At this point TREEIFY_THRESHOLD
is hit and resize
is called. The bins are doubled to 32
and one more bit from the keys decides where that entry will go (so, 5 bits now).
Ultimately this piece of code is reached (when resize
happens):
Node<K,V> loHead = null, loTail = null;
Node<K,V> hiHead = null, hiTail = null;
Node<K,V> next;
do {
next = e.next;
if ((e.hash & oldCap) == 0) {
if (loTail == null)
loHead = e;
else
loTail.next = e;
loTail = e;
}
else {
if (hiTail == null)
hiHead = e;
else
hiTail.next = e;
hiTail = e;
}
} while ((e = next) != null);
if (loTail != null) {
loTail.next = null;
newTab[j] = loHead;
}
if (hiTail != null) {
hiTail.next = null;
newTab[j + oldCap] = hiHead;
}
It's actually not that complicated... what it does it splits the current bin into entries that will move to other bins and to entries that will not move to other bins - but will remain into this one for sure.
And it's actually pretty smart how it does that - it's via this piece of code:
if ((e.hash & oldCap) == 0)
What this does is check if the next bit (the 5-th in our case) is actually zero - if it is, it means that this entry will stay where it is; if it's not it will move with a power of two offset in the new bin.
And now finally the question: that piece of code in the resize is carefully made so that it preserves the order of the entries there was in that bin.
So after you put those 9 keys in the HashMap
the order is going to be :
DFHXR -> TUDDY -> RUTWZ -> WFCVW -> GCVUR (one bin)
YSXFJ -> AXVUH -> DEDUC -> ZETCU (another bin)
Why would you want to preserve order of some entries in the HashMap
. Order in a Map
is really bad as detailed here or here.