The initial capacity and load factor two parameters that affect the HashMap
performance.
The default load factor (.75
) offers a good tradeoff between time and space costs. Higher values decrease the space overhead but increase the lookup cost.
When an item is added to the HashMap
, it is assigned to a buckets based on a value derived of its hashCode
and the bucket size of the HashMap
.
To identify the bucket for any , Hash map use key.hashCode()
and perform some operation:
Bucket (index) = HashMap.indexFor(HashMap.hash(key.hashCode()),
entryArray.length)
When the number of entries in the hash map exceeds the product of the load factor and the current capacity, the hash map is rehashed (internal data structures are rebuilt), so that the hash map has approximately twice the number of buckets.
When you rehash and move everything to a new location (bucket, etc.) then the older elements are also rehashed again and stored in the new bucket according to their new hash codes. The old space which was allocated to store the elements is garbage collected.
If two thread at the same time found that now HashMap
needs re-sizing and they both try to re-size may cause race condition in HashMap
.
On the process of re-sizing of HashMap
, the element in bucket which is stored in linked list get reversed in order during their migration to new bucket because java HashMap
doesn't append the new element at tail instead it append new element at head to avoid tail traversing. If race condition happens then you will end up with an infinite loop.
I have following questions:
- Why is the linked list for each bucket be reversed in order during migration to new bucket?
- How race condition can lead to infinite loop?
- How can increasing the number of buckets diminish the lookup waiting time?
- Elements which are in the same bucket will still be together in same bucket after rehashing?