1

I was reading source code of HashMap.java, and was confused by the default value of loadFactor. As the authors of this class mentioned

Ideally, under random hashCodes, the frequency of nodes in bins follows a Poisson distribution with a parameter of 0.5 on average for the default resizing threshold of 0.75f ...

What I didn't understand is the relationship between the resizing threshold of 0.75f and the frequency that calculated by Poisson distribution.

If there is some kind of relation between the two, what exactly is it?

And if there isn't, where did the default threshold of 0.75f come from?

spike
  • 15
  • 5
  • 1
    this might help - https://stackoverflow.com/a/10901821/7505731 – Hemant Apr 21 '20 at 10:28
  • I have read this before but I think what I am looking forward is the theoretical reason or support of the value 0.75f. In another word, why the default loadFactor is set to 0.75f rather than 0.5 or any other value? Or, my seeking of this reason is of no significance at all? Anyway, thanks for answering :) – spike Apr 21 '20 at 10:41

1 Answers1

1

Because even with a very good hashcode implementation, there will be collisions (collision means two elements fitting in the same bucket). More the number of elements in map, more chances for collisions. When the map is 75% filled, the frequency of collisions increases. It is advisable to have a load factor of around 0.75 for keeping the put and get complexity around O(1).

nbalodi
  • 84
  • 1
  • 11
  • Thanks, so you think that 0.75 may come from vast of test results? – spike Apr 21 '20 at 14:03
  • 1
    Yes. As a general rule, the default load factor (.75) offers a good tradeoff between time and space costs. https://docs.oracle.com/javase/8/docs/api/java/util/HashMap.html – nbalodi Apr 21 '20 at 17:08