0

Well I understand most of working of hashmap, placing of key-value pairs according to hashcode. I was reading this question https://stackoverflow.com/questions/10901752/what-is-the-significance-of-load-factor-in-hashmap#=

Here one of the comment and accepted answer says The probability of the hash collision is less, if the size of the map is bigger. For example elements with hash codes 4, 8, 16 and 32 will be placed in the same bucket, if the size of the map is 4, but every item will get an own bucket, if the size of the map is more than 32. The map with initial size 4 and load factor 1.0 (4 buckets, but all the 4 element in a single bucket) will be in this example in average two times slower than another one with the load factor 0.75 (8 buckets, two buckets filled - with element "4" and with elements "8", "16", "32")..

So according to it different hashcodes can be assigned to same bucket, Is it correct??

Q1. Can somebody explain this in more detail?

Q2. I am not able to understand how initial size of hashmap impacts this?

I dont think It's duplicate, Nowhere I found the clear answer to my questions above. But the first comment explains it. Adding that as answer.

Community
  • 1
  • 1
Vikash
  • 2,046
  • 3
  • 23
  • 30
  • While the answer to this question is to be found in the answers to that, it doesn't give the straight answer to it anywhere, viz; hash codes are reduced further to match the number of buckets, so that hash-using containers can have less than 2^32 buckets. This is generally done by modulo remainder on the hash code, so if we had a container with 4 buckets using simple modulo remainder to get the index, 4, 8, 16 and 32 would all have index 0 (`4 % 4 == 0`, `8 % 4 == 0` etc). – Jon Hanna May 20 '15 at 10:58
  • @JonHanna Great, It clears both doubts of mine. It is possible that different hashcode go into same bucket. And that way it impacts performance in get and put. – Vikash May 21 '15 at 05:48
  • Yes. The practical take-away from that is that a good hashcode mixes up the bits a lot, so different objects don't just ideally have different hashcodes, but *very* different hashcodes (but note that if you've a sequence then that is *sometimes* a perfect hash so just returning a numeric ID will *sometimes* be the best approach possible). Most hashcodes in Java (and .NET for that matter, which has a similar approach to all objects having a hash) aren't very good, but aren't dreadful either. – Jon Hanna May 21 '15 at 09:13

0 Answers0