1

When we are adding an entry or retrieving an entry to the HashMap, the equals() on the key would have been sufficient to find it in the particular index of the bucket. Why is hash also stored and checked?

Sean
  • 212
  • 1
  • 10
Priya
  • 59
  • 4

2 Answers2

3

You are correct that there is no need to recalculate the hash values for keys (in entries) when doing operations like lookup and removal. They are stored for performance related reasons.

Calculating the hash value for a key can be expensive.

When the ratio of entries to the size of a main array exceeds a certain value (the "load factor") the HashMap implementation expands the array. When this happens, all existing entries need to be redistributed to new hash chains ... based on the entry keys' has values. The hash values are stored in the Entry objects so that they doesn't have to be recalculated each time the entries need to be redistributed.

Having stored the hashcode values, they could also be used to accelerate lookup ...

// Version #1
if (node.key.equals(keyToTest)) {
    ...
}

// Versions #2
if (node.hashValue == hashValueToTest && node.key.equals(keyToTest)) {
    ...
}

If the key.equals method is expensive, then you can save some time (on average) by avoiding calling it when the hash values don't match. (But when they do match, the call must be made anyway!)


So, really, there are TWO reasons why the hash values are stored.

Stephen C
  • 698,415
  • 94
  • 811
  • 1,216
1

HashMap is an implementation of Map which maintains a table of entries, with references to the associated keys and values, organized according to their hash code. This implementation provides constant-time performance for the basic operations (get and put), assuming the hash function disperses the elements properly among the buckets.

Please see:

Hashmap and how this works behind the scene

While it is correct that equals() would be sufficient to (ultimately) find the key, this would not provide constant-time performance.

There are other implementations of Map which do not use hash codes.

lexicore
  • 42,748
  • 17
  • 132
  • 221
  • You are correct that the reason is performance. From a CS viewpoint there is no difference between using „equals“ and „first test hashCode, then equals“ in the big-O notation. – Jens Mar 31 '18 at 12:52
  • @Jens The main reason for hash codes is the hash table. – lexicore Mar 31 '18 at 13:02
  • Right you are – Jens Mar 31 '18 at 13:03
  • @Jens There may be no difference between using „equals“ and „first test hashCode, then equals“ in the big-O notation, but checking hash code first will normally perform better. – lexicore Mar 31 '18 at 13:03