5

I'm not able to understand on working pattern of HashMap. Kindly help to understand it.

Say we have two objects Obj1 and Obj2 having same Hashcode as 1212. Now when we run "==" and equals it returns false.

Now i use ValueObj1 and Valueobj2 as value in a HashMap with Keys Obj1 and Obj2 respectively. I believe both the values will be save in same bucket making as List.

My question how HashMap picks Valueobj2 for Obj2 and ValueObj1 for Obj1. Say there are n.. such objects and values. How this key--> value association works internally even though hashcode is same but values are different.

Assuming both condition equals is not overridden and overridden.

Cœur
  • 37,241
  • 25
  • 195
  • 267
Pawan Kumar
  • 154
  • 1
  • 1
  • 11

5 Answers5

13

A HashMap/HashSet implements per bucket a list of keys (on the case of a Map together with the values). If multiple key's share the same hashCode value, they are placed in this list.

So the search method first extracts the hashCode of the queried key and then iterates over the corresponding list until an equals method succeeds. In case of a HashSet it means the key is found, in case of a HashMap, it returns the other side of the tuple: the value.

The memory of a HashMap thus works like:

+--------+--------+--------+--------+
|   00   |   01   |   10   |   11   |
+--------+--------+--------+--------+
    |        |        |        |
 k00/v00     _     k06/v06     _
    |                 |
 k08/v08           k14/v14
    |                 |
 k04/v04              _
    |
    _

What you see is on top the four buckets. Each bucket has a list (the items underneath), that stores tuples of keys (k) and values (v). Since there are here only four buckets, the hash algorithm uses a modulo 4 operation, so a key k06 with value v06 would be placed in bucket 06 mod 4 = 02 thus 10. In case a second key k14 is added with 14 mod 4 = 02 thus 10, it is simply added to the list.

Since the values are stored with it as well, one can perform a fast lookup operation. The key is thus stored together with the value.

You noticed, that iterating over the (linked) list is an expensive operation. But the point of a HashMap is that one hopes, that the number of hash-collisions to use the correct term (number of keys sharing the same bucket) is very low. In general one might expect two or three elements per bucket. The performance boost is thus achieved by selecting the correct bucket in constant time, searching the bucket requires however linear time (or in case there is a complete ordering on the key's, one might implement a (balanced) binary tree to search in logarithmic time). Worst-case however, a HashMap can achieve, the same performance as an ArrayList/LinkedList of entries, but given the hash-function has been designed decently, the odds are very low.

Willem Van Onsem
  • 443,496
  • 30
  • 428
  • 555
  • Hi, so you mean key is also stored with value in the bucket? – Pawan Kumar Jan 08 '15 at 09:32
  • Hi, you mean if we say map.get(key5) it will pick fifth value from the bucket....then why to maintain a linkedlist??? although the complexity is o(5) "for fifth value" – Pawan Kumar Jan 08 '15 at 09:42
  • 1
    @PawanKumar: no since one generally maintains a large number of buckets, picking the correct bucket will in general result in an extreme performance boost. But indeed, if the hash-function is chosen poorly and all items reside in the same bucket, a `HashMap` behaves as linear search. – Willem Van Onsem Jan 08 '15 at 09:45
  • 1
    Very nicely explained answer. Thanks. – Mandar Pandit Apr 29 '15 at 13:46
11

You can always look at the code.

 public V get(Object key) {
     if (key == null)
         return getForNullKey();
     int hash = hash(key.hashCode());
     for (Entry<K,V> e = table[indexFor(hash, table.length)];
          e != null;
          e = e.next) {
         Object k;
         if (e.hash == hash && ((k = e.key) == key || key.equals(k)))
             return e.value;
     }
     return null;
 }
  1. So it first gets the hash for the given key.
  2. Using that hash it locates the table (referred to in other answers as a bucket).
  3. For every entry in the bucket it tests if the key equals the table entries key and if so it has found the correct item.

Partitioning keys by hashes in to buckets reduces the size of the linear search using equals comparisons. So you can see how harmful it is to return a fixed value for hashcode. See this for tips on good hashcode calculation.

Community
  • 1
  • 1
weston
  • 54,145
  • 21
  • 145
  • 203
  • So same question you mean key is also stored with value in the bucket so the key can be compared with key.... – Pawan Kumar Jan 08 '15 at 09:36
  • @PawanKumar Yes, they are kept together in the `Entry` objects in the bucket. – weston Jan 08 '15 at 09:37
  • 3
    @PawanKumar That is why you must have a good hash code calculation. See http://stackoverflow.com/questions/113511/hash-code-implementation – weston Jan 08 '15 at 09:52
3

HashMap works by dividing its content into buckets based on the hash value of the key. Each bucket in turn contains a list of entries, an entry consisting of the key and the value.

Let's say we want to look up x in the map. We calculate x.hashCode() and pick the appropriate bucket. We then iterate through the bucket's list and pick the entry e where e.key equals x. We then return e.value.

Pseudocode:

class Map {
    class Entry {
        Object key, value;
    }

    List<List<Entry>> buckets;

    Object get(Object key) {
        List<Entry> bucket = buckets.get(key.hashCode() % buckets.size());
        for (Entry entry : bucket) {
            if (Object.equals(key, entry.key) return entry.value;
        }
        return null;
    }
}

(Disclaimer: Using % to calculate a bucket index is an oversimplification and wouldn't work as-is; it's just there to convey the general idea)

BambooleanLogic
  • 7,530
  • 3
  • 30
  • 56
0

hashcode() method is called and hash code is calculated. This hashcode is used to find index of array for storing Entry object.

indexFor(hash,table.length) is used to calculate exact index in table array for storing the Entry object.

two key objects having same hashcode(which is known as collision)

In hash map, bucket used simple linked list to store objects .

if two keys have same hashcode, then store the key-value pair in the same bucket as that of the existing key.

How do you retrieve value object when two keys with same hashcode are stored in hashmap? Using hashcode wo go to the right bucket and using equals we find the right element in the bucket and then return it.

HashMap get() function

If key is not null then , it will call hashfunction on the key object.

int hash = hash(hashValue)

hashvalue is used to find the bucket location at which the Entry object is stored . Entry object stores in the bucket like this (hash,key,value,bucketindex) .

detail read here and here

Ankur Singhal
  • 26,012
  • 16
  • 82
  • 116
0

Comparing two objects with == is not a good idea since it checks if two objects are actually links to the same objects in memory.

There is a good article on Wikipedia about hashtables. Hashmap's in java has inside an array of "buckets".

When you put a new pair <key, value> (or in your case <obj1, valueObj1>) the bucket number is calculated depending on obj1.hashcode(). This pair is added into the selected bucket which is a LinkedList inside to store actual pairs <key, value>.

When you try to search for an valueObj1 with key-object obj1 hashmap calculates a bucket number where that pair is located and iterates over all that LinkedList's elements comparing keys with equals(). If instantly equals() returns true, it means that the element we are looking for is found.

darkled
  • 257
  • 2
  • 11