1

I know following thing about Java HashMap put():

  1. Calling put() starts by inserting key-values into HashMap.
  2. Initially we have linked list per bin, but once we have more than `TREEFY_THRESHOLD entries in a bin, bin contents are rearranged from linked list into balanced trees.

Similarly, get() should work as follows:

  1. If the bin in which the key hashes to is in the form of linked list, traverse it to check if the input key exists in the list or not. If it does then return its value.
  2. If the bin is in the form balanced tree, search the key in the tree and return its value

This is very high level abstract understanding after reading some articles, but I am unable to see how exactly this is happening in code. I tried to google it a bit and found these good articles: 1, 2, 3. But they target HashMap from java 7 (which does not involve balanced tree logic) and not java 8 (which involves balanced tree logic). Actually I am not struggling with balanced tree logic, but with the rest. Java 7 implementation was quite easy, but not the java 8 implementation. The whole code is quite involved and difficult to understand. It will be great if someone explain important methods of HashMap. If not, below are some specific doubts:

  1. The Java 8 HashMap.get() source:

    1.     public V get(Object key) {
    2.         Node<K,V> e;
    3.         return (e = getNode(hash(key), key)) == null ? null : e.value;
    4.     }
    5. 
    6.     /**
    7.      * Implements Map.get and related methods
    8.      *
    9.      * @param hash hash for key
    10.      * @param key the key
    11.      * @return the node, or null if none
    12.      */
    13.     final Node<K,V> getNode(int hash, Object key) {
    14.         Node<K,V>[] tab; Node<K,V> first, e; int n; K k;
    15.         if ((tab = table) != null && (n = tab.length) > 0 &&
    16.             (first = tab[(n - 1) & hash]) != null) {
    17.             if (first.hash == hash && // always check first node
    18.                 ((k = first.key) == key || (key != null && key.equals(k))))
    19.                 return first;
    20.             if ((e = first.next) != null) {
    21.                 if (first instanceof TreeNode)
    22.                     return ((TreeNode<K,V>)first).getTreeNode(hash, key);
    23.                 do {
    24.                     if (e.hash == hash &&
    25.                         ((k = e.key) == key || (key != null && key.equals(k))))
    26.                         return e;
    27.                 } while ((e = e.next) != null);
    28.             }
    29.         }
    30.         return null;
    31.     }
    

    (a) Why to always check first node explicitly on line 17?
    (b) What does tab[(n - 1) & hash] do on line 16?
    (c) May I will need to understance hash()source also:

    static final int hash(Object key) {
       int h;
       return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
    }
    
  2. Java 8 HashMap.put() source:

    1.     public V put(K key, V value) {
    2.         return putVal(hash(key), key, value, false, true);
    3.     }
    4. 
    5.     final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
    6.                    boolean evict) {
    7.         Node<K,V>[] tab; Node<K,V> p; int n, i;
    8.         if ((tab = table) == null || (n = tab.length) == 0)
    9.             n = (tab = resize()).length;
    10.         if ((p = tab[i = (n - 1) & hash]) == null)
    11.             tab[i] = newNode(hash, key, value, null);
    12.         else {
    13.             Node<K,V> e; K k;
    14.             if (p.hash == h ash &&
    15.                 ((k = p.key) == key || (key != null && key.equals(k))))
    16.                 e = p;
    17.             else if (p instanceof TreeNode)
    18.                 e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
    19.             else {
    20.                 for (int binCount = 0; ; ++binCount) {
    21.                     if ((e = p.next) == null) {
    22.                         p.next = newNode(hash, key, value, null);
    23.                         if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
    24.                             treeifyBin(tab, hash);
    25.                         break;
    26.                     }
    27.                     if (e.hash == hash &&
    28.                         ((k = e.key) == key || (key != null && key.equals(k))))
    29.                         break;
    30.                     p = e;
    31.                 }
    32.             }
    33.             if (e != null) { // existing mapping for key
    34.                 V oldValue = e.value;
    35.                 if (!onlyIfAbsent || oldValue == null)
    36.                     e.value = value;
    37.                 afterNodeAccess(e);
    38.                 return oldValue;
    39.             }
    40.         }
    41.         ++modCount;
    42.         if (++size > threshold)
    43.             resize();
    44.         afterNodeInsertion(evict);
    45.         return null;
    46.     }
    

    (a) I understand putTreeVal() in line 18 inserts key-value in balanced tree. But I dont find logic to put key-value in the linked list as in line 27 of getNode(), which involves traversing till end of the linked list.
    (b) What does if on lines 10 and 14 do?
    (c) I also dont get full logic completely clear. Can you provide comments for each important line. In fact I want to know if there is any documentation / source code repo having comment for on each important line explaining what it is trying to understand, because there are many tricky complex methods in this class like, resize(), tableSizeFor(int)

MsA
  • 2,599
  • 3
  • 22
  • 47
  • 3
    If someone does answer this, it is going to be a looong answer. If you on the other hand ask one thing _at a time_, you will find out that the majority of your questions are duplicates. Here are some, [1](https://stackoverflow.com/questions/46056113/java-hashmap-resizing/46069815#46069815), [2](https://stackoverflow.com/questions/47921663/when-and-how-does-hashmap-convert-the-bucket-from-linked-list-to-red-black-trees/47922079#47922079), [3](https://stackoverflow.com/questions/42941286/java-hashmap-array-size/42982856#42982856). – Eugene Mar 21 '20 at 19:16
  • 2
    *"This is very high level abstract understanding after reading some articles"* ... No. No, not high level at all. What you are asking about deals with a low level understanding of the Java Library's HashMap implementation. Maybe this is important to you because you are interested in using details of the implementation in some code of your own. However, for most users of the Collections Framework, these details are hardly very important. – scottb Mar 21 '20 at 19:19
  • I do agree with you on one question, though. [I do find the resize interesting too, but I still have no answer to that](https://stackoverflow.com/questions/45404580/hashmap-resize-method-implementation-detail) – Eugene Mar 21 '20 at 19:25
  • 2
    One of the best ways to understand code is to write unit test scenarios, debug and step through that code. – Andrew S Mar 21 '20 at 19:50
  • Does that mean apart from the developers of this class (which are some countable people), no one has complete clear understanding of how does every line of this class works? I was having feeling that every line of framework with substantial thinking / concepts behind it should have comments explaining them for world to know and evaluate what are they doing. I am surprised to see no comments in these methods explaining logic. When I wrote such tricky methods, I put comments for almost every line...not for the world but for myself because I knew I will waste substantial time when coming back. – MsA Mar 22 '20 at 06:53
  • @scottb fact that "HashMap uses linked list and balanced tree" is high level, how exactly it implements those data structures is low level. I want to know this because I was asked how HashMap handles collisions once in an interview and I was not able to answer. I was unaware that it uses linked list and balanced tree. Now I want to know more than that. – MsA Mar 22 '20 at 07:32
  • 1
    **a** when you are looking for a traversal, you are looking for a loop. There is only one loop, so it's actually easy to find. **b** line 10: if there is no entry at the array position, create one, line 14: if the node found at the array position has the right hashcode (shortcut) and the key matches, this is the target entry. **c** we surely don't do that for you. This is not a code review site. The previous questions were already stretching, but this is clearly off topic. – Holger Mar 24 '20 at 08:36

0 Answers0