40

I read that HashMap has the following implementation:

main array 
   ↓
[Entry] → Entry → Entry      ← linked-list implementation
[Entry]
[Entry] → Entry
[Entry]
[null ]

So, it has an array of Entry objects.

Questions:

  1. I was wondering how can an index of this array store multiple Entry objects in case of same hashCode but different objects.

  2. How is this different from LinkedHashMap implementation? Its doubly linked list implementation of map but does it maintain an array like the above and how does it store pointers to the next and previous element?

Mercenary
  • 2,106
  • 6
  • 34
  • 43
  • Just a guess but internally in every key->value pair the value stores a next and previous pointer to other entries to maintain the order of insertion, this preserves order and still allows for constant time deletion insertion and retrieval – aaronman Nov 02 '13 at 04:39
  • ok. I was just thinking the same. So, each `Entry` object in the array has a link to the next and the previous Entry object? – Mercenary Nov 02 '13 at 04:47
  • I'm adding an answer to explain it better – aaronman Nov 02 '13 at 04:48

5 Answers5

89

HashMap does not maintain insertion order, hence it does not maintain any doubly linked list.

Most salient feature of LinkedHashMap is that it maintains insertion order of key-value pairs. LinkedHashMap uses doubly Linked List for doing so.

Entry of LinkedHashMap looks like this-

  static class Entry<K, V> {
     K key;
     V value;
     Entry<K,V> next;
     Entry<K,V> before, after;        //For maintaining insertion order    
     public Entry(K key, V value, Entry<K,V> next){
         this.key = key;
         this.value = value;
         this.next = next;
     }
  }

By using before and after - we keep track of newly added entry in LinkedHashMap, which helps us in maintaining insertion order.

Before refers to previous entry and after refers to next entry in LinkedHashMap.

LinkedHashMap

For diagrams and step by step explanation please refer http://www.javamadesoeasy.com/2015/02/linkedhashmap-custom-implementation.html

Thanks..!!

Nicholas K
  • 15,148
  • 7
  • 31
  • 57
Ankit Mittal
  • 1,319
  • 12
  • 10
53

So, it has an array of Entry objects.

Not exactly. It has an array of Entry object chains. A HashMap.Entry object has a next field allowing the Entry objects to be chained as a linked list.

I was wondering how can an index of this array store multiple Entry objects in case of same hashCode but different objects.

Because (as the picture in your question shows) the Entry objects are chained.

How is this different from LinkedHashMap implementation? Its doubly linked list implementation of map but does it maintain an array like the above and how does it store pointers to the next and previous element?

In the LinkedHashMap implementation, the LinkedHashMap.Entry class extends the HashMap.Entry class, by adding before and after fields. These fields are used to assemble the LinkedHashMap.Entry objects into an independent doubly-linked list that records the insertion order. So, in the LinkedHashMap class, each entry object is in two distinct chains:

  • There are a number of singly linked hash chains that is accessed via the main hash array. This is used for (regular) hashmap lookups.

  • There is a separate doubly linked list that contains all of the entry objects. It is kept in entry insertion order, and is used when you iterate the entries, keys or values in the hashmap.

Stephen C
  • 698,415
  • 94
  • 811
  • 1,216
  • Thanks for all your answers! It did give me a little confusion reading all of them. But I feel this one was a little more clearer! Hence accepting this as the answer! – Mercenary Nov 02 '13 at 05:17
14

Take a look for yourself. For future reference, you can just google:

java LinkedHashMap source

HashMap uses a LinkedList to handle collissions, but the difference between HashMap and LinkedHashMap is that LinkedHashMap has a predicable iteration order, which is achieved through an additional doubly-linked list, which usually maintains the insertion order of the keys. The exception is when a key is reinserted, in which case it goes back to the original position in the list.

For reference, iterating through a LinkedHashMap is more efficient than iterating through a HashMap, but LinkedHashMap is less memory efficient.

In case it wasn't clear from my above explanation, the hashing process is the same, so you get the benefits of a normal hash, but you also get the iteration benefits as stated above, since you're using a doubly linked list to maintain the ordering of your Entry objects, which is independent of the linked-list used during hashing for collisions, in case that was ambiguous..

EDIT: (in response to OP's comment):
A HashMap is backed by an array, in which some slots contain chains of Entry objects to handle the collisions. To iterate through all of the (key,value) pairs, you would need to go through all of the slots in the array and then go through the LinkedLists; hence, your overall time would be proportional to the capacity.

When using a LinkedHashMap, all you need to do is traverse through the doubly-linked list, so the overall time is proportional to the size.

Steve P.
  • 14,489
  • 8
  • 42
  • 72
  • 3
    Can someone please explain the downvote. Nothing that I said here is incorrect. – Steve P. Nov 02 '13 at 05:03
  • 2
    I got one too, I bet it's that guy that just answered – aaronman Nov 02 '13 at 05:04
  • @PaulBellora I doubt it was him, either. Frustrating, though, as I know that my answer is correct and our answers pretty much say the exact same thing (and mine was first). – Steve P. Nov 02 '13 at 05:07
  • @PaulBellora I was just kidding, but I rarely answer java questions cause this stuff happens, the only reason I did this one is because coincidentally I was thinking about how one would implement this type of structure today – aaronman Nov 02 '13 at 05:07
  • 3
    I didn't downvote this Answer. For the record, I decided to write my own because I thought yours didn't address the OP's questions *directly*, and didn't really explain the implementation details clearly. – Stephen C Nov 02 '13 at 05:07
  • 2
    @aaronman It's not about the rep, it's about providing the best answer possible. His answer looks nice and is clearer than mine. Doesn't explain the random downvote in any way, but, again that wasn't his doing... – Steve P. Nov 02 '13 at 05:11
  • @SteveP. well if were going down that road, you could theoretically just edit other people's answers – aaronman Nov 02 '13 at 05:12
  • 3
    @aaronman - Well you could do that. But that's not the way "it is done". People might be offended if you "interfere" with their Answers. (I do it rarely ... when I see an accepted answer that is dangerously wrong.) – Stephen C Nov 02 '13 at 05:16
  • @aaronman **but at this point do you really need anymore rep** this is so funny. Just chill. – Trying Nov 02 '13 at 05:45
  • @Trying just messing around, I'm not the one whose "trying" so hard – aaronman Nov 02 '13 at 05:46
  • @SteveP. Explanation of how overall time would be proportional to the capacity is good. I was just reading the doc and was wondering why it was so. After reading your explanation it become clear – nantitv Aug 11 '17 at 08:06
  • @SteveP. The attached link within "Take a look for yourself" is broken. Could you please update the link? Thanks – william cage May 25 '20 at 15:23
1

Since none of the other answers actually explain how something like this could be implemented I'll give it a shot.

One way would be to have some extra information in the value (of the key->value pair) not visible to the user, that had a reference to the previous and next element inserted into the hash map. The benefits are that you can still delete elements in constant time removing from a hashmap is constant time and removing from a linked list is in this case because you have a reference to the entry. You can still insert in constant time because hash map insert is constant, linked list isn't normally but in this case you have constant time access to a spot in the linked list so you can insert in constant time, and lastly retrieval is constant time because you only have to deal with the hash map part of the structure for it.


Keep in mind that a data structure like this does not come without costs. The size of the hash map will rise significantly because of all the extra references. Each of the main methods will be slightly slower (could matter if they are called repeatedly). And the indirection of the data structure (not sure if that's a real term :P) is increased, though this might not be as big a deal because the references are guaranteed to be pointing to stuff inside the hash map.


Since the only advantage of this type of structure is that it preserves order be careful when you use it. Also when reading the answer keep in mind I don't know that this is the way it's implemented but it is how I would do it if given the task.


On the oracle docs there is a quote confirming some of my guesses.

This implementation differs from HashMap in that it maintains a doubly-linked list running through all of its entries.

Another relevant quote from the same website.

This class provides all of the optional Map operations, and permits null elements. Like HashMap, it provides constant-time performance for the basic operations (add, contains and remove), assuming the hash function disperses elements properly among the buckets. Performance is likely to be just slightly below that of HashMap, due to the added expense of maintaining the linked list, with one exception: Iteration over the collection-views of a LinkedHashMap requires time proportional to the size of the map, regardless of its capacity. Iteration over a HashMap is likely to be more expensive, requiring time proportional to its capacity.

aaronman
  • 18,343
  • 7
  • 63
  • 78
  • Removing from a `LinkedList` is not done in constant time. It can only be done in constant time if you have a reference to the element that you want to remove, otherwise, you would have to traverse through the list to find it. Also, iterating through a `LinkedHashMap` is more efficient than iterating through a `HashMap`, but `LinkedHashMap` is less memory efficient. – Steve P. Nov 02 '13 at 04:52
  • @SteveP. I see the problem I only made it clear for insertion – aaronman Nov 02 '13 at 04:53
  • Yes. I was wondering the same! Constant-time operations will be same since it extends `HashMap` right? But I did not understand the part where it says : `LinkedHashMap requires time proportional to the size of the map, regardless of its capacity` – Mercenary Nov 02 '13 at 05:11
-1

hashCode will be mapped to any bucket by the hash function. If there is a collision in hashCode than HashMap resolve this collision by chaining i.e. it will add the value to the linked list. Below is the code which does this:

for (Entry<K,V> e = table[i]; e != null; e = e.next) {
392             Object k;
393             if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {
394         `enter code here`        V oldValue = e.value;
395                 e.value = value;
396                 e.recordAccess(this);
397                 return oldValue;
398             }
399         }

You can clearly see that it traverse the linked list and if it finds the key than it replaces the old value with new else append to the linked list.

But the difference between LinkedHashMap and HashMap is LinkedHashMap maintains the insertion order. From docs:

This linked list defines the iteration ordering, which is normally the order in which keys were inserted into the map (insertion-order). Note that insertion order is not affected if a key is re-inserted into the map. (A key k is reinserted into a map m if m.put(k, v) is invoked when m.containsKey(k) would return true immediately prior to the invocation).

Trying
  • 14,004
  • 9
  • 70
  • 110