0

I cannot understand the use of HashFunction in LinkedHashMap.

In the HashMap implementation, the use of hashFunction is to find the index of the internal array, which can be justified, following the hashfunction contract (same key will must have same hashcode, but distinct key can have same hashcode).

My questions are:

1) What is the use of hashfunction in LinkedHashMap?

2) How does the put and get method works for LinkedHashMap?

3) Why does it maintains the doublylinkedlist internally? Whats wrong in using the HashMap as internal implementation(just like HashSet) and maintain a separate Array/List of indexes of the Entry array in the sequence of insertion?

Appreciate useful response and references.

Sashi Kant
  • 13,277
  • 9
  • 44
  • 71

4 Answers4

2

1) LinkedHashMap extends HashMap so the hashfunction is the same of HashMap (if you check the code the hash function is inherited from HashMap), i.e. the function computes a the hash of the object inserted and it use to store in a data structure together with the elements with the same key hash; the hasfunction is used in the get method to retrieve the object with the key specified as a param.

2)Put and Get method are behave the same way as HashMap plus the track the insertion order of the elements so when you iterate over the the keyset you get the key values in the order you inserted into the map (see here for more details)

3)the LinkedHashMap uses a double linked list instead of an Array because it's more compact; a double linked list is the the most efficient data structure for list where you insert and remove items; if you mostly insert/append elements then an array based implementation may be better. Since the map sematic is a key-value implementation and removing elements from the map could be a frequent operation a double linked list is a better fit. The internal implmentation could be made with a LinkedList but my opionion is that using a low level data stucture is more efficient and decouples LinkedHashMap from other classes.

Giovanni
  • 3,951
  • 2
  • 24
  • 30
1

A LinkedHashMap does use a HashMap (in fact it extends from it), so the hashCode is used to identify the right hash bucket in the array of hash buckets, just as for HashMap. put and get work just as for HashMap (except that the before and after references for iterating over the entries are updated differently for the two implementations).

The reason insertion order is not kept by keeping an Array or ArrayList is that addition or removal in the middle of an ArrayList is an O(n) operation because you have to move all subsequent items along one place. You could do this with a LinkedList because addition and removal in the middle of a LinkedList is O(1) (all you have to do is break a few links and make a few new ones). However there's no point using a separate LinkedList because you may as well make the Map.Entry objects reference the previous and next Entry objects, which is exactly how LinkedHashMap works.

Paul Boddington
  • 37,127
  • 10
  • 65
  • 116
  • Thanks for your answer, so for optimizing the time complexity they have compromised with the space complexity. My point was, since a user cannot specify the location of insert, the put method will insert the object at the tail. So for that case, ArrayList is too having the same complexity – Sashi Kant Apr 15 '15 at 06:38
  • 1
    @SashiKant You're right about space complexity. I read that `LinkedHashMap` is one of the most memory-hungry of the standard collections. It's *really* fast though. You can iterate over a `LinkedHashSet` much faster than an ordinary `HashSet`. If you want to be able to control the order yourself, you would indeed need to maintain a separate `List`, but as @TimBiegeleisen points out, trying to keep two data structures in sync is often a very bad idea. – Paul Boddington Apr 15 '15 at 06:42
  • Yes, there is a pros and a cons in both the approach. They chose LinkedList. Appreciate if you can also answer my Question 1 and 2 :) – Sashi Kant Apr 15 '15 at 06:44
  • @SashiKant I thought I had answered 1) and 2)? – Paul Boddington Apr 15 '15 at 06:45
  • So please correct me if I am wrong, `put` and `get` method is same as in `HashMap`. The difference is present in the Entry Object which has 2 more references as its instance variables, pointing to its previous and next elements(sequence). I would be very grateful to you, if you can highlight me what happens, when `put`, `get` and `entrySet` methods are called. I think the downvoter dont have the courage to defend his/her action. Thanks once again for replying – Sashi Kant Apr 15 '15 at 06:58
1

LinkedHashMap is a good choice for a data structure where you want to be able to put and get entries with O(1) running time, but you also need the behavior of a LinkedList. The internal hashing function is what allows you put and get entries with constant-time.

Here is how you use LinkedHashMap:

Map<String, Double> linkedHashMap = new LinkedHashMap<String, String>();
linkedHashMap.put("today", "Wednesday");
linkedHashMap.put("tomorrow", "Thursday");
String today = linkedHashMap.get("today"); // today is 'Wednesday'

There are several arguments against using a simple HashMap and maintaining a separate List for the insertion order. First, if you go this route it means you will have to maintain 2 data structures instead of one. This is error prone, and makes maintaining your code more difficult. Second, if you have to make your data structure Thread-safe, this would be complex for 2 data structures. On the other hand, if you use a LinkedHashMap you only would have to worry about making this single data structure thread-safe.

As for implementation details, when you do a put into a LinkedHashMap, the JVM will take your key and use a cryptographic mapping function to ultimately convert that key into a memory address where your value will be stored. Doing a get with a given key will also use this mapping function to find the exact location in memory where the value be stored. The entrySet() method returns a Set consisting of all the keys and values in the LinkedHashMap. By definition, sets are not ordered. The entrySet() is not guaranteed to be Thread-safe.

Tim Biegeleisen
  • 502,043
  • 27
  • 286
  • 360
  • Yes, there is a pros and a cons in both the approach. They chose LinkedList. Appreciate if you can also answer my Question 1 and 2 :) – Sashi Kant Apr 15 '15 at 06:44
  • @SashiKant I answered parts 1 and 2...what I am leaving out if I may know? – Tim Biegeleisen Apr 15 '15 at 06:51
  • Your answer is very helpful, So please correct me if I am wrong, `put` and `get` method is same as in `HashMap`. The difference is present in the Entry Object which has 2 more references as its instance variables, pointing to its previous and next elements(sequence). I would be very grateful to you, if you can highlight me what happens, when `put`, `get` and `entrySet` methods are called.Thanks once again for replying :) – Sashi Kant Apr 15 '15 at 06:58
  • @SashiKant - Not exactly. `put()` call on `LinkedHashMap` is actually directly delegated to the underlying `HashMap` (which `LinkedHashMap extends`). `get()` is a little complex because the `after` and `before` references have to be *updated* – TheLostMind Apr 15 '15 at 07:02
  • @TheLostMind Don't you mean that the other way around? – Paul Boddington Apr 15 '15 at 07:05
  • @pbabcdefp - That's the *funny* part. LHM doesn't define a `put()` but defines a `get()` – TheLostMind Apr 15 '15 at 07:07
  • @TheLostMind Ah yes, I always forget that a `LinkedHashSet` can be ordered by access (why would anyone do that?!). Even though `LinkedHashMap` doesn't override `put`, the method `newNode` of `HashMap.Entry` that `put` uses is overridden by `LinkedHashMap.Entry`, so `before` and `after` references are updated by `put` too. – Paul Boddington Apr 15 '15 at 07:17
  • 2
    @TimBiegeleisen Now your answer has been downvoted! This is absurd. – Paul Boddington Apr 15 '15 at 07:23
  • So if it doesnt do anything different in `put`, so how the `next` and `previous` references gets set? – Sashi Kant Apr 15 '15 at 07:25
  • Setting `next` and `previous` should not be any different than with a standard linked list. The address of the new node is known from the mapping function, and the address of the end node is also available. – Tim Biegeleisen Apr 15 '15 at 07:29
  • 1
    @SashiKant You need to read the source code. It's a work of genius (but too complex to explain in comments). `LinkedHashMap` actually only overrides 4 methods from HashMap despite behaving completely differently. It's the `Map.Entry` objects that maintain `before` and `after` entries. – Paul Boddington Apr 15 '15 at 07:29
  • Any references other than the source code will be very helpful. And as @pbabcdefp said, it is a work of genius and too complex. Why dont you both write a blog on LinkedHashMap, I bet this will be very helpful to folks like us :) – Sashi Kant Apr 15 '15 at 07:44
  • @SashiKant I don't know what you mean by that. I My comment about it being too complex wasn't meant to be patronising. In fact, I don't understand `HashMap` at all. I don't understand how the index in the array can possibly be a function of the `hashCode`. I'm in awe of the people who worked all this stuff out and made it work so unbelievably fast. The only reason I didn't say much about 1) and 2) is that your question implies you understand the way `hashCode` is used to find the right hash bucket, so I just said it works exactly the same for `LinkedHashMap` (which it does). – Paul Boddington Apr 15 '15 at 07:57
  • There is no cryptographic hash function in `HashMap`. It simply uses the `hashCode()` of the passed object. See e.g. [How HashTable and HashMap key-value are stored in the memory?](http://stackoverflow.com/questions/10894558/how-hashtable-and-hashmap-key-value-are-stored-in-the-memory/10895054#10895054) and then [HashMap implementation in Java. How does the bucket index calculation work?](http://stackoverflow.com/questions/10879302/hashmap-implementation-in-java-how-does-the-bucket-index-calculation-work/10879475#10879475) – Petr Janeček Apr 15 '15 at 10:47
-2

Ans. 2) when we call put(map,key) of linkedhashmap. Internally it calls createEntry

void createEntry(int hash, K key, V value, int bucketIndex) {
HashMap.Entry<K,V> old = table[bucketIndex];
Entry<K,V> e = new Entry<K,V>(hash, key, value, old);
table[bucketIndex] = e;
e.addBefore(header);
size++;

Ans 3) To efficiently maintain a linkedHashmap, you actually need a doubly linked list.

Consider three entries in order

A ---> B ---> C

Suppose you want to remove B. Obviously A should now point to C. But unless you know the entry before B you cannot efficiently say which entry should now point to C. To fix this, you need entries to point in both the directions Like this

---> --->

A B C

<--- <---

This way, when you remove B you can look at the entries before and after B (A and C) and update so that A and C point to each other. similar post in this link discussed earlier

why linkedhashmap maintains doubly linked list for iteration

Community
  • 1
  • 1
Shayan Ghosh
  • 882
  • 6
  • 14