31

A few answers on SO mention that the get method in a HashMap can fall into an infinite loop (e.g. this one or this one) if not synchronized properly (and usually the bottom line is "don't use a HashMap in a multi-threaded environment, use a ConcurrentHashMap").

While I can easily see why concurrent calls to the HashMap.put(Object) method can cause an infinite loop, I can't quite see why the get(Object) method can get stuck when it tries to read a HashMap that is being resized at that very moment. I had a look at the implementation in openjdk and it contains a cycle, but the exit condition e != null should be fulfilled sooner or later. How can it loop forever? A piece of code that is mentioned explicitly to be vulnerable to this issue is:

public class MyCache {
    private Map<String,Object> map = new HashMap<String,Object>();

    public synchronized void put(String key, Object value){
        map.put(key,value);
    }

    public Object get(String key){
        // can cause in an infinite loop in some JDKs!!
        return map.get(key);
    }
}

Can someone explain how a thread putting an object into the HashMap and another reading from it can interleave in such a way that an infinite loop is generated? Is it the case that it has to do with some cache coherency issue or CPU instruction reordering (so the problem can only happen on a multi-processor machine)?

Community
  • 1
  • 1
UndefinedBehavior
  • 836
  • 1
  • 11
  • 20
  • Can you actually compile it and make it run forever? seems like an exception will be thrown much more than infinite loop – David Haim Feb 21 '16 at 10:28
  • Why don't your just "lock" your map using `AtomicReference`? You will get rest of not-thread-safe issues. –  Feb 21 '16 at 10:31
  • 7
    This exercise is pointless. A HashMap is not thread safe, and getting an object out of it while another thread writes to it, even if it never goes into an infinite loop, might return a wrong result, corrupt the HashMap, throw an exception, or whatever. Why would you want any of that to happen? Just synchronize the get method: it's necessary to make the code thread-safe. – JB Nizet Feb 21 '16 at 10:49
  • 1
    @DavidHaim I didn't try to run it: I am interested to know if the claim is true that a call to "put" by one thread and a simultaneous call to "get" by another thread CAN cause an infinite loop, and if it is true I wonder how it is possible: e.g. how must the instructions in the two threads interleave? On what architecture can this happen? By looking at the implementation in openjdk I don't see how this Java code can be translated to machine instructions in a way that can cause an infinite loop. Can anybody explain this? – UndefinedBehavior Feb 21 '16 at 13:38
  • @Kilanny I don't want to "fix" this code: if I wanted I could just use a `SynchronizedHashMap` or synchronize the `get` method. I want to understand if it is true that a call to "get" can end up in an infinite loop because for what I can see there is no way that this Java code can be translated to machine code in a way that can create an infinite loop. – UndefinedBehavior Feb 21 '16 at 13:45
  • 5
    @JBNizet this is not an "exercise", it is a precise question: can a call to the "get" method end up in an infinite loop under some jdk implementation on some architecture or it is impossible? The answer I mentioned claim that this CAN happen but don't explain how. I know well that no shared object should be read while it can be modified by another thread (which means: always synchronize accesses to your modifiable shared objects, or make them atomic). I just want to understand why people claim this can cause an infinite loop, why is it pointless? Understanding is never pointless. – UndefinedBehavior Feb 21 '16 at 13:56
  • The [infinite loop](http://mailinator.blogspot.com/2009/06/beautiful-race-condition.html) is due to a resize and a corrupted view of the link lists. – Ben Manes Feb 21 '16 at 22:53
  • @BenManes I read that. It explains in detail how a call to (unsynchronized) "put" method can end up in an infinite loop, if another thread invokes "put" at the same time. I would like to understand how a call to "get" method can end up looping forever if another thread calls "put" at the same time. – UndefinedBehavior Feb 22 '16 at 12:17
  • @UndefinedBehavior The same issue of a loop when searching a bucket. That might require concurrent puts to trigger it. As the behavior is undefined and depends on the hardware's memory model, it is difficult to provide a conclusive answer. – Ben Manes Feb 22 '16 at 17:44
  • @BenManes _"As the behavior is undefined and depends on the hardware's memory model, it is difficult to provide a conclusive answer."_ I agree, it probably depends on the underlying hardware architecture (unlike the "put"/"put" race condition, which is purely a "software" issue due to threads interleaving in a given sequence). – UndefinedBehavior Feb 23 '16 at 10:08
  • @BenManes _"That might require concurrent puts to trigger it."_ Yes, after trying to reproduce this issue myself, I also tend to think that concurrent "puts" are needed in order to see that behavior: without concurrent "puts", the worse think that can happen to a "get" apparently is to retrieve bogus data, but not to end up in an infinite loop. – UndefinedBehavior Feb 23 '16 at 10:16
  • @UndefinedBehavior - Have you gone through this blog (https://javabypatel.blogspot.co.uk/2016/01/infinite-loop-in-hashmap.html) related to map and get and put? I believe this answers your question. – chitresh sirohi Apr 03 '17 at 09:40
  • @chitreshsirohi The situation that Patel describes in his blog has two concurrent puts, both of which trigger resizes. So the OP's question still stands. – Eyal May 21 '17 at 13:36

3 Answers3

18

You link is for the HashMap in Java 6. It was rewritten in Java 8. Prior to this rewrite an infinite loop on get(Object) was possible if there were two writing threads. I am not aware of a way the infinite loop on get can occur with a single writer.

Specifically, the infinite loop occurs when there are two simultaneous calls to resize(int) which calls transfer:

 void transfer(Entry[] newTable, boolean rehash) {
    int newCapacity = newTable.length;
    for (Entry<K,V> e : table) {
         while(null != e) {
             Entry<K,V> next = e.next;
             if (rehash) {
                 e.hash = null == e.key ? 0 : hash(e.key);
             }
             int i = indexFor(e.hash, newCapacity);
             e.next = newTable[i];
             newTable[i] = e;
             e = next;
         }
     }
 }

This logic reverses the ordering of the nodes in the hash bucket. Two simultaneous reversals can make a loop.

Look at:

             e.next = newTable[i];
             newTable[i] = e;

If two threads are processing the same node e, then first thread executes normally but the second thread sets e.next = e, because newTable[i] has already been set to e by the first thread. The node e now points to itself, and when get(Object) is called it enters an infinite loop.

In Java 8, the resize maintains the node ordering so a loop cannot occur in this fashion. You can lose data though.

The Iterators for LinkedHashMap class can get stuck in an infinite loop when there are multiple readers and no writers when access ordering is being maintained. With multiple readers and access order every read removes and then inserts the accessed node from a double linked list of nodes. Multiple readers can lead to the same node being reinserted more than once into the list, causing a loop. Again the class has been rewritten for Java 8 and I do not know if this issue still exists or not.

Simon G.
  • 6,587
  • 25
  • 30
9

Situation:

The default capacity of HashMap is 16 and Load factor is 0.75, which means HashMap will double its capacity when 12th Key-Value pair enters in the map (16 * 0.75 = 12).

When 2 thread tries to access HashMap simultaneously, then you may encounter infinite loop. Thread 1 and Thread 2 tries to put 12th key-value pair.

Thread 1 got execution chance:

  1. Thread 1 tries to put 12th key-value pair,
  2. Thread 1 founds that Threshold limit is reached and it creates new Buckets of increased capacity. So map's capacity is increased from 16 to 32.
  3. Thread 1 now transfers all existing key-value pairs to new buckets.
  4. Thread 1 points to first key-value pair and next(second) key-value pair to start transfer process.

Thread 1 after pointing to key-value pairs and before starting the transfer process, loose the control and Thread 2 got a chance for execution.

Thread 2 got execution chance:

  1. Thread 2 tries to put 12th key-value pair,
  2. Thread 2 founds that Threshold limit is reached and it creates new Buckets of increased capacity. So map's capacity is increased from 16 to 32.
  3. Thread 2 now transfers all existing key-value pairs to new buckets.
  4. Thread 2 points to first key-value pair and next(second) key-value pair to start transfer process.
  5. While transferring key-value pairs from old buckets to new buckets, key-value pairs will be reversed in new buckets because hashmap will add key-value pairs at the start and not at the end. Hashmap adds new key-value pairs at start to avoid traversing linked list every time and keep constant performance.
  6. Thread 2 will transfer all key-value pairs from old buckets to new buckets and Thread 1 will get chance for execution.

Thread 1 got execution chance:

  1. Thread 1 before leaving control was pointing to first element and next element of old bucket.
  2. Now when Thread 1 started putting key-value pairs from old bucket to new bucket. It successfully puts (90, val) and (1, val) in new Bucket.
  3. When it tries to add next element of (1, val) which is (90, val) into new Bucket, it will end up in infinite loop.

Solution:

To solve this either use a Collections.synchronizedMap or ConcurrentHashMap.

ConcurrentHashMap is thread-safe that is the code can be accessed by single thread at a time.

HashMap can be synchronized by using Collections.synchronizedMap(hashMap) method. By using this method we get a HashMap object which is equivalent to the HashTable object. So every modification is performed on Map is locked on Map object.

Avijit Karmakar
  • 8,890
  • 6
  • 44
  • 59
  • Which version of HashMap is this describing? The rewritten version in Java8? I’m curious if the infinite loop can still happen with HashMap in Java8+. – Kohei Nozaki Jan 11 '22 at 00:02
1

Given that the only possibility I see for an infinite loop would be e.next = e within the get method:

for (Entry<K,V> e = table[indexFor(hash, table.length)]; e != null; e = e.next)

And that could only happen in the transfer method during a resizing:

 do {
     Entry<K,V> next = e.next;
     int i = indexFor(e.hash, newCapacity);
     e.next = newTable[i]; //here e.next could point on e if the table is modified by another thread
     newTable[i] = e;
     e = next;
 } while (e != null);

If only one thread is modifying the Map, I believe it is quite impossible to have an infinite loop with just one thread. It was more obvious with the old implementation of get before the jdk 6 (or 5):

public Object get(Object key) {
        Object k = maskNull(key);
        int hash = hash(k);
        int i = indexFor(hash, table.length);
        Entry e = table[i]; 
        while (true) {
            if (e == null)
                return e;
            if (e.hash == hash && eq(k, e.key)) 
                return e.value;
            e = e.next;
        }
    }

Even then the case still seems pretty improbable except if there are a lot of collisions.

P.S: I'd love to be proven wrong though!

Adonis
  • 4,670
  • 3
  • 37
  • 57