2

Given the ConcurrentHashMap javadocs state:

"Iterators and Enumerations return elements reflecting the state of the hash table at some point at or since the creation of the iterator"

I think I am guarranteed that in the example below one or both threads will call fireAllFinished(). Can there ever be a case where neither calls fireAllFinished()?

ConcurrentHashMap<String, Boolean> taskToFinished = new ConcurrentHashMap();
    taskToFinished.put("taskA", false);
    taskToFinished.put("taskB", false);

public void checkForAllFinished() {
    boolean allFinished = true;
    for (Boolean taskFinished = tasksToFinished.values()) {
        if (!taskFinished) {
            allFinished = false;
            break;
        }
    }
    if (allFinished) {
       fireAllFinished()
    }
}

//Thread1
public void run() {
    taskToFinished.put("taskA", true);
    checkForAllFinished();
}

//Thread1
public void run() {
    taskToFinished.put("taskB", true);
    checkForAllFinished();
}

(I've omitted some of the thread creation code. I hope the intent is clear)

update: I already saw this more general question: Is iterating ConcurrentHashMap values thread safe?, but wanted to confirm my specific point as

"at some point"

is generally an imprecise concept when dealing with multi-core machines running code out of order, two threads may update different segments of the map 'at the same time', and there is by design no way to lock the entire ConcurrentHashMap.

Community
  • 1
  • 1
barclar
  • 523
  • 4
  • 11
  • This program won't work. The method `checkForAllFinished` has a local variable `allFinished` which is local. Local variables are always thread safe because they are on the stack and can never be seen by other threads. For that reason, your `checkForAllFinished` method will not work as you expect. If you make `allFinished` an instance field, it will need to be either synchronized or made `volatile` in order to prevent stale data. – scottb Oct 17 '15 at 20:26
  • @scottb what you said isn't incorrect. – Jason Oct 17 '15 at 20:33
  • If you're waiting for a set of tasks to complete, why aren't you using `Future`? – Jason Oct 17 '15 at 20:35
  • Possible duplicate of [Is iterating ConcurrentHashMap values thread safe?](http://stackoverflow.com/questions/3768554/is-iterating-concurrenthashmap-values-thread-safe) – gmcontessa Oct 17 '15 at 21:35
  • @Jason - I created this simplified example to focus on a specific point. In the real system each task has a workflow of states with overall batch status derived from the tasks, and each task can be rerun. Future might help but I'd still appreciate an answer to the original question. – barclar Oct 18 '15 at 19:30
  • @gmconte - thanks but I've updated the question to explain why I don't think so. – barclar Oct 18 '15 at 19:32

1 Answers1

2

Reading the documentation for ConcurrentHashMap...

Retrievals reflect the results of the most recently completed update operations holding upon their onset. (More formally, an update operation for a given key bears a happens-before relation with any (non-null) retrieval for that key reporting the updated value.)

and

For aggregate operations such as putAll and clear, concurrent retrievals may reflect insertion or removal of only some entries. Similarly, Iterators, Spliterators and Enumerations return elements reflecting the state of the hash table at some point at or since the creation of the iterator/enumeration.

It's not worded clearly, but what most recently completed and at or since are supposed to mean that map operations and iterator creation are sequentially consistent.

Using your example, if we call map put A, and value check B you have...

T1: A -> B

T2: A -> B

A happens before B, but T1 and T2 happen concurrently. What sequentially consistent means is some valid sequence between the two needs to occur as long as A happens before B. However, any sequencing between T1 and T2 is valid.

e.g.

T1:A -> T1:B -> T2:A -> T2:B

T1:A -> T2:A -> T2:B -> T1:B

So when the code actually runs, any valid ordering can happen but, T1:B or T2:B (the check) must happen last. So, fireAllFinished gets called either once or twice. Linear synchronization would restrict that even further to an explicit ordering between all the events.

Iterating over the entire map may be a bit expensive though, and it may be simpler to just use an AtomicInteger or another synchronization mechanism like a ConcurrentLinkedQueue.

Jason
  • 3,777
  • 14
  • 27
  • I agree with Jason. The iterator will guarantee the map snapshot for all the completed operations only and NOT GUARANTEE that you will see later changes within the same iterator. So to answer your example, it is possible that your program won't call fireAllFinished method on both threads because one of them could have updated the map after the iterator was created – gmcontessa Oct 19 '15 at 07:21
  • I'm only concerned about fireAllFinished getting called zero times. If it's called once or twice that's fine. – barclar Oct 19 '15 at 07:59
  • I think linearly synchronized means there never be a case where neither thread calls fireAllFinished(). – barclar Oct 19 '15 at 11:58
  • @barclar I read through the source, and iterator is only sequentially consistent, but this still ensures the method will be executed at least once. I'm updating the answer. – Jason Oct 19 '15 at 16:23