3

I'm having a bit of trouble concerning concurrency and maps in Java. Basically I have multiple threads using (reading and modifying) their own maps, however each of these maps is a part of a larger map which is being read and modified by a further thread:

My main method creates all threads, the threads create their respective maps which are then put into the "main" map:

Map<String, MyObject> mainMap = new HashMap<String, Integer>();
FirstThread t1 = new FirstThread();
mainMap.putAll(t1.getMap());
t1.start();
SecondThread t2 = new SecondThread();
mainMap.putAll(t2.getMap());
t2.start();
ThirdThread t3 = new ThirdThread(mainMap);
t3.start();

The problem I'm facing now is that the third (main) thread sees arbitrary values in the map, depending on when one or both of the other threads update "their" items. I must however guarantee that the third thread can iterate over - and use the values of - the map without having to fear that a part of what is being read is "old":

FirstThread (analogue to SecondThread):

for (MyObject o : map.values()) {
    o.setNewValue(getNewValue());
}

ThirdThread:

for (MyObject o : map.values()) {
    doSomethingWith(o.getNewValue());
}

Any ideas? I've considered using a globally accessible (static final Object through a static class) lock which will be synchronized in each thread when the map must be modified. Or are there specific Map implementations that assess this particular problem which I could use?

Thanks in advance!

Edit: As suggested by @Pyranja, it would be possible to synchronize the getNewValue() method. However I forgot to mention that I am in fact trying to do something along the lines of transactions, where t1 and t2 modify multiple values before/after t3 works with said values. t3 is implemented in such a way that doSomethingWith() will not actually do anything with the value if it hasn't changed.

phex
  • 104
  • 3
  • 10
  • Sounds like the third thread needs to wait for the first two. Note: you only really need one additional thread if this is the case. – Peter Lawrey Jan 30 '13 at 20:54
  • pretty sure you'll need to use volatile in there (along with synchronized) if you want to ensure that part of what is being read is the latest value. – Dhruv Gairola Jan 30 '13 at 20:54
  • 1
    The behavior you're seeing isn't a result of any synchronization problem. Modifications to t1's map and t2's map won't ever be visible in mainMap, since before t1 and t2 start you're _copying_ their initial contents into mainMap. (If you used a view of the two maps, then a synchronization issue would arise, but as it is there's no communication to synchronize.) – jacobm Jan 30 '13 at 21:00
  • 1
    @jacobm I think he's not modifying the map itself (put/remove), he's modifying the value objects inside the map. Those changes WOULD be visible in the mainMap, once the theads have been synched. – sharakan Jan 30 '13 at 21:29
  • @sharakan Ah, I was reading "setNewValue" as a shorthand for "replace the value in the map with a different value", but if you're imagining that "setNewValue" modifies the mapped value itself then I agree with you. – jacobm Jan 30 '13 at 21:50

3 Answers3

3

To synchronize at a higher level than the individual value objects, you need locks to handle the synchronization between the various threads. One way to do this, without changing your code too much, is a ReadWriteLock. Thread 1 and Thread 2 are writers, Thread 3 is a reader.

You can either do this with two locks, or one. I've sketched out below doing it with one lock, two writer threads, and one reader thread, without worrying about what happens with an exception during data update (ie, transaction rollback...).

All that said, this sounds like a classic producer-consumer scenario. You should consider using something like a BlockingQueue for communication between threads, as is outlined in this question.

There's other things you may want to consider changing as well, like using Runnable instead of extending Thread.

private static final class Value {

    public void update() {

    }

}

private static final class Key {

}

private final class MyReaderThread extends Thread {

    private final Map<Key, Value> allValues;

    public MyReaderThread(Map<Key, Value> allValues) {
        this.allValues = allValues;
    }

    @Override
    public void run() {
        while (!isInterrupted()) {
            readData();
        }
    }

    private void readData() {
        readLock.lock();
        try {
            for (Value value : allValues.values()) {
                // Do something
            }
        }
        finally {
            readLock.unlock();
        }

    }
}

private final class WriterThread extends Thread {

    private final Map<Key, Value> data = new HashMap<Key, Value>();

    @Override
    public void run() {
        while (!isInterrupted()) {
            writeData();
        }
    }

    private void writeData() {
        writeLock.lock();

        try {
            for (Value value : data.values()) {
                value.update();
            }
        }
        finally {
            writeLock.unlock();
        }
    }
}

private final ReentrantReadWriteLock lock = new ReentrantReadWriteLock();

private final ReadLock readLock;
private final WriteLock writeLock;

public Thing() {
    readLock = lock.readLock();
    writeLock = lock.writeLock();
}

public void doStuff() {
    WriterThread thread1 = new WriterThread();
    WriterThread thread2 = new WriterThread();

    Map<Key, Value> allValues = new HashMap<Key, Value>();
    allValues.putAll(thread1.data);
    allValues.putAll(thread2.data);
    MyReaderThread thread3 = new MyReaderThread(allValues);

    thread1.start();
    thread2.start();
    thread3.start();
}
Community
  • 1
  • 1
sharakan
  • 6,821
  • 1
  • 34
  • 61
  • Yes! This is exactly what I was looking for! I want to +1 you, but i lack the reputation :( – phex Jan 30 '13 at 23:35
  • @phex Well, get me back when you can! ;) – sharakan Jan 30 '13 at 23:40
  • will do :) A bit of feedback concerning the BlockingQueue: yes, that would very well have been the most reasonable choice instead of the HashMap. Unfortunately this particular piece of software is so far into development that I can't replace the Map anymore :( But I'll definitely keep the BlockingQueue in mind for another project! – phex Jan 30 '13 at 23:52
  • @phex Also, you should consider using Runnables instead of Thread subclasses. see my updated answer. – sharakan Jan 31 '13 at 00:33
  • I have to admit that I simplified the dummy-code in my question a bit. The real code does use Runnables :-) – phex Jan 31 '13 at 12:20
2

ConcurrentHashMap from java.util.concurrent - a thread-safe implementation of Map, which provides a much higher degree of concurrency than synchronizedMap. Just a lot of reads can almost always be performed in parallel, simultaneous reads and writes can usually be done in parallel, and multiple simultaneous recordings can often be done in parallel. (The class ConcurrentReaderHashMap offers a similar parallelism for multiple read operations, but allows only one active write operation.) ConcurrentHashMapis designed to optimize the retrieval operations.

  • 1
    +1 for java.util.concurrent. Most of the easy answers to concurrency problems can be found there. – Aurand Jan 30 '13 at 21:07
  • Yes, but I think the problem here is not so much synching the maps, as the (mutable) value objects within them. Based on the OP's code that's been posted, the maps themselves don't change, the value objects do. – sharakan Jan 30 '13 at 21:33
  • good thing here is that ConcurrentHashMap.get() is a volatile read. – Ralf H Jan 30 '13 at 21:37
1

Your example code may be misleading. In your first example you create a HashMap<String,Integer> but the second part iterates the map values which in this case are MyObject. The key to synchronization is to understand where and which mutable state is shared.

An Integer is immutable. It can be shared freely (but the reference to an Integer is mutable - it must be safely publicated and/or synchronized). But your code example suggests that the maps are populated with mutable MyObject instances.

Given that the map entries (key -> MyObject references) are not changed by any thread and all maps are created and safely publicated before any thread starts it would be in my opinion sufficient to synchronize the modification of MyObject. E.g.:

public class MyObject {
   private Object value;

   synchronized Object getNewValue() {
      return value;
   }

   synchronized void setNewValue(final Object newValue) {
      this.value = newValue;
   }
}

If my assumptions are not correct, clarify your question / code example and also consider @jacobm's comment and @Alex answer.

Pyranja
  • 3,529
  • 22
  • 24
  • +1 this sounds right to me. OP just needs to figure out whether he wants t1 and t2 to be completely done with an iteration before t3 accesses it. In which case he would need to do some synchronization at a higher level than MyObject. – sharakan Jan 30 '13 at 21:32
  • This was also something I considered. However the as @sharakan pointed out, I was missing an important part in the OP: What I'm doing within t1, t2 and t3 are something along the lines of transactions. I want to modify multiple values before I iterate over them with t3. – phex Jan 30 '13 at 22:14