10

I'm working with existing code that has an object store in the form of a ConcurrentHashMap. Within the map are stored mutable objects, use by multiple threads. No two threads try to modify an object at once by design. My concern is regarding the visibility of the modifications between the threads.

Currently the objects' code has synchronization on the "setters" (guarded by the object itself). There is no synchronization on the "getters" nor are the members volatile. This, to me, would mean that visibility is not guaranteed. However, when an object is modified it is re-put back into the map (the put() method is called again, same key). Does this mean that when another thread pulls the object out of the map, it will see the modifications?

I've researched this here on stackoverflow, in JCIP, and in the package description for java.util.concurrent. I've basically confused myself I think... but the final straw that caused me to ask this question was from the package description, it states:

Actions in a thread prior to placing an object into any concurrent collection happen-before actions subsequent to the access or removal of that element from the collection in another thread.

In relation to my question, do "actions" include the modifications to the objects stored in the map before the re-put()? If all this does result in visibility across threads, is this an efficient approach? I'm relatively new to threads and would appreciate your comments.

Edit:

Thank you all for you responses! This was my first question on StackOverflow and it has been very helpful to me.

I have to go with ptomli's answer because I think it most clearly addressed my confusion. To wit, establishing a "happens-before" relation doesn't necessarily affect modification visibility in this case. My "title question" is poorly constructed regarding my actual question described in the text. ptomli's answer now jives with what I read in JCIP: "To ensure all threads see the most up-to-date values of shared mutable variables, the reading and writing threads must synchronize on a common lock" (page 37). Re-putting the object back into the map doesn't provide this common lock for the modification to the inserted object's members.

I appreciate all the tips for change (immutable objects, etc), and I wholeheartedly concur. But for this case, as I mentioned there is no concurrent modification because of careful thread handling. One thread modifies an object, and another thread later reads the object (with the CHM being the object conveyer). I think the CHM is insufficient to ensure that the later executing thread will see the modifications from the first given the situation I provided. However, I think many of you correctly answered the title question.

Community
  • 1
  • 1
ryanlb1000
  • 198
  • 4
  • 8

6 Answers6

7

You call concurrHashMap.put after each write to an object. However you did not specified that you also call concurrHashMap.get before each read. This is necessary.

This is true of all forms of synchronization: you need to have some "checkpoints" in both threads. Synchronizing only one thread is useless.

I haven't checked the source code of ConcurrentHashMap to make sure that put and get trigger an happens-before, but it is only logical that they should.

There is still an issue with your method however, even if you use both put and get. The problem happens when you modify an object and it is used (in an inconsistent state) by the other thread before it is put. It's a subtle problem because you might think the old value would be read since it hasn't been put yet and it would not cause a problem. The problem is that when you don't synchronize, you are not guaranteed to get a consistent older object, but rather the behavior is undefined. The JVM can update whatever part of the object in the other threads, at any time. It's only when using some explicit synchronization that you are sure you are updating the values in a consistent way across threads.

What you could do:
(1) synchronize all accesses (getters and setters) to your objects everywhere in the code. Be careful with the setters: make sure that you can't set the object in an inconsistent state. For example, when setting first and last name, having two synchronized setters is not sufficient: you must get the object lock for both operations together.
or
(2) when you put an object in the map, put a deep copy instead of the object itself. That way the other threads will never read an object in an inconsistent state.

EDIT:
I just noticed

Currently the objects' code has synchronization on the "setters" (guarded by the object itself). There is no synchronization on the "getters" nor are the members volatile.

This is not good. As I said above synchronizing on only one thread is no synchronization at all. You might synchronize on all your writer threads, but who cares since the readers won't get the right values.

toto2
  • 5,306
  • 21
  • 24
5

I think this has been already said across a few answers but to sum it up

If your code goes

  • CHM#get
  • call various setters
  • CHM#put

then the "happens-before" provided by the put will guarantee that all the mutate calls are executed before the put. This means that any subsequent get will be guaranteed to see those changes.

Your problem is that the actual state of the object will not be deterministic because if the actual flow of events is

  • thread 1: CHM#get
  • thread 1: call setter
  • thread 2: CHM#get
  • thread 1: call setter
  • thread 1: call setter
  • thread 1: CHM#put

then there is no guarantee over what the state of the object will be in thread 2. It might see the object with the value provided by the first setter or it might not.

The immutable copy would be the best approach as then only completely consistent objects are published. Making the various setters synchronized (or the underlying references volatile) still doesn't let you publish consistent state, it just means that the object will always see the latest value for each getter on each call.

Matt
  • 8,367
  • 4
  • 31
  • 61
4

I think your question relates more to the objects you're storing in the map, and how they react to concurrent access, than the concurrent map itself.

If the instances you're storing in the map have synchronized mutators, but not synchronized accessors, then I don't see how they can be thread safe as described.

Take the Map out of the equation and determine if the instances you're storing are thread safe by themselves.

However, when an object is modified it is re-put back into the map (the put() method is called again, same key). Does this mean that when another thread pulls the object out of the map, it will see the modifications?

This exemplifies the confusion. The instance that is re-put into the Map will be retrieved from the Map by another thread. This is the guarantee of the concurrent map. That has nothing to do with visibility of the state of the stored instance itself.

ptomli
  • 11,730
  • 4
  • 40
  • 68
  • Perhaps I'm asking something stupid, but why the re-put? Is other thread removing that instance? – Mister Smith Oct 18 '11 at 15:26
  • @Mister Smith The re-put is to trigger an happens-before (flush the cache) so that all threads see the updated value. However, I don't think it quite works as intended (see my answer). – toto2 Oct 18 '11 at 15:31
  • @toto2 weird logic you're using there. The same instance is put back into the same key. The map can have no effect on the visibility of the state change of the stored instance. The concurrent guarantees made by the map are only that the correct instance is returned from a get (for example). As I point out, the OP question really relates to the thread safety of the stored instances. The map is only a diversion. As to why the authors *thought* they were re-putting? Who knows, but probably doesn't do what they think. – ptomli Oct 18 '11 at 15:38
  • 2
    @ptomli I disagree with your second sentence. In the most recent java memory model, when there is a "happens-before", everything is updated. Nonetheless the synchronization procedure is still flawed even if everything is updated, as I explain in my answer. – toto2 Oct 18 '11 at 15:49
3

My understanding is that it should work for all gets after the re-put, but this would be a very unsafe method of synchronization.

What happens to gets that happen before the re-put, but while modifications are happening. They may see only some of the changes, and the object would have an inconsistent state.

If you can, I'd recommend store immutable objects in the map. Then any get will retrieve a version of the object that was current when it did the get.

Michael Krussel
  • 2,586
  • 1
  • 14
  • 16
2

That's a code snippet from java.util.concurrent.ConcurrentHashMap (Open JDK 7):

  919       public V get(Object key) {
  920           Segment<K,V> s; // manually integrate access methods to reduce overhead
  921           HashEntry<K,V>[] tab;
  922           int h = hash(key.hashCode());
  923           long u = (((h >>> segmentShift) & segmentMask) << SSHIFT) + SBASE;
  924           if ((s = (Segment<K,V>)UNSAFE.getObjectVolatile(segments, u)) != null &&
  925               (tab = s.table) != null) {
  926               for (HashEntry<K,V> e = (HashEntry<K,V>) UNSAFE.getObjectVolatile
  927                        (tab, ((long)(((tab.length - 1) & h)) << TSHIFT) + TBASE);
  928                    e != null; e = e.next) {
  929                   K k;
  930                   if ((k = e.key) == key || (e.hash == h && key.equals(k)))
  931                       return e.value;
  932               }
  933           }
  934           return null;
  935       }

UNSAFE.getObjectVolatile() is documented as getter with internal volatile semantics, thus the memory barrier will be crossed when getting the reference.

b_erb
  • 20,932
  • 8
  • 55
  • 64
1

yes, put incurs a volatile write, even if key-value already exists in the map.

using ConcurrentHashMap to publish objects across thread is pretty effecient. Objects should not be modified further once they are in the map. (They don't have to be strictly immutable (with final fields))

irreputable
  • 44,725
  • 9
  • 65
  • 93
  • I think his code does modify the objects after they are initially put in the map. My answer was to put deep copies of the objects in the map, so those won't be modified. – toto2 Oct 18 '11 at 15:37