3

I'm writing an analogue of DatabaseConfiguration class which reads configuration from database and I need some advice regards synchronization. For example,

public class MyDBConfiguration{
   private Connection cn;
   private String table_name;
   private Map<String, String> key_values = new HashMap<String,String>();
   public MyDBConfiguration (Connection cn, String table_name) {
      this.cn = cn;
      this.table_name = table_name;
      reloadConfig();
   }
   public String getProperty(String key){
       return this.key_values.get(key);
   }
   public void reloadConfig() {
      Map<String, String> tmp_map = new HashMap<String,String> ();
      // read  data from database
      synchronized(this.key_values)
      {
          this.key_values = tmp_map;
      }
   }
}

So I have a couple questions.
1. Assuming properties are read only , do I have use synchronize in getProperty ?
2. Does it make sense to do this.key_values = Collections.synchronizedMap(tmp_map) in reloadConfig?

Thank you.

a1ex07
  • 36,826
  • 12
  • 90
  • 103

3 Answers3

5

If multiple threads are going to share an instance, you must use some kind of synchronization.

Synchronization is needed mainly for two reasons:

  • It can guarantee that some operations are atomic, so the system will keep consistent
  • It guarantees that every threads sees the same values in the memory

First of all, since you made reloadConfig() public, your object does not really look immutable. If the object is really immutable, that is, if after initialization of its values they cannot change (which is a desired property to have in objects that are shared).

For the above reason, you must synchronize all the access to the map: suppose a thread is trying to read from it while another thread is calling reloadConfig(). Bad things will happen.

If this is really the case (mutable settings), you must synchronize in both reads and writes (for obvious reasons). Threads must synchronize on a single object (otherwise there's no synchronization). The only way to guarantee that all the threads will synchronize on the same object is to synchronize on the object itself or in a properly published, shared, lock, like this:

// synchronizes on the in instance itself:
class MyDBConfig1 {
  // ...
  public synchronized String getProperty(...) { ... }
  public synchronized reloadConfig() { ... }
}

// synchronizes on a properly published, shared lock:
class MyDBConfig2 {
  private final Object lock = new Object();
  public String getProperty(...) { synchronized(lock) { ... } }
  public reloadConfig() { synchronized(lock) { ... } }
}

The properly publication here is guaranteed by the final keyword. It is subtle: it guarantees that the value of this field is visible to every thread after initialization (without it, a thread might see that lock == null, and bad things will happen).

You could improve the code above by using a (properly published) ReadWriteReentrantLock. It might improve concurrency a bit if that's a concern for you.

Supposing your intention was to make MyDBConfig immutable, you do not need to serialize access to the hash map (that is, you don't necessarily need to add the synchronized keyword). You might improve concurrency.

First of all, make reloadConfig() private (this will indicate that, for consumers of this object, it is indeed immutable: the only method they see is getProperty(...), which, by its name, should not modify the instance).

Then, you only need to guarantee that every thread will see the correct values in the hash map. To do so, you could use the same techniques presented above, or you could use a volatile field, like this:

class MyDBConfig {
  private volatile boolean initialized = false;
  public String getProperty(...) { if (initialized) { ... } else { throw ... } }
  private void reloadConfig() { ...; initialized = true; }
  public MyDBConfig(...) { ...; reloadConfig(); }
}

The volatile keyword is very subtle. Volatile writes and volatile reads have a happens-before relationship. A volatile write is said to happen-before a subsequent volatile read of the same (volatile) field. What this means is that all the memory locations that have been modified before (in program order) a volatile write are visible to every other thread after they have executed a subsequente volatile read of the same (volatile) field.

In the code above, you write true to the volatile field after all the values have been set. Then, the method reading values (getProperty(...)) begins by executing a volatile read of the same field. Then this method is guaranteed to see the correct values.

In the example above, if you don't publish the instance before the constructor finishes, it is guaranteed that the exception won't get thrown in the method getProperty(...) (because before the constructor finishes, you have written true to initialized).

Bruno Reis
  • 37,201
  • 11
  • 119
  • 156
  • Thanks for the answer. Will it also work if I just declare the underlying map `volatile` as proposed by John Vint ? – a1ex07 Nov 11 '11 at 15:38
  • No, not at all. Volatiles only guarantees that: (1) the value of the volatile field is visible at all times to all threads; and (2) if a thread modifies some other memory locations, then writes to a volatile field, and later on another thread reads from the same volatile field, then this second thread will see the current values for all the other modified memory locations. What Vint proposes won't work because the volatile field holding the hash map will be written *once*, and this before the values are written to it. These values might never be seen by other threads. – Bruno Reis Nov 11 '11 at 15:42
  • @Bruno If the map needs to be filled atomically then you would need to synchronize all the population and the assignment. But if the Map just needs to be published correctly like he is attempting in his example then declaring the field as volatile absolutely would work. "These values might never be seen by other threads. " Which values might not be seen, the ones in the Map? Because that isn't true. – John Vint Nov 11 '11 at 15:45
  • Vint, it won't work as you propose, everything is explained in my comment above. The reason is that you execute a volatile write *before* writing the values to the map. The subsequent access to the map, even if they perform a volatile read before accessing the map, the values stored in the map might not be available. To correct this, you would need to first instantiate a map in a local variable, then populate it, and only then assign this local map to the volatile field. In this case, the volatile write will happen AFTER the values have been set. Then, a volatile read of the map will work. – Bruno Reis Nov 11 '11 at 15:51
  • If you still don't agree that or understand why what you propose won't work, you must then study a bit about the memory model. A good starting point would be Java Concurrency in Practice, by Brian Goetz. – Bruno Reis Nov 11 '11 at 15:53
  • @Bruno that is what he is doing in his example. My answer relied on that assumption. `Map tmp_map = new HashMap (); // read data from database` Filling tmp_map and then assigning tmp_map to key_values is perfectly legal, and conforms with the happens-before relationship – John Vint Nov 11 '11 at 15:54
  • Vint, oh, now I see it in his example. I was talking specifically about you answer. Anyways, the OP doesn't seem to be very acquainted with threading problems, so to assume that he is aware that he needs to perform the volatile write only after he fills the map is a bit too extreme, I'd say. – Bruno Reis Nov 11 '11 at 15:58
  • That's a fair point. I updated my answer to clearly illustrate the importance of the assumption I made – John Vint Nov 11 '11 at 16:01
  • @Bruno Now that you've read the OP's example code the way that I did (i.e. tmp_map is filled before assigning to key_values), is my answer so very wrong? – ewan.chalmers Nov 11 '11 at 16:35
  • @Bruno Reis: I'm aware that inserting values into the map and reading from it in different threads is not thread-safe. This is why I create temporary map, populate it, and only then do a final assignment `this.key_values = tmp_map`. Anyway, thanks for your answer and discussion with John Vint. – a1ex07 Nov 11 '11 at 17:32
3
  1. Assuming that key_values will not be put to after reloadConfig you will need to synchronize access to both reads and writes of the map. You are violating this by only synchronizing on the assignment. You can solve this by removing the synchronized block and assigning the key_values as volatile.

  2. Since the HashMap is effectively read only I wouldn't assign Collections.synchronizedMap rather Collections.unmodifiableMap (this wouldn't effect the Map itself, just prohibit from accidental puts from someone else possible using this class).

Note: Also, you should never synchronize a field that will change. The results are very unpredictable.

Edit: In regards to the other answers. It is highly suggested that all shared mutable data must be synchronized as the effects are non-deterministic. The key_values field is a shared mutable field and assignments to it must be synchronized.

Edit: And to clear up any confusion with Bruno Reis. The volatilefield would be legal if you still fill the tmp_map and after its finished being filled assign it to this.key_values it would look like:

   private volatile Map<String, String> key_values = new HashMap<String,String>();

  ..rest of class 

   public void reloadConfig() {
      Map<String, String> tmp_map = new HashMap<String,String> ();
      // read  data from database

      this.key_values = tmp_map;
   }

You still need the same style or else as Bruno Reis noted it would not be thread-safe.

John Vint
  • 39,695
  • 7
  • 78
  • 108
  • Thank you for your answer. Just to make sure I understood you well, the reliable way is to : 1. Add a member, for instance, `Object key_value_lock = new Object()`. 2. Every time I access the underlying map(even for read), do it within `synchronize (key_value_lock)`block. Right? – a1ex07 Nov 11 '11 at 15:25
  • Yes that is a safe way of executing it. Again I would rather do `private volatile Map key_values=...` A volatile load is much faster then a synchronized load. – John Vint Nov 11 '11 at 15:28
  • Also, as a side note to be thread-safe. If the object can be declared final it is better to do so. `final Object key_value_lock`, `final Connection cn` and so forth. – John Vint Nov 11 '11 at 15:33
  • Thanks again, I'll try both approaches. I was a bit reluctant to use `volatile` in Java . – a1ex07 Nov 11 '11 at 15:36
  • The lock object **must** be final, otherwise it is broken (other threads might see the field null). – Bruno Reis Nov 11 '11 at 15:39
  • Why do you agree with a1ex07 that he should synchronize on an external lock for read access to the map? According to the javadoc of HashMap, external synchronization is required only if the map may be structurally changed (i.e. put or delete). – ewan.chalmers Nov 12 '11 at 15:25
-1

I would say that if you guarantee that no code will structurally modify your map, then there is no need to synchronize it.

If multiple threads access a hash map concurrently, and at least one of the threads modifies the map structurally, it must be synchronized externally. http://download.oracle.com/javase/6/docs/api/java/util/HashMap.html

The code you have shown provides only read access to the map. Client code cannot make a structural modification.

Since your reload method alters a temporary map and then changes key_values to point to the new map, again I'd say no synchronization is required. The worst that can happen is someone reads from old copy of the map.

I'm going to keep my head down and wait for the downvotes now ;)

EDIT

As suggested by Bruno, the fly in the ointment is inheritance. If you cannot guarantee that your class will not be sub-classed, then you should be more defensive.

EDIT

Just to refer back to the specific questions posed by the OP...

  1. Assuming properties are read only , do I have use synchronize in getProperty ?
  2. Does it make sense to do this.key_values = Collections.synchronizedMap(tmp_map) in reloadConfig?

... I am genuinely interested to know if my answers are wrong. So I won't give up and delete my answer for a while ;)

ewan.chalmers
  • 16,145
  • 43
  • 60
  • @sudocode. Your answer still would be wrong. The reason it is wrong isn't the mutability within the HashMap, rather the mutability of the HashMap field. It is possible for one thread to come in and see an outdated version of the key_values map even after another thread already wrote to that field. For the sake of correctness of multithreaded programming, you need to synchronize access to all reads and writes of shared mutable data. – John Vint Nov 11 '11 at 17:11
  • And also since the field is not `final` a thread can still initialize the MyDBConfiguration and possibly see a null key_values map. The double-checked locking idiom is a classic violator of this property. – John Vint Nov 11 '11 at 17:12
  • @John "The worst that can happen is someone reads from old copy of the map." Is that statement incorrect? – ewan.chalmers Nov 11 '11 at 17:59
  • Well the worst is actually a NullPointerException. But if you did `if(this.key_values != null)...` Then yes worst would be a read from the old map. Keep in mind though this reasoning doesn't hold true for all assignments. For example, `private long someLong = 10;` If you did `someLong = 5000000;` in one thread and `someLong = 10000000;` in another thread, a reading thread may see a combination of the two, you can read here for more information on that http://stackoverflow.com/questions/3463658/are-64-bit-assignments-in-java-atomic-on-a-32-bit-machine/3463673#3463673 – John Vint Nov 11 '11 at 18:03
  • How exactly can key_values ever be null in the code we are reviewing? – ewan.chalmers Nov 11 '11 at 20:43
  • If the field is neither final nor volatile (and is not written and read to with synchronized), the JVM is allowed to delay the assignment of a field so long as the write does not seem out of order for a single threaded environment. So if I were to initialize the MyDBConfiguration, and another thread see's the MyDBConfiguration is created and invokes getProperty(), the JVM is allowed to hold off the assignment of key_values until at the least the creating thread wants to read it. – John Vint Nov 11 '11 at 21:04
  • That being said, it is unlikely you would come across it, but it doesn't mean it will never happen. – John Vint Nov 11 '11 at 21:04