3

Take the following java code:

public class SomeClass {
  private boolean initialized = false;
  private final List<String> someList; 

  public SomeClass() {
    someList = new ConcurrentLinkedQueue<String>();
  }

  public void doSomeProcessing() {
    // do some stuff...
    // check if the list has been initialized
    if (!initialized) {
      synchronized(this) {
        if (!initialized) {
          // invoke a webservice that takes a lot of time
          final List<String> wsResult = invokeWebService();
          someList.addAll(wsResult);
          initialized = true;
        }
      } 
    }
    // list is initialized        
    for (final String s : someList) {
      // do more stuff...
    }
  }
}

The trick is that doSomeProcessing gets invoked only under certain conditions. Initializing the list is a very expensive procedure and it might not be needed at all.

I have read articles on why the double-check idiom is broken and I was a bit skeptic when I saw this code. However, the control variable in this example is a boolean, so a simple write instruction is needed, as far as I know.

Also, please notice that someList has been declared as final and keeps a reference to a concurrent list, whose writes happen-before reads; if instead of a ConcurrentLinkedQueue the list were a simple ArrayList or LinkedList, even though it has been declared as final, the writes don't require to happen-before the reads.

So, is the code given above free of data races?

chahuistle
  • 2,627
  • 22
  • 30
  • I tend to prefer simplest code and by avoiding the synchronized block you may save 2 micro-seconds. However using a ConcurrentLinkedQueue rather than an ArrayList you could lose more than this if the list is even a modest length. Simpler code often runs faster as well. ;) – Peter Lawrey Feb 19 '11 at 17:15
  • 3
    FYI, the double-check idiom was broken in Java versions prior to Java5. Its no longer broken as long as you make your test variable volatile. – KevinS Feb 19 '11 at 17:19
  • possible duplicate of [When is assignment operation atomic in Java?](http://stackoverflow.com/questions/4926681/when-is-assignment-operation-atomic-in-java) (Whether it's a boolean or a reference does not matter.) – meriton Feb 19 '11 at 17:35
  • @meriton: I saw the other question. The `helper = new Helper();` statement could be broken down to several statements (allocating memory, assigning the address of this new allocation to `helper`, then actually running the constructor code), while the `initialized = true;` statement is a simple write... i think... – chahuistle Feb 19 '11 at 18:27
  • Yes, assigning initialized is atomic, just like assigning helper. But adding to the list (just like constructing the Helper instance) is not, and the JVM is permitted to reorder these instructions, i.e. initialized may be assigned before the list has been (fully) appended to, just like helper may be assigned before the constructor of Helper has completed execution, and another thread may therefore use someList (or helper) before that has been fully initialized. – meriton Feb 19 '11 at 18:48

5 Answers5

5

Ok, let's get the Java Language Specification. Section 17.4.5 defines happens-before as follows:

Two actions can be ordered by a happens-before relationship. If one action happens-before another, then the first is visible to and ordered before the second. If we have two actions x and y, we write hb(x, y) to indicate that x happens-before y.

  • If x and y are actions of the same thread and x comes before y in program order, then hb(x, y).
  • There is a happens-before edge from the end of a constructor of an object to the start of a finalizer (§12.6) for that object.
  • If an action x synchronizes-with a following action y, then we also have hb(x, y).
  • If hb(x, y) and hb(y, z), then hb(x, z).

It should be noted that the presence of a happens-before relationship between two actions does not necessarily imply that they have to take place in that order in an implementation. If the reordering produces results consistent with a legal execution, it is not illegal.

It then goes on two discuss:

More specifically, if two actions share a happens-before relationship, they do not necessarily have to appear to have happened in that order to any code with which they do not share a happens-before relationship. Writes in one thread that are in a data race with reads in another thread may, for example, appear to occur out of order to those reads.

In your instance, the thread checking

if (!initialized)

may see the new value for initialized before it sees all writes that added to someList and hence work with a partially filled list.

Note that your argument

Also, please notice that someList has been declared as final and keeps a reference to a concurrent list, whose writes happen-before reads

is irrelavant. Yes, if the thread read a value from the list, we could conclude that he also sees anything that happens-before that the write of that value. But what if it doesn't read a value? What if the list appears empty? And even if it read a value, it doesn't mean that subsequent writes have been performed, and hence the list may appear incomplete.

meriton
  • 68,356
  • 14
  • 108
  • 175
  • yes. you're right... i've read the spec a couple of times, but your short explanation between the quoted spec made lot of sense... thanks – chahuistle Feb 19 '11 at 20:22
4

Wikipedia suggests that you should use the volatile keyword.

Bozho
  • 588,226
  • 146
  • 1,060
  • 1,140
  • I was thinking that too, though if synchronization was on the `initialized` object instead wouldn't that too guarantee _visibility_? – Johan Sjöberg Feb 19 '11 at 17:09
  • @Johan, Not for the first read outside the block, but it may not make any difference in this case. – Peter Lawrey Feb 19 '11 at 17:13
  • It does make a difference, as it allows the compiler to reorder the operations within the synchronized block. In particular, it may assign initialized before adding to someList. – meriton Feb 19 '11 at 17:39
  • yes, i thought about that too... however, the release of the lock on this would make changes to `initialized` visible, right? plus, once you `initialized` is `true`, it means that the last write on the queue was done, and the last write *happens-before* release and setting `initialized` to true *happens-before* the release, therefore the last write *happens-before* the release, so once a thread sees `initialized` set to `true`, it would mean that the list was completely initialized (this is my claim). – chahuistle Feb 19 '11 at 17:58
  • anyway, i know that adding `volatile` would work... the question was "is this code free of data races?"... i heavily suspect it is... – chahuistle Feb 19 '11 at 18:15
3

Use of ConcurrentLinkedQueue doesn't guarantee absence of data race in this case. Its javadoc says:

As with other concurrent collections, actions in a thread prior to placing an object into a ConcurrentLinkedQueue happen-before actions subsequent to the access or removal of that element from the ConcurrentLinkedQueue in another thread.

That is, it guarantees consistency in the following case:

// Thread 1
x = 42;
someList.add(someObject);

// Thread 2
if (someList.peek() == someObject) {
    System.out.println(x); // Guaranteed to be 42
}

So, in this case x = 42; can't be reordered with someList.add(...). However, this guarantee doesn't apply to the inverse situation:

// Thread 1
someList.addAll(wsResult);
initialized = true;

// Thread 2
if (!initialized) { ... }
for (final String s : someList) { ... }

In this case initialized = true; still can be reordered with someList.addAll(wsResult);.

So, you have a regular double-check idiom without any additional guarantees here, and therefore you need to use volatile, as suggested by Bozho.

axtavt
  • 239,438
  • 41
  • 511
  • 482
  • but, as the writer thread sees it, the last write on the queue *happens-before* setting `initialized` to `true`, which *happens-before* the *release* of `this`... given that *happens-before* are transitive, any thread that *acquires* a monitor on `this`, would see the latest value of both the list and `initialized`... right? – chahuistle Feb 19 '11 at 18:02
  • @chahuistle: Right, but that's the whole problem with double-check idiom: reading thread checks `initialized` without acquiring the monitor, therefore it's sensitive to reorder of operations inside the `synchronized` block. – axtavt Feb 19 '11 at 18:10
  • @axtavt: the thing here is that setting `initialized` to `true` is between two *releases* (the last write on the list and the release of the lock), therefore, my claim is that `initialized = true;` cannot be reordered... what do you think? – chahuistle Feb 19 '11 at 18:21
  • @chahuistle: The point is that actions prior to _release_ can't be reordered with that _release_, but actions following the _release_ can. – axtavt Feb 19 '11 at 18:32
  • @axtavt: But releasing the lock is a *release*, and the write to `initialize` *happens-before* it, therefore, I strongly suspect that this dubious code works... Furthermore, a write to a `boolean` variable is treated as atomic by java. – chahuistle Feb 19 '11 at 18:57
  • @chahuistle when in the synchronized block there is no happens-before ordering between the intialized field and sonmeList.addAll. The compiler has every right to be able to move the write of initialized above the addAll invocations (but of course not out of the synchronized block) – John Vint Feb 19 '11 at 19:10
  • @chahuistle: Releasing a lock has nothing to do with it, since reading thread doesn't acquire it. Atomicity is not related to reorders as well. – axtavt Feb 19 '11 at 19:16
  • 1
    @John V.: Actually, there is always a happens-before relationship between actions of the same thread. However, it doesn't matter since there are no happens-before relationships with the reading thread. From JLS: *if two actions share a happens-before relationship, they do not necessarily have to appear to have happened in that order to any code with which they do not share a happens-before relationship*. – axtavt Feb 19 '11 at 19:18
  • @xtavt youre right I was more of meaning there is no order guarantee (despite using the wrong phrase to describe it), the sequential consistency is of course ensured. – John Vint Feb 19 '11 at 19:24
0

Instead of having the initialized flag, can you just check someList.isEmpty()?

Tom Anderson
  • 46,189
  • 17
  • 92
  • 133
  • You can, but it wouldn't be correct, since the thread adding to someList may not yet have added all strings yet. – meriton Feb 19 '11 at 17:45
0

First, it's the wrong use of the concurrent queue. It's intended for the situation where multiple threads are putting to and polling from a queue. What you want is something that's initialized once, then remains readonly afterwards. A simple list impl would do.

volatile ArrayList<String> list = null;

public void doSomeProcessing() {
    // double checked locking on list
    ...

Suppose, for the sole purpose of brain exercise, we want to achieve thread safety through the concurrent queue:

static final String END_MARK = "some string that can never be a valid result";

final ConcurrentLinkedQueue<String> queue = new ...

public void doSomeProcessing() 
    if(!queue.contains(END_MARK)) // expensive to check!
         synchronized(this)
            if(!queue.contains(END_MARK))
                  result = ...
                  queue.addAll(result);
                  // happens-before contains(END_MARK)==true
                  queue.add( END_MARK );

     //when we are here, contains(END_MARK)==true

     for(String s : queue)
         // remember to ignore the last one, the END_MARK

Note, when declaring the variable, I used the full class type, instead of some interface. If someone argues that it should be declared interface List, so that "I can change it to any List impl, and I have only one place to change", he is too naive.

irreputable
  • 44,725
  • 9
  • 65
  • 93