23

My question is an extension to this one: Volatile guarantees and out-of-order execution

To make it more concrete, let's say we have a simple class which can be in two states after it is initialized:

class A {
    private /*volatile?*/ boolean state;
    private volatile boolean initialized = false;

    boolean getState(){
        if (!initialized){
            throw new IllegalStateException();
        }
        return state;
    }

    void setState(boolean newState){
        state = newState;
        initialized = true;
    }
}

The field initialized is declared volatile, so it introduces happen-before 'barrier' which ensures that reordering can't take place. Since the state field is written only before initialized field is written and read only after the initialized field is read, I can remove the volatile keyword from declaration of the state and still never see a stale value. Questions are:

  1. Is this reasoning correct?
  2. Is it guaranteed that the write to initialized field won't be optimized away (since it changes only the first time) and the 'barrier' won't be lost?
  3. Suppose, instead of the flag, a CountDownLatch was used as an initializer like this:

    class A {
        private /*volatile?*/ boolean state;
        private final CountDownLatch initialized = new CountDownLatch(1);
    
        boolean getState() throws InterruptedException {
            initialized.await();
            return state;
        }
    
        void setState(boolean newState){
            state = newState;
            initialized.countdown();
        }
    }
    

    Would it still be alright?

Community
  • 1
  • 1
Kovalsky
  • 339
  • 1
  • 2
  • 11
  • 5
    in the second case it will be visible the first time, however new updates after the first are *not* guaranteed to be visible. – ratchet freak Nov 09 '11 at 11:29

2 Answers2

8

Your code is (mostly) correct and it is a common idiom.

// reproducing your code
class A

    state=false;              //A
    initialized=false;        //B

    boolean state;
    volatile boolean initialized = false;        //0

    void setState(boolean newState)
        state = newState;                        //1
        initialized = true;                      //2

    boolean getState()
        if (!initialized)                        //3
            throw ...;
        return state;                            //4

Line #A #B are pseudo code for writing default values to variables (aka zeroing the fields). We need to include them in a strict analysis. Note that #B is different from #0; both are executed. Line #B is not considered a volatile write.

All volatile accesses(read/write) on all variables are in a total order. We want to establish that #2 is before #3 in this order, if #4 is reached.

There are 3 writes to initialized: #B, #0 and #2. Only #2 assigns true. Therefore if #2 is after #3, #3 cannot read true (this is probably due to the no out-of-thin-air guarantee which I don't fully understand), then #4 can't be reached.

Therefore if #4 is reached, #2 must be before #3 (in the total order of volatile accesses).

Therefore #2 happens-before #3 (a volatile write happens-before a subsequent volatile read).

By programming order, #1 happens-before #2, #3 happens-before #4.

By transitivity, therefore #1 happens-before #4.

Line#A, the default write, happens-before everything (except other default writes)

Therefore all accesses to variable state are in a happens-before chain: #A -> #1 -> #4. There is no data race. The program is correctly synchronized. Read #4 must observe write #1

There is a little problem though. Line #0 is apparently redundant, since #B already assigned false. In practice, a volatile write is not negligible on performance, therefore we should avoid #0.

Even worse, the presence of #0 can cause undesired behavior: #0 can occur after #2! Therefore it may happen that setState() is called, yet subsequent getState() keep throwing errors.

This is possible if the object is not safely published. Suppose thread T1 creates the object and publishes it; thread T2 gets the object and calls setState() on it. If the publication is not safe, T2 can observe the reference to the object, before T1 has finished initializing the object.

You can ignore this problem if you require that all A objects are safely published. That is a reasonable requirement. It can be implicitly expected.

But if we don't have line #0, this won't be a problem at all. Default write #B must happens-before #2, therefore as long as setState() is called, all subsequent getState() will observe initialized==true.

In the count down latch example, initialized is final; that is crucial in guaranteeing safe publication: all threads will observe a properly initialized latch.

irreputable
  • 44,725
  • 9
  • 65
  • 93
  • 1
    I have a question. If the first thread calls `setState(true)` and after that the second thread calls `getState()` there is a big chance it will return `false`, because `state` is not `volatile`. The second thread may not observe the change at all. Am I right here? – Petar Minchev Nov 09 '11 at 17:30
  • 1
    @PetarMinchev that's addressed by #1 *happens-before* #2 *happens-before* #3 *happens-before* #4. Consequently, #1 *happens-before* #4. Volatility is only necessary for the #2 *happens-before* #3 part—the rest is established by normal execution order. – alf Nov 09 '11 at 17:33
  • @alf - That doesn't change the fact that #4 will observe the change at all. Without `volatile` there is chance you won't observe anything. The second thread can get the value from its cache, no matter that `setState` was executed before `getState().` Am I right? – Petar Minchev Nov 09 '11 at 17:34
  • 3
    You are mistaken. A formal analysis yields that read#4 must observe write#1. In an informal explanation, a volatile read will first clear the cache. So if the 2nd thread cached `state`, the cache is cleared upon reading volatile `initialized`; the subsequent read of `state` will effectively fetch from main memory, and sees the last assigned value. – irreputable Nov 09 '11 at 17:36
  • @PetarMinchev it does indeed. See http://java.sun.com/docs/books/jls/third_edition/html/memory.html: **If one action *happens-before* another, then the first is visible to and ordered before the second.** – alf Nov 09 '11 at 17:37
  • 1
    @irreputable - Does a `volatile` read of one variable clear the cache of all other non-volatile variables? – Petar Minchev Nov 09 '11 at 17:37
  • @Peta It clears all cached variables. On the other side of the coin, a volatile write will "flush" all previous writes. Therefore #1 is flushed to main memory by #2, #3 clears caches, #4 fetch from main memory, sees the result of #1. – irreputable Nov 09 '11 at 17:43
  • @irreputable - Where is this documented that a `volatile` read of one `variable` clears the cache of other `non-volatile` variables? I am just curious. And thank you for all your explanations. – Petar Minchev Nov 09 '11 at 17:46
  • 4
    @Petar That is implied by the latest memory model. The old memory model (before Java5) indeed has no such guarantee(see http://g.oswego.edu/dl/cpj/jmm.html) In the new model, volatile write/read are like lock release/acquire for visibility. see http://g.oswego.edu/dl/jmm/cookbook.html – irreputable Nov 09 '11 at 17:55
  • @irreputable - Wow, thanks for your help:) I learned something new:) – Petar Minchev Nov 09 '11 at 18:05
  • irreputable, Thanks a lot for explanations. And @Petar, thanks for your input. One point is still unclear: when we use **volatile** boolean initialized field we always have this crucial 'barrier' upon which all guarantees rely. But when we switch to CountDownLatch and after it is open, do we have this kind of 'barrier' hidden inside the latch? Javadoc states that if you call *countdown()* when the current count equals zero **nothing happens**. I believe to make sure the count reached zero some synchronized reads are necessary, but it probably isn't reliable enough, is it? – Kovalsky Nov 10 '11 at 07:01
  • @Kovalsky It's not that it doesn't do the job—it might, perhaps, so your reasoning is not flawed—it's that you're not promised to get the result you want. For example, it can be that as soon as counter is zero, each thread saves this state; or, using a Strategy pattern, completely changes the implementation in order to minimize overhead. You cannot rely on what's not in the contract. – alf Nov 10 '11 at 09:23
  • @Kovalsky I agree with @alf. the latch won't have visibility guarantees if `setState()` is invoked more than once. – irreputable Nov 10 '11 at 14:54
-1

1. Is this reasoning correct?

No, state will be cached in thread, so you can not get the latest value.

2. Is it guaranteed that the write to initialized field won't be optimized away (since it changes only the first time) and the 'barrier' won't be lost?

Yes

3. Suppose, instead of the flag, a CountDownLatch was used as an initializer like this...

just like @ratchet freak mentioned, CountDownLatch is one time latch, while volatile is kinda of reusable latch, so the answer for your third question should be: If you are going to set the state multiple times you should use volatile.

James.Xu
  • 8,249
  • 5
  • 25
  • 36
  • 1
    except that there is no acquire happening when the latch is empty. (correct me if I'm wrong) the docs say the `await` and `countdown` methods are no-ops when `count() ==0` – ratchet freak Nov 09 '11 at 11:50
  • 5
    `setState(...)` first modifies a non-volatile variable, and then it performs a *volatile write* to `initialized`. After another thread performs a *volatile read* of `initialized`, it is guaranteed (as per chapter 17 of the JVM spec) that all modifications prior to the volatile write are visible to the thread after the volatile read, without need for further synchronization (or volatiles). – Bruno Reis Nov 09 '11 at 11:59
  • 1
    To make it clear: the second time setState is called, it will first change the state variable. If at this time, before the write to the volatile, another thread calls getState(), he will see the initialized variable as true, then read the state variable, where he might still see the old value of the state. – JB Nizet Nov 09 '11 at 12:00
  • 2
    JB Nizet, the second time setState is called, it will execute a new volatile write (writing the same value, but it is still a volatile write). Any voltile read after this one will guarantee that a read of `state` will reflect the current value in main memory. If a read of `state` by one thread occurs between the write to `state` and the volatile write by another thread, it has 2 possibilities: it will read the old value (which is exactly the same as if the call to getState() happened timely before the call to setState(...) by the second thread), or it will read the new value. – Bruno Reis Nov 09 '11 at 12:14
  • 1
    @Bruno Reis: I agree that in this case, it won't probably make a difference. The problem is that it's very fragile. If the state becomes two variables instead of one, or a long or double instead of a boolean, you could have serious problems. – JB Nizet Nov 09 '11 at 12:53
  • I would add that `status` might get the latest value, there is no guarantee it will be cached, (certainly not the first time) and there is no guarantee it will ever see the value change. – Peter Lawrey Nov 09 '11 at 14:45
  • writes can be optimised away when an object is eliminated. This should only happen when an object only appears in one method, re: Escape Analysis. – Peter Lawrey Nov 09 '11 at 14:47
  • 1
    @JBNizet It indeed is very fragile, and isn't the technique to be widely adopted. Nevertheless I want it clarified. I agree that there is a chance that, after the class is initialized, while one thread has already written to *state* but before it writes to *initialized*, other thread can see no changes. But can this do any harm or somehow deceive the user of this class? Upon exit from the *setState()* method it **is guaranteed** that the state is visible, isn't it? I can't imagine the situation when it can spoil the usability. – Kovalsky Nov 10 '11 at 07:20
  • 1
    Kovalsky, in this exact situation, there can be no harm at all. Your code is perfectly sound. And actually, this idiom is commonly used to implement highly-concurrent, non-blocking data structures. You can see this in [`ConcurrentHashMap`](http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/6-b14/java/util/concurrent/ConcurrentHashMap.java), search for "read-volatile": it is quite ingenious, `count` is volatile, and retrieval methods begins with `if (count != 0)`. Anyways, **this kind of code is hard to get right**, and should be left to experts (such as Doug Lea).Good luck! – Bruno Reis Nov 10 '11 at 21:20