Java: how volatile guarantee visibility of "data" in this piece of code?

Question

Class Future
{
    private volatile boolean ready;
    private Object data;
    public Object get()
    {
        if(!ready) return null;
        return data;
    }

    public synchronized void setOnce(Object o)
    {
        if(ready) throw...;
        data = o;
        ready = true;
    }
}

It said "if a thread reads data, there is a happens-before edge from write to read of that guarantees visibility of data"

I know from my learning:

volatile ensures that every read/write will be in the memory instead of only in cache or registers;
volatile ensures reorder: that is, in setOnce() method data = o can only be scheduled after if(ready) throw..., and before ready = true; this guarantee that if in get() ready = true, data must be o.

My confusion is

is it possible that when thread 1 is in setOnce(), reaches the point that after data = o; before ready = true; At the same time, thread 2 runs into get(), read ready is false, and return null. And thead 1 continues ready = true. In this scenario, Thread 2 didn't see the new "data" even though data has been assigned new value in thread 1.
get() isn't synchronized, that means the synchronized lock cannot protect setOnce() since thread 1 calls get() that needn't acquire the lock to access variable ready, data. So thread are not guaranteed to see the latest value of data. By this, I mean lock only guarantee the visibility between synchronized blocks. Even though one thread is running synchronized block setOnce(), another thread is still can go into get() and access ready and data without blocking and may see the old value of these variables.
in get(), if ready = true, data must be o? I mean this thread is guaranteed to see the visibility of data? I think data is not a volatile nor the get() synchronized. Is this thread may see the old value in the cache?

Thanks!

Also, your `1` is mostly false. The `volatile` keyword has to do with memory visibility, not caches. Caches are handled by cache coherency hardware. And that would be an obviously awful design that nobody would use -- memory is way too slow to use that way. — David Schwartz, Nov 11 '15 at 00:13
@DavidSchwartz in Java a variable can be stored in cache memory. L1 and L2 cache memory are invisible for distinct threads, using volatile the value is stored in main memory or L3 cache ( main memory and L3 cache memory are shared between threads ). [More info](http://tutorials.jenkov.com/java-concurrency/volatile.html) — Velko Georgiev, Nov 11 '15 at 00:50
@VelkoGeorgiev That's totally and completely false. That's not how caches work. It's a common myth, but it's just that, a myth. The `volatile` keyword has nothing whatsoever to do with these caches. Access to a `volatile` can remain entirely in an L1 cache with no issues. (Sadly, the article that you linked to repeats the myth.) — David Schwartz, Nov 11 '15 at 00:52
@VelkoGeorgiev I made some comments on the article. It's infuriating when someone who so thoroughly misunderstands an important issue tries to teach it to other people. — David Schwartz, Nov 11 '15 at 00:59
@DavidSchwartz I disagree check this [link StackOverflow](http://stackoverflow.com/questions/2423622/volatile-vs-static-in-java) "Volatile variable: If two Threads(suppose t1 and t2) are accessing the same object and updating a variable which is declared as volatile then it means t1 and t2 can make their own local cache of the Object except the variable which is declared as a volatile . " — Velko Georgiev, Nov 11 '15 at 01:00
@VelkoGeorgiev Unfortunately, that's subtly incorrect as well. I made a comment to that answer as well. This is a distressingly common misunderstanding. — David Schwartz, Nov 11 '15 at 01:01
So if its a wrong why when you make a while loop whit a flag , and the flag is NOT volatile, when you update the flag from another thread the loop keeps running ? — Velko Georgiev, Nov 11 '15 at 01:02
@VelkoGeorgiev It can be wrong for any reason. That is, there is nothing that guarantees it will work, so it can fail for any reason at all. The most common reason it will fail on typical, modern computers is that the JVM optimizes out the CPU instructions that would fetch the variable from cache, instead keeping it in a register. That is, it fails because of JVM optimizations (that `volatile` disables), nothing to do with CPU caches. (This really is something that very, very few people actually understand.) — David Schwartz, Nov 11 '15 at 01:03
Ok check this quote from the book Java concurrency in Practice ( its a photo) [LINK HERE](http://postimg.org/image/6brwsv37b/) — Velko Georgiev, Nov 11 '15 at 01:06
That quote is entirely true. But it's slightly misleading because the L1/L2/L3 caches on modern CPUs have hardware cache coherency and don't hide things from other processors, and so have nothing to do with `volatile`. Some CPUs do have prefetch buffers or write posting buffers that do hide things from other processors, and `volatile` does have to work around those. — David Schwartz, Nov 11 '15 at 01:08
Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/94768/discussion-between-velko-georgiev-and-david-schwartz). — Velko Georgiev, Nov 11 '15 at 01:09
@VelkoGeorgiev Thanks for your answer. But could you help me with my three questions? I am really confused by these things and want to know what is right. — Christy Lin, Nov 11 '15 at 01:27
@DavidSchwartz Thanks. same asking for help about my three questions. — Christy Lin, Nov 11 '15 at 01:27

David Schwartz · Accepted Answer · 2015-11-11T01:40:55.533

volatile ensures that every read/write will be in the memory instead of only in cache or registers;

Nope. It just ensures it's visible to other threads. On modern hardware, that doesn't require accessing memory. (Which is a good thing, main memory is slow.)

volatile ensures reorder: that is, in setOnce() method data = o can only be scheduled after if(ready) throw..., and before ready = true; this guarantee that if in get() ready = true, data must be o.

That's correct.

is it possible that when thread 1 is in setOnce(), reaches the point that after data = o; before ready = true; At the same time, thread 2 runs into get(), read ready is false, and return null. And thead 1 continues ready = true. In this scenario, Thread 2 didn't see the new "data" even though data has been assigned new value in thread 1.

Yes, but if that's a problem, then you shouldn't be using code like this. Presumably, the API for this code would be that get is guaranteed to see the result if called after setOnce returns. Obviously, you can't guarantee that get will see the result before we're finished making them.

get() isn't synchronized, that means the synchronized lock cannot protect setOnce() since thread 1 calls get() that needn't acquire the lock to access variable ready, data. So thread are not guaranteed to see the latest value of data. By this, I mean lock only guarantee the visibility between synchronized blocks. Even though one thread is running synchronized block setOnce(), another thread is still can go into get() and access ready and data without blocking and may see the old value of these variables.

No. And if this were true, synchronization would be almost impossible to use. For example, a common pattern is to create an object, then acquire the lock on a collection and add the object to the collection. This wouldn't work if acquiring the lock on the collection didn't guarantee that the writes involved in the creation of the object were visible.

in get(), if ready = true, data must be o? I mean this thread is guaranteed to see the visibility of data? I think data is not a volatile nor the get() synchronized. Is this thread may see the old value in the cache?

Java's volatile operation is defined such that a thread that sees a change to one is guaranteed to see all other memory changes the thread that made that change made before it made the change the thread saw. This is not true in other languages (such as C or C++). This may make Java's volatiles more expensive on some platforms, but fortunately not on typical platforms.

Also, please don't talk about "in the cache". This has nothing to do with caches. This is a common misunderstanding. It has to do with visibility, not caching. Most caches provide full visibility into the cache (punch "MESI protocol" into your favorite search engine to learn more) and don't require anything special to ensure visibility.

First I really appreciate you take your time help me with this detailed answer! — Christy Lin, Nov 11 '15 at 02:32
Question 1 is solved now. But for the 2nd question, I read this from the document "When a thread releases an intrinsic lock, a happens-before relationship is established between that action and any subsequent acquisition of the same lock." this is what confuse me. I am thinking your example about the collection. What will happen if t1 doing some change to the collection by unsynchronized method change() while your synchronized add() is running by t2. t1 won't block right since that is a synchronized method do. t1 will see the added collection or not or other things? — Christy Lin, Nov 11 '15 at 02:55
@ChristyLin That happens-before relationship means that anything that that thread did before is releases the lock will be visible to any thread that later acquires that same lock. You would need all accesses to the collection to be synchronized for this to work. In Java, `volatile` establishes this same relationship -- if a thread modifies any `volatile` variable, a thread that sees that modification will see all modifications made before that one too. Again, this is specific to Java. — David Schwartz, Nov 11 '15 at 04:03
Now I understand the effects volatile brings. But let's say the collection class doesn't involve any volatile variable, all it has is a synchronized add() and a unsynchronized change(), then add() is not guaranteed to be visible to change(), right? If I need these two methods to be visible to each other, I must make these two all are synchronized, right? — Christy Lin, Nov 11 '15 at 15:52
@ChristyLin Yes, that's correct. You have to do something in each operation such that those two things establish a "before/after" relationship. — David Schwartz, Nov 11 '15 at 15:53
Thank you so much! Now all my confusion gone. So happy! Have a lovely day! — Christy Lin, Nov 11 '15 at 15:54
@DavidSchwartz I'm trying to understand your *It has to do with visibility, not caching. Most caches provide full visibility into the cache*. I posted a question [here](https://stackoverflow.com/questions/68427434/understanding-java-volatile-visibility/68427550#68427550). My confusion is about SINGLE variable visibility, not multiple variables' reordering issue. My question is: since cache coherence protocol (MESI) can guarantee visibility of a single variable, why must we need volatile to ensure visibility to other threads? — hangyuan, Jul 18 '21 at 13:03
@hangyuan Because the relevant standards say so. The CPU is not required to have a cache coherence protocol and since the developers of your compiler *know* that, they are permitted to make optimizations that are broken by your assumption that the CPU has one. It's upside all around. The CPU having a cache coherency protocol allows the compiler to make very effective optimizations and so does the fact that you are not allowed to rely on the cache coherency protocol (only the compiler is). — David Schwartz, Jul 18 '21 at 22:17

Java: how volatile guarantee visibility of "data" in this piece of code?

1 Answers1

Linked