Is it OK to read a variable that could potentially be written at the same time?

Question

Might sound a little silly but I'm not proficient in Java so wanted to make sure:

If there are two code points

I:

if (_myVar == null) 
{
    return; 
}

II:

synchronized (_myLock) 
{
    _myVar = MyVarFactory.create(/* real params */)
}

Edit: Assuming that this _myVar is a complex object (i.e. not boolean, int, or long) but a fully fledged java class that has some parent classes etc. .

Assuming that I and II can run on separate threads at the same time, I think that in C++ this would be a "data race", however I am not sure about the situation in Java.

Same definition applies to Java. If there are 2 conflicting action (so there is at least 1 thread that writes) to the same memory location and these actions are not ordered by a happens-before relation, then there is a data race. Make the _myVar volatile and the data-race disappears. — pveentjer, Apr 01 '21 at 04:56

Basil Bourque · Answer 1 · 2021-03-31T23:04:32.210

6

No, not thread-safe.

Data race is only one problem. You likely have an issue with CPU core cache visibility.

If your variable is of type boolean, int, or long (or the object equivalent), use the appropriate Atomic… class. These classes wrap their content for thread-safe access.

If not of those types, use AtomicReference to contain your object in a thread-safe manner.

There are other techniques too, besides the Atomic… classes.

All this has been covered many times on Stack Overflow. So search to learn more.

And read the classic book, Java Concurrency in Practice by Brian Goetz, et al.

edited Mar 31 '21 at 23:04

answered Mar 31 '21 at 22:35

Basil Bourque

303,325
100
852
1,154

Caches are coherent on the X86 and ARM. So the cache of one CPU will always see the changes of any other CPU. – pveentjer Apr 01 '21 at 04:52
@pveentjer See [comment by rzwitserloot](https://stackoverflow.com/questions/66896086/is-it-ok-to-read-a-variable-that-could-potentially-be-written-at-the-same-time/66896171?noredirect=1#comment118254208_66896182) and that Answer. You absolutely *cannot* count on threads in Java seeing consistently the same value of an unprotected variable. Study the [*Java Memory Model*](https://en.wikipedia.org/wiki/Java_memory_model) and the `volatile` keyword in modern Java. – Basil Bourque Apr 01 '21 at 04:59
You are referring to CPU cache visibility as a potential cause of seeing the wrong value. Modern ISAs like X86 and ARM have coherent caches, so they can't be the source of seeing outdated values. The JMM explicitly isn't defined in terms of hardware; in your own answer you are combing both the JMM and the hardware. – pveentjer Apr 01 '21 at 05:07
If you want to judge the example based on JMM: there is a data race. If you want to provide some causes: out of order execution of loads/stores, store buffering, store coalescing and of course compiler optimizations. But caches normally are not the cause of seeing the wrong value because they are coherent (at least on X86 and ARM) – pveentjer Apr 01 '21 at 05:11
This is a very difficult question to answer. Under SC a volatile read needs to see the most recent volatile write before it in the memory order. But it doesn't need to respect the real time order. For more info see https://stackoverflow.com/questions/66653558/does-volatile-guarantee-that-any-thread-reads-the-most-recently-written-value/66655552#66655552 – pveentjer Apr 01 '21 at 11:53

score 6 · Accepted Answer · edited Mar 31 '21 at 23:00

TL;DR: No, not okay.

Explanation:

The relevant documentation is the Java Memory Model (JMM).

The JMM gives the freedom to the JVM to make a local cached copy of every field on all objects for each individual thread.

Then, it hands each thread a coin. Anytime the thread reads a field or writes a field, it flips this coin. On heads, it uses its local cache. On tails, it updates both its local cache as well as the 'real' copy.

Furthermore, the coin is evil. It is not actually random, but it is unreliable. It may flip tails every time today, every time on the test machine, and every time during the first week of the beta. And then just when you're giving a demo to that important potential customer it starts flipping heads on you, reliably, all day, every time. Just.. all of a sudden.

The name of the game is simple: If the behaviour of your program depends on the result of the evil coin flip, you lose.

Thus, either write code that doesn't care (hard), or write code that suppresses the flips (easier).

In general, the easiest thing to do is to never have any fields that you concurrently write to and read from. This sounds impossible but is, in fact, quite easy: Top-down frameworks like fork join do all communications via the stack (so, method parameter passing and method return values), and there is of course that old, tried, and true trick: Do all comms via a channel that has excellent support for concurrent operations, such as a relational database like postgres, or a message queue like rabbitmq.

If you must use the same field from multiple threads in a concurrent fashion, the only way to ensure the evil coin is not flipped is to establish so-called 'Happens-Before/Happens-After' relationships (this is the official terminology as used in the JMM): There are certain specific ways to set up a relationship such that the JMM officially blesses 2 lines of code: That line will definitely 'happen after' that line (which means: The line that 'happens after' will definitely observe the changes that were caused by the line that 'happens before'). Without HBHA, evil coin flip occurs and you may or may not see the change depending on the phase of the moon.

The list of HBHA causation is lengthy, but the common ways:

The natural: 2 bits of code running in the same thread have a natural HBHA relationship. The JVM/CPU is actually free to re-order code and run things simultaneously if it wants to, but the JVM guarantees that whatever any code observes is as if code within a single thread runs strictly sequentially.
Starting threads: thread.start() is guaranteed to happen-before the first line of code within that thread.
synchronized: If a thread exits a synchronized block, then that happens-before any other thread entering a synchronized block that is synchronizing on the same object reference.
volatile: Reads/writes to volatile fields establish an arbitrary order, but it is reliable, and sets up HBHA.

In your code example, there is absolutely no HBHA going on, as I assume that the first snippet runs in one thread and the second snippet runs in another. Yes, the second snippet uses synchronized, but the first does not, and synchronized can only establish HBHA with other synchronized blocks (and only if they are synchronizing on the exact same object). Thus, you have no HBHA.

Therefore, the JMM gives the JVM the freedom to run your snippets such that you do not observe that update done by the second snippet (where _myVar is set up to some instance), even if it CAN observe other stuff that the second thread did change.

SOLUTION: Set up HBHA; use either an AtomicReference which does it for you, or toss a synchronized(_myLock) around the first snippet, or forget this and use a db or rabbitmq or fork/join or some other framework.

NB: There is pretty much no way to write tests that confirm that evil coin flips are occurring. You should take the advice to look into obviating the need to talk about sharing mutating fields between threads entirely with e.g. fork/join, message queues, or databases seriously as a consequence: Multithreaded code that shares fields has a tendency to be riddled with bugs that no tests can catch.

I'm not sure of the analogy of the coin is valid. Caches are always coherent (at least on X86 and ARM). So it will not happen that one CPUs still reads an old value to a cacheline after a different CPU has written a change to the cache. — pveentjer, Apr 01 '21 at 04:53
The JMM makes no such guarantee, @pveentjer. I was playing around with this a few weeks ago and was able to observe a cached value for half an hour before it mysteriously updated itself. Not sure how the evil coin would lead you astray here. — rzwitserloot, Apr 01 '21 at 04:57
Caches are always coherent. So it can't be that your cache gets out of sync. Caches are the source of truth. — pveentjer, Apr 01 '21 at 05:01
I never claimed otherwise. But as soon as hardware is introduced as an example of why ordering can be violated, it should be a reasonably realistic example and incoherent caches are not. — pveentjer, Apr 01 '21 at 06:35

Is it OK to read a variable that could potentially be written at the same time?

2 Answers2