Confusion about definition of data race

Question

A data race happens when there are two memory accesses in a program where both:

target the same location
are performed concurrently by two threads
are not reads
are not synchronization operations

This definition is taken from, which borrows it from a research paper, so we can assume it to be correct.

Now consider this example:

import java.util.concurrent.*;

class DataRace{
   static boolean flag = false;
   static void raiseFlag() {
      flag = true;
   }
   public static void main(String[] args) {
      ForkJoinPool.commonPool().execute(DataRace::raiseFlag);
      System.out.println(flag);
  }
}

By my understanding, this satisfies the definition of a data race. We have two instructions accessing same location(flag), both are not reads, both are concurrent and are not synchronization operations. And so the output depends on how the threads interleave and can be either 'True' or 'False'.

If we assume this to be a data race, then I can just add locks before accesses and solve this. But even if I add locks in both threads, we know there is a race condition in locks also. So any thread can get the lock and the output can still be 'True' or 'False'.

So this is my confusion, and here are the two questions I would like to ask:

Is this a data race? If no, why not?
If it's a data race, why the proposed solution doesn't work?

score 0 · Answer 1 · answered Oct 15 '18 at 02:37

First of all, the arbitrary order of thread execution is not the data race itself, even though it can cause it. If you need to synchronize 2 or more threads to execute their code in a specific order you have to use a waiting mechanism like monitors. Monitors are the constructs that can do the both mutual exclusion (locking) and waiting. Monitors are also known as the conditional variables and Java supports them.

Now the question is what the data race is. The data race occurs when 2 or more threads access the same memory location in the same time and some of the accesses are writes. This situation can lead to unpredictable values the memory location can contain.

A classical example. Lets have a 32 bits OS and the variable that is 64 bits long like long or double types. Lets have the long variable.

long SharedVariable;

And thread 1 that executes the following code.

SharedVariable=0;

And thread 2 that executes the following code.

SharedVariable=0x7FFF_FFFF_FFFF_FFFFL;

If an access to this variable is not protected by a lock, after the execution of both threads the SharedVariable can have one of the following values.

SharedVariable==0
SharedVariable==0x7FFF_FFFF_FFFF_FFFFL
**SharedVariable==0x0000_0000_FFFF_FFFFL**
**SharedVariable==0x7FFF_FFFF_0000_0000L**

The last 2 values are unexpected - caused by the data race.

The problem here is that on 32 bits OSs there is a guarantee that an access to the 32 bits variables are atomic - so the platform guarantees that even if 2 or more threads are accessing the same 32 bits memory location at the same time the access to that memory location is atomic - only a single thread can access such variable. But because we have 64 bits variable, on the CPU level, the write to 64 bits long variable will translate into 2 CPU instructions. So the code SharedVariable=0; is translated into something like this:

mov SharedVariableHigh32bits,0
mov SharedVariableLow32bits,0

And code SharedVariable=0x7FFF_FFFF_FFFF_FFFFL; is translated into something like this:

mov SharedVariableHigh32bits,0x7FFFFFFF
mov SharedVariableLow32bits,0xFFFFFFFF

Without the lock, the CPU can execute these 4 instructions in the following orders.

Order 1.

mov SharedVariableHigh32bits,0 // T1
mov SharedVariableLow32bits,0 // T1
mov SharedVariableHigh32bits,0x7FFFFFFF // T2
mov SharedVariableLow32bits,0xFFFFFFFF // T2

The result is: 0x7FFF_FFFF_FFFF_FFFFL.

Order 2.

mov SharedVariableHigh32bits,0x7FFFFFFF // T2
mov SharedVariableLow32bits,0xFFFFFFFF // T2
mov SharedVariableHigh32bits,0  // T1
mov SharedVariableLow32bits,0  // T1

The result is: 0.

Order 3.

mov SharedVariableHigh32bits,0x7FFFFFFF // T2
mov SharedVariableHigh32bits,0 // T1
mov SharedVariableLow32bits,0 // T1
mov SharedVariableLow32bits,0xFFFFFFFF // T2

The result is: 0x0000_0000_FFFF_FFFFL.

Order 4.

mov SharedVariableHigh32bits,0 // T1
mov SharedVariableHigh32bits,0x7FFFFFFF // T2
mov SharedVariableLow32bits,0xFFFFFFFF // T2
mov SharedVariableLow32bits,0 // T1

The result is: 0x7FFF_FFFF_0000_0000L.

So, the race condition caused a serious problem because you can get a value that is completely unexpected and invalid. By using locks, you can prevent it, but just using a lock doesn't guarantee the order of execution - which thread executes its code first. So if you use a lock, you will get just 2 orders of execution - order 1 and order 2, not getting the unexpected values 0x0000_0000_FFFF_FFFFL and 0x7FFF_FFFF_0000_0000L. But still, if you need to synchronize which thread executes first and which second, you need to use not just a lock, but also a wait mechanism that monitors (conditional variables) offer.

BTW according this article, Java guarantees an atomic access to all primitive type variables except the long and double. On 64 bits platforms even an access to long and double should be atomic but it looks like that the standard doesn't guarantee it.

And even if the standard guaranteed the atomic access it would be always better to use locks. Locks define the memory barriers that prevent some compiler optimizations that can reorder your code on the CPU instruction level and cause a problem if using the variables to control the order of execution.

So the simple advice here is that if you are not an concurrent programming expert (and I'm also not), and don't write SW that needs to get absolute maximum performance by using lock-free techniques, always use locks - even if accessing the variables that are guaranteed to have an atomic access.

Atomic access to a memory region is determined by 1) the CPU instruction generated by the compiler for a given high level language operation 2) alignment of the object 3) the specific guarantees of the CPU; the OS doesn't play any part in it — curiousguy, Mar 02 '19 at 20:19
Even on platforms where one would expect data races to be harmless, optimizers may have other ideas. For example, on some ARM platforms, given `uint16_t a; uint32_t b;`, the most efficient code for `b = a*0x01000001;` would temporarily disturb the value of `b` *even if it already happens to equal `a*0x01000001`*. — supercat, Jan 21 '20 at 22:26

Confusion about definition of data race

1 Answers1