5

Here is the originan question, but mine have some differences with it. C++ memory model - does this example contain a data race?

My question:

//CODE-1: initially, x == 0 and y == 0
if (x) y++; // pthread 1
if (y) x++; // pthread 2

Note: the code above is written in C, not C++ (without a memory model). So does it contain a data race?

From my point of view: if we view the code in Sequential Consistency memory model, there is no data race because x and y will never be both non-zero at the same time. However, we can never assume a Sequential Consistency memory model, so the compilier reordering could make a transformation that respect to the intra-thread correctness because the compiler isn't aware of the existence of thread.......right?

So the code could be transformed to:

//CODE-2
y++; if (!x) y--;
x++; if (!y) x--;

the transformation above doesn't violate the sequential correctness so it's correct.It's not the fault of the compilier, right? So I agree with the view that the CODE-1 contains a data race.What about you?

I have an extra question, C++11 with a memory model can solve this data race because the compilers are aware of the thread, so they will do their reorder according to the memory model type, right?

Jongware
  • 22,200
  • 8
  • 54
  • 100
Gamer.Godot
  • 608
  • 7
  • 11
  • In a multi-threaeded environment access to shared resources shall be protected. This can be done by using a mutex. If `x` and/or `y` are not defined local to the thread (functions), then `x` and/or `y` are shared resources and there exists a race accessing them. – alk Jan 05 '18 at 07:55
  • 1
    Yes, there is a race condition unless both threads synchronise their access to `x` and `y`. And C++11's updated memory model does not change that. – Peter Jan 05 '18 at 07:55
  • @Peter Thx, so you mean that the CODE-1 written in C contains a data race? – Gamer.Godot Jan 05 '18 at 08:06
  • @alk Thank you.So my understanding is right? – Gamer.Godot Jan 05 '18 at 08:07
  • For snippet 1 it is not clear (to me) where and who initialisation of `x` and `y` is done. – alk Jan 05 '18 at 08:10
  • If for `x` and `y` both are iniatlised *before* the threads start, there is no race in snippet 1. Snippet 2 allows a race in any case. – alk Jan 05 '18 at 08:11
  • @Liu - definitely. Both threads access the values of `x` and/or `y`. And each thread can be preempted or interrupted by the other, while accessing `x` or `y`. There is therefore a race between accesses of the variables. – Peter Jan 05 '18 at 08:13
  • 1
    If the compiler is free to invent writes out of thin air, then anything and everything is potentially racy, and there's no point reasoning about races in such an implementation. Regardless of the synchronization you use, the compiler can invent a write to the shared resource outside the protected block, and boom, race. – T.C. Jan 05 '18 at 08:14
  • And C11 *does* have a memory model consistent with C++11's. – T.C. Jan 05 '18 at 08:16
  • @Peter,Thank you very much.I assume that there are only two threads in the process.So in CODE-1, if the compiler does not reorder any code in the resulting code,then x and y will never be written, so there isn't any data race.But we cannot make the assumption that the compiler will never reorder any codes.So it is hard to say whether there is a data race.Everything is up to the resulting code, I think. – Gamer.Godot Jan 05 '18 at 08:19
  • @T.C. Thank you,I cannot agree with you more.I understood this question just a few minutes ago :-) – Gamer.Godot Jan 05 '18 at 08:23

2 Answers2

2

The C++ standard defines a data race (which triggers undefined behavior) as:

§ 1.10.1-2 [intro.races]
Two expression evaluations conflict if one of them modifies a memory location (..) and the other one reads or modifies the same memory location.

Per the C++ memory model rules, your first code fragment contains no data race because the C++ standard forbids compiler transformations that would introduce such a race:

§ 1.10.1-21 [intro.races]
Compiler transformations that introduce assignments to a potentially shared memory location that would not be modified by the abstract machine are generally precluded by this International Standard, since such an assignment might overwrite another assignment by a different thread in cases in which an abstract machine execution would not have encountered a data race.

So it says that if the condition in the if-statement (x) yields false, no transformation is allowed that would modify y, even if the end result is that y appears unmodified.

The second example clearly contains a data race because 2 threads can write and read x at the same time (same applies to y).

Note that both C++ and C have a memory model since version 11. If you use a compiler that does not support C11, multithreaded behavior is not officially defined.

Here is a question that shows an example of an illegal compiler transformation.

LWimsey
  • 6,189
  • 2
  • 25
  • 53
  • Thank you. "Per the C++ memory model rules, there is no data race in the first code fragment (it has nothing to do with a Sequentially Consistent memory model though)." Do you mean that C++ will not have data race, and it is not related to the sequentially consistent memory model?I cannot get your meaning very well. And, what is the key point between pre-C11 C and C11 C in this question?I don't have any knowledgement about it. – Gamer.Godot Jan 05 '18 at 08:32
  • 1
    @LiuShenming I removed the remarks about seq/cst memory model since it did not really add anything useful – LWimsey Jan 05 '18 at 09:13
  • I'd like to add a few remarks that seem to be unclear to many people: A data race is Undefined Behavior. The race as mentioned in [intro.races] refers to non-atomic variables. There cannot be a race when you use atomic variables. Of course it can be that depending on the kernel schedular sometimes one thread runs faster and other times another and that as a result the output of the program differs from run to run, but that is not a "data race" and is not UB. Hence, if in the given code snippet x or y is non-atomic, we have a data-race (assuming no happens-before relationship, ie by using a – Carlo Wood Dec 08 '18 at 22:54
  • mutex around the given code). If x and y are atomic then we don't. Normally the given code is used to ask the question "Can the result of this be that both x and y are 1 at the end of execution?" This question is NOT about races though. The given code used the default sequential-consistent memory accesses to x and y, which makes it almost trivial to reason about: CODE-1 will always result in 0, 0. The compiler alteration of CODE-2 is never allowed, because x and y are atomic and that totally changes how x and y are manipulated. Note that if the reads in the if() conditionals are done relaxed – Carlo Wood Dec 08 '18 at 23:00
  • then the result x==1 and y==1 at the end *does* become a possibility. – Carlo Wood Dec 08 '18 at 23:01
  • @CarloWood The question was based on code from Stroustrup's [FAQ](http://www.stroustrup.com/C++11FAQ.html#memory-model) where both variables are declared `int` – LWimsey Dec 08 '18 at 23:38
1

//CODE-1: initially, x == 0 and y == 0
if (x) y++; // pthread 1
if (y) x++; // pthread 2

There is no undefined behavior because neither x nor y will ever change their value.
However, there is still a race condition, because there is no defined sequence between read access in one thread and write access in the other one.

//CODE-2
y++; if (!x) y--; // pthread 1
x++; if (!y) x--; // pthread 2

Now you have a data race and undefined behavior because there is no sequence between y++ in thread 1 and if(!y) in thread 2 and vice versa. So possible results for y are:

  • y = 0
    Thread 1 runs after thread 2. So x is still 0.
  • y = 1
    Thread 1 runs in parallel to thread 2, sees the change to x but not vice versa. So y is not decremented.

This has nothing to do with the memory model. It is just a race in any unsynchronized context.

Marcel
  • 1,688
  • 1
  • 14
  • 25
  • The definition of a race condition, in this case, is that any accessing of `x` or of `y` by one thread can be interrupted or preempted by another thread that also accesses `x` or `y`. Even if the net effect of the code is that nothing ever changes, there is a race condition. – Peter Jan 05 '18 at 08:02
  • @Peter, you are right, there is no defined happens before relation. I will change the wording. – Marcel Jan 05 '18 at 08:04
  • @Marcel But without the language-level specified memory model, compiler will make the reordering that introduce a data race,so programmers cannot make any assumption about when their code contains a data race(e.g CODE-1). So I think it is associated with the memory model, C doesn't have a memory model,so the programmers and the compiler doesn't have a consensus on whether the CODE-1 contains a data-race in CODE-1. – Gamer.Godot Jan 05 '18 at 08:12
  • @Peter After thinking twice I am no longer that sure. We are talking about unreachable code here. A smart compiler might have analyzed all access to x and y and found out that there is no write access to both fields possible. And w/o any write there is of course no race, isn't it? – Marcel Jan 05 '18 at 08:14
  • Imagine a third thread that also accesses `x` and `y` that is, for example, in a library function (so the compiler has no visibility of it). – Peter Jan 05 '18 at 08:15
  • @Peter I think we can't determine this from the given context. x and y could be private. They could also be local variables only captured by a closure. In this case no other access is possible. – Marcel Jan 05 '18 at 08:21
  • 4
    @Marcel Your comments about the second code fragment are incorrect.. That is not a race condition, but conflicting access to shared data, i.e. a data race. It does not produce undefined results, but undefined behavior which means reasoning about how threads are interleaved is not applicable. – LWimsey Jan 05 '18 at 08:33
  • 1
    As everyone seems not to say "There is no data race period, because x and y are atomic", I have to assume that x and y are non-atomic. In that case this code DOES have a data race and is Undefined Behavior period. It really doesn't matter (for the abstract machine of the memory model) that there are if statements here :/. – Carlo Wood Dec 08 '18 at 23:08