0

The following program contains a data race:

#include <iostream>
#include <thread>
#include <chrono>
#include <string>

using namespace std::chrono_literals;

int sharedVar = 42;

void func(const std::string& threadName) {
    int input;

    while(true) {
        std::cout<<"["<<threadName<<"]Enter a number: "<<std::endl;
        std::cin>>input;

        if(input == 42) {
            sharedVar++;
        }
        else {
            std::this_thread::sleep_for(1s);
        }
    }
}


int main() {

    std::thread t(func, "thread");
    t.detach();

    func("main");
}

Both threads are accessing sharedVar without any synchronisation if they are given input == 42 from the user.

As per cppreference,

If a data race occurs, the behavior of the program is undefined.

Is the above program guaranteed to work fine (without any undefined behaviour) if input == 42 is never provided?

or

Does the scenario of execution (whether 42 is provided in the input or not at runtime) not matter at all and the program is always undefined.

I ask this question because sometimes, during the discussion on data races, the discussion goes into the details of execution; e.g. in the above case whether the user will provide 42 or not in the input.

Edit Just wanted to add this article I found online which seemed relevant. As per my understanding of this, the input which is given does not matter. The behaviour will remain undefined.

Edit 2 Basically the type 2 functions mentioned here is what I'm interested in.

Vishal Sharma
  • 1,670
  • 20
  • 55
  • 6
    How do you provide `input == 42` when the program's not running? – Friedrich Apr 04 '23 at 06:44
  • 4
    Yes, it occurs at runtime. Yes, UB only happens for a specific input here. Compare this with `gets()`, which also always causes UB on *some* inputs, and works fine on others (or used to, before it was removed). – HolyBlackCat Apr 04 '23 at 06:45
  • 5
    Undefined behavior is not something that "occurs". It is a property of the source code; the language definition does not impose any requirements on what a program whose behavior is undefined should do. The program in the question has a data race, therefore its behavior is undefined. The language definition does not tell you what the program should do. You can certainly **guess** (and you'd probably be right) that the program will behave as you expect it to as long as nobody inputs 42, but if it doesn't you can't complain that the compiler has violated the language definition. – Pete Becker Apr 04 '23 at 13:32
  • 4
    Voted to reopen. I have no idea what prompted closing this clear, well-stated question. – Pete Becker Apr 04 '23 at 13:34
  • The behaviour is undefined if *both* threads have input of `42`, not just one. – Caleth Apr 05 '23 at 09:21
  • @PeteBecker `The program in the question has a data race, therefore its behavior is undefined` I do not understand that part. Does `int main(int argc, char *argv[]) { return !argv[1-argc]; }` has _always_ undefined behavior? Such program for no arguments (argc=1) is guaranteed to return 0. That one code path has undefined behavior doesn't mean all program executions have undefined behavior. `you can't complain that the compiler has violated the language definition` I do not follow, why not? – KamilCuk Apr 05 '23 at 09:59

1 Answers1

1

The following program contains a data race

For some input values, particularly both threads need to see input of 42, not just one thread.

A data race is a property of an execution, not of the program in the abstract. From [intro.races#21]:

The execution of a program contains a data race if

If yes, then is the above program guaranteed to work fine (without any undefined behaviour) if input == 42 is never provided?

Yes. It will continue to loop forever prompting for and accepting input.

Aside: I would be wary of supplying functions with reference parameters to std::thread::thread, although in this case it is fine, because you pass a const char[] not a rvalue std::string.

Caleth
  • 52,200
  • 2
  • 44
  • 75
  • I was going through the following article: https://bartoszmilewski.com/2020/08/11/benign-data-races-considered-harmful/ Would you disagree with the following statement towards the end of the above: "If the programmer doesn’t specifically mark shared variables as atomic, the compiler is free to optimize code as if it were single-threaded." Because as per my understanding of the above, just the mere presence of a data race in any of the possible scenario can render the whole program as undefined(irrespective whether the scenario happened or not). – Vishal Sharma Apr 05 '23 at 12:29
  • 1
    @VishalSharma no, there is only a data race in *this particular* scenario (i.e. sequence of input values), and so only in *this particular* case does the standard place no requirements on the behaviour of the program – Caleth Apr 05 '23 at 12:33
  • @VishalSharma and, whilst the implementation is *permitted* to do wild and wacky things if both threads see `42`, what you will most likely see is as-if you had written `if(input != 42) std::this_thread::sleep_for(1s);`, because `sharedVar` doesn't participate in the observable behaviour of the program – Caleth Apr 05 '23 at 12:57
  • My understanding(mental model) till now had been that the compiler always expects a data race free program and if the program is not data race free then because of compiler's/processor's optimisations the program can exhibit unexpected results(irrespective of whether that particular code leg(containing data race) got executed or not)...I can see that you disagree with this..thus waiting for other answers – Vishal Sharma Apr 05 '23 at 13:03
  • @VishalSharma yes, the compiler can assume that `sharedVar++` is not encountered by both threads. It can't assume that `std::this_thread::sleep_for(1s);` is not encountered by both threads. – Caleth Apr 05 '23 at 13:27
  • @VishalSharma I have quoted the definition of a data race, it is particular to executing the program. See also [`[intro.abstract]`](https://timsong-cpp.github.io/cppwp/n4868/intro.abstract#5) – Caleth Apr 05 '23 at 13:36
  • "this document places no requirement on the implementation executing that program with that input (not even with regard to operations preceding the first undefined operation)." won't the statement inside bracket render whole program as undefined? as there are no requirements from the operations preceding the undefined operation. – Vishal Sharma Apr 05 '23 at 13:50
  • 1
    @VishalSharma **executing that program with that input** If you don't execute it with that input, the behaviour is defined – Caleth Apr 05 '23 at 13:52
  • since we are talking about the operations 'preceding' the first undefined operation, I would think that everything which happens before user gives 42 as input to both threads, is being called as undefined in the excerpt – Vishal Sharma Apr 05 '23 at 14:03
  • @VishalSharma it is only undefined if 42 is input. In that case, the observable behaviour of the whole program is undefined. In the other case, the observable behaviour of the whole program *is* defined. That puts a practical limit on the specific symptoms in the undefined behaviour case. – Caleth Apr 05 '23 at 14:09
  • https://blog.regehr.org/archives/213 I think the type 2 category(described here) is what I am interested in. The sense I got from here is that compiler might make assumptions about the program being UB free during compilation. Now what the results of such assumptions can be on the actual binary that's generated, I'm still not sure about that. – Vishal Sharma Apr 13 '23 at 05:34
  • @VishalSharma yes. See the part in Type 2 where that article says: "Case 1: `(b != 0) && (!((a == INT32_MIN) && (b == -1)))` Behavior of `/` operator is defined: Compiler is obligated to emit code computing `a / b`" – Caleth Apr 13 '23 at 08:32