3

This question follows in continuation of my previous question where my understanding became that 'A data race is a property of an execution, not of the program in the abstract' which means that as long as 2 threads don't access the shared variable(with atleast one being a write access), in reality and not just theoretically, then the behaviour of the program will be well defined.

In light of the above understanding, I want to discuss the following program:

#include <iostream>
#include <thread>
#include <unistd.h>

constexpr int sleepTime = 10;

void func(int* ptr) {
    sleep(sleepTime);
    std::cout<<"going to delete ptr: "<<(uintptr_t)ptr<<"\n";
    delete ptr;
    std::cout<<"ptr has been deleted\n";
}

int main() {

    int* l_ptr = new int(5);

    std::thread t(func, l_ptr);

    t.detach();


    std::cout<<"We have passed ptr: "<<(uintptr_t)l_ptr<<" to thread for deletion. Val at ptr: "<<*l_ptr<<"\n";

    std::cin.get();

}

The above program contains a data race iff the 'main thread' and the 'child thread' happens to access the shared variable at the same time.

However, isn't it reasonable to say atleast that this is highly unlikely to happen in reality while working with multi-core CPUs.

Vishal Sharma
  • 1,670
  • 20
  • 55
  • 2
    `The above program contains a data race iff the 'main thread' and the 'child thread' happens to access the shared variable at the same time.` Not true. The data race occurs if the pointer is read from during or after the pointer is deleted. It does not need to be "at the same time." It is generally poor design to rely on `sleep()` or execution times to be the synchronization mechanism. This is what mutexes are for. – Matt Aug 03 '23 at 15:10
  • @Matt agree that read during/after delete is undefined behaviour...but in any mult-core cpu that I've worked in(intel/amd), I've never seen a thread taking multiple seconds to execute the next instruction...I just want to understand that is this undefined behaviour only when due to some issues the 'main-thread' took 'x' seconds to execute the 'cout' statement(x is the sleep time) or not? – Vishal Sharma Aug 03 '23 at 15:16
  • Remember that if your program contains undefined behaviour (like a data race) *anywhere* then the compiler is under no obligation to generate something meaningful for the *entire program*. UB in one corner can lead to broken code in an entirely different part of the program. UB *anywhere* == broken program. Full stop. – Jesper Juhl Aug 03 '23 at 15:21
  • Is sleeping for 1 nanosecond safe? 10 nanoseconds? Is there a `sleepTime` which is `safe` but `sleepTime-1ns` is "undefined behavior" as you call it? Also, why sleep 10 seconds when one can use mutexes and not "waste" so many CPU cycles? This is not directly answering your question, but these are the types of questions that are more useful/practical IMHO. – Matt Aug 03 '23 at 15:22
  • @JesperJuhl Only if it has UB on all path, which this code does. UB that is avoidable doesn't invalidate the whole program, only the portion from which UB becomes unavoidable. – François Andrieux Aug 03 '23 at 15:24
  • 3
    A race condition is not something that might occur if timing is unfortunate. It is a property of the code, whether the timing of your threads is favorable or not. `sleep` helps make the timing favorable, but the code as-is has a race condition because it meets the definition of what a race condition is. – François Andrieux Aug 03 '23 at 15:26
  • "However, isn't it reasonable to say atleast that this is highly unlikely to happen in reality while working with multi-core CPUs." - Likely or unlikely doesn't matter. If the compiler sees that you have a data race, that may cause it to optimize your program in ways you did not anticipate and you are left with a program that doesn't do what you intended, and the compiler is entirely within its right to do that. – Jesper Juhl Aug 03 '23 at 15:54
  • @JesperJuhl I fail to understand why this same reasoning is not applied in the program I've linked. In that program, there is a data race in one condition(input==42). Why can't it be said that since the program has a data race in one condition therefore the whole program(in all scenarios) is undefined? – Vishal Sharma Aug 03 '23 at 16:15
  • When you use std::thread you should also learn about std::mutex, std::scope_lock and std::atomic. You can also not assume 10ms is really 10ms (on a non realtime os). And you intriniscly have data races when pieces of code are not atomic (cpu instructions). Highly unlikely is not good enough for production code, your program either has to poterntial for dataraces or not (there is no half way) – Pepijn Kramer Aug 03 '23 at 17:07
  • @PepijnKramer if there is no half way then what about the other program that I mentioned in the question(in the link)? If I never give input==42, then can it ever be undefined? – Vishal Sharma Aug 03 '23 at 17:17
  • That example has a `std::thread::detach` which is an immediate no go. The thread will keep running while your program is shutting down. NEVER do that always join with your threads at shutdown (and yes that might require cooperative cancellation). Read up on C++20's [std::jthread](https://en.cppreference.com/w/cpp/thread/jthread) to learn more. Multithreading programming needs very careful design and implementation to get fully right (it is not a free lunch). Be prepared to spent time on mastering it. – Pepijn Kramer Aug 03 '23 at 17:23
  • @PepijnKramer you're not correct that 'program is shutting down'. The function called by the main thread at the end does not return. – Vishal Sharma Aug 03 '23 at 17:24
  • Ok I missed that, but still. One lesson I learned 25 years ago already, and take to heart daily. When doing multithreading shutting down correctly is just as important as starting up (and running without deadlocks/raceconditions). – Pepijn Kramer Aug 03 '23 at 17:28
  • @VishalSharma The difference is that the C++ standard defines the behaviour of the execution without the data race, but doesn't define the behaviour when there is one. So the compiler has to treat non-42 inputs correctly. This program has no conditions guarding the race, so the compiler can do *absolutely anything* – Caleth Aug 04 '23 at 09:43
  • 1
    One way of looking at it is that if the compiler can see that simultaneous access is theoretically possible (e.g., because one thread doesn't get scheduled for a long time, and eventually wakes up at the same time as the other thread that was sleeping) then it is allowed to treat your program as UB and you may end up with unpredictable behaviour. The only way to make it "not theoretically possible" is to ensure that you set up a true happens-before relationship as discussed by the answers. – Brian Bi Aug 04 '23 at 12:05

2 Answers2

10

The above program contains a data race, full stop.

There are accesses to the same object on two threads that are unsequenced to one another. It isn't a question of "at the same time".

The standard defines a relation Happens-Before, which only barely relates to the wall-clock time when things happen.

Undefined behaviour doesn't mean "bad things are observed to happen". It means you can't reason about what you observe to happen from the rules of C++.

Caleth
  • 52,200
  • 2
  • 44
  • 75
  • So you disagree with the statement: "A data race is a property of an execution, not of the program in the abstract"? – Vishal Sharma Aug 03 '23 at 15:08
  • 6
    @VishalSharma You've supplied a program for which all possible executions have a data race – Caleth Aug 03 '23 at 15:11
  • how is it a data race in all possible executions if the main-thread can access the shared memory before the expiry of x seconds(x being the sleep time of the child thread)? – Vishal Sharma Aug 03 '23 at 15:18
  • 2
    @VishalSharma: Because a data race is not just accessing stuff "at the same time". – Nicol Bolas Aug 03 '23 at 15:20
  • 6
    @VishalSharma A "data race" has a strict definition, which boils down to "two threads access a non-atomic variable, at least one of them is writing to it, and neither access *happens before* the other", where "happens before" can only be caused by mutexes, some kinds of atomic operations, and so on. A sleep doesn't cause "happens before". – HolyBlackCat Aug 03 '23 at 15:22
  • @HolyBlackCat does the program in the code in the link added in the question has defined behaviour even when input != 42 as per you? – Vishal Sharma Aug 03 '23 at 15:26
  • 1
    @VishalSharma There's a important difference between an `if` causing a branch to not be taken, and the implementation's thread scheduler *normally* running one thread before another – Caleth Aug 03 '23 at 15:29
  • @VishalSharma That code has no race condition as long as the input is never 42. `cin` and `cout` have special rules that require their concurrent access doesn't caues a race condition, though the input and output might appear mangled. – François Andrieux Aug 03 '23 at 15:31
  • @Caleth you agreed that data race is a property of execution. Which means that something happens at run time which causes undefined behaviour. In the other program that 'something' was user giving 42 as input. My question is this: what happens at runtime in this program which causes data race in 'those scenarios' where the wall-clock timing of access of the shared variable by the 2 threads is different. – Vishal Sharma Aug 03 '23 at 16:23
  • @HolyBlackCat you're not in agreement with Caleth in that case... Because as per him, that program is well defined as long as input 42 is not given – Vishal Sharma Aug 03 '23 at 17:15
  • @VishalSharma Ah, yep, my bad. He's right. – HolyBlackCat Aug 03 '23 at 17:17
  • @VishalSharma The execution of the program. In your previous example there was a condition, such that for non-42 input, there was no data race. In this example, there are no inputs that lead to defined behaviour. – Caleth Aug 03 '23 at 17:21
  • Understood. Thanks a lot – Vishal Sharma Aug 03 '23 at 17:21
  • 3
    As an aside, dereferencing a pointer to see whether it has been deleted elsewhere isn't a good way of detecting if one thread is faster than another, because that's also undefined behaviour in the case it has been deleted – Caleth Aug 03 '23 at 17:38
6

my understanding became that 'A data race is a property of an execution, not of the program in the abstract' which means that as long as 2 threads don't access the shared variable(with atleast one being a write access), in reality and not just theoretically, then the behaviour of the program will be well defined.

Your understanding is correct. Your application of that understanding is not:

The above program contains a data race iff the 'main thread' and the 'child thread' happens to access the shared variable at the same time.

Data races are not about accessing "the shared variable at the same time". They are about accessing memory without proper synchronization, such that there is no "happens-before" relationship between the two accesses.

There is no such relationship between the deletion of the pointer in the other thread and its use in the main thread. Therefore, this code has a data race. That data race is a property of its execution, and the execution has a data race because that execution does not contain anything that would prevent the data race from happening. It has two accesses, one of which is a write, which have no happens-before ordering between them.

Therefore, there is a data race.

A conditional data race would look somethign like this:

#include <iostream>
#include <thread>
#include <unistd.h>
#include <atomic>

std::atomic<bool> flag = false;

constexpr int sleepTime = 10;

void func(int* ptr) {
    sleep(sleepTime);
    std::cout<<"going to delete ptr: "<<(uintptr_t)ptr<<"\n";
    delete ptr;
    flag = true;
    flag.notify_all();
    std::cout<<"ptr has been deleted\n";
}

int main() {

    int* l_ptr = new int(5);

    std::thread t(func, l_ptr);

    t.detach();

    int i = 0;
    std::cin >> i;
    if(i == 5)
    {
        test.wait(false);
    }

    std::cout<<"We have passed ptr: "<<(uintptr_t)l_ptr<<" to thread for deletion. Val at ptr: "<<*l_ptr<<"\n";

    std::cin.get();

}

This will perform proper synchronization, creating a happens-before relationship, but it will only perform this if the user enters the value "5". If they enter anything else, then there is a data race.

Nicol Bolas
  • 449,505
  • 63
  • 781
  • 982
  • 1
    @FrançoisAndrieux in general, it is. This specific example has an unconditional data race, but the linked question is an example of a value-conditional race – Caleth Aug 03 '23 at 15:31
  • @Caleth I see, I misunderstood what was meant. Thanks for clarifying. I thought it was referring to the incorrect notion of a race condition depending on the specific timing of one execution vs another. – François Andrieux Aug 03 '23 at 15:33