0

I have a program where many threads do some computations and write a boolean true value in a shared array to tag the corresponding item as "dirty". This is a data race, reported by ThreadSanitizer. Nevertheless, the flag is never read from these threads, and since the same value is written by all threads, I wonder if it is an actually problematic data race.

Here is a minimal working example:

#include <array>
#include <cstdio>
#include <thread>
#include <vector>

int
main()
{
  constexpr int N = 64;

  std::array<bool, N> dirty{};
  std::vector<std::thread> threads;
  threads.reserve(3 * N);

  for (int j = 0; j != 3; ++j)
    for (int i = 0; i != N; ++i)
      threads.emplace_back([&dirty, i]() -> void {
        if (i % 2 == 0)
          dirty[i] = true; // data race here.
      });

  for (std::thread& t : threads)
    if (t.joinable())
      t.join();

  for (int i = 0; i != N; ++i)
    if (dirty[i])
      printf("%d\n", i);

  return 0;
}

Compiled with g++ -fsanitize=thread, a data race is reported on the marked line. Under which conditions can this be an actual problem, i.e. the dirty flag for an item i would not be the expected value?

Julien
  • 2,139
  • 1
  • 19
  • 32
  • _"...When an evaluation of an expression __writes__ to a memory location and another evaluation reads or __modifies__ the same memory location, ..."_ [Threads and data races](https://en.cppreference.com/w/cpp/language/memory_model#Threads_and_data_races). I am not sure if a write of `true` follow on a 2nd thread by a write of `true` counts as `modified` or not. If the language had said `write` I would say data-race and UB but having read the section I am not sure. – Richard Critten Nov 02 '22 at 16:58
  • `dirty[i] = true;` is not atomic. You need something to protect against concurrent access. Have a look at std::mutex/std::scoped_lock – Pepijn Kramer Nov 02 '22 at 16:59
  • Side note: https://godbolt.org/z/frGhKznej says you've got something else to worry about. – user4581301 Nov 02 '22 at 17:00
  • *"Under which conditions can this be an actual problem"* It is automatically a problem. The language forbids writing code that would allow one thread write a value to an object which may be read from or written to by another thread without synchronization. – François Andrieux Nov 02 '22 at 17:00
  • Also it looks you're creating way more threads then you have cores, which is also a waste of resources and will not make things go fast in the end. Look at [std::thread::hardware_concurrency](https://en.cppreference.com/w/cpp/thread/thread/hardware_concurrency) and try to find a solution that uses at most that many threads. – Pepijn Kramer Nov 02 '22 at 17:02
  • Any data race is problematic and should be avoided if you want consistent results. – Pepijn Kramer Nov 02 '22 at 17:03
  • The C++ Standard uses modifies not write (see above comment) [\[intro.races #2\]](https://eel.is/c++draft/intro.races#2) _"...Two expression evaluations conflict if one of them modifies a memory location ([intro.memory]) and the other one reads or modifies the same memory location...."_ So this is undefined Behaviour. Having UB means that the whole program is ill-formed. As there is no point reasoning about UB, all you can is remove it. – Richard Critten Nov 02 '22 at 17:04

0 Answers0