8

The below code produce a warning when running with thread sanitizer on macOS. I can't see where the race is. The control block of shared_ptr and weak_ptr is thread safe, and pushing and popping from the std::queue is done with a lock held.

#include <future>
#include <memory>
#include <queue>

class Foo {
public:
  Foo() {
    fut = std::async(std::launch::async, [this] {
      while (!shouldStop) {
        std::scoped_lock lock(mut);
        while (!requests.empty()) {
          std::weak_ptr<float> requestData = requests.front();
          requests.pop();
          (void)requestData;
        }
      }
    });
  }

  ~Foo() {
    shouldStop.store(true);
    fut.get();
  }

  void add(const std::weak_ptr<float> subscriber) {
    std::scoped_lock lock(mut);
    requests.push(subscriber);
  }

private:
  std::atomic<bool> shouldStop = false;
  std::future<void> fut;
  std::queue<std::weak_ptr<float>> requests;
  std::mutex mut;
};

int main() {
  Foo foo;

  int numIterations = 100000;

  while (--numIterations) {
    auto subscriber = std::make_shared<float>();

    foo.add(subscriber);
    subscriber.reset();
  }

  std::this_thread::sleep_for(std::chrono::seconds(1));
}

Warning with stacktrace:

WARNING: ThreadSanitizer: data race (pid=11176)
  Write of size 8 at 0x7b0800000368 by thread T1 (mutexes: write M16):
    #0 operator delete(void*) <null>:1062032 (libclang_rt.tsan_osx_dynamic.dylib:x86_64h+0x4f225)
    #1 std::__1::__shared_ptr_emplace<float, std::__1::allocator<float> >::__on_zero_shared_weak() new:272 (minimal:x86_64+0x1000113de)
    #2 std::__1::weak_ptr<float>::~weak_ptr() memory:5148 (minimal:x86_64+0x100010762)
    #3 std::__1::weak_ptr<float>::~weak_ptr() memory:5146 (minimal:x86_64+0x100002448)
    #4 Foo::Foo()::'lambda'()::operator()() const minimal_race.cpp:15 (minimal:x86_64+0x10000576e)
    #5 void std::__1::__async_func<Foo::Foo()::'lambda'()>::__execute<>(std::__1::__tuple_indices<>) type_traits:4345 (minimal:x86_64+0x1000052f0)
    #6 std::__1::__async_func<Foo::Foo()::'lambda'()>::operator()() future:2323 (minimal:x86_64+0x100005268)
    #7 std::__1::__async_assoc_state<void, std::__1::__async_func<Foo::Foo()::'lambda'()> >::__execute() future:1040 (minimal:x86_64+0x100005119)
    #8 void* std::__1::__thread_proxy<std::__1::tuple<std::__1::unique_ptr<std::__1::__thread_struct, std::__1::default_delete<std::__1::__thread_struct> >, void (std::__1::__async_assoc_state<void, std::__1::__async_func<Foo::Foo()::'lambda'()> >::*)(), std::__1::__async_assoc_state<void, std::__1::__async_func<Foo::Foo()::'lambda'()> >*> >(void*) type_traits:4286 (minimal:x86_64+0x10000717c)

  Previous atomic write of size 8 at 0x7b0800000368 by main thread:
    #0 __tsan_atomic64_fetch_add <null>:1062032 (libclang_rt.tsan_osx_dynamic.dylib:x86_64h+0x24cdd)
    #1 std::__1::shared_ptr<float>::~shared_ptr() memory:3472 (minimal:x86_64+0x1000114d4)
    #2 std::__1::shared_ptr<float>::~shared_ptr() memory:4502 (minimal:x86_64+0x100002488)
    #3 main memory:4639 (minimal:x86_64+0x10000210b)

SUMMARY: ThreadSanitizer: data race new:272 in std::__1::__shared_ptr_emplace<float, std::__1::allocator<float> >::__on_zero_shared_weak()

edit: I compile it with:

clang++ -std=c++17 -g -fsanitize=thread -o test  minimal_race.cpp

clang version:

$ clang++ --version
clang version 7.1.0 (tags/RELEASE_710/final)
Target: x86_64-apple-darwin18.7.0
Thread model: posix
InstalledDir: /usr/local/opt/llvm@7/bin

I am using macOS 10.14.6

tuple_cat
  • 1,165
  • 2
  • 7
  • 22
  • I think this belongs in [codereview](https://codereview.stackexchange.com/) – xception Nov 18 '19 at 11:43
  • 2
    No it's fine here. There's a problem with OP's code, they're not just looking to improve it. A race condition is a very serious problem – Tas Nov 18 '19 at 11:45
  • tried it a couple of times with thread sanitizer with gcc and clang on linux and could not reproduce the issue – xception Nov 18 '19 at 11:49
  • Wouldn't this cause a potential deadlock? With `std::launch::async`, it's up to `std::async` to determine how to schedule your requests according to [cppreference.com](https://en.cppreference.com/w/cpp/thread/async). This means that potentially when `Foo` is constructed, the `future` locks the mutex (which it doesn't unlock until `shouldStop` is true). If `Foo::Add` is then called, it will try to lock the mutex, waiting for the future to unlock it, which it never does. – xEric_xD Nov 18 '19 at 11:54
  • @xEric_xD it does, actually. look better – Federico Nov 18 '19 at 11:56
  • @Federico ah right, it does, my bad – xEric_xD Nov 18 '19 at 11:59
  • 1
    managed to get a better trace with libc++ on clang on linux ... don't know how to post it here so you can see it [in this link instead](https://pastebin.com/fBH1ZE6u) compiled with `clang++ -std=c++17 -O0 -ggdb -fsanitize=thread -stdlib=libc++ -o x x.cpp`, does not happen with libstdc++, – xception Nov 18 '19 at 12:00
  • I've run it on MacOS `10.14.6` with `Apple clang version 11.0.0 (clang-1100.0.33.12)`, built like this: `clang++ -std=c++17 -g -fsanitize=thread sanitizerWarning.cpp -o sanitizerWarning` and running code doesn't produce any sanitizer warnings. Same with `clang++ -std=c++17 -O2 -fsanitize=thread sanitizerWarning.cpp -o sanitizerWarning`. So voting to close as can't reproduce. – Marek R Nov 18 '19 at 12:08
  • add -stdlib=libc++ @MarekR – xception Nov 18 '19 at 12:09
  • @xception no changes. – Marek R Nov 18 '19 at 12:10
  • Maybe this is false positive with older version of clang? @tuple_cat provide OS version clang version and build flags. – Marek R Nov 18 '19 at 12:12
  • I'm confused... you make a `shared_ptr` that you put on the queue of non-owning `weak_ptr` and then immediately reset it.. after that it's destroyed... Trying to get a `nullptr` on the queue? Resetting/destroying something that is used in another thread without any mutex locking sounds to me like a potential race condition... The race seems to occur in the destructor of `shared_ptr` – JHBonarius Nov 18 '19 at 12:17
  • Yes you are right, this is weird (bad logic), but this should not trigger detection of race condition. On my machine it thread sanitizer doesn't find anything. – Marek R Nov 18 '19 at 12:19
  • I edited the post to include build flags, version of clang and os version. – tuple_cat Nov 18 '19 at 12:54
  • hi, did you try my answer and does it fix things? if not could you post the trace you get from it? – xception Nov 18 '19 at 13:19
  • I am almost sure that the suggestion using `std::shared_ptr` in all places will get rid of the warning since the object will not be deleted directly after pushing it on the queue. The example above is very minimal and I have removed all things to keep it as small as possible (like getting a `shared_ptr` from calling `lock()` on the `weak_ptr` after it is popped from the `queue`) – tuple_cat Nov 18 '19 at 13:37
  • as I said, I do believe it to be a bug, from what I can tell it should not happen – xception Nov 18 '19 at 13:42

1 Answers1

1

I think this might be a bug with libc++ and clang in their weak_ptr/shared_ptr implementation...

changing the weak_ptr queue to a shared_ptr one fixes the problem even with old clang versions.

changes are line 25:

  void add(const std::shared_ptr<float> subscriber) {

line 33:

  std::queue<std::shared_ptr<float>> requests;

and rewriting the while at line 42 as:

  while (--numIterations) {
    auto subscriber = std::make_shared<float>();

    foo.add(move(subscriber));
  }
xception
  • 4,241
  • 1
  • 17
  • 27
  • have you been able to reproduce it? Which clang version? – Marek R Nov 18 '19 at 12:20
  • to me this sounds logical, as this prevents the objects from being destroyed. The race condition seems to occur in the destructor of `shared_ptr`. – JHBonarius Nov 18 '19 at 12:20
  • clang++ -v clang version 9.0.0 (tags/RELEASE_900/final) Target: x86_64-pc-linux-gnu Thread model: posix InstalledDir: /usr/lib/llvm/9/bin Selected GCC installation: /usr/lib/gcc/x86_64-pc-linux-gnu/9.2.0 Candidate multilib: .;@m64 Candidate multilib: 32;@m32 Selected multilib: .;@m64 @MarekR – xception Nov 18 '19 at 12:22
  • @MarekR I even posted my own backtrace with a link to it in a comment on the question and build flags – xception Nov 18 '19 at 12:22
  • 1
    @JHBonarius if I understand correctly how weak_ptr/shared_ptr interactions should work using them from different threads should be safe (not the pointed data, just the wrapper classes) at least in how the OP uses them with a shared_ptr and a weak_ptr (being different objects) – xception Nov 18 '19 at 12:25