data race with std::shared_timed_mutex::try_lock_until

Question

I'm trying to write a small testcase to exercise std::shared_timed_mutex::try_lock_until. Documentation on cppreference.

Here is my code

#include <thread>
#include <iostream>
#include <chrono>
#include <shared_mutex>
#include <cassert>
 
std::shared_timed_mutex test_mutex;
int global;
 
void f()
{
    auto now=std::chrono::steady_clock::now();
    test_mutex.try_lock_until(now + std::chrono::seconds(100));
    //test_mutex.lock();
    --global;
    std::cout << "In lock, global=" << global << '\n';
    test_mutex.unlock();
}

void g()
{
    auto now=std::chrono::steady_clock::now();
    test_mutex.try_lock_shared_until(now + std::chrono::seconds(10));
    //test_mutex.lock_shared();
    std::cout << "In shared lock, global=" << global << '\n';
    test_mutex.unlock_shared();
}
 
int main()
{
    global = 1;
    test_mutex.lock_shared();
    std::thread t1(f);
    std::thread t2(g);
    test_mutex.unlock_shared();
    t1.join();
    t2.join();
    assert(global == 0);
}

What I'm expecting is

main gets a read lock then starts f and g
f tries to get an exclusive lock and blocks
g gets a read lock, reads global then unlocks the read lock
main unlocks the read lock
f unblocks, write to global, unlockes and finishes
f and g join 7 the assert is true and main ends

(2 and 3 could be in any order).

This seems to work OK on its own. Under gdb, if I put breakpoints on the read of global in g and the write in f, then run, it stops on the read, as I would expect.

Howver, if I compile with -fsanitize=tthread then I get a hazard

WARNING: ThreadSanitizer: data race (pid=6780)
  Read of size 4 at 0x000000407298 by thread T2:
    #0 g() /home/paulf/scratch/valgrind/drd/tests/try_lock_shared_until14.cpp:25 (try_lock_shared_until14+0x402484)
[trimmed]
    #6 execute_native_thread_routine ../../../../../libstdc++-v3/src/c++11/thread.cc:82 (libstdc++.so.6+0xd9c83)

  Previous write of size 4 at 0x000000407298 by thread T1:
    #0 f() /home/paulf/scratch/valgrind/drd/tests/try_lock_shared_until14.cpp:15 [triimed]
    #6 execute_native_thread_routine ../../../../../libstdc++-v3/src/c++11/thread.cc:82 (libstdc++.so.6+0xd9c83)

  Location is global 'global' of size 4 at 0x000000407298 (try_lock_shared_until14+0x000000407298)

Under gdb the tsan version does not block on the exclusive lock and reaches the write first.

I realize that my example is not good and I should check return values and not rely on timeouts.

Can anyone explain what tsan is changing? If I use the plain lock/lock_shared/unlock/unlock_shared functions then tsan no longer complains.

(Note that I can't use DRD or Helgrind for this - I'm writing the testcase for them and I know that they do not support this at the moment, at least not on the platform that I'm using, Fedora 34 / GCC 11.2.1 amd64).

Edit: Here is version 3, which works now. main waits on a cv got g() to finish, then releases the shared lock, then f() can get the exclusive lock.

#include <thread>
#include <iostream>
#include <chrono>
#include <shared_mutex>
#include <mutex>
#include <cassert>
#include <condition_variable>

std::shared_timed_mutex test_mutex;
std::mutex cv_mutex;
std::condition_variable cv;
int global;
bool reads_done = false;
 
void f()
{
    auto now=std::chrono::steady_clock::now();
    std::cout << "In lock, trying to get mutex\n";
    if (test_mutex.try_lock_until(now + std::chrono::seconds(3)))
    {
       --global;
       std::cout << "In lock, global=" << global << '\n';
       test_mutex.unlock();
    }
    else
    {
        std::cerr << "Lock failed\n";
    }
}

void g()
{
    auto now=std::chrono::steady_clock::now();
    std::cout << "In shared lock, trying to get mutex\n";
    if (test_mutex.try_lock_shared_until(now + std::chrono::seconds(2)))
    {
       std::cout << "In shared lock, global=" << global << '\n';
       test_mutex.unlock_shared();
    }
    else
    {
        std::cerr << "Lock shared failed\n";
    }
    std::unique_lock<std::mutex> lock(cv_mutex);
    reads_done = true;
    cv.notify_all();
}
 
int main()
{
    global = 1;
    test_mutex.lock_shared();
    std::thread t1(f);
    std::thread t2(g);
    {
       std::unique_lock<std::mutex> lock(cv_mutex);
       while (!reads_done)
       {
          cv.wait(lock);
       }
    }
    std::cout << "Main, reader thread done\n";
    test_mutex.unlock_shared();
    std::cout << "Main, no more shared locks\n";
    t1.join();
    t2.join();
    assert(global == 0);
}

Isn't shared lock a read lock? https://stackoverflow.com/q/11837428/4688321 For some reason I think step (3) shouldn't happen until f has completed writing and released the lock. — kiner_shah, Nov 26 '21 at 09:16
Also, for some reason, I am not getting that warning: https://godbolt.org/z/nee4jKncb I tried with `-fsanitize=thread -std=c++14`. — kiner_shah, Nov 26 '21 at 09:24
As for your assumptions, 4 could happen before 2 and 3 even. Also, "1 and 3 could be in any order" -- what do you mean with that? — Ulrich Eckhardt, Nov 26 '21 at 09:38
@Mike van Dyke I did try checking the return values but that made no difference. I'll try in gdb putting a break on the write in f and then see where the main thread is. It looks like I need some synchronization between main unlocking and `f`. — Paul Floyd, Nov 26 '21 at 09:39
@kiner_shah this could be somewhat platform dependent. libstdc++ has started using pthread_rwlock_clockwrlock and pthread_rwlock_clockrdlock but only if the platform glibc has them. I don't know how to tell the platform / glibc used on compiler explorer. — Paul Floyd, Nov 26 '21 at 09:53
@UlrichEckhardt that was I typo, I meant 2 & 3 but I agree with the comments and 2, 3 and 4 could be in any order — Paul Floyd, Nov 26 '21 at 10:17

score 1 · Accepted Answer · answered Nov 26 '21 at 09:34

1

This is also a valid scheduling scenario:

main gets a read lock then starts f and g
main releases read lock then joins
f starts execution and locks for over 10 ms, e.g. due to preemption
g starts execution and blocks for 10 ms
g unblocks and reads the shared variable

In 5., a data race happens, and therefore ThreadSanitizer is correct to point it out. The correction of this error would require to check for the return values of try_lock_shared_until and so on.

answered Nov 26 '21 at 09:34

Mike van Dyke

2,724
3
16
31

When does f unblock? – kiner_shah Nov 26 '21 at 09:57
1

@kiner_shah `f`is not really blocked, but preempted, e.g. directly after acquiring the mutex. It might be any time until it starts its execution again, as the scheduler decides when `f` will be rescheduled. – Mike van Dyke Nov 26 '21 at 10:12

data race with std::shared_timed_mutex::try_lock_until

1 Answers1