0

I have a question about the way in which I am using boost::mutex objects in my Linux C++ application. I have a collection of convenience macros for performing various mutex operations, for example I have macros that return whether or not a mutex is locked by any thread, whether or not one is already locked specifically by the calling thread, and several others. The macros that need to know what thread (if any) currently has the mutex locked are the following:

// Determine whether or not a boost mutex is already locked
#define IsBoostMutexLocked(m) ((m.native_handle()->__data.__owner) != 0)

// Determine whether or not a boost mutex is already locked by the calling thread
#define IsBoostMutexLockedByCallingThread(m) ((m.native_handle()->__data.__owner) == (syscall(SYS_gettid)))

But I began to wonder whether or not it is safe to directly read the __owner int field. Could one thread not attempt to read the __owner field while another thread is busy locking or unlocking the mutex, thereby writing to the __owner field?

So, I devised a test that would attempt to expose any data race vulnerabilities and abort if such a situation was detected. So far, I have had 100 threads all concurrently locking, unlocking, and reading the __owner of one global mutex for tens of millions of loop iterations per thread and not once have I ever read an invalid __owner value. Below I've included the entirety of the test code. Frankly I am surprised that I've never read a bad __owner value.

Can anyone explain to me why it is (apparently) safe to directly read the __owner of a mutex even when other threads are trying to lock/unlock? Thanks in advance!

// The test mutex
boost::mutex g_TestMutex;

// The number of threads to launch for the test
#define NUM_THREADS_TO_LAUNCH 100

// The thread IDs of all test threads
long int g_AllSpecialThreadsTIDs[NUM_THREADS_TO_LAUNCH];

// Whether or not each test thread is ready to begin the test
std::atomic<bool> g_bEachTestThreadIsReadyToBegin[NUM_THREADS_TO_LAUNCH];

// Whether or not the test is ready to begin
std::atomic<bool> g_bTestReadyToBegin(false);

// A structure that encapsulates data to be passed to each test thread
typedef struct {
    long *pStoreTIDLoc; // A pointer to the variable at which to store the thread ID
    std::atomic<bool> *pTIDStoredLoc; // A pointer to the variable at which to store the status of whether or not the thread ID has been set
} TestThreadDataStructure;

// Ensure that a test thread ID is valid
void AssertIsValidTID(int iTID)
{
    // Whether or not this thread ID is valid
    bool bValid = false;

    // If the thread ID indicates that no-one has locked the mutex
    if (iTID == 0)
    {
        // A thread ID indicating that no-one has locked the mutex is always valid
        bValid = true;
    }

    // Or, if this is a non-zero thread ID
    else
    {
        // For each test thread
        for (int i = 0; i < NUM_THREADS_TO_LAUNCH; i++)
        {
            // If this is a thread ID match
            if (iTID == static_cast<int>(g_AllSpecialThreadsTIDs[i]))
            {
                // Set that the incoming thread ID is valid
                bValid = true;

                // Stop looking
                break;
            }
        }
    }

    // If the incoming thread ID is invalid
    if (!bValid)
    {
        // The test has failed
        abort();
    }
}

// Each test thread
void TestMutexTesterThread(void *pArg)
{
    // Each mutex owner thread ID
    int iOwner = 0;

    // Unpack the incoming data structure
    TestThreadDataStructure *pStruct = ((TestThreadDataStructure *)pArg);
    long int *pStoreHere = pStruct->pStoreTIDLoc;
    std::atomic<bool> *pTIDStoredLoc = pStruct->pTIDStoredLoc;

    // Clean up
    delete pStruct;
    pStruct = NULL;
    pArg = NULL;

    // Get this thread ID
    const long int lThisTID = syscall(SYS_gettid);

    // Store this thread ID
    (*pStoreHere) = lThisTID;

    // Set that we have finished storing the thread ID
    pTIDStoredLoc->store(true);

    // While we are waiting for everything to be ready so that we can begin the test
    while (true)
    {
        // If we are now ready to begin the test
        if (g_bTestReadyToBegin.load())
        {
            // Stop waiting
            break;
        }
    }

    // The loop iteration count
    uint64_t uCount = 0;

    // For the life of the test, i.e. forever
    while (true)
    {
        // Increment the count
        uCount++;

        // If we are about to go over the edge
        if (uCount >= (UINT64_MAX - 1))
        {
            // Reset the count
            uCount = 0;
        }

        // Every so often
        if ((uCount % 500000) == 0)
        {
            // Print our progress
            printf("Thread %05ld: uCount = %lu\n", lThisTID, uCount);
        }

        // Get the mutex owner's thread ID
        iOwner = g_TestMutex.native_handle()->__data.__owner;

        // Ensure that this is a valid thread ID
        AssertIsValidTID(iOwner);

        // Lock the mutex as part of the test
        g_TestMutex.lock();

        // Get the mutex owner's thread ID
        iOwner = g_TestMutex.native_handle()->__data.__owner;

        // Ensure that this is a valid thread ID
        AssertIsValidTID(iOwner);

        // Unlock the mutex as part of the test
        g_TestMutex.unlock();

        // Get the mutex owner's thread ID
        iOwner = g_TestMutex.native_handle()->__data.__owner;

        // Ensure that this is a valid thread ID
        AssertIsValidTID(iOwner);
    }
}

// Start the test
void StartTest()
{
    // For each thread to launch
    for (int i = 0; i < NUM_THREADS_TO_LAUNCH; i++)
    {
        // Initialize that we do not have a thread ID yet
        g_AllSpecialThreadsTIDs[i] = 0;
        g_bEachTestThreadIsReadyToBegin[i].store(false);
    }

    // For each thread to launch
    for (int i = 0; i < NUM_THREADS_TO_LAUNCH; i++)
    {
        // Allocate a data structure with which to pass data to each thread
        TestThreadDataStructure *pDataStruct = new TestThreadDataStructure;

        // Store the location at which the thread should place its thread ID
        pDataStruct->pStoreTIDLoc = ((long int *)((&(g_AllSpecialThreadsTIDs[i]))));

        // Store the location of the atomic variable that each thread should set to true when it has finished storing its thread ID
        pDataStruct->pTIDStoredLoc = ((std::atomic<bool> *)((&(g_bEachTestThreadIsReadyToBegin[i]))));

        // The thread to return
        boost::thread *pNewThread = NULL;

        // Launch the new thread
        try { pNewThread = new boost::thread(TestMutexTesterThread, pDataStruct); }

        // Catch errors
        catch (boost::thread_resource_error &ResourceError)
        {
            // Print this error
            printf("boost::thread construction error: '%s'", ResourceError.what());

            // This is a fatal error
            abort();
        }

        // Clean up
        delete pNewThread;
        pNewThread = NULL;
    }

    // Whether or not all threads are ready to begin
    bool bAllThreadsReadyToBegin = false;

    // While we are waiting for all threads to be ready to begin
    while (true)
    {
        // Reset to assuming all threads are ready to begin
        bAllThreadsReadyToBegin = true;

        // For each thread we launched
        for (int i = 0; i < NUM_THREADS_TO_LAUNCH; i++)
        {
            // If this thread has not yet stored its thread ID
            if (g_bEachTestThreadIsReadyToBegin[i].load() == false)
            {
                // We are not yet ready to begin
                bAllThreadsReadyToBegin = false;

                // Start over
                break;
            }
        }

        // If all threads are ready to begin
        if (bAllThreadsReadyToBegin)
        {
            // We are done waiting
            break;
        }
    }

    // Atomically store that all threads are ready to begin and that the test should proceed
    g_bTestReadyToBegin.store(true);
}
user2062604
  • 247
  • 3
  • 16
  • 5
    The question of whether a mutex is currently locked is pretty much meaningless, because it basically gives you no information about anything. – nwp Aug 23 '17 at 14:08
  • https://stackoverflow.com/q/21892934/841108 is a very similar question. – Basile Starynkevitch Aug 23 '17 at 14:11
  • What architecture are you running this on - x86? And how many sockets, and how many cores? – Useless Aug 23 '17 at 14:34
  • @Useless Thanks, I am running on x64 with an Intel Celeron with 1 socket with 4 cores. – user2062604 Aug 23 '17 at 14:43
  • @user2062604 Did you compile the code for those test runs with full optimization enabled ? - If not, you may want to try again – curiousguy12 Aug 23 '17 at 19:56
  • Doing as you do is highly incompatible. `pthread_t` not required to be a pointer, as it is on the implementation you use. `pthread_t` is to be taken as an [opaque data type](https://en.wikipedia.org/wiki/Opaque_data_type). – alk Aug 26 '17 at 05:46

1 Answers1

0

it is 'safe' because reads and writes up to 8 bytes, aligned to the respective number of bytes, are atomic. assuming x86_64

i don't know boost, but your macros look bad in 3 ways:

  • you go for internal data layout which makes the thingy prone to api and abi breakage
  • checks "do i have the lock" are almost always a design error (program logic should dictate whether you do, hence no need to check). besides accesses of the sort slow down the code under contention - another cpu could have dirtied the cacheline with the lock word, which you are now fetching for no good reason
  • checks "does someone have a lock" are inhererently racy and typically wrong. if you want to grab the lock if it is available but fail otherwise, you can trylock
  • Thanks for the answer. I did not know that reads/writes up to 8 bytes properly aligned would always happen atomically - this answers my question. – user2062604 Aug 23 '17 at 15:22
  • @user2062604 Unfortunately it is incorrect. Compilers optimize assuming such a race condition can never happen and break your code. It is safe in x86-assembly, but not in C++. That C++ is converted to x86-assembly doesn't save you, compilers will produce incorrect x86-code. – nwp Aug 23 '17 at 15:27
  • @employee of the month I would like to separately note, however, that beyond answering my question, your three points about why the macros look "bad" are both irrelevant and misguided. If you use the macros I provided as checks to ensure that all multithreaded code is written correctly, then they are extremely useful. Either way, the additional points do not answer my question. – user2062604 Aug 23 '17 at 15:28
  • @nwp If employee of the month's answer is incorrect, then what is the answer? Why is it apparently safe to do what I'm doing? – user2062604 Aug 23 '17 at 15:31
  • 1
    @user2062604 Because it is undefined behavior. Doing exactly what you want is allowed behavior for undefined behavior. It might just not do that all the time, depending on timing, architecture, compiler flags and position of the moon. – nwp Aug 23 '17 at 15:32
  • perhaps his point is that in principle the compiler can generate any reads it wants, e.g. byte by byte which in turn loses the 'atomicity' bit. this is part of why i said it is 'safe' in quotes. there is probably a dedicated way in c++ to safely provide an atomic read. –  Aug 23 '17 at 15:33
  • regarding 3 points, i strongly stand by them. for starters i mentioned that *typically* checking this stuff to decide the logic is wrong. i even gave an example. assertions are an excellent example of correct use of course, but you never said what you need these macros for –  Aug 23 '17 at 15:39