Do I need a QMutex for variable that is accessed by single statement?

Question

In this document, a QMutex is used to protect "number" from being modified by multiple threads at same time. I have a code in which a thread is instructed to do different work according to a flag set by another thread.

//In thread1
if(flag)
   dowork1;
else
   dowork2;

//In thread2
void setflag(bool f)
{
    flag=f;
}

I want to know if a QMutex is needed to protect flag, i.e.,

//In thread1
mutex.lock();
if(flag)
{
   mutex.unlock();
   dowork1;
}
else
{
   mutex.unlock();
   dowork2;
}

//In thread2
void setflag(bool f)
{
    mutex.lock();
    flag=f;
    mutex.unlock();
}

The code is different from the document in that flag is accessed(read/written) by single statement in both threads, and only one thread modifies the value of flag.

PS: I always see the example in multi-thread programming tutorials that one thread does "count++", the other thread does "count--", and the tutorials say you should use a Mutex to protect the variable "count". I cannot get the point of using a mutex. Does it mean the execution of single statement "count++" or "count--" can be interrupted in the middle and produce unexpected result? What unexpected results can be gotten?

Although i don't think it would be necessary to use QMutex ,but it would be advisable to use it or you can also use QMutexLocker. — vallabh, Sep 10 '20 at 09:30
If your code is limited by what I have shown here, I think, it's okay to get rid of synchronization routines such as mutexes etc., but it's hard to tell otherwise, if your code has more complex logic. — vahancho, Sep 10 '20 at 10:34
@vallabh I've added a PS section to the original question. I want to know the point of using a QMutex or QMutexLocker for single-statement operation on varaibles. — William, Sep 11 '20 at 06:08
@William (https://stackoverflow.com/a/34558/8970365) check this out for neat explanation of Mutex. **Using Mutex for single statement operations:-** if CPU and compiler does not support Atmoic Operations then it is advisable to use Mutex even on a single statement. — vallabh, Sep 11 '20 at 07:46
re: PS -- Yes, `count++` is best thought of as `ld (%r0), %r1; add $1, %r1, %r1; st %r1,(%r0)`. Even machines with natural atomic `inc` instructions might not be used consistently by the compiler. — mevets, Sep 16 '20 at 14:32

Benjamin T · Answer 1 · 2021-12-26T21:12:24.080

Does it mean the execution of single statement "count++" or "count--" can be interrupted in the middle and produce unexpected result? What unexpected results can be gotten?

Just answering to this part: Yes, the execution can be interrupted in the middle of a statement.

Let's imagine a simple case:

class A {
    void foo(){
        ++a;
    }
    int a = 0;
};

The single statement ++a is translated in assembly to

    mov     eax, DWORD PTR [rdi]
    add     eax, 1
    mov     DWORD PTR [rdi], eax

which can be seen as

eax = a;
eax += 1;
a = eax;

If foo() is called on the same instance of A in 2 different threads (be it on a single core, or multiple cores) you cannot predict what will be the result of the program.

It can behave nicely:

thread 1 > eax = a  // eax in thread 1 is equal to 0
thread 1 > eax += 1 // eax in thread 1 is equal to 1
thread 1 > a = eax  // a is set to 1
thread 2 > eax = a  // eax in thread 2 is equal to 1
thread 2 > eax += 1 // eax in thread 2 is equal to 2
thread 2 > a = eax  // a is set to 2

or not:

thread 1 > eax = a  // eax in thread 1 is equal to 0
thread 2 > eax = a  // eax in thread 2 is equal to 0
thread 2 > eax += 1 // eax in thread 2 is equal to 1
thread 2 > a = eax  // a is set to 1
thread 1 > eax += 1 // eax in thread 1 is equal to 1
thread 1 > a = eax  // a is set to 1

In a well defined program, N calls to foo() should result in a == N. But calling foo() on the same instance of A from multiple threads creates undefined behavior. There is no way to know the value of a after N calls to foo(). It will depend on how you compiled your program, what optimization flags were used, which compiler was used, what was the load of your CPU, the number of core of your CPU,...

NB

class A {
public:
     bool check() const { return a == b; }
     int get_a() const { return a; }
     int get_b() const { return b; }
     void foo(){
        ++a;
         ++b;
     }
 private:
     int a = 0;
     int b = 0;
 };

Now we have a class that, for an external observer, keeps a and b equal at all time. The optimizer could optimize this class into:

class A {
public:
     bool check() const { return true; }
     int get_a() const { return a; }
     int get_b() const { return b; }
     void foo(){
         ++a;
         ++b;
     }
 private:
     int a = 0;
     int b = 0;
 };

because it does not change the observable behavior of the program.

However if you invoke undefined behavior by calling foo() on the same instance of A from multiple threads, you could end up if a = 3, b = 2 and check() still returning true. Your code has lost its meaning, the program is not doing what it is supposed to and can be doing about anything.

From here you can imagine more complex cases, like if A manages network connections, you can end up sending the data for client #10 to client #6. If your program is running in a factory, you can end up activating the wrong tool.

If you want the definition of undefined behavior you can look here : https://en.cppreference.com/w/cpp/language/ub and in the C++ standard For a better understanding of UB you can look for CppCon talks on the topic.

I'm curious about the terminology "undefined behavior". C++ specification does not define what "behavior" is, nor what behavior is a defined behavior. For the count++/count-- example, we can say the defined behavior is if we know the number of calls of count++ and count--, we know the final result. But for other cases, what is undefined behavior? Even we know the number of operations each thread does, we are still unable to know the final result because the timing between threads are unknown even the operations of each thread are serialized. — William, Sep 19 '20 at 04:16
If only one thread writes to the shared data, the undefined behavior is more ambiguous. — William, Sep 19 '20 at 04:18
`undefined behavior` has its own definition in the C++ specification. It is not opposed to `behavior`, but to other expressions such as `unspecified behavior` or `implementation-defined behavior` or `well-formed program`. — Benjamin T, Sep 19 '20 at 07:49
If more than one thread access the data and a least one of them is writing to it, the behavior is undefined, because the standard says so. If only one thread writes to the data, there is no undefined behavior and the program is well-formed. There is no ambiguity there. — Benjamin T, Sep 19 '20 at 08:00
Having undefined or defined behavior is not just about the ability to know the final result even without knowing the threads timing. It's about the fact that the C++ does not guarantee anything about the behavior of your program. For instance if you write a program where you enforce that 2 values are always the same, with undefined behavior they might not hold the same values. I've updates my answer with more details. — Benjamin T, Sep 19 '20 at 08:28

t2solve · Answer 2 · 2020-09-19T13:59:29.377

For me, its seems to be more handy to use a mutex here. In general not using mutex when sharing references could lead to problems. The only downside of using mutex here seems to be, that you will slightly decrease the performance, because your threads have to wait for each other.

What kind of errors could happen ? Like somebody in the comments said its a different situation if your share fundamental datatype e.g. int, bool, float or a object references. I added some qt code example, which emphases 2 possible problems during NOT using mutex. The problem #3 is a fundamental one and pretty well described in details by Benjamin T and his nice answer.

Blockquote

main.cpp

#include <QCoreApplication>
#include <QThread>
#include <QtDebug>
#include <QTimer>
#include "countingthread.h"

int main(int argc, char *argv[])
{
    QCoreApplication a(argc, argv);

    int amountThread = 3;
    int counter = 0;
    QString *s = new QString("foo");
    QMutex *mutex = new QMutex();

    //we construct a lot of thread
    QList<CountingThread*> threadList;

    //we create all threads
   for(int i=0;i<amountThread;i++)
   {
    CountingThread *t = new CountingThread();

#ifdef TEST_ATOMIC_VAR_SHARE
        t->addCounterdRef(&counter);
#endif
#ifdef TEST_OBJECT_VAR_SHARE
        t->addStringRef(s);
        //we add a mutex, which is shared to read read write
        //just used with TEST_OBJECT_SHARE_FIX define uncommented
        t->addMutexRef(mutex);
#endif
    //t->moveToThread(t);
    threadList.append(t);
   }

   //we start all with low prio, otherwise we produce something like a fork bomb
   for(int i=0;i<amountThread;i++)
        threadList.at(i)->start(QThread::Priority::LowPriority);

    return a.exec();
}

countingthread.h

#ifndef COUNTINGTHREAD_H
#define COUNTINGTHREAD_H

#include <QThread>
#include <QtDebug>
#include <QTimer>
#include <QMutex>

//atomic var is shared
//#define TEST_ATOMIC_VAR_SHARE

//more complex object var is shared
#define TEST_OBJECT_VAR_SHARE

// we add the fix
#define TEST_OBJECT_SHARE_FIX

class CountingThread : public QThread
{
    Q_OBJECT
    int *m_counter;
    QString *m_string;
    QMutex *m_locker;
public :
    void addCounterdRef(int *r);
    void addStringRef(QString  *s);
    void addMutexRef(QMutex  *m);
    void run() override;
};

#endif // COUNTINGTHREAD_H

countingthread.cpp

#include "countingthread.h"

void CountingThread::run()
{
    //forever
    while(1)
    {
#ifdef TEST_ATOMIC_VAR_SHARE
        //first use of counter
        int counterUse1Copy=  (*m_counter);
        //some other operations, here sleep 10 ms
        this->msleep(10);
        //we will retry to use a second time
        int counterUse2Copy=  (*m_counter);
        if(counterUse1Copy != counterUse2Copy)
                  qDebug()<<this->thread()->currentThreadId()<<" problem #1 found, counter not like we expect";
        //we increment afterwards our counter
        (*m_counter) +=1; //this works for fundamental types, like float, int, ...
#endif

#ifdef TEST_OBJECT_VAR_SHARE

#ifdef TEST_OBJECT_SHARE_FIX
            m_locker->lock();
#endif
            m_string->replace("#","-");
            //this will crash here !!, with problem #2,
            //segmentation fault, is not handle by try catch
            m_string->append("foomaster");
            m_string->append("#");

            if(m_string->length()>10000)
                 qDebug()<<this->thread()->currentThreadId()<<" string is: " <<   m_string;

#ifdef TEST_OBJECT_SHARE_FIX
            m_locker->unlock();
#endif
#endif
    }//end forever
}

void CountingThread::addCounterdRef(int *r)
{
    m_counter = r;
    qDebug()<<this->thread()->currentThreadId()<<" add counter with value:  " << *m_counter << " and address : "<< m_counter ;
}
void CountingThread::addStringRef(QString *s)
{
    m_string = s;
    qDebug()<<this->thread()->currentThreadId()<<" add string with value:  " << *m_string << " and address : "<< m_string ;
}
void CountingThread::addMutexRef(QMutex *m)
{
    m_locker = m;
}

If you follow up the code you are able to perform 2 tests.

If you uncomment TEST_ATOMIC_VAR_SHARE and comment TEST_OBJECT_VAR_SHARE in countingthread.h your will see

problem #1 if you use your variable multiple times in your thread, it could be changes in the background from another thread, besides my expectation there was no app crash or weird exception in my build environment during execution using an int counter.

If you uncomment TEST_OBJECT_VAR_SHARE and comment TEST_OBJECT_SHARE_FIX and comment TEST_ATOMIC_VAR_SHARE in countingthread.h your will see

problem #2 you get a segmentation fault, which is not possible to handle via try catch. This appears because multiple threads are using string functions for editing on the same object.

If you uncomment TEST_OBJECT_SHARE_FIX too you see the right handling via mutex.

problem #3 see answer from Benjamin T

What is Mutex:

I really like the chicken explanation which vallabh suggested.

I also found an good explanation here

what undefined behavior or exception could happen if two threads write on the same variable? Why could app crash? — William, Sep 15 '20 at 10:31
your answer is about multiple statements accessing one variable in a critical section. In that case, mutex should be used. What I'm concerned is only one statement accessing the shared variable in (if any) critical section. The behavior in your example is not unexpected or undefined. counterUse1Copy is not the same as counterUse2Copy because it is changed in between by another thread. The m_string causes crash because the internal pointer or data of m_string is modified between replace and append by another thread so replace/append may use invalid memory address during execution. — William, Sep 16 '20 at 08:33
you are right, so for fundamental types like bool or int its seems possible to access the shared var without having some sides effects in the scenario you described, I also tried to force an error while increasing the amount of threads to 1000 for testing case TEST_ATOMIC_VAR_SHARE but no error occured, might this could be with different compiler and compiler flags an different scenario, but why risk possible errors over performance ? — t2solve, Sep 16 '20 at 10:16
so Benjamins Answer https://stackoverflow.com/a/63960765/14105642 emphases: PLEASE USE also mutex with fundamental types like bool's or int's, to prevent undefined behavior — t2solve, Sep 19 '20 at 14:09

Jeremy Friesner · Answer 3 · 2020-09-15T13:06:28.090

1

For any standard object (including bool) that is accessed from multiple threads, where at least one of the threads may modify the object's state, you need to protect access to that object using a mutex, otherwise you will invoke undefined behavior.

As a practical matter, for a bool that undefined behavior probably won't come in the form of a crash, but more likely in the form of thread B sometimes not "seeing" changes made to the bool by thread A, due to caching and/or optimization issues (e.g. the optimizer "knows" that the bool can't change during a function call, so it doesn't bother checking it more than once)

If you don't want to guard your accesses with a mutex, the other option is to change flag from a bool to a std::atomic<bool>; the std::atomic<bool> type has exactly the semantics you are looking for, i.e. it can be read and/or written from any thread without invoking undefined behavior.

edited Sep 15 '20 at 13:06

answered Sep 15 '20 at 12:58

Jeremy Friesner

70,199
15
131
234

please see my comments on ptr's answer about the compiler optimization and cache issue. I cannot see any undefined behavior. One thread may get the value written by another thread short time later because the cache issue, but that would not be a problem. Imagine a user clicks a button to set a flag to exit the app, and the thread function checks the flag to exit in a loop, I do not think the slight time delay matters. If the thread function does not check the flag in a loop, it may miss the flag to exit but that is an expected result. – William Sep 16 '20 at 08:21
Undefined behavior is undefined; that means it may appear to work fine for years, and then one day you upgrade your compiler or run on a new CPU and things go wrong. If there was some benefit to taking that risk, it might be worthwhile, but I can’t think what the benefit would be. – Jeremy Friesner Sep 16 '20 at 12:42
I cannot imagine what is undefined since every instruction executed by CPU is defined and clear. – William Sep 16 '20 at 14:52
What is undefined is the compiler's responsibility to make sure that unsynchronized cross-thread accesses to the boolean variable result in any particular behavior. The C++ language spec essentially just says "don't do that; but if you do that anyway, we're not responsible for what happens, you're on your own". If that level of (non)commitment to correct behavior is something you're comfortable with, then knock yourself out; but most programmers want to eliminate potential sources of bugs whenever they can. – Jeremy Friesner Sep 16 '20 at 16:42

ptr · Answer 4 · 2020-09-15T13:29:53.310

1

Look here for an explanation: Do I have to use atomic<bool> for "exit" bool variable?

To synchronize access to flag you can make it a std::atomic<bool>.

Or you can use a QReadWriteLock together with a QReadLocker and a QWriteLocker. Compared to using a QMutex this gives you the advantage that you do not need to care about the call to QMutex::unlock() if you use exceptions or early return statements.

Alternatively you can use a QMutexLocker if the QReadWriteLock does not match your use case.

QReadWriteLock lock;
...
//In thread1
{
  QReadLocker readLocker(&lock);
  if(flag)
    dowork1;
  else
    dowork2;
}
...
//In thread2
void setflag(bool f)
{
    QWriteLocker writeLocker(&lock);
    flag=f;
}

edited Sep 15 '20 at 13:29

answered Sep 15 '20 at 13:21

ptr

316
1
3

I read the post you mentioned but still cannot understand clearly. For the compiler optimization issue, I do not think a compiler will change "while(!exit)" to "while(1)" even if "exit" is not modified in the thread function, because how it could know the function is used as a thread function that's run in another thread. For the cache issue, there may be short time that thread A cannot see the change made by thread B because thread B writes the data to a cache NOT the same memory as thread B reads, but soon, the content in the two memories will be synchronized through some mechanism. – William Sep 16 '20 at 15:32
As described in the post I linked, actions in different threads that access and modify the same memory location at the same time are a data race. Per definition such a data race results in undefined behavior. Therefor you can not assume anything about the execution of your program. It might work, it might not work, the C++ standard gives no guarantees for undefined behavior. "...but soon, the content in the two memories will be synchronized...", yes, soon, but sometimes not soon enough. Actually you gave a good example for a data race. – ptr Sep 17 '20 at 11:54

mevets · Answer 5 · 2020-09-16T16:23:43.540

Keeping your program expressing its intent (ie. accessing shared vars under locks) is a big win for program maintenance and clarity. You need to have some pretty good reasons to abandon that clarity for obscure approaches like the atomics and devising consistent race conditions.

Good reasons include you have measured your program spending too much time toggling the mutex. In any decent implementation, the difference between a non-contested mutex and an atomic is minute -- the mutex lock and unlock typical employ an optimistic compare-and-swap, returning quickly. If your vendor doesn't provide a decent implementation, you might bring that up with them.

In your example, dowork1 and dowork2 are invoked with the mutex locked; so the mutex isn't just protecting flag, but also serializing these functions. If that is just an artifact of how you posed the question, then race conditions (variants of atomics travesty) are less scary.

In your PS (dup of comment above): Yes, count++ is best thought of as:

   mov $_count, %r1
   ld (%r1), %r0
   add $1, %r0, %r2
   st %r2,(%r1)

Even machines with natural atomic inc (x86,68k,370,dinosaurs) instructions might not be used consistently by the compiler. So, if two threads do count--; and count++; at close to the same time, the result could be -1, 0, 1. (ignoring the language weenies that say your house might burn down).

barriers: if CPU0 executes:

store $1 to b
store $2 to c

and CPU1 executes:

load barrier -- discard speculatively read values.
load  b to r0
load  c to r1

Then CPU1 could read r0,r1 as: (0,0), (1,0), (1,2), (0,2). This is because the observable order of the memory writes is weak; the processor may make them visible in an arbitrary fashion. So, we change CPU0 to execute:

 store $1 to b
 store barrier  -- stop storing until all previous stores are visible
 store $2 to c

Then, if CPU1 saw that r1 (c) was 2, then r0 (b) has to be 1. The store barrier enforces that.

Only when both threads modify the shared variable can it alter the intended semantics of the program, i.e., thread A does "count++", thread B does "count--", we expect after both statements are executed, count does not change its original value(say 0), but if we do not use mutex, and the machine instructions are interrupted in the middle, count may become 1 or -1, which is not what we expect. But if only one thread modifies count, the other just reads it, there would not be unexpected result even the machine instructions are interrupted in the middle. — William, Sep 16 '20 at 15:16
maybe I should write: mutex.lock(); if(flag) { mutex.unlock(); dowork1; } else { mutex.unlock(); dowork2; } to just protect flag. — William, Sep 16 '20 at 15:24
(inc, dec): yes, but it is possible that the readonly side sees a slightly *stale* version of count with respect to other variables. Often that is ok, until it isn't. Part of the mutex magic is that the mutex operations have an implicit barrier within them, so when your program has acquired the mutex, it has a consistent view of memory. — mevets, Sep 16 '20 at 15:36
In what situation is it not ok? I cannot understand well that "mutex operations have an implicit barrier within them...". Did you mean when a thread acquirs a mutex, all variables(regardless they are protected by the mutex or not) modified by other threads must fetch their content from registers to memory so the operations after acquiring the mutex see the fresh content of the variables? — William, Sep 16 '20 at 15:51
lower level consistency that than. See the bit I added about barriers. — mevets, Sep 16 '20 at 16:24

Do I need a QMutex for variable that is accessed by single statement?

5 Answers5

NB