-3

Suppose I have a queue of pointers to std::string and producer and consumer threads working on the queue. Let's say a producer appends to a string and puts the pointer in the queue. A consumer thread gets the pointer, appends another data to the string and puts the pointer to another queue.

  1. Will the consumer thread read an updated data from the producer thread?
  2. After the consumer updates the data and puts it in another queue, will consumers to that queue see the updates of the producer and the consumer thread from (1)?

EDIT: sample code EDIT: Added complete example

#include <deque>
#include <thread>
#include <string>
#include <iostream>
#include <mutex>
#include <atomic>

class StringQueue {
public:

    std::string* pop() {
        std::unique_lock<std::mutex> lock(_mutex);
        if (_queue.empty())
            return NULL;
        std::string* s = _queue.front();
        _queue.pop_front();
        return s;
    }

    void push(std::string* s) {
        std::unique_lock<std::mutex> lock(_mutex);
        _queue.push_back(s);
    }


private:

    std::deque<std::string*> _queue;
    std::mutex _mutex;

};



int main(int argc, char** argv)
{

    StringQueue job_queue;
    StringQueue result_queue;
    std::atomic<bool> run(true);

    std::thread consumer([&job_queue, &result_queue, &run]{
            while (run.load()) {
                std::string* s = job_queue.pop();
                if (s != nullptr)
                    s->append("BAR");
                result_queue.push(s);
            }
    });

    std::thread result_thread([&result_queue, &run]{
            while (run.load()) {
                std::string* s = result_queue.pop();
                if (s != nullptr) {
                    std::cout << "Result: " << *s << std::endl;
                    delete s;
                }
            }
    });


    std::string input;
    while (true)
    {
        std::cin >> input;
        if (input == "STOP")
            break;
        std::string* s = new std::string(input);
        job_queue.push(s);
    }

    run.store(false);

    result_thread.join();
    consumer.join();
}
ssemilla
  • 3,900
  • 12
  • 28
  • That doesn't sound like "one" thread handling the data, like the title says. – deviantfan Aug 29 '17 at 05:39
  • (and, btw., this is not C) – deviantfan Aug 29 '17 at 05:45
  • Hi, "at a time" - I'm thinking do I still need to lock even though I'm sure that only one thread is handling the 'pointer' at a time. – ssemilla Aug 29 '17 at 05:46
  • Also, I'm only including C since the question really is about synchronization and locking in general. I can't find an answer if CPUs are able to cache whole objects if the program can only get pointers to those objects. – ssemilla Aug 29 '17 at 05:48
  • The part about the CPU: a) Yes, "if the compiler generates such statements". Meaning, the language does matter. b) ... but it doesn't really matter if the CPU can cache something. Caching data is no excuse for not locking. ... From your questions about the "updated" data and so on, I understand that you believe one thread could still read old data while another one has changed it already. Without any synchronization, this is *possible* (just possible), but nothing is guaranteed. With proper synchronization, it is not possible. – deviantfan Aug 29 '17 at 05:54
  • And about "at a time": How do you know that? ... Probably not at all. – deviantfan Aug 29 '17 at 05:55
  • 1
    We can't answer your question unless you show some code. Please include a minimal complete example to which your question applies. – Brian Bi Aug 29 '17 at 05:55
  • I've added a sample code, the simple question would be; without synchronization, is it possible that the string popped by the `result_thread` does not contain "HELLO THERE"? – ssemilla Aug 29 '17 at 06:42
  • Are you sure the sample is what you're doing? Because there is no multi-threading going on, there is just a sequence of unnecessary thread creations – Passer By Aug 29 '17 at 06:47
  • This is only an illustration of what I want to achieve. As you can see here, only one thread is manipulating the data "at a time" – ssemilla Aug 29 '17 at 06:56
  • 1
    It is crucial that you give *exactly* the code that you are using. Any unsynchronized manipulation is a data race, by definition. However, the calls to `join()` in your example insert synchronization points that make this legal. Without those synchronization points, it would not be okay to do this, even if there is no temporal overlap of the executions, due to memory ordering concerns. – ComicSansMS Aug 29 '17 at 08:22
  • I'll accept this "*Any* unsynchronized manipulation is a data race, by definition". Thank you. – ssemilla Aug 29 '17 at 08:45
  • @ComicSansMS, can you put an answer so I can accept it? – ssemilla Aug 29 '17 at 09:25

2 Answers2

2

There are many answered questions on Stack Overflow about the memory ordering model of C++ std::mutex. E.g: Does std::mutex create a fence? and: Does `std::mutex` and `std::lock` guarantee memory synchronisation in inter-processor code?

When one unlocks a mutex, all memory writes done previously by the unlocking thread are guaranteed to be visible to any thread that locks the same mutex after it is locked. (In practice, locking and unlocking a std::mutex may result in a stronger barrier, e.g. not requiring synchronization on the same mutex to provide visibility, but it is not guaranteed and providing more is not desirable for performance reasons.)

In the above code, there are three threads and two mutexes. Call the threads "main", "consumer" and "result_thread". Call the two mutexes "job_queue_mutex" and "result_queue_mutex". We have two synchronization patterns:

main and consumer synchronize using job_queue_mutex

consumer and result_queue_mutex synchronize using result_queue_mutex

In both cases, all stores to memory by one thread and reads from that memory by another thread are separated by an unlock on a mutex by the thread doing the stores and a lock on the same mutex by the thread doing the reads. (One can prove this by listing all the stores and reads and mutex operations. I suggest doing this as an exercise.)

So yes, this is guaranteed correct. (The code above spins on its mutexes and the consumer thread pushes nullptrs to the result thread's queue, both of which are inefficient, but for the purposes of discussion here, it works.)

I expect the real question is what happens when one is not using mutexes in strict patterns. That gets into lock-free programming which is quite complicated. There are a variety of memory barriers of varying strength. To use such tools, one must start by having a really solid understanding of the specification of the ordering primitives -- barriers, fences, and the like. Which storage locations do they apply to, what operations do they establish an ordering on, and between which threads is that ordering imposed. Even getting a really good working model for the concept of acquire/release semantics can be a bit tricky. Then one really has to sit down and do a combination of design and correctness proving on the algorithm. Finally the code has to be written to the specification one has decided on. One can then check the code using a variety of tools ranging from formal verification ones (e.g. spin http://spinroot.com/spin/whatispin.html) to execution based ones like clang's thread sanitizer.

Point being that getting lock free code correct requires significantly more rigor than most programming tasks. I often tell people you cannot substitute debugging for design in multithreaded code and this applies even more so to lock free mechanisms. (Many serious programmers consider lock free techniques to be so error prone as to be a terrible idea outside of incredibly narrow uses.)

Zalman Stern
  • 3,161
  • 12
  • 18
  • Good answer.. but mutex lock/unlock is probably never a full fence – LWimsey Aug 30 '17 at 00:30
  • On further thinking, I agree. I edited the statement about memory fences. – Zalman Stern Aug 30 '17 at 00:38
  • @ZalmanStern Thank you so much, I'll accept this as the answer. The more specific question I have to ask is: Is the write `s->append("BAR")` visible to the result_thread? My understanding is that it is not guaranteed since there is no synchronization on the string object themselves. Is this correct? Again, thank you. – ssemilla Aug 30 '17 at 05:27
  • Unlike some languages, C++ does not associate synchronization with objects, so there isn't really an idea of "synchronization on the string object themselves." E.g. from the compiler's point of view, there is no connection between the mutex and deque in StringQueue other than that they happen to be in the same structure. The compiler does not know that the mutex protects the deque. That arises from the dynamic execution of the program. The mutex in StringQueue establishes ordering so *all previous stores* by the thread unlocking it are visible to any thread after it later locks the same mutex. – Zalman Stern Aug 30 '17 at 07:04
1

Your code is written such that you are blocking on producer being complete before starting the consumer, and so on. Join specifically stops the current thread until the thread you're calling has completed its work.

So as your code read, yes, its thread safe.

Does it make sense? Not really. Generally the reason you have consumers/producers with a queue of work to do is you want to do some expensive operations while handling some kind of back pressure. This means the producers and consumers are working at the same time.

If that is your intent, then the answer is no, std::deque, nor any other stl container is thread safe for use in this way. In your example you'd have to wrap locks around all deque accesses and make sure you were removing any item from the queue completely if you're going to unlock it. You've got a bug in your code currently where you do a front() instead of a pop_front(), which means the string is left in the work queue. This would lead to issues where more than one consumer could end up working on that string which is bad news bears.

Josh
  • 12,602
  • 2
  • 41
  • 47
  • Hi Josh, I've edited the sample code. For now, I'm accepting "Any unsynchronized manipulation is a data race, by definition" from @ComicSansMS. The question I really want to ask is - will `result_thread` always be able to see "BAR" appended to whatever input string? – ssemilla Aug 29 '17 at 09:16
  • @weavr now that you've added the mutex to your deque access and popped the string completely off the data structure before unlocking, code looks good to me. – Josh Aug 29 '17 at 17:47
  • I am still uncertain on the writes performed on the string objects themselves. Since only the queue is thread safe, I'm not quite sure if I need locking on the strings themselves so that writes performed by the consumer thread is visible to the result thread. My intuition tells me that it is not guaranteed and I should probably put locks on the string objects themselves. Thank you for your answer. – ssemilla Aug 30 '17 at 05:32
  • @weavr based on your code above, the strings in question don't get modified after they are added to the result_queue, correct? In other words, you are doing your ->append operation before pushing it. If so, then the individual strings are read only after they enter this part of the state machine and they don't need to be protected. – Josh Aug 30 '17 at 05:49