cpp: how to make access of vector in class thread-safe?

Question

I posted something in a similar direction yesterday, but that question was specifically about mutexes and I did not find much answer in the referenced "duplicate" thread. I want to try to ask more generally now, I hope thats okay.

Look at this code:

#include <iostream>
#include <mutex>
#include <vector>
#include <initializer_list>
using namespace std;

class Data {
public:

    void write_data(vector<float>& data) {
        datav = move(data);
    }

    vector<float>* read_data() {
        return(&datav);
    }

    Data(vector<float> in) : datav{ in } {};

private:
    vector<float> datav{};
};

void f1(vector<Data>& in) {
    for (Data& tupel : in) {
        vector<float>& in{ *(tupel.read_data()) };
        for (float& f : in) {
            f += (float)1.0;
        };
    };
}

void f2(vector<Data>& in) {
    for (Data& tupel : in) {
        vector<float>& in{ *(tupel.read_data()) };
        for (float& f : in) {
            cout << f << ",";
        };
    };
}
int main() {
    vector<Data> datastore{};
    datastore.emplace_back(initializer_list<float>{ 0.2, 0.4 });
    datastore.emplace_back(initializer_list<float>{ 0.6, 0.8 });
    vector<float> bigfv(50, 0.3);
    Data demo{ bigfv };
    datastore.push_back(demo);
    thread t1(f1, ref(datastore));
    thread t2(f2, ref(datastore));
    t1.join();
    t2.join();
};

In my expectancy, I would have guessed that I will get a wild mixture of output values depending on which thread first got to the vector value, so in the third "demo" vector with 50x0.3f, I would have expected a mixture of 0.3 (t2 got there first) and 1.3 (t1 got it first) as output. Even as I tried to use as much pass-by-reference, direct pointers, etc as possible to avoid copying (the original project uses quite large data amounts), the code behaves defined (always t2, then t1 access). Why? Don't I access the floats directly by reference in both thread functions?

How would you make this vector access well-defined? The only possible solutions I found in the other thread were:

-define a similar sized array of unique_ptr to mutexes (feels bad because I need to be able to add data containers to the datastore, so that would mean clearing the array and rebuilding it every time I change size of the datastore?), or

-make the access to the vector atomic (which as a thought makes my operation as I want it threadsafe, but there is no atomic invariant for a vector, or is there in some non-STL-lib?), or

-write a wrapper for a mutex in the data class?

It's for my project not important which thread accesses first, it's only important that I can definitifly read/write the whole vector into the data tupel by a thread without another thread manipulating the dataset at the same time.

What does "well-defined" mean to you? If you want one thread to wait for another one to finish, you have to write the code, using mutexes and condition variables, to make it so. You have no guarantees, whatsoever, with regard to the order of code that gets executed by different threads. Tomorrow, the same program might decide to execute the code in different order. Just because one thread gets created first doesn't mean its code will always be executed first. Threads don't work this way. — Sam Varshavchik, Apr 09 '20 at 14:44
"well-defined" means for me here that read or write access into the vector in Data is ensured to be done for the whole vector, without having to copy it. I want to avoid thread 1 reading 0.2 from [0][0], then thread 2 writing +=1.0 into the array, and then thread 1 reading 1.4 from [0][1], but without copying the whole vector into the according function, as the dataset is quite large in the original program. — Sacharon, Apr 09 '20 at 14:55
Sam, my problem with my original mutex idea which I asked yesterday that I cannot put the mutex into the class, as the class becomes non-copyable then, so in my understanding of https://stackoverflow.com/questions/16465633/how-can-i-use-something-like-stdvectorstdmutex , I could only create an additional parallel fixed-sized array with unique_ptr, which would create a "huge" management overhead, because every time I add or remove an item from the datastore, I would have to recreate the whole mutex-ptr-array, no? So I was looking for other ideas? — Sacharon, Apr 09 '20 at 16:05
Your understanding is incorrect. That six year old question talks about vectors of mutexes only, and not vectors of classes that contain mutex class members. Such class becomes non-copyable only if you do not provide your own copy constructor and/or the assignment operator that implements whatever a copy/assign operation should mean for your class. — Sam Varshavchik, Apr 09 '20 at 16:28
So you would say just put a mutex in the class and write a copy constructor which excludes copying of the mutex? Because afaik I also cant move() it. Could you be so kind and show me a short example how that should be implemented? — Sacharon, Apr 09 '20 at 16:55
I am sorry, I still do not understand. When I try to copy an instance containing a mutex, I hit the deleted copy constructor in , so I can not copy it. In which way am I supposed to ackquire the mutex in the copy constructor via assignment? ```mutex new_mutex = old_mutex;``` is also deleted in mutex, right? I worked through the Stourstroup C++ introduction and through the 4th edition of his C++ reference, but he only uses global mutexes in his examples and I am really unsure how to get around this mutex limitation. — Sacharon, Apr 09 '20 at 18:35
My reference to "copying its contents" refers to the object itself, not the mutex. Since the access to the vector in the object is protected by the associated mutex, it logically follows that in order to access the vector in the existing object, for the purpose of making a copy of its contents, you must acquire its mutex, first, then "copy its contents", by that meaning the vector. — Sam Varshavchik, Apr 09 '20 at 21:23
Your data set is so small that t1 probably finishes before t2 is even created. That is probably why the dataset was larger in the original problem. (to make sure that t1 couldn't complete before t2 starts colliding with it.) — ttemple, Apr 17 '20 at 12:39

score 0 · Accepted Answer · answered Apr 10 '20 at 08:08

I believe I did this now with reference to Sam's comments, and it seems to work, is this correct?

#include <iostream>
#include <mutex>
#include <vector>
#include <initializer_list>
using namespace std;

class Data {
public:
    unique_ptr<mutex> lockptr{ new mutex };
    void write_data(vector<float>& data) {
        datav = move(data);
    }

    vector<float>* read_data() {
        return(&datav);
    }

    Data(vector<float> in) : datav{ in } {
    };
    Data(const Data&) = delete;
    Data& operator=(const Data&) = delete;
    Data(Data&& old) {
        datav = move(old.datav);
        unique_ptr<mutex> lockptr{ new mutex };
    }
    Data& operator=(Data&& old) {
        datav = move(old.datav);
        unique_ptr<mutex> lockptr{ new mutex };
    }
private:
    vector<float> datav{};
    //mutex lock{};

};

void f1(vector<Data>& in) {
    for (Data& tupel : in) {
        unique_lock<mutex> lock(*(tupel.lockptr));
        vector<float>& in{ *(tupel.read_data()) };
        for (float& f : in) {
            f += (float)1.0;
        };
    };
}

void f2(vector<Data>& in) {
    for (Data& tupel : in) {
        (*(tupel.lockptr)).try_lock();
        vector<float>& in{ *(tupel.read_data()) };
        for (float& f : in) {
            cout << f << ",";
        };
        (*(tupel.lockptr)).unlock();
    };
}
int main() {
    vector<Data> datastore{};
    datastore.emplace_back(initializer_list<float>{ 0.2, 0.4 });
    datastore.emplace_back(initializer_list<float>{ 0.6, 0.8 });
    vector<float> bigfv(50, 0.3);
    Data demo{ bigfv };
    datastore.push_back(move(demo));
    thread t1(f1, ref(datastore));
    thread t2(f2, ref(datastore));
    t1.join();
    t2.join();
};

By usage of the unique_ptr, I should leave no memory leaks when I move the instance, right?

cpp: how to make access of vector in class thread-safe?

1 Answers1