3

I'm trying to take a (very) large vector, and reassign all of the values in it into a multidimensional (2D) vector>.

The multidimensional vector has both dimensions resized to the correct size prior to value population, to avoid reallocation.

Currently, I am doing it single-threaded, but it is something that needs to happen repeatedly, and is very slow due to the large size (~7 seconds). The question is whether it is thread-safe for me to use, for instance, a thread per 2D element.

Some pseudocode:

vector<string> source{/*assume that it is populated by 8,000,000 strings 
of varying length*/};
vector<vector<string>> destination;

destination.resize(8);
for(loop=0;loop<8;loop++)destination[loop].resize(1000000);

//current style
for(loop=0;loop<source.size();loop++)destination[loop/1000000][loop%1000000]=source[loop];

//desired style
void Populate(int index){
    for(loop=0;loop<destination[index].size();loop++)destination[index][loop]=source[index*1000000+loop];
}

for(loop=0;loop<8;loop++)boost::thread populator(populate,loop);

I would think that the threaded version should work, since they're writing to separate 2nd dimensional elements. However, I'm not sure whether writing the strings would break things, since they are being resized.

Eric Jones
  • 31
  • 2

1 Answers1

1

When considering only thread-safety, this is fine.

Writing concurrently to distinct objects is allowed. C++ considers objects distinct even if they are neighboring fields in a struct or elements in the same array. The data type of the object does not matter here, so this holds true for string just as well as it does for int. The only important thing is that you must ensure that the ranges that you operate on are really fully distinct. If there is any overlap, you have a data race on your hands.

There is however, another thing to take into consideration here and that is performance. This is highly platform dependent, so the language standard does not give you any rules here, but there are some effects to look out for. For instance, neighboring elements in an array might reside on the same cache line. So in order for the hardware to be able to fulfill the thread-safety guarantees of the language, it must synchronize access to such elements. For instance: Partitioning array access in a way that one thread works out all the elements with even indices, while another works on the odd indices is technically thread-safe, but puts a lot of stress on the hardware as both threads are likely to contend for data stored on the same cache line.

Similarly, your case there is contention on the memory bus. If your threads are able to complete calculation of the data much faster than you are able to write them to memory, you might not actually gain anything by using multiple threads, because all the threads will end up waiting for the memory in the end.

Keep these things in mind when deciding whether parallelism is really the right solution to your problem.

ComicSansMS
  • 51,484
  • 14
  • 155
  • 166
  • Thank you very much for a clear, in-depth explanation of all of the comments, and a few gotchas to look out for. In my case, each thread gets an entire branch of the 2D vector, so no worries about range collision, but it's good to know for other cases. For the memory and cache issues, I'll just have to test it to determine efficiency in my situation. But it's a good thing to point out that multi-threading isn't always a magical bullet. – Eric Jones Aug 01 '17 at 12:31