16

I know that is possible to read concurrently from a std::vector without "bad" consequences because this operation can be considered thread-safe.

But the same cannot be said for writing operations. But, I am wondering if this is not always true, for example considering my particular scenario.

I have a std::vector<bool>, where all the elements are initialized to false, and, given an array of indices, I need to change the value of these elements (vector[index] for each index) from false to true.

If I use a different thread for each index (and there is the possibility that some indices have the same value), can this operation be considered thread-safe?

If the vector is a std::vector<int> (or any primitive type) and the value assigned is always the same (for example 1) can this operation still be considered thread-safe?

Nick
  • 10,309
  • 21
  • 97
  • 201
  • Is the size of the vector constant while the concurrent access happens? – decltype_auto Nov 09 '15 at 20:19
  • @decltype_auto yes, the size of the vector remains constant. – Nick Nov 09 '15 at 20:20
  • This has been covered on SO before. See: http://stackoverflow.com/questions/4346742/stl-vector-and-thread-safety http://stackoverflow.com/questions/9042571/is-stdvector-or-boostvector-thread-safe http://stackoverflow.com/questions/9305315/stdvector-thread-safety-multi-threading – Craig Estey Nov 09 '15 at 20:26
  • then, given "an array of indices", std::valarray with a [indirect_array](http://en.cppreference.com/w/cpp/numeric/valarray/indirect_array) access may be sth for you. But that's not related to your concurrency topic. – decltype_auto Nov 09 '15 at 20:29
  • @CraigEstey Please note that this question is more specific. – Nick Nov 09 '15 at 20:31
  • @Nick You said different thread for each index yet some indexes have the same value? It may be thread safe, assuming the value written is the same, but if so why bother? There are cache coherency issues here for some arches. What's the larger context/problem--it can make a difference. Are you trying to do something "lockless" (vs. a mutex)? There was an entire 2 hour video at cppcon about this--not as easy as it seems. And the locking version can be faster in certain cases. I've done plenty of multithread and I'm a bit mystified here – Craig Estey Nov 09 '15 at 21:04

2 Answers2

26

Concurrent writes to vector<bool> are never ok, because the underlying implementation relies on a proxy object of type vector<bool>::reference which acts as if it was a reference to bool, but in reality will fetch and update the bitfield bytes as needed.

When using multiple threads without synchronization, the following might happen: Thread 1 is supposed to update a bit, and reads the byte containing it. Then thread 2 reads the same byte, then thread 1 updates a bit and writes the byte back, and then thread 2 updates another bit and writes the byte back, overwriting the edit of thread 1.

This is just one possible scenario, there are others that would lead to the same kind of data corruption.


In the vector<int> situation, if you are absolutely sure that all threads write the same value into the vector, then this operation will generally not lead to data corruption. However, the standard is of course always extra careful and defines all concurrent accesses to a memory location, of which at least one is a write access, to be undefined behavior:

Two expression evaluations conflict if one of them modifies a memory location and the other one reads or modifies the same memory location. – intro.races/2

Therefore, as soon as you do any modification operation on the same element from two different threads, you will have a race condition and need proper synchronization, e.g. by using std::atomic<int>.

Felix Dombek
  • 13,664
  • 17
  • 79
  • 131
  • 3
    "In the vector situation, if you are absolutely sure that all threads write the same value into the vector, then this operation will never lead to data corruption." – this is not correct, concurrent writes to the same memory location result in undefined behavior regardless of whether the value to be written is the same or not. To quote from [(intro.races/2)](https://timsong-cpp.github.io/cppwp/n4659/intro.races): "Two expression evaluations conflict if one of them modifies a memory location and the other one reads or modifies the same memory location." – Arne Vogel Jul 25 '18 at 11:55
  • To clarify… this applies to writing the same value at the same index. If all threads write to a different index, as was part of the actual question, the memory locations will be separate and there is no data race. – Arne Vogel Jul 25 '18 at 11:56
  • @ArneVogel ah well, the standard is of course always extra careful ... you'd certainly not expect an `int` to be written non-atomically on just about any system there is. I'm putting this quote in my answer nonetheless – Felix Dombek Jul 25 '18 at 14:32
  • @FelixDombek If you rely on expectations not found in the relevant standards, your code breaks horribly if those expectations turn out to be wrong in the future. Having gone through this pain enough times, you'd think we'd start learning to avoid it. – David Schwartz Jul 26 '18 at 23:02
  • @FelixDombek - will it be thread safe if I use a vector of "char" or "uint8_t' types which are one byte (and not one word) in size – User2332 Dec 16 '20 at 22:58
  • @KartikLakhotia These are written atomically, so at least any individual value cannot become corrupted by concurrent writes. But the quote from the standard still applies, so it is still UB. And therefore you'd risk to run into issues caused by compiler optimizations, etc. – Felix Dombek Dec 17 '20 at 13:00
24

[container.requirements.dataraces]/2 says:

Notwithstanding (17.6.5.9), implementations are required to avoid data races when the contents of the contained object in different elements in the same container, excepting vector<bool>, are modified concurrently.

So you can safely modify distinct elements of the same standard library container from distinct threads, except when that container is vector<bool>.

Casey
  • 41,449
  • 7
  • 95
  • 125