1

Wondering if it's safe to iterate over a STL container such as a vector in the following manner to avoid locking on reads/writes but allowing for push_back() operations only by any "writing" threads.

for (size_t i = 0; i < vec.size(); i++)
{
   const T& t = *vec[i];
   // do something with t
}

I understand that iterators can be invalidated by changes to the container but perhaps if we make sure the initial container size is large enough for any future additions, it should also be safe to iterate over the elements without locking reads or writes?

stgtscc
  • 970
  • 1
  • 7
  • 19

3 Answers3

3

Wondering if it's safe to iterate over a STL container such as a vector in the following manner to avoid locking on reads/writes but allowing for push_back() operations only by any "writing" threads.

Don't, this is not thread safe. Consider a thread that is trying to read a value and goes to the current buffer. At this time a second thread is growing the buffer, and after the first one obtained the pointer to the buffer, but before it actually read the value, the second thread releases the contents of the buffer. The first thread is reading dead objects, which is undefined behavior.

Restricting the problem to a reserve()-ed vector (i.e. avoiding the problem of growing the buffer) the approach is still not thread safe. The code in push_back() will do something similar to:

template <...>
class vector {
   T *begin,*end,*capacity;   // common implementation uses 3 pointers

   void push_back(T value) {
       if (end == capacity) { /* grow vector and insert */ }
       else {
          new (end) T(value);
          ++end;
       }
   }
};

The problem here is that without synchronization mechanisms the compiler can reorder the instructions, increment end and store that to memory, then call the constructor of T over the buffer element. If that reordering happens, then your reader thread might see a value of size() that includes the value that is currently being stored, but the value is not yet in the vector.

David Rodríguez - dribeas
  • 204,818
  • 23
  • 294
  • 489
  • I think the last sentence of the question is attempting to say that the capacity will be made large enough to never cause it to change (though obviously not said in correct terminology). – GManNickG Mar 12 '13 at 20:30
  • @GManNickG Precisely. I don't see how this answer qualifies if the capacity is sufficient and it shouldn't grow? And if the worry is that a particular implementation _might_ modify the buffer, then perhaps I should just use an array? – stgtscc Mar 12 '13 at 21:40
  • 1
    @stgtscc: Added another potential issue when the vector buffer need not grow. – David Rodríguez - dribeas Mar 12 '13 at 21:54
  • @DavidRodríguez-dribeas I liked your explanation and it seems to make sense. Thank you. – stgtscc Mar 13 '13 at 13:14
0

You should not write code to depend on this as its an implementation detail and is not part of the vector's exported API as far as I know. If its documented behavior, you can rely on it, if it's not then don't do it. Anything that is implementation dependent and not a documented part of the API is subject to change on different platforms and on different versions of your tools on the same platform.

Also, from @GManNickG's comment --> You're going to have a race condition on the invocation of size() as it will be modified and read without locking here.

John Humphreys
  • 37,047
  • 37
  • 155
  • 255
  • 2
    You'd still have a race condition on `size()`. – GManNickG Mar 12 '13 at 19:45
  • What are possible side effects of the race condition on size()? What if there could only be a single writer? – stgtscc Mar 12 '13 at 19:52
  • If it's only that a "stale" size could be returned, then it's not a big issue for my particular use-case – stgtscc Mar 12 '13 at 19:55
  • @stgtscc: It's undefined behavior. Just use a lock or find a concurrent vector implementation and use that. It's a waste of time to know something is not guaranteed to work and then try to justify using it anyway, when existing solutions are readily available. – GManNickG Mar 12 '13 at 19:57
  • @stgtscc: The worst race condition is not on the `size()` function, but on the buffers. See my answer: a thread might grow and release the old buffer while another thread is trying read out of there. – David Rodríguez - dribeas Mar 12 '13 at 20:24
  • Was gonna delete this but I'll leave it up for the comments. – John Humphreys Mar 12 '13 at 21:05
0

You cannot rely on suggestion about iterators will not be invalidated (cus there is many space left). And you need shared_mutex and shared_lock for this. (example of usage)

Community
  • 1
  • 1
Galimov Albert
  • 7,269
  • 1
  • 24
  • 50