7

An excerpt of the documentation from emplace_back():

  • Iterator validity

All iterators related to this container are invalidated, but pointers and references remain valid, referring to the same elements they were referring to before the call.

  • Data races

The container is modified.

No contained elements are accessed by the call: concurrently accessing or modifying them is safe (although see iterator validity above).

And an excerpt of the documentation from operator[]():

  • Data races

The container is accessed (neither the const nor the non-const versions modify the container).

Element n is potentially accessed or modified. Concurrently accessing or modifying other elements is safe.

So, given that some instance of a deque has at least one element, accessing it through operator[]() and calling emplace_back() on the container concurrently is indeed thread safe?

I'm inclined to say it is, but can't decide if "accessing" in emplace_back()'s docs comprises the use of operator[]() as in:

int access( std::deque< int > & q )
{
    return q[ 0 ];
}

void emplace( std::deque< int > & q , int i )
{
    q.emplace_back( i );
}

where both functions are called concurrently, or that "accessing" only applies to elements of which some reference or pointer has already been taken:

std::deque< int > q { 1 };

auto * ptr = & q[ 0 ]

std::thread t1 ( [ ptr  ]{ * ref = 0; } );
std::thread t2 ( [ & q ]{ q.emplace_back( 2 ); } );

edit: For further reference, here's what the C++ 14 standard (actually, the November 2014 Working Draft, N4296) states about insertions in deque with respect to references and iterators validity:

  • 23.3.3.4 deque modifiers

(...)

  1. Effects: An insertion in the middle of the deque invalidates all the iterators and references to elements of the deque. An insertion at either end of the deque invalidates all the iterators to the deque, but has no effect on the validity of references to elements of the deque.

(...)

Tarc
  • 3,214
  • 3
  • 29
  • 41
  • No. `deque` is not a thread safe container. Here `emplace_back()` can cause reallocation, which pretty much closes the book on any kind of thread safe access to the container. – Sam Varshavchik Dec 06 '16 at 17:22
  • 3
    @SamVarshavchik - `emplace_back` will never cause a reallocation in a `deque`. It *might* allocate a new page. Even if it did cause reallocation there's nothing that says a mutex lock could not be used to ensure proper concurrent access. The issue for the STL containers regarding concurrency is `back` and `pop_back`. Those cannot be made thread safe for pulling the back value off and keeping it. It requires an external lock. Same for `front`/`pop_front`. – Edward Strange Dec 06 '16 at 17:36
  • 1
    `emplace_back` changes things, `operator[]` read them (read a least a pointer to the data). These operations don't guaranteed to be atomic. Therefore, it is not thread safe. – Shmuel H. Dec 06 '16 at 18:17

2 Answers2

6

Calling any two methods on an object of a standard class concurrently is unsafe, unless both are const, or unless otherwise specified (e.g. as is the case for std::mutex::lock()). This is explored in more detail here

So, using emplace_back and operator[] concurrently is not safe. However, you may safely use a previously obtained reference to a deque element concurrently with a call to emplace_back/push_back due to the reference/pointer validity rules you quoted, e.g.:

int main()
{
    std::deque<int> d;
    d.push_back(5);
    auto &first = d[0];
    auto task = std::async(std::launch::async, [&] { first=3; });
    d.push_back(7);
    task.wait();
    for ( auto i : d )
        std::cout << i << '\n';
}

This will safely output 3 and 7. Note that the reference first is created before launching the asynchronous task.

Community
  • 1
  • 1
Arne Vogel
  • 6,346
  • 2
  • 18
  • 31
  • @SergeyA would be helpful to say what is wrong instead of "something's wrong". – Shmuel H. Dec 06 '16 at 18:33
  • 1
    Note though that the standard guarantees concurrent access of multiple calls to `[]` is guaranteed *on different elements* in all but associative containers and `vector` – Edward Strange Dec 06 '16 at 18:37
3

Edit note: This answer is incorrect in its conclusion that [] and emplace_back are safe to use concurrently. Arne's answer is correct. Leaving this here rather than deleting due to the comments being useful.

Edit2: Well, technically I didn't make that conclusion but it's sort of implied. Arne's is the better answer.


What this documentation appears to be saying, though I don't exactly trust the source, is that concurrently accessing other values is thread safe so long as you don't do it through an iterator.

The reason this is the case is that emplace_back is not going to touch the other values in any way. If the capacity is too low to add another element a new page is allocated. This doesn't affect the other elements. So it's safe to use those values through other threads. It won't ever cause a data race because the same data is not being accessed/modified.

The container does not need to be thread safe in any way for this to be the case. It's like accessing a[0] while you modify a[1]. So long as you access/modify that data correctly (don't cause UB) it's a safe operation. You don't need any locks to protect because you're not concurrently using the same data.

What I would be more concerned about is size. This is likely going to be a value in the deque that is modified by emplace and read by size. This would cause a data race if there's no protection. The documentation says nothing about this and only concerns the access of elements, which is of course OK to do concurrently with a call to size.

According to this answer the standard containers do not have any guarantees regarding thread safety excepting for the above. In other words you can concurrently access/modify different elements, but anything else can cause a data race. Yet in other words, the standard containers are not thread safe and don't offer any concurrency protection.

Community
  • 1
  • 1
Edward Strange
  • 40,307
  • 7
  • 73
  • 125
  • So what happens when two threads simultaneously try to allocate a new page? I would assume that at least the calls to emplace_back need to be guarded, even if other indexed or interator lookups don't. – IanM_Matrix1 Dec 06 '16 at 17:58
  • @IanM_Matrix1 When two threads simultaneously try to allocate a new page, you get UB. Yes, you do need to guard calls to `emplace_back`, hopefully with a synchronization primitive that's cheap in the uncontested case. – Kuba hasn't forgotten Monica Dec 06 '16 at 17:59
  • @IanM_Matrix1 - that would cause a data race. – Edward Strange Dec 06 '16 at 18:00
  • @IanM_Matrix1 - in fact you probably don't need the allocation to cause a data race. The size value needs to be incremented and it as well won't be protected. You'd get UB as soon as you called `emplace_back` from two threads concurrently no matter what condition the capacity is in. – Edward Strange Dec 06 '16 at 18:02
  • 2
    Note that there may be an instruction in `emplace_back` that non-atomically changes the data pointer. So, there may be a case when `emplace_back` will change a member pointer and `operator[]` would access an invalid pointer (a partial pointer), leading to UB. – Shmuel H. Dec 06 '16 at 18:11
  • 1
    @ShmuelH. that is one thing I'm concerned. In the STL implementation I'm using, `operator[](size_t pos)` is implemented by simply returning `* ( begin( ) + pos )`. As in the documentation it says that `emplace_back` would invalidate all iterators, it got me thinking that maybe the usage of `begin()` (which could return a stored iterator) would mess things up. – Tarc Dec 06 '16 at 18:21
  • 2
    @Tarc You are trusting this unofficial ref more than you should. Even if the implementation was only `return ptr[pos]` when `ptr`is a pointer to a simple array, It would be a problem. From the same reason I have said before. It *might* work on some computers, but nothing guarantees it. – Shmuel H. Dec 06 '16 at 18:29
  • 2
    @Tarc - it's worse than that. `begin` could return a new iterator and you could still have issues if things are modified between the call to `begin` and the call to `+` on that returned iterator...or `*` for that matter. `emplace_back` and `[]` cannot be used concurrently without causing UB. The only things guaranteed are concurrent uses of `[]` and that references are never invalidated. – Edward Strange Dec 06 '16 at 18:30
  • @ShmuelH. and CrazyEddie, thank you very much for your answers. I was hoping to avoid explicit synchronization in this case, but as you have explained there's no other way =/ – Tarc Dec 06 '16 at 18:40