Move-assignment slower than copy-assignment -- bug, feature, or unspecified?

Question

I recently realized that the addition of move semantics in C++11 (or at least my implementation of it, Visual C++) has actively (and quite dramatically) broken one of my optimizations.

Consider the following code:

#include <vector>
int main()
{
    typedef std::vector<std::vector<int> > LookupTable;
    LookupTable values(100);  // make a new table
    values[0].push_back(1);   // populate some entries

    // Now clear the table but keep its buffers allocated for later use
    values = LookupTable(values.size());

    return values[0].capacity();
}

I followed this kind of pattern to perform container recycling: I would re-use the same container instead of destroying and recreating it, to avoid unnecessary heap deallocation and (immediate) reallocation.

On C++03, this worked fine -- that means this code used to return 1, because the vectors were copied elementwise, while their underlying buffers were kept as-is. Consequently I could modify each inner vector knowing that it could use the same buffer as it had before.

On C++11, however, I noticed that this results in a move of the right-hand side onto the left-hand side, which performs an element-wise move-assignment to each vector on the left-hand side. This in turn causes the vector to discard its old buffer, suddenly reducing its capacity to zero. Consequently, my application now slows down considerably due to excess heap allocations/deallocations.

My question is: is this behavior a bug, or is it intentional? Is it even specified by the standard at all?

Update:

I just realized that correctness of this particular behavior may depend on whether or not a = A() can invalidate iterators that point to the elements of a. However, I don't know what the iterator invalidation rules for move-assignment are, so if you're aware of them it may be worth mentioning those in your answer.

It's unspecified what happens to `capacity` in a copy or a move. — M.M, Aug 03 '14 at 11:10
Why don't you do `for (auto& v : values) { v.clear(); }` ? which seems to be the intend anyway. — Jarod42, Aug 03 '14 at 11:10
@Jarod42: Because this was just a mini-example to illustrate the problem. In reality the outer vector is a massive class and has tons of members, some of which are other containers. And in that case it becomes a bad idea to clear each and every field separately; not only is it inconvenient but the code can get out-of-sync with the list of members fairly quickly. On the other hand it's pretty easy to say `instance = MyMassiveClass()` and expect the auto-generated memberwise assignment operator to take care of it. — user541686, Aug 03 '14 at 11:11
@Mehrdad this isn't really an answer, I don't address the larger issue that the performance of the same code is made worse by C++11. — M.M, Aug 03 '14 at 11:12
@MattMcNabb: I guess. If I don't see a better one it'd be fair enough for me to accept, but feel free to wait for a better one along with me. :) — user541686, Aug 03 '14 at 11:13
@Mehrdad: I don't see how the buffers were being reused in the first place. In both cases the elements in `values` were entirely reconstructed. The only difference I see is the choice of default vector capacity (which C++11 mandates to be 0, while C++03 makes no requirements). I'm very surprised that the code is faster in C++03. — Mankarse, Aug 03 '14 at 11:14
@Mankarse: It has nothing to do with the *default* vector capacity. It has to do with the fact that assigning one vector to another is not required to cause the two to have the same capacity; it can simply result in a memberwise assignment, keeping the original capacity intact. — user541686, Aug 03 '14 at 11:15
@Mankarse: It is faster when OP *repopulates* the vector which have non-zero capacity in C++03.. Not for the copy/move itself. — Jarod42, Aug 03 '14 at 11:16
@MattMcNabb: I just realized the answer to this may also depend on the iterator invalidation rules for a move assignment. I don't know what they are, but if you do, maybe it'll give you an idea as to whether or not the buffer may indeed be reallocated despite the fact that the capacity is unspecified. — user541686, Aug 03 '14 at 11:19
If I remember correctly, Visual C++ does not have a switch for standard revisions. How do you compile it in C++03 mode? — Siyuan Ren, Aug 03 '14 at 11:20
Avoiding move-assignment in your example is trivial: Just use a named object. — Deduplicator, Aug 03 '14 at 11:23
Disregard my comment, it was based on a misconception about the semantics of `vector::operator=`. — Mankarse, Aug 03 '14 at 11:25
I wonder if your optimization was guaranteed to work as expected in C++03.. — dyp, Aug 03 '14 at 11:30
@dyp: Yeah I'm not sure -- I think it depends on the iterator invalidation rules for assignment, but I'm not sure what they are. — user541686, Aug 03 '14 at 11:30
Move assignment can either move-assign+move-construct the individual elements or the whole container (depending on the allocator). It can therefore invalidate all iterators. I cannot find a decent quote in the Standard, though. — dyp, Aug 03 '14 at 11:37
@dyp: What happens for the case of the default allocator? (I suppose other allocators are not so important, because the user would be aware of their customized behavior at that point.) — user541686, Aug 03 '14 at 11:39
Well, as per @MattMcNabb , you were exploiting unspecified behaviour to start with, and got so got burned. *"it becomes a bad idea to clear each and every field separately; not only is it inconvenient but the code can get out-of-sync with the list of members fairly quickly"* -> sure, but that's the case with e.g. move and copy assignment operator definitions, etc. too. Methinks if you want to do this right for such case, you need a custom `.clear()`, and remember to maintain it. — CodeClown42, Aug 03 '14 at 11:43
@Mehrdad All instances of `default_allocator` compare equal. This guarantees that you can free the memory of the RHS of a move assignment with the allocator of the LHS (if both are `default_allocator`s). The implementation can therefore choose just to move the pointers. — dyp, Aug 03 '14 at 11:45
@dyp: Oh I see, that's really interesting. If you'd like to post an answer at some point please mention that! — user541686, Aug 03 '14 at 11:47
In my opinion, the container specification is rather convoluted. It also contains known defects, and I think it does not state the iterator invalidation rules explicit enough for non-implementers. For example, I cannot see any guarantee that move-assignment is O(1) if it's allowed, or when this is allowed. This seems to essentially rely on QoI, which I don't quite like for such an important distinction as O(1) vs O(N). — dyp, Aug 03 '14 at 11:47
@goldilocks: Sorry for the second comment, I misunderstood something (deleted it now). Yeah, that might be the only way then. — user541686, Aug 03 '14 at 11:47
@Mehrdad Okay, but it still sounds to me like the difference between something that is *guaranteed* to sabotage this vs. something that *might allow it*, yes? I.e., not something to recommend... — CodeClown42, Aug 03 '14 at 11:48
@dyp: That's a really good point, I never thought about the O(1)-ness of move assignment! :( Now I'm going to have trouble sleeping... — user541686, Aug 03 '14 at 11:48
@goldilocks: Yeah. Or at least that's how I see it now. It might change in 10 minutes. :-) I don't really know, that's why I asked the question. — user541686, Aug 03 '14 at 11:49
Maybe I should qualify my statement: Move-assignment has to be O(N) in terms of operations, since the existing elements of the LHS have to be destroyed. But it's not clear if it's guaranteed to only move the pointers if that's possible (i.e. the O(x) of element-assignment). — dyp, Aug 03 '14 at 11:50
Ahh ok, my fault, this is a known defect: http://www.open-std.org/jtc1/sc22/wg21/docs/lwg-active.html#2321 Should have checked >=N3797 and not the Standard. And the default allocator now is propagated on move assignment, too: http://www.open-std.org/JTC1/SC22/WG21/docs/lwg-defects.html#2103 — dyp, Aug 03 '14 at 11:59
@dyp: So that means it *must* move the pointers when the allocator allows for it, right? — user541686, Aug 03 '14 at 12:04
how about just apply std::ref to the rhs. i have not time to test that. — Cheers and hth. - Alf, Aug 03 '14 at 12:10
@Cheersandhth.-Alf `std::ref` cannot be applied to rvalues, if that's what you meant. Though some `as_lvalue` can be written, of course. — dyp, Aug 03 '14 at 12:11
sorry i had to dehydrate. i realized that just as i hit return. one probably has to write one's own `ref_to` (rvalue reference), or just declare a local variable. — Cheers and hth. - Alf, Aug 03 '14 at 12:16
@Jarod42: If different version of Visual C++ is used, then the compiler's optimizer and STL are both different. How can you be sure that the standard revision is the reason behind this? — Siyuan Ren, Aug 03 '14 at 13:06

dyp · Accepted Answer · 2014-08-03T22:30:13.850

C++11

The difference in the behaviours in the OP between C++03 and C++11 are due to the way move assignment is implemented. There are two main options:

Destroy all elements of the LHS. Deallocate the LHS's underlying storage. Move the underlying buffer (the pointers) from the RHS to the LHS.
Move-assign from the elements of the RHS to the elements of the LHS. Destroy any excess elements of the LHS or move-construct new elements in the LHS if the RHS has more.

_{I think it is possible to use option 2 with copies, if moving is not noexcept.}

Option 1 invalidates all references/pointers/iterators to the LHS, and preserves all iterators etc. of the RHS. It needs O(LHS.size()) destructions, but the buffer movement itself is O(1).

Option 2 invalidates only iterators to excess elements of the LHS which are destroyed, or all iterators if a reallocation of the LHS occurs. It is O(LHS.size() + RHS.size()) since all elements of both sides need to be taken care of (copied or destroyed).

As far as I can tell, there is no guarantee which one happens in C++11 (see next section).

In theory, you can use option 1 whenever you can deallocate the underlying buffer with the allocator that is stored in the LHS after the operation. This can be achieved in two ways:

If two allocators compare equal, one can be used to deallocate the storage allocated via the other one. Therefore, if the allocators of LHS and RHS compare equal before the move, you can use option 1. This is a run-time decision.
If the allocator can be propagated (moved or copied) from the RHS to the LHS, this new allocator in the LHS can be used to deallocate the storage of the RHS. Whether or not an allocator is propagated is determined by allocator_traits<your_allocator :: propagate_on_container_move_assignment. This is decided by type properties, i.e. a compile-time decision.

C++11 minus defects / C++1y

After LWG 2321 (which is still open), we have the guarantee that:

no move constructor (or move assignment operator when allocator_traits<allocator_type> :: propagate_on_container_move_assignment :: value is true) of a container (except for array) invalidates any references, pointers, or iterators referring to the elements of the source container. [ Note: The end() iterator does not refer to any element, so it may be invalidated. — end note ]

This requires that move-assignment for those allocators which are propagated on move assignment has to move the pointers of the vector object, but must not move the elements of the vector. (option 1)

The default allocator, after LWG defect 2103, is propagated during move-assignment of the container, hence the trick in the OP is forbidden to move the individual elements.

My question is: is this behavior a bug, or is it intentional? Is it even specified by the standard at all?

No, yes, no (arguably).

It seems the resolution of LWG 2321 will not be part of C++1y either; so it might even be C++1z. — dyp, Aug 03 '14 at 13:14

score 11 · Answer 2 · edited May 23 '17 at 10:33

See this answer for a detailed description of how vector move assignment must work. As you are using the std::allocator, C++11 puts you into case 2, which many on the committee considered a defect, and has been corrected to case 1 for C++14.

Both case 1 and case 2 have identical run time behavior, but case 2 has additional compile-time requirements on the vector::value_type. Both case 1 and case 2 result in memory ownership being transferred from the rhs to the lhs during the move assignment, giving you the results you observe.

This is not a bug. It is intentional. It is specified by C++11 and forward. Yes, there are some minor defects as dyp pointed out in his answer. But none of these defects will change the behavior you are seeing.

As was pointed out in the comments, the simplest fix for you is to create an as_lvalue helper and use that:

template <class T>
constexpr
inline
T const&
as_lvalue(T&& t)
{
    return t;
}

// ...

// Now clear the table but keep its buffers allocated for later use
values = as_lvalue(LookupTable(values.size()));

This is zero-cost, and returns you to precisely the C++03 behavior. It might not pass code review though. It would be clearer for you to iterate through and clear each element in the outer vector:

// Now clear the table but keep its buffers allocated for later use
for (auto& v : values)
    v.clear();

The latter is what I recommend. The former is (imho) obfuscated.

Wish I could accept two answers. Thanks for the great answer! — user541686, Aug 03 '14 at 21:14

Move-assignment slower than copy-assignment -- bug, feature, or unspecified?

Update:

2 Answers2

C++11

C++11 minus defects / C++1y

Linked