C++ fastest way to clear or erase a vector

Question

I have a code where I routinely fill a vector with between 0 and 5000 elements. I know the maximum never exceeds 5000. Instead of initializing vector multiple times, I would like to do just once

vector<struct> myvector;
myvector.reserve(5000);

However, to fill the vector again, I have to clear the vector first without altering its capacity. So usually I call myvector.clear();

This is a O(n) operation. Is there something simple I can do to increase the performance of this or is this about the best that it will get?

Is assigning to the existing elements a reasonable solution? — Joseph Mansfield, May 07 '13 at 13:33
No, because I might have 5000 elements the first time, and 3500 the next time, and there would be 1500 old elements left at the end... — user788171, May 07 '13 at 13:37
So, later I will want to loop over myvector and presumeably use myvector.size(). — user788171, May 07 '13 at 13:38
So, resize to the current size, and overwrite existing elements. (or resize at the end, if you don't know how many elements) — Mats Petersson, May 07 '13 at 13:39
I guess your best bet is to wrap `vector`, and provide these custom `resize` and `size` facilities which would emulate the changes in size. In other words, your custom `resize` would simply change the new member `size` and not destroy any elements. Of course you should think how to make it consistent with the rest of the `vector` so that this wrapper is seamless to the client, but it's not a big deal. — Alexander Shukaev, May 07 '13 at 13:42
Ill give you the best answer ever here in the comment(if this is a real problem) : run a profiler :) — NoSenseEtAl, May 21 '13 at 07:35

Vaughn Cato · Accepted Answer · 2013-05-08T12:58:53.927

44

If your struct has a non-trivial destructor, then that needs to be called for all the elements of the vector regardless of how it is emptied. If your struct only has a trivial destructor, the compiler or the standard library implementation is allowed to optimize away the destruction process and give you a O(1) operation.

edited May 08 '13 at 12:58

answered May 07 '13 at 13:38

Vaughn Cato

63,448
5
82
132

1

There is a difference between *"the compiler will likely optimize ..."* and *"the implementation can optimize the cost away ..."* (as @DavidRodríguez-dribeas said in his answer). The latter sounds more reasonable to me! – Nawaz May 08 '13 at 03:41
@Nawaz: I suppose that is true. My intent was something along the lines of "most compilers perform this optimization, so chances are you are using one of those compilers", but I can see how it could be interpreted as "sometimes the compiler will and sometimes the compiler won't optimize this, but usually it will". – Vaughn Cato May 08 '13 at 06:28
I think you didn't understand my comment. The statement *"the implementation can optimize the cost away ..."* is a super set, as it also includes the idea *"the compiler will likely optimize"*, in addition to the possibility of the "optimized" library code. Means, even if the compiler itself doesn't do the optimization, the library could be written so as to emit `O(1)` code when the `value_type` has trivial destructor (which is what @DavidRodríguez-dribeas also meant, see his comments). – Nawaz May 08 '13 at 06:50

score 27 · Answer 2 · answered May 07 '13 at 13:39

27

The cost of clear() depends greately on what the stored objects are, and in particular whether they have a trivial destructor. If the type does not have a trivial destructor, then the call must destroy all stored objects and it is in fact an O(n) operation, but you cannot really do anything better.

Now, if the stored elements have trivial destructors, then the implementation can optimize the cost away and clear() becomes a cheap O(1) operation (just resetting the size --end pointer).

Remember that to understand asymptotic complexity you need to know what it talks about. In the case of clear() it represents the number of destructors called, but if the cost (hidden) is 0, then the operation is a no-op.

answered May 07 '13 at 13:39

David Rodríguez - dribeas

204,818
23
294
489

2

Can you clarify what is meant by trivial destructor? I'm not familiar with this terminology. – user788171 May 07 '13 at 13:40
8

I think in this context `trivial destructor` is the same as `no destructor`. – Mats Petersson May 07 '13 at 13:41
5

@user788171: You may want to read [What are Aggregates and PODs and how/why are they special?](http://stackoverflow.com/a/7189821/2070725) in order to understand what this whole "trivial" stuff is about (scroll down a bit to reach the "trivial" part). – syam May 07 '13 at 13:47
3

@MatsPetersson: Well... a bit more complex, but that is the idea. Basically the type cannot have a user provided destructor (what you said), it cannot have any virtual functions or bases, and all of the members and bases must also have trivial destructors (i.e. no member will have virtual functions/bases or user provided destructors) – David Rodríguez - dribeas May 07 '13 at 14:02
FWIW: It's probable that the compiler can optimize the destructor away in the case of some (formally) non-trivial, but simple destructors. The most obvious case is where the only thing that makes your destructor non-trivial is that you derive from an abstract base class (with no data itself). Technically, in such cases, the compiler has to change the `vptr` and call the base class destructor. Practically, if the base class destructor is empty and inline, the compiler won't bother. – James Kanze May 07 '13 at 14:05
2

@JamesKanze: Yes, that is possible and I would expect even common (haven't tested this, but it was mentioned in "The C++ Object model" as an optimization back in 1995, so I'd expect it to be *common*). But I was considering what some libraries do at the library level (even without optimization): detect whether the type has a trivial destructor through traits and use an implementation that does not even call the constructor (i.e. nothing to be optimized in the first place). – David Rodríguez - dribeas May 07 '13 at 14:12
+1. Because this answer sounds more reasonable to me, as there is a difference between *"the compiler will likely optimize ..."* (as the other answer says) and *"the implementation can optimize the cost away ..."* (as this answer says). – Nawaz May 08 '13 at 03:42
@Nawaz that sounds like a distinction without a difference. It isn't particularly incorrect to refer to the whole implementation as "the compiler" – Caleth Jun 26 '20 at 09:34

score 10 · Answer 3 · answered May 07 '13 at 13:39

Anything you do to remove the existing items from the vector needs to (potentially) invoke the destructor of each item being destroyed. Therefore, from the container's viewpoint, the best you can hope for is linear complexity.

That leaves only the question of what sort of items you store in the vector. If you store something like int that the compiler can/will know ahead of time has no destructor to invoke, chances are at least pretty good that removal will end up with constant complexity.

I doubt, however, that changing the syntax (e.g., clear() vs. resize() vs. erase(begin(), end()) ) will make any significant difference at all. The syntax doesn't change that fact that (in the absence of threading) invoking N destructors is an O(N) operation.

C++ fastest way to clear or erase a vector

3 Answers3

Linked