2

In my application there's a part that requires me to make a copy of a container. At the moment I use std::vector (but might consider using something else). The application is very sensitive with regard to the latency. So, I was looking for a way to make a copy of a vector as fast as possible. I found that memcpy does it better than anything else. However, it does not change the internal size of the vector. I.e. vector.size() would still give me 0.

I know what slippery path I am on. I do not mind throwing away safety checks. I do know how many elements are being copied. I do not want to use vector.resize() to change the size of the destination vector (it is a slow operation).

Question:

std::vector<my_struct> destination_vector;
destination_vector.reserve(container_length);
std::memcpy(destination_vector.data(), original_vector.data(), sizeof(my_struct)*container_length);

After the code above I need to tell my destination_vector what size it is. How do I do this?

Thank you.

Aleksey
  • 35
  • 7
  • 1
    I would double check the documentation, vector has a method called resize – JVApen Jan 30 '19 at 16:47
  • 2
    _it is a slow operation_ How slow? Did you measure it? – Paul Sanders Jan 30 '19 at 16:50
  • Get a better compiler. A good one should be using memcpy already. It should also be able to remove whatever code resize includes to initialize the elements, that is killed by memcpy. – Marc Glisse Jan 30 '19 at 16:51
  • It's not possible to *only* change the size. Anything that would increase it's size will also at least initialize the extra elements in some way. Maybe an implementation could skip that part for trivially constructible types but it's not guaranteed. Complexity is only guarnteed to be at worst linear (excluding reallocation costs). – François Andrieux Jan 30 '19 at 16:52
  • @PaulSanders I tested it. Resize takes 30%-60% more time than my memcpy. Making the whole operation 2-3 times slower. – Aleksey Jan 30 '19 at 16:57
  • 2
    Compiler already uses `memcpy` if you do it right... https://godbolt.org/z/7Bp4BF – Baum mit Augen Jan 30 '19 at 16:58
  • @Aleksey Re the slowdown: Supply an allocator that doesn't value-initialize the contents. – Baum mit Augen Jan 30 '19 at 16:58
  • 2
    Psychic debugging: `my_struct` isn't a trivial type, so any use of `memcpy` is UB, and the compiler is trying to save OP from themselves. – Caleth Jan 30 '19 at 17:01
  • @BaummitAugen Thank you for the advice. Can you please provide me some reference on this? I am not a programmer myself. Cheers. – Aleksey Jan 30 '19 at 17:06
  • @Aleksey In the light of your other comments, you can probably scratch that remark, that only makes sense when your type is trivial (which I assumed because otherwise you can't `memcpy` is either). – Baum mit Augen Jan 30 '19 at 17:08
  • @Aleksey So was the vector already large enough for the destination string? Maybe it is 'downsizing' which would involve freeing and re-allocation its internal buffer. Not much you can do in that case, seems. – Paul Sanders Jan 30 '19 at 17:10
  • 1
    @PaulSanders a vector is not allowed to reallocate if resize is less than previous size. – eerorika Jan 30 '19 at 17:21
  • So, I had a second look at my struct and I can give up some things to make it trivially copyable. How can I create a proper allocator for it, so that resize does not value-initialize elements? – Aleksey Jan 30 '19 at 17:21
  • https://stackoverflow.com/a/41049640/3002139 Something like is also in boost if you are using that. – Baum mit Augen Jan 30 '19 at 17:26
  • @BaummitAugen Thank you. – Aleksey Jan 30 '19 at 17:27
  • But you should probably just use the range constructor or the range inserter as demonstrated in my link and the answers anyway. Note the lack of `memset` of similar in the generated assembly on godbolt. – Baum mit Augen Jan 30 '19 at 17:28
  • On my Linux 64, with "std::vector vecT;", a sizeof(vecT) reports 24 bytes, regardless of size of T, regardless of number of elements in vecT. When vecT has 1000 elements, sizeof(vecT) reports 24 bytes, while vecT.size() reports 1000 (elements). Since 24 bytes can not contain 1000 T, you can infer that std::vector must contain pointers and counters, etc, i.e. vecT by itself, regardless of contents, is not trivially copyable. – 2785528 Jan 30 '19 at 17:49

3 Answers3

3
  1. You must to actually resize() the vector before you copy stuff to it using memcpy():

    destination_vector.resize(container_length);

    But it would be better to avoid the use of memcpy() in the first place and use the mechanisms to copy vector content which is offered by vector, as suggested in the other answers:

    std::vector<my_struct> destination_vector(original_vector);

    or if the destination_vector instance already exists:

    destination_vector.insert(destination_vector.begin(), original_vector.begin(), original_vector.end);

    or, the fastest if you do not need the original content any more:

    destination_vector.swap(original_vector);

    All of these variants will be as fast or even faster than your memcpy()variant. If you experience slowness then see 2.:

  2. You probably have a non-trivial default constructor in my_struct. Remove it, or insert a trivial (empty) default constructor to speed things up (to avoid construction of many elements which you never use).

  3. If my_structcontains non-POD data members (like std::string) you cannot use memcpy() at all.

(Side note: You rarely want to call reserve(). The vector maintains its own internal storage in such a way that is always allocates more than is actually needed, exponentially, to avoid frequent resizes/copying when frequently appending elements.)

resize() is not a slow operation. It is as fast as any memory allocation.

Does my_struct have a non-trivial default constructor? Remove it and take care of initialization manually. This might be the reason why you say resize() is slow. It will actually construct your objects. But since you can apparently memcpy() your objects you can probably get away with a trivial (empty) default constructor.

Johannes Overmann
  • 4,914
  • 22
  • 38
  • I appreciate the promptness of your reply. However, I specifically mentioned that resize is not tolerable for me in terms of latency. – Aleksey Jan 30 '19 at 16:48
  • 1
    @Aleksey It's still required. Anything that would change the `size` will (one way or another) initialize the added elements. – François Andrieux Jan 30 '19 at 16:50
  • 3
    "*You generally do not need to call reserve(). "* This is not true. `reserve` is useful when you know how many elements you will insert while still using `push_back`, `emplace_back` or otherwise adding elements one at a time, notably with `std::back_inserter`. – François Andrieux Jan 30 '19 at 16:51
  • @Francois: Yes, it can be useful in special situations. However, in your examples it is not necessary to call `reserve()`beforehand since `vector` uses exponential growth. So adding elements, one at a time, still has linear complexity. – Johannes Overmann Jan 30 '19 at 16:59
  • It is possible that resize looks slow because of the of my constructor. I had to create it in order to emplace my structs into the vector. What are the criteria for a constructor to be called trivial? – Aleksey Jan 30 '19 at 17:03
  • @Aleksey what are the data members of `my_struct`? If they are all fundamental types, or arrays thereof, then the generated constructors are trivial – Caleth Jan 30 '19 at 17:06
  • @Aleksey https://stackoverflow.com/questions/4178175/what-are-aggregates-and-pods-and-how-why-are-they-special/ – eerorika Jan 30 '19 at 17:14
  • @JohannesOvermann Amortized linear complexity, which is not the same thing. Reserving has a big impact if you save even one reallocation. If you reserve enough, `push_back` goes straight to constant complexity. – François Andrieux Jan 30 '19 at 18:05
  • For relatively simple types the time to realloc and move all the elements will dominate the time for actually emplacing new values and incrementing the size even if it's amortized constant. – François Andrieux Jan 30 '19 at 18:20
  • @Francois: Nope. `resize()` usually does not re-allocate and copy at all, on average. Nor does appending a single element. This is the key point. The vector allocates ahead exponentially, so the re-allocations will get rarer and rarer until they are no longer measurable, even when appending single elements. – Johannes Overmann Jan 31 '19 at 07:40
  • @JohannesOvermann Well first, `resize()` is different from a series of individual of insertions at the back. From the context I'll assume that's what you mean. Second, no reallocation on average is not the same as at most one reallocation. If you reserve, you put a hard cap on the overhead at 1 allocation and `N` moves (where `N` is the previous size). If the series of `push_back` without `reserve` would have not had any such overhead `reserve` doesn't either. There's no down side (when you know the final size ahead of time) and you clamp the overhead to the minimum possible. – François Andrieux Jan 31 '19 at 14:31
  • @JohannesOvermann The problem may be over relying on `O` notation. Not all complexities of the same order are created equal. If it helps, you can think of using `reserve` as passing from an `O(N)` algorithm to another `O(N)` algorithm where the cost of each operation is less than or equal to the previous algorithm's. They might both converge to linear time, but one converges faster than the other. – François Andrieux Jan 31 '19 at 14:49
  • @Francois: When you really know the exact size in advance there is no point in using `vector`. Then you could rather use `std::array` and have less overhead and clearer semantics. I stay with it: There is, except for very exotic cases, no use case for `reserve()`. And yes, you can use `reserve()` to avoid the first couple of re-allocations. And you can use `reserver()` to make sure pointers to data do not get corrupted. But I consider both use cases _exotic_. – Johannes Overmann Jan 31 '19 at 23:14
  • @JohannesOvermann The size you will need can both be unknown at compile time and known before starting your insertions. It's not an unusual case. Immediately, data reception comes to mind. It seems you can only imagine scenarios where `std::vector` receives elements continuously. Furthermore, iterator and pointer preservation is *far* from an "exotic" case. It seems our past experiences are just too different to agree. Bottom line though, if you can know the total number of insertions you will perform, there is no reason not to `reserve`, and this is **not** as rare as you make it seem. – François Andrieux Feb 01 '19 at 15:35
2

Your snippet has undefined behaviour, you can't memcpy into an empty vector, even if you have reserved space. It may also be undefined behaviour to memcpy any my_struct objects, if it isn't a TriviallyCopyable type.

You can construct the vector as a copy of the source directly. Most likely your compiler will emit code identical (or faster) than your original snippet, if my_struct is TriviallyCopyable.

std::vector<my_struct> destination_vector(original_vector.begin(), original_vector.begin() + container_length);
Caleth
  • 52,200
  • 2
  • 44
  • 75
  • Apparently my struct is not trivially copyable. This would explain why there's such a big difference between memcpy and regular copy. Thank you. – Aleksey Jan 30 '19 at 17:14
2

How to manually assign vector's size?

You can't. You can only modify vector's size through the modification functions that add or remove elements such as insert, clear, resize etc.

After the code above I need to tell my destination_vector what size it is. How do I do this?

The mentioned code above has undefined behaviour, so it doesn't matter what you do after it.

A simple and efficient way to copy a vector is:

std::vector<my_struct> destination_vector(original_vector);
eerorika
  • 232,697
  • 12
  • 197
  • 326
  • 1
    Nitpick: this assumes that `container_length == original_vector.size()`. I *hope* it is the case, this is nicer than my variation. – Caleth Jan 30 '19 at 17:05
  • @Caleth good point. This is answer to how to copy a vector / container. If `container_length` has some special meaning other than `original_vector.size()`, then this example has different behaviour. – eerorika Jan 30 '19 at 17:10
  • container_length is indeed equal to original_vector.size() – Aleksey Jan 30 '19 at 17:16
  • I have to say that this answer provides the fastest approach out of all suggestions. Still not what I need though. But I appreciate all the help. Apparently, I will need to educate myself further. – Aleksey Jan 30 '19 at 17:30