7

How severe is the efficiency loss between using memcpy and std::copy?

I have a situation where the vector implementation on my system doesn't appear to use contiguous memory, which is making me have to std::copy its contents later on rather than doing memcpy(dest, &vec[0], size);. I'm not sure how badly this is likely to impact efficiency.

John Humphreys
  • 37,047
  • 37
  • 155
  • 255
  • 4
    What implementation are you using? (C++03 guarantees contiguous storage). – R. Martinho Fernandes Sep 02 '11 at 15:48
  • 4
    Doesn't the standard *require* vectors to use contiguous memory, so their addresses can be passed to functions that expect an array? – Frédéric Hamidi Sep 02 '11 at 15:48
  • 1
    If your vector's data isn't contiguous, then your implementation isn't standard compliant. – Kerrek SB Sep 02 '11 at 15:48
  • Your `std::vector` is not conforming if it doesn't use contiguous memory. Generally, on most implementations for types where `memcpy` is valid, `std::copy` performs about the same as `memmove`. – CB Bailey Sep 02 '11 at 15:49
  • @Sven: Oh, was that the thing they changed in C++03? – Kerrek SB Sep 02 '11 at 15:50
  • Measure the difference, in your application. There are too many variables for us (or you) to say that one is necessarily faster than the other. – Robᵩ Sep 02 '11 at 15:50
  • @Kerrek: I don't know about C++98, but C++03 says this in parapgraph 23.2.4: "The elements of a vector are stored contiguously, meaning that if v is a vector where T is some type other than bool, then it obeys the identity &v[n] == &v[0] + n for all 0 <= n < v.size()." – Sven Sep 02 '11 at 15:51
  • 3
    @Kerrek : As I recall, the two major changes were that and creating the distinction between default-initialization and value-initialization. – ildjarn Sep 02 '11 at 15:53
  • 3
    @Kerrek - C++98 just didn't says anything about vector being contiguous or not. Most people assumed it would be that anyway. – Bo Persson Sep 02 '11 at 15:54
  • What evidence do you have that your vector isn't using contiguous memory? – TheJuice Sep 02 '11 at 16:02
  • Coming in late to my own discussion - but this is a RTOS unix variant that's supposed to be "unix-like" as they call it. Memcpying the vector from address 0 yielded correct results for the first element and jibberish for the rest, so I know it's not contiguous, though I could look into the implementation. That wasn't really my question though - I just wanted to know if my solution was efficient enough :) – John Humphreys Sep 02 '11 at 17:21

3 Answers3

14

A reasonably decent implementation will have std::copy compile to a call memmove in the situations where this is possible (i.e. the element type is a POD).

If your implementation doesn't have contiguous storage (the C++03 standard requires it), memmove might be faster than std::copy, but probably not too much. I would start worrying only when you have measurements to show it is indeed an issue.

R. Martinho Fernandes
  • 228,013
  • 71
  • 433
  • 510
14

While you've gotten a number of good answers, I feel obliged to add one more point: even if the code is theoretically less efficient, it's rarely likely to make any real difference.

The reason is pretty simple: the CPU is a lot faster than memory in any case. Even relatively crappy code will still easily saturate the bandwidth between the CPU and memory. Even if the data involved is in the cache, the same generally remains true -- and (again) even with crappy code, the move is going to be done far too quickly to care anyway.

Quite a few CPUs (e.g., Intel x86) have a special path in the hardware that will be used for most moves in any case, so there will often be literally no difference in speed between implementations that appear quite a bit different even at the assembly code level.

Ultimately, if you care about the speed of moving things around in memory, you should worry more about eliminating that than making it faster.

Jerry Coffin
  • 476,176
  • 80
  • 629
  • 1,111
  • 1
    +1: *Exactly*. The only place where it will make a difference is if you aren't compiling optimized (in which case the call to `std::copy` is not optimized away, but also in which case why are you worried about performance?), if using `memcpy` would give the wrong results (in which case you should be using `memmove` anyhow), or if little bit of gyrations that `memmove` does to determine if it is safe to use `memcpy` overwhelm the memory copy itself (in which case you are copying a tiny bit of memory). – David Hammen Sep 02 '11 at 16:57
  • *even if the code is theoretically less efficient, it's rarely likely to make any real difference* At least in 2001 this was not the case at all, and on many embedded architectures it's still not the case. See, for example [Mike Wall's "Using Block Prefetch for Optimized Memory Performance"](http://web.mit.edu/ehliu/Public/ProjectX/Meetings/AMD_block_prefetch_paper.pdf). The differences can be dramatic (say 3x faster than naive code). – Kuba hasn't forgotten Monica Dec 13 '13 at 20:50
  • @KubaOber: Do you have some good reason to believe that the standard library on those machines uses particularly naive code (or at least that it's `memmove` uses substantially better code that its `std::copy`)? – Jerry Coffin Dec 13 '13 at 21:00
  • The way I understood your implication was that, essentially, any code would be "good enough". It probably will be good enough on modern "mainstream" Intel chips, where a for nontrivial-sized blocks a simple K&R C-implementation of `memcpy` saturates the memory, given a recent compiler. As soon as you're not on the latest and greatest, things get "interesting". – Kuba hasn't forgotten Monica Dec 13 '13 at 22:21
5

std::copy will use memcpy when it is appropriate, so you should just use std::copy and let it do the work for you.

Tony The Lion
  • 61,704
  • 67
  • 242
  • 415
  • 4
    `std::copy` is more likely to call `memmove` than `memcpy` because the ranges for `std::copy` are allowed to overlap and if they don't, this is not always statically verifiable. – CB Bailey Sep 02 '11 at 15:53
  • 2
    Checking: `printf '#include \nvoid DoCopy( char* o, const char* i, std::size_t count ) { std::copy( i, i + count, o ); }' | gcc -S -O3 -o - -std=c++98 -x c++ -` gives `jmp memmove` on my system. – CB Bailey Sep 02 '11 at 16:00
  • 1
    @Charles is right, the STL cannot decide to use `memcpy` at compile time as there is no guarantee that the ranges will not overlap. – David Rodríguez - dribeas Sep 02 '11 at 16:29
  • 1
    @Charles: same on VC2010. And I also checked, if you use two vectors and do `std::copy(src.begin(), src.end(), dest.begin())` it also ends up calling `memmove`. In particular it inlines to `call __imp__memmove`. – Sven Sep 02 '11 at 16:39