memcpy vs for loop - What's the proper way to copy an array from a pointer?

Question

I have a function foo(int[] nums) which I understand is essentially equivalent to foo(int* nums). Inside foo I need to copy the contents of the array pointed to by numsinto some int[10] declared within the scope of foo. I understand the following is invalid:

void foo (int[] nums) 
{
    myGlobalArray = *nums
}

What is the proper way to copy the array? Should I use memcpy like so:

void foo (int[] nums)
{
    memcpy(&myGlobalArray, nums, 10);
}

or should I use a for loop?

void foo(int[] nums)
{
    for(int i =0; i < 10; i++)
    {
        myGlobalArray[i] = nums[i];
    }
}

Is there a third option that I'm missing?

Possible duplicate of [Why is memcpy() and memmove() faster than pointer increments?](http://stackoverflow.com/questions/7776085/why-is-memcpy-and-memmove-faster-than-pointer-increments) Although this does not mention faster, both snippets are functionally correct, so it is going to come down to that. — Ciro Santilli OurBigBook.com, Mar 09 '16 at 09:41

Oliver Charlesworth · Answer 1 · 2016-02-14T11:03:39.247

72

Yes, the third option is to use a C++ construct:

std::copy(&nums[0], &nums[10], myGlobalArray);

With any sane compiler, it:

should be optimum in the majority of cases (will compile to memcpy() where possible),
is type-safe,
gracefully copes when you decide to change the data-type to a non-primitive (i.e. it calls copy constructors, etc.),
gracefully copes when you decide to change to a container class.

edited Feb 14 '16 at 11:03

answered Jan 18 '11 at 21:15

Oliver Charlesworth

267,707
33
569
680

It's also faster: http://stackoverflow.com/questions/4707012/c-memcpy-vs-stdcopy/9980859#9980859 – David Stone Apr 02 '12 at 18:25
3

I appear to have left an important word out of that comment: "possibly" – David Stone Oct 26 '13 at 15:20
@Oliver Charlesworth is `&nums[10]` allowed when the size of the array is 10? Would `nums + 10` work instead? – user2635088 Dec 19 '16 at 17:44
@user2635088 `nums[10]` is identical to `*(nums + 10)`, so `&nums[10]` would be `&(*(nums + 10))`, which the compiler will likely optimize to just `nums + 10`. But yes, in general, `&nums[10]` is valid for an array of size 10, as you are allowed to take the address of 1-past-the-end of an array for iteration purposes only. Just don't dereference that address, that would be undefined behavior. – Remy Lebeau Oct 16 '18 at 01:29

score 29 · Accepted Answer · edited Apr 21 '20 at 18:52

29

Memcpy will probably be faster, but it's more likely you will make a mistake using it. It may depend on how smart your optimizing compiler is.

Your code is incorrect though. It should be:

memcpy(myGlobalArray, nums, 10 * sizeof(int) );

edited Apr 21 '20 at 18:52

Gavin

95
1
5

answered Jan 18 '11 at 21:16

Jay

13,803
4
42
69

Hey! Why Memcpy would be faster?? – Swanand Jun 06 '12 at 12:41
2

The authors of the c library have spent a lot of time optimizing it. If you write code to do the same thing then it depends on how well the compiler optimizes your code. – Jay Jun 06 '12 at 17:46
myGlobalArray is array. Should be `memcpy(myGlobalArray, nums, 10 * sizeof(int));` – MrHIDEn Feb 09 '19 at 11:23
It currently shows what you've written. Perhaps it was edited? – Jay Feb 09 '19 at 15:17

kfsone · Answer 3 · 2018-10-16T01:29:52.103

Generally speaking, the worst case scenario will be in an un-optimized debug build where memcpy is not inlined and may perform additional sanity/assert checks amounting to a small number of additional instructions vs a for loop.

However memcpy is generally well implemented to leverage things like intrinsics etc, but this will vary with target architecture and compiler. It is unlikely that memcpy will ever be worse than a for-loop implementation.

People often trip over the fact that memcpy sizes in bytes, and they write things like these:

// wrong unless we're copying bytes.
memcpy(myGlobalArray, nums, numNums);
// wrong if an int isn't 4 bytes or the type of nums changed.
memcpy(myGlobalArray, nums, numNums);
// wrong if nums is no-longer an int array.
memcpy(myGlobalArray, nums, numNums * sizeof(int));

You can protect yourself here by using language features that let you do some degree of reflection, that is: do things in terms of the data itself rather than what you know about the data, because in a generic function you generally don't know anything about the data:

void foo (int* nums, size_t numNums)
{
    memcpy(myGlobalArray, nums, numNums * sizeof(*nums));
}

Note that you don't want the "&" infront of "myGlobalArray" because arrays automatically decay to pointers; you were actually copying "nums" to the address in memory where the pointer to the myGlobalArray[0] was being held.

(Edit note: I'd typo'd int[] nums when I mean't int nums[] but I decided that adding C array-pointer-equivalence chaos helped nobody, so now it's int *nums :))

Using memcpy on objects can be dangerous, consider:

struct Foo {
    std::string m_string;
    std::vector<int> m_vec;
};

Foo f1;
Foo f2;
f2.m_string = "hello";
f2.m_vec.push_back(42);
memcpy(&f1, &f2, sizeof(f2));

This is the WRONG way to copy objects that aren't POD (plain old data). Both f1 and f2 now have a std::string that thinks it owns "hello". One of them is going to crash when they destruct, and they both think they own the same vector of integers that contains 42.

The best practice for C++ programmers is to use std::copy:

std::copy(nums, nums + numNums, myGlobalArray);

Note per Remy Lebeau or since C++11

std::copy_n(nums, numNums, myGlobalArray);

This can make compile time decisions about what to do, including using memcpy or memmove and potentially using SSE/vector instructions if possible. Another advantage is that if you write this:

struct Foo {
    int m_i;
};

Foo f1[10], f2[10];
memcpy(&f1, &f2, sizeof(f1));

and later on change Foo to include a std::string, your code will break. If you instead write:

struct Foo {
    int m_i;
};

enum { NumFoos = 10 };
Foo f1[NumFoos], f2[NumFoos];
std::copy(f2, f2 + numFoos, f1);

the compiler will switch your code to do the right thing without any additional work for you, and your code is a little more readable.

in `void foo (int[] nums, size_t numNums)` using `int[]` throws an error in Visual studio 2017. It's fine if I use `int*`. However, in that case I can't use `sizeof` on the array - it will always return just the size of pointer. — Kari, Oct 14 '18 at 18:30
@Kari fixed, that was a typo, intended `int nums[]`, but to your second point, that's because of array-pointer equivalence. Putting `T x[]` in a function proto is equivalent to `T* x` because C arrays don't have runtime size information, so they decay to pointers. http://c-faq.com/aryptr/aryptrequiv.html — kfsone, Oct 16 '18 at 01:27
@kfsone: also note that in C++11 and later, there is `std::copy_n()` as well: `std::copy_n(nums, numNums, myGlobalArray);` — Remy Lebeau, Oct 16 '18 at 01:28
I would add some nuance to the answer. memcpy, with proper optimizations enabled, will get inlined *IF* the size param is known at compile time. In that case, memcpy() is ALWAYS the right choice. If size param isn't known, then you need to account for the function call cost vs the speed-up gain from memcpy fine-tuned implementation. That would need to be measured to find out but I suspect that for copies smaller than, lets say, 16 bytes, maybe the for loop is a better choice... https://stackoverflow.com/questions/11747891/when-builtin-memcpy-is-replaced-with-libcs-memcpy — lano1106, Mar 07 '20 at 17:23
@lano1106: On targets like ARM cores that don't support unaligned loads and stores, the generated code for `memcpy` will often make allowances for unaligned operands even in cases where the programmer might know that the objects in question will be aligned. — supercat, Apr 21 '20 at 18:58

score 2 · Answer 4 · answered Jan 18 '11 at 21:14

For performance, use memcpy (or equivalents). It's highly optimised platform-specific code for shunting lots of data around fast.

For maintainability, consider what you're doing - the for loop may be more readable and easier to understand. (Getting a memcpy wrong is a fast route to a crash or worse)

score 2 · Answer 5 · answered Jan 18 '11 at 21:17

Essentially, as long as you are dealing with POD types (Plain Ol' Data), such as int, unsigned int, pointers, data-only structs, etc... you are safe to use mem*.

If your array contains objects, use the for loop, as the = operator may be required to ensure proper assignment.

score 1 · Answer 6 · answered Apr 10 '20 at 11:29

A simple loop is slightly faster for about 10-20 bytes and less (It's a single compare+branch, see OP_T_THRES), but for larger sizes, memcpy is faster and portable.

Additionally, if the amount of memory you want to copy is constant, you can use memcpy to let the compiler decide what method to use.

Side note: the optimizations that memcpy uses may significantly slow your program down in a multithreaded environment when you're copying a lot of data above the OP_T_THRES size mark since the instructions this invokes are not atomic and the speculative execution and caching behavior for such instructions doesn't behave nicely when multiple threads are accessing the same memory. Easiest solution is to not share memory between threads and only merge the memory at the end. This is good multi-threading practice anyway.

memcpy vs for loop - What's the proper way to copy an array from a pointer?

6 Answers6

Linked