205

I am trying to understand the difference between memcpy() and memmove(), and I have read the text that memcpy() doesn't take care of the overlapping source and destination whereas memmove() does.

However, when I execute these two functions on overlapping memory blocks, they both give the same result. For instance, take the following MSDN example on the memmove() help page:-

Is there a better example to understand the drawbacks of memcpy and how memmove solves it?

// crt_memcpy.c
// Illustrate overlapping copy: memmove always handles it correctly; memcpy may handle
// it correctly.

#include <memory.h>
#include <string.h>
#include <stdio.h>

char str1[7] = "aabbcc";

int main( void )
{
    printf( "The string: %s\n", str1 );
    memcpy( str1 + 2, str1, 4 );
    printf( "New string: %s\n", str1 );

    strcpy_s( str1, sizeof(str1), "aabbcc" );   // reset string

    printf( "The string: %s\n", str1 );
    memmove( str1 + 2, str1, 4 );
    printf( "New string: %s\n", str1 );
}

Output:

memcpy():
The string: aabbcc
New string: aaaabb

memmove():
The string: aabbcc
New string: aaaabb
VC.One
  • 14,790
  • 4
  • 25
  • 57
user534785
  • 2,173
  • 2
  • 13
  • 6
  • 2
    The Microsoft CRT has had a safe memcpy() for quite a while. – Hans Passant Dec 11 '10 at 09:01
  • 44
    I don't think "safe" is the right word for it. A safe `memcpy` would `assert` that the regions don't overlap rather than intentionally covering up bugs in your code. – R.. GitHub STOP HELPING ICE Dec 11 '10 at 12:53
  • 9
    Depends on whether you mean "safe for the developer" or "safe for the end-user". I would argue that doing as told, even if it isn't standards-compliant is the safer choice for the end-user. – kusma Jan 26 '12 at 12:33
  • since glibc 2.19 - not work `The string: aabbcc New string: aaaaaa The string: aabbcc New string: aaaabb` – askovpen Aug 02 '14 at 20:57
  • You can also see [here](http://www.tedunangst.com/flak/post/memcpy-vs-memmove). – Ren Nov 18 '15 at 08:26
  • 2
    Microsoft's "safe" memcpy() is a fallback to memmove() https://twitter.com/MalwareMinigun/status/737801492808142848 – vobject Jun 01 '16 at 01:33
  • 4
    A good example with pictures on the subject of "What can go wrong with `memcpy(...)` can be found here: [memcpy vs memmove](http://www.equestionanswers.com/c/memcpy-vs-memmove.php). – deralbert Oct 04 '20 at 14:18

11 Answers11

156

I'm not entirely surprised that your example exhibits no strange behaviour. Try copying str1 to str1+2 instead and see what happens then. (May not actually make a difference, depends on compiler/libraries.)

In general, memcpy is implemented in a simple (but fast) manner. Simplistically, it just loops over the data (in order), copying from one location to the other. This can result in the source being overwritten while it's being read.

Memmove does more work to ensure it handles the overlap correctly.

EDIT:

(Unfortunately, I can't find decent examples, but these will do). Contrast the memcpy and memmove implementations shown here. memcpy just loops, while memmove performs a test to determine which direction to loop in to avoid corrupting the data. These implementations are rather simple. Most high-performance implementations are more complicated (involving copying word-size blocks at a time rather than bytes).

Ionic
  • 499
  • 4
  • 18
developmentalinsanity
  • 6,109
  • 2
  • 22
  • 18
  • 2
    +1 Also, in the following implementation, `memmove` calls `memcpy` in one branch after testing the pointers: http://www.student.cs.uwaterloo.ca/~cs350/common/os161-src-html/memmove_8c-source.html – Pascal Cuoq Dec 11 '10 at 09:15
  • That sounds great. Seems like Visual Studio implements a "safe" memcpy (along with gcc 4.1.1, I tested on RHEL 5 as well). Writing the versions of these functions from clc-wiki.net gives a clear picture. Thanks. – user534785 Dec 11 '10 at 09:24
  • 3
    memcpy doesn't take care of the overlapping-issue, but memmove does. Then why not eliminate memcpy from the lib? – Alcott Sep 16 '11 at 12:11
  • 44
    @Alcott: Because `memcpy` can be faster. – Billy ONeal Oct 17 '11 at 17:23
  • 1
    Fixed/webarchive link from Pascal Cuoq above: https://web.archive.org/web/20130722203254/http://www.student.cs.uwaterloo.ca/~cs350/common/os161-src-html/memmove_8c-source.html – JWCS May 29 '20 at 16:43
130

The memory in memcpy cannot overlap or you risk undefined behaviour, while the memory in memmove can overlap.

char a[16];
char b[16];

memcpy(a,b,16);           // Valid.
memmove(a,b,16);          // Also valid, but slower than memcpy.
memcpy(&a[0], &a[1],10);  // Not valid since it overlaps.
memmove(&a[0], &a[1],10); // Valid. 

Some implementations of memcpy might still work for overlapping inputs, but you cannot count on that behaviour. However, memmove must allow for overlapping inputs.

Gabriel Staples
  • 36,492
  • 15
  • 194
  • 265
rxantos
  • 1,724
  • 1
  • 13
  • 14
39

Just because memcpy doesn't have to deal with overlapping regions, doesn't mean it doesn't deal with them correctly. The call with overlapping regions produces undefined behavior. Undefined behavior can work entirely as you expect on one platform; that doesn't mean it's correct or valid.

Billy ONeal
  • 104,103
  • 58
  • 317
  • 552
  • 13
    In particular, depending on the platform, it's possible that `memcpy` is implemented exactly the same way as `memmove`. That is, whoever wrote the compiler didn't bother writing a unique `memcpy` function. – Cam Dec 11 '10 at 08:46
20

Both memcpy and memove do similar things.

But to sight out one difference:

#include <memory.h>
#include <string.h>
#include <stdio.h>

char str1[7] = "abcdef";

int main()
{

   printf( "The string: %s\n", str1 );
   memcpy( (str1+6), str1, 10 );
   printf( "New string: %s\n", str1 );

   strcpy_s( str1, sizeof(str1), "aabbcc" );   // reset string


   printf("\nstr1: %s\n", str1);
   printf( "The string: %s\n", str1 );
   memmove( (str1+6), str1, 10 );
   printf( "New string: %s\n", str1 );

}

gives:

The string: abcdef
New string: abcdefabcdefabcd
The string: abcdef
New string: abcdefabcdef
Simson
  • 3,373
  • 2
  • 24
  • 38
Neilvert Noval
  • 1,655
  • 2
  • 15
  • 21
  • 3
    IMHO, this example program has some flaws, since the str1 buffer is accessed out of bounds (10 bytes to copy, buffer is 7 bytes in size). The out of bounds error results in undefined behavior. The differences in the shown results of the memcpy()/memmove() calls are implementation specific. And the example output doesn't exactly match the program above... Also, strcpy_s() is not part of standard C AFAIK (MS specific, see also: https://stackoverflow.com/questions/36723946/why-does-strcpy-s-not-exist-anywhere-on-my-system/36724095#36724095) - Please correct me if I'm wrong. – rel Feb 26 '20 at 20:19
14

Your demo didn't expose memcpy drawbacks because of "bad" compiler, it does you a favor in Debug version. A release version, however, gives you the same output, but because of optimization.

    memcpy(str1 + 2, str1, 4);
00241013  mov         eax,dword ptr [str1 (243018h)]  // load 4 bytes from source string
    printf("New string: %s\n", str1);
00241018  push        offset str1 (243018h) 
0024101D  push        offset string "New string: %s\n" (242104h) 
00241022  mov         dword ptr [str1+2 (24301Ah)],eax  // put 4 bytes to destination
00241027  call        esi  

The register %eax here plays as a temporary storage, which "elegantly" fixes overlap issue.

The drawback emerges when copying 6 bytes, well, at least part of it.

char str1[9] = "aabbccdd";

int main( void )
{
    printf("The string: %s\n", str1);
    memcpy(str1 + 2, str1, 6);
    printf("New string: %s\n", str1);

    strcpy_s(str1, sizeof(str1), "aabbccdd");   // reset string

    printf("The string: %s\n", str1);
    memmove(str1 + 2, str1, 6);
    printf("New string: %s\n", str1);
}

Output:

The string: aabbccdd
New string: aaaabbbb
The string: aabbccdd
New string: aaaabbcc

Looks weird, it's caused by optimization, too.

    memcpy(str1 + 2, str1, 6);
00341013  mov         eax,dword ptr [str1 (343018h)] 
00341018  mov         dword ptr [str1+2 (34301Ah)],eax // put 4 bytes to destination, earlier than the above example
0034101D  mov         cx,word ptr [str1+4 (34301Ch)]  // HA, new register! Holding a word, which is exactly the left 2 bytes (after 4 bytes loaded to %eax)
    printf("New string: %s\n", str1);
00341024  push        offset str1 (343018h) 
00341029  push        offset string "New string: %s\n" (342104h) 
0034102E  mov         word ptr [str1+6 (34301Eh)],cx  // Again, pulling the stored word back from the new register
00341035  call        esi  

This is why I always choose memmove when trying to copy 2 overlapped memory blocks.

huubby
  • 370
  • 2
  • 10
4

The difference between memcpy and memmove is that

  1. in memmove, the source memory of specified size is copied into buffer and then moved to destination. So if the memory is overlapping, there are no side effects.

  2. in case of memcpy(), there is no extra buffer taken for source memory. The copying is done directly on the memory so that when there is memory overlap, we get unexpected results.

These can be observed by the following code:

//include string.h, stdio.h, stdlib.h
int main(){
  char a[]="hare rama hare rama";

  char b[]="hare rama hare rama";

  memmove(a+5,a,20);
  puts(a);

  memcpy(b+5,b,20);
  puts(b);
}

Output is:

hare hare rama hare rama
hare hare hare hare hare hare rama hare rama
Pimgd
  • 5,983
  • 1
  • 30
  • 45
4

C11 standard draft

The C11 N1570 standard draft says:

7.24.2.1 "The memcpy function":

2 The memcpy function copies n characters from the object pointed to by s2 into the object pointed to by s1. If copying takes place between objects that overlap, the behavior is undefined.

7.24.2.2 "The memmove function":

2 The memmove function copies n characters from the object pointed to by s2 into the object pointed to by s1. Copying takes place as if the n characters from the object pointed to by s2 are first copied into a temporary array of n characters that does not overlap the objects pointed to by s1 and s2, and then the n characters from the temporary array are copied into the object pointed to by s1

Therefore, any overlap on memcpy leads to undefined behavior, and anything can happen: bad, nothing or even good. Good is rare though :-)

memmove however clearly says that everything happens as if an intermediate buffer is used, so clearly overlaps are OK.

C++ std::copy is more forgiving however, and allows overlaps: Does std::copy handle overlapping ranges?

Ciro Santilli OurBigBook.com
  • 347,512
  • 102
  • 1,199
  • 985
  • 1
    `memmove` use an extra temporary array of n, so does it use extra memory? But how can it if we haven't given it access to any memory. (It's using 2x the memory). – clamentjohn Mar 27 '19 at 04:42
  • @clmno it allocates on stack or malloc like any other function I'd expect :-) – Ciro Santilli OurBigBook.com Mar 27 '19 at 18:30
  • 1
    I'd asked a question [here](https://stackoverflow.com/questions/55370165), got a good answer too. Thank you. Saw your hackernews [post](https://news.ycombinator.com/item?id=19428700) that went viral (the x86 one) :) – clamentjohn Mar 28 '19 at 04:11
3

As already pointed out in other answers, memmove is more sophisticated than memcpy such that it accounts for memory overlaps. The result of memmove is defined as if the src was copied into a buffer and then buffer copied into dst. This does NOT mean that the actual implementation uses any buffer, but probably does some pointer arithmetic.

1

compiler could optimize memcpy, for example:

int x;
memcpy(&x, some_pointer, sizeof(int));

This memcpy may be optimized as: x = *(int*)some_pointer;

rockeet
  • 77
  • 6
  • 3
    Such an optimization is permissible only on architectures which allow unaligned `int` accesses. On some architectures (e.g. Cortex-M0), attempting to fetch a 32-bit `int` from an address which is not a multiple of four will cause a crash (but `memcpy` would work). If one will either be using a CPU which allows unaligned access or using a compiler with a keyword that directs the compiler to assemble integers out of separately-fetched bytes when necessary, one could do something like `#define UNALIGNED __unaligned` and then `x=*(int UNALIGNED*)some_pointer; – supercat Jul 22 '13 at 21:57
  • 2
    Some processors do not allow unaligned int access crash `char x = "12345"; int *i; i = *(int *)(x + 1);` But some do, because they fix up the copy during the fault. I worked on a system like this, and it took a bit of time to understand why performance was so poor. – user3431262 Mar 18 '14 at 01:48
  • `*(int *)some_pointer` is a strict aliasing violation, but you probably mean that the compiler would output assembly which copies an int – M.M Apr 20 '16 at 01:08
1

The code given in the links http://clc-wiki.net/wiki/memcpy for memcpy seems to confuse me a bit, as it does not give the same output when I implemented it using the below example.

#include <memory.h>
#include <string.h>
#include <stdio.h>

char str1[11] = "abcdefghij";

void *memcpyCustom(void *dest, const void *src, size_t n)
{
    char *dp = (char *)dest;
    const char *sp = (char *)src;
    while (n--)
        *dp++ = *sp++;
    return dest;
}

void *memmoveCustom(void *dest, const void *src, size_t n)
{
    unsigned char *pd = (unsigned char *)dest;
    const unsigned char *ps = (unsigned char *)src;
    if ( ps < pd )
        for (pd += n, ps += n; n--;)
            *--pd = *--ps;
    else
        while(n--)
            *pd++ = *ps++;
    return dest;
}

int main( void )
{
    printf( "The string: %s\n", str1 );
    memcpy( str1 + 1, str1, 9 );
    printf( "Actual memcpy output: %s\n", str1 );

    strcpy_s( str1, sizeof(str1), "abcdefghij" );   // reset string

    memcpyCustom( str1 + 1, str1, 9 );
    printf( "Implemented memcpy output: %s\n", str1 );

    strcpy_s( str1, sizeof(str1), "abcdefghij" );   // reset string

    memmoveCustom( str1 + 1, str1, 9 );
    printf( "Implemented memmove output: %s\n", str1 );
    getchar();
}

Output :

The string: abcdefghij
Actual memcpy output: aabcdefghi
Implemented memcpy output: aaaaaaaaaa
Implemented memmove output: aabcdefghi

But you can now understand why memmove will take care of overlapping issue.

singingsingh
  • 1,364
  • 14
  • 15
-4

I have tried to run same program using eclipse and it shows clear difference between memcpy and memmove. memcpy() doesn't care about overlapping of memory location which results in corruption of data, while memmove() will copy data to temporary variable first and then copy into actual memory location.

While trying to copy data from location str1 to str1+2, output of memcpy is "aaaaaa". The question would be how? memcpy() will copy one byte at a time from left to right. As shown in your program "aabbcc" then all copying will take place as below,

  1. aabbcc -> aaabcc

  2. aaabcc -> aaaacc

  3. aaaacc -> aaaaac

  4. aaaaac -> aaaaaa

memmove() will copy data to temporary variable first and then copy to actual memory location.

  1. aabbcc(actual) -> aabbcc(temp)

  2. aabbcc(temp) -> aaabcc(act)

  3. aabbcc(temp) -> aaaacc(act)

  4. aabbcc(temp) -> aaaabc(act)

  5. aabbcc(temp) -> aaaabb(act)

Output is

memcpy : aaaaaa

memmove : aaaabb

  • 2
    Welcome to Stack Overflow. Please read the [About] page soon. There are various issues to address. First and foremost, you added an answer to a question with multiple answers from 18 months or so ago. To warrant the addition, you would need to provide startling new information. Second, you specify Eclipse, but Eclipse is an IDE that uses a C compiler, but you don't identify the platform where your code is running or the C compiler Eclipse is using. I'd be interested to know how you ascertain that `memmove()` copies to an intermediate location. It should just copy in reverse when necessary. – Jonathan Leffler Jan 15 '15 at 22:41
  • Thanks. About the compiler, so i am using gcc compiler on linux. There is a man page in linux for the memove which clearly specifies that memove will copy data in temporary variable to avoid overlapping of data. Here is the link of that man page http://linux.die.net/man/3/memmove – Pratik Panchal Jan 30 '15 at 09:11
  • 3
    It actually says "as if", which does not mean that it is what actually happens. Granted it _could_ actually do it that way (though there'd be questions about where it gets the spare memory from), but I would be more than a little surprised if that was what it actually does. If the source address is greater than the target address, it is sufficient to copy from the start to the end (forwards copy); if the source address is less than the target address, it is sufficient to copy from the end to the start (backwards copy). No auxilliary memory is needed or used. – Jonathan Leffler Jan 30 '15 at 15:01
  • try to explain your answer with actual data in code, that would be more helpful. – Haseeb Mir Nov 06 '18 at 20:41