22

Is memcpy() usually faster than strcpy() (on most real platforms)? (I assume that size of the string is known.)

If I remember i386 assembler correctly, there are loop instructions which copy a given number of bytes or words. So it is the fastest way, while strcpy() i386 assembler implementation would use manual checking for '\0' in a plain loop.

So I feel that on x86 memcpy() is faster than strcpy().

What's about other architectures?

phuclv
  • 37,963
  • 15
  • 156
  • 475
porton
  • 5,214
  • 11
  • 47
  • 95
  • 7
    memcpy might be able to take advantage of > 1 character at a time copies (it knows the length); so I would imagine it would be faster (or rather no slower), although they server *different* purposes. See http://fossies.org/dox/glibc-2.19/string_2memcpy_8c_source.html, http://www.danielvik.com/2010/02/fast-memcpy-in-c.html, etc – user2864740 Jul 26 '14 at 01:45
  • 2
    Be carefull, on some platforms memcpy only works with aligned pointers. This especially is not the case if you copy substrings (based on a variable index). So while memcpy is more efficient, it can not be used in all cases (you wont notice the problem on Intel as it allows unaligned access (slowly)). – eckes Jul 26 '14 at 01:59
  • 13
    @eckes: What platforms would those be!? Some platforms may have memcpy-ish functions which are only useful with aligned pointers, but I don't believe any conforming implementation of memcpy could impose any restrictions on src other than that it identify a block of readable memory of sufficient size which does not alias any part of the destination block. – supercat Jul 26 '14 at 02:08
  • @supercat Yes, actually I was incorrect, it is not memcpy which has problems with unaligned access, but you have to be carefull what pointer types you use with the results or input. In the strcopy case you always have the char aligned pointers which are less efficient but more flexible. (But I am glad I dont have to use C anymore :) – eckes Jul 26 '14 at 02:15
  • 2
    @eckes microsoft's code for `memcpy` from many years ago broke the copy down into three sections, an unaligned prefix, the main body, and an unaligned suffix. Which is to say that alignment issues are transparent to the user, and the bulk of the copy (the main body) is done at maximum aligned speed using full sized (e.g. 32bit) transfers. – user3386109 Jul 26 '14 at 02:26
  • 2
    Yes, for the same number of bytes moved, memcpy is likely to be several times faster than strcpy. The only exceptions would be very short operations where the complexity of the memcpy setup would swamp the actual copy. – Hot Licks Jul 26 '14 at 02:38
  • Your question is meaningless, or at best incomplete, since strcpy and memcpy aren't interchangeable. To copy a string with memcpy you will need to call strlen, so any savings are lost. – Jim Balter Jul 26 '14 at 07:03
  • There is probably some fancy SSE code that's much faster than a naive loop for strcpy. [Hashcat contains such code](http://hashcat.net/forum/thread-1912.html) for splitting a file into lines, which is a quite similar problem. – CodesInChaos Jul 26 '14 at 07:22
  • @CodesInChaos: The usefulness of a strcpy that's efficient with large strings is a bit limited compared with the value of a fast memcpy, since making memcpy efficient in the large-copy case only requires adding one integer comparison to the cost of any case where SSE setup won't generate a net "win", but code won't know whether setting up SSE for a strcpy would be worthwhile without knowing the length of a string, and code which does know the length of a string generally wouldn't be using strcpy. – supercat Apr 06 '15 at 03:50
  • [`loop` is one of the slowest ways](https://stackoverflow.com/q/35742570/995714), no one uses it anymore. Old libc implementations use [`rep movsb`](https://stackoverflow.com/q/33902068/995714) whereas newer ones use [SIMD](https://stackoverflow.com/q/18314523/995714) to [speedup](https://stackoverflow.com/q/7776085/995714). [Why are complicated memcpy/memset superior?](https://stackoverflow.com/q/8858778/995714) – phuclv Aug 29 '18 at 15:33

2 Answers2

28

If you know the size of the data to be copied, then memcpy() should be as fast or faster than strcpy(). Otherwise, memcpy() can't be used alone, and strcpy() should be as fast or faster than strlen() followed by memcpy().

However...

A lot of implementations of memcpy() and/or strcpy() and/or strlen() are designed to efficiently handle large amounts of data. This often means additional startup overhead (e.g. determining alignment, setting up SIMD, cache management, etc) and makes these implementations bad (slow) for copying small amounts of data (which is far more likely in well written code). Because of this, "should be as fast or faster" does not necessarily imply "is as fast or faster". For example, for small amounts of data an memcpy() optimised for large amounts of data may be significantly slower than a strcpy() that wasn't optimised for large amounts of data.

Also note that the main problem here is that generic code (e.g. memcpy() and strcpy()) can't be optimised for a specific case. The best solution would have been to have multiple functions - e.g. memcpy_small() that's optimised for copying small amounts of data and memcpy_large() that's optimised for bad code that failed avoid copying a large amount of data.

Brendan
  • 35,656
  • 2
  • 39
  • 66
  • 4
    I would expect versions of memcpy() which are optimized for larger blocks would start by checking whether the size is large enough to make large-block optimizations worthwhile and, if not, simply use a small-block routine. It may be that when e.g. copying a two-character string the cost of strcpy ends up being less than that of memcpy, but I would not expect the break-even point to be much beyond that. – supercat Apr 06 '15 at 03:55
  • 1
    memcpy has a *much* easier time being efficient for both large and small sizes, because the size is known up front. `strcpy` has to avoid reading into another page past the end of the string, and has to load + examine some source data before it can even pick a small vs. large strategy. Glibc's implementations are optimized pretty well for both, with a fast path that works well for small. (A bit of extra startup overhead is less significant for large copies.) But yeah, a `memcpy_large` could avoid some conditional branches before jumping to the large-copy strategy. Helps more for strcpy! – Peter Cordes Aug 29 '18 at 15:33
  • code copying large amount of data isn't necessarily badly written code. for example if you need to serialize large amount of data, having to copy it in place to fit a specific format wouldn't surprise me (for example serializing 100MB of data into a JSON or something, with the occational character needing an \uescape , then you probably have to copy the whole damn thing, even possibly one-utf8-chracter-at-a-time) – hanshenrik Jan 05 '20 at 14:51
  • @hanshenrik: So many options... create a "copy on write" clone of it (`fork()`?) so most isn't actually copied, don't modify it at all and only build a list of differences, combine both of those (split it into blocks of whatever size yourself and track which blocks were modified), ... – Brendan Jan 06 '20 at 15:39
11

On almost any platform, memcpy() is going to be faster than strcpy() when copying the same number of bytes. The only time strcpy() or any of its "safe" equivalents would outperform memcpy() would be when the maximum allowable size of a string would be much greater than its actual size. If one has a buffer which could hold a string of up to a million characters, and one wants to copy the string to another million-byte buffer, memcpy() would have to copy a million bytes even if the string in the first buffer was only two characters long. The strcpy() function or its equivalents, however, would be able to early-exit, and would probably take less time to copy two characters than memcpy() would require to copy a million.

Iharob Al Asimi
  • 52,653
  • 6
  • 59
  • 97
supercat
  • 77,689
  • 9
  • 166
  • 211
  • 9
    That depends on whether you're telling `memcpy` to copy the entire buffer, or you know the length of the string in advance (maybe it's stored separately in a variable) and you can tell `memcpy` to copy just that much. – Wyzard Jul 26 '14 at 02:33
  • 3
    And neither `memcpy()` nor `strcpy()` can be used safely unless you know the space available in the target area _and_ the space used in the source area. So, you should always use `memcpy()` because you should always know how long a string you are dealing with. – Jonathan Leffler Jul 26 '14 at 02:47
  • 4
    Well, `strcpy` is safe if you know that the destination buffer is at least as big as the source buffer, even if you don't know the length of the actual string. (Assuming the string doesn't extend past the end of the source buffer, in which case you have a problem even before you call `strcpy`.) – Wyzard Jul 26 '14 at 02:52
  • Indeed. There are plenty of real-world situations where you know a *bound* on the source string length without knowing the exact length. For example it could be the string obtained by `fgets`, the tail of a string of known length obtained via `strchr` or `strstr`, etc. – R.. GitHub STOP HELPING ICE Jul 26 '14 at 03:31
  • As long as the string is short, doing `strlen` followed by `memcpy` is probably comparable in performance to `strcpy`. For huge strings though, reading the string twice would be very expensive. – R.. GitHub STOP HELPING ICE Jul 26 '14 at 03:34
  • 1
    Any answer to the OP's question should note that strcpy and memcpy aren't interchangeable so the question is incomplete ... it depends on how the length is determined and/or whether the data contains a NUL. – Jim Balter Jul 26 '14 at 07:07
  • @JimBalter: The two would be interchangeable in cases where one had a known number of non-zero bytes followed by a zero byte; that's actually a fairly common situation, given that many length-tracked-string libraries add a gratuitous zero byte after every string. – supercat Apr 06 '15 at 03:58
  • I **said** that "it depends on ... whether the data contains a NUL". Sheesh. – Jim Balter Apr 06 '15 at 09:57