memcpy zero bytes into const variable - undefined behavior?

Question

In C and C++, is it undefined behavior to memcpy into a const variable when the number of bytes to be copied is zero?

int x = 0;
const int foo = 0;
memcpy( (void *)&foo, &x, 0 );

This question is not purely theoretical. I have a scenario in which memcpy is called and if the destination pointer points to const memory, then the size argument is guaranteed to be zero. So I'm wondering whether I need to handle it as a special case.

Why use memcpy in C++ at all? That's what std::copy is for. The whole (void*) cast will disregard any constness and typesafety (that's so important in C++). Also make sure you ask your question specifcally for "C" and "C++" they're different languages with different rules — Pepijn Kramer, Oct 09 '22 at 14:38
Presumably, if the destination is a pointer to `const` memory, then it is an invalid pointer and the behaviour is undefined according to [cppreference](https://en.cppreference.com/w/cpp/string/byte/memcpy). — Adrian Mole, Oct 09 '22 at 14:41
Does this answer your question? [memcpy with destination pointer to const data](https://stackoverflow.com/questions/12309600/memcpy-with-destination-pointer-to-const-data) — possum, Oct 09 '22 at 14:42
If you are copying `0` bytes, then you are not writing to protected memory. You are not writing anything. — Weather Vane, Oct 09 '22 at 14:44
Why would this be undefined? The wonky pointer casts are usually legal, it's the deferencing (or writing to the result of one) that's illegal. — HolyBlackCat, Oct 09 '22 at 14:45
@PepijnKramer The library is C but should also compile in/be compatible with C++. — Jackson Allan, Oct 09 '22 at 14:45
Ok I see, well in that case. You might want to have 2 overloaded C++ functions calling this library function. One for const and one for non const foo, and raise an error for the const version. Since it not about 0 bytes copied or not it is about const correctness — Pepijn Kramer, Oct 09 '22 at 14:48
@PepijnKramer the question isn't about copying to a `const` pointer. It's about passing a pointer that was originally `const` when there is nothing to copy anyway. — Weather Vane, Oct 09 '22 at 14:50
@WeatherVane IMO it is. You should not overwrite a const variable, not even with memcpy. Looking at the number of bytes being zero is just "working" around the problem. — Pepijn Kramer, Oct 09 '22 at 14:55
@PepijnKramer is *isn't* overwriting a `const` variable, as the question makes quite clear. `memset()` won't do anything. It won't dereference anything, or attempt to write anywhere. — Weather Vane, Oct 09 '22 at 14:56
@PepijnKramer You may have missed the part of the question that makes it clear that this is not a practical question. It is a hypothetical question designed to explore the details of the language's rules. — François Andrieux, Oct 09 '22 at 14:58
Whether or not this is undefined behavior this conundrum is easily solved simply by adding an `if` statement that checks the number of bytes to copy and calling memcpy only if it is not 0. I would expect modern C++ compilers to compile the whole thing away, making the whole thing a moot point without any worries of whether this is undefined behavior, or not. — Sam Varshavchik, Oct 09 '22 at 15:05
@FrançoisAndrieux Ok right. I think I just get hung up on the "C" style (void*) cast. Nothing in memcpy (C++ standard) seems to mention what happens if number of bytes is 0. Exploring a bit on godbolt, no code is emitted when copying 0 bytes (https://godbolt.org/z/9b34fPzrb) — Pepijn Kramer, Oct 09 '22 at 15:06
@HolyBlackCat The standard imposes some limitations concerning ```memcpy``` that make some things surprisingly undefined behavior. For example ```memcpy( NULL, NULL, 0 )``` is technically undefined behavior because the pointers passed in must be valid, even though no copy is actually occurring. As for my original question, I couldn't find anything in the standard covering this exact scenario, though there may be something in there. — Jackson Allan, Oct 09 '22 at 15:07
@PepijnKramer "Why use memcpy in C++ at all?" - there are several situations/corners in C++ where the only way to do type punning without [UB](https://en.cppreference.com/w/cpp/language/ub) is to go via `memcpy`, so it's not unreasonable to see it in C++ code. — Jesper Juhl, Oct 09 '22 at 15:09
@PepijnKramer My actual call doesn't use 0 as a literal but another variable that will be 0 if the destination pointer points to ```const``` memory or non-zero if it points to writable memory. So it's doubtful that the compiler will simply omit the call altogether, as it would in my trivial example. — Jackson Allan, Oct 09 '22 at 15:10
@JesperJuhl: From C++20 onward, doesn't `std::bit_cast` take care of most or all of those situations? — Nate Eldredge, Oct 09 '22 at 18:27
@NateEldredge It might. I haven't researched in detail personally. — Jesper Juhl, Oct 09 '22 at 18:28
@JacksonAllan I checked on Godbolt and it seems compilers do not omit the zero check: https://godbolt.org/z/vK4Y4KKnh — Eric M Schmidt, Oct 09 '22 at 18:31
@EricMSchmidt: Since gcc calls an implementation of memmove which isn't bundled with gcc, it has no way of knowing whether that function might malfunction if passed invalid pointers with a size of zero. Since `ZeroCountCheckedMemcpy` woud have defined behavior in that case but `memcpy` would not, omitting the check could adversely affect the behavior of what should be a Strictly Conforming C Program. — supercat, Oct 10 '22 at 20:20

Nate Eldredge · Answer 1 · 2022-10-10T14:17:28.377

38

c c17

The older question Is it guaranteed to be safe to perform memcpy(0,0,0)? points out 7.1.4p1:

Each of the following statements applies unless explicitly stated otherwise in the detailed descriptions that follow: If an argument to a function has an invalid value (such as a value outside the domain of the function, or a pointer outside the address space of the program, or a null pointer, or a pointer to non-modifiable storage when the corresponding parameter is not const-qualified) or a type (after promotion) not expected by a function with variable number of arguments, the behavior is undefined.

The prototype for memcpy is

void *memcpy(void * restrict s1, const void * restrict s2, size_t n);

where the first parameter is not const-qualified, and &foo points to non-modifiable storage. So this code is UB unless the description of memcpy explicitly states otherwise, which it does not. It merely says:

The memcpy function copies n characters from the object pointed to by s2 into the object pointed to by s1.

This implies that memcpy with a count of 0 does not copy any characters (which is also confirmed by 7.24.1p2 "copies zero characters", thanks Lundin), but it does not exempt you from the requirement to pass valid arguments.

edited Oct 10 '22 at 14:17

answered Oct 09 '22 at 18:36

Nate Eldredge

48,811
6
54
82

6

Since C++ doesn't relax the requirements of the C library, this is equally true for C++. – Passer By Oct 10 '22 at 05:51
6

I suspect this will be a Shroedinger's bug - there is absolutely no reason why calling it in the way you describe would cause any problems whatsoever; but as soon as you actually implement it, you will discover that there is exactly one library on the planet that crashes when you do this, and it's the one you are using. – Betty Crokker Oct 10 '22 at 12:23
1

@BettyCrokker Or worse, the library would be perfectly fine if you actually called it. But the compiler sees the call as a special cased memcpy, and infers it can't happen and removes the calling code path. – user1937198 Oct 10 '22 at 13:40
1

"This implies that memcpy with a count of 0 does nothing" C17 7.24.1/2 explicitly states that memcpy (a copy function) copies zero bytes in case the size parameter is zero. – Lundin Oct 10 '22 at 14:12
If the question has already been answered somewhere, it should be closed as a dup – Language Lawyer Oct 10 '22 at 15:58
I will gladly complain to any compiler vendor that makes `memcpy(NULL, NULL, 0);` not work. Requiring the argument to be readable (let alone writable) on zero bytes is a perverse reading of the standard. – Joshua Oct 10 '22 at 16:26
1

@Joshua: A major goal of the Standard is to allow compilers to be as useful as possible. If on some platform, an implementation that behaved oddly when given `memcpy(0,0,0)` would be genuinely more useful for some task than one that would treat it as a no-op, I think the Standard would be intended to allow such treatment. If there is no case where such treatment would be useful, then nobody should have any reason to care about whether the Standard would allow such treatment, and thus there would be no reason for the Standard to spend ink forbidding it. – supercat Oct 10 '22 at 18:14
@supercat: git had to wrap memmove already because of a similar bug where the function worked just fine but the compiler's optimizer would occasionally assume the argument to memmove was not null and make improper optimizations on that assumption. Unfortunately I don't know which compiler was the offender. – Joshua Oct 10 '22 at 18:18
@Joshua: Ironically, such "optimizations" actually end up forcing less efficient code generation, since a compiler that's processes `if (size) memcpy(dest, src, size);` by calling an external library function can't omit the `if`, but of course the function will have to do its own size==0 check in addition to the one performed by the calling code. One of the design principles behind C was that if on some platform no machine code would be necessary to ensure acceptable behavior in some case, neither the programmer nor compiler should have to write code to handle that case. – supercat Oct 10 '22 at 18:25
@Joshua: Someone seeking to write a quality compiler, given a choice between having it processing some programs in more useful ways or a less useful ways, should make a bonafide effort to have it process the programs in the more useful way. The fact that clang and gcc use the Standard as an excuse to behave in gratuitously meaningless fashion doesn't imply the Standard is defective, but merely that the authors aren't making a bona fide effort to produce maximally-useful compilers. – supercat Oct 10 '22 at 18:29
@supercat: You'd expect that a 2022 compiler inlines `memmove` and thus eliminates any redundant checks. "Expect" as in "file a bug report if it doesn't" – MSalters Oct 10 '22 at 18:32
1

@MSalters: On many platforms, the per-byte cost of a simple in-line `memcpy` or `memmove` will be so much greater than that of an optimized one that the former would only be faster when copying fewer than about 16 bytes. A well-designed C implementation should have multiple different library functions which would be employed in different scenarios (e.g. where the destination is known to be word-aligned but the source isn't, and the size is known to be a multiple of 4), but neither clang nor gcc is designed to accommodate such things. – supercat Oct 10 '22 at 18:40
@MSalters: Have you *seen* the contents of a modern memmove? Inlining it is a mistake unless the compiler has a lot more info than usual. (It rarely knows the actual alignment of a char* pointer for example.) – Joshua Oct 10 '22 at 18:41
@MSalters: Also, take a look at https://godbolt.org/z/v8j8v49E5 and tell me what you think. – supercat Oct 10 '22 at 18:43
@supercat: I know that there are indeed quite some reasonable different implementations, optimal for different cases. That's why I expect `memmove` itself to be inlined, so your "n<16" check can be optimized for the many cases where that size is some `sizeof Foo`. – MSalters Oct 10 '22 at 18:49
@Joshua: I have - saw it on a SO question not that long ago. It's not uncommon to pass `&someStackObject` for instance, where the optimizer knows it's aligned and can reasonably guess the object is in L1 cache. – MSalters Oct 10 '22 at 18:51
1

@MSalters: In the linked godbolt example, a compiler would know that n is exactly 7, and would know that the start of the destination address is one byte before the start of the source, but would generate a machine code instruction to call library function `memmove`. – supercat Oct 10 '22 at 19:06
@supercat: I saw it. Tricky one - either `p` or `p+1` is misaligned, if not both. And the function is basically a special case of memmove. Experimenting a bit more, GCC is the exception. It also calls `memmove` on ARM. MSVC and clang inline the code for x64 and ARM, and ICC inlines it too. – MSalters Oct 11 '22 at 00:37
@MSalters: The "Gratuitously Clever Compiler" is exceptional in quite a few ways. What I find really funny with the example I quoted is that gcc takes code that's written as a loop, and converts it into a call to an outside `memmove` function. If gcc had come with a multiple-entry library function like `memcpyup8: ldr r2,[r0,#0] / str r2,[r1,#0] / memcpyup7: ldr r2,[r0,#1] / str r2,[r0,#1] / ...`, calling one of its entry points could offer better performance than using a loop, and be more compact in cases where one funciton could be called from multiple places in the program, but... – supercat Oct 11 '22 at 05:17
...I am doubtful that a memmove function could be written to be faster than a simple in-line loop, or that the slight space savings from the memcpy call would be appreciated by programmers who wrote the loop instead of a `memcpy` call. – supercat Oct 11 '22 at 05:18
@supercat: I've seen a memmove implementation that would at first call check whether or not the CPU had SIMD support and swap in an alternate version of itself that could move 16 bytes at a time. The compiler's not emitting code that does *that*. – Joshua Oct 11 '22 at 23:52
@Joshua: Probably not, though in the days before write-protected code segments, both the PC and IIRC the Macintosh had some conventions for code sequences that could be patched automatically if a numeric coprocessor was present. – supercat Oct 12 '22 at 04:50

supercat · Answer 2 · 2022-11-21T16:06:52.957

-2

It's clear that on the vast majority of platforms, an implementation which would process memcpy(anything, anything, 0) as a no-op, regardless of the validity of the source and destination pointers, would be in every way, in essentially every non-contrived scenario, as good or better than one that does anything else.

The way the Standard is written, however, could be interpreted as specifying that compilers are allowed to treat as UB any situation where the destination address is not associated with writable storage.

If one is using an implementation that seeks to process corner cases applying the philosophy documented in the Rationale document published by the authors of the Standard, without regard for whether the Standard unambiguously mandates such behavior, all memcpy and memmove operations where the size is zero will be reliably processed as no-ops. If the size will often be zero, there may be performance advantages to skipping a memcpy or memmove call in the zero-size case, but such a check would never be required for correctness.

If, however, one wishes to ensure reliable compatibility with compiler configurations that aggressively assumes that code will never receive inputs that trigger corner cases that aren't 100% unambiguously mandated by the Standard, and is designed to generate nonsensical code if such inputs are received, then it will be necessary to add a size==0 check in any case where a zero size might be accompanied by anything other than a pointer to writable storage, recognizing that such a check may negatively affect performance in situations where the size is very seldom zero.

edited Nov 21 '22 at 16:06

answered Oct 10 '22 at 06:40

supercat

77,689
9
166
211

Comments are not for extended discussion; this conversation has been [moved to chat](https://chat.stackoverflow.com/rooms/248719/discussion-on-answer-by-supercat-memcpy-zero-bytes-into-const-variable-undefin). – Samuel Liew Oct 11 '22 at 01:49
1

@supercat Every single time someone asks a question about undefined behavior in C on this site, you post a version of this rant, and I have to ask, why do you keep posting it _here_, where none of the people whose minds you need to change will read it? You could write a position paper for the C committee. You could bring it up on the GCC or LLVM development mailing lists. You could do some actual research to back up your assertions and then publish in PLDI or ASPLOS. You have _better options_ than getting ignored here. – zwol Oct 12 '22 at 13:41
@zwol: My answer directly ties in with the question asked, mentioning why guaranteeing defined behavior might potentially increase the cost of straightforward implementation on some platforms. There is no reason why anyone who isn't targeting such platforms *should* need to worry about such things, and 20 years ago I think pretty much everyone would have agreed that there was no need to worry about them, but a lot of code is processed with compilers that gratuitously *make* it necessary to worry about such things, lest compiler "optimizations" facilitate arbitrary remote code execution. – supercat Oct 12 '22 at 14:29
@zwol: Can someone be a good C programmer today without understanding both (1) the Standard was not written with the intention of making programmers jump through gratuitous hoops, and (2) some compilers are designed to behave in dangerously nonsensical fashion unless programmers jump through gratuitous hoops not intended by the Standard? While #2 may seem like an outrageous claim that I would not expect people to believe without evidence, since it's outrageous to think that supposedly-general-purpose compilers would be designed in such fashion, the only way one could say that they weren't... – supercat Oct 12 '22 at 14:54
...would be to argue either that the hoops required to avoid nonsensical behavior were required by the authors of the Standard, or be unaware of the hoops that some compilers require. Perhaps what's needed is a retronym to distinguish the family of dialects the C Standard was chartered to describe from the one gcc and clang seek to process, so as to allow most of these questions to be easily and non-controversially answered "defined in all commonplace dialects of [retronym] but not defined in gcc-c or clang-c.".; – supercat Oct 12 '22 at 14:59
1

Nope. _Nobody here cares._ Posters on this site only care about what you actually have to do to get your code to work today, which means accepting current-generation compilers' interpretations of the standard as what C is. Again, you'd be much better served to write up a N-document for the committee, or an academic paper on how much """legacy""" code has actually been broken by current compilers, or really _anything besides_ writing more or less the same 500 words over and over again on a site where nobody cares. – zwol Oct 12 '22 at 16:05
If you want to think that makes people here "bad C programmers", you go right ahead. – zwol Oct 12 '22 at 16:06
@zwol: Perhaps I wasn't quite accurate: people who use tools should be familiar with how those tools work, including potentially unexpected aspects. Further, the authors of clang and gcc have stated that in cases where the Standard doesn't mandate that a piece of code be processed meaningfully, they should not be expected to give notice if the next version of their compiler arbitrarily change the behavior of such code. – supercat Oct 12 '22 at 16:16
@zwol: I've rewritten the answer to be more focused on practical aspects. Better? – supercat Oct 12 '22 at 16:30
@supercat Yeah, I'll retract my downvote on that basis, but _please_ consider transferring your efforts to a venue where it might actually bring about the change you want. – zwol Oct 12 '22 at 17:48
Two cents about UB. To my knowledge, Linux kernel is built using `-fno-strict-aliasing`, `-fno-delete-null-pointer-checks`, and `-fno-strict-overflow`. Meaning that Linux kernel relies on a particular compiler's behavior in case of UB (e.g. that code like `if (i+1 > i)` if `i` is signed integer is NOT folded to `if (1)`). Perhaps the Linux kernel code can be changed/revised/fixed, so these `no-` can be removed leading to perf. increase. – pmor Oct 26 '22 at 15:06
@pmor: There are many situations where machine code which was *agnostic* to possibilities of things like integer overflow, thus allowing e.g. `(x+y)>y` to be replaced with `x>0`, could be more efficient than code which rigidly defines behavior in all such cases. There is a huge difference between that, however, and the kinds of optimizations gcc performs around integer overflow, allowing even something as benign as `uint1 = (ushort1*ushort2) & 0xFFFF;` to arbitrarily corrupt memory if `ushort1` exceeds `INT_MAX/ushort2`. – supercat Oct 26 '22 at 15:16
@pmor: BTW, I would hope Linux also includes a flag to prevent the "optimizations" that would cause side-effect-free infinite loops to arbitrarily corrupt memory even if every single operation performed with them would be defined if processed individually. – supercat Oct 26 '22 at 15:18
@supercat I guess that not all C programmers know that ["implementations ... may also treat pointers based on different origins as distinct even though they are bitwise identical"](https://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_260.htm) ([demo](https://godbolt.org/z/EsYMKWWWW), GCC bug [61502](https://gcc.gnu.org/bugzilla/show_bug.cgi?id=61502)). I don't know whether such implementations (e.g. GCC) provide an option to disable such treatments. Re: "side-effect-free infinite loops to arbitrarily corrupt memory": can you elaborate? – pmor Oct 27 '22 at 12:42
@pmor: If clang in either C or C++ mode, or gcc in C++ mode, is given something like `unsigned test(unsigned x) { unsigned i=1; while((i & 255) != x) i*=3; if (x < 256) arr[x] = 1; return i; }`, then in situations where the return value is unused the store to `arr[x]` will be performed unconditionally. If the Standard were written in a way that could accommodate behavior inconsistent with sequential programmer execution without defenestrating all requirements, then a good abstraction model would allow `x < 256` to be replaced with `((i & 255)==x) && (x < 256)`, which could... – supercat Oct 27 '22 at 15:35
....then be replaced with `((i & 255)==x)`, and that could in turn be consolidated with the previous check of that condition in the loop *if such check were actually performed*, but such consolidation should cause the check outside the loop to be replaced with with an *artificial* data dependency on the computation of `(i & 255) == x` within the loop, rather than being dropped altogether. This should in turn preclude the elimination of the loop [unless a compiler recognizes that it would be more efficient to drop the loop and retain the post-loop comparison than vice versa. – supercat Oct 27 '22 at 15:43
@pmor: One of the goals of the Standard is to avoid forbidding optimizations that might, individually, be useful in some circumstances. It isn't designed to identify a set of optimizations that may be safely combined in arbitrary ways without limit. – supercat Oct 27 '22 at 15:46
@pmor: Clang and gcc use an abstraction model that allows for the possibility of program executions where pointers might *be* coincidentally bitwise identical without being able to identify the same object, but does not allow for the possibility that such pointers might be used to access objects after they are *observed* to be identical. – supercat Oct 27 '22 at 16:24
@supercat Consider [this](https://godbolt.org/z/sMaaoKGa3). There per abstract machine no `xxx` is printed. However, per ICC `xxx` is printed. Can a conforming C11 compiler eliminate infinite recursion per C11 5.1.2.3/4? – pmor Oct 31 '22 at 15:39
@pmor: Practical conforming *non-optimizing* compilers for some platforms might by chance output `xxx` if just the right amount of memory happened to be available when the program launched, and the code for `main()` happened to get placed in memory before the code for `f()`, such that the stack frames that were generated for calling`f()` happened to overwrite the code for `f()` with machine instructions to jump to a spot in `main()` just past the call to `f()`. The way the Standard is written, there are very few circumstances where *anything* an otherwise-conforming implementation might do... – supercat Oct 31 '22 at 16:01
...in response to any particular program would render it non-conforming, and provided that an implementation issues at least one diagnostic for all programs requiring one (a requirement that could satisfied by unconditionally outputting a "Warning: This program's diagnostics are garbage" message), all such situations would either involve `#error` directives or source programs that exercise at least one translation limit given in N1570 5.2.4.1. Since the program you linked does neither of those things, an implementation that output at least one diagnostic need not meet any other requirements. – supercat Oct 31 '22 at 16:04
@pmor: The Standard tries to portray itself as a "contract" between programmers and implementations, but if one looks at the actual requirements imposed upon conforming programs and conforming implementations, it exercises almost no meaningful normative authority. – supercat Oct 31 '22 at 16:13
If you specified a bytecount of 0, wouldn't it decrement to 0xFFFFFFFF and have to reach zero again to stop? – puppydrum64 Nov 21 '22 at 15:59
@puppydrum64: The behavior of memcpy is specified as treating a zero length copy as a request to copy zero bytes. If a compiler were bundled with its own memory-copy function(s), and it could determine at a particular call site that the length could never be zero, the compiler could invoke a memory-copy function which copied the first byte unconditionally, and could thus be a tiny bit faster than one which had to exit early in the length-is-zero case. – supercat Nov 21 '22 at 16:04

memcpy zero bytes into const variable - undefined behavior?

2 Answers2