15

In particular, is the following well-defined, or does it exhibit undefined behavior?

memcmp(0, 0, 0);

Does this differ between C and C++? Ideally, please provide a quote from the standard(s).

avakar
  • 32,009
  • 9
  • 68
  • 103
  • Does this run? You will be reading memory from NULL, I don't see how that could work – Eric May 03 '13 at 15:53
  • 2
    Hit the close button too early. This is about `memcmp`, not `memcpy`. Voting to reopen. – Fred Foo May 03 '13 at 15:54
  • 4
    @Eric The pointers shouldn't get dereferenced if the `count` parameter is `0`. But that being said, the question asks whether the standard guarantees this behavior, which I don't know the answer to. – Praetorian May 03 '13 at 15:54
  • @larsmans Although the accepted answer there also answers this question as far as C is concerned. (Undefined behaviour, for those who don't want to follow the link.) – Daniel Fischer May 03 '13 at 15:56
  • 2
    @Eric, that wouldn't be a good way to test. It may run on his system with his compiler and fail miserably when compiled elsewhere. In cases like this one should resort to what the standard says to minimize the chance of failure. – Remo.D May 03 '13 at 16:02
  • @Remo.D Yeah, but if it fails on his own system, then we already have our answer. My question was how can this not fail, which Praetorian explained – Eric May 03 '13 at 16:03
  • @Eric, "but if it fails on his own system, then we already have our answer". But do we? I actually bumped into the problem on msvc with optimization; it does not immediately follow that the call is UB, it might as well be a bug in the optimizer. – avakar May 03 '13 at 17:00

2 Answers2

33

In particular, is the following well-defined, or does it exhibit undefined behavior?

It's undefined. C99 7.21.1/2 says about all the string functions:

Unless explicitly stated otherwise in the description of a particular function in this subclause, pointer arguments on such a call shall still have valid values

and the description of memcmp in 7.21.4.1 doesn't explicitly state otherwise.

Does this differ between C and C++?

No, C++ defers to C for its definition of the C library functions, and doesn't have anything special to say about memcmp.

Mike Seymour
  • 249,747
  • 28
  • 448
  • 644
  • The relevance of the second quote is made clearer by the leading context : `Where an argument declared as size_t n specifies the length of the array for a function, n can have the value zero on a call to that function. Unless explicitly stated otherwise ...` – user295691 May 03 '13 at 18:17
  • 1
    Accepted and thank you. One more question though: does "valid value" for a pointer really mean "non-null"? Is there a quote to that effect? – avakar May 04 '13 at 12:12
  • 2
    @avakar: Yes, if you really want to follow the standard all the way to the bitter end, then my quote references 7.1.4 for the definition of "valid", and that specifically includes "a null pointer" in a list of examples of "invalid" values for library functions. – Mike Seymour May 05 '13 at 17:06
  • @avakar no worries, I did it on your behalf. Good stuff. – hari Mar 13 '15 at 17:23
  • @avakar: Good point about the meaning of "valid" value. I'd say for many programming tasks, NULL is a "valid" value for a pointer, in the sense that it's functionally meaningful. – Craig McQueen Mar 26 '20 at 01:15
3

It is amazing that although this appears to be a case of an obvious bug in the standard - which neglects to say that zero-length memcmp is fine (and always returns 0) - mountains of theory was built to explain why this should be labeled "undefined behavior". The above accepted answer is a good example, and so is the later discussion here.

Sadly, after the theory that memcmp(0,0,0) is undefined took hold, Gcc decided to strongly enforce this unfortunate decision, by adding a __nonull attribute to memcmp, causing possibly-wrong optimizations and UBSAN warnings. Only at that point, did this call really become undefined :-(

But if we were to look at it logically, memcmp(0, 0, 0) is well-defined, and should always returns 0 (equality): The functionality of memcmp() is described in Posix as:

The memcmp() function shall compare the first n bytes (each interpreted as unsigned char) of the object pointed to by s1 to the first n bytes of the object pointed to by s2.

When n=0, this means no bytes will be compared. If no bytes are compared, no pointer should ever be dereferenced, and it doesn't matter what this pointer is. That should be obvious, and the fact that the C standard forgot to mention it is nothing more than a bug in the standard.

Interestingly, the Linux memcmp(3) and the FreeBSD memcmp(3) manual pages, disagree with gcc, and claim that this case should be allowed:

The Linux manual page says:

If n is zero, the return value is zero.

While the BSD one says:

Zero-length strings are always identical.

Nadav Har'El
  • 11,785
  • 1
  • 24
  • 45