22

Is it safe to put NULL pointer as parameter of strncmp if the third parameter is zero? I.e. an invocation like:

strncmp(NULL, "foo", 0);
Sourav Ghosh
  • 133,132
  • 16
  • 183
  • 261
Marian
  • 7,402
  • 2
  • 22
  • 34

2 Answers2

35

It's undefined behavior.

C standard says you should not pass invalid pointers to library function, in general.

Quoting C11, chapter §7.24.1, "String function conventions", (emphasis mine)

Where an argument declared as size_t n specifies the length of the array for a function, n can have the value zero on a call to that function. Unless explicitly stated otherwise in the description of a particular function in this subclause, pointer arguments on such a call shall still have valid values, as described in 7.1.4. On such a call, a function that locates a character finds no occurrence, a function that compares two character sequences returns zero, and a function that copies characters copies zero characters.

and I don't see any specific mention (as an exception to the aforesaid constraint) in 7.24.4.4, strncmp() function.


To add context for "invalid pointers", quoting §7.1.4/p1, Use of library functions

[...] If an argument to a function has an invalid value (such as a value outside the domain of the function, or a pointer outside the address space of the program, or a null pointer, or a pointer to non-modifiable storage when the corresponding parameter is not const-qualified) or a type (after promotion) not expected by a function with variable number of arguments, the behavior is undefined. [...]

and regarding NULL, quoting §7.19, <stddef.h>

NULL
which expands to an implementation-defined null pointer constant; [...]

Sourav Ghosh
  • 133,132
  • 16
  • 183
  • 261
  • 9
    yes, it is (in this context) 7.1.4/1 "If an argument to a function has an invalid value (such as a value outside the domain of the function, or a pointer outside the address space of the program, or a null pointer, or a pointer to non-modifiable storage when the corresponding parameter is not const-qualified)" – Lightness Races in Orbit Jun 23 '16 at 14:13
  • 3
    (to be clear, I think that needs adding to the answer in order to complete it) – Lightness Races in Orbit Jun 23 '16 at 14:15
  • 1
    @LightnessRacesinOrbit Sure, let me edit that info into the answer. It's too valuable to be sitting in comments. :) – Sourav Ghosh Jun 23 '16 at 14:16
  • @LightnessRacesinOrbit Done updating. also added a bit about `NULL`, that's been used in the question. – Sourav Ghosh Jun 23 '16 at 14:25
  • 1
    @LightnessRacesinOrbit: It is unlikely that any implementation which is not being deliberately obtuse would have any reason to regard reading or writing zero bytes from a null address as anything other than a no-op. Unfortunately, that does not make it unlikely that some implementations will find other more "interesting" behaviors. – supercat Jun 23 '16 at 18:33
  • @supercat: `char c = *start; while (c && i++ < n) { /*..*/; c = *start++; }` isn't a stretch (or something to that effect; example has bugs obvs :P) – Lightness Races in Orbit Jun 23 '16 at 20:00
  • @LightnessRacesinOrbit: What would authorize the function to read more characters than it is copying? There could be a justification if `char *end=start+len;` would do something wonky when adding a zero offset to a null pointer; while the C++ Standard mandates that adding zero to null is a no-op, the C Standard does not. Except on platforms which would trap such pointer arithmetic, however, I see no plausible real advantage to allowing zero-size copy operations to behave as anything other than a no-op [or, for that matter, of allowing `memcpy(x,x,n)` to to anything other than... – supercat Jun 23 '16 at 20:04
  • @supercat: What prevents the function from reading more characters than it is copying, when reading those characters is guaranteed to work? In my example, only _one_ character, and the definition of the function states that the inputs must be valid. To be valid, they must be dereferenceable. That's "authorization" enough. You seem to be inventing constraints where there are none. The point is that the implementation and compiler are free to make assumptions, that go beyond "how is the loop inside `strncpy` designed", based upon the stated preconditions of the function. – Lightness Races in Orbit Jun 23 '16 at 20:07
  • ...read and write any or all of the values in the range in arbitrary order, with the provisos that no element's value be altered by the operation unless it's *also* altered by some other simultaneous operation (which would invoke UB).] A lot of APIs take a pointer and length, with a proviso that the pointer may be null if the length is zero. It is necessary that the pointer be dereferenceable for as many items as the function is allowed to copy. If that number is zero, the pointer is allowed to be dereferenced zero times (not at all). – supercat Jun 23 '16 at 20:09
  • BTW, `strncpy` is specified to always write `n` bytes, and zero is a valid value; code will need to check for zero length before it stores the first byte, and doing such check as the very first operation will avoid the need to have to do it later. – supercat Jun 23 '16 at 20:14
  • @LightnessRacesinOrbit: There are real-world usage cases where regarding a zero-byte copy as a no-op would avoid the need for a boundary check in user code. What real-world scenarios can you offer where guaranteeing that size zero implies a no-op would make an implementation less efficient? While I can't fault those who want to guard against deliberately-obtuse implementations, is there any *other* reason why a zero-character operation shouldn't be a no-op? – supercat Jun 23 '16 at 20:19
  • @supercat: Nope, none. It's all moot though. If you wilfully break preconditions you deserve whatever you get. Don't forget you're not just guarding against whoever wrote the eight or nine lines of code that make up `strncpy`; you're guarding against an aggressively optimizing compiler. Don't even start to try to rationalise about what that will and will not lead to when you break preconditions. Speaking generally, read https://blogs.msdn.microsoft.com/oldnewthing/20140627-00/?p=633 – Lightness Races in Orbit Jun 23 '16 at 20:21
  • 1
    @LightnessRacesinOrbit: The comment about passing null doesn't mean that no library function may be passed null, but rather means that unless otherwise noted, e.g. in the documentation to `free`, no library function may be passed null *for a pointer it is going to actually use*, and copying zero bytes from a pointer would not fit most definitions of "use". I was including hyper-aggressive optimizers in "deliberately-obtuse implementations", since forcing programmers to include extra checks that an optimizer will likely not be able to eliminate is a phony "optimization". – supercat Jun 23 '16 at 20:32
  • @supercat: The standard says you shall not pass a null pointer to `strncpy`. It can't be much clearer about it. There is no "for a pointer it is going to actually use" caveat. Unless Sourav or I have missed a passage, in which case please feel free to point it out so we can fix the answer. – Lightness Races in Orbit Jun 23 '16 at 20:49
  • 3
    @LightnessRacesinOrbit: The C89 Standard was written in the days when compiler writers would implement useful behaviors without being ordered to do so (as evidenced by the fact that compiler writers had been implementing useful behaviors for over a decade before the Standard was written). The authors of the Standard wanted to avoid imposing mandates that would prevent a standard-conforming compiler for a platform from producing code that was as efficient as pre-existing compilers for the platform, which meant they would make many things UB if the alternative would be to risk adding cost... – supercat Jun 23 '16 at 21:39
  • 2
    ...to *any* existing implementations, on the premise that even if 99% of implementations could guarantee a behavior at no extra cost, making the action UB shouldn't prevent them from doing so. Unfortunately, implementation writers generally didn't bother to list all the ways in which the underlying platform logic would behave more consistently than the Standard required, since such things would have seemed to obvious to be worth mentioning. I know compiler writers who value speed over semantics have thrown that out the window, but the Standard wasn't written for such people. – supercat Jun 23 '16 at 21:48
  • @supercat: I know what the history of the standard is. None of that changes what the answer is to this question, or how one should program. – Lightness Races in Orbit Jun 24 '16 at 00:13
7

From the C strncmp documentation at cppreference.com:

The behavior is undefined when either lhs or rhs is the null pointer.

Simply read the documentation.

Lightness Races in Orbit
  • 378,754
  • 76
  • 643
  • 1,055
  • 17
    http://cppreference.com isn't "the" documentation, it's a public Wiki written by enthusiasts. Given the [tag:language-lawyer] tag, I don't think we should consider anything but the language standard to be an authoritative reference for this question. – Nate Eldredge Jun 23 '16 at 15:55
  • 3
    @NateEldredge: I didn't say that it was authoritative, nor "the" documentation (it's "the documentation at cppreference.com"). But doing _some_ basic research would have been a good start. The OP could then have asked for _confirmation_ on what they'd already seen stated. Enter the standard, and Sourav's excellent answer. I consider there to be value in reminding people to _read documentation_. – Lightness Races in Orbit Jun 23 '16 at 15:57
  • 1
    Thank for the answer. I was actually interested in the kind of answer which Sourav's has provided (with your valuable comment). I wanted to find reasons against suspicious lines of code like this one. – Marian Jun 23 '16 at 16:53
  • @Marian: Indeed, Sourav's answer is the one you _need_. But I ask you again to consider starting your research with a read-through of documentation in the future. – Lightness Races in Orbit Jun 23 '16 at 17:08