24

I have a function which returns a pointer and a length, and I want to call std::string::assign(pointer, length). Do I have to make a special case (calling clear) when length is zero and the pointer may be nullptr?

The C++ standard says:

21.4.6.3 basic_string::assign

basic_string& assign(const charT* s, size_type n);
Requires: s points to an array of at least n elements of charT.

So what if n is zero? What is an array of zero characters and how does one point to it? Is it valid to call

s.assign(nullptr, 0);

or is it undefined behavior?

The implementation of libstdc++ appears not to dereference the pointer s when the size n is zero, but that's hardly a guarantee.

erip
  • 16,374
  • 11
  • 66
  • 121
Bulletmagnet
  • 5,665
  • 2
  • 26
  • 56
  • On a related note, would `char a[4]; s.assign(s+4, 0);` be legal? That seems like a reasonable interpretation of "array of length 0", and could plausibly arise in practice. – Nate Eldredge Jan 21 '16 at 03:22

4 Answers4

21

Pedantically, a nullptr does not meet the requirements of pointing to an array of size >=0, and therefore the standard does not guarantee the behaviour (it's UB).

On the other hand, the implementation wouldn't be allowed to dereference the pointer if n is zero, because the pointer could be to an array of size zero, and dereferencing such a pointer would have undefined behaviour. Besides, there wouldn't be any need to do so, because nothing is copied.

The above reasoning does not mean that it is OK to ignore the UB. But, if there is no reason to disallow s.assign(nullptr, 0) then it could be preferable to change the wording of the standard to "If n is greater than zero, then s points to ...". I don't know of any good reason to disallow it, but neither can I promise that a good reason doesn't exist.

Note that adding a check is hardly complicated:

s.assign(ptr ? ptr : "", n);

What is an array of zero characters

This is: new char[0]. Arrays of automatic or static storage may not have a zero size.

eerorika
  • 232,697
  • 12
  • 197
  • 326
  • 7
    Implementations are allowed to check validity of pointer. Checking if pointer points to allocated memory or even simple check `__debug_assert(cstr != nullptr)` can exist in any conforming implementation – Revolver_Ocelot Jan 20 '16 at 14:30
  • @Revolver_Ocelot there can and as I said, the standard does not guarantee the behaviour. I would expect any implementation that checks the pointer, to only do so if the pointer would be dereferenced. – eerorika Jan 20 '16 at 14:33
  • I prefer to call `clear()` if I have to make a special case for zero length. – Bulletmagnet Jan 20 '16 at 14:41
  • Actually, if the length is zero, the pointer I pass to `assign` is not an explicit `nullptr` literal, it's an honest `char*` which just may happen to be NULL. – Bulletmagnet Jan 20 '16 at 14:45
  • @Bulletmagnet , well, both libc++ and libstdc++ do something like `assert(len == 0 || ptr != nullptr)`, so it is safe to use at least with those libraries. Do not about VS. It is safer to use suggested ternary operator approach if there is danger of null pointer slipping in your code. – Revolver_Ocelot Jan 20 '16 at 14:50
  • `std::string s(nullptr)` will fail with a debug assertion on `libstdc++`. Apparently the same check is not present in `string::assign`. – sbabbi Jan 20 '16 at 15:04
  • @sbabbi I would expect `std::string s(nullptr)` to fail assertion (or have any random UB) because it has to dereference the pointer to check if null terminator is reached. `assign(ptr, n)` will need to copy until `n` is decremented to zero and when zero is reached, no other check is required. – eerorika Jan 20 '16 at 15:11
  • Here is an analoguous constructor call: `std::string s(nullptr, 0);` It fails no debug assertion, behaves as expected (empty string), but is also technically UB for the same reasons. – eerorika Jan 20 '16 at 15:23
  • @user2079303: I do not understand how you conclude that an implementation is of "poor quality" when all it does is make an assertion that will only fail in an ill-formed program. Should, by the same reasoning, std::copy((int*)0, (int*)0, (int*)0); be an acceptable call? – Arne Vogel Jan 20 '16 at 15:51
  • @ArneVogel I consider it poor quality, because it forces the user of the standard library to add a check that is otherwise unnecessary. Yes, the wording of the standard unfortunately allows anything but I believe the reason for the wording is simplicity - not that `(nullptr, 0)` arguments are explicitly considered bad. About the std::copy, yes, I do believe that is an acceptable call and appears to be well formed according to the standard too: http://stackoverflow.com/a/19483347/2079303 – eerorika Jan 20 '16 at 16:12
  • Fair enough, I didn't know (int*)0-(int*)0 was defined. What about std::copy((int*)0, (int*)0, vi); where vi is a singular std::vector::iterator? This is in my reading UB because you cannot even add 0 to a singular iterator (see §24.2.2/5, N3690). Should this be allowed? Just consider for a moment string::assign and imagine the optimizer can generate faster code for some architecture when it knows that the pointer is never null. But then adding an attribute to that effect would be in your opinion a sign of a "poor implementation", because it messes with a subset of ill-formed programs? – Arne Vogel Jan 20 '16 at 16:35
  • @ArneVogel This is getting a bit off topic but subtracting or adding to a pointer is only valid if the result points to the same array as the left hand operand, so your subtraction is not valid (Though, *I* think it should be allowed because that's actually another UB technicality that for example makes `offsetof` pretty useless if you weren't to simply ignore the UB). It's not very relevant to std::copy because it's not allowed to do the subtraction that you suggest because the iterators are required only to be `InputIterator`s and those don't need to define addition nor subtraction. – eerorika Jan 20 '16 at 17:00
  • If the limitation does allow better optimization, then I would consider it worthwhile for the penalty of additional corner case check by the programmer. If you can show me that's the case, then I shall adjust my opinion and consider the current wording of the standard good. – eerorika Jan 20 '16 at 17:05
  • It is UB to call `memcpy` or `memmove` with a null pointer (and somewhere within `assign` you're going to call it). Modern compilers are known to optimize away null pointer checks based on such calls. – T.C. Feb 01 '16 at 00:38
  • “Arrays of automatic or static storage may not have a zero size.” While this is true, the pointer to pass to `assign` is nowhere required to point to the beginning of such an array. As I understand it, it could very well be the one-past-the-end pointer. – Quirin F. Schroll Apr 03 '23 at 09:09
13

Well as you point out, the standard says "s points to an array...". A null pointer does not point to an array of any number of elements. Not even 0 elements. Also, note that s points to "an array of at least n elements...". So it's clear that if n is zero, you can still pass a legitimate pointer to an array.

Overall, std::string's API is not well-guarded against null pointers to charT. So you should always make sure that pointers you hand off to it are non-null.

Nicol Bolas
  • 449,505
  • 63
  • 781
  • 982
10

I am not sure why an implementation would dereference any pointer to an array whose length is provided as zero.

That said, I would err to the side of caution. You could argue that you are not meeting the standards requirement:

21.4.6.3 basic_string::assign

8 Requires: s points to an array of at least n elements of charT

because nullptr is not pointing to an array.

So technically the behaviour is undefined.

Galik
  • 47,303
  • 4
  • 80
  • 117
6

From the Standard (2.14.7) [lex.nullptr]:

The pointer literal is the keyword nullptr. It is a prvalue of type std::nullptr_t. [ Note: std::nullptr_t is a distinct type that is neither a pointer type nor a pointer to member type ... ]

std::nullptr_t can be implicitly converted to any type of null pointer as per 4.10.1 [conv.ptr]. Regardless of the type of null pointer, the fact remains that it points at nothing.

Thus, it doesn't meet the requirement that s points to an array of at least n elements of charT.

It seems to be undefined behavior.

Interestingly, according to this answer, the C++11 Standard clearly stated that s must not be a null pointer in the basic_string constructor, but this wording has since been removed.

Community
  • 1
  • 1
erip
  • 16,374
  • 11
  • 66
  • 121
  • 2
    True but a nullptr_t can be implicitly converted to a null pointer of any type, including char*. The issue is that the resulting null pointer is not pointing to anything, not even 0 characters, but the type is not a problem. – rici Jan 20 '16 at 15:36
  • @rici Thank you. I've edited my post, but fear that it is quite similar to other provided answers. – erip Jan 20 '16 at 15:53