Summary
We cannot assume that this pointer is valid. Moreover, the pointer arithmetic vec.data() + 3
might be UB. And since nothing guarantees that it is not UB, if this code works it's implementation dependent.
Note: this answer was reworded to make a better distinction between UB and at risk of being UB.
Language-lawyer reasoning
Isn't this vector empty?
In you code snippet, you use reserve()
and data()
on an empty vector vec
that has a size()
of 0. We know two things from your own quote of the standard :
- first,
data()
returns a pointer such that [data(), data() + size())
is a valid range. In your case, we therefore can assume that [data(), data()+0)
is a valid range. But there is no such guarantee for [data(), data()+capacity())
. Your standard library could provide such an implementation dependent guarantee, but you cannot be certain in general. If not, the expression would definitively be UB (explanation further down the road).
- second, for a non-empty vector,
data()
returns the address of the first element. This is an assumption you make (otherwise you wouldn't add a fixed index). But since your vector is empty, you cannot be sure. An implementation is for example perfectly allowed to return a nullptr
for an empty vector, regardless of its capacity.
Why is the valid range so important?
A fundamental rule is often ignored: a pointer arithmetic operation is UB if it isn't within the range of a valid array object. The standard expresses this more formally:
[expr.add]/4: When an expression J that has integral type is added to or subtracted from an expression P of pointer type, the result has the type of P.
- If P evaluates to a null pointer value and J evaluates to 0, the result is a null pointer value.
- Otherwise, if P points to an array element i of an array object x with n elements, the expressions P + J and J + P (where J has the value j) point to the (possibly-hypothetical) array element i + j of x if 0 ≤ i + j ≤ n and the expression P - J points to the (possibly-hypothetical) array element i − j of x if 0 ≤ i − j ≤ n.
- Otherwise, the behavior is undefined.
This means that if you add an integer to a pointer, and the result would be out of the valid range, the expression itself would be UB, before a pointer result is even computed. This means also that adding anything else than 0 to nullptr would be UB as well.
So if the vector implementation of your library strictly complies to the standard, without any additional guarantee, your code would be UB because of this pointer arithmetic rule (out of range). But since we do not know what your implementation does, we cannot be sure of UB. The only thing we are sure is that UB cannot be excluded and the code is not portable.
Additional thoughts
It may be tempting to believe that reserve()
guarantees memory for the vector being allocated and hence the validity of the range [data(), data() +capacity())
ensured. But this is not at all the case: the pointer arithmetic rule is not about allocated memory but about array element of an array object with n elements.
An implementation could well allocate memory and create the array object of the exact right size using placement new, to preserve the addresses of the existing elements. It would not be a super efficient implementation, but it'd be a legal one.
Moreover, the standard gives for reserve()
and capacity()
guarantees about the absence of reallocation:
[vector.capacity]/4: A directive that informs a vector of a planned change in size, so that it can manage the storage allocation accordingly. After reserve(), capacity() is greater or equal to the argument of reserve if reallocation happens; and equal to the previous value of capacity() otherwise. Reallocation happens at this point if and only if the current capacity is less than the argument of reserve().
[vector.capacity]/1: Returns: The total number of elements that the vector can hold without requiring reallocation.
But as long as the vector stays empty, no element might ever be reallocated. So a standard compliant implementation has not to worry on any reallocation and could delay the first allocation just in time before the first element is inserted in the vector. I would personally not implement it like that, but it would be legal and cannot be excluded. The fact that data()
is not obliged to return a pointer to a first element when the vector is empty, seems tailor made to allow this kind of implementation.
Final word
Your code will work with mainstream implementations, since it is quite common for practical reasons for reserve()
to trigger allocation/reallocation. But if you want portable code, that works perfectly also on exotic microcontroller architectures in mission critical systems with lives at risks, then you'd better avoid such shortcuts.