6

Is it well defined in c++ to dereference a one-past-the-end pointer to an array type?

Consider the following code :

#include <cassert>
#include <iterator>

int main()
{
    // An array of ints
    int my_array[] = { 1, 2, 3 };

    // Pointer to the array
    using array_ptr_t = int(*)[3];
    array_ptr_t my_array_ptr = &my_array;

    // Pointer one-past-the-end of the array
    array_ptr_t my_past_end = my_array_ptr + 1;

    // Is this valid?
    auto is_this_valid = *my_past_end;

    // Seems to yield one-past-the-end of my_array
    assert(is_this_valid == std::end(my_array));
}

Common wisdom is that it's undefined behavior to dereference a one-past-the-end pointer. However, does this hold true for pointers to array types?

It seems reasonable that this should be valid since *my_past_end can be solved purely with pointer arithmetic and yields a pointer to the first element in the array that would be there, which happens to also be a valid one-past-the-end int* for the original array my_array.

However, another way of looking at it is that *my_past_end is producing a reference to an array that doesn't exist, which implicitly converts to an int*. That reference seems problematic to me.

For context, my question was brought on by this question, specifically the comments to this answer.

Edit : This question is not a duplicate of Take the address of a one-past-the-end array element via subscript: legal by the C++ Standard or not? I'm asking if the rule explained in the question also apply for pointers pointing to an array type.

Edit 2 : Removed auto to make explicit that my_array_ptr is not a int*.

François Andrieux
  • 28,148
  • 6
  • 56
  • 87
  • 3
    No, that would be UB – πάντα ῥεῖ Oct 09 '18 at 18:13
  • @πάνταῥεῖ I agree, but if you look at the answers in the linked question you'll find that not everyone does. – François Andrieux Oct 09 '18 at 18:15
  • You might use the pointer value for calculations, though dereferencing would be definitiely UB. `// Is this valid? auto is_this_valid = *my_past_end;` Definitely not. `assert(is_this_valid == std::end(my_array));` That would be still valid though. – πάντα ῥεῖ Oct 09 '18 at 18:17
  • @πάνταῥεῖ many says in comments to that answer that it is ok unless you read or write from/into that lvalue – Slava Oct 09 '18 at 18:18
  • @Richard It's not a duplicate. This question is specifically about whether or not there is a difference if the pointer is a pointer to an array type. – François Andrieux Oct 09 '18 at 18:19
  • See also [CWG 232](https://wg21.link/cwg232) which is closely related. – Barry Oct 09 '18 at 18:20
  • @FrançoisAndrieux There's technically no difference. – πάντα ῥεῖ Oct 09 '18 at 18:20
  • @πάνταῥεῖ My question is asking if it's any different. If it isn't please make it an answer. That question is not a duplicate. – François Andrieux Oct 09 '18 at 18:21
  • @FrançoisAndrieux You're asking if the general rules for dereferencing a `T*` are different based on whether or not `T` is an array type? – Barry Oct 09 '18 at 18:22
  • @Barry Yes, I'm asking whether the same rules apply for pointers to array types. I've always assumed yes, but the discussion in the linked answer seems to show a few people disagreeing with that. – François Andrieux Oct 09 '18 at 18:25
  • @πάνταῥεῖ is dereferencing past the end pointer is valid if it used for calculations? And this question is because many people there claim it is valid depend on context (no reading or writing is involved) not because lack of research – Slava Oct 09 '18 at 18:27
  • 4
    @Slava No, referencing the resulting _pointer value_ is valid for calculations, dereferencing it never is valid. – πάντα ῥεῖ Oct 09 '18 at 18:28
  • @πάνταῥεῖ so Brian's answer is incorrect? – Slava Oct 09 '18 at 18:28
  • @Slava Did I overlook something? I don't see Brian participating here. Questions need to be self contained, in case you're referring to some of those links I didn't even bother to look at. OP could have at least citing the essential parts of such arguable answers. – πάντα ῥεῖ Oct 09 '18 at 18:32
  • @Slava What part of [this](https://stackoverflow.com/questions/988158/take-the-address-of-a-one-past-the-end-array-element-via-subscript-legal-by-the) doesn't actually answer the Q? – NathanOliver Oct 09 '18 at 18:33
  • @πάνταῥεῖ question explicitly says it is related to this answer https://stackoverflow.com/questions/52726320/calculate-array-length-via-pointer-arithmetic/52726432#52726432 – Slava Oct 09 '18 at 18:34
  • @NathanOliver There is no question whether or not `&my_array[3]` would be legal to write. The question is about the means used in the example to get that pointers which involves dereferencing a one past the end pointer. – François Andrieux Oct 09 '18 at 18:35
  • @NathanOliver I do not see where it says is it legal or not of deferefencing past the end pointer when no conversion to prvalue involved. As you can see by the answer of Keith people are still confused. – Slava Oct 09 '18 at 18:35
  • am i missing something? Isnt adding 1 to a pointer to the first element of an array of 3 elements not just a pointer to the second element? – 463035818_is_not_an_ai Oct 09 '18 at 18:51
  • 2
    `my_array_ptr ` is a pointer to the whole array, not the first element. It a `int(*)[3]` not a `int *`. – François Andrieux Oct 09 '18 at 18:52
  • @FrançoisAndrieux oh right, maybe one of the places where I wouldnt use `auto` – 463035818_is_not_an_ai Oct 09 '18 at 18:52
  • @user463035818 The syntax for a pointer to an array can be confusing. I already put a comment on that line. Though maybe I could use a `using` instead. – François Andrieux Oct 09 '18 at 18:53
  • @FrançoisAndrieux well its a matter of style I guess, maybe one day we will be more used to `auto` than to arrays decaying to pointers ;) – 463035818_is_not_an_ai Oct 09 '18 at 18:56

2 Answers2

8

This is CWG 232. That issue might seem like it's mainly about dereferencing a null pointer but it's fundamentally about what it means to simply dereference something that doesn't point to an object. There is no explicit language rule about this case.

One of the examples in the issue is:

Similarly, dereferencing a pointer to the end of an array should be allowed as long as the value is not used:

char a[10];
char *b = &a[10];   // equivalent to "char *b = &*(a+10);"

Both cases come up often enough in real code that they should be allowed.

This is basically the same thing as OP (the a[10] part of the above expression), except using char instead of an array type.

Common wisdom is that it's undefined behavior to dereference a one-past-the-end pointer. However, does this hold true for pointers to array types?

There is no difference in the rules based on what kind of pointer it is. my_past_end is a past-the-end pointer, so whether it's UB to dereference it or not is not a function of the fact that it points to an array as opposed to any other kind of type.


While the type of is_this_valid an int* which gets initialized from a int(&)[3] (array-to-pointer decay), and thus nothing here actually reads from memory - that is immaterial to the way the language rules work. my_past_end is a pointer whose value is past the end of an object, and that's the only thing that matters.

Barry
  • 286,269
  • 29
  • 621
  • 977
  • So is it legal or not? – Slava Oct 09 '18 at 18:46
  • @Slava It's an active core language issue. – Barry Oct 09 '18 at 18:52
  • 1
    OK so we can assume it is undefined behavior until issue is solved in either way. – Slava Oct 09 '18 at 18:59
  • 1
    @Slava I don't know why you would say that. Given how much code that exists that does something like this, I would expect that if the wording changes, it would change to allow this. – Barry Oct 09 '18 at 19:00
  • 2
    is it undecided behaviour then? – 463035818_is_not_an_ai Oct 09 '18 at 19:03
  • 1
    @Barry because if code has UB or not is not defined by how much code uses it but if this behaviour is defined by standard and it is safer to assume that behaviour is undefined unless proven otherwise. – Slava Oct 09 '18 at 19:06
  • @Barry Do you know why there seems to be no activity on this issue for the last 10+ years? Is the committee still working on this? – Brian Bi Oct 09 '18 at 19:34
  • @Slava: It is safer for implementations to treat such behaviors as defined absent a good reason not to, and I can't see any good reasons an implementation might have in a case like this. One might want to configure a diagnostic implementation to trap such code, but one may also want diagnostic implementations to trap on some constructs that have defined behavior, but which parts of code are known not to use deliberately (e.g. unsigned wraparound involving operands of type `size_t`). Whether it's safer for programs to treat them as undefined depends upon the recklessness of implementations. – supercat Oct 10 '18 at 18:50
0

I believe it's well defined, because it doesn't dereference the one-past-the-end pointer.

auto is_this_valid = *my_past_end;

my_past_end is of type int(*)[3] (pointer to array of 3 int elements). The expression *my_past_end is of therefore of type int[3] -- so like any array expression in this context, it "decays" to a pointer of type int*, pointing to the initial (zeroth) element of the array object. This "decay" is a compile-time operation. So the initialization simply initializes is_this_valid, a pointer of type int*, to point just past the end of my_array. No memory past the end of the array object is accessed.

Keith Thompson
  • 254,901
  • 44
  • 429
  • 631
  • Downvoter: It's entirely possible that I'm wrong. If so, can you explain why? – Keith Thompson Oct 09 '18 at 18:37
  • 1
    That's just, like, your opinion, man. If you want to show that it's well defined, you should cite the standard. – melpomene Oct 09 '18 at 18:37
  • 1
    @melpomene: I'm saying it's well defined because it doesn't dereference a one-past-the-end pointer. Where would the standard describe something that *doesn't* happen? Array-to-pointer conversion (which is implicit) is defined in [conf.array], 4.2 of the C++11 standard, but I assume that's common knowledge. – Keith Thompson Oct 09 '18 at 18:47
  • 2
    Things are undefined by default. For every piece of code with defined behavior, there is a part of the standard that defines what it does. `*my_past_end` is a use of the `*` operator, whose description states "*the result is an lvalue referring to the object or function to which the expression points*". `my_past_end` does not point to any object, so the behavior is undefined. – melpomene Oct 09 '18 at 18:53
  • 2
    `*x` dereferences `x` (if the expression is evaluated, which it is). You seem to be incorrectly conflating "dereference `x`" with "accessing memory where `x` points" – M.M Apr 30 '20 at 04:00