5

Inspired by this question: How does *(&arr + 1) - arr give the length in elements of array arr?

The following code will calculate the length of an array, albeit invoking undefined behavior:

int  arr[] = {5, 8, 1, 3, 6};
size_t len = *(&arr + 1) - &arr[0]; // len is 5

I believe, that by dereferencing the (&arr + 1) we are triggering undefined behavior. However, the only reason we are doing this is to immediately decay the result into int*, pointing to one element after the last one in original array. Since we do not dereference this pointer, we are well in defined zone.

The question thus is following: is there a way to decay into int* without dereferencing the undereferencable pointer and staying defined?

P.S. Mandatory disclaimer: yes, I can calculate the size of an array with sizeof operator and division. This is not the point of the question.

EDIT: I am now less sure that the indirection itself is undefined. I have found http://www.open-std.org/jtc1/sc22/wg21/docs/cwg_active.html#232 and from it looks it seems like there was an attempt to legalize the indirection per se, but I was not able to find any wording to this effect in actual standard.

SergeyA
  • 61,605
  • 5
  • 78
  • 137
  • You could do it in C++14 using `reinterpret_cast(&arr + 1)`, but this isn't possible anymore in C++17. – Brian Bi Apr 15 '20 at 21:25
  • If you simply convert to `uinptr_t` types, the conversion to `int*` wouldn't be needed and the size could found with: `size_t len = ((reinterpret_cast(&arr + 1) - reinterpret_cast(&arr[0]) ) / sizeof arr[0] );`. This is obviously not quite "equivalent" to the original... – P.P Apr 15 '20 at 21:31
  • @usr I think that's not guaranteed to work by the standard, but in practice it most likely will. – Brian Bi Apr 15 '20 at 21:33
  • @Brian I believe the conversion to uintptr_t is implementation-defined. – P.P Apr 15 '20 at 21:39
  • @usr You're right. So it's guaranteed to work if your implementation says that it does something sane. – Brian Bi Apr 15 '20 at 21:41
  • 2
    Note: it is better to calculate array size with `::std::size` rather than with `sizeof` and division. – user7860670 Apr 15 '20 at 21:47
  • 1
    This doesn't mean much, but Clang diagnoses it as UB in `constexpr` context, while GCC and MSVC don't. See https://godbolt.org/z/9oWef6. But GCC and MSVC seem to not correctly diagnose illegal pointer arithmetic (e.g. adding `2` instead of `1`) either. – walnut Apr 15 '20 at 22:43
  • size_t arr_len = &(&arr)[1][0] - &(&arr)[0][0]; What about this ? – Strong will Apr 16 '20 at 03:58
  • @Strongwill `(&arr)[1]` is equivalent to `*(&arr + 1)`. – Language Lawyer Apr 16 '20 at 04:41
  • Would this be a crystal clear UB: `int (&end_ref)[5] = *(&arr + 1); auto len = end_ref - arr;` to not force the reference decaying into a pointer immediately? "_A reference is required to be initialized to refer to a valid object or function_" (possibly after the referred object's storage has been allocated but before it's actually constructed, if I understand it correctly). Is a reference to something before its storage has been allocated different than a pointer to something before its storage has been allocated? If it is, perhaps the answer can be found there? (i'm not a language-lawyer) – Ted Lyngmo Apr 16 '20 at 10:28

1 Answers1

2

What is known as "decay" is "array-to-pointer conversion" in standardese, and it is clear that what you ask is impossible:

[conv.array]

An lvalue or rvalue of type “array of N T” or “array of unknown bound of T” can be converted to a prvalue of type “pointer to T”

There can be no lvalue referring to a non-existent array, and a rvalue (not converted from the lvalue) can't refer to the same array.


Now to clarify a few points:

  • &arr + 1 is a pointer pass the end of an object, and indirection is defined only on pointers to actual objects. The active issue is about null pointers, which isn't relevant to pointers pass the end of an object.
  • Pointer arithmetic is only defined on objects within the same array, even if you had int arr[2][5];, *(&arr + 1) decayed as int* is not considered part of the array arr[0], and so &arr[1][0] - &arr[0][0] is still undefined.
Passer By
  • 19,325
  • 6
  • 49
  • 96
  • @M.M Sorry, I meant that using them is undefined. I removed it since it's kinda hard to phrase in a short sentence. – Passer By May 25 '20 at 13:00