0

Let's have a char array:

char arr[(size_t)PTRDIFF_MAX + (size_t)2];

And let's accept that we have a system with sufficient memory available.

Accessing the array either in array notation or pointer notation is defined, undefined, or implementation defined behaviour?

char c = arr[(size_t)PTRDIFF_MAX + (size_t)1];
char d = *(&arr[0] + ((size_t)PTRDIFF_MAX + (size_t)1));

I'm concerned that the index may be converted to ptrdiff_t before the access resulting in an invalid index.


Edit:

In the case there's no possibility of having an array of that many elements, let's put this similar situation:

We have an array of less elements than PTRDIFF_MAX, but big ones (e.g.: int64_t, or structs with many elements), such that the raw size of the array still exceeds PTRDIFF_MAX. And we access the array through a char *, which is a valid cast (more or less what's happening inside memcpy).

  • 4
    This looks relevant: https://stackoverflow.com/a/31864574/634919. I think the short answer is that you simply can't have an array that big. – Nate Eldredge Jul 24 '20 at 15:23
  • 2
    I assume this is on a 32 bit machine where `PTRDIFF_MAX` is `0x7FFFFFFF` (31 bits)? On a 64 bit machine, `PTRDIFF_MAX` is `0x7FFFFFFFFFFFFFFF`, so unlikely you'd get anywhere close [or care]. But, this is a _signed_ quantity. `PTRDIFF_MIN` is `0x80000000` It can depend on the environment. You _can_ address beyond this. For example, on a 32 bit machine, addressibility is `0xFFFFFFFF` (32 bits). It's signed because you want to go backwards when you do: `&arr[3] - &arr[5]`. But, this works even if unsigned. Pointer arithmetic is unsigned. So, `PTRDIFF_MAX` may not be the best way to look at it – Craig Estey Jul 24 '20 at 16:26
  • @CraigEstey re: 32 bits or 64: Yes, 32 or less. re: *Pointer arithmetic is unsigned*: That's good to know. I thought that because pointer arithmetic used `ptrdiff_t` it was signed. Where's that in the standard? – alx - recommends codidact Jul 24 '20 at 16:30
  • 1
    @CacahueteFrito: Where does the standard say that pointer arithmetic uses `ptrdiff_t`? The only statement is that the result of a pointer subtract (*not* the computation) is returned as a `ptrdiff_t`. – rici Jul 24 '20 at 16:32
  • @rici That's what made me think in that direction. Maybe I supposed too much :) – alx - recommends codidact Jul 24 '20 at 16:33
  • @CacahueteFrito: You didn't read the next sentence: "if the expressions P and Q point to, respectively, the i-th and j-th elements of an array object, the expression (P)-(Q) has the value i−j provided the value fits in an object of type `ptrdiff_t`." – rici Jul 24 '20 at 16:34
  • 1
    It clearly says that it is `i-j` which must fit in a ptrdiff_t, not the difference between addresses. I'll add that quote language-lawyerly to my answer. – rici Jul 24 '20 at 16:35

1 Answers1

4

The array index is not converted to ptrdiff_t.

Adding an integer to a pointer is done with the actual value (including signedness) of the integer addend, so array indexing is not a problem. (Of course, you have to ensure that the type of the index you provide is adequately wide and that the value is within the range of legal index values. But if that's the case, the compiler isn't going to change the type.)

The problem may arise when you are subtracting pointers; in that case, if the result of the subtraction is not representable as a ptrdiff_t, then the outcome is undefined. But remember the the result is the number of array elements, not the number of bytes. Of course, if it is an array of char, there's no difference, but for array element types whose size is greater than 1 it is not possible to overflow, assuming that ptrdiff_t is the signed version of size_t.

The standard is explicit about the result of pointer subtraction:

if the expressions P and Q point to, respectively, the i-th and j-th elements of an array object, the expression (P)-(Q) has the value i−j provided the value fits in an object of type ptrdiff_t. (§6.5.6p9).

That wording goes out of its way to emphasize that it is the difference between array indices, and not the difference of the addresses, which must fit in a ptrdiff_t.

rici
  • 234,347
  • 28
  • 237
  • 341
  • It's the count, but if you alias the array as a `char *` (even if the array is `int64_t[]`), the count you'll get is actually the size in bytes, which may overflow, right? – alx - recommends codidact Jul 24 '20 at 16:32
  • 1
    @CacahueteFrito: If you subtract two array pointers, then overflow is possible. If you add, it isn't. – rici Jul 24 '20 at 16:33
  • 1
    Furthermore, quality implementations recognize that the potential overflow of `ptrdiff_t`, both at the source level (technically UB but no reasonable way for programmers to avoid it) and in internal transformations made by the compiler (invalid but hard to avoid) is a serious problem, and *forbid objects larger than `PTRDIFF_MAX` from existing*. – R.. GitHub STOP HELPING ICE Jul 24 '20 at 20:01