Why is this code involving arrays and pointers behaving as it does?

Question

I was asked what the output of the following code is:

int a[5] = { 1, 3, 5, 7, 9 };
int *p = (int *)(&a + 1);
printf("%d, %d", *(a + 1), *(p - 1));

3, 9
Error
3, 1
2, 1

The answer is NO.1

It is easy to get *(a+1) is 3.

But how about int *p = (int *)(&a + 1); and *(p - 1) ?

If you remove the cast `(int *)` the warning explains what is going on. `initialization of ‘int *’ from incompatible pointer type ‘int (*)[5]’` — stark, Oct 07 '21 at 12:38
Looks like an exam question to me. I hope that one thing you learn from this question and how difficult you find it to answer is that it is not a good idea to do pointer-juggling like this in a real-world program. — Philipp, Oct 08 '21 at 09:06
Why would you do &a on an array in local scope? That's is just plain silly. Maybe the specs account for silly maybe they don't. — Captain Giraffe, Oct 08 '21 at 19:34
The key here is that in `&a`, the name `a` does **not** decay to a pointer to its first element, so It is the address of **the array**. Its type is pointer to array of 5 `int`. — Pete Becker, Oct 09 '21 at 09:05
Did any of the answers address your question? If so, consider [accepting one of them](https://stackoverflow.com/help/accepted-answer). — dbush, Nov 02 '21 at 13:25

score 39 · Answer 1 · edited Oct 07 '21 at 13:44

39

The answer to this could be either "1) 3,9" or "2) Error" (or more specifically undefined behavior) depending on how you read the C standard.

First, let's take this:

&a + 1

The & operator takes the address of the array a giving us an expression of type int(*)[5] i.e. a pointer to an array of int of size 5. Adding 1 to this treats the pointer as pointing to the first element of an array of int [5], with the resulting pointer pointing to just after a.

Also, even though &a points to a singular object (in this case an array of type int [5]) we can still add 1 to this address. This is valid because 1) a pointer to a singular object can be treated as a pointer to the first element of an array of size 1, and 2) a pointer may point to one element past the end of an array.

Section 6.5.6p7 of the C standard states the following regarding treating a pointer to an object as a pointer to the first element of an array of size 1:

For the purposes of these operators, a pointer to an object that is not an element of an array behaves the same as a pointer to the first element of an array of length one with the type of the object as its element type.

And section 6.5.6p8 says the following regarding allowing a pointer to point to just past the end of an array:

When an expression that has integer type is added to or subtracted from a pointer, the result has the type of the pointer operand. If the pointer operand points to an element of an array object, and the array is large enough, the result points to an element offset from the original element such that the difference of the subscripts of the resulting and original array elements equals the integer expression. In other words, if the expression P points to the i-th element of an array object, the expressions (P)+N (equivalently, N+(P)) and (P)-N (where N has the value n) point to, respectively, the i+n-th and i−n-th elements of the array object, provided they exist. Moreover, if the expression P points to the last element of an array object, the expression (P)+1 points one past the last element of the array object, and if the expression Q points one past the last element of an array object, the expression (Q)-1 points to the last element of the array object. If both the pointer operand and the result point to elements of the same array object, or one past the last element of the array object, the evaluation shall not produce an overflow; otherwise, the behavior is undefined. If the result points one past the last element of the array object, it shall not be used as the operand of a unary * operator that is evaluated.

Now comes the questionable part, which is the cast:

(int *)(&a + 1)

This converts the pointer of type int(*)[5] to type int *. The intent here is to change the pointer which points to the end of the 1-element array of int [5] to the end of the 5-element array of int.

However the C standard isn't clear on whether this conversion and the subsequent operation on the result is allowed. It does allow conversion from one object type to another and back, assuming the pointer is properly aligned. While the alignment shouldn't be an issue, using this pointer is iffy.

So this pointer is assigned to p:

int *p = (int *)(&a + 1)

Which is then used as follows:

*(p - 1)

If we assume that p validly points to one element past the end of the array a, subtracting 1 from it results in a pointer to the last element of the array. The * operator then dereferences this pointer to the last element, yielding the value 9.

So if we assume that (int *)(&a + 1) results in a valid pointer, then the answer is 1) 3,9 otherwise the answer is 2) Error.

edited Oct 07 '21 at 13:44

Andreas Wenzel

22,760
4
24
39

answered Oct 07 '21 at 12:47

dbush

205,898
23
218
273

_using this pointer is iffy_ So, you also don't believe that `offsetof` is useful in C? – Language Lawyer Oct 08 '21 at 09:27
9

@LanguageLawyer `offsetof` is part of the standard, so how it does what it does doesn't matter from a user perspective. – dbush Oct 08 '21 at 11:47
How is your answer related to my comment? Did I write that `offsetof` is not implementable? Your analysis of what can be done to a casted pointer means that one can't cast to a pointer to `char`, add some offset to it and cast to a pointer to some object type to try to get a pointer to a struct member at the offset. – Language Lawyer Oct 08 '21 at 13:35
@LanguageLawyer You implied that I thought it wasn’t implementable, and in fact your most recent comment seems to back that up. Regarding what can be done with a casted pointer, casting to a `char *` (but not from) is given special treatment, but there’s no such pointer involved in this example. – dbush Oct 08 '21 at 13:44
_your most recent comment seems to back that up_ How? _Regarding what can be done with a casted pointer, casting to a char * (but not from) is given special treatment but there’s no such pointer involved in this example_ So, you won't generalize your argument that if the standard doesn't explicitly say that doing X to a casted pointer is allowed, then doing X to it results in UB? – Language Lawyer Oct 08 '21 at 17:06
3

is `sizeof(int[5])` guaranteed to be equal to `sizeof(int)*5` without padding? If not, that's another failure mode – thegreatemu Oct 08 '21 at 17:54
I suspect this is technically UB for the reasons you mention (I agree it is difficult to point out why). However, the output `3, 9` is far more likely that the *output* "Error". Even a SEGV will not "output Error" in any meaningful sense. What a poorly written question (in the test, not by the OP). – abligh Oct 09 '21 at 06:54
@dbush You said about incrementing a single-val. ptr., "This is valid because a pointer to a singular object can be treated as a pointer to an array of size 1", but an array of size 1 would include an extra byte for the null byte, wouldn't it? wouldn't this be the difference between `NULL` return & `SIGSEGV` error (segmentation fault)? I am new at C/Cpp as well. I am trying to understand by way of questioning, not trying to question your understanding. thx – Nate T Oct 09 '21 at 07:07
@NateT _an array of size 1 would include an extra byte for the null byte_ What is «null byte»? – Language Lawyer Oct 09 '21 at 08:21
@LanguageLawyer Are all C arrays not null-terminated as are c-string arrays? – Nate T Oct 09 '21 at 08:24
Nevermind answered my own question. I was assuming the answer was yes, but I was wrong. – Nate T Oct 09 '21 at 08:26
1

If we have undefined behavior, couldn't the answer be 1), 2), 3) and 4) then? Why only 1) and 2)? – Thomas Weller Oct 09 '21 at 16:29
@ThomasWeller: The answer already mentions in its first sentence that "Error" should be renamed to "undefined behavior". You are right that in the case of undefined behavior, the answers are not mutually exclusive. – Andreas Wenzel Oct 09 '21 at 16:31
It isn't clear to me that this leads to undefined behaviour. The first quote from the specs says that &a can be interpreted a pointer to an array of length one of type int(*)[5], in which case (&a + 1) is a valid pointer to one off the end of that array. When converted to an int* it is still one of the end of the original array of five integers, and hence still valid. 6.5.2.1 (parts 3 and 4) on multidimensional arrays suggests that it is valid AFAICS. Definitely code you shouldn't write, but programmers sometimes need to maintain or refactor horrible code written by others. – Dikran Marsupial Oct 10 '21 at 09:06
... although the "conceptually" in "Then i is adjusted according to the type of x, which conceptually entails multiplying i by the size of the object to which the pointer points, namely an array of five int objects." leaves room for doubt. – Dikran Marsupial Oct 10 '21 at 09:11

Andreas Wenzel · Answer 2 · 2021-10-09T15:24:41.180

In the line

int *p = (int *)(&a + 1);

note that &a is being written, not a. This is important.

If simply a had been written, then the array would have decayed to a pointer to the first element, i.e. to &a[0]. However, since the expression &a was used instead, the result of this expression has the same value as if a or &a[0] had been used, but the type is different: The type is a pointer to an array of 5 int elements, instead of a pointer to a single int element.

According to the rules on pointer arithmetic, incrementing a pointer by 1 will increase the memory address by the size of the object that it is pointing to. Since the pointer is not pointing to a single element, but to an array of 5 elements, the memory address will be incremented by 5 * sizeof(int). Therefore, after incrementing the pointer, the value of (but not type of) the pointer will be equivalent to &a[5], i.e. one past the end of the array.

After casting this pointer to int * and assigning the result to p, the expression p is fully equivalent to &a[5] (both in value and in type).

Therefore, the expression *(p - 1) is equivalent to *(&a[5] - 1), which is equivalent to *(&a[4]), or simply a[4].

score 13 · Answer 3 · edited Oct 07 '21 at 13:50

13

This:

&a + 1;

is taking the address of a, an array, and adding 1, which adds the size of one a, i.e. 5 integers. Then the indexing "backs down", one integer, ending up in the final element of a.

edited Oct 07 '21 at 13:50

Jonathan Leffler

730,956
141
904
1,278

answered Oct 07 '21 at 12:21

unwind

391,730
64
469
606

score 7 · Answer 4 · answered Oct 07 '21 at 12:45

Normally whenever arrays are used in expressions, they "decay" into a pointer to the first element. There are a few exceptions to this rule and one such exception is the & operator.

&a therefore yields a pointer to the array of type int (*)[5]. Then &a + 1 is pointer arithmetic on such a type, meaning the pointer address is increased by the size of one int [5]. We end up pointing just beyond the array, but C actually allows us to do that as long as we don't de-reference that location.

Then the pointer is forced a type conversion to (int *) which we can do too - C allows pretty much any manner of wild pointer conversions as long as we don't de-reference or cause misalignment etc.

p - 1 does pointer arithmetic on type int and the actual type of data in the array is also int, so we are allowed to de-reference that location. We end up at the last item of the array.

Why is this code involving arrays and pointers behaving as it does?

4 Answers4

Linked