Explaining the Observed Results
The Results With Arrays
After int arr[] = { 1, 2, 3, 4, 5 };
, arr
is an array of five int
.
Then &arr
is a pointer to an array of five int
, and (&arr)[1]
would be the array of five int
after arr
, if there were one. For purposes of explaining the results you saw, let’s assume for the moment there is one. (Below, I will explain without this assumption.)
As an array, (&arr)[1]
is automatically converted to a pointer to its first element.1 So (&arr)[1]
acts as a pointer to the first int
in the array of five int
that follows arr
in memory.
Similarly, since arr
is an array of five int
, it is converted to a pointer to its first element. So arr
acts as a pointer to the first int
in it.
When you print these with %d
, the program might print the memory address that is the value of the pointer, or part of it. (%d
is the wrong conversion specifier to use. See below.) If so, you will see the actual addresses as raw memory addresses, typically measured in bytes.
In (&arr)[1] - arr
, you subtract these two pointers. When you subtract two pointers in C, the result is the number of array elements between the two locations. It is not the number of bytes. The C standard requires the C implementation to provide the result as a number of elements, even if it has to perform a division to convert from bytes to array elements.
Since (&arr)[1]
(after automatic conversion) points to the first int
in an array after the array of five int
that is arr
, and arr
(after conversion) points to the first int
in arr
, they differ by five int
, and so the result is five. This is what you saw printed, although you should use %td
to print the result of pointer subtraction, not %d
.
The Results With Pointers
After int *arr; arr = (int*)malloc(10*sizeof(int));
, arr
is a pointer to an int
. Then &arr
is a pointer to that pointer, and (&arr)[1]
would be the pointer after arr
, if there were one. When you print the raw memory address of arr
, you will see the value returned by malloc
. However, when you print (&arr)[1]
, we do not know what you will see—there is no pointer after arr
, and your C implementation might print whatever value is in memory after arr
, but we do not know what that is. And, since we do not know what the value of (&arr)[1]
will be, we do not know what the value of (&arr)[1] - arr
will be.
With your ptr = arr;
case, the same as above is true—there is no proper (&ptr)[1]
, so we do not know what will be printed. A possible reason that “0” was printed when you tried it is that the compiler happened to put arr
in memory just after ptr
, so (&ptr)[1]
was arr
, and then (&ptr)[1] - ptr
is arr - ptr
, and that is zero since you set ptr
equal to arr
.
Explaining What the C Standard Says and Correcting the Code
Proper Use of Pointers and Referring to Objects
As stated above, (&arr)[1]
refers to an array of five int
after arr
, but no such array has been defined. Because of this, the behavior of (&arr)[1]
is not defined by the C standard. In consequence, the behavior of printf("%d - %d: %d\n",(&arr)[1], arr, (&arr)[1] - arr);
is not defined by the C standard.
Instead, you could use (&arr + 1)
. This points “one beyond” the array arr
. That is, it points to where the next array of five int
would be if there were one. That is the same place (&arr)[1]
would be, but (&arr+1)
is defined because doing pointer arithmetic up to “just beyond” an object is defined By the C standard. (&arr)[1]
is not defined because it does not just do pointer arithmetic but is technically a reference to the object that does not exist—it is technically a use of an object that does not exist even though it is immediately converted to a pointer. Pointer arithmetic just after an object is defined, but use of the hypothetical object just after a single object is not defined.
Another alternative is &(&arr)[1]
. This takes the address of (&arr)[1]
, which would still be an improper reference to an object that does not exist except that the definition of &
is such that it cancels the *
that is implicit in the subscript operator. So &(&arr)[1]
is defined to be (&arr + 1)
even though (&arr)[1]
is not defined.
Correct Printf Conversions
To print a pointer p
, use printf("%p", (void *) p);
.
To print the result of subtracting pointers p
and q
, use printf("%td", p-q);
.
So, a correct printf
for your first case can be:
printf("%p - %p: %td\n", (void *) (&arr+1), (void *) &arr, (&arr+1) - &arr);
or:
printf("%p - %p: %td\n", (void *) (arr+5), (void *) arr, (arr+5) - arr);
The first will print the addresses of the two arrays and the difference between them in units of arrays of five int
. That difference will be one.
The second will print the address of the int
just beyond the array arr
and the address of the first int
in arr
and the difference between them in units of int
. That difference will be one. The two addresses in this printf
will be the same as the addresses in the first printf
, because they are pointing to the same place. (Note: The C standard permits C implementations to have multiple ways of representing pointers, so it is possible the addresses could appear to be different when printed in this way. However, in most common C implementations, they will appear identical.)
Your second and third cases cannot readily be corrected, because they both relying on using the value of an object beyond a defined single object (a pointer). We could correct the first case because it is only use the address of an object beyond a defined object, and there are ways to use that address in a defined manner. Since the second and third cases attempt to use the value of an object that does not exist, not just its address, they are inherently not defined.
Footnote
1 When used in an expression, any array is automatically converted to a pointer to its first element except when it is the operand of sizeof
, is the operand of unary &
, or is a string literal used to initialize an array. This conversion occurs whether the array is directly named, as arr
, or is the result of an expression, as (&arr)[1]
.