Difference between dynamically created array (int *arr) and statically created array (int arr[]) when finding array size using pointer arithmetic

Question

#include <stdio.h>

int main(void)
{
    int arr[] = { 1, 2, 3, 4, 5 };

    //printf("The size of the array is %d", n); //assuming int 4 bytes
    printf("%p - %p: %d\n",(&arr)[1], arr, (&arr)[1] - arr);

    return 0;
}

I know that (&arr)[1] will give the address of next block of memory(memory address after the last element of the array) and arr will hold the address of first element in the array. Hence the difference will give the number of bytes between them. The above code gives the size of the array which is what it is supposed to do.

But I don't know why the difference is coming /4. I thought that it is because we are using %d. I tried different format specifiers but nothing seemed to be working. Answering this is secondary.

Now, I tried the same thing using dynamic memory allocation as shown below:

#include<stdio.h>
#include<stdlib.h>

int main(){
    int *arr, i;
    arr = (int*)malloc(10*sizeof(int));
    for (i = 0; i < 10; i++){
        // scanf("%d", &arr[i]);
        arr[i] = i;
    }

    printf("%p - %p: %d\n",(&arr)[1], arr, (&arr)[1] - arr);
    return 0;
}

But the result is different. It gave a random value.

Then I tried:

#include<stdio.h>
#include<stdlib.h>

int main(){
    int *arr, *ptr, i;
    arr = (int*)malloc(10*sizeof(int));
    ptr = arr;
    for (i = 0; i < 10; i++){
        // scanf("%d", &ptr[i]);
        ptr[i]=i;
    }

    printf("%p - %p: %d\n",(&ptr)[1], ptr, (&ptr)[1] - ptr);
    return 0;
}

The result is 0. Of course I tried different possible combinations to use ptr and arr in the print statement and for loop. Everything gave 0. The obvious reason is, same address for both.

My assumption for this behavior is due to difference between dynamically created array and statically created array.

Can someone explain me what is the reason?

Thanks in advance. Please help me correct myself if I am asking anything wrong.

Edit: To make the question more clear, I changed %d to %p wherever required as per the suggestions of the community. Cheers!

you need to learn what is an array and whot is the pointer. It is one of the most common mistakes and misunderstandings in C. — 0___________, Apr 07 '20 at 20:32
@P__J__ yeah. I am currently learning. I came across this by the way. Hence asked for help. — Rohit Babu, Apr 07 '20 at 20:34
https://stackoverflow.com/questions/9855482/pointer-address-difference/9855559 — Eraklon, Apr 07 '20 at 20:35
The first tricks the compiler into working with an int[5] array, The dynamic sample always works in int*, this is why it cannot work. and a void* has a fixed size (4 I would guess) — Mario The Spoon, Apr 07 '20 at 21:05

Enzo Ferber · Accepted Answer · 2020-04-10T17:48:44.610

Arrays

You are mixing arrays and pointers, which are two different objects. Let's make a mental experiment:

Imagine that every object in C has an address in memory. So a declaraction like int a; will reserve a memory address for variable a. You can see that memory address with &a, as in:

int a;
printf("&a: %p\n", &a);

Arrays are objects that hold sequences of same-type data. Therefore, an int array has a sequence of ints in memory. The address of the array is that of the first element (so you can access the array and derefrence it).

That's also why you access the first element of an array with array[0] instead of array[1]. The index is a displacement from the base address, which is given by the name array, which points to the first element in memory.

 arr[0]
 |       arr[2]
 |       |
 v       v
 +---+---+---+---+---+
 | 0 | 1 | 2 | 3 | 4 |
 +---+---+---+---+---+

Back to your problem, at compile time the compiler knows the size of your array, which is static. In your code, it is 5 elements long. The elements have a type, int in your case. Therefore, the size of your array is 5 elements * 4 bytes of unit size = 20 bytes (considering that your implementation of C int is 4-byte long).

So you can do all sorts of tricks with it, because the compiler knows the size of the array. So you can make macros like SZ(n) (sizeof(n)/sizeof(n[0])) to get the size of the array in elements. Code to test:

/* so1.c
 */

#include <stdio.h>

#define SZ(n)       (sizeof(n)/sizeof(n[0]))

int main(void)
{
    int arr[] = {1, 2, 3, 4, 5};

    printf("Array size (byges)   : %d\n", sizeof(arr));
    printf("Array size (elements): %d\n", SZ(arr));

    printf("Address of arr      : %p\n", arr);
    printf("Address of &arr     : %p\n", &arr);
    printf("Address of arr[0]   : %p\n", &arr[0]);
    printf("Address of &arr[4]  : %p\n", &arr[4]);
    printf("Address of (&arr)[1]: %p\n", (&arr)[1]);

    printf("arr[4] - arr[0]     : %d\n", arr[4] - arr[0]);

    return 0;
}

Notice that the difference in address offset between arr[4] and arr[0] is 16 bytes, or 4 int elements, each of size 4 bytes. You may wonder why is that since the array is 20 bytes, but consider that the last element starts at an offset of 16 bytes from the start of the array and spans till the end of the array (the last 4 bytes will be used by the last element).

Pointer to arrays and Address-of Operator

There's also something very important to consider. Given the properties of the & operator, when you cast it in an array of type T [n], it becomes a pointer to type T(*)[n], which evaluates differently, and you should read this answer to understand the difference.

(T [n]) + 1   => address of T + 1 byte
(T(*)[n]) + 1 => address of T + (n * 1) bytes

And this is the reason why tricks like (&arr[1])-1 will point to the last element in the array on most implementations, but that's not portable nor standard compliant - therefore not recommended practice.

Pointers

Now consider the following program, which uses dynamic memory:

/* so2.c
 */

#include <stdio.h>
#include <stdlib.h>

int main(void)
{
    int *p;
    int i;

    printf("p: %p\n", p);
    printf("&p: %p\n", &p);

    if(!(p = malloc(5 * sizeof *p))) {
        perror("malloc");
        exit(-1);
    }

    printf("AFTER malloc()\n");
    printf("p : %p\n", p);
    printf("&p: %p\n", &p);

    for(i = 0; i < 5; i++) {
        p[i] = i + 1;
    }

    printf("&p[0]: %p\n", &p[0]);
    printf("&p[4]: %p\n", &p[4]);

    free(p);
    return 0;
}

Before malloc, you can see the value of the pointer *p is NULL. That means it does not point anywhere. However, the address of the pointer exists (&p), because we have to store it somewhere. After the malloc call, you see that p now has a value. That value is the address of a chunk of memory that the system has given us.

The compiler has no way of knowing the size of that chunk of memory in compile-time (how would it know the address of something the operating system just gave us on the fly?!). In fact, the size may be very different than the one you request. malloc(n) and it's siblings only guarantee you that the memory chunk returned will be sufficient to accomodate at least n bytes of stuff, but does not guarantee the size of the chunk (or that the memory is available!). From malloc(3) manual:

By default, Linux follows an optimistic memory allocation strategy. This means that when malloc() returns non-NULL there is no guarantee that the memory really is available. In case it turns out that the system is out of memory, one or more processes will be killed by the OOM killer. For more information, see the description of /proc/sys/vm/overcommit_memory and /proc/sys/vm/oom_adj in proc(5), and the Linux kernel source file Documentation/vm/overcommit- accounting.rst.

Therefore, there's no way to know which size it is.

Sizeof (Pointer)

When you call sizeof on a pointer, it merely return the size of a pointer in the given architecture, which will be 4 bytes on a 32-bit machine and 8 bytes on a 64-bit machine. And when you cast the address-of & operator on a pointer, it yields a pointer-to-pointer (T (*)(*)), which is a pointer nonetheless and has the same size of a regular pointer T(*) - because both will, in theory, hold memory address.

Warning: I know that talking about memory address is an over-simplification, because the standard of the C language does not talk about specifics of implementations. However, it really helps to think this low level first and then work your way up throught the trivialities and idiossincrasies of the standard. The chapter on pointers of K&R C is the best reference you canget - read it and do the exercises.

score 1 · Answer 2 · edited Jun 20 '20 at 09:12

Explaining the Observed Results

The Results With Arrays

After int arr[] = { 1, 2, 3, 4, 5 };, arr is an array of five int.

Then &arr is a pointer to an array of five int, and (&arr)[1] would be the array of five int after arr, if there were one. For purposes of explaining the results you saw, let’s assume for the moment there is one. (Below, I will explain without this assumption.)

As an array, (&arr)[1] is automatically converted to a pointer to its first element.¹ So (&arr)[1] acts as a pointer to the first int in the array of five int that follows arr in memory.

Similarly, since arr is an array of five int, it is converted to a pointer to its first element. So arr acts as a pointer to the first int in it.

When you print these with %d, the program might print the memory address that is the value of the pointer, or part of it. (%d is the wrong conversion specifier to use. See below.) If so, you will see the actual addresses as raw memory addresses, typically measured in bytes.

In (&arr)[1] - arr, you subtract these two pointers. When you subtract two pointers in C, the result is the number of array elements between the two locations. It is not the number of bytes. The C standard requires the C implementation to provide the result as a number of elements, even if it has to perform a division to convert from bytes to array elements.

Since (&arr)[1] (after automatic conversion) points to the first int in an array after the array of five int that is arr, and arr (after conversion) points to the first int in arr, they differ by five int, and so the result is five. This is what you saw printed, although you should use %td to print the result of pointer subtraction, not %d.

The Results With Pointers

After int *arr; arr = (int*)malloc(10*sizeof(int));, arr is a pointer to an int. Then &arr is a pointer to that pointer, and (&arr)[1] would be the pointer after arr, if there were one. When you print the raw memory address of arr, you will see the value returned by malloc. However, when you print (&arr)[1], we do not know what you will see—there is no pointer after arr, and your C implementation might print whatever value is in memory after arr, but we do not know what that is. And, since we do not know what the value of (&arr)[1] will be, we do not know what the value of (&arr)[1] - arr will be.

With your ptr = arr; case, the same as above is true—there is no proper (&ptr)[1], so we do not know what will be printed. A possible reason that “0” was printed when you tried it is that the compiler happened to put arr in memory just after ptr, so (&ptr)[1] was arr, and then (&ptr)[1] - ptr is arr - ptr, and that is zero since you set ptr equal to arr.

Explaining What the C Standard Says and Correcting the Code

Proper Use of Pointers and Referring to Objects

As stated above, (&arr)[1] refers to an array of five int after arr, but no such array has been defined. Because of this, the behavior of (&arr)[1] is not defined by the C standard. In consequence, the behavior of printf("%d - %d: %d\n",(&arr)[1], arr, (&arr)[1] - arr); is not defined by the C standard.

Instead, you could use (&arr + 1). This points “one beyond” the array arr. That is, it points to where the next array of five int would be if there were one. That is the same place (&arr)[1] would be, but (&arr+1) is defined because doing pointer arithmetic up to “just beyond” an object is defined By the C standard. (&arr)[1] is not defined because it does not just do pointer arithmetic but is technically a reference to the object that does not exist—it is technically a use of an object that does not exist even though it is immediately converted to a pointer. Pointer arithmetic just after an object is defined, but use of the hypothetical object just after a single object is not defined.

Another alternative is &(&arr)[1]. This takes the address of (&arr)[1], which would still be an improper reference to an object that does not exist except that the definition of & is such that it cancels the * that is implicit in the subscript operator. So &(&arr)[1] is defined to be (&arr + 1) even though (&arr)[1] is not defined.

Correct Printf Conversions

To print a pointer p, use printf("%p", (void *) p);.

To print the result of subtracting pointers p and q, use printf("%td", p-q);.

So, a correct printf for your first case can be:

printf("%p - %p: %td\n", (void *) (&arr+1), (void *) &arr, (&arr+1) - &arr);

or:

printf("%p - %p: %td\n", (void *) (arr+5), (void *) arr, (arr+5) - arr);

The first will print the addresses of the two arrays and the difference between them in units of arrays of five int. That difference will be one.

The second will print the address of the int just beyond the array arr and the address of the first int in arr and the difference between them in units of int. That difference will be one. The two addresses in this printf will be the same as the addresses in the first printf, because they are pointing to the same place. (Note: The C standard permits C implementations to have multiple ways of representing pointers, so it is possible the addresses could appear to be different when printed in this way. However, in most common C implementations, they will appear identical.)

Your second and third cases cannot readily be corrected, because they both relying on using the value of an object beyond a defined single object (a pointer). We could correct the first case because it is only use the address of an object beyond a defined object, and there are ways to use that address in a defined manner. Since the second and third cases attempt to use the value of an object that does not exist, not just its address, they are inherently not defined.

Footnote

¹ When used in an expression, any array is automatically converted to a pointer to its first element except when it is the operand of sizeof, is the operand of unary &, or is a string literal used to initialize an array. This conversion occurs whether the array is directly named, as arr, or is the result of an expression, as (&arr)[1].

I think we defined five `ints` not six. You may have to edit wherever required. — Rohit Babu, Apr 08 '20 at 07:23
Based on your explanation, I tried `printf("%p - %p: %td\n", (void *) (arr+9), (void *) arr, (arr+9) - arr);` with the same declaration of five ints. It gave 9 as result which is conflicting with your explanation of ' you are allowed to do pointer arithmetic up to “just beyond” an object.' — Rohit Babu, Apr 08 '20 at 07:27
@RohitBabu: Thanks, I fixed the six/five error. By “allowed,” I meant that is what is defined by the C standard. I edited to be clear about that. As long as you do pointer arithmetic within the bounds of an array and one element just after it, the behavior is defined by the standard. (When doing arithmetic with a single object, it acts like an array of one element, so `&x+1` is defined for any object `x`.) When you go beyond that, the behavior is not defined by the standard. That does not mean it will not give the result you want. It means the standard does not say what will happen. — Eric Postpischil, Apr 08 '20 at 08:49
@RohitBabu: You cannot reliably test whether something is defined by the standard by trying it. When the C standard says something is not defined, that means the standard does not impose **any** requirements for the behavior. It might fail with an error message, it might print results different from what you think the expected behavior would be, it might print the results you expected, or it might do something else. So the fact a program prints what you expected is consistent with the fact that the C standard does not impose any requirements on its behavior. — Eric Postpischil, Apr 08 '20 at 08:56

score -1 · Answer 3 · answered Apr 07 '20 at 21:01

First we have to understand some concepts

the first one is malloc it comes from memory allocation, so you're reserving memory, but for what? for an array, an array is the continuos memory chunk like PJ said, in your case you're reserving 10 memory spaces (10 * size of int) = ten spaces for int's

The second one is (*) the pointer (dereference operator) and Address-of operator(&): The pointer can be used to access the variable, they point directly.

if you declare a variable

myvar == 25; &myvar == (memory address a random number)

in your case you have the memory address, and you nevermind what's the address, you want the contain of that memory address piece so with *(your variable that contains the address that you want to access), you could reach the value

foo == 1776(random number) *foo == 25 (you're assigning to this memory address the value of 25)

here's documentation that helps http://cplusplus.com/doc/tutorial/pointers/

happy coding! :)