C programming language is defined in terms of abstract machine. The behaviour of a program is described as it would happen if executed in an abstract machine that has the same characteristics as the target environment. The C standard defines that in this abstract machine storage is guaranteed to be reserved for objects for their lifetime, so
int array[1000000];
will have sizeof (int) * 1000000
bytes memory reserved for its lifetime (which is until the end of the scope where the array was defined) and so does the object allocated with
int *array = malloc(sizeof (int) * 1000000);
where the lifetime ends at the corresponding free
. That's the theory.
However the standard says that any compiler is conforming even if it produces a program that when run behaves as if it was run in the abstract machine according to its rules. This is called the as-if rule. So in fact if you write something like
for (int i = 0; i < 100; i++) {
int *p = malloc(sizeof (int) * 1000000);
}
the compiler can produce an executable that does not call malloc
at all since the return value is not used. Or if you just use p[0]
it can notice that actually you could live with int p_0
instead and use it for all calculations. Or anything in between. See this program for an example:
#include <stdio.h>
#include <stdlib.h>
int main(void) {
int *array = malloc(1000000);
int tmp;
scanf("%d", &tmp);
array[0] = tmp;
array[1] = array[0] + tmp;
printf("%d %d\n", array[0], array[1]);
}
Compiled with GCC 9.1 -O3
for x86-64 it produces
.LC0:
.string "%d"
.LC1:
.string "%d %d\n"
main:
sub rsp, 24
mov edi, OFFSET FLAT:.LC0
xor eax, eax
lea rsi, [rsp+12]
call __isoc99_scanf
mov esi, DWORD PTR [rsp+12]
mov edi, OFFSET FLAT:.LC1
xor eax, eax
lea edx, [rsi+rsi]
call printf
xor eax, eax
add rsp, 24
ret
which has 2 call instructions: one for scanf
and one for printf
but none for malloc
! And how about
#include <stdio.h>
#include <stdlib.h>
int main(void) {
int array[1000000];
int tmp;
scanf("%d", &tmp);
array[0] = tmp;
array[1] = array[0] + tmp;
printf("%d %d\n", array[0], array[1]);
}
The output is
.LC0:
.string "%d"
.LC1:
.string "%d %d\n"
main:
sub rsp, 24
mov edi, OFFSET FLAT:.LC0
xor eax, eax
lea rsi, [rsp+12]
call __isoc99_scanf
mov esi, DWORD PTR [rsp+12]
mov edi, OFFSET FLAT:.LC1
xor eax, eax
lea edx, [rsi+rsi]
call printf
xor eax, eax
add rsp, 24
ret
which is identical.
In practice you can not depend on any such behaviour, as none of it is guaranteed, it is just a possibility allowed for compilers to optimize.
Notice that in case of global objects with external linkage, the compiler wouldn't know if any other translation units to be linked could depend on the array having the defined size, it would often have to produce output that actually has the array in it.