8

Consider the following code:

int array[1000000];
array[1] = 1;
array[10] = 10;

We've statically allocated memory for this array with size 1000000. But is all of this used?

I mean if we did this instead:

int *array = malloc(sizeof(int) * 1000000);
array[1] = 1;
array[10] = 10;

Then it seems that we actually use all of those 1000000 spaces: since we have allocated them, they can't be used for anything else in memory. But for the statically allocated array, things are uninitialized, so things can still be stored in the remaining 999998 spots that do not have a set value.

Essentially what I'm asking is: will int array[num] use less memory than int array = malloc(sizeof(int) * num).

Aykhan Hagverdili
  • 28,141
  • 6
  • 41
  • 93
herophant
  • 642
  • 7
  • 16
  • [Here's](https://stackoverflow.com/a/15238783/1173390) a good explanation on what's the difference between `static` vs `malloc`. – ClydeFrog Aug 27 '19 at 05:32

5 Answers5

7

If you have int array[1000000]; and only use a few of its initial members, then in some circumstances (if array is either static or a local of if it's a global and you're linking statically with link time optimizations) your toolchain can shrink/eliminate the array under the as-if rule. (Note that global and static variables are not in effect uninitialized--the C standard mandates that they be zero-initialed.)

Gcc and clang do it, and clang does it even with mallocated arrays to the point that the entire malloc-free pair can be eliminated:

Example:

#include <stdio.h>
#include <stdlib.h>

//gcc and clang optimize out the array, only using
//the constants 1 and 10
int pr(void)
{
   int array[1000000];
   array[1] = 1;
   array[10] = 10;
   return printf("%d %d", array[1], array[10]);
}
//clang optimizes out the dynamic allocation of array,
//only using the constants 1 and 10
int pr1(void)
{
   int r;
   int *array = malloc(1000000);
   if(!array) return -1;
   array[1] = 1;
   array[10] = 10;
   r = printf("%d %d", array[1], array[10]);

   free(array);
   return r;
}

Example output assembly on x86-64 clang with -O3:

pr:                                     # @pr
        mov     edi, offset .L.str
        mov     esi, 1
        mov     edx, 10
        xor     eax, eax
        jmp     printf                  # TAILCALL
pr1:                                    # @pr1
        mov     edi, offset .L.str
        mov     esi, 1
        mov     edx, 10
        xor     eax, eax
        jmp     printf                  # TAILCALL
.L.str:
        .asciz  "%d %d"

Check it out at https://gcc.godbolt.org/z/UmiA34.

Such optimizations are nonportable and twitchy, however. The simplest things such passing a pointer to an array to a function defined in a different translation unit can turn them off. I would avoid relying on them.

Petr Skocik
  • 58,047
  • 6
  • 95
  • 142
  • @chqrlie Shame on me. I've edited it to make it less jarring to people easily offended by the comma operator. :D – Petr Skocik Aug 11 '19 at 08:36
  • "Example output assembly on x86-64 clang with -O3:" I don't understand what this output is saying.. could you explain? – herophant Aug 11 '19 at 09:46
  • @herophant Yes. The code is getting optimized to the equivalent of a direct `return printf("%d %d", 1, 10);`. The first 3 mov pass the 3 arguments: the string, and the 2 integers; the xor zeroes the eax register because that's part of the calling convention for calling variadic (=with an `...` argument) functions such as printf when no floating point arguments are passed and the `jmp printf` does a tail call to `printf`. – Petr Skocik Aug 11 '19 at 10:35
  • @herophant The tail call uses a `jmp` instruction rather than the `call` instruction because the result of the `printf` call is immediately returned with no adjustments to it and so control doesn't need to return to the caller -- it can be simply passed on to `printf`. – Petr Skocik Aug 11 '19 at 10:37
4

C programming language is defined in terms of abstract machine. The behaviour of a program is described as it would happen if executed in an abstract machine that has the same characteristics as the target environment. The C standard defines that in this abstract machine storage is guaranteed to be reserved for objects for their lifetime, so

int array[1000000];

will have sizeof (int) * 1000000 bytes memory reserved for its lifetime (which is until the end of the scope where the array was defined) and so does the object allocated with

int *array = malloc(sizeof (int) * 1000000);

where the lifetime ends at the corresponding free. That's the theory.


However the standard says that any compiler is conforming even if it produces a program that when run behaves as if it was run in the abstract machine according to its rules. This is called the as-if rule. So in fact if you write something like

for (int i = 0; i < 100; i++) {
     int *p = malloc(sizeof (int) * 1000000);
}

the compiler can produce an executable that does not call malloc at all since the return value is not used. Or if you just use p[0] it can notice that actually you could live with int p_0 instead and use it for all calculations. Or anything in between. See this program for an example:

#include <stdio.h>
#include <stdlib.h>

int main(void) {
    int *array = malloc(1000000);
    int tmp;
    scanf("%d", &tmp);
    array[0] = tmp;
    array[1] = array[0] + tmp;
    printf("%d %d\n", array[0], array[1]);
}

Compiled with GCC 9.1 -O3 for x86-64 it produces

.LC0:
        .string "%d"
.LC1:
        .string "%d %d\n"
main:
        sub     rsp, 24
        mov     edi, OFFSET FLAT:.LC0
        xor     eax, eax
        lea     rsi, [rsp+12]
        call    __isoc99_scanf
        mov     esi, DWORD PTR [rsp+12]
        mov     edi, OFFSET FLAT:.LC1
        xor     eax, eax
        lea     edx, [rsi+rsi]
        call    printf
        xor     eax, eax
        add     rsp, 24
        ret

which has 2 call instructions: one for scanf and one for printf but none for malloc! And how about

#include <stdio.h>
#include <stdlib.h>

int main(void) {
    int array[1000000];
    int tmp;
    scanf("%d", &tmp);
    array[0] = tmp;
    array[1] = array[0] + tmp;
    printf("%d %d\n", array[0], array[1]);
}

The output is

.LC0:
        .string "%d"
.LC1:
        .string "%d %d\n"
main:
        sub     rsp, 24
        mov     edi, OFFSET FLAT:.LC0
        xor     eax, eax
        lea     rsi, [rsp+12]
        call    __isoc99_scanf
        mov     esi, DWORD PTR [rsp+12]
        mov     edi, OFFSET FLAT:.LC1
        xor     eax, eax
        lea     edx, [rsi+rsi]
        call    printf
        xor     eax, eax
        add     rsp, 24
        ret

which is identical.

In practice you can not depend on any such behaviour, as none of it is guaranteed, it is just a possibility allowed for compilers to optimize.

Notice that in case of global objects with external linkage, the compiler wouldn't know if any other translation units to be linked could depend on the array having the defined size, it would often have to produce output that actually has the array in it.

1

No, it does not use less memory.

The compiler reserves the space for static variables during compilation and build of the executable, in the data segment of the executable file.

Note that there are two sections of the data segment - the initialised section, and the un-initialised section.

// example code in global scope 
// `n` stored in initialised section of data segment 
int n = 0; 

// `count` stored in un-initialised section of data segment   
int count;

Also, static (and global) variables that are explicitly initialised in the program are initialised. From the standard:

Objects with static storage duration (3.7.1) shall be zero-initialized (8.5) before any other initialization takes place. Zero-initialization and initialization with a constant expression are collectively called static initialization; all other initialization is dynamic initialization

suspectus
  • 16,548
  • 8
  • 49
  • 57
1

The program's behavior will be slightly different if you define the array as a global uninitialized array of int or if you allocate it with malloc() at run time, but in both cases, the OS will set aside 1000000 * sizeof(int) bytes of the process' address space for your program use.

Depending on the OS' implementation and that of malloc(), this address space may remain virtual until the program makes actual accesses to it. If you only access the first and eleventh elements of the array, it is possible that only a single page of RAM be mapped for the array. It is possible too that malloc() fail to allocate this memory if the process is configured to run with a lower limit on memory allocation.

Avoid relying on this behavior, and only allocate the amount of memory you need at any given time. If this amount is variable, then allocate it at runtime with malloc() or calloc().

chqrlie
  • 131,814
  • 10
  • 121
  • 189
0

Here

int array[1000000];

whether you use/access only few elements of it like array[1] and array[10], but at compile time itself 1000000*4 bytes of memory was reserved or fixed of array i.e compiler can't be used for other purpose. Uninitialized array doesn't make any difference in reserving memory for array.

Essentially what I'm asking is: will int array[num] use less memory than int array = malloc(sizeof(int) * num). ?

No, for both static and dynamic array same number of bytes gets reserved initially so there is no point of assuming that compiler will do some kind of optimization and use un-used bytes for some other purpose, unless you are not using realloc() in case of dynamically created buffer.

Achal
  • 11,821
  • 2
  • 15
  • 37