3

Why is the value of

int array[10];

undefined when declared in a function and is 0-initialized when declared as static?

I have been reading the answer of this question and it is clear that

[the expression int array[10];] in a function means: take the ownership of 10-int-size area of memory without doing any initialization. If the array is declared as a global one or as static in a function, then all elements are initialized to zero if they aren't initialized already.

Question: why this behaviour? Do the compiler programmers decide that (for a particular reason)? Can a particular compiler used do the things differently?

Why I am asking this: I am asking this question because I would like to make my code portable among architectures/compilers. In order to ensure it, I know I can always initialize the declared array. But this means that I will lose precious time only for this operation. So, which is the right decision?

Thomas Mueller
  • 48,905
  • 14
  • 116
  • 132
Leos313
  • 5,152
  • 6
  • 40
  • 69
  • 1
    If the array is static/global - then it is zero initialized. If it is local/automatic - then it is not initialized. Because the standard says so. – Eugene Sh. Jul 03 '19 at 13:59
  • 2
    "Question: why this behaviour?" The rationale for leaving local variables uninitialized is execution speed. Setting all variables to zero before using them creates execution overhead. It's the same rationale as why malloc doesn't zero-initialize memory. – Lundin Jul 03 '19 at 14:07
  • I don't think it's specified as a requirement but most compilers would put static data into the `.bss` data segment that will be initialized to zero when the application is loaded into memory. – PeterT Jul 03 '19 at 14:07
  • The why is for speed. Static data is generally all clumped together and declared in one big .bss lump. Most object loaders have an efficient way to zero large chunks of memory at program start. On the other hand, zeroing every array allocation every time a function is entered would cost some processing time if you didn't need it. – Gem Taylor Jul 03 '19 at 14:07
  • 1
    @Leos313 No. The static data has to be initialized as well, but it is done by the run-time library/startup code happening before `main`. But it is happening only once as opposed to functions which can be executed numerous times per program run. – Eugene Sh. Jul 03 '19 at 14:26

4 Answers4

6

An automatic int array[10]; isn't implicitly zeroed because the zeroing takes time and you might not need it zeroed. Additionally, you'd pay the cost not just once but each time control ran past the initialized variable.

A static/global int array[10]; is implicitly zeroed because statics/globals are allocated at load time. The memory will be fresh from the OS and if the OS is security conscious at all, the memory will have been zeroed already. Otherwise the loading code (the OS or a dynamic linker) will have to zero them (because the C standard requires it), but it should be able to do it in one call to memset for all globals/statics, which is considerably more efficient than zeroing each static/global variable at a time.

This initialization is done once. Even statics inside of functions are initialized just once, even if they have nonzero initializers (e.g., static int x = 42;. This is why C requires that the initializer of a static be a constant expression).

Since the loadtime zeroing of all globals/statics is either OS-guaranteed or efficiently implementable, it might as well be standard-guaranteed and thereby make programmers' lives easier.

Petr Skocik
  • 58,047
  • 6
  • 95
  • 142
4

The values are not undefined but indeterminate, and it behaves this way because the standard says so.

Section 6.7.9p10 of the C standard regarding initialization states:

If an object that has automatic storage duration is not initialized explicitly, its value is indeterminate. If an object that has static or thread storage duration is not initialized explicitly, then:

  • if it has pointer type, it is initialized to a null pointer;
  • if it has arithmetic type, it is initialized to (positive or unsigned) zero;
  • if it is an aggregate, every member is initialized (recursively) according to these rules,and any padding is initialized to zero bits;
  • if it is a union, the first named member is initialized (recursively) according to theserules, and any padding is initialized to zero bits;

So for any variable defined either at file scope or static you can safely assume the values are zero-initialized. For variables declared inside of a function or scope, you cannot make any assumptions about uninitialized variables.

As for why, global/static variables are initialized at program startup or even at compile time, while locals have to be initialized each time they come into scope and doing so would take time.

dbush
  • 205,898
  • 23
  • 218
  • 273
  • thank you. So I can safely decide to omit the zero-initialization when the array is static: this behaviour will never change – Leos313 Jul 03 '19 at 14:08
  • 1
    @Leos313 Correct. – dbush Jul 03 '19 at 14:08
  • 2
    Correct, though generally the compiler/linker will optimise a ={0} on a static array, so there is no harm in using that form. – Gem Taylor Jul 03 '19 at 14:10
  • Be careful in the free-standing environment though. Without runtime library (AKA crt.o) it is *your* responsibility to provide a startup code making sure that .bss is zeroed out before entering `main`. – Eugene Sh. Jul 03 '19 at 17:17
3

The reason for not defining the initial value of the variables in stack-allocated/local variables is efficiency. The C Standard expects your program to allocate your array and later fill it:

int array[10];
for (i = 0; i < 10; ++i)
    array[i] = i * 42;

In this case, any initialization would be pointless, so the C Standard wants to avoid it.

If your program needs these values initialized to zero, you can do it explicitly:

int array[10] = {0}; // initialize to zero so the accumulation below works
while (condition)
{
    ... // some code
    for (i = 0; i < 10; ++i)
        array[i] += other_array[i];
}

It is your decision whether to initialize or not, because you are supposed to know how your program behaves. This decision will be different for different arrays.

However, this decision will not depend on a compiler - they are all standard-compliant. One little detail regarding portability - if you don't initialize your array and still see all zeros in it when you use a particular compiler - don't be fooled; the values are still undefined; you cannot rely on them being 0.

Some other languages decided that zero initialization is cheap enough to do even if it's superfluous, and its advantage (safety) outweighs its disadvantage (performance). In C, performance is more important, so it decided otherwise.

anatolyg
  • 26,506
  • 9
  • 60
  • 134
1

The C philosophy is to a) always trust the programmer and b) prioritize execution speed over programmer convenience. C assumes that the programmer is in the best position to know whether an array (or any other auto variable) needs to be initialized to a specific value, and if so, is smart enough to write the code to do it themselves. Otherwise it won't waste the CPU cycles.

Same thing for bounds checking on array accesses, same thing for NULL checks on pointer dereferences, etc.

This is simultaneously C's greatest strength (fast code with a small footprint) and greatest weakness (lots of manual labor to make code safe and secure).

John Bode
  • 119,563
  • 19
  • 122
  • 198