C++ declaration and initialization / variables aren't initialized to default value

Question

case 1->

int a;
std :: cout << a << endl; // prints 0

case 2->

int a;
std :: cout << &a << " " << a << endl; // 0x7ffc057370f4 32764

whenever I print address of variable, they aren't initialized to default value why is it so. I thought value of a in case 2 is junk but every time I run the code it shows 32764,5,6,7 are these still junk values?

Both of those samples invokes *undefined behavior* because `a` is not initialized. It can print *any* value — UnholySheep, May 14 '21 at 13:28
possible dupe of [Default variable value](https://stackoverflow.com/questions/6032638/default-variable-value) or plenty others. I hope your C++ book or other resource explains _very_ early on that basic types without explicit initialisation are undefined to read until assigned. — underscore_d, May 14 '21 at 13:30
The object has an address as soon as it's declared. That's fine to read. You just can't read the value until it has been explicitly given one. — underscore_d, May 14 '21 at 13:31
"right?" no, if you see the value of `a` differ depending on whether or not you printed its address, that's just Undefined Behaviour manifesting. It's still undefined. Don't rely on it or speculate over its causes. Just ensure all your objects have been initialised or assigned before you read them. — underscore_d, May 14 '21 at 13:33
There's nothing I can say about undefined behaviour that hasn't already been explained better elsewhere, such as at the link to "possible" dupe I posted above or Ashok posted below, or other resources you can easily find by researching _undefined behaviour_. — underscore_d, May 14 '21 at 13:37

score 1 · Accepted Answer · answered May 14 '21 at 13:29

1

Variables in C++ are not initialized to a default value, hence there's no way to determine the value. You can read more about it here.

answered May 14 '21 at 13:29

Ashok Arora

531
1
6
17

score 0 · Answer 2 · answered May 14 '21 at 22:31

I'm afraid the accepted answer does not touch the main point of the question: why

int a;
std :: cout << a << endl; // prints 0

always prints 0, as if a was initialized to its default value, whereas in

int a;
std :: cout << &a << " " << a << endl; // 0x7ffc057370f4 32764

the compiler produces some junk value for a.

Yes, in both cases we have an example of undefined behavior and ANY value for a is possible, so why in Case 1 there's always 0?

First of all remember that a C/C++ compiler is free to modify the source code in an arbitrary way as long as the meaning of the program remains the same. So, if you write

int a;
std :: cout << a << endl; // prints 0

the compiler is free to assume that a needs not be associated with any real RAM cells. You don't read it, nor do you write to a. So the compiler is free to allocate the memory for a in one of its registers. In such a case a has no address and is functionally equivalent to something as weird as a "named, addressless temporary". However, in Case 2 you ask the compiler to print the address of a. In such a case the compiler cannot ignore the request and generates the code for the memory where a would be allocated even though the value of a can be a junk.

The next factor is optimization. You can either switch it off completely in Debug compilation mode or turn on aggressive optimization in Release mode. So, you can expect that your simple code will behave differently whether you compile it as Debug or Release. Moreover, since it is undefined behavior, your code may run differently if compiled with different compilers or even different versions of the same compiler.

I prepared a version of your program that is a bit easier to analyze:

#include <iostream>

int f()
{
    int a;
    return a;  // prints 0
}

int g()
{
    int a;
    return reinterpret_cast<long long int>(&a) + a;  // prints 0
}

int main() { std::cout << f() << " " << g() << "\n"; }

Function g differs form f in that it uses the address of uninitialized variable a. I tested it in Godbolt Compiler Explorer: https://godbolt.org/z/os8b583ss You can switch there between various compilers and various optimization options. Please do experiment yourself. For Debug and gcc or clang, use -O0 or -g, for Release use -O3.

For the newest (trunk) gcc, we have the following assembly equivalent:

f():
        xorl    %eax, %eax
        ret
g():
        leaq    -4(%rsp), %rax
        addl    -4(%rsp), %eax
        ret
main:
        subq    $24, %rsp
        xorl    %esi, %esi
        movl    $_ZSt4cout, %edi
        call    std::basic_ostream<char, std::char_traits<char> >::operator<<(int)
        leaq    12(%rsp), %rsi
        movl    $_ZSt4cout, %edi
        addl    12(%rsp), %esi
        call    std::basic_ostream<char, std::char_traits<char> >::operator<<(int)
        xorl    %eax, %eax
        addq    $24, %rsp
        ret

Please notice that f() was reduced to a trivial setting of the eax register to zero ( for any value of integer a, a xor a equals 0). eax is the register where this function is to return its value. Hence 0 in Release. Well, actually, no, the compiler is even smarter: it never calls f()! Instead, it zeroes the esi register that is used in a call to operator<<. Similarly, g is replaced by reading 12(%rsp), once as a value, once as the address of. This generates a random value for a and rather similar values for &a. AFIK, they're a bit randomized to make the life of hackers attacking our code harder.

Now the same code in Debug:

f():
        pushq   %rbp
        movq    %rsp, %rbp
        movl    -4(%rbp), %eax
        popq    %rbp
        ret
g():
        pushq   %rbp
        movq    %rsp, %rbp
        leaq    -4(%rbp), %rax
        movl    %eax, %edx
        movl    -4(%rbp), %eax
        addl    %edx, %eax
        popq    %rbp
        ret
main:
        pushq   %rbp
        movq    %rsp, %rbp
        call    f()
        movl    %eax, %esi
        movl    $_ZSt4cout, %edi
        call    std::basic_ostream<char, std::char_traits<char> >::operator<<(int)
        call    g()
        movl    %eax, %esi
        movl    $_ZSt4cout, %edi
        call    std::basic_ostream<char, std::char_traits<char> >::operator<<(int)
        movl    $0, %eax
        popq    %rbp
        ret

You can now clearly see, even without knowing the 386 assembly (I don't know it either) that in Debug mode (-g) the compiler performs no optimization at all. In f() it reads a (4 bytes below the frame pointer register value, -4(%rbp)) and moves it to the "result register" eax. In g(), the same is done, but a is read once as a value and once as an address. Moreover, both f() and g() are called in main(). In this compiler mode, the program produces "random" results for a (try it yourself!).

To make things even more interesting, here's f() as compiled by clang (trunk) in Release:

f():                                  # @f()
        retq
g():                                  # @g()
        retq

Can you see? These function are so trivial to clang that it generated no code for them. Moreover, it did not zeroed the registers corresponding to a, so, unlike g++, clang produces a random value for a (in both Release and Debug).

You can go with your experiments even further and find that what clang produces for f depends on whether f or g is called first in main.

Now you should have a better understanding of what Undefined Behavior is.

C++ declaration and initialization / variables aren't initialized to default value

2 Answers2