141

According to the accepted (and only) answer for this Stack Overflow question,

Defining the constructor with

MyTest() = default;

will instead zero-initialize the object.

Then why does the following,

#include <iostream>

struct foo {
    foo() = default;
    int a;
};

struct bar {
    bar();
    int b;
};

bar::bar() = default;

int main() {
    foo a{};
    bar b{};
    std::cout << a.a << ' ' << b.b;
}

produce this output:

0 32766

Both constructors defined are default? Right? And for POD types, the default initialization is zero-initialization.

And according to the accepted answer for this question,

  1. If a POD member is not initialized in the constructor nor via C++11 in-class initialization, it is default-initialized.

  2. The answer is the same regardless of stack or heap.

  3. In C++98 (and not afterward), new int() was specified as performing zero initialization.

Despite trying to wrap my (albeit tiny) head around default constructors and default initialization, I couldn't come up with an explanation.

Duck Dodgers
  • 3,409
  • 8
  • 29
  • 43
  • 3
    Interestingly, I even get a warning for b: main.cpp:18:34: warning: 'b.bar::b' is used uninitialized in this function [-Wuninitialized] http://coliru.stacked-crooked.com/a/d1b08a4d6fb4ca7e – tkausl Jan 24 '19 at 15:32
  • It being zero doesn't necessarily mean it's initialized... that may be random. – The Quantum Physicist Jan 24 '19 at 15:32
  • 1
    @TheQuantumPhysicist, for that I ran the program maybe 5~6 times before posting and about 10 times now, `a` is always zero. `b` changes around a little. – Duck Dodgers Jan 24 '19 at 15:33
  • 9
    `bar`'s constructor is user provided whereas `foo`'s constructor is the defaulted one. – Jarod42 Jan 24 '19 at 15:33
  • @Jarod42, and how is `bar`'s constructor user provided? – Duck Dodgers Jan 24 '19 at 15:35
  • I think `foo` is eligible for aggregate initialization, which will value-initialize the member, while `bar` invokes the no-op default constructor – KABoissonneault Jan 24 '19 at 15:35
  • Jarod42 is right: http://eel.is/c++draft/dcl.fct.def#default-5.sentence-2 – YSC Jan 24 '19 at 15:36
  • what standard is this? behaviour changed between cpp14 and cpp17 – bartop Jan 24 '19 at 15:36
  • @JoeyMallone -- "random" here doesn't mean "random", it means "whatever...". The value you see for an uninitialized variable is whatever value happened to be in that memory location. If something previously set that memory location to 0, then you'll always see 0 for the uninitialized variable. (Note that this is a behavioral observation, not a requirement) So seeing the same result multiple times does not mean that the result is not "random", just that "random" doesn't accurately describe the result. – Pete Becker Jan 24 '19 at 15:36
  • @bartop, I didn't realize that mattered. I have `gcc 7.3.0`. My CMake file has explicitly `set(CMAKE_CXX_STANDARD 17)`. I also tried with `set(CMAKE_CXX_STANDARD 14)`. The result was the same. – Duck Dodgers Jan 24 '19 at 15:38
  • @JoeyMallone well, now I'm also unsure, but rules about aggregate and default init priority changed between standards – bartop Jan 24 '19 at 15:39
  • 2
    @PeteBecker, I understand that. How could I somehow shake my RAM a little so that if there was zero there, it should now be something else. ;) p.s. I ran the program a dozen times. It is not a big program. You could run it and test it on your system. `a` is zero. `b` is not. Seems `a` is initialized. – Duck Dodgers Jan 24 '19 at 15:41
  • 3
    @JoeyMallone Regarding "how is it user-provided": There is no guarantee that the definition of `bar::bar()` is visible in `main()` - it might be defined in a separate compilation unit and do something very non-trivial while in `main()` only the declaration is visible. I think you'll agree that this behavior shouldn't change depending on whether you place `bar::bar()`'s definition in a separate compilation unit or not (even if the whole situation is unintuitive). – Max Langhof Jan 24 '19 at 15:47
  • @tkausl, I don't and I have everything on `set(CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -Wall -pedantic -Wextra -Wsign-conversion -Wshadow -Wuninitialized -Wconversion"` – Duck Dodgers Jan 24 '19 at 15:48
  • I am not sure what I am missing. But even in the `foo` case, I am not getting zero initialization. https://gcc.godbolt.org/z/EWaL_7 – balki Jan 24 '19 at 16:04
  • @balki You're comparing apples to oranges. Your `foo2()` needs to have `S2 s2{};` to match what is going on here. https://gcc.godbolt.org/z/CCPUZa – NathanOliver Jan 24 '19 at 16:10
  • 1
    @NathanOliver, yes you are right. I created all the possible initialization ways possible. https://gcc.godbolt.org/z/0EO1Ei The best way to guarantee zero initialization is `struct foo { int a{}; };` – balki Jan 24 '19 at 16:26
  • 3
    @balki Or `int a = 0;` is you want to be really explicit. – NathanOliver Jan 24 '19 at 16:27
  • 3
    Great example for the idiosyncrasies that a language should **not** contain... – cmaster - reinstate monica Jan 28 '19 at 09:57
  • 1
    Yes, this is just crazy – Paul Sanders Jan 29 '19 at 18:03

4 Answers4

112

The issue here is pretty subtle. You would think that

bar::bar() = default;

would give you a compiler generated default constructor, and it does, but it is now considered user provided. [dcl.fct.def.default]/5 states:

Explicitly-defaulted functions and implicitly-declared functions are collectively called defaulted functions, and the implementation shall provide implicit definitions for them ([class.ctor] [class.dtor], [class.copy.ctor], [class.copy.assign]), which might mean defining them as deleted. A function is user-provided if it is user-declared and not explicitly defaulted or deleted on its first declaration. A user-provided explicitly-defaulted function (i.e., explicitly defaulted after its first declaration) is defined at the point where it is explicitly defaulted; if such a function is implicitly defined as deleted, the program is ill-formed. [ Note: Declaring a function as defaulted after its first declaration can provide efficient execution and concise definition while enabling a stable binary interface to an evolving code base. — end note ]

emphasis mine

So we can see that since you did not default bar() when you first declared it, it is now considered user provided. Because of that [dcl.init]/8.2

if T is a (possibly cv-qualified) class type without a user-provided or deleted default constructor, then the object is zero-initialized and the semantic constraints for default-initialization are checked, and if T has a non-trivial default constructor, the object is default-initialized;

no longer applies and we are not value initializing b but instead default initializing it per [dcl.init]/8.1

if T is a (possibly cv-qualified) class type ([class]) with either no default constructor ([class.default.ctor]) or a default constructor that is user-provided or deleted, then the object is default-initialized;

NathanOliver
  • 171,901
  • 28
  • 288
  • 402
  • 53
    I mean `(*_*)` .... If to even use the basic constructs of the language, I need to read the fine print of the language draft, then Hallelujah! But it probably seems to be what you say. – Duck Dodgers Jan 24 '19 at 15:46
  • So, Is out of line `bar::bar() = default;`, same as `bar::bar(){}`? – balki Jan 24 '19 at 15:46
  • 14
    @balki Yes, doing `bar::bar() = default` out of line is the same as doing `bar::bar(){}` inline. – NathanOliver Jan 24 '19 at 15:47
  • 15
    @JoeyMallone Yeah, C++ can be pretty complicated. I'm not sure what the reason for this is. – NathanOliver Jan 24 '19 at 15:49
  • (and by this I means this exemption making it user provided) – NathanOliver Jan 24 '19 at 16:00
  • 1
    Ok, then before I accept this answer, I just wanted to say, that I think that the answer to the question I referred to in my question, ( [this one](https://stackoverflow.com/questions/48809894/c-zero-initialization) ) is not exactly correct. Or is it? It is not definition but declaration that matters when using the keyword `default`. If there is a previous declaration, then a subsequent definition with the `default` keyword will NOT zero initialize the members. Right? – Duck Dodgers Jan 24 '19 at 16:12
  • @JoeyMallone The answer is correct. Remember a definition is also a declaration. That means `MyTest() {}` both defines and declares a default constructor. The answer says to replace that with `MyTest() = default;` to declare a non user provided constructor. The declaration then cause the compiler to define a default constructor. I'm not certain if it is specified where it puts that definition. – NathanOliver Jan 24 '19 at 16:15
  • 3
    *If there is a previous declaration, then a subsequent definition with the default keyword will NOT zero initialize the members. Right?* This is correct. It is what is happening here. – NathanOliver Jan 24 '19 at 16:16
  • 6
    The reason is right there in your quote: the point of an out-of-line default is to "provide efficient execution and concise definition while enabling a stable binary interface to an evolving code base", in other words, enable you to switch to a user-written body later if necessary without breaking ABI. Note that the out-of-line definition is not implicitly inline and so can only appear in one TU by default; another TU seeing the class definition alone has no way to know if it's explicitly defined as defaulted. – T.C. Jan 24 '19 at 20:33
  • I am wondering why a.a is initialized? Does compiler generated constructor initialize primitive types? – uuu777 Jan 25 '19 at 01:15
  • @zzz777 Normally no but in this case with list initialization it gets aggregate initialized and that zero initializes it. – NathanOliver Jan 25 '19 at 01:49
  • What is so special about class A, that it makes default constructor to initialize primitive types? – uuu777 Jan 25 '19 at 18:17
  • @T.C. Except, what valid reason is not having the out-of-line `=default` ctor not have the *same behaviour* as the non-user-provided inline `=default` ctor? Other than a possible oversight. As a plus, as the state of the data is undefined currently, it means it can be fixed in a future version of the standard without backwards incompatibility. – Yakk - Adam Nevraumont Jan 25 '19 at 18:26
  • @zzz777 `A` is an aggregate and because of that when you do `A a{}` you don't actually call `A` constructor but instead do aggregate initialization. Since no value(s) are specified all members get zero initialized. – NathanOliver Jan 25 '19 at 18:27
  • 1
    @Yakk-AdamNevraumont They have the exact same behavior themselves. The user-providedness of the constructor cannot depend on the kind of the out-of-line definition because the definition could be in a different TU that the compiler can't see. – T.C. Jan 25 '19 at 18:52
  • @t.c. I agree it must be user provided. I am simply stating that out of line `=default` user-provided constructors should zero initialize. – Yakk - Adam Nevraumont Jan 25 '19 at 19:32
  • @NathanOliver Oh, the reason for C++'s complexity lies in the apparent mindset of "Oh, we just include about every feature possible, and give a ***** for orthogonality of features. Our users will love us for all the great features!" that seems to be en-vogue among the leading minds of the C++ standard. Unless you dedicate yourself to a) keep your syntax simple, and b) keep your features fully orthogonal, your language is doomed to become overly complex over time. Every single exception that you make will force more exceptions when you add the next feature. Just my 2 cents. – cmaster - reinstate monica Jan 28 '19 at 09:55
  • This is absolutely terrible. – Lightness Races in Orbit Jan 30 '19 at 11:07
24

The difference in behaviour comes from the fact that, according to [dcl.fct.def.default]/5, bar::bar is user-provided where foo::foo is not1. As a consequence, foo::foo will value-initialize its members (meaning: zero-initialize foo::a) but bar::bar will stay uninitialized2.


1) [dcl.fct.def.default]/5

A function is user-provided if it is user-declared and not explicitly defaulted or deleted on its first declaration.

2)

From [dcl.init#6]:

To value-initialize an object of type T means:

  • if T is a (possibly cv-qualified) class type with either no default constructor ([class.ctor]) or a default constructor that is user-provided or deleted, then the object is default-initialized;

  • if T is a (possibly cv-qualified) class type without a user-provided or deleted default constructor, then the object is zero-initialized and the semantic constraints for default-initialization are checked, and if T has a non-trivial default constructor, the object is default-initialized;

  • ...

From [dcl.init.list]:

List-initialization of an object or reference of type T is defined as follows:

  • ...

  • Otherwise, if the initializer list has no elements and T is a class type with a default constructor, the object is value-initialized.

From Vittorio Romeo's answer

Community
  • 1
  • 1
YSC
  • 38,212
  • 9
  • 96
  • 149
10

From cppreference:

Aggregate initialization initializes aggregates. It is a form of list-initialization.

An aggregate is one of the following types:

[snip]

  • class type [snip], that has

    • [snip] (there are variations for different standard versions)

    • no user-provided, inherited, or explicit constructors (explicitly defaulted or deleted constructors are allowed)

    • [snip] (there are more rules, which apply to both classes)

Given this definition, foo is an aggregate, while bar is not (it has user-provided, non-defaulted constructor).

Therefore for foo, T object {arg1, arg2, ...}; is syntax for aggregate initialisation.

The effects of aggregate initialization are:

  • [snip] (some details irrelevant to this case)

  • If the number of initializer clauses is less than the number of members or initializer list is completely empty, the remaining members are value-initialized.

Therefore a.a is value initialised, which for int means zero initialisation.

For bar, T object {}; on the other hand is value initialisation (of the class instance, not value initialisation of members!). Since it is a class type with a default constructor, the default constructor is called. The default constructor that you defined default initialises the members (by virtue of not having member initialisers), which in case of int (with non-static storage) leaves b.b with an indeterminate value.

And for pod-types, the default initialization is zero-initialization.

No. This is wrong.


P.S. A word about your experiment and your conclusion: Seeing that output is zero does not necessarily mean that the variable was zero initialised. Zero is perfectly possible number for a garbage value.

for that I ran the program maybe 5~6 times before posting and about 10 times now, a is always zero. b changes around a little.

The fact that the value was same multiple times does not necessarily mean that it was initialised either.

I also tried with set(CMAKE_CXX_STANDARD 14). The result was the same.

The fact that result is the same with multiple compiler options doesn't mean that the variable is initialised. (Although in some cases, changing standard version can change whether it is initialised).

How could I somehow shake my RAM a little so that if there was zero there, it should now be something else

There is no guaranteed way in C++ to make uninitialised value value to appear nonzero.

Only way to know that a variable is initialised is to compare program to the rules of the language and verify that the rules say that it is initialised. In this case a.a is indeed initialised.

eerorika
  • 232,697
  • 12
  • 197
  • 326
  • *"The default constructor that you defined default initialises the members (by virtue of not having member initialisers), which in case of int leaves it with an indeterminate value."* --> eh! "for pod-types, the default initialization is zero-initialization." or am I wrong? – Duck Dodgers Jan 24 '19 at 15:49
  • 2
    @JoeyMallone Default initialization of POD types is no initialization. – NathanOliver Jan 24 '19 at 15:53
  • @NathanOliver, Then I am even more confused. Then how come `a` is initialized. I was thinking `a` is default initialized and default initialization for a member POD is, zero-initialization. Is `a` then just luckily always coming up zero, no matter how many times I run this program. – Duck Dodgers Jan 24 '19 at 16:00
  • @JoeyMallone `Then how come a is initialized.` Because it is value initialised. `I was thinking a is default initialized` It is not. – eerorika Jan 24 '19 at 16:00
  • @JoeyMallone *default* initialization of a POD is no initialization, but that isn't what is going on here. Here you are *value* initializing which does zero initialize. – NathanOliver Jan 24 '19 at 16:03
  • Ouch. Ok (both of you) guys, thanks for clearing that up. Apparently, I don't know my value-initialization from my default-initialization. Thanks. – Duck Dodgers Jan 24 '19 at 16:04
  • 3
    @JoeyMallone Don't worry about it. You could make a book out of initialization in C++. If you get a chance CppCon on youtube has a few videos on initialization with the most disappointing (as in pointing out how bad it is) being https://www.youtube.com/watch?v=7DTlWPgX6zs – NathanOliver Jan 24 '19 at 16:08
  • I will go read on [default init vs value init](https://stackoverflow.com/questions/8106016/c-default-initialization-and-value-initialization-which-is-which-which-is-ca) later. – Duck Dodgers Jan 24 '19 at 16:08
  • I saw that [CppCon talk](https://youtube.com/watch?v=7DTlWPgX6zs) you suggested, last night. Heavens have mercy on us. I cracked when he suggested that *some of the 19 different ways could/would be 'fixed' in `C++23`.* `¯\_(ツ)_/¯` Amen! – Duck Dodgers Jan 25 '19 at 08:27
0

Meh, I tried running the snippet you provided as test.cpp, through gcc & clang and multiple optimization levels:

steve@steve-pc /tmp> g++ -o test.gcc.O0 test.cpp
                                                                              [ 0s828 | Jan 27 01:16PM ]
steve@steve-pc /tmp> g++ -o test.gcc.O2 -O2 test.cpp
                                                                              [ 0s901 | Jan 27 01:16PM ]
steve@steve-pc /tmp> g++ -o test.gcc.Os -Os test.cpp
                                                                              [ 0s875 | Jan 27 01:16PM ]
steve@steve-pc /tmp> ./test.gcc.O0
0 32764                                                                       [ 0s004 | Jan 27 01:16PM ]
steve@steve-pc /tmp> ./test.gcc.O2
0 0                                                                           [ 0s004 | Jan 27 01:16PM ]
steve@steve-pc /tmp> ./test.gcc.Os
0 0                                                                           [ 0s003 | Jan 27 01:16PM ]
steve@steve-pc /tmp> clang++ -o test.clang.O0 test.cpp
                                                                              [ 1s089 | Jan 27 01:17PM ]
steve@steve-pc /tmp> clang++ -o test.clang.Os -Os test.cpp
                                                                              [ 1s058 | Jan 27 01:17PM ]
steve@steve-pc /tmp> clang++ -o test.clang.O2 -O2 test.cpp
                                                                              [ 1s109 | Jan 27 01:17PM ]
steve@steve-pc /tmp> ./test.clang.O0
0 274247888                                                                   [ 0s004 | Jan 27 01:17PM ]
steve@steve-pc /tmp> ./test.clang.Os
0 0                                                                           [ 0s004 | Jan 27 01:17PM ]
steve@steve-pc /tmp> ./test.clang.O2
0 0                                                                           [ 0s004 | Jan 27 01:17PM ]
steve@steve-pc /tmp> ./test.clang.O0
0 2127532240                                                                  [ 0s002 | Jan 27 01:18PM ]
steve@steve-pc /tmp> ./test.clang.O0
0 344211664                                                                   [ 0s004 | Jan 27 01:18PM ]
steve@steve-pc /tmp> ./test.clang.O0
0 1694408912                                                                  [ 0s004 | Jan 27 01:18PM ]

So that is where it gets interesting, it clearly shows clang O0 build is reading random numbers, presumably stack space.

I quickly turned up my IDA to see what's happening:

int __cdecl main(int argc, const char **argv, const char **envp)
{
  __int64 v3; // rax
  __int64 v4; // rax
  int result; // eax
  unsigned int v6; // [rsp+8h] [rbp-18h]
  unsigned int v7; // [rsp+10h] [rbp-10h]
  unsigned __int64 v8; // [rsp+18h] [rbp-8h]

  v8 = __readfsqword(0x28u); // alloca of 0x28
  v7 = 0; // this is foo a{}
  bar::bar((bar *)&v6); // this is bar b{}
  v3 = std::ostream::operator<<(&std::cout, v7); // this is clearly 0
  v4 = std::operator<<<std::char_traits<char>>(v3, 32LL); // 32 = 0x20 = ' '
  result = std::ostream::operator<<(v4, v6); // joined as cout << a.a << ' ' << b.b, so this is reading random values!!
  if ( __readfsqword(0x28u) == v8 ) // stack align check
    result = 0;
  return result;
}

Now, what does bar::bar(bar *this) does?

void __fastcall bar::bar(bar *this)
{
  ;
}

Hmm, nothing. We had to resort to using assembly:

.text:00000000000011D0                               ; __int64 __fastcall bar::bar(bar *__hidden this)
.text:00000000000011D0                                               public _ZN3barC2Ev
.text:00000000000011D0                               _ZN3barC2Ev     proc near               ; CODE XREF: main+20↓p
.text:00000000000011D0
.text:00000000000011D0                               var_8           = qword ptr -8
.text:00000000000011D0
.text:00000000000011D0                               ; __unwind {
.text:00000000000011D0 55                                            push    rbp
.text:00000000000011D1 48 89 E5                                      mov     rbp, rsp
.text:00000000000011D4 48 89 7D F8                                   mov     [rbp+var_8], rdi
.text:00000000000011D8 5D                                            pop     rbp
.text:00000000000011D9 C3                                            retn
.text:00000000000011D9                               ; } // starts at 11D0
.text:00000000000011D9                               _ZN3barC2Ev     endp

So yeah, it's just, nothing, what the constructor basically does is this = this. But we know that it is actually loading random uninitialized stack addresses and print it.

What if we explicitly provide values for the two structs?

#include <iostream>

struct foo {
    foo() = default;
    int a;
};

struct bar {
    bar();
    int b;
};

bar::bar() = default;

int main() {
    foo a{0};
    bar b{0};
    std::cout << a.a << ' ' << b.b;
}

Hit up clang, oopsie:

steve@steve-pc /tmp> clang++ -o test.clang.O0 test.cpp
test.cpp:17:9: error: no matching constructor for initialization of 'bar'
    bar b{0};
        ^~~~
test.cpp:8:8: note: candidate constructor (the implicit copy constructor) not viable: no known conversion
      from 'int' to 'const bar' for 1st argument
struct bar {
       ^
test.cpp:8:8: note: candidate constructor (the implicit move constructor) not viable: no known conversion
      from 'int' to 'bar' for 1st argument
struct bar {
       ^
test.cpp:13:6: note: candidate constructor not viable: requires 0 arguments, but 1 was provided
bar::bar() = default;
     ^
1 error generated.
                                                                              [ 0s930 | Jan 27 01:35PM ]

Similar fate with g++ as well:

steve@steve-pc /tmp> g++ test.cpp
test.cpp: In function ‘int main()’:
test.cpp:17:12: error: no matching function for call to ‘bar::bar(<brace-enclosed initializer list>)’
     bar b{0};
            ^
test.cpp:8:8: note: candidate: ‘bar::bar()’
 struct bar {
        ^~~
test.cpp:8:8: note:   candidate expects 0 arguments, 1 provided
test.cpp:8:8: note: candidate: ‘constexpr bar::bar(const bar&)’
test.cpp:8:8: note:   no known conversion for argument 1 from ‘int’ to ‘const bar&’
test.cpp:8:8: note: candidate: ‘constexpr bar::bar(bar&&)’
test.cpp:8:8: note:   no known conversion for argument 1 from ‘int’ to ‘bar&&’
                                                                              [ 0s718 | Jan 27 01:35PM ]

So this means it's effectively a direct initialization bar b(0), not aggregate initialization.

This is probably because if you do not provide an explicit constructor implementation this could potentially be an external symbol, for example:

bar::bar() {
  this.b = 1337; // whoa
}

The compiler isn't smart enough to deduce this as a no-op/an inline call in a non-optimized stage.

Steve Fan
  • 3,019
  • 3
  • 19
  • 29