14

I know that, in C++, when you write

int i;

you can not make any assumptions about the value that the variable will hold until you effectively assign it a value. However, if you write

int i = int();

then you have the guarantee that i will be 0. So my question is, isn't it actually an incosistency in the behavior of the language? I mean, if I have defined a class MyClass and write

MyClass myInstance;

I can rest assured that the default constructor without parameters of the class will be called to initialize myInstance (and the compiler will fail if there is none), because that's how the RAII principle goes. However, it seems that when it comes to primitive types, resource acquisition is not initialization anymore. Why is that?

I don't think that changing this behavior inherited from C would break any existing code (is there any code in the world that works on the assumption that no assumption can be made about the value of a variable?), so the main possible reason that comes to my mind is performance, for example when creating big arrays of primitive types; but still, I'd like to know if there is some official explanation to this.

Thanks.

jdehesa
  • 58,456
  • 7
  • 77
  • 121
  • "is there any code in the world that works on the assumption that no assumption can be made about the value of a variable?" Yes, for entropy pools. – Étienne Apr 24 '14 at 08:43
  • 5
    I *think* you're looking at it from the wrong angle. Historically, variables were uninitialised, so the question should be why that changed for (some of) C++'s new class types, and that question is easier to answer. You're asking why the old types deviate from the norm, but the old types *were* the norm. –  Apr 24 '14 at 08:46
  • You could try to ask why MyClass(); is not required for the constructor to be called because it had resulted in more consistency in your view. – user2672165 Apr 24 '14 at 09:22
  • @Étienne Touché, cool example (although I hope they don't rely on uninitalized variables). – jdehesa Apr 24 '14 at 10:40
  • @user2672165 Your are right, but that's just because that way the compiler would confuse the object construction with a function declaration. – jdehesa Apr 24 '14 at 10:42
  • @javidcf Some implementations do rely on them as one source of entropy among several, see for instance http://lwn.net/Articles/586427/ – Étienne Apr 24 '14 at 11:33
  • @Étienne: This is a bad idea. This is considered undefined behavior by C standard, and [some implementations optimize this](http://kqueue.org/blog/2012/06/25/more-randomness-or-less/). – Konrad Borowski Apr 24 '14 at 12:49

4 Answers4

8

No. It is not inconsistency.

What if your class is defined as:

struct MyClass
{
    int x;
    float y;
    char *z;
};

then this line does NOT do that you think it does:

MyClass myInstance; 

Assuming the above is declared inside a function, it is same as:

int x; //assuming declared inside a function

In C++, the types are broadly divided into 3 kinds viz. POD, non-POD, Aggregates — and there is a clear distinction between them. Please read about them and their initialization rules (there are too many topics on them. Search on this site). Also read about static initialization and dynamic initialization.

Community
  • 1
  • 1
Nawaz
  • 353,942
  • 115
  • 666
  • 851
  • shades of gray -- standard layout, aggregate... ;-) – Cheers and hth. - Alf Apr 24 '14 at 09:12
  • I'm not sure aggregates are treated specially; aggregate initialization follows the rules of its members. And it's not POD vs. non-POD; a class with virtual functions but no user defined constructors is not a POD, but it will be initialized as one. And you've missed on of the most important differences: static lifetime vs. other lifetimes. – James Kanze Apr 24 '14 at 09:25
  • @JamesKanze: value initialization was introduced in C++03 (as the only new language feature, AFAIK) in order to clean up the initialization behavior for aggregates. in particular the non-POD ones. so there's special treatment. – Cheers and hth. - Alf Apr 24 '14 at 10:46
  • 1
    @Nawaz Thanks. I've just found this reference covering several aspects about PODs and aggregates here http://stackoverflow.com/questions/4178175/what-are-aggregates-and-pods-and-how-why-are-they-special and another about initialization here http://stackoverflow.com/questions/620137/do-the-parentheses-after-the-type-name-make-a-difference-with-new/620402#620402 – jdehesa Apr 24 '14 at 11:03
  • @Alf The concept of value initialization was introduced in order to clarify issues that weren't clear in C++98. It didn't actually change anything with regards to what had been the _intent_ (nor what compilers were actually doing in practice). – James Kanze Apr 24 '14 at 11:06
7

The real reason, at least initially, was that C++ wanted all objects which are compatible with C to behave exactly as they would in C. The reason in C was (and still is) performance; zero initialization of objects with static lifetime was free (because the OS must initialize all memory that it gives the process anyway, for security reasons); zero initialization otherwise costs runtime. (The performance rationale is less strong today than it was originally, because compilers are a lot better at determining that the variable will be initialized later, and suppressing the zero-initialization in such cases; but they do still exist; in particular, in cases like:

char buffer[1000];
strcpy( buffer, something );

If zero initialization were required, I don't know of any compiler which would be able to skip it here, even though it won't be necessary.)

James Kanze
  • 150,581
  • 18
  • 184
  • 329
  • Also performance penalty for static initialization is likely to be smaller than for local variables as static initialization only happens once. – user2672165 Apr 24 '14 at 09:27
  • @user2672165 The performance penalty for zero initialization of statics is 0, at least on classical general purpose machines. – James Kanze Apr 24 '14 at 09:45
  • If strcpy() is inlined in your particular variant of C++ a good compiler may be able to optimize the initialization away. Otherwise: little chance that would happen. – Tonny Apr 24 '14 at 14:08
  • @Tonny It would also have to prove that you never access beyond the trailing `'\0'`, because `strcpy` won't initialize these. – James Kanze Apr 24 '14 at 15:39
3

If you write

int i;

then the initialization or not depends on the context.

  • Namespace scope → zero-initialized.
  • Local function scope → uninitialized.
  • Class member: depends on the constructors, if any.

The lack of initialization for a local variable is just for efficiency. For a very simple function that is called repeatedly at the lowest levels, this can matter. And C and C++ are languages used to construct the bottom levels of things.

Cheers and hth. - Alf
  • 142,714
  • 15
  • 209
  • 331
  • 4
    The initialization depends on object lifetime, _not_ scope. Objects with static lifetime (even if they have local scope) are zero initialized before anything else (and this includes class members, even if the class has a constructor). – James Kanze Apr 24 '14 at 09:14
  • 1
    @JamesKanze: regarding the declaration `int i;` from the question and discussed in the answer, you're **wrong**. it can't be static unless it's in namespace scope , or in a static class type object (in which case it depends on constructors, if any). i understand that you're referring or *meant to refer* to variables in general, but by placing your remark as a comment on this answer it looks to readers like it contradicts something in the answer. – Cheers and hth. - Alf Apr 24 '14 at 10:37
  • `void f() { static int i; /*...*/ }`. The initialization does _not_ depend on the context; it depends on the object lifetime and the type of the object. And of course, even in `class MyClass { int i; MyClass() {} };`, `MyClass:i` will be zero initialized in a static instance of `MyClass`. – James Kanze Apr 24 '14 at 11:01
  • @JamesKanze: note that you have introduced the keyword `static`, not present in the question or the answer. similarly, by introducing the keyword `unsigned` you can change some properties of the variable. and so on. generally, declarations change meaning as you add various keywords. – Cheers and hth. - Alf Apr 24 '14 at 11:22
1

When you set a local variable in a function to some value, then every time the function is called, the assignment takes place and the value is loaded into the stack.

For example:

void func()
{
    int i = 0; // Every time `func` is called, '0' is loaded into the stack
    ...
}

This is something that you might want to avoid, in particularly since the C and C++ languages are also designated for real-time systems, where every operation matters.

And by the way, when you declare MyClass myInstance, you can indeed rest assure that the default constructor is called, but you can choose whether or not you want to do anything in that constructor.

So the C and C++ languages allow you to make the same choice for primitive-type variables as well.

barak manos
  • 29,648
  • 10
  • 62
  • 114
  • It's not the real time systems where it matters; it's ones which are CPU intensive (and which take a lot of time). – James Kanze Apr 24 '14 at 09:23
  • @JamesKanze I take it you never programmed on embedded real-time stuff ? Under tight conditions a couple of these might just make all the difference between being able to use a high level programming language and having to reach for assembler. Sometimes a few clock-cycles is all the leeway you have. You don't want any unneeded things happening behind your back. Granted: C++ isn't usually the language of choice for real-time stuff. It is usually C with additional tweaks and optimizations for the specific platform. – Tonny Apr 24 '14 at 14:13
  • @Tonny Most of my early career was on embedded real-time stuff. If things were that tight, we'd use a faster processor. (There are exceptions, where I've had to juggle, but when I got to that point, I was writing in assembler.) The important thing about real-time is that the response time have a deterministic upper bound. So we'd use heap sort, instead of quick sort (supposing we had to sort, which I've never seen), despite the fact that heap sort is typically a lot slower. – James Kanze Apr 24 '14 at 15:42
  • @JamesKanze Seems we both started in the same business then. But were I was working a faster CPU usually didn't exist. So we often ended up looking at the assembler generated by the C compiler and then hand-optimizing that. (I did a lot of Hitachi H8 stuff. The compiler was very bad at optimizing.) Or we went asm all the way for the really critical parts. I agree that determinism (and predictability) is everything in embedded real-time. Never had to sort either :-) – Tonny Apr 24 '14 at 21:05