-1

I recently found out that in C++, declaring int's outside of a function will initialize them to 0, but inside a function the behavior is undefined, and will return a nonsense value or potentially crash the compiler. However I found that if I declare an int (let's call it j) in a function but then print it by writing cout << j;, it will print 0 and from there on within the function it will equal 0 and act as such.

Firstly, why does printing do this? Why does this cause an undeclared int inside a function to stop acting like a nonsense value and start acting like it was implicitly initialized to 0, as it does outside functions?

Secondly, why does C++ work this way to begin with? Which genius decided that uninitialized int's should act in a well defined manner outside functions, but not within them? Wouldn't this be trivial to specify and implement with a compiler? Why on God's Green Earth is this how it works?

nvoigt
  • 75,013
  • 26
  • 93
  • 142
user6873235
  • 199
  • 1
  • 2
  • You have just discovered Undefined Behavior (UB): you can't predict how it will be executed, maybe local `int` is initialized to 0, maybe not (it's depend on compiler and option of compiler), maybe it works well, maybe it can be destroy your computer – Garf365 Dec 20 '16 at 12:59
  • 6
    `0` is a perfectly valid nonsense value. – Quentin Dec 20 '16 at 13:00
  • 5
    The language standard dictates that any global and/or static variable should be initialized to (filled with) zeros. It doesn't dictate the same for non-static local variables (and if it did, then that would yield an immediate increase in both code-size and running-time, since each local variable will have to be initialized every time the corresponding function is invoked). – barak manos Dec 20 '16 at 13:00
  • Learn the lingo, kiddo. "outside of functions" are variables with static sotrage duration. They are always zero initialized by default, because it's easy to do efficiently. It's slightly more costly with variables that have automatic storage duration, so it's not done by default. The "genius" that designed the language didn't want you to have to pay in performance for something you may not need, however minuscule the price is. – StoryTeller - Unslander Monica Dec 20 '16 at 13:02
  • values outside of functions are presumably in your case "global" which means they get put in a special section of the executable called the bss. these values are initialized to zero because they are part of the binary itself whereas values declared in functions are just assigned a reference to memory on the stack and will therefore typically have whatever value was already at that location until they are initialized. – jhbh Dec 20 '16 at 13:03
  • 4
    With global and/or static variables, you simply "apply" zeros at their (constant) addresses within the executable image, so there is no additional runtime overhead required in order to initialize them. They attain their (zero) values as soon as the program is loaded into memory. – barak manos Dec 20 '16 at 13:05
  • @StoryTeller: Although it is indeed easy to initialize global and/or static variables efficiently (by "embedding" zeros into the executable image), I doubt that the language-standard committee took that as one of the considerations when they wrote this rule. If they did, then it means that they were thinking of (compiler) implementation benefits, which, being a standard committee , they are not supposed to... But maybe I'm wrong on that one... – barak manos Dec 20 '16 at 13:09
  • @barakmanos - That little tidbit predates standardization, and is inherited from C. It wasn't adopted by the standard committee, as much as being stated and kept to avoid breaking existing code, me thinks. – StoryTeller - Unslander Monica Dec 20 '16 at 13:10
  • @StoryTeller: So you're saying that the language (and its compilers) has evolved before any standard was established? – barak manos Dec 20 '16 at 13:12
  • 1
    Look at it this way. A global or static variable is only initialized once, not much overhead there. On the other hand function local, block local, variables can be initialized many, many, many times. Not very smart to waste resources on that unless the programmer explicitly states they want it. – NathanOliver Dec 20 '16 at 13:12
  • 1
    Possible duplicate of [When are static and global variables initialized?](http://stackoverflow.com/questions/17783210/when-are-static-and-global-variables-initialized) – doctorlove Dec 20 '16 at 13:12
  • @barakmanos - Of course. C++ was in use before the move to standardization has even begun. Compilers existed since 1983, long before the first standard. – StoryTeller - Unslander Monica Dec 20 '16 at 13:13
  • @StoryTeller: So this means that the standard itself is somewhat inclined to implementation details... – barak manos Dec 20 '16 at 13:14
  • @doctorlove I disagree with the dupe closure. The question is more a why is there a difference in initialization between global/static objects and everything else. – NathanOliver Dec 20 '16 at 13:20
  • True - I guess the answer is "pay for what you use" in that case – doctorlove Dec 20 '16 at 13:22

1 Answers1

0
  1. Globals / Statics (Outside functions) are 'Heap' variables. They have one address always. C and it's derivatives guarantee 0 initialization when the space is allocated for that data type on the heap.

  2. Local Variables (inside functions). All that the compiler does is adds the size of that type to the function 'stack' space, therefore when the function is relocated / loaded anywhere in memory that stack space is simply a chunk of memory that was there before (undefined) and the value could be nonsense.

  3. This will only crash the compiler when you use the local uninitialized variable in a context where the compiler needs to know information about it's value at compile time. Printing is not one of those times. OR when you have told the compiler to break on certain warnings.

Look up 'program heap' and 'function stack' memory for more info. Here is one link: http://gribblelab.org/CBootcamp/7_Memory_Stack_vs_Heap.html

Hope this helps you find the resources you need. This is a perfectly valid question that a beginner SHOULD have to develop an understanding of the mechanics of c based languages when it comes to creating a PHYSICAL assembly with real memory management at the assembly level.

EDIT: Also worth noting that not all stack variables end up being used. imagine if you had two control paths (if else pattern) and you declare ints in each. only one of the int's is needed, the compiler does not want to store an initial value for both if only one will be needed.

It's expensive (cpu cycles) to zero all kinds of memory everytime a function stack is set-up by a runtime call, you would never want that to happen when most variables (SHOULD) end up getting set to something useful later anyway!!!

sbail95
  • 198
  • 8
  • This is all useful and relevant, but doesn't quite answer my main question. Why does printing the variable (by way of `cout << j;`) cause it to seemingly zero-initialise (it prints `0` and subsequently acts as such) whereas it does not do that when I don't print it? And in your control flow example with if-else structure, couldn't you simply have the compiler surreptitiously insert a zero-initialisation in the flow just before an undeclared variable is used, hence not requiring it be done if that path is not taken? Is this as mentioned above, a pre-standardisation holdover for back-compat? – user6873235 Dec 21 '16 at 08:20