You are experiencing this because you most likely didn't try to compile your code with optimizations turned on. When you do, you'll get an unpredicted behavior of your application's output, because of violating the semantics of scopes in C
or C++
.
If you don't use compile-time optimizations, you can still have some sort of predictability even if you break the semantical rules. This is because the compiler limits itself to generate the code in the order and with the logic that was written.
Once the optimizations will kick in, only the semantical rules of your programming language will continue to give you control and predictability over the resulted machine code. That's why in production code (where you almost always want optimizations turned on in release binaries), you will never try these academic hacks.
The longer explanation
The way the compiler manages the stack follows two types of contracts:
a strong contract - like the case of function calls between different binaries (like shared libraries), which is named caling convention (see here). Roughly speaking, this calling convention defines how the stack frame is managed when a function is called. This is a strong contract, because it will not change based on optimization settings, or other compiler settings, or even different versions of the compiler. Otherwise, the ABI will get broken.
a weak contract - like in the case of local variables within a function, a statement, or a compound statement or calls to functions that are only visible within a certain compile unit. There is no standard on how the compiler will manage the stack here. It can do whatever it wants, as long as it follows the semantics of that programming language and it will be a target for compile-time optimizations algorithms.
In your examples or mine's (see below), the semantics is broken: we define a compound statement, exit its scope but still keep (or use) some references to the memory used within that scope.
For example
Let's extend your example with this one and save it to local.c
file:
int main(int argc, char * argv[]) {
int *ptr1, *ptr2;
{
int ch = 5;
ptr1 = &ch;
}
{
int ch = 10;
ptr2 = &ch;
}
printf(
"pointer1: %d\n"
"pointer2: %d\n",
*ptr1, *ptr2
);
return 0;
}
Now, let's use gcc
and compile this in two different ways, to see what happens:
- with optimizations disabled
- with optimizations enabled
1. With optimizations disabled
# gcc local.c -O0 -o local; ./local
pointer1: 10
pointer2: 10
Well, we see that both ptr1
and ptr2
point to the exact location. This somehow makes sense because, after the first compound statement closes, the compiler uses its reserved space for the second statement. This is a behavior we do expect, once we define the scope with those compound statements by using the {
and }
brackets.
This is what you are experiencing with your example too. You are saving an address pointing to a stack location that the compiler knows it's free to be used as soon as it hits the closing bracket }
. Your example, however, doesn't have an upcoming statement to see the effect in action.
2. With optimizations enabled
# gcc local.c -O1 -o local; ./local
pointer1: 0
pointer2: 0
Wait, what?
Yes, the same code produces two different outputs. With optimizations turned on, the behavior changes, and now the compiler decided to replace your code with something that is faster or smaller in size.
Experimenting with function stack frames
For fun, let's try the same with functions:
void fn_set() { char a = 5; printf("fn_set: a=%d\n", a); }
void fn_get() { char a ; printf("fn_get: a=%d\n", a); }
int main(int argc, char * argv[]) {
fn_set();
fn_get();
return 0;
}
We expect fn_get
to print 5
, like in our previous example.
And let's test this again:
# gcc local.c -O0 -o local; ./local # without optimizations
fn_set: a=5
fn_get: a=5
# gcc local.c -O1 -o local; ./local # with optimizatins enabled
fn_set: a=5
fn_get: a=0
The result is the same. In theory, the function fn_get
and fn_set
have the same stack fingerprint. They should overlap nicely. In practice, there is no semantics or rule to bound to that, so the compiler optimizations remove the unnecessary code (like the unused variable a
in fn_get
) and go for their simplest/fastest version.