Address of uninitialized variable in C?

Question

int main() {

    int num;
    int a = num;
    int b = num;

    printf("%d", a);
    printf("%d", b);

    return 0;
}

I know that the value of a and b could be different and that will be a garbage value as num is not initialized. But, will the address of num remains the same?

The argument is, does the compiler assigns an address when there is an uninitialized variable? So, if I create an application with let say 1,000 int variables that are uninitialized, does that mean the program will use 4 byte * 1,000 when it's executed?

Seeing as nothing gets the address of the variables, they could very well be optimized away entirely. So no, it doesn't mean that it will use 4000 bytes. — ikegami, Apr 06 '21 at 02:11
@ikegami I have edited the code to make it more interesting and have `num` as part of the function stack. Will `num` still get the address and does the function use `4K` byte incase there are 1000 such variables? — Vishrant, Apr 06 '21 at 02:13
Nothing about the stack is defined in the C standard, since it doesn't need one. Its possible certain implementations of the C standard, along with the machine and OS may always give the same result, but I don't think anyone gives this guarantee. I could be wrong, but maybe the first edition of C (the K&R C) may give some guarantees about stack usage for local variables. — Karthik Sriram, Apr 06 '21 at 02:16
Re "*I have edited the code to make it more interesting*", No, you didn't. I mean, yes, you edited the code, but in no way did it make any difference (interesting or otherwise). — ikegami, Apr 06 '21 at 02:17
@ikegami it was based on your earlier comment that if that's part of function then it will make a difference (which I see got edited). — Vishrant, Apr 06 '21 at 02:21
Actually, it's the opposite. If it wasn't in a function (or more precisely, if it was in static storage), then it might make a difference. Or maybe not. It probably could still be optimized away. — ikegami, Apr 06 '21 at 02:22
"will the address of `num` remain the same?" Same as what? It seems to me that you're asking about size, not address. The simple fact that you read from an uninitialized variable is undefined behavior, so anything could happen from there. If you instead want to ask what happens if you have a 1000 *unused* variables, then that would depend on compiler optimizations. — jamesdlin, Apr 06 '21 at 02:22
@jamesdlin _The simple fact that you read from an uninitialized variable is undefined behavior_. I never knew using an unitialized value is undefined behavior. I thought it's just indeterminate. — Raman, Apr 06 '21 at 02:46
The behaviour of the program is undefined, so the compiler could take any action (including rejecting the program) and still be a conforming compiler. — M.M, Apr 06 '21 at 03:32
@Raman See https://stackoverflow.com/questions/25074180/ . In this program the indeterminate value is passed to a library function , which is unambiguously undefined — M.M, Apr 06 '21 at 03:33
You should read [n1570](http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf) or a newer C standard and a good textbook about compilation, e.g. [the Dragon book](https://en.wikipedia.org/wiki/Dragon_Book_(computer_science)). If you are allowed to use [GCC](http://gcc.gnu.org/), read carefully its documentation. You can invoke it to show internal representations, and add comments in the generated assembler — Basile Starynkevitch, Apr 06 '21 at 06:41

ikegami · Answer 1 · 2021-04-06T06:13:11.813

2

Seeing as nothing gets the address of the variables, they could very well be optimized away entirely. So no, it doesn't mean that declaring 1,000 32-bit ints will use 4,000 bytes.

Compare this, this and this. As you can see, for that compiler and settings, exactly the same binary is generated for all three of the following:

int num;

printf("%d", num);
printf("%d", num);

int num;
int a = num;
int b = num;

printf("%d", a);
printf("%d", b);

int num;
int a = num;
int b = num;
int c = num;

printf("%d", a);
printf("%d", b);

While 4,000 bytes could be reserved, it's by no means required.

edited Apr 06 '21 at 06:13

answered Apr 06 '21 at 02:16

ikegami

367,544
15
269
518

The question is not about if the code will be optimized or not, the question is if the `num` will have an address, and if it does what's making the value of that address different based on "I know that the value of a and b could be different" – Vishrant Apr 06 '21 at 02:24
1

@Vishrant My question answers: "does that mean the program will use 4 byte * 1,000 when it's executed?" But it also answers what you claim it doesn't: If a variable is optimized away, it doesn't have an address, so asking if its address is the same as something else makes no sense. – ikegami Apr 06 '21 at 02:24
yeah, in order to answer the first part of the question (without considering the compiler optimization) i.e., will the compiler assigns an address, to me the answer is "no", as it would not make sense to assign memory to a variable until it's initialized. – Vishrant Apr 06 '21 at 02:28
Re "*without considering the compiler optimization*", uh, why would I do that? The question didn't ask "without optimizations". (If it did, I wouldn't have answered such a silly question.) The question was "will it?" And the answer is "no, not necessarily". – ikegami Apr 06 '21 at 02:29
considering the compiler optimization sure, the program will not use 4K byte but an optimed memory. – Vishrant Apr 06 '21 at 02:29
@Vishrant Variables are optimized way by virtue of being *unused*, not by being uninitialized. `int num; int* p = &num; print("%s", p ? "not null" : "null");` is perfectly legal. – jamesdlin Apr 06 '21 at 02:29
@jamesdlin, Except they are being used here... Unused vars are very likely to be optimized away. But used variables can be optimized away, as is the case in my examples. – ikegami Apr 06 '21 at 02:30
Sure, but it seems pointless to discuss behavior of uninitialized variables since virtue of being uninitialized is irrelevant (other than leading to UB, which makes discussion even more pointless). – jamesdlin Apr 06 '21 at 02:32
well to me it's more of a curiosity question than it should be done in the production env or not. – Vishrant Apr 06 '21 at 02:33
@jamesdlin The same thing would have happened with initialized variables and no UB, though. [See here](https://godbolt.org/z/sjTqP5sef). What's different in your example is that you took the address of the variable. That changes things. That was mentioned in the answer. – ikegami Apr 06 '21 at 02:37
2

@Vishrant optimization is inherent to the definition of the language. C is defined in terms of an abstract machine, and the compiler is only required to produce the same output that the abstract machine would output. In the abstract machine every variable has a unique address ; in the real machine anything at all can go on as long as the same output is produced as the abstract machine would output. And there are no requirements on real memory usage either, you could print the address of 1000 unused variables and the compiler might allocate no space and print 1000 values of some sort . – M.M Apr 06 '21 at 03:36

Chris Dodd · Answer 2 · 2021-04-06T06:38:37.557

It actually has nothing to do with initialization -- the compiler only needs to assign memory (and an address) to a variable if its address is taken (with the unary & operator) and used (if the address is not used, it might be dead-code eliminated away.)

If the address is never taken, then the variable might be optimized away completely. As long as the compiler can generate code that operates as if the variable existed, it is allowed to optimize it away.

In your example, even if you add an initialization for num (eg, make it int num = 2; it might be optimized away and the code turned into just fputs("22", stdout); getting rid of all the variables.

score 1 · Answer 3 · answered Apr 08 '21 at 19:08

Strictly speaking, the code in the question has undefined behavior because it accesses uninitialized objects. It also has a constraint violation in C99 or later, because it calls printf with no visible declaration.

Because of the undefined behavior, the C standard says literally nothing about how the program will behave. Undefined behavior is "behavior, upon use of a nonportable or erroneous program construct or of erroneous data, for which this International Standard imposes no requirements". The standard joke is that it could make demons fly out of your nose. Of course it can't, but the point is that if it did, that would only violate physics and common sense, not the standard.

So let's get rid of the undefined behavior:

#include <stdio.h>
int main(void) {
    int num = 42;
    int a = num;
    int b = num;

    printf("%d\n", a);
    printf("%d\n", b);
}

Section 6.2.4 of the ISO C standard (I'm using the N1570 draft) says:

The lifetime of an object is the portion of program execution during which storage is guaranteed to be reserved for it. An object exists, has a constant address, and retains its last-stored value throughout its lifetime.

So num, a, and b all have unchanging addresses throughout their lifetimes (which is the execution of the main function), and all three addresses are distinct.

That applies to the "abstract machine". An implementation is required to produce behavior as if all that were the case. If it can do that by generating code equivalent to just puts("42\n42"), that's a perfectly valid optimization. Or, less drastically, it could store num, a, and b in the same location (perhaps a CPU register) because it can prove that they always have the same value and their addresses are irrelevant.

If the behavior of the program actually depends on the addresses of num, a, and b, for example if you print the addresses using printf("%p\n", &a), then that restricts some optimizations. (Incidentally, taking the address of an uninitialized variable is well defined; you might pass that address to a function that initializes it, for example.)

So, if I create an application with let say 1,000 int variables that are uninitialized, does that mean the program will use 4 byte * 1,000 when it's executed?

If you define 1000 int variables, the compiler will generate code that will allocate sizeof (int) bytes for each of them -- unless it can prove that it doesn't need to. If it can generate code that behaves as required without allocating that memory, it can do that. And if the behavior is undefined, then "behaves as required" isn't a requirement at all.

Re "*it calls `printf` with no visible declaration.*", Obvious includes are often omitted from SO snippets for readability. — ikegami, Apr 08 '21 at 21:08
@ikegami That's unfortunately true. Note that in C90, a program that calls `printf` with no visible declaration would be legal, but would have undefined behavior (it would very likely work "correctly"). Without the `#include ` it's not a [mre]. I thought that was worth pointing out. — Keith Thompson, Apr 08 '21 at 21:23

Address of uninitialized variable in C?

3 Answers3