-1

There are many claims that any use of uninitialised variables invokes undefined behavior (UB).
Perusing the docs, I could not verify that claim, so I would like a convincing argument clarifying this for both C and C++.
I expect the same semantics for both, but am prepared to be surprised by subtle or not so subtle differences.

Some examples of using uninitialised variables to get started. Please add others as needed to explain any corner-cases they don't cover.

void test1() {
    int x;
    printf("%d", x);
}

void test2() {
    int x;
    for(int i = 0; i < CHAR_BIT * sizeof x)
        x = x << 1;
    printf("%d", x);
}

void test3() {
    unsigned x;
    printf("%u", x); /* was format "%d" */
}

void test4() {
    unsigned x;
    for(int i = 0; i < CHAR_BIT * sizeof x)
        x = x << 1;
    printf("%u", x); /* was format "%d" */
}
Community
  • 1
  • 1
Deduplicator
  • 44,692
  • 7
  • 66
  • 118
  • Sure that C and C++ don't match for such a subtle issue. – Jens Gustedt Apr 03 '14 at 13:47
  • Also you have UB for the wrong `printf` specifiers, but I guess that is not what you are asking? – Jens Gustedt Apr 03 '14 at 14:00
  • @Jens: You are really sure interpreting an unsigned > MAX_INT as signed is UB? I concede i should have used "%u" for test3 and test4, but where is the convincing rationale that either is UB now? – Deduplicator Apr 03 '14 at 14:04
  • 3
    Please don't play games. If you are asking a question about UB, please try to be a bit more specific in the things you want to know about it. For each of the functions this would have cost you one little commenting phrase. Or state otherwise, don't expect precise answers when your question is fuzzy. – Jens Gustedt Apr 03 '14 at 14:33
  • @Jens: As i said, i'm sorry for throwing that one in there too, it was not intented. AFAICT, it at most invokes no more UB then the question I really wanted to ask. Correcting it as it distracts. – Deduplicator Apr 03 '14 at 14:36
  • 9
    Excuse me, but what is your question? What exactly are the assertions that you're asking about? You can't just dump a block of code on us and say "here you go, is any of this UB?" – Lightness Races in Orbit Apr 03 '14 at 14:36
  • The question is what the standard says about use of x in the examples. I included all 4, because they are strongly related, but there might be subtleties which can only be appreciated if one has them all. – Deduplicator Apr 03 '14 at 14:39
  • I don't understand the 3 downvotes. This is a good conceptual question, well-posed and is not obvious. – Bathsheba Apr 03 '14 at 14:41
  • 4
    @Deduplicator: What do you mean "the use of `x`"? You "use" `x` in at least four different ways. Ask a simple, up-front, concrete question instead of being so hopelessly vague about what it is that you don't understand. – Lightness Races in Orbit Apr 03 '14 at 14:43
  • 6
    @Bathsheba: The downvotes are there because, no, it is indeed not obvious at all what the OP is asking. Is he concerned about overflowing during shifts? About printing the value of uninitialised variables? About shifting uninitialised variables? About overflowing uninitialised variables? Just signed, or unsigned too? Something else? _What?_ – Lightness Races in Orbit Apr 03 '14 at 14:44
  • Simply said: reading / using /modifying uninitialised variables. And both signed and unsigned, in case there are differences. – Deduplicator Apr 03 '14 at 14:59
  • Similar to http://stackoverflow.com/q/6824488/560648 then, but with a UB-specific twist. Please edit your question to include your clarification. – Lightness Races in Orbit Apr 03 '14 at 15:18
  • @Light: So completely different. Not asking for why I get a specific value, but if it is valid at all. – Deduplicator Apr 03 '14 at 22:29

4 Answers4

10

In C all of them are undefined behavior, but for a reason that probably not comes directly to mind. Accessing an object with indeterminate value has undefined behavior if it is "memoryless" that is 6.3.2.1 p2

If the lvalue designates an object of automatic storage duration that could have been declared with the register storage class (never had its address taken), and that object is uninitialized (not declared with an initializer and no assignment to it has been performed prior to use), the behavior is undefined.

Otherwise, if the address is taken, the interpretation of what indeterminate means concretely in this case is not unanimous. There are people that expect such a value to be fixed once it is first read, others speak of something like "woobly" (or so) values that can be different at each access.

In summary, don't do it. (But that you probably knew already.)

(And not talking about the error using "%d" for an unsigned.)

Jens Gustedt
  • 76,821
  • 6
  • 102
  • 177
  • Could you give the references, best would be relating to the openly available drafts? – Deduplicator Apr 03 '14 at 14:09
  • 1
    @Deduplicator, see my edit for the normative part. For the ongoing discussion, maybe the mailing of the committee is available online, I am not sure. What I know is that there is still a point on the agenda of the upcoming Parma meeting concerning interpretation of these things. – Jens Gustedt Apr 03 '14 at 14:26
  • That certainly put paid to the C side of things. Only C++ to go, which will probably resolve similarly. – Deduplicator Apr 03 '14 at 14:51
  • @Deduplicator: C++ wording: "Otherwise, if `T` is a (possibly cv-qualified) unsigned character type (3.9.1), and the object to which the glvalue refers contains an indeterminate value (5.3.4, 8.5, 12.6.2), and that object does not have automatic storage duration or the glvalue was the operand of a unary `&` operator or it was bound to a reference, the result is an unspecified value." followed by "Otherwise, if the object to which the glvalue refers contains an indeterminate value, the behavior is undefined." – Ben Voigt Jul 10 '14 at 05:39
  • So like C, C++ contains a narrow exception for variables which cannot be enregistered... but only for unsigned character types (likely related to the mention of character types in John's answer) – Ben Voigt Jul 10 '14 at 05:41
8

C

C11 6.7.9/10

If an object that has automatic storage duration is not initialized explicitly, its value is indeterminate.

Indeterminate values are handled as follows:

C11 6.2.6.1/5

Certain object representations need not represent a value of the object type. If the stored value of an object has such a representation and is read by an lvalue expression that does not have character type, the behavior is undefined. If such a representation is produced by a side effect that modifies all or any part of the object by an lvalue expression that does not have character type, the behavior is undefined 50). Such a representation is called a trap representation.

There's a comment to the above normative text:

50) Thus, an automatic variable can be initialized to a trap representation without causing undefined behavior, but the value of the variable cannot be used until a proper value is stored in it.

(emphasis mine)

Furthermore, left-shifting a signed int variable containing an indeterminate value can also lead to undefined behavior in case it is interpreted as a negative one:

C11 6.5.7/4

The result of E1 << E2 is E1 left-shifted E2 bit positions; vacated bits are filled with zeros. If E1 has an unsigned type, the value of the result is E1 × 2E2, reduced modulo one more than the maximum value representable in the result type. If E1 has a signed type and nonnegative value, and E1 × 2E2 is representable in the result type, then that is the resulting value; otherwise, the behavior is undefined.

Lundin
  • 195,001
  • 40
  • 254
  • 396
  • Ok, so left shifting an indeterminate signed value is UB. Still leaves the unsigned case open. – Deduplicator Apr 03 '14 at 14:44
  • Hm, Jens Gustedt's answer used a different tack to show all examples invoke UB in C, due to the automatic variable x never having its address taken. – Deduplicator Apr 03 '14 at 16:03
  • @Deduplicator They are all different examples. They key is that when you start using the uninitialized variable, you're on your own, you invoke UB so anything might happen. – Lundin Apr 03 '14 at 16:44
  • All your quotes but the last one only allow an implementation to have trap representation for int/unsigned int, but whether they have them is implementation defined, and only if they have them do those quotes mandate UB. The last one, all alone, puts paid for left-shifting indeterminate signed variables, but does not do so for any of the other tests. – Deduplicator Apr 03 '14 at 17:04
4

All four cases invoke undefined behavior in C since the uninitialized automatic variable never has its address taken. See different answer.

By the way, sizeof(x) is defined since the expression is not actually evaluated: it's a compile time evaluation that decays to the type.

In the latest C++1y draft(N3936) this is clearly undefined behavior since the language on indeterminate values and undefined behavior has been clarified and it now says in section 8.5:

[...]If an indeterminate value is produced by an evaluation, the behavior is undefined except in the following cases;

and goes on to list exception for some unsigned narrow character types only.

Previously in C++ we had to rely on the underspecified lvalue-to-rvalue conversion to prove undefined behavior, which is problematic in the general case. In this case we do have an lalue-to-rvalue conversion. If we look at section 5.2.2 Function call paragraph 7 which says (emphasis mine):

When there is no parameter for a given argument, the argument is passed in such a way that the receiving function can obtain the value of the argument by invoking va_arg (18.10). [...] The lvalue-to-rvalue (4.1), array-to-pointer (4.2), and function-to-pointer (4.3) standard conversions are performed on the argument expression.

Community
  • 1
  • 1
Bathsheba
  • 231,907
  • 34
  • 361
  • 483
  • It actually does not invoke *undefined behavior*. No where in the standard is written anything like it. – Shoe Apr 03 '14 at 13:39
  • Really? Is it my loose terminology that bothers? – Bathsheba Apr 03 '14 at 13:39
  • The standard simply specifies that the value is *indeterminate*. – Shoe Apr 03 '14 at 13:40
  • @Jefffrey, Using an uninitialized variable has always been UB AFAIK. – chris Apr 03 '14 at 13:40
  • The standard only says (C++ at §8.5/12): "If no initializer is specified for an object, the object is default-initialized. When storage for an object with automatic or dynamic storage duration is obtained, the object has an indeterminate value, and if no initialization is performed for the object, that object retains an indeterminate value until that value is replaced (5.17)." – Shoe Apr 03 '14 at 13:41
  • 3
    @Jefffrey, In C++11 § 4.1 [conv.lval]/1, *if the object is uninitialized, a program that necessitates this conversion has undefined behavior*. I imagine something along those lines is in other standards. – chris Apr 03 '14 at 13:41
  • Undefined behavior is only associated with an uninitialized pointer, AFAIK. – Shoe Apr 03 '14 at 13:41
  • @chris, can you please tell me which section is it in? – Shoe Apr 03 '14 at 13:42
  • @chris, that apparently changed in C++14, since what I have (with the latest draft - N3936) is: "If T is an incomplete type, a program that necessitates this conversion is ill-formed." – Shoe Apr 03 '14 at 13:46
  • 3
    @Jefffrey dyp pointed me [defect report 616](http://www.open-std.org/JTC1/SC22/WG21/docs/cwg_defects.html#616) when I made a similar statement in this [answer here](http://stackoverflow.com/a/21213706/1708801). I am guessing this is part of latest draft but I have not checked yet. – Shafik Yaghmour Apr 03 '14 at 13:46
  • @ShafikYaghmour, Hmm, thanks, and I didn't realize that disappeared in the C++14 draft. – chris Apr 03 '14 at 13:48
  • @Jefffrey: C99, Appendix J.2 (a list of causes of UB): "The value of an object with automatic storage duration is used while it is indeterminate". – Oliver Charlesworth Apr 03 '14 at 13:48
  • @OliCharlesworth, yeah, I don't know the C standard. But the C++ one, as you can see, talks about *indeterminate value*, not undefined behavior, if I understood correctly. – Shoe Apr 03 '14 at 13:50
  • @OliCharlesworth, appendix J is not normative, and should perhaps be seen at points that *may* lead to UB, not that they *must*. There is still ongoing discussion in the standards committee about all that. – Jens Gustedt Apr 03 '14 at 13:54
  • 1
    @TonyD: Took me six seconds to fix. You could have done that yourself. – Lightness Races in Orbit Apr 03 '14 at 14:39
  • @Jefffrey I asked because the github version says a lot more see [Has C++1y changed with respect to the use of indeterminate values and undefined behavior?](http://stackoverflow.com/questions/23415661/has-c1y-changed-with-respect-to-the-use-of-indeterminate-values-and-undefined). I am guessing the initial release after the meeting was missing some edits. – Shafik Yaghmour May 02 '14 at 14:15
  • @Bathsheba I am not sure if you see comments since this is CW now but you may be interested in the edits I made. – Shafik Yaghmour May 06 '14 at 01:08
  • @chris I clarified the post ... previously I was trying to tie down details but the language has changed a lot in C++1y. – Shafik Yaghmour May 06 '14 at 01:09
3

With respect to C, the behavior of all the examples is may be undefined:

Chapter and verse

3.19.2
1 indeterminate value
either an unspecified value or a trap representation
...
6.2.6 Representations of types
6.2.6.1 General
...
5 Certain object representations need not represent a value of the object type. If the stored value of an object has such a representation and is read by an lvalue expression that does not have character type, the behavior is undefined. If such a representation is produced by a side effect that modifies all or any part of the object by an lvalue expression that does not have character type, the behavior is undefined.50) Such a representation is called a trap representation.
...
50) Thus, an automatic variable can be initialized to a trap representation without causing undefined behavior, but the value of the variable cannot be used until a proper value is stored in it.

In all four cases, x has automatic storage duration and is not explicitly initialized, meaning its value is indeterminate; if this indeterminate value is a trap representation, then the behavior is undefined.

EDIT

Removed reference to appendix J, as it is non-normative.

John Bode
  • 119,563
  • 19
  • 122
  • 198
  • 1
    That appendix is non-normative, as Jens Gustedt pointed out in the comments to another answer. – Deduplicator Apr 03 '14 at 14:11
  • 2
    When quoting ISO standards, have in mind that appendices are either explicitly normative or explicitly informative. This is stated on top of the appendix. Also, foot notes in ISO standards are never normative. – Lundin Apr 03 '14 at 14:29
  • 1
    @Deduplicator: Yeah, but it provides cross-references to normative text... Did you try following them? – Lightness Races in Orbit Apr 03 '14 at 14:37
  • @Light: Other answers dug out normative text, which covers the examples in the question, though not everything covered by the non-normative text cited here. – Deduplicator Apr 03 '14 at 16:08