7

Is the following code valid C++, according to the standard (discounting the ...s)?

bool f(T& r)
{
    if(...)
    {
        r = ...;
        return true;
    }
    return false;
}

T x = (f(x) ? x : T());

It is known to compile in the GCC versions this project uses (4.1.2 and 3.2.3... don't even get me started...), but should it?

Edit: I added some details, for example as to how f() conceptually looks like in the original code. Basically, it's meant to be initialize x in certain conditions.

Shoe
  • 74,840
  • 36
  • 166
  • 272
ShdNx
  • 3,172
  • 5
  • 40
  • 47
  • 1
    Whether syntactically valid or not, it is logically meaningless since you'd be accessing an uninitialized variable. What did you *expect* it to do? – Cody Gray - on strike May 20 '14 at 15:16
  • @CodyGray Nothing from this code tells us that `x` is left uninitialized in `f()`. – πάντα ῥεῖ May 20 '14 at 15:18
  • @CodyGray It could just be an output parameter: `bool f( int& x ){ x = 10; }` – clcto May 20 '14 at 15:19
  • Define valid. It could mean many different things. – yizzlez May 20 '14 at 15:22
  • This situation is very similar to something like `struct Foo { Foo() { f(*this); } };` It's essentially UB to pretend something is an object before the object's life time has begun. – Kerrek SB May 20 '14 at 15:33
  • @awesomeyi: I mean by the C++ standard. I've updated the question to reflect this. – ShdNx May 20 '14 at 16:12
  • @twalberg: Thank you for the productive comment. – ShdNx May 20 '14 at 16:12
  • @CodyGray: I've updated the question to reflect the semantics of the function in the original code. – ShdNx May 20 '14 at 16:13
  • Possible duplicate of: http://stackoverflow.com/questions/9820027/using-newly-declared-variable-in-initialization-int-x-x1 – Shoe Jun 05 '14 at 18:57
  • @Jeffrey: Not a duplicate, that question doesn't consider writes to the object before its initialization takes place. I even put an answer on that question, with a link to this highlighting the difference, before your comment – Ben Voigt Jun 05 '14 at 19:21
  • @KerrekSB Are you sure that that code is UB? Because you are allowed to call other members of `Foo` within its constructor, which is equivalent to `this->someMember();`. And that member, again, is perfectly entitled to do `f(*this);`. Afaik, the only surprising thing that may happen is, that `*this` is not yet an object of the subclass that is supposed to be constructed, allowing failure when pure virtual functions are called, but not undefined behavior. – cmaster - reinstate monica Jun 05 '14 at 20:23
  • Related to [Can initializing expression use the variable itself?](http://stackoverflow.com/q/33649370/1708801) and [Is passing a C++ object into its own constructor legal?](http://stackoverflow.com/q/32608458/1708801) – Shafik Yaghmour Dec 25 '15 at 03:47
  • I added an answer since both of the accepted ones misses several points and neither covered whether binding a reference to an object before its lifetime had begun was valid. – Shafik Yaghmour Dec 27 '15 at 05:57

3 Answers3

2

Syntactically it is, however if you try this

#include <iostream>
using namespace std;

typedef int T;
bool f(T& x)
{
    return true;
}
int main()
{
    T x = (f(x) ? x : T());
    cout << x;
}

it outputs some random junk. However, if you modify

bool f(T& x)
{
    x = 10;
    return true;
}

then it outputs 10. In the first case, the object x is declared, and the compiler assigns some pseudo-arbitrary value (so you do not initialize it), whereas in the second you specifically assign a value (T(), i.e. 0) after the declaration, i.e. you initialize it.

I think your question is similar to this one: Using newly declared variable in initialization (int x = x+1)?

Community
  • 1
  • 1
vsoftco
  • 55,410
  • 12
  • 139
  • 252
  • I've updated the question. Thank you for the related SO thread link! So basically unless f() uses the (unspecified) value of x, the code should be valid? – ShdNx May 20 '14 at 15:45
  • Whenever `f()` returns `true` and it is not initializing `x` inside, you get an undefined value for `x`. If `f()` returns `false` then `x` is value-initialized with 0. In your updated code, I see that `f` returns `true` on the branch in which `x` is initialized, so in this case you shouldn't get any undefined value. – vsoftco May 20 '14 at 16:15
  • I have accepted your answer because of the link to the other SO thread, which clears up the question: the variable x is defined at the point of the = symbol, so referencing it is legal, even though the value of x is unspecified at that point. Thank you! – ShdNx May 21 '14 at 10:08
  • This answer confuses assignment and initialization. The `x = 10;` in the latter function is not initialization. – Ben Voigt Jun 05 '14 at 19:45
  • Umm, the value `T()` in `T x = (f(x)? x : T());` is used for initialization. I was talking about `x = 10;` inside the second `f()`, that's assignment. – Ben Voigt Jun 05 '14 at 20:00
  • Ahhh ok, are you talking about the comment? I cannot modify it unfortunately. – vsoftco Jun 05 '14 at 20:48
  • Note, although your code works since you using *int* if the object has non-trivial initialization then it would invoke undefined behavior. See my answer for all the details. Also a key point is whether it is valid to bind a reference to an object before its lifetime has began. – Shafik Yaghmour Dec 27 '15 at 05:55
1

It undoubtedly should compile, but may conditionally lead to undefined behavior.

  • If T is a non-primitive type, undefined behavior if it is assigned.
  • If T is a primitive type, well-defined behavior if it is non-local, and undefined behavior if it is not assigned before reading (except for character types, where it is defined to give an unspecified value).

The relevant part of the Standard is this rule from 3.8, Object lifetime:

The lifetime of an object of type T begins when:

  • storage with the proper alignment and size for type T is obtained, and
  • if the object has non-trivial initialization, its initialization is complete.

So the lifetime of x hasn't started yet. In the same section, we find the rule that governs using x:

Similarly, before the lifetime of an object has started but after the storage which the object will occupy has been allocated or, after the lifetime of an object has ended and before the storage which the object occupied is reused or released, any glvalue that refers to the original object may be used but only in limited ways. For an object under construction or destruction, see 12.7. Otherwise, such a glvalue refers to allocated storage (3.7.4.2), and using the properties of the glvalue that do not depend on its value is well-defined. The program has undefined behavior if:

  • an lvalue-to-rvalue conversion (4.1) is applied to such a glvalue,
  • the glvalue is used to access a non-static data member or call a non-static member function of the object, or
  • the glvalue is bound to a reference to a virtual base class (8.5.3), or
  • the glvalue is used as the operand of a dynamic_cast (5.2.7) or as the operand of typeid.

If your type is non-primitive, then trying to assign it is actually a call to T::operator=, a non-static member function. Full-stop, that is undefined behavior according to case 2.

Primitive types are assigned without invoking a member function, so let's now take a closer look at section 4.1, Lvalue-to-rvalue conversion, to see when exactly that lvalue-to-rvalue conversion will be undefined behavior:

When an lvalue-to-rvalue conversion occurs in an unevaluated operand or a subexpression thereof (Clause 5) the value contained in the referenced object is not accessed. In all other cases, the result of the conversion is determined according to the following rules:

  • If T is (possibly cv-qualified) std::nullptr_t, the result is a null pointer constant (4.10).
  • Otherwise, if T has a class type, the conversion copy-initializes a temporary of type T from the glvalue and the result of the conversion is a prvalue for the temporary.
  • Otherwise, if the object to which the glvalue refers contains an invalid pointer value (3.7.4.2, 3.7.4.3), the behavior is implementation-defined.
  • Otherwise, if T is a (possibly cv-qualified) unsigned character type (3.9.1), and the object to which the glvalue refers contains an indeterminate value (5.3.4, 8.5, 12.6.2), and that object does not have automatic storage duration or the glvalue was the operand of a unary & operator or it was bound to a reference, the result is an unspecified value.
  • Otherwise, if the object to which the glvalue refers contains an indeterminate value, the behavior is undefined.
  • Otherwise, the value contained in the object indicated by the glvalue is the prvalue result.

(note that these rules reflect a rewrite for the upcoming C++14 standard in order to make them easier to understand, but I don't think there's an actual change in the behavior here)

Your variable x has1 an indeterminate value at the time an lvalue-reference is made and passed to f(). As long as that variable has primitive type and its value is assigned before it is read (a read is lvalue-to-rvalue conversion), the code is fine.

If the variable isn't assigned before being read, the effect depends on T. Character types will cause code that executes and uses an arbitrary but legal character value. All other types cause undefined behavior.


1 Unless x has static storage duration, for example a global variable. In that case it is zero-initialized before execution, according to section 3.6.2 Initialization of non-local variables:

Variables with static storage duration (3.7.1) or thread storage duration (3.7.2) shall be zero-initialized (8.5) before any other initialization takes place.

In this case of static storage duration it is not possible to run into lvalue-to-rvalue conversion of an unspecified value. But zero-initialization is not a valid state for all types, so still be careful of that.

Ben Voigt
  • 277,958
  • 43
  • 419
  • 720
  • 1
    *"The relevant part of the Standard"* Has C++1y already been accepted? Otherwise, is it considered a defect that either needs no change from implementers or will be implemented (consistently) in the respective C++11 modes? (see http://www.open-std.org/JTC1/SC22/WG21/docs/cwg_defects.html#616 ) – dyp Jun 05 '14 at 18:53
  • 4
    This isn't really true. If T is a non-trivially-initializable type, then this is guaranteed UB, regardless of whether it's read before assignment or not. – Puppy Jun 05 '14 at 18:56
  • @DeadMG: Good point. Actually it's not that rvalue conversion is still UB, but that calling `operator=` on such a type prior to it becoming alive is not allowed. – Ben Voigt Jun 05 '14 at 19:04
  • @DeadMG: Ok, covered the non-primitive case, I think. – Ben Voigt Jun 05 '14 at 19:11
  • @dyp: Does a primitive object remain uninitialized after it has been assigned? I thought that assignment met the requirements for creation of a primitive object (after rework, it's the first quote in my answer). The new wording is considerably better, though. – Ben Voigt Jun 05 '14 at 19:18
  • @BenVoigt note that you can start its lifetime prematurely by calling placement `new` on it; in that case there's only UB if you depend on the side effects of the destructor (3.8p4). – ecatmur Jun 05 '14 at 19:19
  • @ecatmur: Yeah. But for primitive objects, aren't assignment and placement new equivalent? – Ben Voigt Jun 05 '14 at 19:20
  • 1
    @BenVoigt yup. In assignment the "value of the expression replaces that of the object referred to by the left operand"; placement new invokes direct-initialization, by which the "initial value of the object being initialized is the (possibly converted) value of the initializer expression". In both cases there is only one object during the entire procedure, but its (possibly initially indeterminate) value is replaced. – ecatmur Jun 05 '14 at 19:33
  • 1
    Hm. I may be wrong on that last part; 3.8p4 seems to indicate that the original primitive object's lifetime is terminated by placement new. I guess 3.8p7 licenses us to use the original object's name. – ecatmur Jun 05 '14 at 19:43
1

Although scope plays a role the real issue is about object lifetime and more exactly for object with non-trivial initialization when does the lifetime begin.

This is closely related to Can initializing expression use the variable itself? and Is passing a C++ object into its own constructor legal?. Although my answers to those questions do not neatly answer this question, so it does not seem like a duplicate.

The key portion of the draft C++ standard we are concerned with here is section 3.8 [basic.life] which says:

The lifetime of an object is a runtime property of the object. An object is said to have non-trivial initialization if it is of a class or aggregate type and it or one of its members is initialized by a constructor other than a trivial default constructor. [ Note: initialization by a trivial copy/move constructor is non-trivial initialization. — end note ] The lifetime of an object of type T begins when:

  • storage with the proper alignment and size for type T is obtained, and
  • if the object has non-trivial initialization, its initialization is complete.

So in this case we satisfy the first bullet, storage has been obtained.

The second bullet is where we find trouble:

  • do we have non-trivial initialization
  • and if so is the initialization complete

Non-trivial initialization case

We can get a base reasoning from defect report 363 which asks:

And if so, what is the semantics of the self-initialization of UDT? For example

 #include <stdio.h>

 struct A {
        A()           { printf("A::A() %p\n",            this);     }
        A(const A& a) { printf("A::A(const A&) %p %p\n", this, &a); }
        ~A()          { printf("A::~A() %p\n",           this);     }
 };

 int main()
 {
  A a=a;
 }

can be compiled and prints:

A::A(const A&) 0253FDD8 0253FDD8
A::~A() 0253FDD8

and the proposed resolution was:

3.8 [basic.life] paragraph 6 indicates that the references here are valid. It's permitted to take the address of a class object before it is fully initialized, and it's permitted to pass it as an argument to a reference parameter as long as the reference can bind directly. [...]

So before the lifetime of an object begins we are limited in what we can do with an object. We can see from the defect report binding a reference to x is valid as long as it binds directly.

What we can do is covered in section 3.8(The same section and paragraph the defect report quotes) says (emphasis mine):

Similarly, before the lifetime of an object has started but after the storage which the object will occupy has been allocated or, after the lifetime of an object has ended and before the storage which the object occupied is reused or released, any glvalue that refers to the original object may be used but only in limited ways. For an object under construction or destruction, see 12.7. Otherwise, such a glvalue refers to allocated storage (3.7.4.2), and using the properties of the glvalue that do not depend on its value is well-defined. The program has undefined behavior if:

  • an lvalue-to-rvalue conversion (4.1) is applied to such a glvalue,

  • the glvalue is used to access a non-static data member or call a non-static member function of the object, or

  • the glvalue is bound to a reference to a virtual base class (8.5.3), or

  • the glvalue is used as the operand of a dynamic_cast (5.2.7) or as the operand of typeid.

In your case we are accessing a non-static data member here, see emphasis above:

r = ...;

So if T has non-trivial initialization then this line invokes undefined behavior and so would reading from r which would also be an access, covered in defect report 1531.

If x has static storage duration it will be zero-initialized but as far as I can tell this does not count as it's initialization is complete since the constructor would be called during dynamic initialization.

Trivial Initialization case

If T has trivial initializaton then the lifetime begins once storage is obtained and writing to r is well defined behavior. Although note that reading r before it has initialized will invoke undefined behavior since it would produce an indeterminate value. If x has static storage duration then it is zero-initialized and we don't have this issue.

Should it compile, in either cases whether you are invoking undefined behavior or not this allowed to compile. The compiler is not obligated to produce a diagnostic for undefined behavior although it may. It is only obligated to produce a diagnostic for ill-formed code which none of the troublesome cases here are.

Community
  • 1
  • 1
Shafik Yaghmour
  • 154,301
  • 39
  • 440
  • 740