18

Consider the following code:

#include <iostream>

struct Data
{
    int x, y;
};

Data fill(Data& data)
{
    data.x=3;
    data.y=6;
    return data;
}

int main()
{
    Data d=fill(d);
    std::cout << "x=" << d.x << ", y=" << d.y << "\n";
}

Here d is copy-initialized from the return value of fill(), but fill() writes to d itself before returning its result. What I'm concerned about is that d is non-trivially used before being initialized, and use of uninitialized variables in some(all?) cases leads to undefined behavior.

So is this code valid, or does it have undefined behavior? If it's valid, will the behavior become undefined once Data stops being POD or in some other case?

Shafik Yaghmour
  • 154,301
  • 39
  • 440
  • 740
Ruslan
  • 18,162
  • 8
  • 67
  • 136
  • 7
    This is uggly, who do you do this ?` – Jabberwocky Nov 11 '15 at 11:20
  • 2
    Why not just give Data a proper constructor? Also, see here: http://stackoverflow.com/questions/4782757/rule-of-three-becomes-rule-of-five-with-c11 – George Nov 11 '15 at 11:27
  • 2
    Related question: [Is passing a C++ object into its own constructor legal?](http://stackoverflow.com/q/32608458/1708801) this is not a constructor but the arguments there are all similar for this case as well. – Shafik Yaghmour Nov 11 '15 at 12:31
  • 2
    @MichaelWalz I agree this is ugly, but it's a much simplified version of a thing I only recently understood how to implement properly. So now it's just a matter of curiosity whether it was valid. – Ruslan Nov 11 '15 at 13:04
  • It is a shame this question did not get more attention, it is an interesting corner of the standard and the answer is not obvious. While I am pretty sure my answer is correct, it could use more eyes. – Shafik Yaghmour Nov 16 '15 at 13:02
  • 2
    @ShafikYaghmour that's an exact phrase among those in bounty reasons "this question didn't receive enough attention" or something like this ;) You can consider offering a bounty if you really feel so. – Ruslan Nov 16 '15 at 13:05
  • @Ruslan indeed, I have thought about this is several cases but I always felt weird offering a bounty on a question I have answered. The math is also discouraging since I have an answer the minimum is `100`. If I am correct, I may break even but the bounty is itself is wasted since I can not award it to myself. It makes sense this is prevent gaming but is also discourages scenarios like this where I would probably do it if it was only `50`. – Shafik Yaghmour Nov 16 '15 at 14:39
  • I received a 100 point bounty today, so I added on here. I doubt I will get it back but it would be good to get more eyes on this one, I would like to know what the correct answer is. – Shafik Yaghmour Dec 08 '15 at 21:38
  • The specific code seems legit since `Data` is a trivial type and thus its lifetime begins when storage has been allocated. It is then initialized from a temporary. The fact that this temporary has been copied from the same storage location shouldn't matter. – Kerrek SB Dec 13 '15 at 23:30
  • Well, the bounty worked out well and I received some pretty good comments from T.C. and M.M and so I am pretty confident my answer is correct now. The bounty will go to waste but I can live with that. – Shafik Yaghmour Dec 16 '15 at 13:11

3 Answers3

13

This does not seem like valid code. It is similar to the case outlined in the question: Is passing a C++ object into its own constructor legal?, although in that case the code was valid. The mechanics are not identical but the base reasoning can at least get us started.

We start with defect report 363 which asks:

And if so, what is the semantics of the self-initialization of UDT? For example

 #include <stdio.h>

 struct A {
        A()           { printf("A::A() %p\n",            this);     }
        A(const A& a) { printf("A::A(const A&) %p %p\n", this, &a); }
        ~A()          { printf("A::~A() %p\n",           this);     }
 };

 int main()
 {
  A a=a;
 }

can be compiled and prints:

A::A(const A&) 0253FDD8 0253FDD8
A::~A() 0253FDD8

and the proposed resolution was:

3.8 [basic.life] paragraph 6 indicates that the references here are valid. It's permitted to take the address of a class object before it is fully initialized, and it's permitted to pass it as an argument to a reference parameter as long as the reference can bind directly. [...]

So although d is not fully initialized we can pass it as a reference.

Where we start to get into trouble is here:

data.x=3;

The draft C++ standard section 3.8(The same section and paragraph the defect report quotes) says (emphasis mine):

Similarly, before the lifetime of an object has started but after the storage which the object will occupy has been allocated or, after the lifetime of an object has ended and before the storage which the object occupied is reused or released, any glvalue that refers to the original object may be used but only in limited ways. For an object under construction or destruction, see 12.7. Otherwise, such a glvalue refers to allocated storage (3.7.4.2), and using the properties of the glvalue that do not depend on its value is well-defined. The program has undefined behavior if:

  • an lvalue-to-rvalue conversion (4.1) is applied to such a glvalue,

  • the glvalue is used to access a non-static data member or call a non-static member function of the object, or

  • the glvalue is bound to a reference to a virtual base class (8.5.3), or

  • the glvalue is used as the operand of a dynamic_cast (5.2.7) or as the operand of typeid.

So what does access mean? That was clarified with defect report 1531 which defines access as:

access

to read or modify the value of an object

So fill accesses a non-static data member and hence we have undefined behavior.

This also agrees with section 12.7 which says:

[...]To form a pointer to (or access the value of) a direct non-static member of an object obj, the construction of obj shall have started and its destruction shall not have completed, otherwise the computation of the pointer value (or accessing the member value) results in undefined behavior.

Since you are using a copy anyway you might as well create an instance of Data inside of fill and initialize that. The you avoid having to pass d.

As pointed out by T.C. it is important to explicitly quote the details on when lifetime starts. From section 3.8:

The lifetime of an object is a runtime property of the object. An object is said to have non-trivial initialization if it is of a class or aggregate type and it or one of its members is initialized by a constructor other than a trivial default constructor. [ Note: initialization by a trivial copy/move constructor is non-trivial initialization. — end note ] The lifetime of an object of type T begins when:

  • storage with the proper alignment and size for type T is obtained, and

  • if the object has non-trivial initialization, its initialization is complete.

The initialization is non-trivial since we are initializing via the copy constructor.

Community
  • 1
  • 1
Shafik Yaghmour
  • 154,301
  • 39
  • 440
  • 740
  • This is missing a quote from 3.8/1 on when `d`'s lifetime starts. – T.C. Dec 08 '15 at 22:02
  • @T.C. Fair point, I kind of left it implied by reference to the defect report but I should make it explicit. I won't be able to edit it in for a while though. – Shafik Yaghmour Dec 08 '15 at 22:28
  • @M.M see T.C.'s comment on deleted answer below. – Shafik Yaghmour Dec 09 '15 at 04:35
  • @ShafikYaghmour If we say that this is non-trivial initialization because we are calling a copy-constructor, then perhaps "For an object under construction or destruction, see 12.7. " should actually apply, since `d` is under copy-construction? – M.M Dec 09 '15 at 04:37
  • 1
    @M.M it is unclear to me, I think they both apply. The defect report with a similar situation cites `3.8` but `3.8` cites `12.7` but it seems like they are saying the same thing in a different way. It seems like some fixing is needed in the wording. – Shafik Yaghmour Dec 09 '15 at 04:41
  • 2
    @M.M As I see it, the object isn't under construction until its constructor begins execution. The access here happens before that, while evaluating arguments to be passed to the constructor. – T.C. Dec 09 '15 at 13:34
  • 2
    I concur with your conclusion. I would also point out that if x and y were of some non-POD type with an assignment operator that depended on the existing value, then calling that assignment operator (as you do in fill) is bound to fail (because x and y haven't been constructed yet). Given it can't work with complex classes, I'd feel very nervous about using it with POD classes. (Even if the standard *were* to allow it, this sort of edge case is where compiler bugs are likely to lurk.) – Martin Bonner supports Monica Dec 13 '15 at 11:40
2

I don't see a problem. Accessing the uninitialized integer members is valid, because you're accessing for the purpose of writing. Reading them would cause UB.

MSalters
  • 173,980
  • 10
  • 155
  • 350
  • 2
    Could you comment on why Shafik's answer is wrong, as it seems to contradict your answer? – Ruslan Dec 15 '15 at 18:04
  • @Ruslan: Omitting Standardese, `this->x = 3` would be legal in a constructor of `Data`, even if `x` is not initialized by the initializer list. This hinges on `x` not having a (default) constructor, as it's a plain `int`, because for members with a constructor will have been initialized by that point. Fundamentally, "lifetime" is a fuzzy problem for POD's. `*(int*) malloc(sizeof(int)) = 5` has to be valid for C compatibility, which shows that storage is sufficient. This mirrors Shaik's last quote from 3.8. `Data::x` has the right storage and trivial initialization, so the write is OK. – MSalters Dec 16 '15 at 13:41
-7

I think it is valid ( crazy, but valid ).

This would be both legal and logically acceptable :

Data d ;

d = fill( d ) ;

and the fact is that this form is the same :

Data d = fill( d ) ;

As far as the logical structure of the language is concerned those two versions are equivalent.

So it's legal and logically correct for the language.

However, as we normally expect people to initialize variables to a default when we created them ( for safety ), it is bad programming practice.

It is interesting that g++ -Wall compiles this code without a blurp.

  • If you're going to down vote, at least have the common courtesy to say *why* in a comment. People can't mind read. – StephenG - Help Ukraine Dec 09 '15 at 15:47
  • 2
    Those forms are not the same, and they are not equivalent "as far as the logical structure of the language is concerned". One is initialization and one is assignment. Also, an answer to a language-lawyer question should include references to the language standard to support the argument presented. – M.M Dec 09 '15 at 21:22
  • I think that this is not an initialization. This operation must be resolved as a creation and assignment, because the assigned value cannot be constructed without a runtime call to a function and only *after* the variable is created. It thus *must* be viewed as a creation of an uninitialized variable prior to the assignment of a value to it. This is distinct from initializing with a function call involving only variables that existed *before* the variable is created ( e.g. `Date d=func(c) ;` ). – StephenG - Help Ukraine Dec 09 '15 at 21:28
  • 1
    Well, you're mistaken. – M.M Dec 09 '15 at 22:15
  • Why am I mistaken ? Put another way, why do you think it does not resolve into a creation followed by an assignment. I do not see how it can be otherwise. – StephenG - Help Ukraine Dec 09 '15 at 22:18
  • 1
    Because the C++ standard says that it is initialization. This is really basic stuff. [Here is an intro](http://en.cppreference.com/w/cpp/language/copy_initialization), read the C++ standard section 8.5.4 for a full description. You're out of your depth here. – M.M Dec 09 '15 at 22:51
  • You're not getting it. For the expression on the RHS of this "=" sign ( whatever you want to think of it as ) to be evaluated, we *must* first create the 'd' variable. It has to do that to pass it to the function. If you wish to think of the return being copied into the d variable as an initialization you can, but in fact the variable d *had* to be initialized ( even it it was with garbage ) *before* that functions result is produced. It is logically impossible to resolve this operation without doing that . Object must be created and initialized before passing to a function. – StephenG - Help Ukraine Dec 10 '15 at 00:31
  • 1
    "in fact the variable d had to be initialized ( even it it was with garbage ) before that functions result is produced." - no it didn't. That's the whole point of this question, `fill(d)` uses `d` before its initialization even starts. See Shafik's answer for supporting evidence. – M.M Dec 10 '15 at 01:31
  • 1
    " . Object must be created and initialized before passing to a function." - in fact you can pass an object by reference before it was initialized (see the question linked in the first paragraph of Shafik's answer) and do certain things; but writing to `data.x` is not one of the things you're allowed to do at that time. – M.M Dec 10 '15 at 01:32
  • You seem to be making up your own ideas of what "initialized", "created" etc. mean . In fact we have a standards document that defines all of these things, you'd do well to read it (especially before answering language-lawyer questions). I'm done here, the comments box isn't the place for a tutorial. – M.M Dec 10 '15 at 01:34
  • 3
    BTW if you're going to say things like "at least have the common courtesy to say why [the downvote]", then you should accept explanations. Probably people don't explain their downvotes because they don't want to get into lame "arguments" like this where you just insist you were right all along and don't listen. – M.M Dec 10 '15 at 01:37