Extending temporary's lifetime through rvalue data-member works with aggregate, but not with constructor, why?

Question

I've found the following scheme to extend a temporaries lifetime works, I don't know if it should, but it does.

struct S {
    std::vector<int>&& vec;
};

int main() {
    S s1{std::vector<int>(5)};      // construct with temporary
    std::cout << s1.vec[0] << '\n'; // fine, temporary is alive
}

However, when S is given an explicit value constructor it is no longer an aggregate, and this scheme fails with an invalid read on s1.vec[0]

struct S {
    std::vector<int>&& vec;
    S(std::vector<int>&& v)
        : vec{std::move(v)}         // bind to the temporary provided
    { }
};

int main() {
    S s1{std::vector<int>(5)};      // construct with temporary
    std::cout << s1.vec[0] << '\n'; // not ok. invalid read on free'd memory
}

Why is this valid with an aggregate? I'm thinking it has to do with the constructor being an actual function call, based on what I've read with const lvalue refs. Additionally, is there any way to make the latter case work?

There are a great deal of questions dealing with a similar situation using lvalue references on SO. I see that if I had used a const lvalue ref it wouldn't help to extend the lifetime of the temporary, are the rules for rvalue refs the same?

"`const lvalue refs` can't extend the lifetime of temporaries" - huh? — Yakk - Adam Nevraumont, May 27 '14 at 14:31
This looks like an abuse of C++ (It works does not mean it is right, it might be undefined behavior) — , May 27 '14 at 14:34
But the temporary is not alive, therefore this is UB. In my understanding it dies after `;` where is it created. Or am I missing something? — BЈовић, May 27 '14 at 14:44
The rules for lifetime extension of temporaries have to do with binding the temporary to a reference, they don't discriminate between lvalue and rvalue references. — Casey, May 27 '14 at 15:01
The claim in the first sentence is dubious and the test provided is too simplistic. — R. Martinho Fernandes, May 27 '14 at 15:03
If I remember rightly, there are interesting clauses in the standard that make this case distinct from other cases. Like "reference binding in a constructor does not extend lifetime". The lack of a constructor here means the reference binding is done directly by the compiler, so ... it might work. I looked into the lifetime extension rules when I was thinking of proposing "lifetime chaining" to get around some needless copies when forwarding temporaries through helper functions. — Yakk - Adam Nevraumont, May 27 '14 at 15:47
@Yakk sorry about that first screw up, I was kinda rushing when I typed the last bit this morning. — Ryan Haining, May 28 '14 at 01:46
@Yakk yes, you are correct in the difference between *aggregate initialization* and initialization through a *user-defined constructor*: see my answer for further information. — Filip Roséen - refp, May 31 '14 at 00:39

Filip Roséen - refp · Accepted Answer · 2014-05-31T10:53:30.077

13

TL;DR

Aggregate initialization can be used to extend the life-time of a temporary, a user-defined constructor cannot do the same since it's effectively a function call.

_{Note: Both T const& and T&& apply in the case of aggregate-initalization and extending the life of temporaries bound to them.}

What is an Aggregate?

struct S {                // (1)
  std::vector<int>&& vec;
};

To answer this question we will have to dive into the difference between initialization of an aggregate and initialization of a class type, but first we must establish what an aggregate is:

8.5.1p1 Aggregates [dcl.init.aggr]

An aggregate is an array or a class (Clause 9) with no user-provided constructors (12.1), no private or protected non-static data members (Clause 11), no base classes (Clause 10), and no virtual functions (10.3)

^{Note: The above means that (1) is an aggregate.}

How are Aggregates initialized?

The initialization between an aggregate and a "non-aggregate" differs greatly, here comes another section straight from the Standard:

8.5.1p2 Aggregates [dcl.init.aggr]

When an aggregate is initialized by an initializer list, as specified in 8.5.4, the elements of the initializer list are taken as initializers for the members of the aggregate, in increasing subscript or member order. Each member is copy-initialized from the corresponding initializer-clause.

The above quotation states that we are initializing the members of our aggregate with the initializers in the initializer-clause, there is no step in between.

struct A { std::string a; int b; };

A x { std::string {"abc"}, 2 };

Semantically the above is equivalent to initializing our members using the below, just that A::a and A::b in this case is only accessible through x.a and x.b.

std::string A::a { std::string {"abc"} };
int         A::b { 2 };

If we change the type of A::a to an rvalue-reference, or a const lvalue-reference, we will directly bind the temporary use for initialization to x.a.

The rules of rvalue-references, and const lvalue-references, says that the temporaries lifetime will be extended to that of the host, which is exactly what is going to happen.

How does initialization using a user-declared constructor differ?

struct S {                    // (2)
    std::vector<int>&& vec;
    S(std::vector<int>&& v)
        : vec{std::move(v)}   // bind to the temporary provided
    { }
};

A constructor is really nothing more than a fancy function, used to initialize a class instance. The same rules that apply to functions, apply to them.

When it comes to extending the life-time of temporaries there is no difference.

std::string&& func (std::string&& ref) {
  return std::move (ref);
}

A temporary passed to func will not have its life-time extended just because we have an argument declared as being a rvalue/lvalue-reference. Even if we return the "same" reference so that it's available outside of func, it just won't happen.

This is what happens in the constructor of (2), after all a constructor is just a "fancy function" used to initialize an object.

12.2p5 Temporary objects [class.temporary]

The temporary to which the reference is bound or the temporary that is the complete object of a subobject to which the reference is bound persists for the lifetime of the reference except:

A temporary bound to a reference member in a constructor's ctor-initializer (12.6.2) persists until the constructor exits.

A temporary bound to a reference parameter in a function call (5.2.2) persists until the completion of the full-expression containing the call.

The lifetime of a temporary bound to the returned value in a function return statement (6.6.3) is not extended; the temporary is destroyed at the end of the full-expression in the return statement.

A temporary bound to a reference in a new-initializer (5.3.4) persists until the completion of the full-expression containing the new-initializer.

_{Note: Do note that aggregate initialization through a new T { ... } differ from the previously mentioned rules.}

edited May 31 '14 at 10:53

answered May 31 '14 at 00:30

Filip Roséen - refp

62,493
20
150
196

1

this all makes sense thank you. If I'm reading your last note correctly, if you use aggregate initialization through new, the lifetime of the temporaries doesn't get extended (I assume because the object is allocated nondeterministically) is that right? – Ryan Haining May 31 '14 at 00:48
1

@RyanHaining that is correct. I'm considering answering why that is in a more elaborate *Q&A*. Is that of interest? – Filip Roséen - refp May 31 '14 at 00:50
I'm thinking it'd have to do with the compiler not knowing when a temporaries lifetime was over. If one were to have a non-constexpr amount of `new`s with the same rvalue ref, the lifetime of that rvalue is now extended non-deterministically. so that would imply 2 options, one is to move the rvalue ref into the `new`'d object, but then the address of the rvalues in the two locations would differ, not to mention that's a move constructor the compiler isn't necessarily allowed to isert. the other would be to leave it on the stack but the `new`d object would likely outlive the temporary. – Ryan Haining May 31 '14 at 00:57
but, I could be way off on this and if you were to elaborate then yeah I'd be interested. – Ryan Haining May 31 '14 at 00:58
@RyanHaining: I'd say, it really is a feature & limitation at once of the parser(!) of C++. So, this is a purely semantical issue. In the initialization (no constructor) case, we just have a "full expression" and according rules (see http://stackoverflow.com/questions/4214153/lifetime-of-temporaries). In the constructor case we have a different set of rules. So, we have two completely different cases for the compiler, each being correct on its own, but each doing different things. – Frunsi May 31 '14 at 01:18
1

@Frunsi I'm talking about new vs non-new allocation here. – Ryan Haining May 31 '14 at 01:19
@RyanHaining: Alright. I think this is a real design flaw of C++. The whole concept of extended lifetimes of objects works fine for automatic memory, but not so well for heap allocation. Just my 2 cents... – Frunsi May 31 '14 at 01:30
@Frunsi -- Well, temporaries are really unnamed variables on the stack. If we have a temporary directly bound to a reference, we can extend its lifetime to be the same as the reference since we know they must live in the same scope. If we allowed a temporary's lifetime to be extended in a `new` expression, the destruction of the temporary is no longer deterministic. But the temporary lives on the stack and it might be gone by the time we want to access it. – mpark May 31 '14 at 05:22
@mpark -- exactly. But because of all that I would call the "extended lifetimes" thing a design flaw. It is not a bug, it is defined how the compiler will (or should) deal with the different cases, and I do not know better, it is like it is. But it is hard to grasp for beginners. Understanding such cases requires a deep knowledge of the whole language. – Frunsi May 31 '14 at 09:30
@Frunsi -- I agree that C++ in general is too complicated and perhaps getting worse. I imagine the feature came in even before return value optimization was introduced never mind move semantics. So I can see how it might have been seen as a good idea at the time to save copies. Andrei Alexandrescu also used the feature when developing ScopeGuard to save virtual destructor calls which was a neat trick. Anyway, just a few thoughts on why it may have been a good idea at the time that we can't fix due to backwards compatibility. – mpark May 31 '14 at 09:54
@mpark Here's a [Q&A](http://stackoverflow.com/q/23970565/1090079) explaining the issue in detail. – Filip Roséen - refp May 31 '14 at 13:47
@Frunsi Here's a [Q&A](http://stackoverflow.com/q/23970565/1090079) explaining the issue in detail. – Filip Roséen - refp May 31 '14 at 13:48
`Even if we return the "same" reference so that it's available outside of func, it just won't happen.` Aha! I've been trying to determine why it's considered "a very bad thing to do" to return rvalue references (specifically, an rvalue reference that was received as a parameter). This makes sense. The temporary is automatically bound to the parameter, but that reference itself is destroyed at the end of the function even if you return *another* reference to the same object. That's why the returned reference would be left dangling! – monkey0506 Sep 29 '15 at 22:37
This is just like how `A&& rref=static_cast(A());` is different from `A&& rref=std::move(A());`. The former extends the lifetime of the temporary, while the second doesn't, due to the extra function call. Even though what `std::move` does is just `static_cast`ing to `A&&`. – Weijun Zhou Oct 12 '21 at 07:47

Extending temporary's lifetime through rvalue data-member works with aggregate, but not with constructor, why?

1 Answers1

What is an Aggregate?

How are Aggregates initialized?

How does initialization using a user-declared constructor differ?

Linked

Related