What is the rationale for extending the lifetime of temporaries?

Question

In C++, the lifetime of a temporary value can be extended by binding it to a reference:

Foo make_foo();

{
    Foo const & r1 = make_foo();
    Foo && r2 = make_foo();

    // ...
}             // both objects are destroyed here

Why is this allowed? What problem does this solve?

I couldn't find an explanation for this in Design and Evolution (e.g. 6.3.2: Lifetime of Temporaries). Nor could I find any previous questions about this (this one came closest).

This feature is somewhat unintuitive and has subtle failure modes. For example:

Foo const & id(Foo const & x) { return x; }  // looks like a fine function...

Foo const & r3 = id(make_foo());             // ... but causes a terrible error!

Why is something that can be so easily and silently abused part of the language?

Update: the point may be subtle enough to warrant some clarification: I do not dispute the use of the rule that "references bind to temporaries". That is all fine and well, and allows us to use implicit conversions when binding to references. What I am asking about is why the lifetime of the temporary is affected. To play the devil's advocate, I could claim that the existing rules of "lifetime until end of full expression" already cover the common use cases of calling functions with temporary arguments.

I assume if you had a function that returns `std::string` and one that returns `std::string const&`, you don't have to care which you're getting back; you can just write `std::string const& x = foo()`. — Simple, Dec 12 '13 at 15:09
Note: a very interesting example: `long const& l = std::min(0, 1);` is unsafe because there is an implicit conversion from `int` to `long const&` and `min` then return this reference. *sigh* — Matthieu M., Dec 12 '13 at 15:41
@Simple You don't have to care either way: you just write `std::string x = foo();`. — James Kanze, Dec 12 '13 at 16:25
@JamesKanze which makes a copy when `std::string const&` is returned. — Simple, Dec 12 '13 at 16:25
@Simple Exactly. It doesn't extend the lifetime if a reference is returned. Which means that it isn't safe. (Obviously, if the function returns a reference because it's semantics require it, and you need the reference to the original, and not a copy, you use a reference. But otherwise: the function should return by value, and you should capture by value.) — James Kanze, Dec 12 '13 at 16:30
@JamesKanze I don't know what you're getting at. My point was you can always write `std::string const& x = foo()` and not care whether you getting a reference to some internal string or getting a temporary; if you're not going to modify the string then this is a bit of an optimisation technique. But I'm done in this topic now. — Simple, Dec 12 '13 at 16:34
@Simple: it matters if you have `std::string const& identity(std::string const& s) { return s; }` and use it as `std::string const& x = identity("Hello, World!");` because then `x` references a temporary that is destroyed right after `x` was initialized to the reference... — Matthieu M., Dec 12 '13 at 16:37
@MatthieuM. so do we disallow local variables of type `std::string const&` and only allow it in a return type? This isn't a binding-to-temporary problem. — Simple, Dec 12 '13 at 16:39
@Simple: well, whether you consider it a binding-to-temporary or a more general dangling-reference, it still is *undefined behavior*; and the fact is that it involves a temporary variable so in all the discussions I have had about it (notably with Argyrios, one of Clang core developers) it was lumped in the binding-to-temporary category... and no-one figured how to detect it :( — Matthieu M., Dec 12 '13 at 16:45
@Simple: The error in that train of thought is *you can always [...] and not care wether you are getting a reference [...] or a temporary*. This is part of the things you want to *care*. If it is a reference it can change as you use the object, if it is a value you get your own copy, semantically very different. If you think that you can "not care" C++ is probably not your language, and I particularly would consider that *sloppy*. Personally I reject binding const references to temporaries in code reviews (twice this year) as it *hides* what is going on and does not provide any benefit — David Rodríguez - dribeas, Dec 12 '13 at 16:59
@DavidRodríguez-dribeas That corresponds to our policy as well. We had a couple of programmers who did this a lot; we're only gradually getting rid of it. (In addition to all of the other potential problems, it actually results in slightly slower code, at least in some of the builds, because of the extra indirection.) — James Kanze, Dec 12 '13 at 17:22
It seems that copy elision can *always* be used in the last step of copying the return value into a local variable, so I find it really hard to see why you could not always, uncondtionally write `T x = f();` rather than `T const & x = f();`. I guess copy elision allows too much wild behaviour so that a guaranteed way to not have that last copy may be desirable, but I don't find that convincing. — Kerrek SB, Dec 12 '13 at 17:24
@KerrekSB If `f` returns an `std::string`, you'll find that `T const& x = f();` actually generates worse code than `T x = f();`, at least with VS. — James Kanze, Dec 12 '13 at 17:30
@KerrekSB: The difference is when the reference is not bound directly to the complete object, which is the case in `ScopeGuard` for example where you might not even know the name of the real type being returned. Then again, in C++11 you would use `auto` there — David Rodríguez - dribeas, Dec 12 '13 at 17:35
@DavidRodríguez-dribeas: Yeah, that's a sort of interesting use case, but then again, returning by value requires complete types already. It's a point, I'm just not sure how big a point it is... — Kerrek SB, Dec 12 '13 at 18:05
@KerrekSB: Don't mix the completeness of the type with the ability of the programmer to type it. To provide an example from the standard, the result of `std::bind` is a complete type, but you don't know the name, `auto` solves that in C++11, but that was not available in C++98. That is also the case in the `ScopeGuard`, where a tag type is created so that the caller can pin down the temporary of a type whose name is not known. — David Rodríguez - dribeas, Dec 12 '13 at 18:11
That being said, I have never encountered another use of that... `ScopeGuard` is special in that the only *interface* of the object is really that it will be destroyed when the scope exits. If you need to do something more special, then using this approach would require dynamic polymorphism (potentially optimized out) — David Rodríguez - dribeas, Dec 12 '13 at 18:13
@DavidRodríguez-dribeas Conditional polymorphism is the one case where the prolongation of lifetime would make sense. But try as I might, I can't come up with an initialization expression where it would be relevant. Something like `Base const& obj = cond ? Base() : Derived();` maybe. But as soon as it becomes `cond ? Derived1() : Derived2()` (which will almost always be the case), you need a `static_cast` somewhere, and initializing a reference with the result of `static_cast` does _not_ prolong the lifetime of the object of the cast. — James Kanze, Dec 13 '13 at 09:11
@JamesKanze: (+1, I was surprised by that the other day -- two derived types don't have a common type. Makes sense, though.) With non-abstract leaf classes only, you'll never get to take advantage of that... — Kerrek SB, Dec 13 '13 at 09:14
@KerrekSB The rational is simple: what do you do in cases where many conversions are possible: `unsigned` and `long`, for example, or even two derived classes which have multiple common bases? (One could, of course, come up with a set of rules, but the rules aren't exactly simple at present.) — James Kanze, Dec 13 '13 at 09:26

David Rodríguez - dribeas · Answer 1 · 2013-12-12T17:34:05.557

15

The simple answer is that you need to be able to bind a temporary with a const reference, not having that feature would require a good amount of code duplication, with functions taking const& for lvalue or value arguments or by-value for rvalue arguments. Once you need that the language needs to define some semantics that will guarantee the lifetime of the temporary is at least as long as that of the reference.

Once you accept that a reference can bind to an rvalue in one context, just for consistency you may want to extend the rule to allow the same binding in other contexts, and the semantics are really the same. The temporary lifetime is extended until the reference goes away (be it a function parameter, or a local variable).

The alternative would be rules that allow binding in some contexts (function call) but not all (local reference) or rules that allow both and always create a dangling reference in the latter case.

Removed the quote from the answer, left here so that comments would still make sense:

~~If you look at the wording in the standard there are some hints as of this intended usage:~~

12.2/5 [middle of the paragraph] [...] A temporary bound to a reference parameter in a function call (5.2.2) persists until the completion of the full expression containing the call. [...]

edited Dec 12 '13 at 17:34

answered Dec 12 '13 at 15:19

David Rodríguez - dribeas

204,818
23
294
489

1

"until the reference goes away"... well, in a function call, that's not a sharp bound, since the temporary lives *longer* than the reference. – Kerrek SB Dec 12 '13 at 15:26
1

After your edit: "always dangle" seems like the better rule to me. It makes my `r1`, `r2` and `r3` examples all behave the same way. Easy to learn, easier to teach and very easy to diagnose statically. – Kerrek SB Dec 12 '13 at 15:35
@KerrekSB: The edit was initially an answer to one comment you added in a different question that has been deleted – David Rodríguez - dribeas Dec 12 '13 at 16:03
3

This doesn't seem to address the issue at all. The lifetime of a temporary is until the end of the full expression, _not_ until the reference it is bound to in a function call ends. Initializing a function parameter is one time when binding to a reference does _not_ increase the lifetime of the temporary. – James Kanze Dec 12 '13 at 17:29
@JamesKanze: That particular quote is actually the inverse, it is an exception to the rule that the temporary and the reference have the same lifetime [in this case the lifetime of the temporary is *longer* than that of the refernce]. I am getting more and more convinced that I should have avoided this particular quote. – David Rodríguez - dribeas Dec 12 '13 at 17:32
Feel free to delete the text from the answer; you can always flag comments as "obsolete" and have them removed... – Kerrek SB Dec 12 '13 at 20:11

Cheers and hth. - Alf · Answer 2 · 2013-12-14T03:32:27.817

As Bjarne Stroustrup (the original designer) explained it in a clc++ posting in 2005, it was for uniform rules.

The rules for references are simply the most general and uniform I could find. In the cases of arguments and local references, the temporary lives as long as the reference to which it is bound. One obvious use is as a shorthand for a complicated expression in a deeply nested loop. For example:
for (int i = 0; i<xmax; ++i)
    for (int j = 0; j< ymax; ++j) { 
        double& r = a[i][j]; 
        for (int k = 0; k < zmax; ++k) { 
           // do something with a[i][j] and a[i][j][k] 
        }
    } 
This can improve readability as well as run-time performance.

And it turned out to be useful for storing an object of a class derived from the reference type, e.g. as in the original Scopeguard implementation.

In a clc++ posting in 2008, James Kanze supplied some more details:

The standard says exactly when the destructor must be called. Before the standard, however, the ARM (and earlier language specifications) were considerably looser: the destructor could be called anytime after the temporary was "used" and before the next closing brace.

(The “ARM” is the Annotated Reference Manual by (IIRC) Bjarne Stroustrup and Margareth Ellis, which served as a de-facto standard in the last decade before the first ISO standard. Unfortunately my copy is buried in a box, under a lot of other boxes, in the outhouse. So I can't verify, but I believe this is correct.)

Thus, as with much else the details of lifetime extensions were honed and perfected in the standardization process.

Since James has raised this point in comments to this answer: that perfection could not reach back in time to affect Bjarne's rationale for the lifetime extension.

Example of Scopeguard-like code, where the temporary bound to the reference is the full object of derived type, with its derived type destructor executed at the end:

struct Base {};

template< class T >
struct Derived: Base {};

template< class T >
auto foo( T ) -> Derived<T> { return Derived<T>(); }

int main()
{
    Base const& guard = foo( 42 );
}

score 2 · Answer 3 · answered Jul 22 '14 at 18:41

I discovered an interesting application for lifetime extension somewhere here on SO. (I forget where, I'll add a reference when I find it.)

Lifetime extension allows us to use prvalues of immobile types.

For example:

struct Foo
{
    Foo(int, bool, char);
    Foo(Foo &&) = delete;
};

The type Foo cannot be copied nor moved. Yet, we can have a function that returns a prvalue of type Foo:

Foo make_foo()
{
    return {10, false, 'x'};
}

Yet we cannot construct a local variable initialized with the return value of make_foo, so in general, calling the function will create a temporary object that is immediately destroyed. Lifetime extension allows us to use the temporary object throughout an entire scope:

auto && foo = make_foo();

What is the rationale for extending the lifetime of temporaries?

3 Answers3

Linked