14

For class types it is possible to assign to temporary objects which is actually not allowed for built-in types. Further, the assignment operator generated by default even yields an lvalue:

int() = int();    // illegal: "expression is not assignable"

struct B {};
B& b = B() = B(); // compiles OK: yields an lvalue! ... but is wrong! (see below)

For the last statement the result of the assignment operator is actually used to initialize a non-const reference which will become stale immediately after the statement: the reference isn't bound to the temporary object directly (it can't as temporary objects can only be bound to a const or rvalue references) but to the result of the assignment whose life-time isn't extended.

Another problem is that the lvalue returned from the assignment operator doesn't look as if it can be moved although it actually refers to a temporary. If anything is using the result of the assignment to get hold of the value it will be copied rather than moved although it would be entirely viable to move. At this point it is worth noting that the problem is described in terms of the assignment operator because this operator is typically available for value types and returns an lvalue reference. The same problem exists for any function returning a reference to the objects, i.e., *this.

A potential fix is to overload the assignment operator (or other functions returning a reference to the object) to consider the kind of object, e.g.:

class G {
public:
    // other members
    G& operator=(G) &  { /*...*/ return *this; }
    G  operator=(G) && { /*...*/ return std::move(*this); }
};

The possibility to overload the assignment operators as above has come with C++11 and would prevent the subtle object invalidation noted above and simultaneously allow moving the result of an assignment to a temporary. The implementation of the these two operators is probably identical. Although the implementation is likely to be rather simple (essentially just a swap() of the two objects) it still means extra work raising the question:

Should functions returning a reference to the object (e.g., the assignment operator) observe the rvalueness of the object being assigned to?

An alternatively (mentioned by Simple in a comment) is to not overload the assignment operator but to qualify it explicitly with a & to restrict its use to lvalues:

class GG {
public:
    // other members
    GG& operator=(GG) &  { /*...*/ return *this; }
};
GG g;
g = GG();    // OK
GG() = GG(); // ERROR
Community
  • 1
  • 1
Dietmar Kühl
  • 150,225
  • 13
  • 225
  • 380
  • 2
    I agree with the `&` ref-qualifier but I don't like the `&&` overload. Slightly related, an insight I had is that a member function with a ref-qualifier has the same "semantics" as a non-member function that takes a reference in that `foo(T())` and `T().foo()` are both disallowed whereas without one the first is disallowed but the second is allowed. – Simple Jan 02 '14 at 15:54
  • 3
    @Dietmar: if the correct answer is "yes" then presumably there's an awkward conversation coming up for the committee, in which they decide whether or not to fix compiler-generated copy/move assignments? So there might be two levels to the question: (1) should the operator do it in an ideal world, and (2) should the standard operator be changed to do it. – Steve Jessop Jan 02 '14 at 15:57
  • @Simple: yes, I agree that another option to address the problem is not to overload the assignment operator (or similar function) but explicitly qualify it to take lvalues only (I meant to mention this in the question but forgot). – Dietmar Kühl Jan 02 '14 at 15:58
  • @SteveJessop: it may be something to be raised in the committee but I doubt that there would be a lengthy discussion: it will be one of these cases where we would need to stick with the inferior choices made in the past. C++ already has plenty of these. However, it doesn't mean that shiny new libraries could improve on the current situation. – Dietmar Kühl Jan 02 '14 at 16:00
  • @DyP: yes, I already mentioned that the problem is bigger although not as big as you (correctly) made it. I used the assignment operator as an example because it is well-known to return an lvalue reference. You have a valid point, though: the problem is actually much bigger than I stated and the same fix could apply to the other situations, too. – Dietmar Kühl Jan 02 '14 at 16:12
  • 1
    (reposted from a deleted comment) I think the issue isn't restricted to assignment. *Any* member function that returns an lvalue reference can cause a dangling reference, for example `auto& x = std::vector{1,2,3}.front();`. – dyp Jan 02 '14 at 16:13
  • 2
    Worth noting that you can still use `= default` with a ref-qualifier e.g `widget& operator=(widget const&) & = default`. – Simple Jan 02 '14 at 16:18
  • FWIW the line `B& b = B() = B();` doesn't compile in gcc 4.8.1 (I'm not saying Dietmar is wrong, I belive gcc 4.8.1 is) even if one adds `B& operator =(const B&) = default;`. Funilly enough, it compiles if one adds `B& operator =(const B&) {return *this;}` (which is another indication of a bug: why would the latter be different from the compiler's implementation?) Unfortunately, I can't try now with a more recent version or with clang. VS is not a reference because of an ["extension"](http://tinyurl.com/njsrf22) which allows rvalues to be bound to non-`const` lvalue references. – Cassio Neri Jan 02 '14 at 17:04
  • @CassioNeri: interesting. I just tried with a more recent gcc (gcc version 4.9.0 20131222 (experimental)) and it still doesn't compile. It does compile with clang (3.5 (trunk 198287)). – Dietmar Kühl Jan 02 '14 at 17:14
  • Huh. Given `B& operator =(const B&) = default;`, GCC even rejects an explicit `B& b = B().operator=(B());`, but accepts `B& b = (B().*&B::operator=)(B());`. –  Jan 02 '14 at 17:20
  • @hvd: with an explicit definition of `B& operator=(B const&)`, whether defaulted or not, it seems the code should compile although it doesn't do anything useful. Even using `void f(B&){}; /*...*/ f(B() = B());` fails to compile with gcc when the operator is defaulted and this _may_ do something useful. – Dietmar Kühl Jan 02 '14 at 17:26
  • @DyP we need explicit dependency descriptions for functions/methods in C++. If `T& std::vector::front() dependent` meant "the return value's lifetime is dependent on `this`'s lifetime, or `T& do_stuff(U& u dependent)` meant "the return value of `do_stuff` has a lifetime dependency on `u`", we could (A) get great diagnostics, and (B) actually do lifetime extension in some indirect cases to do away with the need for diagnostics. Then your `auto& x = std::vector{1,2,3}.front();` would either be marked as "dead" and warnings could ensue, or the `vector` lifetime could match `x`s lifetime! – Yakk - Adam Nevraumont Jan 02 '14 at 18:30
  • On gcc rejecting `B& b = B() = B();`: This is [bug 54319](http://gcc.gnu.org/bugzilla/show_bug.cgi?id=54319). – Cassio Neri Jan 02 '14 at 20:11

2 Answers2

5

IMHO, the original suggestion by Dietmar Kühl (providing overloads for & and && ref-qualifiers) is superior than Simple's one (providing it only for &). The original idea is:

class G {
public:
    // other members
    G& operator=(G) &  { /*...*/ return *this; }
    G  operator=(G) && { /*...*/ return std::move(*this); }
};

and Simple has suggested to remove the second overload. Both solutions invalidate this line

G& g = G() = G();

(as wanted) but if the second overload is removed, then these lines also fail to compile:

const G& g1 = G() = G();
G&& g2 = G() = G();

and I see no reason why they shouldn't (there's no lifetime issue as explained in Yakk's post).

I can see only one situation where Simple's suggestion is preferable: when G doesn't have an accessible copy/move constructor. Since most types for which the copy/move assignment operator is accessible also have an accessible copy/move constructor, this situation is quite rare.

Both overloads take the argument by value and there are good reasons for that if G has an accessible copy/move constructor. Suppose for now that G does not have one. In this case the operators should take the argument by const G&.

Unfortunately the second overload (which, as it is, returns by value) should not return a reference (of any type) to *this because the expression to which *this binds to is an rvalue and thus, it's likely to be a temporary whose lifetime is about to expiry. (Recall that forbidding this from happening was one of the OP's motivation.)

In this case, you should remove the second overload (as per Simple's suggestion) otherwise the class doesn't compile (unless the second overload is a template that's never instantiated). Alternatively, we can keep the second overload and define it as deleted. (But why bother since the existence of the overload for & alone is already enough?)

A peripheral point.

What should be the definition of operator = for &&? (We assume again that G has an accessible copy/move constructor.)

As Dietmar Kühl has pointed out and Yakk has explored, the code of the both overloads should be very similar and, in this case, it's better to implement the one for && in terms of the one for &. Since the performance of a move is expected to be no worse than a copy (and since RVO doesn't apply when returning *this) we should return std::move(*this). In summary, a possible one-line definition is:

G operator =(G o) && { return std::move(*this = std::move(o)); }

This is good enough if only G can be assigned to another G or if G has (non-explicit) converting constructors. Otherwise, you should instead consider giving G a (template) forwarding copy/move assignment operator taking an universal reference:

template <typename T>
G operator =(T&& o) && { return std::move(*this = std::forward<T>(o)); }

Although this is not a lot of boiler plate code it's still an annoyance if we have to do that for many classes. To decrease the amount of boiler plate code we can define a macro:

#define ASSIGNMENT_FOR_RVALUE(type) \
    template <typename T> \
    type operator =(T&& b) && { return std::move(*this = std::forward<T>(b)); }

Then inside G's definition one adds ASSIGNMENT_FOR_RVALUE(G).

(Notice that the relevant type appears only as the return type. In C++14 it can be automatically deduced by the compiler and thus, G and type in the last two code snippets can be replaced by auto. It follows that the macro can become an object-like macro instead of a function-like macro.)

Another way of reducing the amount of boiler plate code is defining a CRTP base class that implements operator = for &&:

template <typename Derived>
struct assignment_for_rvalue {

    template <typename T>
    Derived operator =(T&& o) && {
        return std::move(static_cast<Derived&>(*this) = std::forward<T>(o));
    }

};

The boiler plate becomes the inheritance and the using declaration as shown below:

class G : public assignment_for_rvalue<G> {
public:
    // other members, possibly including assignment operator overloads for `&`
    // but taking arguments of different types and/or value category.
    G& operator=(G) & { /*...*/ return *this; }

    using assignment_for_rvalue::operator =;
};

Recall that, for some types and contrarily to using ASSIGNMENT_FOR_RVALUE, inheriting from assignment_for_rvalue might have some unwanted consequences on the class layout.

Community
  • 1
  • 1
Cassio Neri
  • 19,583
  • 7
  • 46
  • 68
  • *"but if the second overload is removed, then these lines also fail to compile: [...] and I see no reason why they shouldn't"* OTOH, what would you want to express with `G&& g2 = G() = G();` etc? I don't see where that could be useful. As a default, I'd think that having no viable `operator=` for rvalues is acceptable; you can still add it in the rare cases where you need one. – dyp Jan 03 '14 at 14:14
  • @DyP I was talking about the safety point of view but I completely agree that `G&& g2 = G() = G();` is weird (and probably there's no code out there with such a line). The assignment operator was used as an example (following the OP) but I wonder if another example closer (but not exaclty the same) to yours `auto& x = std::vector{1,2,3}.front();` (actually, `auto&& x = ...`) could appear somewhere. (Which I still doubt but who knows?) Never mind: I might be spending too much energy solving a problem that in practice doesn't exist. – Cassio Neri Jan 03 '14 at 15:13
3

The first problem is that this is not actually ok in C++03:

B& b = B() = B();

in that b is bound to an expired temporary once the line is finished.

The only "safe" way to use this is in a function call:

void foo(B&);
foo( B()=B() );

or something similar, where the line-lifetime of the temporaries is sufficient for the lifetime of what we bind it to.

We can replace the probably inefficient B()=B() syntax with:

template<typename T>
typename std::decay<T>::type& to_lvalue( T&& t ) { return t; }

and now the call looks clearer:

foo( to_lvalue(B()) );

which does it via pure casting. Lifetime is still not extended (I cannot think of a way to manage that), but we don't construct to objects then pointlessly assign one to the other.

So now we sit down and examine these two options:

G operator=(G o) && { return std::move(o); }
G&&  operator=(G o) && { *this = std::move(o); return std::move(*this); }
G  operator=(G o) && { *this = std::move(o); return std::move(*this); }

which are, as an aside, complete implementations, assuming G& operator=(G o)& exists and is written properly. (Why duplicate code when you don't need to?)

The first and third allows for lifetime extension of the return value, the second uses the lifetime of *this. The second and third modify *this, while the first one does not.

I would claim that the first one is the right answer. Because *this is bound to an rvalue, the caller has stated that it will not be reused, and its state does not matter: changing it is pointless.

The lifetime of first and third means that whomever uses it can extend the lifetime of the returned value, and not be tied to whatever *this's lifetime is.

About the only use the B operator=(B)&& has is that it allows you to treat rvalue and lvalue code relatively uniformly. As a downside, it lets you treat it relatively uniformly in situations where the result may be surprising.

std::forward<T>(t) = std::forward<U>(u);

should probably fail to compile instead of doing something surprising like "not modifying t" when T&& is an rvalue reference. And modifying t when T&& is an rvalue reference is equally wrong.

Yakk - Adam Nevraumont
  • 262,606
  • 27
  • 330
  • 524
  • For your first paragraph, by his use of OK I believe he meant that it compiles, not that he believes it's good code. – Simple Jan 02 '14 at 16:26
  • @Simple It is about as "ok" as `B& b = *(B*)nullptr;` is. Sure, it compiles, but the resulting variable is not usable. – Yakk - Adam Nevraumont Jan 02 '14 at 16:29
  • @Yakk: I indeed meant that it compiles but as explained in the paragraph below I describe that it actually doesn't do the right thing. The comment is confusing and I will clarify it... – Dietmar Kühl Jan 02 '14 at 16:30
  • @Yakk: Your example in comments is even worse than Dietmar's code, though. It's undefined behavior to dereference a null pointer, which means that `B& b = *(B*)nullptr;` has UB. It's not UB to allow a reference to become dangling, it's merely UB to access it once it is. – Steve Jessop Jan 03 '14 at 11:26