Consider these three assignments:
my_type x = func_returning_my_type_byvalue();
my_type & y = func_returning_my_type_byvalue();
my_type && z = func_returning_my_type_byvalue();
The first - you have a local variable x
and it's being initialized to the result of a function call (rvalue) so a move constructor/assignment can be used or the construction of x could be elided entirely (skipped over and x
is constructed in-place by func_returning_my_type_byvalue
when it generates its result).
Note that x
is an lvalue - you can take its address, so therefore it is also a type of reference itself. Technically all variables that are not references, are references to themselves. In that respect lvalues are a binding site for assignments to and reads from known-storage-duration memory.
The second will not compile - you cannot assign a reference to a result (this way), you must use reference assignment syntax to alias an existing lvalue. It is perfectly fine to do this, however:
my_type & y = func_returning_my_type_byreference();
// `y` will never use constructors or destructors
This is why the third exists, when we need a reference to something we cannot create a reference to using the conventional syntax. Within something like func
in the original question, the lifetime of arg
is not immediately obvious. For example we can't do this without an explicit move:
void func( my_type && arg ) {
my_type && save_arg = arg;
}
The reason this is not allowed is because arg
is a reference to a value first and foremost. If the storage of arg
's value (what it's referring to) were to be shorter than that of save_arg
, then save_arg
would call the destructor of that value - in effect capturing it. That is not the case here, save_arg
will disappear first, so it makes no sense to transfer an lvalue into it that we can, after func
, still refer to potentially!
Consider that even if you were to use std:move
to force this to compile. The destructor will still not be called within func
because you haven't created a new object, just a new reference, and then this reference is destroyed before the original object itself went out of scope.
For all intents and purposes arg
behaves as if it's my_type&
, as do any rvalue references. The trick is storage duration and the semantics of lifetime extension by reference passing. It's all regular references under the hood, there is no 'rvalue type'.
If it helps, recall the increment / decrement operators. There are two overloads that exist, not two operators. operator++(void)
(pre) and operator++(int)
(post). There is never an actual int
being passed, it's just so the compiler has different signatures for different situations / contexts / agreements about value treatment. This is sort of the same deal with references.
If rvalue and lvalue references are both always referred to like an lvalue, what's the difference?
In a word: object lifetime.
An lvalue reference must always be assigned to using something with longer storage duration, something that is already constructed. That way there is no need to call constructors or destructors for the scope of the lvalue reference variable, because by definition we are given a ready object and forget about it before it's due to be destroyed.
it's also relevant that objects are implicitly destroyed in the reverse order they're defined:
int a; // created first, destroyed last
int b; // created second, destroyed 2nd-last
int & c = b; // fine, `c` goes out of scope before `b` per above
int && d = std::move(a); // fine, `a` outlives `d`, same situation as `c`
If we assigned to an rvalue reference, something that is an lvalue reference, the same rule applies - the lvalue must by definition have longer storage, so we don't need to call constructors or destructors for c
or even d
. You can't trick the compiler with std::move
on this because it knows the scope of the object being moved - d
is unambiguously shorter-duration than the reference it's being given, we're just forcing the compiler to use the rvalue type check / context and that's all we've achieved.
The difference is with non-lvalue references - things like expressions where there can be references to them but these references are definitely short-lived, perhaps shorter than the duration of a local variable. Hint Hint.
When we assign the result of a function call or an expression to an rvalue reference, we are creating a reference to a temporary object that otherwise could not be referred to. Due to this, we are in effect forcing in-place construction of a variable from the result of an expression. This is a variation on copy/move elision where the compiler has no choice but to elide the temporary to in-place construction:
int a = 2, b = 3; // lvalues
int && temp = a + b; // temp is constructed in-place using the result of operator+(int,int)
The case with func
It boils down to an lvalue assignment - references as function arguments refer to objects that may exist for longer than a function call, and as such are lvalues even when the argument type is an rvalue reference.
The two cases are:
func( std::move( variable ) ); // case 1
func( my_type() + my_type() ); // case 2
func
is not allowed to guess which situation we will use it in ahead of time (sans optimizations). If we didn't allow case 1, then there would be a legitimate reason to consider an rvalue reference parameter as having less storage duration than the function call, but that would also make no sense because either the object is always cleaned up inside func
or always outside of it, and having "unknown" storage duration at compile time is not satisfactory.
The compiler has no choice but to assume the worst, that case 1 might happen eventually, in which case we must make guarantees to the storage duration of arg
as being longer than the call to func
in the general case. As consequence of this - that arg
would be considered to exist for longer than the call to func
some of the time, and that func
's generated code must work in both cases - arg
's allowable usage and assumed storage duration meet the requirements of my_type&
and not my_type&&
.