Foreword
Introducing code duplication when adding support of move semantics to your interface is very annoying. For each function you have to make two almost identical implementations: the one which copies from the argument, and the one which moves from the argument. If a function has two parameters, it's not even code duplication it's code quadruplication:
void Func(const TArg1 &arg1, const TArg2 &arg2); // copies from both arguments
void Func(const TArg1 &arg1, TArg2 &&arg2); // copies from the first, moves from the second
void Func( TArg1 &&arg1, const TArg2 &arg2); // moves from the first, copies from the second
void Func( TArg1 &&arg1, TArg2 &&arg2); // moves from both
In general case you have to make up to 2^N overloads for a function where N is the number of parameters. In my opinion this makes move semantics practically unusable. It is most disappointing feature of C++11.
The problem could have happened even earlier. Let's take a look at the following piece of code:
void Func1(const T &arg);
T Func2();
int main()
{
Func1(Func2());
return 0;
}
It's quite strange that a temporary object is passed into the function that takes a reference. The temporary object may even not have an address, it can be cached in a register for example. But C++ allows to pass temporaries where a const (and only const) reference is accepted. In that case the lifetime of temporary is prolonged till the end of lifetime of the reference. If there were no this rule, we would have to make two implementations even here:
void Func1(const T& arg);
void Func1(T arg);
I don't know why the rule that allows to pass temporaries where reference is accepted was created (well, if there were no this rule we would not be able to call copy constructor to make a copy of a temporary object, so Func1(Func2())
where Func1
is void Func1(T arg)
would not work anyway :) ), but with this rule we don't have to make two overloads of the function.
Solution #1: Perfect forwarding
Unfortunately there is no such simple rule which would make it unnecessary to implement two overloads of the same function: the one which takes a const lvalue reference and the one which takes a rvalue reference. Instead perfect forwarding was devised
template <typename U>
void Func(U &¶m) // despite the fact the parameter has "U&&" type at declaration,
// it actually can be just "U&" or even "const U&", it’s due to
// the template type deducing rules
{
value = std::forward<U>(param); // use move or copy semantic depending on the
// real type of param
}
It may look as that simple rule which allows to avoid duplication. But it is not simple, it uses unobvious template "magic" to solve the problem, and also this solution has some disadvantages that follow from the fact that the function that uses perfect forwarding must be templated:
- The implementation of the function must be located in a header.
- It blows up the binary size because for each used combination of the parameters type (copy/move) it generates separate implementation (you have single implementation in the source code and at the same time you have up to 2^N implementations in the binary).
- There is no type checking for the argument. You can pass value of any type into the function (since the function accepts template type). The actual checking will be done at the points where parameter is actually used. This may produce hard-to-understand error messages and lead to some unexpected consequences.
The last problem can be solved by creating non-template wrappers for perfect-forwarding functions:
public:
void push( T &&data) { push_fwd(data); }
void push(const T &data) { push_fwd(data); }
private:
template <typename U>
void push_fwd(U &&data)
{
// actual implementation
}
Of course it can be used in practice only if the function has few parameters (one or two). Otherwise you have to make too many wrappers (up to 2^N, you know).
Solution #2: Runtime check for movability
Eventually I got to the idea that checking arguments for movablity should be done not at compile-time but at runtime. I created some reference-wrapper class with constructors that took both types of references (rvalue and const lvalue). The class stored the passed to constructor reference as const lvalue reference and additionally it stored the flag whether the passed reference was rvalue.
Then you could check at runtime whether the original reference was rvalue and if so you just casted the stored reference to rvalue-reference.
Unsurprisingly someone else had got to this idea before me. He named this as "in idiom" (I called this "pmp" - possibly movable param). You can read about this idiom in details here and here (original page about "in" idiom, I recommend to read all 3 parts of article if you are really interested in problem, the article reviews the problem in depth).
In short the implementation of the idiom looks like this:
template <typename T>
class in
{
public:
in (const T& l): v_ (l), rv_ (false) {}
in (T&& r): v_ (r), rv_ (true) {}
bool rvalue () const {return rv_;}
const T& get () const {return v_;}
T&& rget () const {return std::move (const_cast<T&> (v_));}
private:
const T& v_; // original reference
bool rv_; // whether it is rvalue-reference
};
(Full implementation also contains special case when some types can be implicitly converted into T)
Example of usage:
class A
{
public:
void set_vec(in<std::vector<int>> param1, in<std::vector<int>> param2)
{
if (param1.rvalue()) vec1 = param1.rget(); // move if param1 is rvalue
else vec1 = param1.get(); // just copy otherwise
if (param2.rvalue()) vec2 = param2.rget(); // move if param2 is rvalue
else vec2 = param2.get(); // just copy otherwise
}
private:
std::vector<int> vec1, vec2;
};
The implementation of "in" lacks copy and move constructors.
class in
{
...
in(const in &other): v_(other.v_), rv_(false) {} // always makes parameter not movable
// even if the original reference
// is movable
in( in &&other): v_(other.v_), rv_(other.rv_) {} // makes parameter movable if the
// original reference was is movable
...
};
Now we can use it in this way:
void func1(in<std::vector<int>> param);
void func2(in<std::vector<int>> param);
void func3(in<std::vector<int>> param)
{
func1(param); // don't move param into func1 even if original reference
// is rvalue. func1 will always use copy of param, since we
// still need param in this function
// some usage of param
// now we don’t need param
func2(std::move(param)); // move param into func2 if original reference
// is rvalue, or copy param into func2 if original
// reference is const lvalue
}
We could also overload an assignment operator:
template<typename T>
T& operator=(T &lhs, in<T> rhs)
{
if (rhs.rvalue()) lhs = rhs.rget();
else lhs = rhs.get();
return lhs;
}
After that we would not need to check for ravlue each time, we could just use it in this way:
vec1 = std::move(param1); // moves or copies depending on whether param1 is movable
vec2 = std::move(param2); // moves or copies depending on whether param2 is movable
But unfortunately C++ doesn’t allow overload of operator=
as global function (https://stackoverflow.com/a/871290/5447906). But we can rename this function into assign
:
template<typename T>
void assign(T &lhs, in<T> rhs)
{
if (rhs.rvalue()) lhs = rhs.rget();
else lhs = rhs.get();
}
and use it like this:
assign(vec1, std::move(param1)); // moves or copies depending on whether param1 is movable
assign(vec2, std::move(param2)); // moves or copies depending on whether param2 is movable
Also this won’t work with constructors. We can’t just write:
std::vector<int> vec(std::move(param));
This requires the standard library to support this feature:
class vector
{
...
public:
vector(std::in<vector> other); // copy and move constructor
...
}
But standards doesn’t know anything about our "in" class. And here we can’t make workaround similar to assign
, so the usage of the "in" class is limited.
Afterword
T
, const T&
, T&&
for parameters it’s too many for me. Stop introducing things that do the same (well, almost the same). T
is just enough!
I would prefer to write just like this:
// The function in ++++C language:
func(std::vector<int> param) // no need to specify const & or &&, param is just parameter.
// it is always reference for complex types (or for types with
// special qualifier that says that arguments of this type
// must be always passed by reference).
{
another_vec = std::move(param); // move parameter if it's movable.
// compiler hides actual rvalue-ness
// of the arguments in its ABI
}
I don’t know if standard committee considered this sort of move semantics implementation but it’s probably too late to make such changes in C++ because they would make ABI of compilers incompatible with previous versions. Also it adds some runtime overhead, and there may be other problems that we don’t know.