Due to Return Value Optimization, the second form (passing a reference and modifying it) is almost certainly slower and less amendable to optimization, as well as less legible.
Let us consider a simple example function:
return_value foo( void );
Here are the possibilities that may occur:
- Return Value Optimization (RVO)
- Named Return Value Optimization (NRVO)
- Move semantic return
- Copy semantic return
What is Return Value Optimization? Consider this function:
return_value foo( void ) { return return_value(); }
In this example, an unnamed temporary variable is returned from a single exit point. Because of this, the compiler can easily (and is free to) completely remove any traces of this temporary value, and instead construct it directly in place, in the calling function:
void call_foo( void )
{
return_value tmp = foo();
}
In this example, tmp is actually directly used in foo as if foo defined it, removing all copies. This is a HUGE optimization if return_value is a non-trivial type.
When can RVO be used? That's up to the compiler, but in general, with a single return code point, it will always be used. Multiple return code points make it more iffy, but if they are all anonymous, your chances increase.
What about Named Return Value Optimization?
This one is a bit trickier; if you name the variable before you return it, it's now an l-value. This means the compiler has to do more work to prove that the in place construction will be possible:
return_type foo( void )
{
return_type bar;
// do stuff
return bar;
}
In general, this optimization is still possible, but less likely with multiple code paths, unless each code path returns the same object; returning multiple different objects from multiple different code paths tends to not difficult to optimize out:
return_type foo( void)
{
if(some_condition)
{
return_type bar = value;
return bar;
}
else
{
return_type bar2 = val2;
return bar2;
}
}
This is not going to be as well received. It's still possible NRVO could kick in, but it's getting less and less likely. If at all possible, construct a single return_value and tweak it in different code paths, rather than returning wholly different ones.
If NRVO is possible, this will get rid of any overhead; it will be as if it was constructed directly in the calling function.
If neither form of return value optimization is possible, Move return may be possible.
C++11 and C++03 both have the possibility to do move semantics; rather than copying the information out of one object into another, move semantics allow one object to steal the data in another, setting it to some default state. For C++03 move semantics, you need boost.move, but the concept is still sound.
Move return isn't as fast as RVO return, but it's drastically faster than a copy. For a compliant C++11 compiler, of which there are many today, all STL and STD structures should support move semantics. Your own objects may not have a default move constructor/assignment operator (MSVC do not currently have default move semantic operations for user defined types), but adding move semantics is not hard: just use the copy-and-swap idiom to add it!
What is the copy-and-swap idiom?
Finally, if your return_value does not support move and your function is too hard to RVO, you will default to copy semantics, which is what your friend said to avoid.
However, in a large amount of cases, this will not be significantly slower!
For primitive types, such as float or int or bool, copying is a single assignment or move; hardly the sort of thing to complain about; passing such things by reference without a really good reason is sure to make your code slower, as references are internally pointers. For something like your bool example, there's no reason to waste time or energy passing a bool by reference; returning it is the fastest possible way.
When you return something that fits in a register, it's usually returned in a register for exactly that reason; it's fast, and as noted, easiest to maintain.
If your type is a POD type, such as a simple struct, this can often be passed through registers via a fastcall mechanism, or optimized away into direct assignments.
If your type is a large and imposing type, such as std::string or something with a lot of data behind it, requiring lots of deep copies, and your code is sufficiently complex as to make RVO unlikely, then perhaps passing by reference is a better idea.
Summary
- Anonymous (rvalue) values of any kind should be returned by value
- Small or primitive types should be returned by value.
- Any type supporting move semantics (the STL, STD, etc) should be returned by value
- Named (lvalue) values that are easy to reason about should be returned by value
- Large data types in complex functions should be profiled or passed by reference
Always return by value when possible, if you are using C++11. It's more legible, and faster.