Consider this example:
#include <utility>
// runtime dominated by argument passing
template <class T>
void foo(T t) {}
int main() {
int i(0);
foo<int>(i); // fast -- int is scalar type
foo<int&>(i); // slow -- lvalue reference overhead
foo<int&&>(std::move(i)); // ???
}
Is foo<int&&>(i)
as fast as foo<int>(i)
, or does it involve pointer overhead like foo<int&>(i)
?
EDIT: As suggested, running g++ -S
gave me the same 51-line assembly file for foo<int>(i)
and foo<int&>(i)
, but foo<int&&>(std::move(i))
resulted in 71 lines of assembly code (it looks like the difference came from std::move
).
EDIT: Thanks to those who recommended g++ -S
with different optimization levels -- using -O3
(and making foo noinline
) I was able to get output which looks like xaxxon's solution.