The short answer is none of the alternatives are faster. They produce identical code when compiler optimizations are enabled.
For the case when the value is known at compile time the assembly is this: Compiler Explorer link
movabs rsi, -353255926290448386
call std::basic_ostream<char, std::char_traits<char> >& std::basic_ostream<char, std::char_traits<char> >::_M_insert<unsigned long long>(unsigned long long)
It loads a constant into a register and calls operator<<()
. All three alternatives produce this optimized assembly.
For the case when the value is read at runtime the assembly is this: Compiler Explorer link
mov rax, QWORD PTR [rsp+8]
mov edi, OFFSET FLAT:_ZSt4cout
lea rsi, [rax+rax]
call std::basic_ostream<char, std::char_traits<char> >& std::basic_ostream<char, std::char_traits<char> >::_M_insert<unsigned long long>(unsigned long long)
It uses addition (rax+rax
) and calls operator<<()
. Again all three alternatives produce the same assembly.