Consider the following declaration of local variables:
bool a{false};
bool b{false};
bool c{false};
bool d{false};
bool e{false};
bool f{false};
bool g{false};
bool h{false};
in x86-64 architectures, I'd expect the optimizer to reduce the initialization of those variables to something like mov qword ptr [rsp], 0
. But instead what I get with all the compilers (regardless of level of optimization) I've been able to try is some form of:
mov byte ptr [rsp + 7], 0
mov byte ptr [rsp + 6], 0
mov byte ptr [rsp + 5], 0
mov byte ptr [rsp + 4], 0
mov byte ptr [rsp + 3], 0
mov byte ptr [rsp + 2], 0
mov byte ptr [rsp + 1], 0
mov byte ptr [rsp], 0
Which seems like a waste of CPU cycles. Using copy-initialization, value-initialization or replacing braces with parentheses made no difference.
But wait, that's not all. Suppose that I have this instead:
struct
{
bool a{false};
bool b{false};
bool c{false};
bool d{false};
bool e{false};
bool f{false};
bool g{false};
bool h{false};
} bools;
Then the initialization of bools
generates exactly what I'd expect: mov qword ptr [rsp], 0
. What gives?
You can try the code above yourself in this Compiler Explorer link.
The behavior of the different compilers is so consistent that I am forced to think there is some reason for the above inefficiency, but I have not been able to find it. Do you know why?