Consider the following function:
void func(bool& flag)
{
if(!flag) flag=true;
}
It seems to me that if flag has a valid boolean value, this would be equivalent to unconditional setting it to true
, like this:
void func(bool& flag)
{
flag=true;
}
Yet neither gcc nor clang optimize it this way — both generate the following at -O3
optimization level:
_Z4funcRb:
.LFB0:
.cfi_startproc
cmp BYTE PTR [rdi], 0
jne .L1
mov BYTE PTR [rdi], 1
.L1:
rep ret
My question is: is it just that the code is too special-case to care to optimize, or are there any good reasons why such optimization would be undesired, given that flag
is not a reference to volatile
? It seems the only reason which might be is that flag
could somehow have a non-true
-or-false
value without undefined behavior at the point of reading it, but I'm not sure whether this is possible.