Here's my code:
int f(double x)
{
return isnan(x);
}
If I #include <cmath>
I get this assembly:
xorl %eax, %eax
ucomisd %xmm0, %xmm0
setp %al
This is reasonably clever: ucomisd sets the parity flag if the comparison of x with itself is unordered, meaning x is NAN. Then setp copies the parity flag into the result (only a single byte, hence the initial clear of %eax
).
But if I #include <math.h>
I get this assembly:
jmp __isnan
Now the code is not inline, and the __isnan
function is certainly no faster the the ucomisd
instruction, so we have incurred a jump for no benefit. I get the same thing if I compile the code as C.
Now if I change the isnan()
call to __builtin_isnan()
, I get the simple ucomisd
instruction instruction regardless of which header I include, and it works in C too. Likewise if I just return x != x
.
So my question is, why does the C <math.h>
header provide a less efficient implementation of isnan()
than the C++ <cmath>
header? Are people really expected to use __builtin_isnan()
, and if so, why?
I tested GCC 4.7.2 and 4.9.0 on x86-64 with -O2
and -O3
optimization.