47

Here's my code:

int f(double x)
{
  return isnan(x);
}

If I #include <cmath> I get this assembly:

xorl    %eax, %eax
ucomisd %xmm0, %xmm0
setp    %al

This is reasonably clever: ucomisd sets the parity flag if the comparison of x with itself is unordered, meaning x is NAN. Then setp copies the parity flag into the result (only a single byte, hence the initial clear of %eax).

But if I #include <math.h> I get this assembly:

jmp     __isnan

Now the code is not inline, and the __isnan function is certainly no faster the the ucomisd instruction, so we have incurred a jump for no benefit. I get the same thing if I compile the code as C.

Now if I change the isnan() call to __builtin_isnan(), I get the simple ucomisd instruction instruction regardless of which header I include, and it works in C too. Likewise if I just return x != x.

So my question is, why does the C <math.h> header provide a less efficient implementation of isnan() than the C++ <cmath> header? Are people really expected to use __builtin_isnan(), and if so, why?

I tested GCC 4.7.2 and 4.9.0 on x86-64 with -O2 and -O3 optimization.

igauravsehrawat
  • 3,696
  • 3
  • 33
  • 46
John Zwinck
  • 239,568
  • 38
  • 324
  • 436
  • 2
    here is my speculation: pre c99, there is no inline function in c. no inline function means that functions have to be invoked by jmp/call (or some sort of branching). __builtin_isnan is not part of c. it's probably a platform specific intrinsic. – thang Sep 26 '14 at 05:58
  • 2
    But surely a system header like `` can use platform-specific built-ins. – John Kugelman Sep 26 '14 at 06:01
  • 1
    I'm pretty sure `isnan` would use `__builtin_isnan` if possible. I see no reason why you would have to call it manually. – Rapptz Sep 26 '14 at 06:01
  • yes it surely can, but things that can be done aren't always done... – thang Sep 26 '14 at 06:01
  • @thang: I tried now with `-std=c99` and it did not help, `jmp __isnan` remains. Also, I don't think that GCC is required to (or does) actually call all functions naively even before C99; it can implement various builtins to help with performance or otherwise. – John Zwinck Sep 26 '14 at 06:02
  • 2
    Maybe when C99 came along, no one thought to go back and update isnan – thang Sep 26 '14 at 06:04
  • 4
    https://sourceware.org/bugzilla/show_bug.cgi?id=15367 – Marc Glisse Sep 26 '14 at 06:08
  • Can't reproduce your findings with `gcc (MacPorts gcc48 4.8.3_0) 4.8.3` as well as `gcc (MacPorts gcc49 4.9.1_0) 4.9.1` on OS X. I always get the sequence `ucomisd %xmm0, %xmm0 ; setp %al` for all cases you have outlined here. – Michael Foukarakis Sep 26 '14 at 06:54
  • does anyone think this is related with constantfolding ? – igauravsehrawat Sep 26 '14 at 08:19
  • Interesting. A few weeks ago, I was trying to optimize NumPy's quicksort (in C) by first moving the NaNs out of the way, so that comparisons become cheaper: NumPy defines a custom order where NaN is > any number including inf. Doing this with `(x) != (x)` made quicksort faster. Using `isnan` made it *slower*. – Fred Foo Sep 26 '14 at 09:07
  • @larsmans: from my testing it seems that if you say `__builtin_isnan()` you will get the same generated code as `(x) != (x)`...only less portable of course! – John Zwinck Sep 26 '14 at 09:16
  • @JohnZwinck NumPy wants to be portable. Strangely, it has its own macro that should become `__builtin_isnan` for GCC, but this still compiled to a function call, even though in a simple test function it *is* inlined. – Fred Foo Sep 26 '14 at 09:18
  • This might have something to do with legacy code generation for the FPU. An expression like (x) != (x) requires a "floating point assist" if x is NaN. Microcode, 90x slower than the SSE2 version. – Hans Passant Sep 26 '14 at 09:51
  • 1
    Note that this isn't a gcc issue(gcc doesn't ship a C library), but an issue with the standard C library, glibc in this case. – nos Sep 27 '14 at 18:06
  • 1
    @nos: thanks for pointing that out. I have submitted this as a bug in glibc: https://sourceware.org/bugzilla/show_bug.cgi?id=17441 – John Zwinck Sep 28 '14 at 04:04

1 Answers1

19

Looking at <cmath> for libstdc++ shipped with gcc 4.9 you get this:

  constexpr bool
  isnan(double __x)
  { return __builtin_isnan(__x); }

A constexpr function could be aggressively inlined and, of course, the function just delegates the work over to __builtin_isnan.

The <math.h> header doesn't use __builtin_isnan, rather it uses an __isnan implementation which is kind of long to paste here but it's lines 430 of math.h on my machine™. Since the C99 standard requires using a macro for isnan et al (section 7.12 of the C99 standard) the 'function' is defined as follows:

#define isnan(x) (sizeof (x) == sizeof (float) ? __isnanf (x)   \
  : sizeof (x) == sizeof (double) ? __isnan (x) \
  : __isnanl (x))

However, I see no reason why it can't use __builtin_isnan instead of __isnan so I suspect it's an oversight. As Marc Glisse points out in the comments, there is a relevant bug report for a similar issue using isinf instead of isnan.

Rapptz
  • 20,807
  • 5
  • 72
  • 86
  • actually, the bug is about isinf. it's a similar problem with a different function, but it's not strictly the same issue. – thang Sep 26 '14 at 06:19
  • 2
    Don't forget to include [this](http://chat.stackoverflow.com/transcript/message/19109073#19109073) saying that the standard *requires* that they be macros. – Mysticial Sep 26 '14 at 06:20
  • 1
    Do you think it would be legitimate to change `` to simply say `#define isnan(x) __builtin_isnan(x)`? – John Zwinck Sep 26 '14 at 06:38
  • 1
    @JohnZwinck Yeah. I can't think of any reasons why it shouldn't be valid. – Rapptz Sep 26 '14 at 06:39
  • 1
    For gcc, it would be valid. __builtin_isnan isn't there on every compiler. – thang Sep 26 '14 at 06:51
  • But in C++98 mode, `isnan` is inlined too, without a `constexpr`, and GCC has `__inline` in C90 mode. – Fred Foo Sep 26 '14 at 08:51
  • @thang: According to the C standard, `` is part of the implementation (i.e. part of GCC) so it's technically OK. – MSalters Sep 26 '14 at 09:12
  • @MSalters: But "the implementation", from the language standard's standpoint, is anything that's *either* the compiler *or* the library. And `math.h` is part of glibc, not gcc... or am I missing something? – DevSolar Sep 26 '14 at 10:04
  • @DevSolar Nothing stops an implementation from using compiler assisted help. For example, in C++'s case there are traits such as `std::is_enum` that require the use of compiler intrinsics to properly implement them. This is okay and there's nothing wrong with it. The only thing the standard dictates is what the implementation interface should look like and what its results are expected. How it does it is just a detail. – Rapptz Sep 26 '14 at 10:17
  • @thang the glibc PR mentions isnan as well, further down. Nobody to copy the obvious workaround `(isnan)(x)`? – Marc Glisse Sep 26 '14 at 15:45
  • @Rapptz: I've submitted this suggestion as a glibc bug here: https://sourceware.org/bugzilla/show_bug.cgi?id=17441 – John Zwinck Sep 28 '14 at 04:04