4

After compiling and running my program with clang14 I found a bug in my program that boils down to the following code:

#include <iostream>
#include <limits>

int main() {
    double a = std::numeric_limits<double>::max();
    double b = -0.1;
    double c = a * b - a * b;
    std::cout << c << std::endl;
}
clang++-14 main.cpp -O2 && ./a.out

results in

-4.9896e+290

godbolt demo

Why doesn't clang optimize code like this since version 14? Already at compile-time it turns out to be not zero.

  • 1
    Interestingly, if you make `a` and `b` `constexpr` you get `0` as the value of `c`. – NathanOliver Mar 24 '23 at 13:18
  • clang14 did optimise, the result was calculated in compile time and stored in `.LCPI0_0`. But I guess it would take a language lawyer to tell which behaviour is correct (or maybe it's an implementation defined behaviour and one should look into clang documentation somewhere). – Yksisarvinen Mar 24 '23 at 13:18
  • 3
    I assume this gets "optimized" at compile-time to `fma(a, b, -a*b)` which is not zero. Play around with different architectures and optimization options here: https://godbolt.org/z/7Ka5s8s59 – chtz Mar 24 '23 at 13:19
  • 2
    @Yksisarvinen (I'm not a lawyer, but) I think compilers are allowed to calculate expressions with an overall higher precision than the input arguments -- this allows them to use `fma` (and it allowed to use x87 `long double` before SSE-math was standard) – chtz Mar 24 '23 at 13:21
  • 3
    Does this answer your question? [clang 14.0.0 floating point optimizations](https://stackoverflow.com/questions/73985098/clang-14-0-0-floating-point-optimizations) - tl;dr: `-ffp-contract=off` can be used to turn off FMA, which then produces the same result as clang 13: [godbolt](https://godbolt.org/z/7KjhT6EhK) – Turtlefight Mar 24 '23 at 13:29
  • 1
    If you're going to link to Godbolt, you have the asm right there. If you write a function the returns a `double` it compiles to one instruction, very easy to see that it's not doing any FP math at run-time with either clang version. https://godbolt.org/z/f89dE66nc Unlike with `std::cout << c << '\n';` or much worse your `std::endl` where there's a whole bunch of asm burying the instruction that sets XMM0. [How to remove "noise" from GCC/clang assembly output?](https://stackoverflow.com/q/38552116) – Peter Cordes Mar 24 '23 at 13:49
  • 2
    Fun fact: `-ffp-contract=` `off` or `fast` both return 0, only the clang14 default of `on` returns non-zero here. @chtz: IIRC, the standard says temporaries within a single expression can be elided if computed with infinite precision, as in FMA, if `#pragma STDC FP_CONTRACT ON` is set ([Is floating point expression contraction allowed in C++?](https://stackoverflow.com/q/49278125)). Being allowed to use higher precision temporaries is a separate part of the standard, `FLT_EVAL_METHOD == 2` https://en.cppreference.com/w/cpp/types/climits/FLT_EVAL_METHOD . – Peter Cordes Mar 24 '23 at 13:52

0 Answers0