Why does -fno-signed-zeros alone enable optimization, for which seemingly also -ffinite-math-only is needed (gcc)

Question

There is nothing in the man-pages, that would suggest that -fno-signed-zeros would imply -ffinite-math-only:

-fno-signed-zeros

Allow optimizations for floating point arithmetic that ignore the signedness of zero. IEEE arithmetic specifies the behavior of distinct +0.0 and -0.0 values, which then prohibits simplification of expressions such as x+0.0 or 0.0*x (even with -ffinite-math-only). This option implies that the sign of a zero result isn't significant.

The default is -fsigned-zeros.

However, there are observations which could be explained if it were the case. Problems in my code boil down to the following somewhat silly example:

#include <complex>

std::complex<double> mult(std::complex<double> c, double im){
    std::complex<double> jomega(0.0, im);
    return c*jomega;
}

The compiler would be tempted to optimize the multiplication c*=jomega to something similar to c={-omega*c.imag(), omega*c.real()} However, IEEE 754 compliance and at least the following corner cases prevent it:

A) signed zeros, e.g. omega=-0.0, c={0.0, -0.0}:

 (c*jomega).real() = 0.0*0.0-(-0.0)*(-0.0) =  0.0
 -c.imag()*omega   = -(-0.0)*(-0.0)        = -0.0  //different!

B) infinities, e.g. omega=0.0, c={inf, 0.0}:

 (c*jomega).real() = inf*0.0-0.0*0.0 =  nan
 -c.imag()*omega   = -(0.0)*(0.0)    = -0.0     //different!

C) nans, e.g. omega=0.0, c={inf, 0.0}:

 (c*jomega).real() = nan*0.0-0.0*0.0 =  nan
 -c.imag()*omega   = -(0.0)*(0.0)    = -0.0    //different!

That means, we have to use both, -ffinite-math-only (for B and C) and -fno-signed-zeros (for A), in order to allow the above optimization.

However, even with only -fno-signed-zeros on, gcc performs the above optimization, if I understand the resulting assembler right (or see the listings below to see the effects):

mult(std::complex<double>, double):
        mulsd   %xmm2, %xmm1
        movapd  %xmm0, %xmm3
        mulsd   %xmm2, %xmm3
        movapd  %xmm1, %xmm0
        movapd  %xmm3, %xmm1
        xorpd   .LC0(%rip), %xmm0
        ret
.LC0:
        .long   0
        .long   -2147483648
        .long   0
        .long   0

My first tought was, that this could be a bug - but all recent gcc-versions I have at hand produce the same result, so I'm probably missing something.

Thus my question, why is gcc performing the above optimization only with -fno-signed-zeros on and without -ffinite-math-only?

Listings:

separate mult.cpp to avoid funky precalculation during the compilation

#include <complex>

std::complex<double> mult(std::complex<double> c, double im){
       std::complex<double> jomega(0.0, im);
       return c*jomega;
}

main.cpp:

#include <complex>
#include <iostream>
#include <cmath>

std::complex<double> mult(std::complex<double> c, double im);


int main(){
     //(-nan,-nan) expected:
     std::cout<<"case INF: "<<mult(std::complex<double>(INFINITY,0.0),
 0.0)<<"\n";

     //(nan,nan) expected:
     std::cout<<"case NAN: "<<mult(std::complex<double>(NAN,0.0),  0.0)<<"\n"; 
}

Compile and run:

>>> g++ main.cpp mult.cpp -O2 -fno-signed-zeros -o mult_test
>>> ./mult_test
case INF: (-0,-nan)   //unexpected!
case NAN: (-0,nan)    //unexpected!

I know IEEE754 is a large standard, but does it cover complex numbers? Or are you inferring the behavior of complex numbers from an assumed implementation as a pair of doubles? Because "the sign of zero" is very much a property of real numbers; with complex numbers the corresponding property would be the phase of 0. — MSalters, Mar 13 '18 at 10:13
@Msalters If `a` and `b` are complex, then `(a*b).real()=a.real()*b.real()-a.imag()*b.imag()` and every operation is a double-operation, thus IEEE754 applies. — ead, Mar 13 '18 at 10:17
That's what I mean by "assuming a pair of doubles". You are assuming a specific implementation. And I just checked, IEEE754-2008 does not define complex arithmetic rules. Now C++11 does define storage rules, the values must be stored as a pair of doubles, but storage does not strictly define arithmetic. — MSalters, Mar 13 '18 at 10:49
@MarcGlisse I don't know, whether this is a bug. And the example is minimal, with `double` the behavior is as expected: https://godbolt.org/g/AJmWao — ead, Mar 15 '18 at 07:58
(Using the C _Complex would be more minimal) The transformation happens in the complex lowering pass. It lowers complex multiplication to scalar operations, and has special code for the imaginary-only case instead of relying on scalar optimizations to clean it up. — Marc Glisse, Mar 15 '18 at 08:10
@MarcGlisse I don't understand exactly what you are saying, but no matter what I use (c++-complex or gcc _Complex) the result is the same: https://godbolt.org/g/WjHJn8 (as I would expect) — ead, Mar 15 '18 at 08:17

score 4 · Accepted Answer · answered Mar 29 '18 at 20:26

It was a misconception from my side, that the complex number multiplication is defined the same way it is learned in the school.

Basically, C++-standard isn't concerned with the complex multiplication, so probably the C-standard has to be consulted. Only since C99, the complex numbers are part of the standard (Appendix G), which yet does not define all results of the complex multiplication uniquely.

The most important definitions are:

a complex number is zero when both parts are zero (0.0 or -0.0).
a complex number is finite when both parts are finite and not nans.
a complex number is infinite when real or imaginary (or both) parts are inf or -inf (even if the other one is nan).

It is not defined what is a complexnan, so if one part is nan, we can consider the complex number being nan (as long as there is no infinite part).

The standard goes on to say, that the school-multiplication should hold for the most of the cases, but also that

if one operand is an infinity and the other operand is a nonzero finite number or an infinity,then the result of the operator is an infinity;

That means for example, that (1.0+0j)*(inf+inf*j) should be infinite (inf+inf*j would probably make most sense), but not nan+nan*j as it would be the case for the usual formula.

There is more on this topic in my following SO-question.

Given, that the compiler has some freedom producing results, we can see that the difference between the used implementation via __multdc3 and the the simplified school formula is only for if signed zeros is taken into account, i.e. (-0,-0)vs.(0,-0) and so on (see listing of the program testing it further below or see it here live).

That means, that the behavior of gcc is OK, because it uses undefined behavior of the standard. One could argue, that this is missed optimization of clang.

NB: There is also a "bug-report": https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84891

#include <complex>
#include <iostream>
#include <cmath>
#include <cfloat>
#include <vector>


int get_type(std::complex<double> c){
  if(std::isinf(c.real()) || std::isinf(c.imag()))
       return 2;
  if(std::isnan(c.real()) || std::isnan(c.imag()))
       return 1;
  return 0;
}

void do_mult(double b, double c, double d){
     std::complex<double> school(-b*d, b*c);
     std::complex<double> f(0.0,b);
     std::complex<double> s(c,d);
     auto cstd=f*s;

     int type1=get_type(school);
     int type2=get_type(cstd);

     #ifdef INFINITE_MATH
                        //not special,    usual            
     if(type1!=type2 || (type1==0 &&  (cstd!=school))){
               std::cout<<"(0.0,"<<b<<")*("<<c<<","<<d<<")="<<school<<"vs."<<cstd<<"\n";
     }

     #endif

     #ifdef SIGNED_ZERO_MATH
                                                //       signed zero
     if(type1!=type2 || (type1==0 &&  (1.0/cstd.real()!=1.0/school.real() || 1.0/cstd.imag()!=1.0/school.imag() ))){
               std::cout<<"(0.0,"<<b<<")*("<<c<<","<<d<<")="<<school<<"vs."<<cstd<<"\n";
     }
     #endif
}

int main(){
       std::vector<double> numbers{0.0, -0.0, 1.0, INFINITY, -INFINITY, NAN, DBL_MAX, -DBL_MAX};
       for(double b: numbers)
         for(double c: numbers)
           for(double d: numbers)
               do_mult(b,c,d);
}

To build/run use:

g++ main.cpp -o main -std=c++11 -DINFINITE_MATH -DSIGNED_ZERO_MATH && ./main

Why does -fno-signed-zeros alone enable optimization, for which seemingly also -ffinite-math-only is needed (gcc)

1 Answers1