2

I've found an interesting floating point problem. I have to calculate several square roots in my code, and the expression is like this:

sqrt(1.0 - pow(pos,2))

where pos goes from -1.0 to 1.0 in a loop. The -1.0 is fine for pow, but when pos=1.0, I get an -nan. Doing some tests, using gcc 4.4.5 and icc 12.0, the output of

1.0 - pow(pos,2) = -1.33226763e-15

and

1.0 - pow(1.0,2) = 0

or

poss = 1.0
1.0 - pow(poss,2) = 0

Where clearly the first one is going to give problems, being negative. Anyone knows why pow is returning a number smaller than 0? The full offending code is below:

int main() {
  double n_max = 10;
  double a = -1.0;
  double b = 1.0;
  int divisions = int(5 * n_max);
  assert (!(b == a));

  double interval = b - a;
  double delta_theta = interval / divisions;
  double delta_thetaover2 = delta_theta / 2.0;
  double pos = a;
  //for (int i = 0; i < divisions - 1; i++) {
   for (int i = 0; i < divisions+1; i++) {

    cout<<sqrt(1.0 - pow(pos, 2)) <<setw(20)<<pos<<endl;

     if(isnan(sqrt(1.0 - pow(pos, 2)))){
      cout<<"Danger Will Robinson!"<<endl;
      cout<< sqrt(1.0 - pow(pos,2))<<endl;
      cout<<"pos "<<setprecision(9)<<pos<<endl;
      cout<<"pow(pos,2) "<<setprecision(9)<<pow(pos, 2)<<endl;
      cout<<"delta_theta "<<delta_theta<<endl;
      cout<<"1 - pow "<< 1.0 - pow(pos,2)<<endl;
      double poss = 1.0;
      cout<<"1- poss "<<1.0 - pow(poss,2)<<endl;


  }

  pos += delta_theta;

}

 return 0;
 }
Ivan
  • 19,560
  • 31
  • 97
  • 141

4 Answers4

8

When you keep incrementing pos in a loop, rounding errors accumulate and in your case the final value > 1.0. Instead of that, calculate pos by multiplication on each round to only get minimal amount of rounding error.

Tronic
  • 10,250
  • 2
  • 41
  • 53
  • This answer is good. @Ivan: Try printing the value of pos throughout the loop, as this answer states it will go above 1 resulting in your issue. – DeusAduro Dec 15 '10 at 18:18
  • +1. One of the first things my teachers did to demonstrate floating point inaccuracy was to have this loop: `for(float x = 0.0; x != 1.0; x += .1);`. I'll leave it as an exercise to the reader as to whether this loop terminates. – jdmichal Dec 15 '10 at 18:20
  • What do you mean calculating pos by multiplication? – Ivan Dec 15 '10 at 18:22
  • I see, it is indeed getting above the top. Subtle is the lord of floating point numbers ;-) – Ivan Dec 15 '10 at 18:23
  • 2
    Even with multiplication there is still the risk that 1 - pow(1, 2) will be negative. – Alexandre C. Dec 15 '10 at 18:28
5

The problem is that floating point calculations are not exact, and that 1 - 1^2 may be giving small negative results, yielding an invalid sqrt computation.

Consider capping your result:

double x = 1. - pow(pos, 2.);
result = sqrt(x < 0 ? 0 : x);

or

result = sqrt(abs(x) < 1e-12 ? 0 : x);
Alexandre C.
  • 55,948
  • 11
  • 128
  • 197
  • 1
    +1 -- should probably use `std::numeric_limits::epsilon` instead of the constant `1e-12`. – Billy ONeal Dec 15 '10 at 19:38
  • yeah... but the calculation should be exact. All values and temporaries are exactly representable by `double`. – Inverse Dec 15 '10 at 19:56
  • 1
    @Billy: To the contrary. I should use "a few times" the machine epsilon, because I have no control on the roundoff error (this would involve ad hoc subtle analysis and knowledge of the details of `pow`). I chose 1e-12, but this is largely problem dependent, and I never feel ashamed of using such magic constants in real production code (including "magic * machine_epsilon"). Too much generality kills you when you do numerics, and you should tailor code on what you do exactly. An alternative would have been `abs(x) < 1e-12 * some_problem_variable`, it depends on what you want. – Alexandre C. Dec 15 '10 at 19:57
  • @Aelxandre: Good point -- probably need a few times that. Just would be nice if you're unlucky enough to port to a platform where `double` isn't 80 bits wide like it is on the x86 to not have to go around modifying code. A few times epsilon should do the trick. – Billy ONeal Dec 15 '10 at 19:59
  • @inverse: **never** rely on that. – Alexandre C. Dec 15 '10 at 19:59
  • I'm having a hard time picturing how pow could be coded such that 1^2 isn't equal to 1 (produces the expected result: http://ideone.com/kIzEq ) – Inverse Dec 16 '10 at 03:20
  • @inverse: for `pow(1, 2)` surely, but this is so particular – Alexandre C. Dec 16 '10 at 08:10
2

setprecision(9) is going to cause rounding. Use a debugger to see what the value really is. Short of that, at least set the precision beyond the possible size of the type you're using.

Edward Strange
  • 40,307
  • 7
  • 73
  • 125
0

You will almost always have rounding errors when calculating with doubles, because the double type has only 15 significant decimal digits (52 bits) and a lot of decimal numbers are not convertible to binary floating point numbers without rounding. The IEEE standard contains a lot of effort to keep those errors low, but by principle it cannot always succeed. For a thorough introduction see this document

In your case, you should calculate pos on each loop and round to 14 or less digits. That should give you a clean 0 for the sqrt.

You can calc pos inside the loop as

pos = round(a + interval * i / divisions, 14);

with round defined as

double round(double r, int digits) 
{
    double multiplier = pow(digits,10);
    return floor(r*multiplier + 0.5)/multiplier;
}
TToni
  • 9,145
  • 1
  • 28
  • 42
  • That round function doesn't work for negative numbers. – Billy ONeal Dec 15 '10 at 19:40
  • false. Rounding error are not limited to 1e-15 (even relative error). Consider 10000001 - 10000000 for instance. You have then 1e-8 relative error (assuming doubles) – Alexandre C. Dec 15 '10 at 19:59
  • Right, that e-15 is true only for values around 1. I'll correct that. The rest still holds though. – TToni Dec 15 '10 at 20:43
  • @Billy: Sure it does. Depends on what kind of rounding you want. In this special case its insignificant if you use symmetric rounding, banking style rounding or whatever. See the discussion here: http://stackoverflow.com/questions/485525/round-for-float-in-c – TToni Dec 15 '10 at 20:45
  • @TToni: -0.5 should be rounded to -1, not 0, as it is here. At least with what mathematicians usually refer to as rounding. Simple matter to fix -- store the sign of the number, convert to positive, use this round, add the sign back in. Should be doable without any loss of precision. – Billy ONeal Dec 15 '10 at 21:02
  • @Billy: Is there a special reason why you think that -0.5 should be -1 instead of 0? Yours is the symmetric rounding case, but that's not always best depending on what you want and in this particular case its irrelevant anyway. – TToni Dec 15 '10 at 22:52