5

I read this in a popular book about computer graphics,

There are many numeric computations that become much simpler if the programmer takes advantage of the IEEE rules. For example, consider the expression:

a = 1 / (1/b + 1/c)

Such expressions arise with resistors and lenses. If divide-by-zero resulted in a program crash (as was true in many systems before IEEE floating-point), then two if statements would be required to check for small or zero values of b or c. Instead, with IEEE floating-point, if b or c is zero, we will get a zero value for a as desired.

But what about the case where b=+0 and c=-0? Then a=1/inf-inf=nan. Is the book wrong about this optimizations, or have I misunderstood something? It seems like we'll still need at least one check for the signs of b & c rather than no checks at all.

Edit One suggestion in the comments was just to do a post-check for NaN. Is that the idiomatic way to "take advantage" of these number type?

bool is_nan(float x) { return x != x; }

float optic_thing(float b, float c) {
  float result = 1.0f / (1.0f/b + 1.0f/c);
  if (is_nan(result)) result = 0;
  return result;
}
dune.rocks
  • 497
  • 3
  • 11

1 Answers1

2

But what about the case where b=+0 and c=-0?

Convert -0.0 to +0.0 without branching by adding 0.0.

int main(void) {
  double b = +0.0;
  double c = -0.0;
  printf("%e\n", b);
  printf("%e\n", c);
  printf("%e\n", 1.0/(1.0/b + 1.0/c));
  b += 0.0;
  c += 0.0;
  printf("%e\n", 1.0/(1.0/b + 1.0/c));
  return 0;
}

Output

0.000000e+00
-0.000000e+00
nan
0.000000e+00

[Edit] On the other hand, any values near 0.0, but not 0.0, are likely numeric artifacts and not accurate resistance values. The above still has trouble with tiny value pairs like b = DBL_TRUE_MIN; c = -b; The issue is 1.0/tiny --> Infinity and +Infinity + -Infinity --> NAN. Could resort to either using wider floating point for the quotient calculation or narrow the operands.

  b += 0.0;
  c += 0.0;
  printf("%Le\n", 1.0/(1.0L/b + 1.0L/c));

  // Lose some precision, but we are talking about real resistors/lenses here.
  b = (float) b + 0.0f;
  c = (float) c + 0.0f;
  printf("%e\n", 1.0/(1.0/b + 1.0/c));
chux - Reinstate Monica
  • 143,097
  • 13
  • 135
  • 256
  • 1
    Adding +zero is indeed the canonical way of "cleaning up" undesired instances of -zero. It can also be combined with multiplication if necessary, e.g. `fma (a, b, +0.0)`. Of course this all assumes that one works with a compiler (and compiler switches) that respect proper IEEE-754 behavior in the first place (i.e. doesn't optimize out additions involving zero). – njuffa Sep 02 '16 at 02:40
  • @njuffa True about "respect proper IEEE-754". Fortunately OP's post is tagged and titled IEEE 754. BTW: You may enjoy [+0.0 and -0.0 give different arithmetic results](http://stackoverflow.com/q/25332133/2410359) which is somewhat the opposite of what OP is trying to do. – chux - Reinstate Monica Sep 02 '16 at 02:44
  • I am apparently aware of the question and your answer (now community wiki), because I see that I upvoted the answer. Since I didn't comment or edit the answer I assume that I was unable to think of an example not already mentioned. – njuffa Sep 02 '16 at 02:47