Is floating point division sufficient to determine bin number accurately?

Question

I'm trying to account for crossings of the prime meridian accurately and I ran into the following question about IEEE floating point arithmetic (round to nearest):

Let n be an integer, and d a small positive number. Does

y = n * 360 - d < n * 360

guarantee that floor(y/360) < n? Here all the operations (* - < / floor) are to be understood as floating operations (using e.g., double precision IEEE).

What about if 360 in this question is replaced by some other positive floating point number. (The same question arises whenever a floating point quantity is being assigned to evenly spaced bins.)

chux - Reinstate Monica · Answer 1 · 2014-11-04T21:26:08.637

0

n * 360 - d < n * 360 --> 0 - d < 0 --> d > 0 is true because "d (is) a ~~small~~ positive number".

The value of n is irrelevant so far.

y = n * 360 - d --> y/360 = n - d/360 -->

With 0.0 <= q < 1.0,
floor(y/360) + q = n - d/360 --> floor(y/360) - n = -q - d/360

For all values of q and d, -q - d/360 < 0 -->

floor(y/360) - n < 0 --> floor(y/360) < n. Q.E.D.

If 360 was replaced by x as any integer greater than 0, the answer is still the same. I think it is also true if x is replaced by any number >= 1.0. Have to think about 0 < x < 1.

The smallest of d is irrelevant so far - just that it is a positive number (d > 0).

edited Nov 04 '14 at 21:26

answered Nov 04 '14 at 21:20

chux - Reinstate Monica

143,097
13
135
256

Sorry, the inequality n * 360 - d < n * 360 was supposed to be interpreted as "what you would get if you did all the operations using floating point arithmetic". So for sufficiently small d, e.g., 1.0e-30, is inequality is only satisfied for n = 0 (using double). I will edit to question to clarify. – cffk Nov 04 '14 at 22:36
Let's jump to the higher level problem: "to account for crossings of the prime meridian accurately" In `C`, rather than use `y = n * 360 - d;` for some `n`, use `longitude = fmod(longitude, 360.0);` and suffer _no_ loss of precision regardless of `longitude`. [ref](http://stackoverflow.com/questions/20928253/is-fmod-exact-when-y-is-an-integer) – chux - Reinstate Monica Nov 04 '14 at 22:44
Yes, I already use this (and I have to deal with the pesky problem that the result can lie in (-360,360)). In my current application, I need to determine which period of longitude I'm in, i.e., floor(longitude/360). – cffk Nov 04 '14 at 23:26
@cffk `remainder((longitude, 360.0)` will give answer `-180 <= longitude <= 180`. Also check `remquo()` will give the last 3 bits of the integer period. Perhaps those may help. – chux - Reinstate Monica Nov 05 '14 at 05:03
thanks, noted. But this still leaves my original question open. – cffk Nov 05 '14 at 14:17

cffk · Answer 2 · 2014-11-09T15:22:16.577

After some experimentation, I think I can provide a partial answer. Let me rephrase the question: Write a function

int bin(double x, double m)

which computes

int(floor(x/m))

exactly. Assume m is positive and that the result is in the range of int.

The first attempt is

int bin0(double x, double m) {
  return int(std::floor(x / m));
}

but this fails for the case m = 360.0 and x = -denorm_min (0 is returned instead of -1).

Since this failure is just for x close to zero, the second attempt is

int bin1(double x, double m) {
  int n = int(std::floor(x / m));
  return n == 0 && x < 0 ? -1 : n;
}

I believe this returns the exact answer provided that n * m is exactly representable as a double. For m = 360.0, this includes all n representable as a 32-bit integer. Am I right? A proof would be nice!

If this condition doesn't hold, e.g., m = 0.1, then the best I can come up with is

int bin2(double x, double m) {
  double z = std::fmod(x, m);
  return int(std::floor((x - z)/m + 0.5)) + (z < 0 ? -1 : 0);
}

Does this always return the correct result? Is there some "cleaner" solution?

ADDENDUM: In my application I only needed to get the parity of the bin number (even or odd). (My application is measuring the area of a geodesic polygon and I need to keep track of whether an edge encircles the pole an even or odd number of times.) So chux's suggestion to use remquo is a good one. Unfortunately (1) std::remquo requires C++11 and (2) more seriously, the glibc implementation of remquo is buggy; see this bug report. So I end up doing essentially

int binparity(real x, real m) {
  // return the parity of int(floor(x/m))
  x = std::fmod(x, 2 * m);
  return (x >= 0 && x < m) || x < -m ? 0 : 1
}

(re: 2nd attempt) Let's assume `m >= 1.0`: If `x/m` does not underflow to 0.0, `return int(std::floor(x / m)` obviously works. If `x >= +0.0`, is works too. The only case left is when `x < 0.0` and `x/m` underflows. Code's `n == 0 && x < 0 ? -1 : n;` takes care of that. The issues becomes more complicated if `m < 1.0`. Suggest stating the range of `m`. Do you care about `-0.0`? Your approach returns 0. Alternative when `m >= 1.0`: `double q = x/m; return floor(q ? q, x);` — chux - Reinstate Monica, Nov 05 '14 at 16:18
Have doubt `int(std::floor((x - z)/m + 0.5)) ...` works in corner cases due to inexact quotient in `(x - z)/m + 0.5)`. — chux - Reinstate Monica, Nov 05 '14 at 16:20
Maybe `int bin1x(double x, double m) { double q = x/m; return (int) floor(q ? q, -(x<0.0)); }` for any `m > 0`. — chux - Reinstate Monica, Nov 05 '14 at 16:30

Is floating point division sufficient to determine bin number accurately?

2 Answers2