Numpy: Overcoming machine imprecision by relative rounding

Question

Goal

I want to apply "relative" rounding to the elements of a numpy array. Relative rounding means here that I round to a given number significant figures, whereby I do not care whether this are decimal or binary figures.

Suppose we are given two arrays a and b so that some elements are close to each other. That is,

np.isclose(a, b, tolerance)

has some True entries for a given relative tolerance. Suppose that we know that all entries that are not equal within the tolerance differ by a relative difference of at least 100*tolerance. I want to obtain some arrays a2 and b2 so that

np.all(np.isclose(a, b, tolerance) == (a2 == b2))

My idea is to round the arrays to an appropriate significant digit:

a2 = relative_rounding(a, precision)
b2 = relative_rounding(b, precision)

However, whether the numbers are rounded or floor is applied does not matter as long as the goal is achieved.

An example:

a = np.array([1.234567891234, 2234.56789123, 32.3456789123])
b = np.array([1.234567895678, 2234.56789456, 42.3456789456])

# desired output
a2 = np.array([1.2345679, 2234.5679, 3.2345679])
b2 = np.array([1.2345679, 2234.5679, 4.2345679])

Motivation

The purpose of this exercise is to allow me to work with clearly defined results of binary operations so that little errors do not matter. For example, I want that the result of np.unique is not affected by imprecisions of floating point operations.

You may suppose that the error introduced by the floating point operations is known/can be bounded.

Question

I am aware of similar questions concerning rounding up to given significant figures with numpy and respective solutions. Though the respective answers may be sufficient for my purposes, I think there should be a simpler and more efficient solution to this problem: since floating point numbers have the "relative precision" builtin, it should be possible to just set the n least significant binary values in the mantissa to 0. This should be even more efficient than the usual rounding procedure. However, I do not know how to implement that with numpy. It is essential that the solution is vectorized and more efficient than the naive way. Is there a direct way of directly manipulating the binaries of an array in numpy?

*"...it should be possible to just set the n least significant binary values in the mantissa to 0."* Yes, that is possible, and easy. But doing that does not satisfy your requirement that "if two elementes a[i] and b[i] are close together, the rounded versions a2[i] and b2[i] shall be equal equal" for all possible a[i] and b[i]. — Warren Weckesser, Jul 09 '19 at 02:21
For example, suppose you are rounding to just one significant digit. There is a set of floating point values that round to 1, and another set that round to 2. The boundary between these is at 1.5. By almost any definition of *close*, the values 1.5 - eps and 1.5 + eps, where eps is the machine precision (i.e. the floating point spacing), are *close*. But they round to different values. — Warren Weckesser, Jul 09 '19 at 02:21
I @WarrenWeckesser I have updated the question to be more precise. I know that different numbers differ by far more than the rounding radius. — Samufi, Jul 09 '19 at 04:03
It looks like your example rounds to 8 digits. Suppose, in addition to the values that you show, `a` contains `12345678.499999`, and `b` contains `12345678.500001`. What should the corresponding values in `a2` and `b2` be? (If you use floor instead of round, then the same question can be asked about `12345678.99999` and `12345679.00000`.) — Warren Weckesser, Jul 09 '19 at 06:04
Instead of trying to define a "relative round" function that acts on one array at a time, perhaps something like this would work: `a2 = a.copy(); b2 = b.copy(); a2[np.isclose(a, b, tolerance)] = b2[np.isclose(a, b, tolerance)]`. No rounding is done, but for the pairs in `a` and `b` that were close, the corresponding pairs in `a2` and `b2` are equal. — Warren Weckesser, Jul 09 '19 at 06:15
I get your point. Your suggestion is nice, but I would need that for `np.unique`. However, others have had this problem before: https://stackoverflow.com/questions/5426908/find-unique-elements-of-floating-point-array-in-numpy-with-comparison-using-a-d — Samufi, Jul 09 '19 at 07:44

score 4 · Accepted Answer · answered Jul 09 '19 at 00:56

This is impossible, except for special cases such as a precision of zero (isclose becomes equivalent to ==) or infinity (all numbers are close to each other).

numpy.isclose is not transitive. We may have np.isclose(x, y, precision) and np.isclose(y, z, precision) but not np.isclose(x, z, precision). (For example, 10 and 11 are within 10% of each other, and 11 and 12 are within 10% of each other, but 10 and 12 are not within 10% of each other.)

Give the above isclose relations for x, y, and z, the requested property would require that x2 == y2 and y2 == z2 be true but that x2 == z2 be false. However, == is transitive, so x2 == y2 and y2 == z2 implies x2 == z2. Thus, the requested function requires that x2 == z2 be both true and false, and hence it is impossible.

(a) I did not critique the motivation. (b) The requested function is impossible on a practical level, except for special situations with particular designs. With just a few operations, the rounding errors in floating-point arithmetic can accumulate to very large relative errors, producing results that fall within the scenario described in this answer. Thus, the impossibility described is not just one that appears only in theory or only in rare circumstances but will appear in practical situations. — Eric Postpischil, Jul 09 '19 at 01:54

Numpy: Overcoming machine imprecision by relative rounding

1 Answers1