I have two Numpy arrays of (x,y) coordinates. I want to find all points in the first array that are NOT in the second array. The coordinates are floating-point numbers. They should have few digits (e.g. 1.25, but not 1.123456). But they're the result of calculations, so floating-point imprecision is a factor.
Comments to this question state that an answer suitable to floating-point numbers is found here. But after inspecting the answers, it's not clear to me that any of them account for floating-point imprecision.
Right my solution is this:
import numpy as np
a1 = np.array([[1.2, 2.3], [1.0, 1.1]])
a2 = np.array([[1.0, 1.1], [5.2, 2.2]])
a1_not_a2 = []
a2_set = set(tuple(point) for point in a2.round(decimals=5).tolist())
for point in a1.round(decimals=5).tolist():
if tuple(point) not in a2_set:
a1_not_a2.append(point)
But I'm not sure if my solution always works, and it's slow. I have two questions:
(1) Is comparing floats after round(decimals=5)
guaranteed to produce correct output?
(2) Is there a better way to get my result? My arrays are huge, so using nested for loops with np.allclose
is slow.