I have a 2-dimensional NumPy array. Each row is sorted and contains a small number of elements (like 10), but there are a large number of rows (like 1e6). It might look something like this:
haystacks = [
[1, 4, 7, 8],
[2, 5, 5, 7],
[10, 11, 25, 30],
...
]
I also have a one-dimensional array. This array has as many elements as the first array has rows. So, maybe:
needles = [10, 6, 15 ...]
I want to perform binary search for each element in the 1d array on the corresponding row in the 2d array. I would use np.searchsorted
, but it doesn't seem to support this use-case.
I am using this in a large simulation of a physical system. So, performance is extremely important. The following code works, but it is too slow.
positions = []
for needle, haystack in zip(needles, haystacks):
positions.append(np.searchsorted(haystack, needle))
print(positions)
NumPy solution is preferred. Other libraries are okay, but I am having trouble getting Numba working.
Anyone have any ideas?