0

I am interested in efficiently sorting through two very long sorted lists to find the closest pair.I could of course write two for-loops, but this code is very slow for long elements in a list. Doing a bit of reading, it seems that "numpy arrays" are particularly fast, and are similar to Matlab (which I have experience in "vectorizing" code with for loops in).

One of the answers suggests a fast way of finding the nearest value in a numpy array. So all I need to do to do this for two lists, is loop through the values of one list:

import numpy as np
import numpy.matlib
import glob
import time
import math
def find_nearest(array,value):
    idx = np.searchsorted(array, value, side="left")
    if idx > 0 and (idx == len(array) or math.fabs(value - array[idx-1]) < math.fabs(value - array[idx])):
        return array[idx-1]
    else:
        return array[idx]

x1 = np.array([1, 3, 4, 5, 19]);
x2 = np.array([6, 18, 24, 36, 37]);
dtarray = numpy.array([]);
for i in range(x1.size):
    dtarray = np.append(dtarray, math.fabs(x2[i]-find_nearest(x1, x2[i])))

print(dtarray)

I've eliminated one for-loop, and now I'm interested in speeding this up further. It seems like list-comprehensions will be useful for this task (and maybe I can figure out how to parallelize them) - but I'm having trouble getting them to work:

dtarray2 = [math.fabs(x2[i]-find_nearest(x1, x2[i])) for i in range(x1.size)]   

Is this syntax not correct? I get the message:

---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<timed exec> in <module>

<timed exec> in <listcomp>(.0)

NameError: name 'x2' is not defined
Steven Sagona
  • 115
  • 1
  • 11
  • Please provide the full error traceback, as that will tell where exactly the error is occurring – G. Anderson Oct 21 '19 at 20:06
  • Cannot reproduce – Dani Mesejo Oct 21 '19 at 20:08
  • I posted the error message...but when I reran the code it appears to be gone. I'm not really sure what happened... Additionally, I see that there are duplicate questions that cover what I am trying to solve, so maybe I should delete this question? – Steven Sagona Oct 21 '19 at 20:17
  • @StevenSagona If you think that the question has some content that's new on top what's been discussed in the linked Q&As and could be useful to future readers, keep it. Just an advice. – Divakar Oct 21 '19 at 20:26

0 Answers0