I have a big number of coordinates in 2 arrays (HLat22 and HLong22) and I've also got a LineString. The output is in indices - there is an array full of True/False that shows me the coordinates in HLat22/HLong22 that are in a certain threshold to the coordinates on my LineString ( my example is 0.005) . On the example that I have, on the 4th position the coordinates are near my LineString.
For the filtering function I've used the function from this post: Selecting close matches from one array based on another reference array
def searchsorted_filter(a, b, thresh):
choices = np.sort(b) # if b is already sorted, skip it
lidx = np.searchsorted(choices, a, 'left').clip(max=choices.size-1)
ridx = (np.searchsorted(choices, a, 'right')-1).clip(min=0)
cl = np.take(choices,lidx) # Or choices[lidx]
cr = np.take(choices,ridx) # Or choices[ridx]
return a[np.minimum(np.abs(a - cl), np.abs(a - cr)) < thresh]
from shapely.geometry import LineString, Point, LinearRing
import time
import numpy as np
start_time = time.time()
HLat22 = np.asarray([100,200,300,32.47156,500,600,700,800,900,1000])
HLong22 = np.asarray([-100,-200,-300,-86.79192,-500,-600,-700,-800,-900,-1000])
polygon2 = LineString ([Point(-86.79191,32.47155), Point(-86.78679699999999,32.47005)])
#Getting lat and long coordinates
numpy_x = np.array(polygon2.coords.xy[0])
numpy_y = np.array(polygon2.coords.xy[1])
#Filtering so I only remain with coordinates
The_X = searchsorted_filter(HLong22,numpy_x,thresh=0.005)
The_Y = searchsorted_filter(HLat22,numpy_y,thresh=0.005)
print("Secsfilter: %s",time.time()-start_time)
start_time = time.time()
indices = np.in1d(HLong22, The_X) & np.in1d(HLat22, The_Y)
print("Secsin1d: %s",time.time()-start_time)
Output:
Secsfilter: %s 0.002005338668823242
Secsin1d: %s 0.0
array([False, False, False, True, False, False, False, False, False, False], dtype=bool)
This works fine. However, with bigger outputs it starts going slower. If my HLat2/Hlong2 are 1413917 in size ( same LineString ), this is how it acts:
Secsfilter: %s 0.20999622344970703
Secsin1d: %s 0.49498486518859863
The_X and The_Y's length would be 15249 .
The question that I have is: Is there any way to optimize this code and make it a bit faster?