I have a list of Lat and Long approximately 5 Million rows of data. I tried below code to create a buffer of 25ft around each point and assign a new Location Id to all the points that fall in that buffer. The only issue here is the performance of the code. Please help, I am new to python and dealing with the huge dataset. Any help on this is much appreciated!
import geopy.distance
Coord_List = Sample_Data.Lat_Long.values.tolist()
Coord_List_E = [""]*len(Coord_List)
k =1
for i in range(len(Coord_List)):
#if i==0:
#New_List[i]=k
if Coord_List_E[i]=="":
#New_List[i]=k
for j in range(i,len(Coord_List)):
if Coord_List_E[j]=="" and abs(geopy.distance.distance(Coord_List[i],Coord_List[j]).ft)<=25 :
Coord_List_E[j]=k
Coord_List_E[i]=k
#print(i,j,k)
k+=1
else:
pass
print(Coord_List_E)