I have a data frame with points. The first two columns are positions. I am filtering the data based on a points proximity to another point. I calculate the distance of all the points with cdist and then filter this result to find the indices of the points that have a distance less 0.5 from each other. I also have to do two mini filters on these indices first to remove remove indices for comparing the same point distance [n,n] = distance [n,n] will always equal zero and I don't want to remove all of my points. Also I remove indeces for similar distance comparisons distance [n,m] = distance [m,n]. There are basically double the number of points that I need to remove so I use unique to filter out half.
My questions loc_find
is a numpy array of indexes to rows that should be removed. How do I remove use this array to remove these numbered rows from my pandas dataframe without iterating over the dataframe?
from scipy.spatial.distance import cdist
import numpy as np
import pandas as pd
# make points and calculate distances
east=data['easting'].values
north=data['northing'].values
points=np.vstack((east,north)).T
distances=cdist(points,points) # big row x row matrix
zzzz=np.where(distances<0.5)
loc_dist=np.vstack((zzzz[0],zzzz[1])).T #array of indices where points are
# to close together and will be filtered contains unwanted distance
# comparisons such as comparing data[1,1] with data[1,1] which is always zero
#since it is the same point. also distance [1,2] is same as [2,1]
#My code for filtering the indices
loc_dist=loc_dist.astype('int')
diff_loc=zzzz[0]-zzzz[1] # remove indices for comparing the same
#point distance [n,n] = distance [n,n]
diff_zero=np.where(diff_loc==0)
loc_dist_s=np.delete(loc_dist, diff_zero[0],axis=0)
loc_find=np.unique(loc_dist_s) # remove indices for similar distance
#comparisons distance [n,m] = distance [m,n]