I have a pandas
dataframe as
name dist
a 10
b 11
c 5
d 3
I want to iterate through the dataframe and for each row, I want to check a condition using column dist
on all other rows, and if it falls below a threshold, I need to delete that row. This threshold itself is computed as a function of the dist
values of other rows. How can I efficiently use iterrows()
so as to drop the rows without iterating through all the rows in a nested loop?
Here is how I am currently doing:
ind_to_drop=[]
for idx1, row1 in df.iterrows():
for idx2, row2 in df.iterrows():
if idx1!=idx2:
val = myfunc(row1.dist, row2.dist) #This is the function to compute that value
if val>0:
ind_to_drop.append(idx2) #here we want to drop the row with index idx2
Instead of appending the indices in ind_to_drop
can we remove the row dynamically so that the number of iterations can be reduced?