Remove rows data frame Python

Asked Mar 14 '17 at 13:03

Active Mar 14 '17 at 13:07

Viewed 33 times

I have a data frame which basically looks like that:

The values in the column 'A' are integers. Further I have a list of subset of the values of 'A'. So I have, say, l = [12,676,854] and I want to remove ALL the lines for which df['A'] is equal to any of the values in my list, i.e. df['A']=12 or df['A']=676 or df['A']=854. In this case the desirable output would be

A code like

for el in l:
   if df=df[df['A']!=el]

would work under 'normal' circumstances but the thing is that my dataframe is relatively huge (~4 mln lines) and my list l has 40k elements. So each time I enter the loop I go through the whole data frame which is extremely time inefficient. Do you have any idea how to do it in a more efficient way? Ideally, I look for a code which looks like df=df[df['A']!=el for el in l] which is of course incorrect.

asked Mar 14 '17 at 13:03

rado

1

You need `print (df[~df.A.isin(l)])` – jezrael Mar 14 '17 at 13:04
It is dupe, so no answer. – jezrael Mar 14 '17 at 13:04

Remove rows data frame Python

0 Answers0