0

I have a data frame which basically looks like that:

      'A'
0     12
1     542
2     676
3     854
4     922
5     972

The values in the column 'A' are integers. Further I have a list of subset of the values of 'A'. So I have, say, l = [12,676,854] and I want to remove ALL the lines for which df['A'] is equal to any of the values in my list, i.e. df['A']=12 or df['A']=676 or df['A']=854. In this case the desirable output would be

      'A'
1     542
4     922
5     972

A code like

for el in l:
   if df=df[df['A']!=el]

would work under 'normal' circumstances but the thing is that my dataframe is relatively huge (~4 mln lines) and my list l has 40k elements. So each time I enter the loop I go through the whole data frame which is extremely time inefficient. Do you have any idea how to do it in a more efficient way? Ideally, I look for a code which looks like df=df[df['A']!=el for el in l] which is of course incorrect.

rado
  • 401
  • 3
  • 8
  • 16

0 Answers0