1

How can I drop some values from a dataframe using another dataframe as parameter?

df1
code | reply
A1   | yes
A2   | yes
A3   | no

df2
code |
A1   |
A1   |
A3   |

df_new = df1.drop(df1['code'] == df2['code'] & df1['reply'] != 'yes')

df_new
code | reply
A1   | yes

Is there a simple way to do this using .drop()?

null92
  • 145
  • 8
  • 1
    There are several ways, have a look at those answers: https://stackoverflow.com/questions/33282119/pandas-filter-dataframe-by-another-dataframe-by-row-elements – tomasborrella Jan 10 '23 at 14:25

2 Answers2

2

Use boolean indexing:

out = df1[df1['code'].isin(df2['code']) & df1['reply'].eq('yes')]

Output:

  code reply
0   A1   yes
mozway
  • 194,879
  • 13
  • 39
  • 75
1

The logic is unclear but you can accomplish what you want without the use of drop:

>>> df1[df1['reply'] == 'yes'].merge(df2.drop_duplicates('code'))

  code reply
0   A1   yes
Corralien
  • 109,409
  • 8
  • 28
  • 52
  • actually your drop and my indexing are equivalent using [De Morgan's law](https://en.wikipedia.org/wiki/De_Morgan%27s_laws), so yes I always try to keep the most simple variant in such cases ;) – mozway Jan 10 '23 at 14:34
  • 1
    That's why I'm pointing it out to the OP. – Corralien Jan 10 '23 at 14:36
  • I was about to add this variant and the wikipedia link in my answer so I commented instead as you had just edited ;) – mozway Jan 10 '23 at 14:38
  • OK. I removed this part from my answer. – Corralien Jan 10 '23 at 14:39