0

I have a dataframe df like this:

    x   
1   paris   
2   paris  
3   lyon  
4   lyon   
5   toulouse 

I would like to only keep not duplicated rows, for exemple above I would like to only keep the row 'toulouse'.

I tried drop duplicates pandas function but doesn't work:

df.drop_duplicates(subset=['x'], inplace=True)

Expected output:

      x   
 5 toulouse

How can I do this ?

jos97
  • 405
  • 6
  • 18
  • Does this answer your question? [Drop all duplicate rows across multiple columns in Python Pandas](https://stackoverflow.com/questions/23667369/drop-all-duplicate-rows-across-multiple-columns-in-python-pandas) – Kermit Oct 16 '22 at 20:21

1 Answers1

3

From documentation:

keep{‘first’, ‘last’, False}, default ‘first’ Determines which duplicates (if any) to keep. - first : Drop duplicates except for the first occurrence. - last : Drop duplicates except for the last occurrence. - False : Drop all duplicates.

It says , keep=False would drop all duplicates. So you can do:

df.drop_duplicates(subset=['x'], keep=False,inplace=True)

Related Post: Drop all duplicate rows across multiple columns in Python Pandas

anky
  • 74,114
  • 11
  • 41
  • 70