Pandas Dataframe Drop both records and their duplicates

Question

I want drop records that have duplicates and their duplicates in pandas Dataframe based on a column

Please post sample data, expected output and what you have tried so far. — nitin3685, Nov 14 '19 at 08:24
Possible duplicate of [How to drop a list of rows from Pandas dataframe?](https://stackoverflow.com/questions/14661701/how-to-drop-a-list-of-rows-from-pandas-dataframe) — abhilb, Nov 14 '19 at 08:31
@abhlib : user is asking how to remove duplicate column , not rows i believe and also how to drop with condition. I assume he is not aware which all columsn are duplicate. So he is not aware of the index of the columns to be dropped. — nitin3685, Nov 14 '19 at 08:37
@nitin3685 i think abhilb ia actually quite on track....however am not only looking to drop the duplicates, I want to drop both the duplicate and the first instance of the record.(Drop both the record and its duplicate)--drop both rows keep that in mind.... — herbert ichama, Nov 14 '19 at 08:42
@herbert : use `drop_duplicated` with `keep=False` as mentioned my answer . If you want to drop duplicate rows, don't use `.T` — nitin3685, Nov 14 '19 at 09:31
@herbert : Please check the updated answer. It allows you to drop duplicate records based on a subset of columns and drops both the record and its duplicates — nitin3685, Nov 14 '19 at 10:08

nitin3685 · Accepted Answer · 2019-11-14T10:06:35.120

1

df.drop_duplicates(subset='column_name',keep=False)

drop_duplicates will drop duplicated

subset will allow you the specify based on which column you want to determine duplicated

keep will allow you to specify which record to keep or drop.

drop_duplicates : Please check this link for more info.

edited Nov 14 '19 at 10:06

answered Nov 14 '19 at 08:34

nitin3685

1 Answers1