I have below data frame with duplicate data. I want to remove these duplicates from dataframe.
df = pd.DataFrame({'test_id': [
{'user_id':2, 'insert_date':'2020-12-23', 'is_admin': "true"},
{'user_id':4, 'insert_date':'2020-12-23', 'is_admin': "true"},
{'user_id':3, 'insert_date':'2020-12-21', 'is_admin': "false"},
{'user_id':2, 'insert_date':'2020-12-23', 'is_admin': "true"}
], 'contact_id':[1,4,2,1]}
)
print(df)
test_id contact_id
0 {'user_id': 2, 'insert_date': '2020-12-23', 'i... 1
1 {'user_id': 4, 'insert_date': '2020-12-23', 'i... 4
2 {'user_id': 3, 'insert_date': '2020-12-21', 'i... 2
3 {'user_id': 2, 'insert_date': '2020-12-23', 'i... 1
I have tried below to remove dupicates
df = df.drop_duplicates(subset=['test_id', 'contact_id'], keep='first')
print(df)
But getting below error
TypeError: unhashable type: 'dict'
Can anyone guide me how can I delete duplicate data based on 'test_id', 'contact_id' combination ?
I want below output
test_id contact_id
0 {'user_id': 2, 'insert_date': '2020-12-23', 'i... 1
1 {'user_id': 4, 'insert_date': '2020-12-23', 'i... 4
2 {'user_id': 3, 'insert_date': '2020-12-21', 'i... 2