Trying use the DataFrame.drop_duplicates parameters but without luck as the duplicates are not being removed.
Looking to remove based on column "inc_id". If find duplicates in that column should keep only the last row.
My df is:
inc_id inc_cr_date
0 1049670 121
1 1049670 55
2 1049667 121
3 1049640 89
4 1049666 12
5 1049666 25
Output should be:
inc_id inc_cr_date
0 1049670 55
1 1049667 121
2 1049640 89
3 1049666 25
Code is:
df = df.drop_duplicates(subset='inc_id', keep="last")
Any idea what am I missing here? Thanks.