Pandas get unique values from a column and only latest date efficiently

Asked Jul 23 '20 at 08:51

Active Jul 23 '20 at 08:51

Viewed 6 times

I have a dataframe df:

NonUniqueKey       Date
1111              01-01-2020  12:00
4444              22-09-2020  14:00
1111              23-07-2020  04:00
2222              08-03-2020  08:15
2222              08-03-2020  08:16 
2222              08-03-2020  08:17
3333              11-11-2019  00:00

Expected output :

NonUniqueKey       Date
4444              22-09-2020  14:00
1111              23-07-2020  04:00
2222              08-03-2020  08:17
3333              11-11-2019  00:00

I want to get unique NonUniqueKey value of the latest Date only. I'm using drop_duplicates(subset ='NonUniqueKey'), removed the duplicate but wrong value taken..

asked Jul 23 '20 at 08:51

xixi

First convert values to datetimes and then use any of solutions. – jezrael Jul 23 '20 at 08:52
But how to make sure the other column is unique?? HOW IS THIS DUPLICATE?? your comment isnt useful either – xixi Jul 23 '20 at 08:59
OK, use`df['Date'] = pd.to_datetime(df['Date'], dayfirst=True)` and then `df.loc[df.groupby('NonUniqueKey')['Date'].idxmax()]` – jezrael Jul 23 '20 at 09:01

Pandas get unique values from a column and only latest date efficiently

0 Answers0