0

I have a dataframe df:

NonUniqueKey       Date
1111              01-01-2020  12:00
4444              22-09-2020  14:00
1111              23-07-2020  04:00
2222              08-03-2020  08:15
2222              08-03-2020  08:16 
2222              08-03-2020  08:17
3333              11-11-2019  00:00

Expected output :

NonUniqueKey       Date
4444              22-09-2020  14:00
1111              23-07-2020  04:00
2222              08-03-2020  08:17
3333              11-11-2019  00:00

I want to get unique NonUniqueKey value of the latest Date only. I'm using drop_duplicates(subset ='NonUniqueKey'), removed the duplicate but wrong value taken..

xixi
  • 59
  • 8
  • First convert values to datetimes and then use any of solutions. – jezrael Jul 23 '20 at 08:52
  • But how to make sure the other column is unique?? HOW IS THIS DUPLICATE?? your comment isnt useful either – xixi Jul 23 '20 at 08:59
  • OK, use`df['Date'] = pd.to_datetime(df['Date'], dayfirst=True)` and then `df.loc[df.groupby('NonUniqueKey')['Date'].idxmax()]` – jezrael Jul 23 '20 at 09:01

0 Answers0