remain multiple rows with latest maintenance date - sort values and drop duplicate in Python

Asked Mar 14 '22 at 11:06

Active Mar 14 '22 at 11:29

Viewed 31 times

I have dataframe

df = {'id':['A','A','A','A','B','B'], 'type':['PO','PO','PO','MB','MB','FX'], 'desc':['01234','66666','77890','qwer','asdf','123abc'], 'maint_date':["2018-10-03","2018-10-03","2005-08-23", "2017-09-19","2008-08-13","2016-08-20"]}

id	type	desc	maint_date
A	PO	01234	2018-10-03
A	PO	66666	2018-10-03
A	PO	77890	2005-08-23
A	MB	QWER	2017-09-19
B	MB	ASDF	2008-08-13
B	FX	123ABC	2016-08-20

I would like to sort based on latest 'id', 'type' and keeping the latest desc based on latest'maint_date'. I would like to keep multiple rows with the latest maintenance date even though having same 'id','type','maint_date'

To simplify, I would like to obtain result shown below:

id	type	desc	maint_date
A	PO	01234	2018-10-03
A	PO	66666	2018-10-03
A	MB	QWER	2017-09-19
B	FX	123ABC	2016-08-20

edited Mar 14 '22 at 11:29

asked Mar 14 '22 at 11:06

Emma She

@mkrieger1 df.sort_values(['id','type','maint_date'],ascending = [True,False,False]).drop_duplicates(['id','type','maint_date'],keep='first'), I tried to run this line of code, but it did not remain the second row with desc '6666' eventhough the maintenance date is the latest – Emma She Mar 14 '22 at 11:19
Check [this](https://stackoverflow.com/a/31185210/2901002) solution – jezrael Mar 14 '22 at 11:28
thanks @jezrael, u led me straight to the right source! It finally solved my problem – Emma She Mar 14 '22 at 11:34

remain multiple rows with latest maintenance date - sort values and drop duplicate in Python

0 Answers0