I have a dataframe formatted like this in pandas.
(df)
School ID Num Status Modified Date
School 1 6000 Active 2020-07-18
School 1 6000 InActive 2020-10-05
School 2 9-999 Active 2020-03-30
School 2 9-999 Active 2020-10-14
School 2 9-999 InActive 2020-07-21
School 3 7000 Active 2020-07-18
School 3 7000 InActive 2020-09-05
....
I am trying to create a function using sort()
that will sort the rows that will only keep the rows with the most recent dates in the dataframe. So this would be the result.
(df)
School ID Num Status Modified Date
School 1 6000 InActive 2020-10-05
School 2 9-999 Active 2020-10-14
School 3 7000 InActive 2020-09-05
....
I would like to use the sort() function, and then maybe drop duplicates of columns Num and Status, but I am a bit stuck. Thanks.