I am attempting to create a sub-table from a larger dataset which lists out the unique ID, the name of the person and the date they attended an appointment.
For example,
df = pd.DataFrame({'ID': ['abc', 'def', 'abc', 'abc'],
'name':['Alex','Bertie','Alex','Alex'],
'date_attended':['01/01/2021','05/01/2021','11/01/2021','20/01/2021']
What I would like is a dataframe, that shows the last time Alex and Bertie attended a class. So my dataframe would like:
name date_attended
Alex 20/01/2021
Bertie 05/01/2021
I'm really struggling with this! So far I have tried (based off a previous question I saw here):
max_date_list = ['ID','date_attended']
df = df.groupby(['ID'])[max_date_list].transform('max').size()
but I keep getting an error. I know this would involve a groupby
but I can't figure out how to get the maximum date. Would anyone know how to do this?