I have two dataframes. Dataframe X has columns:
Index(['studentid', 'Year', 'MCR_NAME', 'QUAL_DETAILS'])
Dataframe Y has columns:
Index(['studentid', 'total'])
I want to merge X and Y on 'studentid' using:
Z = X.merge(Y, on="studentid")
However, the same "studentid" field can occur in different years in X. If this happens, I want to only keep the earliest record after the merge. That is I don't want the same studentid to occur twice in the merged dataframe. The years from the Year field are written "20/21", "21/22", "22/23".
A given studentid can only occur once per year.
How can I do that?