Apologies for the confusing title. It is explained better below. I currently have a pandas data frame that looks something like this:
user_id year grade_id
1 2005 47
1 2003 70
1 2004 70
2 2011 50
2 2003 43
2 2009 60
I want to group by the user_id and return the minimum year value based on the max value of grade_id. So the output for the above data frame would look like so:
user_id year grade_id
1 2003 70
2 2009 60
Is there a simple/elegant way to do this? I have tried things like the following:
tmp_df = df.groupby(["user_id", "year"])["grade_id"].agg(np.max)
However, this does not return the correct year. I have already checked out a few StackOverflow posts but none seem to be the same issue. Any help would be much appreciated.