I have the following dataframe:
d = {'item_code': ['Item1', 'Item2', 'Item3', 'Item1', 'Item2', 'Item2', 'Item2', 'Item1', 'Item3'], 'dollar_amount': ['200.25', '350.00', '120.00', '400.50', '1231.25', '700.00', '350.00', '200.25', '2340.00'], 'year': ['2010','2010','2010','2011','2011','2012','2012','2012','2012']}
df = pd.DataFrame(data=d)
df
dollar_amount item_code year
0 200.25 Item1 2010
1 350.00 Item2 2010
2 120.00 Item3 2010
3 400.50 Item1 2011
4 1231.25 Item1 2011
5 700.00 Item2 2012
6 350.00 Item2 2012
7 200.25 Item1 2012
8 2340.00 Item3 2012
This is what I currently have:
df['dollar_amount'] = df['dollar_amount'].astype(float)
df.groupby('year', as_index=False).agg({'dollar_amount':'max'})
It gives me the following result:
year dollar_amount
0 2010 350.0000
1 2011 1,231.2500
2 2012 2,340.0000
I want to groupby year and find the maximum dollar_amount in that year and also get the corresponding item_code for that maximum dollar_amount.
So the expected outcome is something like this:
year dollar_amount item_code
0 2010 350.0000 item2
1 2011 1,231.2500 item2
2 2012 2,340.0000 item3