I have a dataframe, and want to find the sum of the two largest values in each group.
I am able to find the two largest, using groupby, and then nlargest. However, doing this seems to not preserve the 'group' value itself in a way that allows me to calculate the sum of these top twos. As a result, I get an error using the code below.
import pandas as pd
df = pd.DataFrame({'group': list('aaabccaaaccbbbc'),
'vals': [12,341,3,2,45,24,4,51,54,21,231,21,51,31,87]})
top_2 = df.groupby('group')['vals'].nlargest(2)
top_2.groupby('group')['vals'].sum()
My expected output would be:
a 395
b 82
c 318