0

I have a dataframe, and want to find the sum of the two largest values in each group.

I am able to find the two largest, using groupby, and then nlargest. However, doing this seems to not preserve the 'group' value itself in a way that allows me to calculate the sum of these top twos. As a result, I get an error using the code below.

import pandas as pd
df = pd.DataFrame({'group': list('aaabccaaaccbbbc'), 
               'vals': [12,341,3,2,45,24,4,51,54,21,231,21,51,31,87]})

top_2 = df.groupby('group')['vals'].nlargest(2)
top_2.groupby('group')['vals'].sum()

My expected output would be:

a    395
b     82
c    318
oli5679
  • 1,709
  • 1
  • 22
  • 34

0 Answers0