1

I'm having trouble with some pandas groupby object issue, which is the following:

so I have this dataframe:

  Letter name    num_exercises
    A       carl        1 
    A       Lenna       2 
    A       Harry       3         
    A       Joe         4  
    B       Carl        5    
    B       Lenna       3   
    B       Harry       3  
    B       Joe         6 
    C       Carl        6
    C       Lenna       3 
    C       Harry       4  
    C       Joe         7  

And I want to add a column on it, called num_exercises_total , which contains the total sum of num_exercises for each letter. Please note that this value must be repeated for each row in the letter group.

The output would be as follows:

Letter name    num_exercises   num_exercises_total
A       carl        1                 15
A       Lenna       2                 15
A       Harry       3                 15
A       Joe         4                 15
B       Carl        5                 18
B       Lenna       3                 18
B       Harry       3                 18
B       Joe         6                 18
C       Carl        6                 20
C       Lenna       3                 20
C       Harry       4                 20
C       Joe         7                 20

I've tried adding the new column like this:

df['num_exercises_total'] = df.groupby(['letter'])['num_exercises'].sum()

But it returns the value NaN for all the rows.

Any help would be highly appreciated.

Thank you very much in advance!

HRDSL
  • 711
  • 1
  • 5
  • 22

2 Answers2

4

You may want to check transform

df.groupby(['Letter'])['num_exercises'].transform('sum')
0     10
1     10
2     10
3     10
4     17
5     17
6     17
7     17
8     20
9     20
10    20
11    20
Name: num_exercises, dtype: int64

df['num_of_total']=df.groupby(['Letter'])['num_exercises'].transform('sum')
BENY
  • 317,841
  • 20
  • 164
  • 234
0

Transform works perfectly for this question. WenYoBen is right. I am just putting slightly different version here.

df['num_of_total']=df['num_excercises'].groupby(df['Letter']).transform('sum')
>>> df
   Letter   name  num_excercises  num_of_total
0       A   carl               1            10
1       A  Lenna               2            10
2       A  Harry               3            10
3       A    Joe               4            10
4       B   Carl               5            17
5       B  Lenna               3            17
6       B  Harry               3            17
7       B    Joe               6            17
8       C   Carl               6            20
9       C  Lenna               3            20
10      C  Harry               4            20
11      C    Joe               7            20
shinchaan
  • 136
  • 2
  • 12