How to find sum and count of a column based on a grouping condition on a Pandas dataset?

Question

I have a Pandas dataset with 3 columns. I need to group by the ID column while finding the sum and count of the other two columns. Also, I have to ignore the zeroes in the columsn 'A' and 'B'.

The dataset looks like -

ID   A    B  

1    0    5  
2   10    0   
2   20    0  
3    0   30

What I need -

ID   A_Count   A_Sum   B_Count   B_Sum  

1      0         0        1        5  
2      2        30        0        0  
3      0         0        1       30

I have tried this using one column but wasn't able to get both the aggregations in the final dataset.

(df.groupby('ID').agg({'A':'sum', 'A':'count'}).reset_index().rename(columns = {'A':'A_sum', 'A': 'A_count'}))

Probable Duplicate : https://stackoverflow.com/questions/38174155/group-dataframe-and-get-sum-and-count — min2bro, May 02 '19 at 09:13
Possible duplicate of [Group dataframe and get sum AND count?](https://stackoverflow.com/questions/38174155/group-dataframe-and-get-sum-and-count) — Georgy, May 02 '19 at 09:57

Erfan · Accepted Answer · 2019-05-02T13:29:38.140

If you don't pass it columns specifically, it will aggregate the numeric columns by itself.

Since your don't want to count 0, replace them with NaN first:

df.replace(0, np.NaN, inplace=True)

print(df)
   ID     A     B
0   1   NaN   5.0
1   2  10.0   NaN
2   2  20.0   NaN
3   3   NaN  30.0

df = df.groupby('ID').agg(['count', 'sum'])

print(df)
       A           B      
   count   sum count   sum
ID                        
1      0   0.0     1   5.0
2      2  30.0     0   0.0
3      0   0.0     1  30.0

Remove MultiIndex columns

You can use list comprehension:

df.columns = ['_'.join(col) for col in df.columns]

print(df)
    A_count  A_sum  B_count  B_sum
ID                                
1         0    0.0        1    5.0
2         2   30.0        0    0.0
3         0    0.0        1   30.0

In the output I do not want to count or sum the zeroes though. — Vjain, May 02 '19 at 10:25
I see, edited for correct answer. See explanation about replacing `0` with `NaN` @Vjain — Erfan, May 02 '19 at 13:30

How to find sum and count of a column based on a grouping condition on a Pandas dataset?

1 Answers1