pandas - aggregate count per month and calculate total sum of counts afterwards

Question

For my dataset:

df = pd.DataFrame({'ID': [1, 1, 2, 2, 2, 2, 3],
                   'Month': ['Feb', 'March', 'Feb', 'Feb', 'March', 'March', 'May']})

I want to aggregate a data set by ID and its month so I get the counts per month and the total sum over all periods:

ID Month
1   Feb
1   March
2   Feb
2   Feb
2   March
2   March
3   May

I need following output:

ID Month  count total_sum
1  Feb    1     2
1  Marc   1     2
2  Feb    2     4
2  March  2     4
3  May    1     1

It looks like you want a groupby (in your case you would groupby ["ID","Month"] with a couple functions. See https://stackoverflow.com/questions/12589481/multiple-aggregations-of-the-same-column-using-pandas-groupby-agg — Leo, Jan 13 '22 at 17:41

score 1 · Answer 1 · answered Jan 13 '22 at 18:38

You can groupby twice, once by ['ID', 'Month'] and then by 'ID' to count per month per ID and total count per ID, respectively.

Note that in the first groupby is the case where you use count method but in the second groupby you use sum method because you're aggregating the counts.

out = df.groupby(['ID', 'Month']).agg(count=('ID','count')).reset_index()
out['total count'] = out.groupby('ID')['count'].transform('sum')

Output:

   ID  Month  count  total count
0   1    Feb      1            2
1   1  March      1            2
2   2    Feb      2            4
3   2  March      2            4
4   3    May      1            1

@KostaS. was your question not resolved? – Jan 14 '22 at 09:44 — , Jan 14 '22 at 09:44

pandas - aggregate count per month and calculate total sum of counts afterwards

1 Answers1