Adding a 'count' column to the result of a groupby in pandas?

Question

I think this is a fairly basic question, but I can't seem to find the solution.

I have a pandas dataframe similar to the following:

import pandas as pd

df = pd.DataFrame({'A' : ['x','x','y','z','z'],
                   'B' : ['p','p','q','r','r']})
df

which creates a table like this:

    A   B
0   x   p
1   x   p
2   y   q
3   z   r
4   z   r

I'm trying to create a table that represents the number of distinct values in that dataframe. So my goal is something like this:

    A   B   c
0   x   p   2
1   y   q   1
2   z   r   2

I can't find the correct functions to achieve this, though. I've tried:

df.groupby(['A','B']).agg('count')

This produces a table with 3 rows (as expected) but without a 'count' column. I don't know how to add in that count column. Could someone point me in the right direction?

Reopened the question since this is more specific ('get just size') versus 'get me these random stats I asked for') — cs95, Jul 08 '20 at 21:36

score 53 · Accepted Answer · answered Feb 13 '18 at 15:20

53

You can using size

df.groupby(['A','B']).size()
Out[590]: 
A  B
x  p    2
y  q    1
z  r    2
dtype: int64

For your solution adding one of the columns

df.groupby(['A','B']).B.agg('count')
Out[591]: 
A  B
x  p    2
y  q    1
z  r    2
Name: B, dtype: int64

Update :

df.groupby(['A','B']).B.agg('count').to_frame('c').reset_index()

#df.groupby(['A','B']).size().to_frame('c').reset_index()
Out[593]: 
   A  B  c
0  x  p  2
1  y  q  1
2  z  r  2

answered Feb 13 '18 at 15:20

BENY

317,841
20
164
234

2

this is a hell of a workaround to simply add a new count column but apparently is the "pythonic" way to do things. – Seymour May 24 '20 at 14:27
If you write `df.groupby(['A','B'])[["B"]].agg('count')` you don't need to do the `.to_frame` - This will return a DataFrame instead of a series. – Markus May 11 '21 at 09:37
How to group by B only and keep A values? – Peter.k Feb 20 '23 at 16:59

score 16 · Answer 2 · answered Jul 08 '20 at 19:59

pandas >= 1.1: `DataFrame.value_counts`

This is an identical replacement for df.groupby(['A', 'B']).size().

df.value_counts(['A', 'B'])

A  B
z  r    2
x  p    2
y  q    1
dtype: int64

df.value_counts(['A', 'B']).reset_index(name='c')

   A  B  c
0  z  r  2
1  x  p  2
2  y  q  1

Adding a 'count' column to the result of a groupby in pandas?

2 Answers2

pandas >= 1.1: `DataFrame.value_counts`

Linked

Related

Adding a 'count' column to the result of a groupby in pandas?

2 Answers2

pandas >= 1.1: DataFrame.value_counts

Linked

Related

pandas >= 1.1: `DataFrame.value_counts`