0

I'm using the below dataframe.

df = pd.DataFrame({'A' : ['aa','bb','aa','dd','ff','dd','aa','bb','dd','cc'], 'B' : ['xx','xx','yy','zz','xx','xx','yy','zz','zz','yy']})

which creates a table like this

    A   B
0   aa  xx
1   bb  xx
2   aa  yy
3   dd  zz
4   ff  xx
5   dd  xx
6   aa  yy
7   bb  zz
8   dd  zz
9   cc  yy

I'm able to add 3rd column with

df.groupby(['A','B']).size()

which gives be below table:

A   B 
aa  xx    1
    yy    2
bb  xx    1
    zz    1
cc  yy    1
dd  xx    1
    zz    2
ff  xx    1
dtype: int64

I want to get the below output:

A      Count
aa     3   
bb     2   
cc     1  
dd     3    
ff     1  

I'm not able to get the below output, I have also tried

df.groupby(['A','B']).B.agg('count').to_frame('Count').reset_index()

But it is not able to get the output. Any help is deeply appreciated.

DirtyBit
  • 16,613
  • 4
  • 34
  • 55

2 Answers2

0

Change this:

df.groupby(['A','B']).size()

to this:

df.groupby(['A']).size()

Or just:

df['A'].value_counts()

Hence:

import pandas as pd
df = pd.DataFrame({'A' : ['aa','bb','aa','dd','ff','dd','aa','bb','dd','cc'], 'B' : ['xx','xx','yy','zz','xx','xx','yy','zz','zz','yy']})
print(df.groupby(['A']).size())

OUTPUT:

A
aa    3
bb    2
cc    1
dd    3
ff    1
dtype: int64
DirtyBit
  • 16,613
  • 4
  • 34
  • 55
0

You were taking too many steps to make something more simple. There is no need to group first by 'A' and 'B' and then perform an operation. Just group by 'A' and count.

df.groupby(['A']).count()
luis.galdo
  • 543
  • 5
  • 20