2

I group a pandas dataframe using groupby() function with multiple columns.

df_tr_mod = df_tr.groupby(['Col1','Col2']).aCol.agg(['count'])

Now I want to access this count values (I want to multiply this all count values by 10) How i can do this?

Paul Rooney
  • 20,879
  • 9
  • 40
  • 61
GihanDB
  • 591
  • 2
  • 6
  • 23
  • That's not a groupby object. You have already done some aggregation so it should return a DataFrame. You can do regular operations (`df_tr_mod * 10`) on that DataFrame? – ayhan Jun 29 '17 at 04:10
  • can you provide me dataframe – ammy Jun 29 '17 at 04:26

2 Answers2

1

apply groupby on both fields 'Col1', 'Col2' with agg function for count, here new 'count' field added at the same time count value multiply with 10.

df_tr_mod = df_tr.groupby(['Col1','Col2']).aCol.agg(['count'])*10
ammy
  • 618
  • 1
  • 5
  • 13
1

I think you need GroupBy.size, agg is better if aggregate by multiple functions:

What is the difference between size and count in pandas?

df_tr = pd.DataFrame({'Col1':[1,2,1,2,2],
                      'Col2':[5,5,5,6,6],
                      'aCol':[1,8,9,6,4]})
print(df_tr)
   Col1  Col2  aCol
0     1     5     1
1     2     5     8
2     1     5     9
3     2     6     6
4     2     6     4

#your solution, only multiple 10 
df_tr_mod = df_tr.groupby(['Col1','Col2']).aCol.agg(['count']) * 10
print (df_tr_mod)
           count
Col1 Col2       
1    5        20
2    5        10
     6        20

print (type(df_tr_mod))
<class 'pandas.core.frame.DataFrame'>

#for MultiIndex add to_frame
df_tr_mod = df_tr.groupby(['Col1','Col2']).size().to_frame(name='count') * 10
print (df_tr_mod)
           count
Col1 Col2       
1    5        20
2    5        10
     6        20

#for all columns from index add reset_index() 
df_tr_mod = df_tr.groupby(['Col1','Col2']).size().reset_index(name='count') 
df_tr_mod["count"]= df_tr_mod["count"]*10
print (df_tr_mod)
   Col1  Col2  count
0     1     5     20
1     2     5     10
2     2     6     20

Better using agg function:

df_tr_mod = df_tr.groupby(['Col1','Col2']).aCol.agg(['size', 'sum', 'mean'])
print (df_tr_mod)
           size  sum  mean
Col1 Col2                 
1    5        2   10     5
2    5        1    8     8
     6        2   10     5
Graham
  • 7,431
  • 18
  • 59
  • 84
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252