How can i use sum() and count() (both) for a groupby in pandas

Question

df=pandas.DataFrame(processed_data_format, columns=["file_name", "innings", "over","ball", "individual ball", "runs","batsman", "wicket_status","bowler_name","fielder_name"])  
df.groupby(['batsman'])['runs','ball'].sum()

by using this i will get the result like

a 30 29
b 4  1
c 10 15

I would like to get the count of column called filename with the result of the code as mentioned above.The Final result should be like

a 30 29 2
b 4  1  1
c 10 15 2

http://stackoverflow.com/questions/29127376/pandas-using-multiple-functions-in-a-group-by Possible duplicate, not to mention covered exactly here http://pandas.pydata.org/pandas-docs/stable/groupby.html#applying-multiple-functions-at-once — Chris, Jan 28 '16 at 08:17
@jezrael processed_data_format is a list of list that doesn't have any default column that is why i gave name for each column .The processed_data_format looks like [[102032 , 1 , 1 , 1 , 0.3 , 4 , Sachin , Caught , Akthar , Malik] ,[102032 , 1 , 1 , 1 , 0.3 , 4 , Sachin , Caught , Akthar , Malik] .....................] — Edwin Baby, Jan 28 '16 at 08:19
Possible duplicate of [Apply multiple functions to multiple groupby columns](http://stackoverflow.com/questions/14529838/apply-multiple-functions-to-multiple-groupby-columns) — Kartik, Jan 28 '16 at 08:21

score 0 · Answer 1 · answered Jan 28 '16 at 10:11

df=pandas.DataFrame(processed_data_format, columns=["file_name", "innings", "over","ball", "individual ball", "runs","batsman", "wicket_status","bowler_name","fielder_name"])      
a = {'runs':['sum'],'ball':['sum'],'file_name':['nunique']}
t = df.groupby('batsman').agg(a)

No need to use count() for this format instead of count use nunique to get the number of unique value

How can i use sum() and count() (both) for a groupby in pandas

1 Answers1