2

I have a DataFrame like this:

Team      Player      Goals       YellowCards          RedCards

Team1     Player1       2             1                    1

Team1     Player2       3             1                    0

Team2     Player3       2             2                    1

I'm trying to calculate sum of Goals, YellowCards and RedCards for each team and create new dataframe for result. I have tried:

pd.crosstab(df['Team'],[df['Goals'],df['YellowCards'],df['RedCards']], aggfunc='sum')

But it's not working. Preferably I would like to do this with either crosstab or pivot_table function. Any advise is highly appreciated.

Mr. Engineer
  • 215
  • 1
  • 9
  • 26

2 Answers2

3

Because need DataFrame.pivot_table the simpliest solution is:

df = df.pivot_table(index='Team',aggfunc='sum')
print (df)
       Goals  RedCards  YellowCards
Team                               
Team1      5         1            2
Team2      2         1            2

Working like aggregate sum:

df = df.groupby('Team').sum()

EDIT: If need specify columns:

df = df.pivot_table(index='Team',aggfunc='sum',values=['Goals','RedCards','YellowCards'])
print (df)
       Goals  RedCards  YellowCards
Team                               
Team1      5         1            2
Team2      2         1            2

Working like:

df = df.groupby('Team')[['Goals','RedCards','YellowCards']].sum()
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
  • 1
    This seems to work, but what if I have multiple columns containing non-integer values? Then I would need to select only certain columns. Giving an index list to @Anurag's values-attribute unfortunately doesn't work. – Mr. Engineer Mar 02 '21 at 11:36
0

I added column totals and grand totals

data=[('Team1','Player1',       2,             1,                    1),
('Team1','Player2',       3,             1,                    0),
('Team2','Player3',       2,             2,                    1)]

df=pd.DataFrame(data=data,columns=['Team','Player','Goals', 'YellowCards','RedCards'])

fp=df.pivot_table(index='Team',aggfunc='sum')
fp['Totals'] = fp.sum(axis='columns')
fp.loc[('Grand Total'), :] = fp.sum()
print(fp)

output

 Goals  RedCards  YellowCards  Totals
 Team                                             
 Team1          5.0       1.0          2.0     8.0
 Team2          2.0       1.0          2.0     5.0
 Grand Total    7.0       2.0          4.0    13.0
Golden Lion
  • 3,840
  • 2
  • 26
  • 35