1

I have a DataFrame d of games played of the game Go. The important columns for these are the player name, and if they won or not that particular game. How can I make a new DataFrame with a column that is Player name (without repeating names), the total games they played and the total number they won. Example:

import pandas as pd

d = {'Nombres': ['pepe','tito','pepe','pepe'], 'col2': ['Win','Lost','Win','Lost']}
df2 = pd.DataFrame(data=d)

This is what I currently do, counting the number of games played per user:

df3 = df2.groupby(['Jugador']).size().reset_index(name='count')

How can I add the column of how many games each player won?

smci
  • 32,567
  • 20
  • 113
  • 146
mmazz
  • 51
  • 4
  • 1
    Do you mean `df2.Nombres.value_counts()` – piRSquared Jan 24 '20 at 18:31
  • This would be called *"aggregate Wins by Player"* or *"tabulate Wins and Matches by Player"*. Also, you could name 'col2' ('Win'/'Lost' column) 'Result'. – smci Jan 24 '20 at 22:53

2 Answers2

3

Few ways (feel free to add more)

1

df2.groupby(['Nombres','col2'])['col2'].count().to_frame()
              col2
Nombres col2      
pepe    Lost     1
        Win      2
tito    Lost     1

2

pd.crosstab(df2.Nombres,df2.col2,df2.col2,aggfunc='count').fillna(0)
col2     Lost  Win
Nombres           
pepe      1.0  2.0
tito      1.0  0.0

3

df2.loc[df2['col2'] == 'Win']['Nombres'].value_counts()
pepe    2
Name: Nombres, dtype: int64

4

df2.loc[df2['col2'] == 'Win'].groupby('Nombres').size()
Nombres
pepe    2
dtype: int64

5

df2['win_counter'] = df2.groupby('Nombres')['col2'].apply(lambda x : x.eq('Win').cumsum())
print(df2)
     Nombres  col2    win_counter
0    pepe     Win              1
1    tito     Lost             0
2    pepe     Win              2
3    pepe     Lost             2
Umar.H
  • 22,559
  • 7
  • 39
  • 74
2

IIUC, DataFrame.pivot_table

new_df = (df2.pivot_table(index = 'Nombres',columns = 'col2',
                          values = 'col2',
                          aggfunc = 'size',
                          fill_value = 0)
            #.loc[:,['Win']] #if you want drop the column of Lost
             .assign(Total_Match=lambda x: x.sum(axis=1))
             .rename_axis(columns = None)
             .reset_index())
print(new_df)

Output

  Nombres  Lost  Win  Total_Match
0    pepe     1    2            3
1    tito     1    0            1
Community
  • 1
  • 1
ansev
  • 30,322
  • 5
  • 17
  • 31