0

I have a csv file "players.csv" with attributes of players "Name, Age, Nationality, Overall, Potential, Club, Value ...".

My task is to sum up the value of all clubs by adding all their players value together.

So far I get the desired outcome with the following solution. My problem is that my solution takes a very long time to process, because of the two for loops.

Is there any more efficient way to solve the problem? (Dataframe has 14700 players)

import pandas as pd
pd.options.mode.chained_assignment = None  # default='warn'

# Load the data with only the Club Name and Value of the player
df = pd.read_csv('./players.csv',usecols=['Club','Value'])

# Create new List where the Club Value will be shown
# Drop all duplicates of Clubs. Now we have a Dataframe with all the available Clubs inside
# Futhermore Drop the column 'Value' and add 'Club_Value'
df_Clubs = df.drop_duplicates('Club').drop('Value',axis=1)
df_Clubs['Club_Value']=0

df = df.sort_values(by=["Club"])

#Iterating trough the players Dataframe and get the row we are in and the Value of that row
for rowdf, valuedf in df.iterrows():
    #Iterating trough the new Dataframe with only the unique Clubs
    for row, value in df_Clubs.iterrows():
        if valuedf["Club"] == value["Club"]:
            #When the Club of the Player matches with the Unique Clubs Dataframe,
            #we asign the Value of the Player to the club Value
            ValueClub_old = df_Clubs["Club_Value"][row]
            ValuePlayer = df["Value"][rowdf]
            ValueClub_new = ValueClub_old + ValuePlayer
            df_Clubs["Club_Value"][row] = ValueClub_new

# save the new dataframe
df_Clubs.to_csv(r'Players_Value.csv', index = False)
df.head()
print(df_Clubs)
maja95
  • 63
  • 1
  • 8

1 Answers1

4

Use groupby on clubs and sum.

df_new=df.groupby(['Club'])['Value'].sum().reset_index()
Suhas Mucherla
  • 1,383
  • 1
  • 5
  • 17