-4

Hello I would like to sort the Bundesliga table according to the points of each team not by the team name. Can someone help me? Currently I can sort the issue only by team name. I don't know how to sort the points after the for-loop.

def push2(self):
    #Download Database
    urllib.request.urlretrieve(
        "http://www.football-data.co.uk/mmz4281/1819/d1.csv",
        "2018-19.csv")
    #Read Database
    df = pd.read_csv("2018-19.csv")
    teams = df["HomeTeam"]
    teams = teams.drop_duplicates()
    teams = teams.sort_values(0, False, False)
    teams = teams.tolist()
    namesLen = len(teams)
    for i in range(0, namesLen):


        # Get points through victories
        team = df[(df["HomeTeam"] == teams[i]) | (
                df["AwayTeam"] == teams[i])]
        teamWin = team[((team["FTR"] == "H") & (
                team["HomeTeam"] == teams[i])) | (
                               (team["FTR"] == "A") & (
                               team["AwayTeam"] == teams[
                           i]))]
        teamTotalPoints = (len(teamWin.index) * 3)

        # Get points through draw
        teamU = df[(df["HomeTeam"] == teams[i]) | (
                df["AwayTeam"] == teams[i])]
        teamD = teamU[(team["FTR"] == "D")]
        teamDTotal = (len(teamD.index) * 1)

        # Total points wins and points draws
        teamT = teamTotalPoints + teamDTotal

        teamTStr = str(teamT)


        print(str( teamTStr + ": " +teams[i] ))
KeyP9
  • 1
  • 1
  • 2
  • Possible duplicate of [How to sort a dataFrame in python pandas by two or more columns?](https://stackoverflow.com/questions/17141558/how-to-sort-a-dataframe-in-python-pandas-by-two-or-more-columns) – stovfl Sep 28 '18 at 12:30
  • I really can't follow what this code is doing. Why do you have `df = pd.read_csv("2018-19.csv")` inside a `for` loop? – roganjosh Sep 28 '18 at 12:34
  • I want to sort the print line by points – KeyP9 Sep 28 '18 at 13:06

1 Answers1

0

I would recommend you take a closer look into the Pandas Documentation, particularly the groupby (link here) and merge (link here) functions. They give great examples that offer pieces of the answer that you are looking for. For more examples on what merge can offer, take a look at this article with tons of examples on the different combinations.

That being said, below is a way to use just Pandas to yield the total goals in order from highest to lowest for clubs in this league.

Some assumptions I took:

  • The Python version used is 2.7
  • The Pandas version used is at least 0.19.
  • After doing some grouping and sorting, the end result is a new dataframe which can be manipulated.
  • The file is already loaded into your python script.
  • The columns are based off of http://www.football-data.co.uk/notes.txt
  • Assumes no errors in the HomeTeam column.

The function below should be repeatable as long as your dataframe follows the same structure as the link you listed.

import pandas as pd

def sortTable(df):
    """
    Returns a pd.DataFrame with goals scored by teams, sorted by total goals.

    df: pd.DataFrame, raw dataframe taken from .csv.
    """
    # groups by the Home team name, sums the FTHG, resets the grouping object indexing
    home_goals = df.groupby(['HomeTeam'])[['FTHG']].sum().reset_index()

    # rename the HomeTeam column to 'Team', column shared by tables to be merged
    home_goals.rename(columns = {'HomeTeam': 'Team'}, inplace = True)

    # groups by the away team name, sums the FTAG, resets the grouping object indexing
    away_goals = df.groupby(['AwayTeam'])[['FTAG']].sum().reset_index()
    away_goals.rename(columns = {'AwayTeam': 'Team'}, inplace = True)

    # merge the 2 tables by the team name
    goals_table = pd.merge(home_goals, away_goals, on='Team')
    goals_table['FTG'] = goals_table['FTHG'] + goals_table['FTAG']

    return goals_table.sort_values('FTG', ascending=False)


""" ------ Run Function Example ------ """
df_old = pd.read_csv('path_to_csv')
df_new = sortTable(df_old)
Kai
  • 234
  • 3
  • 11