-1

I'm trying to create a dataframe with 'team','games','wins','losses' and 'ties.

Here's a snippet of the data:

[{'away_games': {'games': 4, 'losses': 2, 'ties': 0, 'wins': 2},
  'conference': 'Mountain West',
  'conference_games': {'games': 8, 'losses': 3, 'ties': 0, 'wins': 5},
  'division': 'Mountain',
  'expected_wins': 9.9,
  'home_games': {'games': 7, 'losses': 1, 'ties': 0, 'wins': 6},
  'team': 'Air Force',
  'total': {'games': 13, 'losses': 3, 'ties': 0, 'wins': 10},
  'year': 2022},
 {'away_games': {'games': 8, 'losses': 6, 'ties': 0, 'wins': 1},
  'conference': 'Mid-American',
  'conference_games': {'games': 9, 'losses': 7, 'ties': 0, 'wins': 1},
  'division': 'East',
  'expected_wins': 1.5,
  'home_games': {'games': 5, 'losses': 4, 'ties': 0, 'wins': 1},
  'team': 'Akron',
  'total': {'games': 13, 'losses': 10, 'ties': 0, 'wins': 2},
  'year': 2022},

Here's the code I tried:

# Create an empty DataFrame
df = pd.DataFrame(columns=['team', 'games', 'wins', 'losses', 'ties'])

# Loop through each record in the data
for record in data:
    try:
        # Extract the desired values
        team = record['team']
        games = record['total'].get['games']
        wins = record['total'].get['wins']
        losses = record['total'].get['losses']
        ties = record['total'].get['ties']
        
        # Create a new row with the extracted values
        new_row = {'team': team, 'games': games, 'wins': wins, 'losses': losses, 'ties': ties}
        
        # Append the new row to the DataFrame
        df = df.append(new_row, ignore_index=True)
    
    except KeyError as e:
        print(f"Skipping record due to missing key: {e}")

# Print the resulting DataFrame
print(df)

Im getting an error that the 'TeamRecord' object is not subscriptable.

I'm sure there's a better / easier to way to do this. Any advice would be much appreciated.

1 Answers1

0

That's how it's supposed to look:

import pandas as pd

data=[{'away_games': {'games': 4, 'losses': 2, 'ties': 0, 'wins': 2},
  'conference': 'Mountain West',
  'conference_games': {'games': 8, 'losses': 3, 'ties': 0, 'wins': 5},
  'division': 'Mountain',
  'expected_wins': 9.9,
  'home_games': {'games': 7, 'losses': 1, 'ties': 0, 'wins': 6},
  'team': 'Air Force',
  'total': {'games': 13, 'losses': 3, 'ties': 0, 'wins': 10},
  'year': 2022},
 {'away_games': {'games': 8, 'losses': 6, 'ties': 0, 'wins': 1},
  'conference': 'Mid-American',
  'conference_games': {'games': 9, 'losses': 7, 'ties': 0, 'wins': 1},
  'division': 'East',
  'expected_wins': 1.5,
  'home_games': {'games': 5, 'losses': 4, 'ties': 0, 'wins': 1},
  'team': 'Akron',
  'total': {'games': 13, 'losses': 10, 'ties': 0, 'wins': 2},
  'year': 2022}]

rows = []
# Loop through each record in the data
for record in data:
    try:
        # Extract the desired values
        team = record['team']
        games = record['total']['games']
        wins = record['total']['wins']
        losses = record['total']['losses']
        ties = record['total']['ties']

        # Create a new row with the extracted values
        new_row = {'team': team, 'games': games, 'wins': wins, 'losses': losses, 'ties': ties}
        rows.append(new_row)

    except KeyError as e:
        print(f"Skipping record due to missing key: {e}")

# Print the resulting DataFrame
df = pd.DataFrame(rows, columns=['team', 'games', 'wins', 'losses', 'ties'])
print(df)
    team    games   wins    losses  ties
0   Air Force   13  10  3   0
1   Akron   13  2   10  0

It also looks like your data is borked since the total sum of wins, losses, and ties must result in the total number of games played. That's not the case for Akron.

You don't use get, see also Create a Pandas Dataframe by appending one row at a time regarding append which has been deprecated and removed in Pandas>=2.0.0. Appending in a loop is in most cases a bad practice.

Bracula
  • 335
  • 1
  • 3
  • 14
  • Thanks for the input! The data matches the source, so I'll the maintainer know about it - https://collegefootballdata.com/exporter/records?year=2022&team=Akron. After re-running, I'm still getting the same TypeError: 'TeamRecord' object is not subscriptable – WillyTheWalrus_123 Jul 09 '23 at 23:35
  • @WillyTheWalrus_123 Does it solve your problem now? Eh, I didn't see your edit. – Bracula Jul 09 '23 at 23:37
  • @WillyTheWalrus_123 I've edited the answer, could you try and run the code in a clear Python instance? – Bracula Jul 09 '23 at 23:41
  • When 'data' is explicitly passed like mentioned above, your code works perfectly. Thanks for that. I updated my full list to be referenced like this, and now it works for the whole thing and I avoid the typeerror message. – WillyTheWalrus_123 Jul 10 '23 at 00:05