Efficient and more pandas way of doing this
On pandas >= 0.25.1
teamgamedataresp = requests.get(teamgamedata_url)
d = teamgamedataresp.json()
# errors='ignore' used because some records may not have id, in that case it will throw error, I'm ignoring it here
teams_df = pd.io.json.json_normalize(d, ['teams'], ['id'], errors='ignore')
print(teams_df)
teams_df = teams_df.explode('stats')
print(teams_df)
stats_df = pd.io.json.json_normalize(teams_df['stats'])
print(stats_df)
teams_df.drop(columns=['stats'], inplace=True)
data = pd.concat([teams_df.reset_index(drop=True), stats_df.reset_index(drop=True)], axis=1)
print(data)
TL;DR (Data displayed for understanding purposes)
Normalizing first level of records i.e. teams
school conference homeAway points stats id
0 Vanderbilt SEC home 6 [{'category': 'rushingTDs', 'stat': '0'}, {'ca... 401110732
1 Georgia SEC away 30 [{'category': 'rushingTDs', 'stat': '2'}, {'ca... 401110732
2 Miami ACC home 20 [{'category': 'rushingTDs', 'stat': '1'}, {'ca... 401110723
3 Florida SEC away 24 [{'category': 'rushingTDs', 'stat': '1'}, {'ca... 401110723
4 Georgia State Sun Belt away 38 [{'category': 'rushingTDs', 'stat': '3'}, {'ca... 401110730
.. ... ... ... ... ... ...
163 Navy American Athletic home 45 [{'category': 'rushingTDs', 'stat': '6'}, {'ca... 401117857
164 Gardner-Webb None away 28 [{'category': 'rushingTDs', 'stat': '3'}, {'ca... 401135910
165 Charlotte Conference USA home 49 [{'category': 'rushingTDs', 'stat': '4'}, {'ca... 401135910
166 Alabama State None away 19 [{'category': 'rushingTDs', 'stat': '1'}, {'ca... 401114237
167 UAB Conference USA home 24 [{'category': 'rushingTDs', 'stat': '1'}, {'ca... 401114237
[168 rows x 6 columns]
Exploding list in stats column into rows
school conference homeAway points stats id
0 Vanderbilt SEC home 6 {'category': 'rushingTDs', 'stat': '0'} 401110732
0 Vanderbilt SEC home 6 {'category': 'passingTDs', 'stat': '0'} 401110732
0 Vanderbilt SEC home 6 {'category': 'kickReturnYards', 'stat': '35'} 401110732
0 Vanderbilt SEC home 6 {'category': 'kickReturnTDs', 'stat': '0'} 401110732
0 Vanderbilt SEC home 6 {'category': 'kickReturns', 'stat': '2'} 401110732
.. ... ... ... ... ... ...
167 UAB Conference USA home 24 {'category': 'netPassingYards', 'stat': '114'} 401114237
167 UAB Conference USA home 24 {'category': 'totalYards', 'stat': '290'} 401114237
167 UAB Conference USA home 24 {'category': 'fourthDownEff', 'stat': '0-1'} 401114237
167 UAB Conference USA home 24 {'category': 'thirdDownEff', 'stat': '1-13'} 401114237
167 UAB Conference USA home 24 {'category': 'firstDowns', 'stat': '16'} 401114237
[3927 rows x 6 columns]
Normalize the stats column to get a dataframe
category stat
0 rushingTDs 0
1 passingTDs 0
2 kickReturnYards 35
3 kickReturnTDs 0
4 kickReturns 2
... ... ...
3922 netPassingYards 114
3923 totalYards 290
3924 fourthDownEff 0-1
3925 thirdDownEff 1-13
3926 firstDowns 16
[3927 rows x 2 columns]
Finally, Merge both dataframes.
school conference homeAway points id category stat
0 Vanderbilt SEC home 6 401110732 rushingTDs 0
1 Vanderbilt SEC home 6 401110732 passingTDs 0
2 Vanderbilt SEC home 6 401110732 kickReturnYards 35
3 Vanderbilt SEC home 6 401110732 kickReturnTDs 0
4 Vanderbilt SEC home 6 401110732 kickReturns 2
... ... ... ... ... ... ... ...
3922 UAB Conference USA home 24 401114237 netPassingYards 114
3923 UAB Conference USA home 24 401114237 totalYards 290
3924 UAB Conference USA home 24 401114237 fourthDownEff 0-1
3925 UAB Conference USA home 24 401114237 thirdDownEff 1-13
3926 UAB Conference USA home 24 401114237 firstDowns 16
[3927 rows x 7 columns]