- Create two dataframes selected by the unique values in the
'homeAway'
column, 'home'
and 'away'
, using Boolean indexing.
- Drop the obsolete
'homeAway'
column
- Rename the appropriate columns with a
'home_'
, and 'away_'
prefix.
- This can be done in a for-loop, with each dataframe added to a list, which can be consolidated into a simple list-comprehension.
- Use
pd.merge
to combine the two dataframes on the common 'game_id'
column.
import pandas as pd
# test dataframe
data = {'game_id': [332410041, 332410041, 662410041, 662410041, 772410041, 772410041],
'school': ['Connecticut', 'Towson', 'NY', 'CA', 'FL', 'AL'],
'conference': ['American Athletic', 'CAA', 'a', 'b', 'c', 'd'],
'homeAway': ['home', 'away', 'home', 'away', 'home', 'away'], 'points': [18, 33, 1, 2, 3, 4]}
df = pd.DataFrame(data)
# create list of dataframes
dfl = [(df[df.homeAway.eq(loc)]
.drop('homeAway', axis=1)
.rename({'school': f'{loc}_school',
'conference': f'{loc}_conference',
'points': f'{loc}_points'}, axis=1))
for loc in df.homeAway.unique()]
# combine the dataframes
df_new = pd.merge(dfl[0], dfl[1])
# display(df_new)
game_id home_school home_conference home_points away_school away_conference away_points
0 332410041 Connecticut American Athletic 18 Towson CAA 33
1 662410041 NY a 1 CA b 2
2 772410041 FL c 3 AL d 4