Here is one way that relies on the melt()
method to first make a long table out of the original and then on the pivot()
method to transform it to the new wide format:
import pandas as pd
import numpy as np
# Code that creates your input dataframe (replace column names as needed)
df = pd.DataFrame(
{'school':['school1', 'school2', 'school3', 'school4'],
'g1':['game1', 'game1', 'game2', None],
'g2':['game2', None, 'game3', 'game2'],
'g3':['game3', None, None, None],
}
)
# Convert to long format (one row per school-game)
long_df = df.set_index('school').melt(ignore_index=False).reset_index()
# Remove null (non-existing) school-game combinations
# Also, convert index to column for next step
long_df = long_df[long_df.value.notnull()].reset_index(drop=True).reset_index()
# Convert to dataframe with one row per game ID
by_game_df = long_df.pivot(index='value',columns='index',values='school')
At this point, the dataframe will look like this:
index value 0 1 2 3 4 5 6
0 game1 school1 school2 NaN NaN NaN NaN NaN
1 game2 NaN NaN school3 school1 NaN school4 NaN
2 game3 NaN NaN NaN NaN school3 NaN school1
You can perform these additional steps to shift non-null school values to the left and to remove columns with only NaN's remaining:
# per https://stackoverflow.com/a/65596853:
idx = pd.isnull(by_game_df.values).argsort(axis=1)
squeezed_df = pd.DataFrame(
by_game_df.values[np.arange(by_game_df.shape[0])[:,None], idx],
index=by_game_df.index,
columns=by_game_df.columns
)
result = squeezed_df.dropna(axis=1, how='all')
result
# index value 0 1 2
# 0 game1 school1 school2 NaN
# 1 game2 school3 school1 school4
# 2 game3 school3 school1 NaN