I want to expand the columns of the following (toy example) pandas DataFrame,
df = pd.DataFrame({'col1': ["A", "A", "A", "B", "B", "B"],
'col2': [1, 7, 3, 2, 9, 4],
'col3': [3, -1, 0, 5, -2, -3],})
col1 col2 col3
0 A 1 3
1 A 7 -1
2 A 3 0
3 B 2 5
4 B 9 -2
5 B 4 -3
such that it will become row-wise,
col1 col2_1 col2_2 col2_3 col3_1 col3_2 col3_3
0 A 1 7 3 3 -1 0
1 B 2 9 4 5 -2 -3
I know that I shall use groupby('col1')
but do not know how to achieve the desired DataFrame.
Note: The number of elements in each group when we perform groupby('col1')
are all equal (in this case we have three A's and three B's)
Edit: I managed to do it by the following code, but it is not efficient,
import pandas as pd
from functools import partial
def func(x, exclude_list):
for col in x.columns:
if col in exclude_list:
continue
for i, value in enumerate(x[col].values):
x[f'{col}_{i+1}'] = value
return x
df = pd.DataFrame({'col1': ["A", "A", "A", "B", "B", "B"],
'col2': [1, 7, 3, 2, 9, 4],
'col3': [3, -1, 0, 5, -2, -3],})
exclude_list = ['col1']
columns_to_expand = ['col2', 'col3']
func2 = partial(func, exclude_list=exclude_list)
df2 = df.groupby(exclude_list).apply(func2)
df2.drop(columns_to_expand, axis=1, inplace=True)
df3 = df2.groupby(exclude_list).tail(1).reset_index()
df3.drop('index', axis=1, inplace=True)
print(df3)
which results in,
col1 col2_1 col2_2 col2_3 col3_1 col3_2 col3_3
0 A 1 7 3 3 -1 0
1 B 2 9 4 5 -2 -3
Edit2: This code, based on ouroboros1
answer works efficiently,
df_pivot = None
for col in columns_to_expand:
df['index'] = [f'{col}_{i}' for i in range(1,4)]*len(np.unique(df[exclude_list].values))
if df_pivot is None:
df_pivot = df.pivot(index=exclude_list, values=col, columns='index').reset_index(drop=False)
else:
df_pivot = df_pivot.merge(df.pivot(index=exclude_list, values=col, columns='index').reset_index(drop=False))