How to reshape dataframe by creating a grouping multiheader from specific column?

Question

I like to reshape a dataframe thats first column should be used to group the other columns by an additional header row.

Initial dataframe

df = pd.DataFrame(
    {
        'col1':['A','A','A','B','B','B'],
        'col2':[1,2,3,4,5,6],
        'col3':[1,2,3,4,5,6],
        'col4':[1,2,3,4,5,6],
        'colx':[1,2,3,4,5,6]
    }
)

Trial: Using pd.pivot() I can create an example, but this do not fit my expected one, it seems to be flipped in grouping:

df.pivot(columns='col1', values=['col2','col3','col4','colx'])

     col2      col3      col4      colx     
col1    A    B    A    B    A    B    A    B
0     1.0  NaN  1.0  NaN  1.0  NaN  1.0  NaN
1     2.0  NaN  2.0  NaN  2.0  NaN  2.0  NaN
2     3.0  NaN  3.0  NaN  3.0  NaN  3.0  NaN
3     NaN  4.0  NaN  4.0  NaN  4.0  NaN  4.0
4     NaN  5.0  NaN  5.0  NaN  5.0  NaN  5.0
5     NaN  6.0  NaN  6.0  NaN  6.0  NaN  6.0

Expected output:

        A                               B
col1    col2    col3    col4    colx    col2    col3    col4    colx
0       1       1       1       1       4       4       4       4
1       2       2       2       2       5       5       5       5
2       3       3       3       3       6       6       6       6

jezrael · Accepted Answer · 2022-09-20T11:28:13.453

3

Create counter column by GroupBy.cumcount, then use DataFrame.pivot with swapping level of MultiIndex in columns by DataFrame.swaplevel, sorting it and last remove index and columns names by DataFrame.rename_axis:

df = (df.assign(g = df.groupby('col1').cumcount())
        .pivot(index='g', columns='col1')
        .swaplevel(0,1,axis=1)
        .sort_index(axis=1)
        .rename_axis(index=None, columns=[None, None]))

print(df)
     A                   B               
  col2 col3 col4 colx col2 col3 col4 colx
0    1    1    1    1    4    4    4    4
1    2    2    2    2    5    5    5    5
2    3    3    3    3    6    6    6    6

edited Sep 20 '22 at 11:28

answered Sep 20 '22 at 11:22

jezrael

822,522
95
1,334
1,252

1

Great and on the point as usal. Played around with `assign()` too, but I had no idea how to proceed with the grouping. So `cumcount()` here is the key experience for me + topping of `swaplevel`. Thanks – HedgeHog Sep 20 '22 at 11:31

score 1 · Answer 2 · answered Sep 20 '22 at 11:35

As an alternative to the classical pivot, you can concat the output of groupby with a dictionary comprehension, ensuring alignment with reset_index:

out = pd.concat({k: d.drop(columns='col1').reset_index(drop=True)
                 for k,d in df.groupby('col1')}, axis=1)

output:

     A                   B               
  col2 col3 col4 colx col2 col3 col4 colx
0    1    1    1    1    4    4    4    4
1    2    2    2    2    5    5    5    5
2    3    3    3    3    6    6    6    6

How to reshape dataframe by creating a grouping multiheader from specific column?

2 Answers2