1

I have a Pandas data frame that I am trying transpose from long to wide.

Here is the data frame:

enter image description here

Here is the desired output:

enter image description here

Below is reproducible code to make the data frame and also my attempt called test; I am part way there, but I am not sure how to unmelt again and concatenate the column names.

mydf = pd.DataFrame(
    {
        "x": [1, 1, 1, 1],
        "cat": ["Cat100", "Cat100", "Cat200", "Cat200"],
        "n": ["N1", "N2", "N1", "N2"],
        "y": [3, 4.2, 2.1, 2.4],
    }
)

test = mydf.set_index(["x", "cat", "n"])["y"].unstack().reset_index()
test.columns = test.columns.tolist()
display(test)

enter image description here

a11
  • 3,122
  • 4
  • 27
  • 66

2 Answers2

2

try:

mydf['cat'] = mydf.cat +'_'+ mydf.n
mydf.drop('n', axis=1, inplace=True)


res = mydf.set_index(['x', 'cat']).rename(columns={'y':''}).unstack().reset_index()
res.columns = res.columns.map(''.join)

res

    x   Cat100_N1   Cat100_N2   Cat200_N1   Cat200_N2
0   1   3.0         4.2         2.1         2.4
Pygirl
  • 12,969
  • 5
  • 30
  • 43
  • Thanks, that is really close; I am struggling with flattening out the multilevel though, so that the result is just like the desired output in the OP – a11 Jul 26 '21 at 17:41
  • Actually `res.columns = [''.join(col) for col in res.columns.values]` and `res.columns = res.columns.map(''.join)` both give the desired output in the OP; but `res.columns = res.columns.map('_'.join)` adds extra "_" bits – a11 Jul 26 '21 at 17:49
  • @a11: Updated thanks for pointing this out. Didn't check that. – Pygirl Jul 26 '21 at 17:50
2

We can also use pivot and Index.map to flatten the MultiIndex then reset_index to return x to a column:

new_df = mydf.pivot(index='x', columns=['cat', 'n'], values='y')
new_df.columns = new_df.columns.map('_'.join)
new_df = new_df.reset_index()

new_df:

   x  Cat100_N1  Cat100_N2  Cat200_N1  Cat200_N2
0  1        3.0        4.2        2.1        2.4
Henry Ecker
  • 34,399
  • 18
  • 41
  • 57