unstacking a data frame with repeated value in a column

Question

This is a part of data frame I have:

index value category
    1   ff  a
    2   ss  a
    3   hl  a
    4   dn  a
    5   fs  b
    6   lm  b
    7   fds b
    8   dn  b
    9   hs  b
    10  ho  c
    11  ycs c
    12  dl  c

I want to convert it to this format:

a   b   c
ff  fs  ho
ss  lm  ycs
hl  fds dl 
dn  dn  
    hs

I know that I can filter the data based on a category and create a new data frame by adding a value of each category as column. Is there any simpler way? I tried stacking methods but they did not work.

See Question 10 in the dup link. – Quang Hoang Aug 07 '20 at 02:45 — Quang Hoang, Aug 07 '20 at 02:45

score 1 · Answer 1 · answered Aug 07 '20 at 01:37

You have a hidden key here create by cumcount

s = df.assign(key=df.groupby('category').cumcount()).pivot(index='key',columns='category',values='value')
Out[91]: 
category    a    b    c
key                    
0          ff   fs   ho
1          ss   lm  ycs
2          hl  fds   dl
3          dn   dn  NaN
4         NaN   hs  NaN

score 1 · Accepted Answer · answered Aug 07 '20 at 02:09

You can spread the dataframe by using the pivot() function from pandas like:

df=df.pivot(columns='category', values='value').

After that you can remove the NAs from the individual series using the lambda function like:

df=df.apply(lambda x:pd.Series(x.dropna().values))

unstacking a data frame with repeated value in a column

2 Answers2