0

This is a part of data frame I have:

index value category
    1   ff  a
    2   ss  a
    3   hl  a
    4   dn  a
    5   fs  b
    6   lm  b
    7   fds b
    8   dn  b
    9   hs  b
    10  ho  c
    11  ycs c
    12  dl  c

I want to convert it to this format:

a   b   c
ff  fs  ho
ss  lm  ycs
hl  fds dl 
dn  dn  
    hs  

I know that I can filter the data based on a category and create a new data frame by adding a value of each category as column. Is there any simpler way? I tried stacking methods but they did not work.

Mary
  • 1,142
  • 1
  • 16
  • 37

2 Answers2

1

You have a hidden key here create by cumcount

s = df.assign(key=df.groupby('category').cumcount()).pivot(index='key',columns='category',values='value')
Out[91]: 
category    a    b    c
key                    
0          ff   fs   ho
1          ss   lm  ycs
2          hl  fds   dl
3          dn   dn  NaN
4         NaN   hs  NaN
BENY
  • 317,841
  • 20
  • 164
  • 234
1

You can spread the dataframe by using the pivot() function from pandas like:

df=df.pivot(columns='category', values='value').

After that you can remove the NAs from the individual series using the lambda function like:

df=df.apply(lambda x:pd.Series(x.dropna().values))

sharmajee499
  • 116
  • 5