1

I have a Pandas Dataframe as follows:

   user   start  end  label    item
0   1      0    3     food     burger
1   1      4    6     food     cola
0   2      0    3     drink    burger   
1   2      4    6     food     cola
0   3      0    3     food     burger
1   3      7    9     drink    fries  

I would like to "re-structure" the matrix so that label-values that share the same values in the columns "start" and "end" are moved to separate columns instead. For the matrix above, I'd like to get the matrix:

      start  end    item   label1  label2  label3
0       0    3      burger  food    drink   food   
1       4    6      cola    food    food    nan  
2       7    9      burger  nan     nan     drink 

Does anyone have any idea how to solve this?

coder
  • 538
  • 7
  • 17

1 Answers1

3

This is pivot:

(df.pivot_table(index=['start','end','item'], columns='user', 
               values='label', aggfunc='first')
   .add_prefix('label')
   .reset_index()
)

Output:

user  start  end    item label1 label2 label3
0         0    3  burger   food  drink   food
1         4    6    cola   food   food    NaN
2         7    9   fries    NaN    NaN  drink
Quang Hoang
  • 146,074
  • 10
  • 56
  • 74
  • Nice solution. Thank you so much! It worked fine, except the resulting dataframe still have the old indices as well (user, label1, label2, label3 are all aligned, but the start, end and item labels are placed in a separate row on top of the others. I also get a label that is called 'labellabel' that is placed on top of label1) which results in multi indices – coder Feb 22 '21 at 16:09
  • @Anna not sure what you mean. The output in my answer is what it shows in my terminal. Did you forget `reset_index`? – Quang Hoang Feb 22 '21 at 16:14
  • Sorry, my fault. I accidentally put list brackets [ ] around label. Thanks again for the help :) – coder Feb 22 '21 at 16:20