1

I have the following pandas data frame:

user_id   label      avg_price
--------------------------------
11         A         217.3
11         B         312.1
11         C        1079.8
14         A         453.1
14         B         125.4

and here is my desired output:

   user_id     A_avg_price      B_avg_price     C_avg_price
   ---------------------------------------------------------
    11          217.3            312.1            1079.8 
    14          453.1            125.4            na    

My current approach is to loop over the pandas dataframe, collecting data for each user and create a new dataset from it. However, I am wondering if there is a much elegant way to solve this problem? Thanks a lot!

Edamame
  • 23,718
  • 73
  • 186
  • 320
  • Does this answer your question? [How to pivot a dataframe](https://stackoverflow.com/questions/47152691/how-to-pivot-a-dataframe) – Andy L. Dec 21 '19 at 00:27

1 Answers1

2

IIUC,

new_df = pd.crosstab(df.user_id,df.label,df.avg_price,aggfunc='mean')

new_df.columns = new_df.columns.map(lambda x : f'{x}_avg_price')

print(new_df)

label    A_avg_price  B_avg_price  C_avg_price
user_id                                       
11             217.3        312.1       1079.8
14             453.1        125.4          NaN

to match your desired input you can reset the index and rename the column labels to None

new_df = pd.crosstab(df.user_id,df.label,df.avg_price,aggfunc='mean')
new_df.columns = new_df.columns.map(lambda x : f'{x}_avg_price')
new_df.reset_index(inplace=True)
new_df.columns.name = None
print(new_df)
       user_id  A_avg_price  B_avg_price  C_avg_price
0       11        217.3        312.1       1079.8
1       14        453.1        125.4          NaN
Umar.H
  • 22,559
  • 7
  • 39
  • 74