0

If I have a data frame like and want to create new columns (or a new table) based on data in other columns

data = {'uid':[1,1,2,2,3,3],
       'product':['abc','pqr','abc','rst','pqr','rst'],
       'rank':[1,2,2,1,1,2]}
df = pd.DataFrame(data)
df


   uid  product rank
0   1   abc     1
1   1   pqr     2 
2   2   abc     2
3   2   rst     1
4   3   pqr     1
5   3   rst     2

I want

    uid  rank_abc rank_pqr rank_rst
0.  1      1        2       nan
1.  2      2        nan     1
2.  3      nan      1       2

I tried below

  newdf['rank_abc'] = np.where(
        df['product']=='abc', df['rank'], np.nan
    )
  newdf['rank_pqr'] = np.where(
        df['product']=='pqr', df['rank'], np.nan
    )    
   newdf['rank_rst'] = np.where(
        df['product']=='1st', df['rank'], np.nan
    )

    uid  rank_abc rank_pqr rank_rst
0.  1      1        nan       nan
0.  1      nan        2       nan
1.  2      2        nan       nan
1.  2      nan      nan       1
2.  3      nan      1         nan
2.  3      nan      nan       2

What is the better way to do this?

pranav nerurkar
  • 596
  • 7
  • 19
  • 1
    `df.pivot(index='uid', columns='product', values='rank').add_prefix('rank_').reset_index()` – mozway Jan 13 '23 at 10:17

0 Answers0