0

I want to merge rows that contain a specific value, however, I want the merged row to have new columns.

Example

import pandas as pd


df = pd.DataFrame([{'Day': "Monday", 'Item_1':   "Shirt", 'Item_2': "Mug",   'Item_3': "Pen"},
                   {'Day': "Monday", 'Item_1':   "Shoes", 'Item_2': "Tea",   'Item_3': "Book"},
                   {'Day': "Tuesday", 'Item_1':"Charger", 'Item_2': "Router",'Item_3': "Phone"},
                   {'Day': "Tuesday", 'Item_1':"Monitor", 'Item_2': "Toy",   'Item_3': "Chair"},
                   {'Day': "Friday", 'Item_1':   "Shirt", 'Item_2': "TV",    'Item_3': "Desk"}])
df


Day Item_1  Item_2  Item_3
0   Monday  Shirt   Mug Pen
1   Monday  Shoes   Tea Book
2   Tuesday Charger Router  Phone
3   Tuesday Monitor Toy Chair
4   Friday  Shirt   TV  Desk

I want any row that has the same day to be merged like this


Day  Item_1     Item_2     Item_3     Item_1_1     Item_2_1     Item_3_1
Monday   Shirt      Mug        Pen    Shoes        Tea      Book
Tuesday  Charger    Router     Phone  Monitor      Toy      Chair
Friday   Shirt      TV         Desk   NaN          NaN      NaN

is there a way to do it like this?

Josuke
  • 11
  • 1

1 Answers1

1

I think you can use groupby here:

df = (df
      .groupby('Day', sort=False)
      .apply(lambda x: x.to_numpy())
      .apply(np.concatenate)
      .apply(pd.Series)
      .reset_index(drop=True)
      )

# fix col names
df.columns = ['Day'] + [f'Item_{x}' for x in range(1, df.shape[1])]

print(df)

       Day   Item_1  Item_2 Item_3   Item_4   Item_5 Item_6 Item_7
0   Monday    Shirt     Mug    Pen   Monday    Shoes    Tea   Book
1  Tuesday  Charger  Router  Phone  Tuesday  Monitor    Toy  Chair
2   Friday    Shirt      TV   Desk      NaN      NaN    NaN    NaN

YOLO
  • 20,181
  • 5
  • 20
  • 40