0

I have below database format stored in a pandas dataframe

ID          Block
MGKfdkldr   Product 1
MGKfdkldr   Product 2
MGKfdkldr   Product 3
GLOsdasd    Product 2
GLOsdasd    Product 3
NewNew      Product 1
OldOld      Product 4
OldOld      Product 8

Here is the sample dataframe code

df1 = pd.DataFrame({'ID':['MGKfdkldr','MGKfdkldr','MGKfdkldr','GLOsdasd','GLOsdasd','NewNew','OldOld','OldOld'],'Block':['Product 1','Product 2','Product 3','Product 2','Product 3','Product 1','Product 4','Product 8']})

I am looking for below data format from this(Expected output):

ID          Block-1     Block-2     Block-3
MGKfdkldr   Product 1   Product 2   Product 3
GLOsdasd    Product 2   Product 3   
NewNew      Product 1       
OldOld      Product 4   Product 8   

I have tried to melt it with pd.melt function but it just transposing data to column header but I am looking for bit difference. Is there any other method through which I can get my Expected output?

Can anyone help me on this? Please

1 Answers1

0

The function you're looking for is pivot not melt. You'll also need to provide a "counter" column that simply counts the repeated "ID"s to get everything to align properly.

df1["Block_id"] = df1.groupby("ID").cumcount() + 1

new_df = (df1.pivot("ID", "Block_id", "Block") # reshapes our data
          .add_prefix("Block-")                # adds "Block-" to our column names
          .rename_axis(columns=None)           # fixes funky column index name
          .reset_index())                      # inserts "ID" as a regular column instead of an Index

print(new_df)
          ID    Block-1    Block-2    Block-3
0   GLOsdasd  Product 2  Product 3        NaN
1  MGKfdkldr  Product 1  Product 2  Product 3
2     NewNew  Product 1        NaN        NaN
3     OldOld  Product 4  Product 8        NaN

If you want actual blanks (e.g. empty string "") instead of a NaN, you can use new_df.fillna("")

Cameron Riddell
  • 10,942
  • 9
  • 19