Concatenate strings from multiple rows using Pandas groupby and remove duplicates from the comma separated cell

Question

I have the following data and trying to aggregate by unique id and need to get unique names, unique products, unique prices in one cell in their respective columns

Unique_id   Name    Product Price
101         ABC Ltd A       100
102         JKL Ltd B       200
101         ABC Ltd B       200
102         JKL US  B       200
103         IHJ Ltd A       100
102         JKL UK  C       300
103         IHJ US  A       100
101         ABC US  A       100

Could get the below code but its not meeting my requirement.

df1 = df.groupby(' Unique_ID'').agg({'Product'  : 'first', ‘Price’  : 'first', 
          'Name’:'first'.join}).reset_index()

Below is the output i am trying to get but i am not able to get correct code.

Unique_id   Name               Product   Price
101         ABC Ltd, ABC US    A, B      100
102         JKL Ltd, JKL UK    B, C      200
103         IHJ Ltd, IHJ US    A         100

Thanks.

score 5 · Answer 1 · answered Mar 09 '20 at 09:08

Slight modification to your current approach by using lambda:

print (df.groupby('Unique_id').agg({'Product': lambda d: ", ".join(set(d)),
                                    'Price' : 'first',
                                    'Name':lambda d: ", ".join(set(d))}).reset_index())

   Unique_id Product  Price                   Name
0        101     A,B    100         ABC Ltd,ABC US
1        102     C,B    200  JKL Ltd,JKL UK,JKL US
2        103       A    100         IHJ US,IHJ Ltd

Concatenate strings from multiple rows using Pandas groupby and remove duplicates from the comma separated cell

1 Answers1

Linked