Pandas merging rows with same values based on multiple columns

Question

I have a sample dataset like this

Col1 Col2 Col3
A 1,2,3 A123
A 4,5 A456
A 1,2,3 A456
A 4,5 A123

I just want to merge the Col2 and Col3 into single row based on the unique Col1.

Expected Result:

Col1 Col2 Col3
A 1,2,3,4,5 A123,A456

I referred some solutions and tried with the following. But it only appends single column.

df.groupby(df.columns.difference(['Col3']).tolist())\
                 .Col3.apply(pd.Series.unique).reset_index()

post data, not pics https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples — sammywemmy, Feb 20 '20 at 10:26
@sammywemmy thought the image will explain better. edited the question with the data now. — Arun Karthick, Feb 20 '20 at 10:34

score 1 · Accepted Answer · answered Feb 20 '20 at 10:50

Drop duplicates with subsets Col1 and 3
groupby Col1
Then aggregate, using the string concatenate method

(df.drop_duplicates(['Col1','Col3'])
.groupby('Col1')
.agg(Col2 = ('Col2',lambda x: x.str.cat(sep=',')),
     Col3 = ('Col3', lambda x: x.str.cat(sep=','))
     )
.reset_index()
 )

    Col1    Col2    Col3
0   A   1,2,3,4,5   A123,A456

Pandas merging rows with same values based on multiple columns

1 Answers1