0

I have a pandas dataframe like this:

Column1 Column2 Column3
a k x
a l y
b k z

I want to transform this dataframe to this:

Column1 Column2 Column3
a "k,l" "x,y"
b k z

I found similar examples but couldn't find an exact solution to my problem. Thank you so much for your help!

BigBen
  • 46,229
  • 7
  • 24
  • 40
1dll
  • 39
  • 5

3 Answers3

4

Try groupby then agg

df_ = (df.groupby(['Column1'])
       .agg({'Column2': lambda x: ','.join(x), 'Column3': lambda x: ','.join(x)})
       .reset_index()
)
print(df_)

  Column1 Column2 Column3
0       a     k,l     x,y
1       b       k       z

If you need the quote mark

df_ = (df.groupby(['Column1'])
       .agg({'Column2': lambda x: f'"{",".join(x)}"' if len(x)>1 else x,
             'Column3': lambda x: f'"{",".join(x)}"' if len(x)>1 else x})
       .reset_index()
)
  Column1 Column2 Column3
0       a   "k,l"   "x,y"
1       b       k       z
Ynjxsjmh
  • 28,441
  • 6
  • 34
  • 52
3

You can do it with groupby and agg:

df.groupby(["Column1"], as_index=False).agg(lambda x: ",".join(x))

EDIT Just found out lambda is not even needed here:

df.groupby(["Column1"], as_index=False).agg(",".join)

Output:

    Column1     Column2     Column3
0   a           k,l         x,y
1   b           k           z
Rabinzel
  • 7,757
  • 3
  • 10
  • 30
  • I believe that lambda is needed, without lambda it concatenates column names. Thank you – 1dll Apr 12 '22 at 16:35
  • I tested it and it worked well for me ( don't have a deeper explanation why :P ) – Rabinzel Apr 12 '22 at 16:37
  • @1dll see this [post](https://stackoverflow.com/a/27298308/15521392). seeing his reputation I think we can trust his statement. That's where I saw the idea and tried it out myself in the first place – Rabinzel Apr 12 '22 at 16:41
0

You may want to use a custom function that joins the strings exploiting pandas.DataFrame.transform:

df.groupby(['Column1']).transform(lambda val: ','.join(val))
lemon
  • 14,875
  • 6
  • 18
  • 38