2

I have a dataset like this:

ID    Name
 1       a
 1       b
 1       2
 1       3
 2      er
 2     get
 2  better
 3     123
 3    cold
 3    warm
 3   sweet
 3    heat

and I want to group together this data such that data column "name" having same "id" is merged together using a delimiter. Something like this:

ID                      Name
 1                   a,b,2,3
 2             er,get,better
 3  123,cold,warm,sweet,heat

and so on.

Can anyone provide me a pythonic way of doing this?

James Z
  • 12,209
  • 10
  • 24
  • 44
Kshitij Yadav
  • 1,357
  • 1
  • 15
  • 35
  • Possible duplicate of [pandas - Merge nearly duplicate rows based on column value](https://stackoverflow.com/questions/36271413/pandas-merge-nearly-duplicate-rows-based-on-column-value) – Sheldore Sep 20 '18 at 21:04
  • I tried doing that but I always get this error "sequence item 6: expected str instance, float found" – Kshitij Yadav Sep 20 '18 at 21:08

1 Answers1

4

Use ','.join in a groupby

df.groupby('ID').Name.apply(','.join)

ID
1                     a,b,c,d
2               er,get,better
3    hot,cold,warm,sweet,heat
Name: Name, dtype: object

Reset the index if you need those same two columns

df.groupby('ID').Name.apply(','.join).reset_index()

   ID                      Name
0   1                   a,b,c,d
1   2             er,get,better
2   3  hot,cold,warm,sweet,heat

If for some reason you have non string items

df.assign(Name=df.Name.astype(str)).groupby('ID').Name.apply(','.join).reset_index()

   ID                      Name
0   1                   a,b,c,d
1   2             er,get,better
2   3  hot,cold,warm,sweet,heat
piRSquared
  • 285,575
  • 57
  • 475
  • 624