1

I have this dataframe :

ID x y z
1 new york california
2 vermont; new york carolina n florida
2 oregon texas florida; colorado
3 new jersey
1 hawai utah

I'm trying to group by ID, and merge column x, y, and z into one column.

Output should be :

ID geo
1 new york; california; hawai; utah
2 vermont; new york; carolina n; oregon; texas; florida; colorado; florida
3 new jersey

I tried with the following piece of code :

save = df.groupby('ID').agg(lambda x: ';'.join(set(x)))

But it doesn't work.

DouxDoux
  • 53
  • 4
  • Does this answer your question? [Concatenate strings from several rows using Pandas groupby](https://stackoverflow.com/questions/27298178/concatenate-strings-from-several-rows-using-pandas-groupby) –  Aug 04 '23 at 17:35

1 Answers1

1

Try stack the data before join:

(df.groupby('ID',group_keys=False)
   .apply(lambda x: '; '.join(x.stack().dropna()) )
   .reset_index(name='geo')
)

Output:

   ID                                                geo
0   1               new york; california; hawai; utah 
1   2  vermont; new york; carolina n; florida; oreg...
2   3                                        new jersey 
Quang Hoang
  • 146,074
  • 10
  • 56
  • 74