3

I am dealing with a data set which has the following fields:

ID  Person_Name Person_Country
110 Marc    CA
110 Sean    CN
111 Matt    IN
111 Rob     AU
112 Mike    US

I intend grouping the data in the following way:

ID  Person_Name Person_Country
110 Marc; Sean  CA; CN
111 Matt; Rob   IN; AU
112 Mike        US

I tried using the built-in functions like .pivot_table() and .unstack(), but they weren't helpful since I am dealing with non-numeric data.

Mazahir Bhagat
  • 137
  • 4
  • 10
  • Small note: it is usually a bad idea to give your columns names with spaces. It makes them hard to read: is Name the third column? Oh no, it is part of the second column's name. Rather, use dots or underscores as separators. – Bram Vanroy Jun 04 '18 at 15:09
  • Either `df.groupby('ID').agg('; '.join)` or if you want to explicitly state the column names: `df.groupby('ID')[['Person Name', 'Person Country']].agg('; '.join)`. – ayhan Jun 04 '18 at 15:11
  • This example cannot take advantage of `apply` and it needs the `agg` to accomplish the desired result. – zipa Jun 04 '18 at 15:12
  • @BramVanroy - Thanks, implemented your advice! – Mazahir Bhagat Jun 04 '18 at 17:20
  • @user2285236 - I was trying this approach by referring to similar questions, but it returns the column names instead of the names concatenated together. – Mazahir Bhagat Jun 04 '18 at 17:32
  • Are you trying with `agg`? It works fine when I try it: https://imgur.com/a/1hExpHF – ayhan Jun 04 '18 at 17:37
  • @user2285236 - Yes, I am using the .agg. – Mazahir Bhagat Jun 06 '18 at 20:23

0 Answers0