0

I have an input data frame something like this:

Input DF:

Id  Col1  Col2  Col3    Comp      Paired_Id
 1   a     NaN   z     Public         A
 2   b     NaN   x     Public         B
 A  NaN     b    z     Hybrid         1
 B  NaN     d    x     Hybrid         2

how do I merge the rows based on paired rows to get below result using frozenset and groupby.first():

Expected Output:

Id  Col1  Col2  Col3    Comp         Paired_Id
1    a    b       z  Public,Hybrid         A
2    b    d       x  Public,Hybrid         B
mozway
  • 194,879
  • 13
  • 39
  • 75
Bharath
  • 13
  • 1

1 Answers1

0

Looks like a custom groupby.aggregate:

(df.groupby('Col3', as_index=False, sort=False)
   .agg({'Id': 'first', 'Col1': 'first', 'Col2': 'first',
         'Comp': ','.join, 'Paired_Id': 'first'
        })
   [df.columns]
)

Output:

  Id Col1 Col2 Col3           Comp Paired_Id
0  1    a    b    z  Public,Hybrid         A
1  2    b    d    x  Public,Hybrid         B
mozway
  • 194,879
  • 13
  • 39
  • 75