I have a datafame which will look like this: where col4 has unique ID. There are posts on concatenating strings, but I have to concatenate integer which is throwing error if I am using str(int) which is not the usual case
col1 | col2 | col3 | col4 | col5 |
---|---|---|---|---|
1999 | ABC | ggg | 1 | kogyk |
1999 | ABC | ggg | 2 | hfu |
1989 | CAT | ppp | 3 | gl |
1999 | ABC | uyt | 4 | klyif |
1989 | CAT | ppp | 5 | gil |
I want to merge the contents of col4 if col1,col2,col3 values match and add a count of it. output must look like this:
col1 | col2 | col3 | col4 | count |
---|---|---|---|---|
1999 | ABC | ggg | 1,2 | 2 |
1989 | CAT | ppp | 3,5 | 2 |
1999 | ABC | uyt | 4 | 1 |
I got the necessary output with: df.groupby(['col1', 'col2', 'col3']).agg(col4=('col4', lambda x: ','.join([str(x) for x in list(x))), count=('col4', 'size')).reset_index() works as expected