I am wondering whether there is a neat way to 'collapse' a pandas data frame in presence of identical rows. For example:
df =
col_a col_b
a 1
b 2
b 3
c 4
d 5
d 6
d 7
what I need is:
df_new =
col_a col_b
a 1
b [2, 3]
c 4
d [5, 6, 7]
it definitely should include groupby
df_new = df.groupby('col_a').apply(....)
but how to implement effectively the bit in the brackets, I'm puzzled.