My question is similar to this one, however I have many (above 10) columns. One answer says:
if you have many columns in a df it makes sense to use df.groupby(['foo']).agg(...), see here. The .agg() function allows you to choose what to do with the columns you don't want to apply operations on. If you just want to keep them, use .agg({'col1': 'first', 'col2': 'first', ...}
again specifying these so many columns isn't easy. My own solution is using merge
, however I didn't see this simple solution in any related question. So, I thought maybe I am missing something.
Is this solution correct with no problem?
df = df.merge(df.groupby(['prefix','input_text'],
as_index=False)['target'].agg('<br />'.join))
'.join)` would work fine (?) – Henry Ecker Oct 23 '21 at 19:09