Take rows that share a value in one column and combine values from another column in pandas dataframe

Question

I have a pandas dataframe with multiple rows that can share an ID. Each row also has a value for the "label" column. What I would like is to combine all the labels that share the same ID.

For example, say this is what I have:

id | label 
-----------
 1    a
 1    b
 2    a
 2    c
 2    d
 3    e

What I would like is something like this:

id | label_list
----------------
1      [a,b]
2      [a,c,d]
3      [e]

So the labels that shared the same ID were combined and made into a list. What would be the most efficient way to do this?

Possible duplicate of [grouping rows in list in pandas groupby](https://stackoverflow.com/questions/22219004/grouping-rows-in-list-in-pandas-groupby) — cmaher, Aug 30 '17 at 18:25

score 1 · Accepted Answer · answered Aug 30 '17 at 18:24

1

You need

df.groupby('id').label.apply(list).reset_index()

id       label 
1       [a, b]
2    [a, c, d]
3          [e]

answered Aug 30 '17 at 18:24

Vaishali

37,545
5
58
86

score 0 · Answer 2 · answered Aug 30 '17 at 18:41

0

This solution is very similar to @Vaishali's solution, but it uses .agg() instead of .apply() method:

In [110]: df.groupby('id', as_index=False)['label'].agg(lambda x: x.tolist())
Out[110]:
   id      label
0   1     [a, b]
1   2  [a, c, d]
2   3        [e]

answered Aug 30 '17 at 18:41

MaxU - stand with Ukraine

205,989
36
386
419

Take rows that share a value in one column and combine values from another column in pandas dataframe

2 Answers2