Pandas: how to collapse df rows by a common key

Asked Sep 20 '17 at 11:01

Active Sep 20 '17 at 11:14

Viewed 876 times

Example dataset:

import pandas as pd
df_test = pd.DataFrame({
    'a': ['orange', 'lemon', 'banana', 'orange'],
    'b': ['person_a', 'person_a', 'person_b', 'person_b']
})

This gives:

        a         b
0  orange  person_a
1   lemon  person_a
2  banana  person_b
3  orange  person_b

I want to collapse this so that each of person_a and person_b is just one row, and the fruits form a list for each person:

                      a         b
0   ['orange', 'lemon']  person_a
1  ['banana', 'orange']  person_b

How? I can put something equivalent together crudely with for loops but it feels hacky, and it's very slow. My gut suggests there should be something more native to pandas.

EDIT: answer here: grouping rows in list in pandas groupby

edited Sep 20 '17 at 11:14

Petter Friberg

21,252
9
60
109

asked Sep 20 '17 at 11:01

MobiusStriptease

Please do not add Solved in title, if you like to indicate which duplicate helped you out, just leave a comment, @Zero FYI, OP thanks you and says that "grouping rows in list in pandas groupby" is the correct post (not sure if worth to keep only that in dup list or not, your choice). Nice Work Zero! – Petter Friberg Sep 20 '17 at 11:17

Pandas: how to collapse df rows by a common key

0 Answers0