List comprehension over the same rows in pandas

Question

I am wondering whether there is a neat way to 'collapse' a pandas data frame in presence of identical rows. For example:

df =

col_a  col_b
    a     1
    b     2
    b     3
    c     4
    d     5
    d     6
    d     7

what I need is:

df_new = 

col_a     col_b
    a         1
    b    [2, 3]
    c         4
    d [5, 6, 7]

it definitely should include groupby

df_new = df.groupby('col_a').apply(....)

but how to implement effectively the bit in the brackets, I'm puzzled.

`df.groupby('col_a').col_b.apply(list)` should work. i'm sure this question is a dup though — Haleemur Ali, Sep 13 '18 at 20:37

score 2 · Answer 1 · answered Sep 13 '18 at 20:37

2

You can apply list to col_b:

df.groupby('col_a')['col_b'].apply(list)

col_a
a          [1]
b       [2, 3]
c          [4]
d    [5, 6, 7]
Name: col_b, dtype: object

answered Sep 13 '18 at 20:37

sacuL

49,704
8
81
106

OMG, I've never felt so silly :) – Arnold Klein Sep 13 '18 at 20:38

score 1 · Answer 2 · answered Sep 13 '18 at 20:41

1

s = df.groupby('col_a')['col_b'].apply(list)
df['col_c'] = df['col_a'].map(s)

print(df)

col_a   col_b   col_c
0   a   1   [1]
1   b   2   [2, 3]
2   b   3   [2, 3]
3   c   4   [4]
4   d   5   [5, 6, 7]
5   d   6   [5, 6, 7]
6   d   7   [5, 6, 7]

answered Sep 13 '18 at 20:41

Khalil Al Hooti

4,207
5
23
40

List comprehension over the same rows in pandas

2 Answers2