pandas Concatenate strings based on column values

Question

I have a dataframe

df = pd.DataFrame({
        'Names': ['A', 'A', 'A', 'B', 'B', 'C', 'C', 'C'],
        'Value': ['A1','A2','A3','B1','B2','C1','C2','C3']})

#  Names Value
#0     A    A1
#1     A    A2
#2     A    A3
#3     B    B1
#4     B    B2
#5     C    C1
#6     C    C2
#7     C    C3

I wish to get it into the current state:

#  Names Values
#0     A    [A1, A2, A3]
#1     B    [B1, B2]
#2     C    [C1, C2, C3]

Are there any inbuilt functions in the pandas or numpy packages that can simplify this? Or am I forced to iterate it through using default python?

Yup thought about groupby but using `apply` is not really what I was looking for. I suppose its close. — ycx, Oct 31 '19 at 04:45

score 1 · Answer 1 · answered Oct 31 '19 at 04:33

1

Try this out:

df.groupby('Names')['Value'].apply(list).reset_index(name='Values')

answered Oct 31 '19 at 04:33

Ha Bom

2,787
3
15
29

1

I was thinking about `groupby` and `apply` too, but I didn't want to do the cop-out answer of `apply`. I think the link provided by other user gave me the answer in `df.groupby('Names').agg(lambda x: list(x))` – ycx Oct 31 '19 at 04:54
2

@ycx Actually, you could skip the `lambda`. `df.groupby('Names')['Value'].agg(list)` is all you need. – Henry Yik Oct 31 '19 at 04:56
Ah thats nice, you have a way to shorten `lambda x: ', '.join(list(x))`? – ycx Oct 31 '19 at 05:04

score 1 · Answer 2 · answered Oct 31 '19 at 04:33

1

It's very simple:

df.groupby('Names')['Value'].apply(list).reset_index(name='Values')

answered Oct 31 '19 at 04:33

pissall

7,109
2
25
45

Thanks for the link. I was thinking about `groupby` and `apply` too, but I didn't want to do the cop-out answer of `apply`. I think your link provided the answer in `df.groupby('Names').agg(lambda x: list(x))` – ycx Oct 31 '19 at 04:52

pandas Concatenate strings based on column values

2 Answers2