how do i group the values of csv file using panda based on values of row?

Question

I have a csv file which has data in this format. Here i want to group by elements using 2 column values.

col1  col2  col3
abc   001    mango
abc   001    apple
abc   001    orange
abc   002    potato
xyz   003    cabbage
xyz   003    peas
xyz   004    ladyfinger

I want the output in this format

col1  col2  col3
abc   001   [mango,apple,orange]
abc   002   [potato]
xyz   003   [cabbage,peas]
xyz   004   [ladyfinger]

I want to group the values of col3 on basis of clo2 values.

Does this answer your question? [How to group dataframe rows into list in pandas groupby?](https://stackoverflow.com/questions/22219004/how-to-group-dataframe-rows-into-list-in-pandas-groupby) — sushanth, Jul 17 '20 at 09:01
df.groupby(["col1", "col2"])["col3"].apply(list).reset_index() , although groups by both col1 and col2 so only work in provided example — saph_top, Jul 17 '20 at 09:02

Phạm Ngọc Quý · Accepted Answer · 2020-07-17T09:42:29.387

1

try it:

df.groupby(['col1','col2']).agg({'col3': lambda col: list(col)}).reset_index()

or

df.groupby('col2').agg(col1=pandas.NamedAgg(column='col1', aggfunc='first'), col3=pandas.NamedAgg(column='col3', aggfunc=lambda col: list(col))).reset_index()

edited Jul 17 '20 at 09:42

answered Jul 17 '20 at 09:01

Phạm Ngọc Quý

249
2
7

This will not yield what OP needs. – sushanth Jul 17 '20 at 09:11
it worked for col2 and col3 values are showing but col1 is not showing – Ashwin Singh Jul 17 '20 at 09:22
1

Try this for column 1 to be present :`df.groupby(['col1','col2']).agg({'col3': lambda col: list(col)}).reset_index()` – Raghul Raj Jul 17 '20 at 09:29
1

sorry my mistake. In other way: df.groupby('col2').agg(c1=pd.NamedAgg(column='col1', aggfunc='first', c3=pd.NamedAgg(column='col3', aggfunc=lambda x: list(x)).reset_index() – Phạm Ngọc Quý Jul 17 '20 at 09:39
Its working. I want to know if i am providing column name as 'Column 1' it gives keyword error. why? – Ashwin Singh Jul 17 '20 at 09:52
because if dataframe there is no columns 'Column 1' – Phạm Ngọc Quý Jul 17 '20 at 09:57

score 1 · Answer 2 · edited Jul 17 '20 at 12:15

1

df.groupby(['col1','col2']).agg({'col3': lambda col: list(col)}).reset_index()

You can achieve it with the code above

edited Jul 17 '20 at 12:15

Shaheen Zahedi

1,216
3
15
40

answered Jul 17 '20 at 09:48

julius

11
1

how do i group the values of csv file using panda based on values of row?

2 Answers2