-1

I have a csv file which has data in this format. Here i want to group by elements using 2 column values.

col1  col2  col3
abc   001    mango
abc   001    apple
abc   001    orange
abc   002    potato
xyz   003    cabbage
xyz   003    peas
xyz   004    ladyfinger

I want the output in this format

col1  col2  col3
abc   001   [mango,apple,orange]
abc   002   [potato]
xyz   003   [cabbage,peas]
xyz   004   [ladyfinger]

I want to group the values of col3 on basis of clo2 values.

  • Does this answer your question? [How to group dataframe rows into list in pandas groupby?](https://stackoverflow.com/questions/22219004/how-to-group-dataframe-rows-into-list-in-pandas-groupby) – sushanth Jul 17 '20 at 09:01
  • df.groupby(["col1", "col2"])["col3"].apply(list).reset_index() , although groups by both col1 and col2 so only work in provided example – saph_top Jul 17 '20 at 09:02

2 Answers2

1

try it:

df.groupby(['col1','col2']).agg({'col3': lambda col: list(col)}).reset_index()

or

df.groupby('col2').agg(col1=pandas.NamedAgg(column='col1', aggfunc='first'), col3=pandas.NamedAgg(column='col3', aggfunc=lambda col: list(col))).reset_index()
1
df.groupby(['col1','col2']).agg({'col3': lambda col: list(col)}).reset_index()

You can achieve it with the code above

Shaheen Zahedi
  • 1,216
  • 3
  • 15
  • 40
julius
  • 11
  • 1