0

for instance, I have a .csv file with 1000s of rows like below:

year,name
1992,Alex
1992,Anna
1993,Max
1993,Bob
1993,Tom

so on...

I want my output to be:

   year           name
   1992     Alex, Anna
   1993  Max, Bob, Tom

this looks simple but I'm not able to make the corresponding rows in a single row appended by a comma ','

talatccan
  • 743
  • 5
  • 19

3 Answers3

3

You can achieve this by using groupby and aggregation. Try the below code:

df = df.groupby("year").agg({
    "year":"first",
    "name":", ".join
                          })

You can save the dataframe values to csv by ignoring index

df.to_csv("output.csv",index=False)
Varsha
  • 319
  • 1
  • 5
2

This may help you

df = df.groupby('year')['name'].unique().reset_index()
df['name'] = df['name'].apply(lambda x: ', '.join(x))

Output:

   year           name
0  1992     Alex, Anna
1  1993  Max, Bob, Tom
talatccan
  • 743
  • 5
  • 19
1

How about this one?

import pandas as pd
x = pd.DataFrame.from_dict({'year':['1992', '1992', '1993', '1993', '1993'], 
                            'name':['ALEX', 'ANNA', 'MAX', 'BOB', 'TOM'],
                             'col':range(5)})
print (x)

a = x.groupby('year').agg({'name': lambda x: tuple(set(x)), 'col':'sum'})
print (a)

Result:

                 name  col
year                      
1992     (ALEX, ANNA)    1
1993  (BOB, TOM, MAX)    9
ASH
  • 20,759
  • 19
  • 87
  • 200