How to split a row with to multiple rows in csv file?

Question

I need an application to work on CSV files where I need to split one single row into multiple rows based on multiple value in any particular column. It would be better if anyone can help me with python code but any other solution is welcomed. Please help me to solve this. Thanks

Here is the input file: [![enter image description here][1]][1]`

And I want to transfer this file into like below: enter code here

If I open the csv file in Notepad++, I can see these as below:
`Name,ID Club,Position,Inactive,Attendance Date

Arnold Simon,xxxxx,Volleyball - Women's,Player,No,"04/07/2021 04/05/2021"

Arnold Simon,xxxxx,Volleyball - Women's,Safety Officer,No,

Mike Anderson,yyyyy,Volleyball - Women's,Player,No,"04/07/2021 04/05/2021"

Thomas Engle,AAAAA,Volleyball - Women's,Player,No,4/5/2021

Chris Haris,BBBBB,Baseball,Player,No,"07/12/2021 07/11/2021 06/29/2021 06/25/2021 06/16/2021"`

Is the CSV well-formed regarding multiline fields? I.e. is the value wrapped in quotes or some sort of open/close marker? — mojo, Aug 02 '21 at 14:06
Depending on the answer to that question, [csv.reader](https://docs.python.org/3/library/csv.html#csv.reader) takes an optional `newline` parameter that you can use to allow newlines embedded in a record/field. — mojo, Aug 02 '21 at 14:09
Pandas and itertools give an example here: https://stackoverflow.com/questions/50731229/split-cell-into-multiple-rows-in-pandas-dataframe — SNygard, Aug 02 '21 at 14:11
@mojo if I open the csv file into notepad++ I see the file as below: Name,ID Club,Position,Inactive,Attendance Date Arnold Simon,xxxxx,Volleyball - Women's,Player,No,"04/07/2021 04/05/2021" Arnold Simon,xxxxx,Volleyball - Women's,Safety Officer,No, Mike Anderson,yyyyy,Volleyball - Women's,Player,No,"04/07/2021 04/05/2021" Thomas Engle,AAAAA,Volleyball - Women's,Player,No,4/5/2021 Chris Haris,BBBBB,Baseball,Player,No,"07/12/2021 07/11/2021 06/29/2021 06/25/2021 06/16/2021" — Sajjad, Aug 02 '21 at 14:24
@Sajjad, I can't see newlines or anything in the comment, but I could believe that there are newlines in the "Attendance Date" field data you posted. See what happens with csv.reader. — mojo, Aug 02 '21 at 18:41

score 1 · Accepted Answer · answered Aug 03 '21 at 21:41

1

I got it done with this below 1 line code!

`import pandas as pd

df = pd.read_csv('testfile.csv')

res = df.set_index(['Name', 'ID', 'Club','Position','Inactive']).apply(lambda x: x.str.split('\n').explode()).reset_index()

res.to_csv('output.csv')`

answered Aug 03 '21 at 21:41

Sajjad

57
1
7

Pi Marillion · Answer 2 · 2021-08-06T00:45:02.647

1

You could do this short one:

import csv
with open('input_file.csv') as file:
    new = [row[:-1] + [i] for row in csv.reader(file) for i in row[-1].split()]
with open('output_file.csv', 'w') as file:
    csv.writer(file).writerows(new)

Or this equivalent longer one:

import csv
with open('input_file.csv') as f_in, open('output_file.csv', 'w') as f_out:
    csv_out = csv.writer(f_out)
    for row in csv.reader(f_in):
        for i in row[-1].split():
            csv_out.writerow(row[:-1] + [i])

edited Aug 06 '21 at 00:45

answered Aug 03 '21 at 21:57

Pi Marillion

4,465
1
19
20

just a little correction for the 2nd one: import csv with open('testfile.csv','r') as f_in, open('output_file2.csv','w',newline='') as f_out: csv_out = csv.writer(f_out) for row in csv.reader(f_in): for i in row[-1].split(): csv_out.writerow(row[:-1] + [i]) – Sajjad Aug 05 '21 at 18:30

How to split a row with to multiple rows in csv file?

2 Answers2