1

I have a csv file with, say, 50 rows of data, and I would like to split it into separate csv files for each row, which includes first row (header) and the the relevant row.

E.g. file 1 contains: row1, row2, file 2 contains: row1, row3, file 3 contains: row1, row4

And so on.

Currently working with:

import csv

counter = 1

with open('mock_data.csv', 'r', newline='') as csvfile:
    reader = csv.reader(csvfile)
    for row in reader:
        filename = "file_%s" % str(counter)
        with open(filename, 'w') as csvfile_out:
            writer = csv.writer(csvfile_out)
            writer.writerow(row)
            counter = counter + 1

I'm currently getting 'csvfile_out' not defined.

a) Am I even approaching this correctly b) Any ideas why csvfile_out isn't being defined?

WSC
  • 903
  • 10
  • 30

3 Answers3

1

I have tried this and it works fine for your purpose. Unfortunately, I didn't get any csvfile_out error and your with statement works correctly in my Python 2.7.12 console.

import csv

counter = 1

with open('mock_data.csv', 'r') as csvfile:
    reader = csv.reader(csvfile)
    row1 = next(reader) # here you save your first line of the .csv file
    for row in reader:
        if row: # if row is not empty, write a file with this row
            filename = "file_%s" % str(counter)
            with open(filename, 'w') as csvfile_out:
                writer = csv.writer(csvfile_out)
                writer.writerow(row1) #here you write your row1 as first row of csvfile_out
                writer.writerow(row)
                counter = counter + 1
titiro89
  • 2,058
  • 1
  • 19
  • 31
  • Thank you for this. The undefined error was me being stupid. Which is good, because I definitely didn't think there was anything wrong with the code! – WSC May 13 '17 at 13:11
  • Follow up: I'm having a weird issue with this. The `writer.writerow(row)` is being put onto the 3rd row with a blank row between. Any ideas? – WSC May 13 '17 at 13:17
  • I think that this issue can come up if you have blank rows in your 'mock_data.csv'. You have to remove blank rows from your 'mock_data.csv', [see here](http://stackoverflow.com/questions/4521426/delete-blank-rows-from-csv) , or check in your code if a row is blank or not (I put this second option in my answer, I hope that can fix your issue). – titiro89 May 13 '17 at 13:29
  • Turns out I actually forgot to add `newline=''` to `with open(filename, 'w', newline='') as csvfile_out:` - adding that removed the blank row. Truthfully, I don't know why that is. – WSC May 13 '17 at 16:55
1

Here is a solution with pandas. Assume the content of csv as follows:

Name, Age, Gender
John, 20, Male
Jack, 22, Male
Jill, 18, Female

And my code is as follows:

import pandas as pd
df = pd.read_csv("mock_data.csv")

for index, row in df.iterrows():
    file_name = row['Name']+".csv"  #Change the column name accordingly
    pd.DataFrame(row).T.to_csv(file_name, index=None)

This will create filenames based on the values of the column "Name" (i.e. Jack, John and Jill) to produce three files John.csv, Jack.csv and Jill.csv. Content of John.csv is as follows:

Name    | Age   |  Gender |
---------------------------
John    | 20    |  Male   |

Content of Jack.csv is as follows:

Name    | Age   |  Gender |
---------------------------
Jack    | 22    |  Male   |

Content of Jill.csv is as follows:

Name    | Age   |  Gender |
---------------------------
Jill    | 20    |  Female   |

P.S: If you don't want the header, just add header = None when calling .to_csv() function. For example:

pd.DataFrame(row).T.to_csv(file_name, index=None, Header=None)
kingmakerking
  • 2,017
  • 2
  • 28
  • 44
0

You can use DictReader too...

import csv

counter = 1

with open('mock_data.csv', 'r') as csvfile:
    reader = csv.DictReader(csvfile)
    for row in reader:
        filename = "file_%s" % str(counter)
        with open(filename, 'w') as csvfile_out:
        writer = csv.DictWriter(csvfile_out, fieldnames=reader.fieldnames)
        headers = dict((n, n) for n in reader.fieldnames)
        writer.writerow(headers)
        writer.writerow(row)
        counter = counter + 1
abhinav kumar
  • 222
  • 2
  • 10