0

trying to split a single csv file into multiple smaller csv files categorized by a column value. i got the answer here via Jon Clements but the resulting smaller csv files have blanks every other row. why does the code produce blank rows that way? there was no blank rows in the bigger csv file. (Splitting csv file based on a particular column using Python)

i read the docs on csv package and tried to print out row value within each iteration, and i could not see any rows that are blank.

import csv
splitbycoln='STATE'
with open(file_path) as fin:    
    csvin = csv.DictReader(fin)
    # Category -> open file lookup
    outputs = {}
    for row in csvin:
        cat = row[splitbycoln]
        # Open a new file and write the header
        if cat not in outputs:
            fout = open('{}.csv'.format(cat), 'w')
            dw = csv.DictWriter(fout, fieldnames=csvin.fieldnames)
            dw.writeheader()
            outputs[cat] = fout, dw
        # Always write the row
        outputs[cat][1].writerow(row)
        # print(row)
    # Close all the files
    for fout, _ in outputs.values():
        fout.close()

i expect the smaller csv file to not have any blank rows.


edit: found solution via Erick Stone's reply to the duplicate question. just needed to add newline='' to the fout=open command. [when using DictWriter, you will have a new line from the open function and a new line from the writerow function. You can use newline='' within the open function to remove the extra newline.]

Cuezy
  • 43
  • 1
  • 6
  • 1
    His code does not produce blank rows. Is it possible the original file has blank data rows? – Jab Aug 26 '19 at 19:08
  • 1
    Is [this](https://stackoverflow.com/questions/3191528/csv-in-python-adding-an-extra-carriage-return-on-windows) your problem? You aren't opening your file in binary mode. – holdenweb Aug 26 '19 at 19:16
  • i double checked and original file doesn't have blanks. i even made a new dummy file with all colns filled (in addition to the 'splitbycoln') and i still end up with blanks every other row in the smaller csv file. – Cuezy Aug 26 '19 at 19:18

0 Answers0