0

I'm trying to write a Python script to batch convert all tab separated .txt files in a source directory to comma separated .csv files, but retaining the original file names in the output.

I'm fairly new to this, but at the moment my script can create the new .csv files. However, an empty row has been added in between every data filled row in my output files. How can I resolve this problem?

import csv
import os

source_path = r"file location"
dest_path = r"file location"

for file in os.listdir(source_path):

    # Get filename without file extension
    filename_no_extension = os.path.splitext(file)[0]

    # Concatenate filename amd paths
    dest_csv_file = str(filename_no_extension) + ".csv"
    dest_file = os.path.join(dest_path,dest_csv_file)
    source_file = os.path.join(source_path,file)

    # Open the original file and create a reader object
    with open(source_file, "r") as infile:
        reader = csv.reader(infile, dialect="excel-tab")
        with open(dest_file, "w") as outfile:
            writer = csv.writer(outfile, delimiter = ',')
            for row in reader:
                writer.writerow(row)
Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
G Suarez
  • 11
  • 2

1 Answers1

0

The addition of extra newlines is usually a result of not specifying the newline="" parameter when opening the files. If Python 2.x is being used you would instead need to use rb and wb as file modes instead of the extra parameter:

import csv
import os

source_path = r"file location"
dest_path = r"file location"

for file in os.listdir(source_path):

    # Get filename without file extension
    filename_no_extension = os.path.splitext(file)[0]

    # Concatenate filename amd paths
    dest_csv_file = str(filename_no_extension) + ".csv"
    dest_file = os.path.join(dest_path,dest_csv_file)
    source_file = os.path.join(source_path,file)

    # Open the original file and create a reader object
    with open(source_file, "r", newline="") as infile, open(dest_file, "w", newline="") as outfile:
        reader = csv.reader(infile, dialect="excel-tab")
        writer = csv.writer(outfile)
        writer.writerows(reader)

You can also avoid the loop by making use of the writerows() function.

Martin Evans
  • 45,791
  • 17
  • 81
  • 97
  • I'm using Python 2.x so modification of previous script with rb and wb as file modes as resolved the issue. Many thanks! – G Suarez Oct 29 '18 at 10:58