I am trying to combine over 100,000 CSV files (all same formats) in a folder using below script. Each CSV file is on average 3-6KB of size. When I run this script, it only opens exact 47 .csv files and combines. When I re-run it only combines same .csv files, not all of them. I don't understand why it is doing that?
import os
import glob
os.chdir("D:\Users\Bop\csv")
want_header = True
out_filename = "combined.files.csv"
if os.path.exists(out_filename):
os.remove(out_filename)
read_files = glob.glob("*.csv")
with open(out_filename, "w") as outfile:
for filename in read_files:
with open(filename) as infile:
if want_header:
outfile.write('{},Filename\n'.format(next(infile).strip()))
want_header = False
else:
next(infile)
for line in infile:
outfile.write('{},{}\n'.format(line.strip(), filename))