I have a very big 4+ GB size of textfile and I have a script which splits the file into small files based on what characters are before the first coma. eg.: 16,.... line goes to 16.csv, 61,.... line goes to 61.csv. Unfortunately this script runs for ages, I guess because of the write out method. Is there any way to speed up the script?
import pandas as pd
import csv
with open (r"updates//merged_lst.csv",encoding="utf8", errors='ignore') as f:
r = f.readlines()
for i in range(len(r)):
row = r[i]
letter = r[i].split(',')[0]
filename = r"import//"+letter.upper()+".csv"
with open(filename,'a',encoding="utf8", errors='ignore') as f:
f.write(row)