I'm having trouble editting a csv file using Python. I am receiving a csv file from a supplier and have no flexibility on how the data is being provided. The issue I'm running into, is that one of the columns - the last column in the file - contains very long strings (sometimes upwards of 32000 characters) and when opening the file in excel, the rows with these very long strings in a column will automatically overflow onto the next line. To work around this, I've tried taking the approach of: if there is a very long string in the last column, break that string at a certain point and put a comma (delimiter) there so the very long string is now broken across two columns and will not overflow into a new row. I have achieved this and when opening in a text editor, can see the comma delimiter in the right place. But when opening in excel, I still run into the text overflowing and not getting populated in a second column.
To me, this is suggesting an issue with excel and how it is interpreting the csv file and maybe it is not recognising the comma I have inserted as a delimiter?
The code below is how I've tackled the problem so far:
from tempfile import NamedTemporaryFile
import shutil
import csv
filename = input("File Name in current folder: ")
tempfile = NamedTemporaryFile("w+t", newline='', delete=False)
with open(filename, "rt") as csvFile, tempfile:
reader = csv.reader(csvFile, delimiter=",", quotechar='"')
writer = csv.writer(tempfile, delimiter=",", quotechar='"')
for row in reader:
if len(row[21]) > 30000:
row[21] = row[21][:len(row[21])-20004] + "," + row[21][len(row[21])-20003:]
writer.writerow(row)
shutil.move(tempfile.name, filename)
Thanks and any suggestions would be greatly appreciated!