0

Using Python 3.6 on Windows 10.

def to_csv():
with open('data_set.csv', 'w', newline='', encoding='utf-8') as csvfile:
    translator = str.maketrans('', '', string.punctuation)
    writer = csv.writer(csvfile, delimiter=',')
    rows = []
    for i in range(1, 190):
        try:
            file = open("false_text_files/" + str(i) + ".txt", "rb")
            text = file.read().decode().translate(translator)
        except:
            continue
        row = ['no', text]
        rows.append(row)
    for u in rows:
        print(u[1])
        writer.writerow(u)

For several entries, the text element is being split and added to the next line of the CSV file. e.g.

Example image

The text contains no punctuation so I cannot work out why it is being split between two lines. Any help or advice as to what could be going wrong would be greatly appreciated.

Juan Antonio
  • 2,451
  • 3
  • 24
  • 34
  • 2
    Possible duplicate of [CSV in Python adding an extra carriage return](https://stackoverflow.com/questions/3191528/csv-in-python-adding-an-extra-carriage-return) – r.ook Jan 16 '18 at 17:08
  • check the file where it occurs. It's possible that the text file contains a BOM (Byte order mark). 3 strange chars which tell text editor that it's utf-something. – Jean-François Fabre Jan 16 '18 at 20:32

1 Answers1

0

In case anyone finds this and is having the same issue I worked it out. I was using microsoft excel to view the csv file and it extends lines to the next line if the size of a cell goes beyond a certain cutoff.

Simply viewing the file in a text editor showed that it was storing it properly all along. So don't make my mistake.