I have converted a pdf file to a text file. This text file is also converted to a csv file. My Problem is the contents in the csv file is written in multiple columns(A,B,C,D,E) whereas I wanted to write it in only one column ie Column A. How could i write the contents from these columns into only one column?
I've tried using merge function and concatenate function and join function but it was of no help.
here's my code
import os.path
import csv
import pdftotext
#Load your PDF
with open("crimestory.pdf", "rb") as f:
pdf = pdftotext.PDF(f)
# Save all text to a txt file.
with open('crimestory.txt', 'w') as f:
f.write("\n\n".join(pdf))
save_path = "/home/mayureshk/PycharmProjects/NLP/"
completeName_in = os.path.join(save_path, 'crimestory' + '.txt')
completeName_out = os.path.join(save_path, 'crimestoryycsv' + '.csv')
file1 = open(completeName_in)
In_text = csv.reader(file1, delimiter=',')
file2 = open(completeName_out, 'w')
out_csv = csv.writer(file2)
file3 = out_csv.writerows(In_text)
file1.close()
file2.close()
The expected output in the csv file should be Column A All information. Rest of the columns Empty