I have this script from my professor that prints metadata from a folder containing PDF files. I need to be able to export this data to a newly created CSV file.
I can't figure out where / what I need to do.
Here is the script for the PDF data extraction.
#!/ bin/bash/ python
import csv
import os
import pyPdf
from pyPdf import PdfFileReader
print "Please enter the path containing your PDF files for analysis."
print '-' * 61
targ_dir = raw_input("Path: ")
file_names = os.listdir(targ_dir)
pdfMetadata = open('E:\CVF\Python\Python Class\PDF_metadata.csv','w')
def getPDFdata (PDFFile):
pdf = PdfFileReader(file(PDFFile, 'rb'))
if pdf.isEncrypted:
pdf.decrypt('')
metadata = pdf.getDocumentInfo()
print PDFFile
for info in metadata:
try:
print info+"::"+metadata[info]
except UnicodeEncodeError:
print "BAD CHARACTER ERROR"
print "__________________________________________"
for item in file_names:
getPDFdata(targ_dir+"\\"+item)
end = raw_input("Press Enter to Finish: ")