I am looking for a solution to convert .doc files to .pdf in Python 2.7.x It seems to be quite not so straight-forward to handle .doc files in Python, compared to .docx and pdf. So far the most suitable and working solution seems to be for me this although when I try to extend it to loop over .doc files in a given directory I encounter an error:
_ctypes.COMError: (-2146823114, None, (u"Sorry, we couldn't find your file. Was it moved, renamed, or deleted?\r (C:\\windows\\system32\\PrivateCourse_AR.doc)", u'Microsoft Word', u'wdmain11.chm', 24654, None))
Here is the code:
import os
import comtypes.client
os.chdir('C:\Users\Domi\PycharmProjects\STStransl-auto\doc')
path = os.getcwd()
print path
input = os.listdir(path)
print input
print len(input)
wdFormatPDF = 17 #pdf
i=0
output = '.\doc2txt_{}'.format(i)
word = comtypes.client.CreateObject('Word.Application')
for file in input:
if file.endswith('.doc'):
print file
doc = word.Documents.Open(file)
doc.SaveAs(output, FileFormat=wdFormatPDF)
i += 1
doc.Close()
word.Quit()
Any advice regarding code or how to efficiently handle .doc files in Python is welcomed and much appreciated. I am working on an automation script to handle .docx and .pdf files (merge, extract text and split text into multiple files). With those there is not any problem. Pity is, I have a lot of .doc files too. Thanks a lot.