1

Problem:

I need to batch some Word files with python to:

  1. check if they are .doc files
  2. if so change their name
  3. save them as .docx files

So that I can then extract some info from the tables contained in the document with docx lib.

I encounter an issue when trying to save docx files containing comments since a popup appears to ask me to confirm if I want to save the file with comments. It pauses the code execution untill an operator manually confirm by clicking OK into the popup. It prevents the code to be run automatically without any operator input.

Note: The comments don't need to be kept in the .docx files since I won't use them for further computation.

What I do:

Here's the code I have right now, that stops before end of execution untill you confirm in word you accept to keep the comments (in case your doc file contained some):

import win32com.client

doc_file = "path\\of\\document.doc"
docx_file = "path\\of\\new_document.docx"

word = win32com.client.Dispatch("Word.application")

#get the file extension
file_extension = '.'+doc_file.split('\\').pop().split('.').pop()

#test file extension and convert it to docx if original document is a .doc
if file_extension.lower() == '.doc':
    wordDoc = word.Documents.Open(doc_file, False, False, False)
    wordDoc.SaveAs2(docx_file, FileFormat = 12)
    wordDoc.Close()

    #test file extension and print a message in the console if not a .doc document
else:
        print('Extension of document {0} is not .doc, will not be treated'.format(doc_file))
word.Quit()

What I've tried:

I tried to look for solutions to remove the comments before saving since I do not use them later in the .docx file I created, but I didn't find any satisfying solution.

Maybe I'm just using the wrong approach and there's a super simple way to dismiss the dialog box or something, but somehow didn't find it.

Thanks!

EGI
  • 117
  • 1
  • 1
  • 8

2 Answers2

0

This seems to do the job, but removes all comments:

import win32com.client

doc_file = "path\\of\\document.doc"
docx_file = "path\\of\\new_document.docx"

word = win32com.client.Dispatch("Word.application")

#get the file extension
file_extension = '.'+doc_file.split('\\').pop().split('.').pop()

#test file extension and convert it to docx if original document is a .doc
if file_extension.lower() == '.doc':
    wordDoc = word.Documents.Open(doc_file, False, False, False)
    # Accept all revisions
    word.ActiveDocument.Revisions.AcceptAll()
    # Delete all comments
    if word.ActiveDocument.Comments.Count >= 1:
        word.ActiveDocument.DeleteAllComments()
    wordDoc.SaveAs2(docx_file, FileFormat = 12)
    wordDoc.Close()

    #test file extension and print a message in the console if not a .doc document
else:
        print('Extension of document {0} is not .doc, will not be treated'.format(doc_file))
word.Quit()

I just added the part below that accepts the modifications and remove the comments in original code:

    # Accept all revisions
    word.ActiveDocument.Revisions.AcceptAll()
    # Delete all comments
    if word.ActiveDocument.Comments.Count >= 1:
        word.ActiveDocument.DeleteAllComments()

I found the solution here: Python - Using win32com.client to accept all changes in Word Documents

But it still doesn't fully answer the initial question. Because it just gets rid of comments since in my own situation I don't need them. But in case you need the comments, I still don't know how to proceed.

EGI
  • 117
  • 1
  • 1
  • 8
  • Moving forward, I found something that may have interested us in here: https://stackoverflow.com/questions/21081870/how-to-dismiss-a-dialog-box-displayed-by-ms-word-when-openning-document-in-pytho but unfortunately the link that gave the solution is broken, and the solution itself is not reported in the discussion thread. – EGI Sep 04 '20 at 09:06
  • Fortunately it has been archived here: https://www.betaarchive.com/wiki/index.php/Microsoft_KB_Archive/259971 Anayway, I don't see how to implement this in my code right now. – EGI Sep 04 '20 at 09:30
0

I stumbled upon this today:

import win32com.client

doc_file = "path\\of\\document.doc"
docx_file = "path\\of\\new_document.docx"

word = win32com.client.Dispatch("Word.application")
#Disable save with comments warning
word.Options.WarnBeforeSavingPrintingSendingMarkup = False

#get the file extension
file_extension = '.'+doc_file.split('\\').pop().split('.').pop()

#test file extension and convert it to docx if original document is a .doc
if file_extension.lower() == '.doc':
    wordDoc = word.Documents.Open(doc_file, False, False, False)
    wordDoc.SaveAs2(docx_file, FileFormat = 12)
    wordDoc.Close()

    #test file extension and print a message in the console if not a .doc document
else:
        print('Extension of document {0} is not .doc, will not be treated'.format(doc_file))
word.Quit()

An even easier solution is to use wordconv.exe which is located in your office installation beside the WinWord.exe

The commandline is like this:

wordconv.exe -oice -nme inputfilePath outputFilePath 
AngelM1981
  • 141
  • 5