2

I´m working on a Python program supposed to read incoming MS-Word documents in a client/server fashion, i.e. the client sends a request (one or multiple MS-Word documents) and the server reads specific content from those requests using pythoncom and win32com.

Because I want to minimize waiting time for the client (client needs a status message from server, I do not want to open an MS-Word instance for every request. Hence, I intend to have a pool of running MS-Word instances from which the server can pick and choose. This, in turn, means I have to reuse those instances from the pool in different threads and this is what causes trouble right now. After I fixed the following error I asked previously on stack overflow, my code looks now like this:

import pythoncom, win32com.client, threading, psutil, os, queue, time, datetime

class WordInstance:
    def __init__(self,app):
        self.app = app
        self.flag = True

appPool = {'WINWORD.EXE': queue.Queue()}

def initAppPool():
    global appPool
    wordApp = win32com.client.DispatchEx('Word.Application')
    appPool["WINWORD.EXE"].put(wordApp) # For testing purpose I only use one MS-Word instance currently

def run_in_thread(instance,appid, path):
    print(f"[{datetime.now()}] open doc ... {threading.current_thread().name}")
    pythoncom.CoInitialize()
    wordApp = win32com.client.Dispatch(pythoncom.CoGetInterfaceAndReleaseStream(appid, pythoncom.IID_IDispatch))
    doc = wordApp.Documents.Open(path)
    doc.SaveAs(rf'{path}.FB.pdf', FileFormat=17)
    doc.Close()
    print(f"[{datetime.now()}] close doc ... {threading.current_thread().name}")
    instance.flag = True

if __name__ == '__main__':
    initAppPool()

    pathOfFile2BeRead1 = r'C:\Temp\file4.docx'
    pathOfFile2BeRead2 = r'C:\Temp\file5.docx'

    #treat first request
    wordApp = appPool["WINWORD.EXE"].get(True, 10)
    wordApp.flag = False 
    pythoncom.CoInitialize()
    wordApp_id = pythoncom.CoMarshalInterThreadInterfaceInStream(pythoncom.IID_IDispatch, wordApp.app) 
    readDocjob1 = threading.Thread(target=run_in_thread,args=(wordApp,wordApp_id,pathOfFile2BeRead1), daemon=True)
    readDocjob1.start() 
    appPool["WINWORD.EXE"].put(wordApp)


    #wait here until readDocjob1 is done 
    wait = True
    while wait:
        try:
            wordApp = appPool["WINWORD.EXE"].get(True, 1)
             if wordApp.flag:
                print(f"[{datetime.now()}] ok appPool extracted")
                wait = False
            else:
                appPool["WINWORD.EXE"].put(wordApp)
        except queue.Empty:
            print(f"[{datetime.datetime.now()}] error: appPool empty")
        except BaseException as err:
            print(f"[{datetime.datetime.now()}] error: {err}")

    wordApp.flag = False
    openDocjob2 = threading.Thread(target=run_in_thread,args=(wordApp,wordApp_id,pathOfFile2BeRead2), daemon=True)
    openDocjob2.start()

When I run the script I receive the following output printed on the terminal:

[2022-03-29 11:41:08.217678] open doc ... Thread-1
[2022-03-29 11:41:10.085999] close doc ... Thread-1
[2022-03-29 11:41:10.085999] ok appPool extracted
[2022-03-29 11:41:10.085999] open doc ... Thread-2

Process finished with exit code 0

And only the first word file is converted to a pdf. It seems like def run_in_thread terminates after the print statement and before/during pythoncom.CoInitialize(). Sadly I do not receive any error message which makes it quite hard to understand the cause of this behavior.

After reading into Microsofts documentation I tried using pythoncom.CoInitializeEx(pythoncom.APARTMENTTHREADED) instead of pythoncom.CoInitialize(). Since my COM object needs to be called by multiple threads. However this changed nothing.

NB149
  • 103
  • 6
  • Add more *print*s in *run\_in\_thread*. Something tells me that it's *pythoncom.CoInitialize()* from *run\_in\_thread* (as *Python* threads are not real threads). – CristiFati Apr 04 '22 at 07:45
  • Yes it is. When I add a print statement directly after ```pythoncom.CoInitialize()``` it does not get printed. – NB149 Apr 10 '22 at 15:23

0 Answers0