I'm writing a script that processes several different instances of a Class object, which contains a number of attributes and methods. The objects are all placed in a single list (myobjects = [myClass(IDnumber=1), myClass(IDnumber=2), myClass(IDnumber=3)]
, and then modified by fairly simplistic for loops that call specific functions from the objects, of the form
for x in myobjects:
x.myfunction()
This script utilizes logging, to forward all output to a logfile that I can check later. I'm attempting to parallelize this script, because it's fairly straightforward to do so (example below), and need to utilize a queue in order to organize all the logging outputs from each Process. This aspect works flawlessly- I can define a new logfile for each process, and then pass the object-specific logfile back to my main script, which can then organize the main logfile by appending each minor logfile in turn.
from multiprocessing import Process, Queue
q = Queue()
threads = []
mainlog = 'mylogs.log' #this is set up in my __init__.py but included here as demonstration
for x in myobjects:
logfile = x.IDnumber+'.log'
thread = Process(target=x.myfunction(), args=(logfile, queue))
threads.append(thread)
thread.start()
for thread in threads:
if thread.is_alive():
thread.join()
while not queue.empty():
minilog = queue.get()
minilog_open = open(minilog, 'r')
mainlog_open = open(mainlog, 'a+')
mainlog_open.write(minilog_open.read())
My problem, now, is that I also need these objects to update a specific attribute, x.success
, as True or False. Normally, in serial, x.success
is updated at the end of x.myfunction()
and is sent down the script where it needs to go, and everything works great. However, in this parallel implementation, x.myfunction
populates x.success
in the Process, but that information never makes it back to the main script- so if I add print(success)
inside myfunction()
, I see True
or False
, but if I add for x in myobjects: print(x.success)
after the queue.get()
block, I just see None
. I realize that I can just use queue.put(success)
in myfunction()
the same way I use queue.put(logfile)
, but what happens when two or more processes finish simultaneously? There's no guarantee (that I know of) that my queue will be organized like
- logfile (for
myobjects[0]
) - success = True (for
myobjects[0]
) - logfile (for
myobjects[1]
) - success = False (for
myobjects[1]
) (etc etc)
How can I organize object-specific outputs from a queue, if this queue contains both logfiles and variables? I need to know the content of x.success
for each x.myfunction()
, so that information has to come back to the main process somehow.