I know about the limitations of Twisted for multiprocess applications, but my question is different. I am not trying to run a server or client using multiple processes. I already have a running application that takes a number of directories and performs some operations on them. I want to divide the work in chunks, spawning a process with the same application for each subdirectory. I can do this by running the application multiple times from the shell and passing a different subdirectory as argument each time.
In the main I have something like:
from multiprocessing import Pool
...
p = Pool(num_procs)
work_chunks = [work_chunk] * len(configs)
p.map(run_work_chunk, zip(work_chunks, configs))
p.close()
p.join()
where:
def run_work_chunk((work_chunk, config)):
from twisted.internet import reactor
d = work_chunk.configure(config)
d.addCallback(lambda _: work_chunk.run())
d.addErrback(handleLog)
print "pid=", getpid(), "reactor=", id(reactor)
reactor.run()
return
class WorkChunk(object):
...
def run(self):
# do stuff
...
reactor.stop()
Let's say num_procs
is 2, then the output would be something like:
pid=2 reactor=140612692700304
pid=6 reactor=140612692700304
And you can't see any output for the workers working in other chunks.
The problem is that when reactor.stop()
is called, it stops all the reactors because each process uses the same reactor. I thought that when spawning a new process, all the stack was copied, but in this case it is copying the reference to the reactor, so all processes use the same reactor object.
Is there a way to instantiate a different reactor object for each process? (as if it was really a completely different process and not a child process)