1

In a multiprocessing application, a main process spawns multiple sub processes. Each process is meant to run its own Tornado ioloop. However, I noticed that when the process is started, all the instances of IOLoop.current() (in main and all the sub processes) are the same. Wouldn't that mean that ioloop.spawn_callback(my_func) runs all in one ioloop context (in the main process)?

Here's a minimal example that I could extract:

from tornado.ioloop import IOLoop
import time
from multiprocessing import Process

def sub(i):
  print('sub %d: %s' % (i, hex(id(IOLoop.current(True)))))
  for i in range(10):
    time.sleep(1)


def main():
  print('main  ', hex(id(IOLoop.current(True))))

  for i in range(2):
    sub_process = Process(target=sub, args=(i, ))
    sub_process.daemon = True
    sub_process.start()

  time.sleep(5)

main()

Output:

main   0x7f14a09cf750
sub 0: 0x7f14a09cf750
sub 1: 0x7f14a09cf750

Are the processes created correctly and isn't the expected behaviour that there would be multiple ioloop instances?

orange
  • 7,755
  • 14
  • 75
  • 139

1 Answers1

1

This is mentioned in Tornado's docs

it is important that nothing touches the global IOLoop instance (even indirectly) before the fork

You can get the behavior you want using a slightly modified main function:

def main():
    processes = []
    for i in range(2):
        process = Process(target=sub, args=(i,))
        process.daemon = True
        process.start()
        processes.append(process)
    print('main  ', hex(id(IOLoop.current(True))))
    time.sleep(5)

Output:

main   0x7fbd4ca0da30
sub 0: 0x7fbd4ca0db50
sub 1: 0x7fbd4ca0dc40

Edit

As for the explanation: the sharing is due to due to how fork is implemented in Linux: using COW (copy-on-write); this means that unless you write to the shared object in the child process, both parent and child will share the same object. As soon as the child modifies the shared object it will be copied and changed (these changes won't be visible in the parent).

Ionut Ticus
  • 2,683
  • 2
  • 17
  • 25
  • The pertinent difference between our snippets is that in your's the access to `IOLoop.current()` occurs after the creation of the sub processes. What I don't understand is how an object reference can be shared like this across processes and why it's done. It's actually quite difficult to do that, so I guess it's intended behaviour (but why?). Other than the docs (your link) exemplify, I don't want to share the IOLoop (in the example network ports are shared which may require some synchronisation effort, I guess). This is all a bit magic to me, so I guess I'm missing something here... – orange Apr 20 '20 at 23:57
  • Is the edited explanation regarding COW satisfactory? – Ionut Ticus May 04 '20 at 14:03
  • I didn't see the edit - thanks. Does this mean that `IOLoop.current().add_handler(...)` in the child process would constitute a **write** which copies the object and thus these event loops will be absolutely separate? – orange May 05 '20 at 01:10
  • Yes, but you should test it. On another note, the `hex(id(object))` testing methodology for separate processes is not correct because objects having the same ID in their process do not necessarily represent the same object. So for checking you would need find another method (perhaps checking the IOLoop's `handlers` dict if you are using `add_handler`). – Ionut Ticus May 05 '20 at 10:18