0

Edit: For those finding this later, this is explicitly not a duplicate of the other topic that somebody linked to this. The problem is unrelated to stdout buffering, but rather was a misunderstanding of how imap_unordered was calling do_something

I'm trying to debug a separate issue which requires some print statements in my Python multiprocessing code. I can print() fine in the main processes, but when I spawn new processes, I can't successfully print anything.

Here is a bare bones example of my issue:

import argparse
from multiprocessing import Pool, get_context


def main():
    parser = argparse.ArgumentParser()
    parser.add_argument('--foo', required=True)
    global args # need this if I'm reading from threads / forks
    args = parser.parse_args()

    print('This prints fine')
    with get_context('spawn').Pool(processes=4) as pool:
        pool.imap_unordered(do_something, [0, 1, 2, 3, 4 ])

    return

def do_something():
    print(args.foo * 2)
    print("this doesn't print either")

if __name__=="__main__":
    main()

What is the correct way to print from the do_something method?

Please note that I am using python from Bash on an Ubuntu 18.04 machine. I'm not using IDLE, or any IDE, which I see similar questions for.

Edit: Please note that the issue is also not that the printing is delayed, but rather that it does not happen at all. Even if I flush the stdout buffer in the do_something function.

Final edit: It looks like my code was not actually calling the do_something function. When I force it to through the methods described in below, I do see the expected prints. This code produces that output:

import collections
from multiprocessing import Pool, get_context


def main():
    print('This prints fine')
    with get_context('spawn').Pool(processes=4) as pool:
        results = collections.deque(pool.imap_unordered(do_something, [0, 1, 2, 3, 4 ]), 0)

    print(results)
    return

def do_something(i):
    print("this doesn't print either") # Actually does print
    return i


if __name__=="__main__":
    main()
A. P.
  • 11
  • 1
  • 5
  • 1
    Why are you trying to call `imap_unordered` with a function that takes no arguments? – user2357112 Jul 20 '20 at 23:48
  • @user2357112 supports Monica Just a typo. This is based on code that does something different which I'm trying to debug. I'll add an argument to avoid confusion. Thanks. – A. P. Jul 20 '20 at 23:50
  • 1
    Also, your workers don't have an `args` global. Despite how hard `multiprocessing` tries to pretend to be `threading`, it's nothing like `threading`, and one of the ways that manifests is that processes don't share variables. – user2357112 Jul 20 '20 at 23:50
  • @user2357112 supports Monica that's actually what I'm *trying* to debug (trying to initialize the spawned processes with the args of the parent), but I thought it was too complicated to put in a question. Wanted to take it one step at a time. Perhaps I'll post this differently with the larger problem. – A. P. Jul 20 '20 at 23:52
  • @JohnGordon unfortunately that does not answer. Even if I import sys and flush stdout all within that `do_something` method, still nothing prints. – A. P. Jul 20 '20 at 23:54

2 Answers2

3

I said that imap_unordered is lazy and while that's true, it's not actually the reason your code doesn't run. Your code doesn't run because nothing causes the main process to wait for the pool children to finish before terminating the pool. The pool is terminated when the with block exits. A straightforward way to wait is to iterate through the results of imap_unordered.

imap_unordered is lazy. So lazy, in fact, that it isn't actually running your code yet. You'll need to iterate it to retrieve its return values. That's not all that's wrong with your program, but that should at least get you moving forward!

Weeble
  • 17,058
  • 3
  • 60
  • 75
  • 1
    FYI, the `itertools` module provides a convenient recipe for consuming an iterator when you don't care about the results. It can be inlined pretty easily in this case via `collections.deque(pool.imap_unordered(do_something, [0, 1, 2, 3, 4 ]), 0)`, and it will be enough to ensure the entire imapping process is finished before you exit the `with` and terminate the pool (but without storing any results from the tasks; they're pulled and discarded as quickly as possible). – ShadowRanger Jul 21 '20 at 00:10
  • I updated my answer because I realised after looking more closely that I wasn't correct. I ran the program and saw that this fixed it, but initially misunderstood the cause. – Weeble Jul 21 '20 at 00:10
  • Heh, that's pretty cute! I had to look at it a while to understand why it works! It always seems weird that the itertools has all these recipes in the docs and doesn't just make them functions. Batteries not included. Some assembly required. – Weeble Jul 21 '20 at 00:19
  • Thanks all. I will update the code above with code that definitely calls `do_something`, but still doesn't print. – A. P. Jul 21 '20 at 00:28
  • Just tried the collections.deque() method above, and it does print now. Thanks for your help everyone. Will edit my question. – A. P. Jul 21 '20 at 00:35
-1

It may be a little bit stupid but I put it in another code and

import os
os.system('python "the_code_name &"')

and they both work in two processess