1

I am relatively new to Python and new to Python multiprocessing. I am trying to execute a function repeatedly (10 times in the example below) as separate processes. Within the function I have a loop of multiple iterations (again, 10 times in the example below). In this example, and for simplicity, the function just sets a random number. Judging by the logging I've added, I have at least two problems. Firstly, while I see iteration 1 of each process instance starting and setting the random number, iteration 2 of each process instance starts but appears to do nothing beyond that and iterations 3-10 of each process don't appear to start at all. Secondly, there appears to be cross-contamination between process instances, with multiple processes apparently setting the same random number.

My code

import concurrent.futures
import numpy as np

def do_it(n):
    for i in range(1,11):
        print("do_it instance" + " " + str(n) + ", iteration " + str(i) + " starting")
        try:
            rnd
        except NameError:
            print("do_it instance" + " " + str(n) + ", iteration " + str(i) + ": Variable is not defined")
        else:
            print("do_it instance" + " " + str(n) + ", iteration " + str(i)+ ": Variable is already defined as " + str(random))
        rnd = np.random.randint(1,1000)
        print("do_it instance" + " " + str(n) + ", iteration " + str(i) + ": rnd set to " + str(rnd))
        return rnd
def main():

    with concurrent.futures.ProcessPoolExecutor() as executor:
        results = executor.map(do_it, range(1,11))
    print(results)

My aim is to end up with a list of 100 random numbers, one from each of the 10 process instances and 10 iterations therein. There's undoubtedly a simple explanation and solution to this but searching online for answers has not helped me. Are there any Python experts who can help?

Log

do_it instance 1, iteration 1 starting
do_it instance 1, iteration 1: Variable is not defined
do_it instance 2, iteration 1 starting
do_it instance 2, iteration 1: Variable is not defined
do_it instance 3, iteration 1 starting
do_it instance 3, iteration 1: Variable is not defined
do_it instance 1, iteration 1: rnd set to 807
do_it instance 1, iteration 2 starting
do_it instance 2, iteration 1: rnd set to 807
do_it instance 2, iteration 2 starting
do_it instance 3, iteration 1: rnd set to 807
do_it instance 3, iteration 2 starting
do_it instance 4, iteration 1 starting
do_it instance 4, iteration 1: Variable is not defined
do_it instance 4, iteration 1: rnd set to 666
do_it instance 4, iteration 2 starting
do_it instance 5, iteration 1 starting
do_it instance 5, iteration 1: Variable is not defined
do_it instance 5, iteration 1: rnd set to 666
do_it instance 5, iteration 2 starting
do_it instance 6, iteration 1 starting
do_it instance 6, iteration 1: Variable is not defined
do_it instance 6, iteration 1: rnd set to 807
do_it instance 6, iteration 2 starting
do_it instance 7, iteration 1 starting
do_it instance 8, iteration 1 startingdo_it instance 7, iteration 1: Variable is not defined

do_it instance 8, iteration 1: Variable is not defined
do_it instance 8, iteration 1: rnd set to 666
do_it instance 8, iteration 2 startingdo_it instance 7, iteration 1: rnd set to 666

do_it instance 9, iteration 1 startingdo_it instance 7, iteration 2 starting

do_it instance 9, iteration 1: Variable is not defined
do_it instance 10, iteration 1 startingdo_it instance 9, iteration 1: rnd set to 947

do_it instance 9, iteration 2 startingdo_it instance 10, iteration 1: Variable is not defined

do_it instance 10, iteration 1: rnd set to 947
do_it instance 10, iteration 2 starting
cpjen
  • 11
  • 2
  • [How to debug small programs](https://ericlippert.com/2014/03/05/how-to-debug-small-programs/) – wwii Jul 11 '20 at 18:26
  • `there appears to be cross-contamination between process instances, with multiple processes apparently setting the same random number.` - why do you think the processes are *communicating* with each other? I don't see any place where you provided for that. – wwii Jul 11 '20 at 18:28
  • `My aim is to end up with a list of 100 random numbers,` - It doesn't look like you tried to implement that. – wwii Jul 11 '20 at 18:32
  • It's these lines in the log. Sorry, no idea why they don't format properly. Three processes apparently setting the same random number. do_it instance 1, iteration 1: rnd set to 807 do_it instance 2, iteration 1: rnd set to 807 do_it instance 3, iteration 1: rnd set to 807 – cpjen Jul 11 '20 at 18:48
  • My code is intended spawn 10 processes, each of which will create 10 random numbers. It appears to be spawning processes and each process is going through the loop, apparently twice. – cpjen Jul 11 '20 at 18:54
  • Unless you seed the generator the separate processes should not be producing **exactly the same** series of numbers. It is possible that there will be repeats/duplicates. After fixing minor issues; a NameError, the indentation of the return statement, and/or accumulating the generated numbers before returning - the example you provided does not produce *identical* results from the separate processes. Cannot reproduce. – wwii Jul 13 '20 at 14:37
  • May I ask what version of Python you are using? I am on 3.8 and there appears to have been changes in this release relating to shared memory (some of which is a bit beyond me). https://docs.python.org/3/whatsnew/3.8.html#multiprocessing Is it possible that Numpy is now defaulting to using shared memory where in previous Python versions it did not? If so, is there any way I can switch this behaviour off so that each process uses its own memory space? – cpjen Jul 14 '20 at 14:25
  • The code that you provided previously which has since been removed from this page perfectly illustrates the problem when I run it. No duplicated lists of 10 numbers for the lists generated by Python's random function, several duplicates for the Numpy-generated lists. – cpjen Jul 14 '20 at 14:35
  • I deleted that answer because it **does NOT** produce duplicate random lists from separate processes whether using numpy or python random generators- so it did not solve your problem or even replicate your problem. That is why it was deleted and I voted to close because I could not replicate your problem. – wwii Jul 14 '20 at 15:50
  • It replicated precisely my problem when I ran it. There is a difference in our results. – cpjen Jul 14 '20 at 16:18
  • Running the same code here also produces dupes: https://www.onlinegdb.com/online_python_interpreter – cpjen Jul 14 '20 at 16:55
  • There are caveats in the documentation regarding running multiprocessing stuff in an interactive shell/interpreter. Make sure `main` is called in a `if __name__ == '__main__':` suite then run it from a shell prompt - `python -m tmp` or `py -m tmp`. Look through search results for `python multiprocessing interactive site:stackoverflow.com` – wwii Jul 14 '20 at 18:45
  • I get the same when I run it from a shell prompt. – cpjen Jul 14 '20 at 21:38
  • [Same output in different workers in multiprocessing](https://stackoverflow.com/questions/12915177/same-output-in-different-workers-in-multiprocessing) – wwii Jul 15 '20 at 02:27
  • https://stackoverflow.com/questions/6914240/multiprocessing-pool-seems-to-work-in-windows-but-not-in-ubuntu, – wwii Jul 15 '20 at 02:34
  • https://github.com/numpy/numpy/issues/9650 – wwii Jul 15 '20 at 02:41
  • Superb. Adding "np.random.seed()" has fixed it. Thanks for your help with this. – cpjen Jul 16 '20 at 15:47

0 Answers0