I have the following snippet below from a main.py script
------------------------------------------------------------------------------------------------
a bunch of code above with that has a write-to-disk process that was meant to be only run once
------------------------------------------------------------------------------------------------
context = get_context('spawn')
someInitData1= context.Value('i', -1)
someInitData2= context.Value('i', 0)
with concurrent.futures.ProcessPoolExecutor(max_workers=4,
mp_context=context,
initializer=util.init_func,
initargs=(someInitData1,someInitData2)
) as executor:
multiProcessResults= [x for x in executor.map(util.multi_process_job,
someArguments1,
someArguments2,
)]
I intend only to have util.multi_process_job
be parallelized with multiprocessing. Although, for some reason, with this snippet, all of the code in my main.py get's repeated from the beginning and in parallel as a new process by the workers.
What is strange to me is that the following snippet works fine for my needs when I run it via jupyter notebook. Only the specified function runs. The problem only occurs when I convert the ipnyb file to .py file and run it as a regular python script on a linux machine.