289

I am trying my very first formal python program using Threading and Multiprocessing on a windows machine. I am unable to launch the processes though, with python giving the following message. The thing is, I am not launching my threads in the main module. The threads are handled in a separate module inside a class.

EDIT: By the way this code runs fine on ubuntu. Not quite on windows

RuntimeError: 
            Attempt to start a new process before the current process
            has finished its bootstrapping phase.
            This probably means that you are on Windows and you have
            forgotten to use the proper idiom in the main module:
                if __name__ == '__main__':
                    freeze_support()
                    ...
            The "freeze_support()" line can be omitted if the program
            is not going to be frozen to produce a Windows executable.

My original code is pretty long, but I was able to reproduce the error in an abridged version of the code. It is split in two files, the first is the main module and does very little other than import the module which handles processes/threads and calls a method. The second module is where the meat of the code is.


testMain.py:

import parallelTestModule

extractor = parallelTestModule.ParallelExtractor()
extractor.runInParallel(numProcesses=2, numThreads=4)

parallelTestModule.py:

import multiprocessing
from multiprocessing import Process
import threading

class ThreadRunner(threading.Thread):
    """ This class represents a single instance of a running thread"""
    def __init__(self, name):
        threading.Thread.__init__(self)
        self.name = name
    def run(self):
        print self.name,'\n'

class ProcessRunner:
    """ This class represents a single instance of a running process """
    def runp(self, pid, numThreads):
        mythreads = []
        for tid in range(numThreads):
            name = "Proc-"+str(pid)+"-Thread-"+str(tid)
            th = ThreadRunner(name)
            mythreads.append(th) 
        for i in mythreads:
            i.start()
        for i in mythreads:
            i.join()

class ParallelExtractor:    
    def runInParallel(self, numProcesses, numThreads):
        myprocs = []
        prunner = ProcessRunner()
        for pid in range(numProcesses):
            pr = Process(target=prunner.runp, args=(pid, numThreads)) 
            myprocs.append(pr) 
#        if __name__ == 'parallelTestModule':    #This didnt work
#        if __name__ == '__main__':              #This obviously doesnt work
#        multiprocessing.freeze_support()        #added after seeing error to no avail
        for i in myprocs:
            i.start()

        for i in myprocs:
            i.join()
NG Algo
  • 3,570
  • 2
  • 18
  • 27

10 Answers10

414

On Windows the subprocesses will import (i.e. execute) the main module at start. You need to insert an if __name__ == '__main__': guard in the main module to avoid creating subprocesses recursively.

Modified testMain.py:

import parallelTestModule

if __name__ == '__main__':    
    extractor = parallelTestModule.ParallelExtractor()
    extractor.runInParallel(numProcesses=2, numThreads=4)
Janne Karila
  • 24,266
  • 6
  • 53
  • 94
  • 12
    (smacks his palm against his forehead) Doh! It works!!!! Thank you so much! I was missing the fact that it is the original main module that gets re-imported! All this time I was trying the "__name__ ==" check right before where I launched my processes. – NG Algo Aug 13 '13 at 09:17
  • 1
    I cannot seem to import 'parallelTestModule'. I'm using Python 2.7. Should it work out of the box? – Jonny Jan 28 '16 at 14:42
  • 2
    @Jonny The code for parallelTestModule.py is part of the question. – Janne Karila Jan 29 '16 at 06:44
  • Thank you. I ran into the "creating subprocesses recursively" problem. Fairly uncomfortable to watch a box go down in flames (understatement) – Doo Dah Feb 11 '16 at 14:42
  • i cant find how to install this package it says unavailable – DeshDeep Singh Jul 06 '18 at 16:09
  • 1
    @DeshDeepSingh The code snippet is not a stand-alone example; it is a modification of OP's code – Janne Karila Jul 20 '18 at 12:56
  • 1
    @DeshDeepSingh That module is part of the question. – Janne Karila Jul 20 '18 at 13:05
  • @JanneKarila, I have a question. I want to create a module to implement multiprocessing operations such as reading and writting files. If I create a class, could I implement the "if __name__ == '__main__' " guard on the "__init__" ? This way when I instantiate an object, it would run the guard and everything under it Or I have to put the guard on the main script and run everything under it? – Ipa Sep 04 '20 at 10:16
  • @Ipa You could post a question on that; add a code example and explain what you are really trying to accomplish. – Janne Karila Sep 09 '20 at 05:37
  • @JanneKarila, sure I didn't want to duplicate the question. But I will explain my specific case on a different question. – Ipa Sep 09 '20 at 05:45
  • Thank you. This error of not using multiprocessing inside `if __name__ == "__main__":` came as a surprise to me. I was unaware of the multiprocessing requirement though I always used multiprocessing functions inside the `if` clause. This was the first time I used the code outside of the `if` clause and it gave the error `RuntimeError: Attempt to start a new process before the current process has finished its bootstrapping phase` – Gary Nov 18 '21 at 04:31
  • This work, however I did not need this before and now I do, ideally a solution which does not require me to change previously working code would have been preferable – Greg7000 Nov 09 '22 at 15:13
49

Try putting your code inside a main function in testMain.py

import parallelTestModule

if __name__ ==  '__main__':
  extractor = parallelTestModule.ParallelExtractor()
  extractor.runInParallel(numProcesses=2, numThreads=4)

See the docs:

"For an explanation of why (on Windows) the if __name__ == '__main__' 
part is necessary, see Programming guidelines."

which say

"Make sure that the main module can be safely imported by a new Python interpreter without causing unintended side effects (such a starting a new process)."

... by using if __name__ == '__main__'

B8vrede
  • 4,432
  • 3
  • 27
  • 40
doctorlove
  • 18,872
  • 2
  • 46
  • 62
17

Though the earlier answers are correct, there's a small complication it would help to remark on.

In case your main module imports another module in which global variables or class member variables are defined and initialized to (or using) some new objects, you may have to condition that import in the same way:

if __name__ ==  '__main__':
  import my_module
linusg
  • 6,289
  • 4
  • 28
  • 78
Ofer
  • 423
  • 6
  • 11
16

hello here is my structure for multi process

from multiprocessing import Process
import time


start = time.perf_counter()


def do_something(time_for_sleep):
    print(f'Sleeping {time_for_sleep} second...')
    time.sleep(time_for_sleep)
    print('Done Sleeping...')



p1 = Process(target=do_something, args=[1])
p2 = Process(target=do_something, args=[2])


if __name__ == '__main__':
    p1.start()
    p2.start()

    p1.join()
    p2.join()

    finish = time.perf_counter()
    print(f'Finished in {round(finish-start,2 )} second(s)')

you don't have to put imports in the if __name__ == '__main__':, just running the program you wish to running inside

Community
  • 1
  • 1
Sway Wu
  • 379
  • 3
  • 8
  • I do similar. However, I notice that doing this results in the process running a bit slower (in windows). In particular in `time_for_sleep`=1 second, the process takes 2 seconds whereas on linux it takes 1.02 seconds. For `time_for_sleep`=10, it takes 11 seconds. So wondows has ~1 second overhead. – D.L Dec 31 '21 at 10:52
  • ```if __name__ == '__main__': with Pool(3) as p: p.map(task, items)``` works for me! – Alex Zubkov Jul 09 '23 at 17:42
13

As @Ofer said, when you are using another libraries or modules, you should import all of them inside the if __name__ == '__main__':

So, in my case, ended like this:

if __name__ == '__main__':       
    import librosa
    import os
    import pandas as pd
    run_my_program()
Luis Abdi
  • 139
  • 2
  • 6
1

In my case it was a simple bug in the code, using a variable before it was created. Worth checking that out before trying the above solutions. Why I got this particular error message, Lord knows.

arame3333
  • 9,887
  • 26
  • 122
  • 205
1

In yolo v5 with python 3.8.5

if __name__ == '__main__':
    from yolov5 import train
    train.run()
Biman Pal
  • 391
  • 4
  • 9
0

The below solution should work for both python multiprocessing and pytorch multiprocessing.

As other answers mentioned that the fix is to have if __name__ == '__main__': but I faced several issues in identifying where to start because I am using several scripts and modules. When I can call my first function inside main then everything before it started to create multiple processes (not sure why).

Putting it at the very first line (even before the import) worked. Only calling the first function return timeout error. The below is the first file of my code and multiprocessing is used after calling several functions but putting main in the first seems to be the only fix here.

if __name__ == '__main__':
    from mjrl.utils.gym_env import GymEnv
    from mjrl.policies.gaussian_mlp import MLP
    from mjrl.baselines.quadratic_baseline import QuadraticBaseline
    from mjrl.baselines.mlp_baseline import MLPBaseline
    from mjrl.algos.npg_cg import NPG
    from mjrl.algos.dapg import DAPG
    from mjrl.algos.behavior_cloning import BC
    from mjrl.utils.train_agent import train_agent
    from mjrl.samplers.core import sample_paths
    import os
    import json
    import mjrl.envs
    import mj_envs
    import time as timer
    import pickle
    import argparse

    import numpy as np 

    # ===============================================================================
    # Get command line arguments
    # ===============================================================================

    parser = argparse.ArgumentParser(description='Policy gradient algorithms with demonstration data.')
    parser.add_argument('--output', type=str, required=True, help='location to store results')
    parser.add_argument('--config', type=str, required=True, help='path to config file with exp params')
    args = parser.parse_args()
    JOB_DIR = args.output
    if not os.path.exists(JOB_DIR):
        os.mkdir(JOB_DIR)
    with open(args.config, 'r') as f:
        job_data = eval(f.read())
    assert 'algorithm' in job_data.keys()
    assert any([job_data['algorithm'] == a for a in ['NPG', 'BCRL', 'DAPG']])
    job_data['lam_0'] = 0.0 if 'lam_0' not in job_data.keys() else job_data['lam_0']
    job_data['lam_1'] = 0.0 if 'lam_1' not in job_data.keys() else job_data['lam_1']
    EXP_FILE = JOB_DIR + '/job_config.json'
    with open(EXP_FILE, 'w') as f:
        json.dump(job_data, f, indent=4)

    # ===============================================================================
    # Train Loop
    # ===============================================================================

    e = GymEnv(job_data['env'])
    policy = MLP(e.spec, hidden_sizes=job_data['policy_size'], seed=job_data['seed'])
    baseline = MLPBaseline(e.spec, reg_coef=1e-3, batch_size=job_data['vf_batch_size'],
                           epochs=job_data['vf_epochs'], learn_rate=job_data['vf_learn_rate'])

    # Get demonstration data if necessary and behavior clone
    if job_data['algorithm'] != 'NPG':
        print("========================================")
        print("Collecting expert demonstrations")
        print("========================================")
        demo_paths = pickle.load(open(job_data['demo_file'], 'rb'))

        ########################################################################################
        demo_paths = demo_paths[0:3]
        print (job_data['demo_file'], len(demo_paths))
        for d in range(len(demo_paths)):
            feats = demo_paths[d]['features']
            feats = np.vstack(feats)
            demo_paths[d]['observations'] = feats

        ########################################################################################

        bc_agent = BC(demo_paths, policy=policy, epochs=job_data['bc_epochs'], batch_size=job_data['bc_batch_size'],
                      lr=job_data['bc_learn_rate'], loss_type='MSE', set_transforms=False)

        in_shift, in_scale, out_shift, out_scale = bc_agent.compute_transformations()
        bc_agent.set_transformations(in_shift, in_scale, out_shift, out_scale)
        bc_agent.set_variance_with_data(out_scale)

        ts = timer.time()
        print("========================================")
        print("Running BC with expert demonstrations")
        print("========================================")
        bc_agent.train()
        print("========================================")
        print("BC training complete !!!")
        print("time taken = %f" % (timer.time() - ts))
        print("========================================")

        # if job_data['eval_rollouts'] >= 1:
        #     score = e.evaluate_policy(policy, num_episodes=job_data['eval_rollouts'], mean_action=True)
        #     print("Score with behavior cloning = %f" % score[0][0])

    if job_data['algorithm'] != 'DAPG':
        # We throw away the demo data when training from scratch or fine-tuning with RL without explicit augmentation
        demo_paths = None

    # ===============================================================================
    # RL Loop
    # ===============================================================================

    rl_agent = DAPG(e, policy, baseline, demo_paths,
                    normalized_step_size=job_data['rl_step_size'],
                    lam_0=job_data['lam_0'], lam_1=job_data['lam_1'],
                    seed=job_data['seed'], save_logs=True
                    )

    print("========================================")
    print("Starting reinforcement learning phase")
    print("========================================")


    ts = timer.time()
    train_agent(job_name=JOB_DIR,
                agent=rl_agent,
                seed=job_data['seed'],
                niter=job_data['rl_num_iter'],
                gamma=job_data['rl_gamma'],
                gae_lambda=job_data['rl_gae'],
                num_cpu=job_data['num_cpu'],
                sample_mode='trajectories',
                num_traj=job_data['rl_num_traj'],
                num_samples= job_data['rl_num_samples'],
                save_freq=job_data['save_freq'],
                evaluation_rollouts=job_data['eval_rollouts'])
    print("time taken = %f" % (timer.time()-ts))
devil in the detail
  • 2,905
  • 17
  • 15
0

I ran into the same problem. @ofter method is correct because there are some details to pay attention to. The following is the successful debugging code I modified for your reference:


if __name__ == '__main__':
    import matplotlib.pyplot as plt
    import numpy as np
    def imgshow(img):
        img = img / 2 + 0.5
        np_img = img.numpy()
        plt.imshow(np.transpose(np_img, (1, 2, 0)))
        plt.show()

    dataiter = iter(train_loader)
    images, labels = dataiter.next()

    imgshow(torchvision.utils.make_grid(images))
    print(' '.join('%5s' % classes[labels[i]] for i in range(4)))

For the record, I don't have a subroutine, I just have a main program, but I have the same problem as you. This demonstrates that when importing a Python library file in the middle of a program segment, we should add:

if __name__ == '__main__':
Leon Brant
  • 11
  • 3
0

I tried the tricks mentioned above on the following very simple code. but I still cannot stop it from resetting on any of my Window machines with Python 3.8/3.10. I would very much appreciate it if you could tell me where I am wrong.

print('script reset')

def do_something(inp):
    print('Done!')

if __name__ == '__main__':
    from multiprocessing import Process, get_start_method
    print('main reset')
    print(get_start_method())
    Process(target=do_something, args=[1]).start()
    print('Finished')

output displays:

script reset
main reset
spawn
Finished
script reset
Done!

Update:

As far as I understand, you guys are not preventing either the script containing the __main__ or the .start() from resetting (which doesn't happen in Linux), rather you are suggesting workarounds so that we don't see the reset. One has to make all imports minimal and put them in each function separately, but it is still, relative to Linux, slow.

Alijoonami
  • 21
  • 3