0

Is there a way I can write a python script that emulates the use of GNU Screen and Bash? I was originally trying to write a simple Bash script, but I suspect learning the multiprocessing module will give me a little bit flexibility down the road, not to mention that python modules are very well documented.

So, I have seen in the tutorials and documentation the use of a single function run in parallel, but am a little bit lost on how to make to use this. Any reference would be extremely helpful.

Below is basically what I want:

If I have a bunch of experiments in different python files, then in Bash:

$python experiment1.py&
$python experiment2.py& ...

In Python, if I have a bunch of functions in the same script, the main emulates the above (? this is really just a guess and don't mean to offend people other than myself with my ignorance):

import multiprocessing as mp

def experiment1():
    """run collection of simulations and collect relevant statistics"""
    ....
def experiment2():
    """run different collection of simulations and collect relevant statistics""" 
    ....

if __name__ == '__main__':
    one = mp.process(target = experiment1)
    two = mp.process(target = experiment2)
    ...
    one.start()
    two.start()
    ...
    one.join()
    two.join()

I am not sure how I would test this except maybe my activity monitor on OSX, which doesn't seem to tell me the distribution of the cores, so suggestions as to checking python-ically without runtime would be helpful. This last question might be too general, but thought I would throw it in. Thank you for your help!

Charlie
  • 189
  • 14
  • 6
    What exactly do you mean by "emulates `screen`"? If you really want to do what you're saying, that's going to be a whole lot of `curses` code… but there doesn't seem to be any `screen`-related stuff in either the bash session, just basic job control. – abarnert Aug 23 '14 at 00:54
  • Also, when you say "I am not sure how I would test this"… test what, exactly? Since your functions don't do anything but return a value that you ignore, there's notthing to test. Write something that at least does something trivial, and it's easy to test. (However… have you tried running this? Because I'm pretty sure you're just going to get an exception on that `mp.process`, because there is no much name in that module…) – abarnert Aug 23 '14 at 00:56
  • 1
    The script you provided is similar to the two bash commands you included in that the functions `experiment1` and `experiment2` will run concurrently in two separate processes, the same way `exeriment1.py` and `experiment2.py` will run concurrently as two separate processes. I'm not sure if that's all you're looking for in terms of using `multiprocessing` as a replacement or not. – dano Aug 23 '14 at 00:57
  • @abarnert I just mean run a lot of different functions asynchronously by detaching. I had this [question](http://stackoverflow.com/questions/24619330/multiprocessing-with-screen-and-bash) and the GNU parallel looked fine, but the cores on our school system don't have it. The basic python 2.7 distribution comes with `multiprocessing` – Charlie Aug 23 '14 at 00:58
  • @Charlie On Linux, It's running a bunch of functions asynchronously by forking, but yes, `multiprocessing` does that. – dano Aug 23 '14 at 01:00
  • Thank you! I will carefully read the join() and start() documentation then and implement the above. – Charlie Aug 23 '14 at 01:07

1 Answers1

1

The following program runs a bunch of scripts in parallel. For each one it prints a message when it starts, and when it ends. If it exited with an error, the error code and command line are printed, and the program continues.

It runs one shell script per CPU in the system, at a time.

source

import multiprocessing as mp, subprocess

def run_script(script_name):
    curproc = mp.current_process()
    cmd = ['python', script_name]
    print curproc, 'start:', cmd
    try:
        return subprocess.check_output(
            cmd, shell=False)
    except subprocess.CalledProcessError as err:
        print '{} error: {}'.format(
            curproc, dict(
                status=err.returncode,
                command=cmd,
            )
        )
    finally:
        print curproc, "done"

scripts = ['zhello.py', 'blam']

pool = mp.Pool()   # default: num of CPUs
print pool.map(
    run_script, scripts,
)
pool.close()
pool.join()

output

python: can't open file 'blam': [Errno 2] No such file or directory
<Process(PoolWorker-2, started daemon)> start: ['python', 'blam']
<Process(PoolWorker-2, started daemon)> error: {'status': 2, 'command': ['python', 'blam']}
<Process(PoolWorker-2, started daemon)> done
<Process(PoolWorker-1, started daemon)> start: ['python', 'zhello.py']
<Process(PoolWorker-1, started daemon)> done
['howdy\n', None]
johntellsall
  • 14,394
  • 4
  • 46
  • 40
  • Holy Moly! Thank you, shavenwarthog! Really appreciate this! Especially the print outs and pythons testing of CPU's! Have a wonderful Sunday and I hope karma rewards you! – Charlie Aug 24 '14 at 16:36
  • Hey, Sir Shavenwarthog, I ran into this [problem](http://stackoverflow.com/questions/25924397/python-multiprocessing-and-serializing-data) and was wondering if you had any idea how to solve it. Admittedly, I don't use `pool` or `subprocess` and was wondering if you knew whether or not these implementations actually help solve this problem. I really don't understand it and am trying to desperately fix it. Thank you! – Charlie Oct 03 '14 at 02:00