Rewrite shell script with GNU Parallel to Python

Question

I have a shell script that uses GNU Parallel to run function in parallel. Now, I am rewriting the script to Python and I dont know, how to do this correctly.

In script, I have:

parallel --jobs 5 --linebuffer run {1} ::: "${files[@]}"

How can I convert this to Python code? In shell, files is an array of files, run method calls external program that process the file.

In Python, I have method def run(file), that have several Python command to prepare data and at the end, it calls external program with os.command.

def run(file):
  do something with input file
  os.command(...)

@mkrieger1 not exactly. I dont want to use `Popen`. I have `run` method in python, and inside this method is `os.command`. Threading seems more reasonable, but I dont know whether to use it or multiprocessing. Also, I dont know, hot to pass the array of files. — Martin Perry, Oct 09 '21 at 11:07
Note that processes will be created anyway using the `os.command` call. Creating processes for calling internal Python methods using Popen is not great though because it is often too low level. Using [processing pools (eg with map)](https://docs.python.org/3/library/multiprocessing.html) is often simpler and better. If you know that your processing is IO bound and written in Python, then threads are better (due to the GIL). — Jérôme Richard, Oct 09 '21 at 11:55

score 2 · Accepted Answer · answered Oct 09 '21 at 12:02

2

I would use multiprocessing :

from multiprocessing import Pool

def run(file):
  do something with input file
  os.command(...)

if __name__ == '__main__':
  with Pool(5) as p:
    p.map(run, sys.argv[1:])

Call it with :

python test.py "${files[@]}"

answered Oct 09 '21 at 12:02

Philippe

20,025
2
23
32

Rewrite shell script with GNU Parallel to Python

1 Answers1