Why does my python script get frequently terminated when using subprocess?

Question

I have this code. Basically I using subprocess to execute a program several times in a while loop. It works fine but after several times (5 times to be precise) the my python script just terminates and it still has a long way before finishing.

        while x < 50:

            # ///////////I am doing things here/////////////////////

            cmdline = 'gmx mdrun -ntomp 1 -v -deffnm my_sim'
            args = shlex.split(cmdline)
            proc = subprocess.Popen(args, stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
            output = proc.communicate()[0].decode()

            # ///////////I am doing things here/////////////////////
            x += 1

For each time I am calling program, it will take approximately one hour to finish. In the mean time subprocess should wait because depending on the output I must execute parts of my code (that is why I am using .communicate() ).

Why is this happening?

Thanks for the help in advanced!

Does your script actually wait on anything after calling `Popen`? I think you need a call to `proc.wait()` or `proc.communicate()`, unless I am misunderstanding your goal. — 0x5453, Mar 11 '19 at 21:08
Could you check the system monitor for memory usage while this script is running? It looks like a memory overflow situation. — Paandittya, Mar 11 '19 at 21:10
for this particular case I need to know in the output text if there is any error word. That is why I am using .decode(). Then I look for the string 'error'. If there is one I take some measurements, and there is non I take other measurements. If I use proc.wait() it should be just after proc = .....? — ananvodo, Mar 11 '19 at 21:13
Also, is there a big difference between using proc.stdout.read() and proc.communicate()? — ananvodo, Mar 11 '19 at 21:14
@ Paandittya my problem still persists after using communicate(). How can I check if I have memory leaks? I running my script in a host centos 7.0 machine — ananvodo, Mar 12 '19 at 04:24
@ananvodo Why do you think the process is killed by the OOM killer? That's the only way in which a memory leak could end up killing your process. Unfortunately without a minimal **reproducible** example we have no way to tell you what's going on with your code. I suggest you comment out all the code that is not needed to reproduce this problem, and then provide a minimal script which demonstrates it that we can run on our machines. You should try to replace the `gmx` subprocesses with other long running processes (e.g. `sleep 1m` or whatever) — Bakuriu, Mar 12 '19 at 20:11

Kristopher Ives · Answer 1 · 2019-03-11T21:17:27.117

1

A subprocess runs asynchronously in the background (since it's a different process) and you need to use subprocess.wait() to wait for it to finish. Since you have multiple subprocesses you'll likely want to wait on all of them, like this:

exit_codes = [p.wait() for p in (p1, p2)]

edited Mar 11 '19 at 21:17

answered Mar 11 '19 at 21:07

Kristopher Ives

5,838
7
42
67

that command line you . are suggesting to put it at the end of my while loop? – ananvodo Mar 11 '19 at 21:09
Thanks Kristopher. Could you please help me telling where to put it? – ananvodo Mar 11 '19 at 21:15
You probably want to keep all your subprocess objects in an array and then `wait()` on them outside the loop – Kristopher Ives Mar 11 '19 at 21:17
To make clear I must wait for subprocess to finish EVERY time before I go to the next looping. Can I just put proc.wait() just after proc = ........? – ananvodo Mar 11 '19 at 21:21
1

Yes in that case you can just use `proc.wait()` inside the loop – Kristopher Ives Mar 11 '19 at 21:24
thanks! Additional question. If I change proc.stdout.read() to proc.communicate(), do I still need to use proc.wait()? – ananvodo Mar 11 '19 at 21:28
1

`communicate()` mostly replaces the need for `read()` and `wait()` – Kristopher Ives Mar 11 '19 at 21:40

score 0 · Answer 2 · answered Mar 12 '19 at 20:09

0

To solve this problem I suggest doing the following:

    while x < 50:

        # ///////////I am doing things here/////////////////////

        cmdline = 'gmx mdrun -ntomp 1 -v -deffnm my_sim 2>&1 | tee output.txt'
        proc = subprocess.check_output(args, shell=True)
        with open('output.txt', 'r') as fin:
        out_file = fin.read()

        # ///////////Do what ever you need with the out_file/////////////   


        # ///////////I am doing things here/////////////////////
        x += 1

I know it is not recommended to use shell=True, so if you do not want to use it then just pass the cmdline you with commas. Be aware that you might get an error when passing with commas. I do want to go into details but in that case you can use shell=True and your problem will be gone.

Using the piece of code I just provided your python script will not terminate abruptly when using subprocess many time and with programs that have a lot of stdout and stderr messages.

It tool some time to discover this and I hope I can help someone out there.

answered Mar 12 '19 at 20:09

ananvodo

371
5
13

I highly doubt your answer provides any help. For once, the OP is correctly using `shlex.split` to launch `Popen`. Using `shell=True` will **not** fix anything, it will just open a security hole and add some overhead to the subprocesses. Also the OP is already using `communicate` which blocks and waits the subprocess. – Bakuriu Mar 12 '19 at 20:11
It depends. I am telling you when using shlex.split that command line cmdline did not work. Only when I used shell=True. I am using a host machine superPC centos 7.0. What you say is valid, communicate() should work but it did not for me. The only whay I was able to solve my problem was the script provided – ananvodo Mar 12 '19 at 20:15
How much output does the subprocess produce? How much RAM does your computer have available? As mentioned [in the documentation](https://docs.python.org/3/library/subprocess.html#subprocess.Popen.communicate) `communicate` will buffer all output in memory which might be an issue. Also, I repeat, you do *not* need `shell=True`. What you want to do is open the file and pass it to `stdout`: `with open('output.txt', 'w') as out: Popen(..., stdout=out)`. – Bakuriu Mar 12 '19 at 20:21
Command line: The error I have when using shlex is: `gmx grompp -f em_ini.mdp -c cnc1_ions.gro -p topol.top -o cnc1_em.tpr -maxwarn 2 2>&1 | tee outGromppEmInitial.txt Program: gmx grompp, version 2018.3 Source file: src/gromacs/commandline/cmdlineparser.cpp (line 276) Function: void gmx::CommandLineParser::parse(int *, char **) Error in user input: Invalid command-line options In command-line option -maxwarn Invalid values: '2>&1', '|', 'tee', 'outGromppEmInitial.txt'; expected an integer` Using shell=True I do not have that problem. – ananvodo Mar 12 '19 at 20:38
@Bakuriu For your suggestion above, about using : stdout: with open('output.txt', 'w') as out: Popen(..., stdout=out), could you please provide the example as a formal answer so I can get the way you are telling me it should be done? Thanks – ananvodo Mar 12 '19 at 20:43
I thought it was obvious: you are using a pipeline and `tee` to redirect the output to a file. My suggestion is to do this redirection using the `stdout` parameter instead, so you have to remove any shell features from your commandline. [here's an example using `stdout`](https://stackoverflow.com/a/6482200/510937). The equivalent of `2>&1` (i.e. redirecting stderr to stdout) is to use `stderr=subprocess.STDOUT`. – Bakuriu Mar 12 '19 at 20:48
@Bakuriu If I do as you are telling I think I will be to use the same command lines and subprocess lines that I posted in my original posted question. I do not know why, but every single time I try to use Popen and the communicate() the script gets killed. Seems that stdout and stderr are too big to handle. – ananvodo Mar 12 '19 at 21:04

Why does my python script get frequently terminated when using subprocess?

2 Answers2