0

I am starting a Python script called test.py from the main script called main.py. In the test.py I am tracking some machine learning metrics. When these metrics reach a certain threshold, I want to terminate the subprocess in which the test.py was started.

Is there a possibility to achieve this in Python if I have started this script by using:
proc = subprocess.Popen("python test.py", shell=True)

I haven't found anything in the documentation which would allow me to trigger this event on my own.

Reallu
  • 50
  • 4
  • `proc.terminate()`? It is not elegant: I would use `proc.communicate` so you have a graceful shutdown. – azelcer Jan 13 '22 at 14:33
  • I know that you can terminate it with `proc.terminate()` but how can I know **when** to do that? I am talking about triggering some event before the script has finished executing and when that event is triggered to forcefully terminate the process. – Reallu Jan 13 '22 at 14:53
  • I also want to be able to send that event from the script which was run inside of the `Popen` process. – Reallu Jan 13 '22 at 14:56
  • You should be better using `multiprocessing`. `subprocess` buffering issues are a nightmare. – azelcer Jan 13 '22 at 15:59
  • `subprocess.run` and `p.communicate()` support a `timeout` keyword which let you shut them down when enough time has passed. But running Python as a subprocess of itself is often an antipattern. – tripleee Jan 14 '22 at 07:48
  • As an aside, the `shell=True` is easy to avoid, and you probably should. See further [Actual meaning of `shell=True` in subprocess](https://stackoverflow.com/questions/3172470/actual-meaning-of-shell-true-in-subprocess) – tripleee Jan 14 '22 at 07:48
  • Can you modify `test.py`? Do you need a portable solution or Linux-only is OK? – azelcer Jan 14 '22 at 15:57

1 Answers1

0

UPDATED The easiest way to perform this is to pass the termination condition as a parameter to test.py.

Otherwise, you can use printing and reading from stdout and stdin If you want to preserve the output and still use Popen, see below. As an example, consider a simple test.py that calculates (in a very inefficient way) some primes:

test.py

import time

primes = [2, 3]

if __name__ == "__main__":
    for p in primes:
        print(p, flush=True)

    i = 5
    while True:
        for p in primes:
            if i % p == 0:
                break
        if i % p:
            primes.append(i)
            print(i, flush=True)
        i += 2
        time.sleep(.005)

You can read the output and choose to terminate the process when you achieve the desired output. As an example, I want to get primes up to 1000.

import subprocess

proc = subprocess.Popen("python test.py",
                        stdout=subprocess.PIPE, stdin=subprocess.PIPE,
                        bufsize=1, universal_newlines=True,
                        shell=True, text=True)
must_stop = False
primes = []
while proc.poll() is None:
    line = proc.stdout.readline()
    if line:
        new_prime = int(line)
        primes.append(new_prime)
        if  new_prime > 1000:
            print("Threshold achieved", line)
            proc.terminate()
        else:
            print("new prime:", new_prime)
print(primes)

please notice that since there is a delay in the processing and communication, you might get one or two more primes than desired. If you want to avoid that, you'd need bi-directional communication and test.py would be more complicated. If you want to see the output of test.py on screen, you can print it and then somehow parse it and check if the condition is fulfilled. Other options include using os.mkfifo (Linux only, not very difficult), which provides an easy communication path between two processes:

os.mkinfo version

test.py

import time
import sys

primes = [2, 3]

if __name__ == "__main__":
    outfile = sys.stdout
    if len(sys.argv) > 1:
        try:
            outfile = open(sys.argv[1], "w")
        except:
            print("Could not open file")
    for p in primes:
        print(p, file=outfile, flush=True)
    i = 5
    while True:
        for p in primes:
            if i % p == 0:
                break
        if i % p:
            primes.append(i)
            print("This will be printed to screen:", i, flush=True)
            print(i, file=outfile, flush=True) # this will go to the main process
        i += 2
        time.sleep(.005)

main file

import subprocess
import os
import tempfile


tmpdir = tempfile.mkdtemp()
filename = os.path.join(tmpdir, 'fifo')  # Temporary filename
os.mkfifo(filename)  # Create FIFO
proc = subprocess.Popen(["python3", "test.py", filename], shell=False)
with open(filename, 'rt', 1) as fifo:
    primes = []
    while proc.poll() is None:
        line = fifo.readline()
        if line:
            new_prime = int(line)
            primes.append(new_prime)
            if new_prime > 1000:
                print("Threshold achieved", line)
                proc.terminate()
            else:
                print("new prime:", new_prime)
    print(primes)

    pass

os.remove(filename)
os.rmdir(tmpdir)
azelcer
  • 1,383
  • 1
  • 3
  • 7
  • Is it possible to do that without collecting the `stdout`? What if inside of that script I have some prints that should be displayed as the console output when I run `python main.py`? – Reallu Jan 14 '22 at 14:06
  • Well... you can use stderr. Is that OK for you? If you are bound to use `Popen` the options are limited. – azelcer Jan 14 '22 at 14:14
  • Tbh the only reason I used `Popen` is that it allows me to start a script from the `main.py` and pass arguments to the argument parser (which is inside that script which I wish to run) – Reallu Jan 14 '22 at 15:02