0

I need to execute a CLI binary with args, keep the process alive and run multiple commands throughout the python script. So I am using Python and subprocess.Popen() in the following way:

from subprocess import Popen, PIPE

cmd = ["/full/path/to/binary","--arg1"]
process = Popen(cmd,stdin=PIPE, stdout=None)
process.stdin.write(f"command-for-the-CLI-tool".encode())
process.stdin.flush()

However, no matter how I call Popen(), the returned process object is None.

If I run process = Popen(cmd), without specifying stdin and stdout, I can see the process running correctly in the output console, meaning that the binary path and args are correct, but the process object is still None, meaning that I cannot issue other commands afterwards.

EDIT: The point of this is that I want to execute the following:

command = (
        f"cat << EOF | {cmd}\n"
        f"use {dbname};\n"
        "set optimizer_switch='hypergraph_optimizer=on';\n"
        f"SET forced_plan='{forced_plan}';\n"
        f"{query_text}\n"
        "EOF"
    )
    runtimes = []
    for _ in trange(runs):
        start = time.time()
        subprocess.run(command, shell=True, stdout=sys.stdout)
        runtimes.append(time.time() - start)

But this clearly measures the time of all the commands, whereas I am only interested in measuring the "query_text" command. This is why I am looking for a solution where I can send the commands separately and time only the one I am interested in. If I use multiple subprocess.run(), then the process instances will be different. I want the instance to be the same because the query depends on the previous commands.

  • You can specify subprocess.PIPE as the value for stdout to capture the output of the process. Also, it's better to use communicate() method of Popen class to write and read from stdin and stdout of the process. – Gihan Jan 26 '23 at 11:18
  • I do not care about capturing the output, I just need to measure the runtime of commands issued inside the CLI. If the process object is Null, I cannot call communicate() either. – enrico_steez Jan 26 '23 at 11:22
  • `process = Popen(cmd, stdin=PIPE, stdout=PIPE)` also returns None – enrico_steez Jan 26 '23 at 11:23
  • I'm not really sure but maybe you try to can use subprocess.run() function instead of subprocess.Popen()? – Gihan Jan 26 '23 at 11:27
  • Another issue with communicate() is that it is not possible to communicate more than once, since after the call, the process' stdin will be closed, as explained [here](https://stackoverflow.com/questions/28616018/multiple-inputs-and-outputs-in-python-subprocess-communicate) – enrico_steez Jan 26 '23 at 11:28
  • Strange. `Popen` is a class, it can raise, but it cannot return None. Is there any other `subprocess` module? – VPfB Jan 26 '23 at 11:35
  • what do you mean by the returned process object is None ? what is the code that you run and what is the error that you get specifically ? like the error traceback, also as far as i am aware, terminal tty mode is different than just launching it through the stdin, so you cannot really pass each command separately as you think you can do, you should probably use the `time` command instead. – Ahmed AEK Jan 26 '23 at 13:15
  • also i think you are trying to reinvent the wheel, there are already profiling tools for SQL databases, and there are ways to time each command in the script https://unix.stackexchange.com/a/52347 , but you aren't going to get anything useful from that. – Ahmed AEK Jan 26 '23 at 13:35

2 Answers2

1

With subprocess.run you can pass the entire input as ... input.

    command = f"""\
use {dbname};
set optimizer_switch='hypergraph_optimizer=on';
SET forced_plan='{forced_plan}';
{query_text}
"""
    runtimes = []
    for _ in trange(runs):
        start = time.time()
        subprocess.run([cmd], text=true, input=command, stdout=sys.stdout)
        runtimes.append(time.time() - start)

I took out shell=True; perhaps see also Actual meaning of shell=True in subprocess as well as perhaps Running Bash commands in Python which elaborates on several of the changes here.

tripleee
  • 175,061
  • 34
  • 275
  • 318
  • This times the first 3 commands, whereas I only want to time the query itself. – enrico_steez Jan 26 '23 at 14:09
  • Uh, I don't understand what you mean by "the query itself". Python has no insight into the internals of `cmd`; if it does something more than run the commands you pass as `input`, you need some way to instrument it or retrieve debugging output (which probably affects timings significantly, too) to isolate the part which is interesting to you. – tripleee Jan 26 '23 at 14:14
  • So if you see the command contains multiple instructions, the last of which being "query_text", which is what I want to time. Timing the call of subprocess.run() would include the first 3 lines of the command, which is what I want to avoid. The use of subprocess.run() I wrote in the EDIT: is running just fine for now. The goal is to time only the last line of the command – enrico_steez Jan 26 '23 at 14:22
  • The simple solution would be to run the commands without the query a few times and estimate a reasonable number to subtract from the total. I'm not saying it can't be done with bare `Popen`, just that the complexities of getting it exactly right might not pe worth the effort. Do you really expect these simple static commands to take significant time to execute? – tripleee Jan 26 '23 at 17:25
-1

Try using subprocess.run() instead of subprocess.Popen()

If you still use subprocess.Popen(), then you can use the .poll() method

But subprocess.Popen() will always return None if the execution of the command has not yet completed, or an exit code if the command has finished its execution.

  • Can I run multiple commands to the same subprocess afterwards? I started using subprocess but I realised that if I want to measure the runtime of a command, I think I have to "squeeze" the setup commands with the actual command that I want to type and therefore I cannot time just the command itself but I need to time everything – enrico_steez Jan 26 '23 at 12:44
  • To measure the time of a command, you can add `time` for example "time date" – Abra_Kadabra Jan 26 '23 at 12:47
  • `Popen` does not return `None` ;`Popen` is a class, and calling it produces an instance of that class. The instance represents a process that runs *in parallel* with the current process. – chepner Jan 26 '23 at 14:07
  • 1
    You may be thinking of `subprocess.check_call`, which blocks, returns `None` on success and raises a `CalledProcessError` otherwise. – chepner Jan 26 '23 at 14:08