0

I am writing a pipeline script that incorporates different Python scripts which take the output of one script as the input to another, sequentially. The scripts are called from the command line using the subprocess module. The application must be run on Windows, so I cannot use the pexpect module.

I have a Python script, pipeline.py, which is the main entry-point for my application, a first.py script that gets called first and expects input from second.py.

Here, first.py will prompt for input for a fixed number of iterations:

# first.py
import sys

sum = 0
for i in range(0, 10):
    num = int(input('value: '))   # wait for input from pipeline.py, given by second.py
    # TODO: sleep to avoid EOFFile error?
    # while not num:
    #    time.sleep(5) until value is given?
    sum += num
sys.stdout.write(sum)   # return with answer to pipeline.py

Now, first.py will prompt for a number that is generated by second.py:

# second.py
import random
import sys

rand_num = random.randint(1, 10)
sys.stdout.write(rand_num) # or print(rand_num)

In pipeline.py I call first.py, wait until it asks for a value, call second.py to generate that value, then pass it back to first.py.

# pipeline.py
import subprocess as sp

cmd1 = "python first.py"
cmd2 = "python second.py"

prc1 = sp.Popen(cmd1, shell=True, stdin=sp.PIPE, stdout=sp.PIPE, stderr=sp.PIPE)
# TODO: put prc1 on hold until value is passed

# do this for each iteration in first.py (i.e. range(0,10))
while True:
    line = prc1.stdout.readline():  # TODO: generates EOFError
    if line == "value: ":     # input prompt
        prc2 = sp.Popen(cmd2, shell=True, stdin=sp.PIPE, stdout=sp.PIPE, stderr=sp.PIPE)
        (val, err) = prc2.communicate()
        prc1.stdin.write(val)   # pass value back to first.py
    if type(line) == int: # iterations finished, first.py returns sum
        break

It seems my issue arises because prc1 asks for input but does not wait for the input to be provided. When running first.py directly from the command line, however, it does not crash and actually waits for input. Any thoughts? Help would be greatly appreciated. Thanks in advance!

jdpena
  • 43
  • 1
  • 5
  • From what I can gather from your question, this could be due to `Popen` not blocking. See [Blocking and Non Blocking subprocess calls](http://stackoverflow.com/questions/21936597/blocking-and-non-blocking-subprocess-calls). Then again, I'm not really clear on why you're spawning subprocesses for this... – roganjosh Dec 19 '16 at 18:56
  • Using `call(cmd)` or `wait()` will make `prc1` hang since it will continue to block until the process is finished. In this case, the process can never finish because it never receives any input. The code I showed is representative of my actual problem. I'm not actually trying to compute a sum with subprocesses. – jdpena Dec 19 '16 at 19:08
  • I was just typing another question when you replied :) Why are you doing this? This is making a very convoluted setup with inter-process communication, why can't you just create functions in `first.py` and `second.py` and just import them into `pipeline.py`? – roganjosh Dec 19 '16 at 19:12
  • Also, your last response has confused me more. You have `# TODO: put prc1 on hold until value is passed` in your code, but now you say that it would hang indefinitely. Since `first.py` never calls `second.py` itself, you either have it as a blocking call in `pipeline.py` so that when `second.py` is called, it has an argument, or you have a broken system. I don't get it. – roganjosh Dec 19 '16 at 19:14
  • Not sure what's confusing. You suggested looking into blocking subprocess calls, such as call() and wait() from your suggested link. Using either of those functions will hang the entire application because no input will ever be sent to prc1. Currently, I can send input but the application crashes with EOFFile exceptions. The `TODO`s I listed are my thoughts on how to solve the issues. As I said, this code is representative. I would import modules if possible. My question is specific to my task. – jdpena Dec 19 '16 at 19:23
  • "I am writing a pipeline script that incorporates different Python scripts which take the output of one script as the input to another, sequentially". The first part of `first.py` is a blocking call to ask the user for input in `num = int(input('value: '))`. The input is never supplied because "no input will ever be sent to prc1". So that script stalls. It will never end. So how could `pipeline.py` _ever_ execute the subprocesses sequentially? – roganjosh Dec 19 '16 at 19:29
  • 1
    To reinforce what I think @roganjosh is saying, it seems like you should lauch `second.py` first and pipe that into `first.py`. Also, I think you should `print()` the output from in both of them so each line is terminated with a newline at the end—which `input()` waits to see before it returns something. – martineau Dec 19 '16 at 21:57
  • @martineau I think something is wonky in the main setup. We can't see OP's code, but if this is representative then for sure I would not be calling sunprocesses for this. The fact that one subprocess uses `input` and yet expects no input makes me think that the logic here is flawed. – roganjosh Dec 19 '16 at 22:09

0 Answers0