3

I'm trying to understand how Python's subprocess module works and have begun by setting myself some problems that weren't as simple as I thought. Specifically, I'm trying to interact with a Python intepreter that has been created as a subprocess.

I've created a test module, dummy.py that is structured as follows:

def hi():
    print "Hi Earth"


hi()

Then, to test my ability to use the subprocess module, I've written a module called pyrun.py, that is structured as follows:

import subprocess

def subprocess_cmd1():
    outFile = open("tempy1.tmp",'w')
    proc = subprocess.Popen("pwd", stdin=subprocess.PIPE, stdout=outFile, stderr=outFile, shell=True)
    outFile.close()

def subprocess_cmd2():
    outFile = open("tempy2.tmp",'w')
    proc = subprocess.Popen('python dummy.py', stdin=subprocess.PIPE, stdout=outFile, stderr=outFile, shell=True)
    outFile.close()

def subprocess_cmd3():
    outFile = open("tempy3.tmp",'w')
    proc = subprocess.Popen('python', stdin=subprocess.PIPE, stdout=outFile, stderr=outFile, shell=True)
    proc.communicate('import dummy')
    outFile.close()

def subprocess_cmd4():
    outFile = open("tempy4.tmp",'w')
    proc = subprocess.Popen('python', stdin=subprocess.PIPE, stdout=outFile, stderr=outFile, shell=True)
    proc.communicate('import dummy')
    proc.communicate('dummy.hi()')
    outFile.close()

print "Start"
subprocess_cmd1()
subprocess_cmd2()
subprocess_cmd3()
subprocess_cmd4()
print "Stop"

The idea is to send input to the subprocess from the calling process and to have all output sent to a text file.

When I attempt to run pyrun from the command line, I get the following results:

me@Bedrock1:~/Projects/LushProjects/newCode$ python pyrun.py
Start
Traceback (most recent call last):
  File "pyrun.py", line 42, in <module>
    subprocess_cmd4()
  File "pyrun.py", line 35, in subprocess_cmd4
    proc.communicate('dummy.hi()')
  File "/usr/lib/python2.7/subprocess.py", line 785, in communicate
    self.stdin.write(input)
ValueError: I/O operation on closed file

subprocess_cmd1 - 3 run without crashing. The error comes in subprocess_cmd4(), when trying to execute the statement:

proc.communicate('dummy.hi()')

This seems to be because the communicate method closes the pipe to stdin after it's first used. Why does it do that? Is there any advantage to assuming the pipe should close?

Also, when I look at the contents of tempy3.tmp (my output file for subprocess_cmd3), it's missing the 'start' text of the Python interpreter - i.e.

Python 2.7.6 (default, Mar 22 2014, 22:59:56) 
[GCC 4.8.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.

Why is that? I redirected both stdout & stderr to outFile.

Finally, why is tempy4.tmp completely empty? Shouldn't it contain, at least, the text that was sent to it before it crashed? (i.e. it should look a lot like tempy3.tmp)

DNA
  • 42,007
  • 12
  • 107
  • 146
user1245262
  • 6,968
  • 8
  • 50
  • 77
  • unrelated: why do you use subprocess to run Python code? – jfs Dec 22 '14 at 06:33
  • @J.F.Sebastien - I am really just experimenting with subprocess, to see if I understood how to use it & how it worked. I suppose a more realistic example would be using it to run code in an interpreter for some other language. – user1245262 Dec 22 '14 at 13:08
  • if it is a learning exersice then here're some hints: 1. avoid `shell=True`, use a list argument to pass the command 2. Know that a child process may behave differently if its stdin/stdout/stderr are redirected (e.g., to suppress color (ansi codes) in the output) or no header as in `python`'s case. – jfs Dec 22 '14 at 14:56
  • 3. Understand buffering issues (here's [a simple case where the parent only reads child's output](http://stackoverflow.com/q/20503671/4279)): there are several buffers before `p.stdin.write("import dummy\n")` data is seen by the child. 4. child process may [read/write directly to terminal](http://stackoverflow.com/q/20980965/4279). Given pp.3 and 4 (for dialog-based interaction) [`pexpect` can be more convienient](http://stackoverflow.com/q/20185353/4279). 5. [you don't need multiple threads or multiprocessing to run several subprocesses in parallel](http://stackoverflow.com/a/23616229/4279) – jfs Dec 22 '14 at 15:02

3 Answers3

4

Define your interpreter:

interpreter=sys.executable

and pass a list as the first argument:

fproc=subprocess.Popen([interpreter,script,'-f',datafile], stdout=subprocess.PIPE)
Nathaniel Ford
  • 20,545
  • 20
  • 91
  • 102
zeotrope
  • 41
  • 2
1

The problem is how you're using subprocess.communicate(), which expects a single string. From the docs

https://docs.python.org/2/library/subprocess.html

Interact with process: Send data to stdin. Read data from stdout and stderr, until end-of-file is reached. Wait for process to terminate. The optional input argument should be a string to be sent to the child process, or None, if no data should be sent to the child.

Try this:

def subprocess_cmd4():
    outFile = open("tempy4.tmp",'w')
    proc = subprocess.Popen('python', stdin=subprocess.PIPE, stdout=outFile, stderr=outFile, shell=True)
    proc.communicate('import dummy\ndummy.hi()\n')
    outFile.close()
user590028
  • 11,364
  • 3
  • 40
  • 57
  • Thanks. I came across a different solution, where I just used proc.stdin.write('....\n') for each line I fed to the interpreter. But I'm still wondering what's the advantage of having communicate close the pipe, and where did the 'intro' text from the Python interpreter go? – user1245262 Dec 21 '14 at 21:38
  • @user1245262: quote from the docs: *"Wait for process to terminate."* -- there is no point to call `communicate()` -- the child process is already dead. *"where did the 'intro' text from the Python interpreter go?"* -- python doesn't print it in non-interactive mode (if stdin is not connected to tty). – jfs Dec 22 '14 at 06:33
  • @user1245262: ^^^^^ "there is no point to call `communicate()`" **more than once** – jfs Dec 22 '14 at 14:30
0

Answering your question about communicate().

communicate() can only be used once, because the outFile being called has been closed after the first communicate. Calling communicate() again will never produce anything because you have already read all the output in the previous one. An advantage of using this is that you do not need to terminate after you use it.

Answering your question about where is the header of the python command in the tempy3.py.

This is merely a header for python, and is not an 'answer' to a 'question'. You are simply entering python mode, and not requesting anything back. However, if you try:

proc.communicate('1+1')

Then that should write 2 to the file tempy4.tmp, right?

No. This is because communicate() can only get output from the Unix command line, not the python. e.g.

proc = subprocess.Popen('ls', stdin=subprocess.PIPE, stdout=outFile, stderr=outFile, shell=True)
proc.communicate('-l')

Outputs:

student@ubuntu:~/Desktop/Testing$ pg tempy4.tmp                                
dummy.py
dummy.pyc
pyrun.py
tempy1.tmp
tempy2.tmp
tempy3.tmp
tempy4.tmp

I ran your program as it is, and tempy4.tmp actually shows Hi Earth once, and has the same error which you have put. However, if you get rid of the second communicate() and have only one, you can do as @user590028 pointed out:

proc.communicate('import dummy\ndummy.hi()\n')

However, instead of having all the commands bunched into one line, subprocess allows you to do this with stdin.write:

proc.stdin.write('import dummy\n')
proc.communicate('dummy.hi()')

*Make sure you put \n after your command for a new line. They both output:

Hi Earth
Hi Earth
Jonathan Davies
  • 882
  • 3
  • 12
  • 27
  • 1. you can close `outFile` even before `.communicate()` is called -- immediately after `Popen()` (the subprocess has its own copy) 2. `.communicate()` waits for the subprocess to finish i.e., the child process has been reaped by the time `.communicate()` returns. 3. the explanation about `tempy3.py` header is incorrect. python doesn't print header in non-interactive mode (stdin is not a tty). `-i` option can force interactive mode. – jfs Dec 22 '14 at 06:26