1

I have a routine that frequently executes a shell command and parses the output from it. For this I am using the subprocess.Popen to execute the shell command. Tested with the shell commands ls , systemctl status xxxx.service etc.

Platform - Linux 3.14.4-yocto-standard x86_64


status_call = subprocess.Popen(shlex.split(<shell_command>),
                                   stdout=subprocess.PIPE,
                                   stderr=subprocess.PIPE)

I used the communicate() to read the output and error from the above child process. I am considering error scenario if the length of output is 0 or length of error is not zero and I am printing the error and the return code.

import subprocess
import shlex

def loop_process():
  count = 0
  iter = 0
  print 'Test Case: Starting the script'
  while True:
    status_call = subprocess.Popen(shlex.split('ls'),
                                   stdout=subprocess.PIPE,
                                   stderr=subprocess.PIPE)
    output, error = status_call.communicate()
    if len(error) != 0 or len(output) == 0:
        print 'Test Case : Unable to make the systemctl status call\n'
        print 'error is : ' + str(error) + 'with length of output as ' + str(len(output))
        print 'return_code is ' + str(status_call.returncode)
        count = 0
        iter = 0
    else:
        count = count + 1
        if count > 1000:
           iter = iter + 1
           print iter  
           count = 0
if __name__ == '__main__':
  loop_process()

The above logic is failing at random times and I am not getting any error message or the return code from the child process is zero as shown below

I am seeing the issue in here roughly for every 200,000 times. I am running it on embedded equipment. I am not seeing this issue when I run on QEMU

114
115
Unable to make the systemctl status call
error is : with length of output as 0
return_code is 0
1
2

Other occurrence

387
388
389
Test Case : Unable to make the systemctl status call
error is : with length of output as 0
return_code is 0
1
2

Can anyone guide me how to resolve such issues? What is the direction I need to concentrate on finding the root cause and possibly a fix? I am strongly suspecting some problem with the PIPE.

  • 1
    The problem should be with the command you're running, which may not reliably produce output. Can you show us the command you run? – blhsing Jun 29 '18 at 16:03
  • As mentioned in the question, I tested it with two different commands 1) 'ls' 2) 'systemctl status xxxx.service' – Nani Anudeep Jun 29 '18 at 16:09
  • Please post enough code to reproduce the problem. – martineau Jun 29 '18 at 16:17
  • The code you have written, does not print output unless the command execution is done. Please do check your command. Try running `netstat` command. Check this for live output form subprocess [link](https://stackoverflow.com/questions/18421757/live-output-from-subprocess-command) – Yajana N Rao Jun 29 '18 at 16:29
  • @Yajana, Thanks for your answer, but my question is not about printing output. My question is about asking for guidance on the scenario where the commincate() call fails to give any output or error randomly. – Nani Anudeep Jun 29 '18 at 16:36
  • try closing the unused fds in the subprocess by passing the parameter: `close_fds=True` to `subprocess.Popen` – nosklo Jun 29 '18 at 17:17
  • @nosklo tried with the close_fds=True. Issue reproduced so soon after 22000 executions. Can I have some insight on your suggestion close_fds = True? – Nani Anudeep Jun 29 '18 at 17:26
  • @NaniAnudeep executing a subprocess works by forking the current process first. That means the open file descriptors get shared and they can accumulate. – nosklo Jun 29 '18 at 17:32
  • @NaniAnudeep, I ran your script for > 1,000,000 iterations without error (Python 2.7.15, Linux x86_64). – OregonJim Jun 29 '18 at 19:07
  • @OregonJim I am seeing the issue in here roughly for every 200,000 times. I am running it on embedded equipment. I am not seeing this issue when I run on QEMU. Do you think there are some hardware factors that can effect the subprocess.PIPE? – Nani Anudeep Jun 29 '18 at 19:28
  • @NaniAnudeep, that is an important clue that should have been disclosed clearly in your question. – OregonJim Jun 29 '18 at 20:12

0 Answers0