0

I'm working on an automated framework for a bioinformatics tool. As most software that my program will use is written for Linux and not written in python, I use subprocess to invoke the processes. The problem I have is that many steps in the pipeline takes very long time and I want to see the live output so I know that it's still working and has not hung or something. But I will also need to capture the output to log any unexpected errors after the process is done.

I found that subprocces.Popen() is what I need for this issue.

This is the code I use (found here: https://fabianlee.org/2019/09/15/python-getting-live-output-from-subprocess-using-poll/):

 # invoke process
    process = subprocess.Popen("./test.sh", shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE, text=True)

    # print stdout while process is still working
    while True:
        output = process.stdout.readline()
        if process.poll() is not None:
            break
        if output:
            print("out:", output.strip())
    rc = process.poll()

    if rc == 0:
        print("Process ended with rc:", rc, "output:", output)

    else:
        print("Process ended with rc:", rc, "error:", process.stderr.readline())

It works like a charm when I use this simple bash script as argument:

    #!/bin/bash

for i in $(seq 1 5); do
    echo "iteration" $i
       sleep 1
done

which gives the output:

out: iteration 1
out: iteration 2
out: iteration 3
out: iteration 4
out: iteration 5
Process ended with rc: 0 output:

or this if i deliberately insert an error in the script, e.g.:

Process ended with rc: 2 error: ./test.sh: line 7: syntax error: unexpected end of file

Hovever, when I try it with (in this case picard ValidateSamFile) it does not give me any livefeed no matter what I have tried:

# invoke process
    process = subprocess.Popen("picard ValidateSamFile -I dna_seq/aligned/2064-01/AHWM2NCCXY.RJ-1967-2064-01.6.bam", shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE, text=True)

    # print stdout while process is still working
    while True:
        output = process.stdout.readline()
        if process.poll() is not None:
            break
        if output:
            print("out:", output.strip())
    rc = process.poll()

    if rc == 0:
        print("Process ended with rc:", rc, "output:", output)

    else:
        print("Process ended with rc:", rc, "error:", process.stderr.readline())

I get this after the process is completed:

out: No errors found
Process ended with rc: 0 output:

Any ideas?

Vadim Kotov
  • 8,084
  • 8
  • 48
  • 62
biomedswe
  • 1
  • 1
  • 1
    Just a guess, perhaps the stdout of the subprocess isn't getting flushed, so by the time the stdout would finally get written, process.poll() already returns and causes it to break. – William Jun 28 '21 at 08:12
  • From some googling it looks like picard is a Java package normally distributed as a JAR, so what is the `picard` executable? Is it a shell script wrapping `java`? – Iguananaut Jun 28 '21 at 08:29

0 Answers0