Maybe what I need is a general explanation of what STDOUT
is, but here's my problem. I need to run a shell script within python on a bunch of pairs of files, and parse the output. If I run:
from itertools import combinations
from subprocess import Popen
for pair in combinations(all_ssu, 2):
Popen(
['blastn',
'-query', 'tmp/{0}.fna'.format(pair[0]),
'-subject', 'tmp/{0}.fna'.format(pair[1]),
'-outfmt', '6 qseqid sseqid pident'
],
)
... it seems to work great (note: all_ssu
is a list of file names essentially). The shell prints a bunch of lines of data that I'd like to compare. So how do I get that printed data into a list or a dataframe or something so that I can use it?
After looking around the docs and some other questions here, it looks like the stdout
flag is looking for a file object, so I tried:
from itertools import combinations
from subprocess import Popen
for pair in combinations(all_ssu, 2):
out_file = open('tmp.txt', 'rw')
Popen(
['blastn',
'-query', 'tmp/{0}.fna'.format(pair[0]),
'-subject', 'tmp/{0}.fna'.format(pair[1]),
'-outfmt', '6 qseqid sseqid pident'
],
stdout=out_file
)
for line in out_file.readlines():
print line
out_file.close()
And this also seems to work, except that I create that temporary file that I don't need. I tried just setting a variable captured
to None
and then putting stdout=captured
, but in this case it's just setting captured
to 0. I also tried out = Popen(...)
without the stdout
flag, but again, out
is just int(0)
. I also tried playing around with PIPE
, but couldn't make heads or tails of it.
So the question is: how do I just capture the output from Popen
directly?