Lets say I had the following command running from the shell
{
samtools view -HS header.sam; # command1
samtools view input.bam 1:1-50000000; # command2
} | samtools view -bS - > output.bam # command3
For those of you who aren't familiar with samtools view (Since this is stackoverflow). What this is essentially doing is creating a new bam file that has a new header. bam files are typically large compressed files, so even passing through the file in some cases can be time consuming. One alternative approach would be to undergo command2, and then use samtools reheader to switch the header. This passes through the large file twice. The above command passes through the bam a single time which is good for larger bam files (They get to be larger then 20GB even when compressed - WGS).
My question is how to implement commands of this type in python using subprocess.
I have the following:
fh_bam = open('output.bam', 'w')
params_0 = [ "samtools", "view", "-HS", "header.sam" ]
params_1 = [ "samtools", "view", "input.bam", "1:1-50000000"]
params_2 = [ "samtools", "view", "-bS", "-" ]
sub_0 = subprocess.Popen(params_0, stderr=subprocess.PIPE, stdout=subprocess.PIPE)
sub_1 = subprocess.Popen(params_1, stderr=subprocess.PIPE, stdout=subprocess.PIPE)
### SOMEHOW APPEND sub_1.stdout to sub_0.stdout
sub_2 = subprocess.Popen(params_2, stdin=appended.stdout, stdout=fh_bam)
Any help is greatly appreciated. Thanks.