0

I am trying to write a pipeline in python and make proper use of subprocess and not invoke shell=True. One common task in bioinformatics is to align sequences with a program such as bwa and pass the sam-formatted results on to samtools to do downstream processing. The python package pysam helps me do all of the tasks samtools does but within python.

I would like to pass the results from a call to the aligner bwa to pysam without having to write to a file.

Pysam allows for you to open Samfile object with the input file being set to "-" and it will read from stdin. Similarly, bwa writes its results to stdout.

The way I have written it thus far is:

bwa_call = ["bwa", "mem", "-v", "1", "-t", str(cores), index, fwd, rev]
bwa = subprocess.Popen(bwa_call, stdout=subprocess.PIPE)
samfile = pysam.Samfile("-", "r")

This seems to work, in that I don't see the stdout output from bwa but the problem is that pysam does not know when the file is finished and so just keeps waiting.

Is there a way in which I can pass the stdout from bwa directly to pysam without writing to a file?

Ian Fiddes
  • 2,821
  • 5
  • 29
  • 49

0 Answers0