I am using Bio::DB::Sam in a Centos7 environment, using version 0.1.17 of samtools. I am using this procedure to perform my installation:
wget http://sourceforge.net/projects/samtools/files/samtools/0.1.17/samtools-0.1.17.tar.bz2
tar xjf samtools-0.1.17.tar.bz2 && cd samtools-0.1.17
make CFLAGS=-fPIC
export SAMTOOLS=`pwd`
cpanm Bio::DB::Sam
which I discovered here (notice I changed the version of samtools)
The crash occurs intermittently, sometimes on the same input files. My general procedure is as follows:
- Use bowtie to generate a .sam file from a .fastq file, using a custom bowtie index
- Use samtools to convert my .sam to a .bam, sorting and indexing the file along the way
- Issue the following Perl commands:
Perl:
my $sortbam = align_and_sort_and_index($reads_file); # steps 1 and 2
my @all_gene_ids = qw(gene_id1 gene_id2 gene_id3); # really lots more
for (my $worker=0; $worker <= $n_threads; $worker++) {
my $pid = fork;
die "fork error: $!" unless defined $pid;
next if $pid; # parent
my @gene_ids = get_unique_subset(@all_gene_ids, $worker);
my $sam = Bio::DB::Sam->new(-bam=>$sortbam, -fasta=>$ampl_seqfile, -autoindex=>0);
foreach my $gene_id (@gene_ids) {
# THIS NEXT LINE IS THE ONE THAT SEGFAULTS (SOMETIMES):
my @alignments = $sam->get_features_by_location(-seq_id => $gene_id);
# do something interesting with @alignments...
}
exit;
}
while ((my $pid=wait()) != -1) {
print "reaped $pid\n";
}
To date I have tried the following:
- Increased the number of allowed open files (ulimit -n)
- Increased the number of number of allowed subprocesses
- Increased the limit of pipe buffers
- Increased the swap space
Any and all suggestions would be greatly appreciated. Thank you!