1

I need to run the software souporcell using apptaimer (or singularity 3.9.2) and a bash script. The script uses .bam, .tsv, .fa, i.vcf as input, as well as the files souporcell_latest.sif and souporcell_pipeline.py. I "connected" the inside and outside container environments "bind mounting" (--bind) the fullpath (outside container) with a /tpm dir (suggested elsewhere). When I do a test with all these files located within the same dir, the script runs well:

apptainer exec --bind /mypath/souporcell:/tmp souporcell_latest.sif souporcell_pipeline.py -i /tmp/A.merged.bam -b /tmp/GSM2560245_barcodes.tsv -f /tmp/refdata-cellranger-GRCh38-3.0.0/fasta/genome.fa -t 16 -o /tmp -k 3

And outputs many files within the same dir.

However, I have thousands of files/subdirs (each one representing one sample), so I need to input/output files from diferent directories. I made a bash script souporcell_skipRemap_k_3.sh located in:

cd /mypath/scripts

This is the script:

#!/bin/bash
module load apptainer/1.1.4
                                             
SOUPORCELL_DIR=/mypath/data/test/souporcell
BATCH=${1}
SAMPLE=$(echo "$BATCH" | cut -d"_" -f1)

BAM_DIR=/mypath/data/test/fastqs/${BATCH}_OUT/outs
BARCODES_DIR=/mypath/data/test/fastqs/${BATCH}_OUT/outs/filtered_feature_bc_matrix
REF_DIR=/mypath/data/test/souporcell/references
                                                                       
apptainer exec --bind $SOUPORCELL_DIR:/tmp souporcell_latest.sif souporcell_pipeline.py \
        -i /tmp${BAM_DIR}/possorted_genome_bam.bam \
        -b /tmp${BARCODES_DIR}/barcodes.tsv \
        --fasta /tmp${REF_DIR}/refdata-gex-GRCh38-2020-A/fasta/genome.fa \
        --common_variants /tmp${REF_DIR}/common_variants_grch38.vcf \
        --skip_remap SKIP_REMAP \
        -t 16 \
        -o /tmp${SOUPORCELL_DIR}/skip_remap_OUTS/${SAMPLE}_souporcell \
        -k 3

I do a run with one sample (B8-c2-10X_OUT):

bash souporcell_skipRemap_k_3.sh B8-c2-10X_OUT

but I get:

[me@server scripts]$ bash souporcell_skipRemap_k_3.sh B8-c2-10X
checking modules
imports done
checking bam for expected tags
Traceback (most recent call last):
  File "/opt/souporcell/souporcell_pipeline.py", line 64, in <module>
    with open_function(args.barcodes) as barcodes:
  File "/opt/souporcell/souporcell_pipeline.py", line 57, in <lambda>
    open_function = lambda f: gzip.open(f,"rt") if f[-3:] == ".gz" else open(f)
FileNotFoundError: [Errno 2] No such file or directory: 
'/tmp/mypath/data/test/fastqs/B8-c2- 
10X_OUT/outs/filtered_feature_bc_matrix/barcodes.tsv'

I know the problem is not related to the python script, but to how apptaimer containers work. However, even after reading the manual and trying different options I am not able to find a solution.

Lucas
  • 1,139
  • 3
  • 11
  • 23

0 Answers0