I am trying to run gcloud beta lifesciences because genomics API is deprecated. There have been so many changes, genomics API vs lifesciences API.
I ran one of my analysis step in google clooud using beta lifesciences. Here is what I found. (1) wildcard is not working in command line options (2) It is not easy to set the target directory in command line option, I used env-var for copy.
I am now trying to convert commandline option into JSON format pipeline-file, but it is not easy to understand help page in google cloud. Do you have an idea how to convert following options into JSON file, so I could run it with simpler option?
I used YAML formatted pipeline file in genomics API, but beta lifescienes is totally different.
$ more step03_bwa_mem_genome1.run
#SMALL=
SMALL=chr21.
LIFESCIENCESPATH=/gcloud-shared
#LIFESCIENCESPATH=/mnt
SCRIPTFILENAME=step03_bwa_mem_genome.sh
COHORTID=2_C_222
gcloud beta lifesciences pipelines run \
--logging gs://${BUCKETID}/ExomeSeq/hResults/step03_bwa_mem_genome.${COHORTID}.log \
--regions=asia-northeast1,asia-northeast2,asia-northeast3,asia-east1,asia-east2,asia-south1 \
--boot-disk-size 20 \
--preemptible \
--machine-type n1-standard-1 \
--disk-size "gcloud-shared:10" \
--docker-image asia.gcr.io/thermal-shuttle-199104/centos8-essential-software-genomics-custom-python3:0.4 \
--inputs REFERENCE1=gs://${BUCKETID}/ExomeSeq/hReference/GRCh38.primary_assembly.genome.${SMALL}fa.amb \
--inputs REFERENCE2=gs://${BUCKETID}/ExomeSeq/hReference/GRCh38.primary_assembly.genome.${SMALL}fa.ann \
--inputs REFERENCE3=gs://${BUCKETID}/ExomeSeq/hReference/GRCh38.primary_assembly.genome.${SMALL}fa.bwt \
--inputs REFERENCE4=gs://${BUCKETID}/ExomeSeq/hReference/GRCh38.primary_assembly.genome.${SMALL}fa.fai \
--inputs REFERENCE5=gs://${BUCKETID}/ExomeSeq/hReference/GRCh38.primary_assembly.genome.${SMALL}fa.intervals \
--inputs REFERENCE6=gs://${BUCKETID}/ExomeSeq/hReference/GRCh38.primary_assembly.genome.${SMALL}fa.pac \
--inputs REFERENCE7=gs://${BUCKETID}/ExomeSeq/hReference/GRCh38.primary_assembly.genome.${SMALL}fa.sa \
--inputs SCRIPTFILE=gs://${BUCKETID}/ExomeSeq/${SCRIPTFILENAME} \
--inputs COHORTID=${COHORTID} \
--inputs SAMPLELIST=gs://${BUCKETID}/ExomeSeq/SAMPLELIST.${COHORTID}.lst \
--inputs INPUTFILE1=gs://${BUCKETID}/ExomeSeq/hReads/${COHORTID}_01_1.chr21.fastq.gz \
--inputs INPUTFILE2=gs://${BUCKETID}/ExomeSeq/hReads/${COHORTID}_01_2.chr21.fastq.gz \
--inputs INPUTFILE3=gs://${BUCKETID}/ExomeSeq/hReads/${COHORTID}_02_1.chr21.fastq.gz \
--inputs INPUTFILE4=gs://${BUCKETID}/ExomeSeq/hReads/${COHORTID}_02_2.chr21.fastq.gz \
--inputs INPUTFILE5=gs://${BUCKETID}/ExomeSeq/hReads/${COHORTID}_03_1.chr21.fastq.gz \
--inputs INPUTFILE6=gs://${BUCKETID}/ExomeSeq/hReads/${COHORTID}_03_2.chr21.fastq.gz \
--outputs OUTPUTFILE1=gs://${BUCKETID}/ExomeSeq/hResults/${COHORTID}_01.bam \
--outputs OUTPUTFILE2=gs://${BUCKETID}/ExomeSeq/hResults/${COHORTID}_02.bam \
--outputs OUTPUTFILE3=gs://${BUCKETID}/ExomeSeq/hResults/${COHORTID}_03.bam \
--env-vars REFERENCE1=${LIFESCIENCESPATH}/ExomeSeq/hReference/GRCh38.primary_assembly.genome.${SMALL}fa.amb,REFERENC
E2=${LIFESCIENCESPATH}/ExomeSeq/hReference/GRCh38.primary_assembly.genome.${SMALL}fa.ann,REFERENCE3=${LIFESCIENCESPATH}/
ExomeSeq/hReference/GRCh38.primary_assembly.genome.${SMALL}fa.bwt,REFERENCE4=${LIFESCIENCESPATH}/ExomeSeq/hReference/GRC
h38.primary_assembly.genome.${SMALL}fa.fai,REFERENCE5=${LIFESCIENCESPATH}/ExomeSeq/hReference/GRCh38.primary_assembly.ge
nome.${SMALL}fa.intervals,REFERENCE6=${LIFESCIENCESPATH}/ExomeSeq/hReference/GRCh38.primary_assembly.genome.${SMALL}fa.p
ac,REFERENCE7=${LIFESCIENCESPATH}/ExomeSeq/hReference/GRCh38.primary_assembly.genome.${SMALL}fa.sa,SCRIPTFILE=${LIFESCIE
NCESPATH}/ExomeSeq/${SCRIPTFILENAME},SAMPLELIST=${LIFESCIENCESPATH}/ExomeSeq/SAMPLELIST.${COHORTID}.lst,INPUTFILE1=${LIF
ESCIENCESPATH}/ExomeSeq/hReads/${COHORTID}_01_1.chr21.fastq.gz,INPUTFILE2=${LIFESCIENCESPATH}/ExomeSeq/hReads/${COHORTID
}_01_2.chr21.fastq.gz,INPUTFILE3=${LIFESCIENCESPATH}/ExomeSeq/hReads/${COHORTID}_02_1.chr21.fastq.gz,INPUTFILE4=${LIFESC
IENCESPATH}/ExomeSeq/hReads/${COHORTID}_02_2.chr21.fastq.gz,INPUTFILE5=${LIFESCIENCESPATH}/ExomeSeq/hReads/${COHORTID}_0
3_1.chr21.fastq.gz,INPUTFILE6=${LIFESCIENCESPATH}/ExomeSeq/hReads/${COHORTID}_03_2.chr21.fastq.gz,OUTPUTFILE1=${LIFESCIE
NCESPATH}/ExomeSeq/hResults/${COHORTID}_01.bam,OUTPUTFILE2=${LIFESCIENCESPATH}/ExomeSeq/hResults/${COHORTID}_02.bam,OUTP
UTFILE3=${LIFESCIENCESPATH}/ExomeSeq/hResults/${COHORTID}_03.bam \
--command-line="find ${LIFESCIENCESPATH}; /bin/bash ${LIFESCIENCESPATH}/ExomeSeq/${SCRIPTFILENAME} ${COHORTID} 4"