-1
#!/bin/bash

set -o errexit
set -o nounset

#VAF_and_IGV_TAG

paste <(grep -v "^#" output/"$1"/"$1"_Variant_Filtering/"$1"_GATK_filtered.vcf | cut -f-5) \
      <(grep -v "^#" output/"$1"/"$1"_Variant_Filtering/"$1"_GATK_filtered.vcf | cut -f10-| cut -d ":" -f2,3) |
sed 's/:/\t/g' |
sed '1i chr\tstart\tend\tref\talt\tNormal_DP_VCF\tTumor_DP_VCF\tDP'|
awk 'BEGIN{FS=OFS="\t"}{sub(/,/,"\t",$6);print}' \
  > output/"$1"/"$1"_Variant_Annotation/"$1"_VAF.tsv

My above code ends up with a syntax error if I run this in the terminal without using the variable it shows no syntax error

sh Test.sh S1 Test.sh: 6: Test.sh: Syntax error: "(" unexpected

paste <(grep -v "^#" output/S1/S1_Variant_Filtering/S1_GATK_filtered.vcf | cut -f-5) \
      <(grep -v "^#" output/S1/S1_Variant_Filtering/S1_GATK_filtered.vcf | cut -f10-| cut -d ":" -f2,3) |
sed 's/:/\t/g' |
sed '1i chr\tstart\tend\tref\talt\tNormal_DP_VCF\tTumor_DP_VCF\tDP'|
awk 'BEGIN{FS=OFS="\t"}{sub(/,/,"\t",$6);print}' \
  > output/S1/S1_Variant_Annotation/S1_VAF.ts

My vcf file looks like this: https://drive.google.com/file/d/1HaGx1-3o1VLCrL8fV0swqZTviWpBTGds/view?usp=sharing

tripleee
  • 175,061
  • 34
  • 275
  • 318
  • 2
    Please [edit] to provide a (small) text-only sample of the input in the question itself. – tripleee Dec 09 '21 at 09:37
  • 5
    The error message indicates that you are using `sh` to run the script. You need to use `bash` if you want to use Bash features. In particular, the process substitution `<(command)` is not portable to POSIX `sh` – tripleee Dec 09 '21 at 09:38
  • It's not clear whether you are trying to translate this to `sh` script syntax or just running the code incorrectly. I have posted an answer which assumes the former, but I have no way to test it (refuse to connect to Google Drive). – tripleee Dec 09 '21 at 09:58
  • @tripleee I just used chmod 777 which made my program executable. then I simply ran it using bash test.sh or ./test.sh – Manojkumar K Dec 10 '21 at 03:58
  • Whatever you are hoping to accomplish, **`chmod 777` is *wrong* and *dangerous.*** You absolutely do not want to grant write access to executable or system files to all users under any circumstances. You will want to revert to sane permissions ASAP (for your use case, probably `chmod 755`) and learn about the Unix permissions model before you try to use it again. If this happened on a system with Internet access, check whether an intruder could have exploited this to escalate their privileges. – tripleee Dec 10 '21 at 06:36
  • @tripleee i just gave only execution permission alone and thanks for your explanation. – Manojkumar K Dec 10 '21 at 09:32

2 Answers2

2

You cannot use <(command) process substitution if you are trying to run this code under sh. Unfortunately, there is no elegant way to avoid a temporary file (or something even more horrid) but your paste command - and indeed the entire pipeline - seems to be reasonably easy to refactor into an Awk script instead.

#!/bin/sh

set -eu

awk -F '\t' 'BEGIN { OFS=FS;
        print "chr\tstart\tend\tref\talt\tNormal_DP_VCF\tTumor_DP_VCF\tDP' }
    !/#/ { p=$0; sub(/^([^\t]*\t){9}/, "", p);
           sub(/^[:]*:/, "", p); sub(/:.*/, "", p);
           sub(/,/, "\t", p);
           s = sprintf("%s\t%s\t%s\t%s\t%s\t%s", $1, $2, $3, $4, $5, p);
           gsub(/:/, "\t", s);
           print s
    }' output/"$1"/"$1"_Variant_Filtering/"$1"_GATK_filtered.vcf \
  > output/"$1"/"$1"_Variant_Annotation/"$1"_VAF.tsv

Without access to the VCF file, I have been unable to test this, but at the very least it should suggest a general direction for how to proceed.

tripleee
  • 175,061
  • 34
  • 275
  • 318
1

sh does not support bash process substitution <(). The easiest way to port it is to write out two temporary files, and remove them via when via a trap when done. The better option is use a tool that is sufficiently powerful (i.e. sed) to do the filtering and manipulation required:

#!/bin/sh
header="chr\tstart\tend\tref\talt\tNormal_DP_VCF\tTumor_DP_VCF\tDP"
field_1_to_5='\(\([^\t]*\t\)\{5\}\)' # \1 to \2
field_6_to_8='\([^\t]*\t\)\{4\}[^:]*:\([^,]*\),\([^:]*\):\([^:]*\).*' # \3 to \6
src="output/${1}/${1}_Variant_Filtering/${1}_GATK_filtered.vcf"
dst="output/${1}/${1}_Variant_Variant_Annotation/${1}_VAF.tsv"
sed -n \
  -e '1i '"$header" \
  -e '/^#/!s/'"${field_1_to_5}${field_6_to_8}"'/\1\4\t\5\t\6/p' \
  "$src" > "$dst"

If you are using awk (or perl, python etc) just port the script to that language instead.

As an aside, all those repeated $1 suggest you should rework your file naming standard.

Allan Wind
  • 23,068
  • 5
  • 28
  • 38