2

I am trying to pass each line of a data file through a function. In addition to the data in the file, a number of other parameters are also parsed into the function. For each line, there a total of 11 total parameters are parsed. For some reason, the last two parameters are ignored by the function.

My code is below, as is a sample of the input data and the result of running the code. Any suggestions?

The Code:

function exon_parse {
    data=$1
    CHROM=$(awk ' {print $1}' <<< $data )
    CHROM_LENGTH=$(awk ' {print $2}' <<< $data )
    EXON_LENGTH=$(awk ' {print $3}' <<< $data )
    STRAND=$(awk ' {print $4}' <<< $data )
    START=$(awk ' {print $5}' <<< $data )
    STOP=$(awk ' {print $6}' <<< $data )
    POLY_SITES=$(awk ' {print $7}' <<< $data )
    Av_Cov_Min=$2
    Min_SNPs=$3
    REF=$4
    BAM1=$5
    BAM2=$6
    BAM3=$7
    BAM4=$8
    BAM4=$8
    BAM6=$10
    OUTPUT_FILE=$11
    echo $1
    echo $2
    echo $3
    echo $4
    echo $5
    echo $6
    echo $7
    echo $8
    echo $9
    echo $10
    echo $11
    exit 0
}
INPUT_FILE="/filepath/confused_reads.txt_"
OUTPUT_FILE="/filepath/filtered_recovered_reads.txt"
Av_Cov_Min=40
Min_SNPs=10
REF="/filepath/Renamed_pmin.scaf.fa"
BAM1="/filepath/SRR573675.realigned.bam"
BAM2="/filepath/SRR573705.realigned.bam"
BAM3="/filepathSRR573706.realigned.bam"
BAM4="/filepath/SRR573707.realigned.bam"
BAM5="/filepath/SRR573708.realigned.bam"
BAM6="/filepath/SRR573709.realigned.bam"
count=1
while read line; do
    if [[ $count == 1 ]]; then
        count=$(( count + 1 ))
    else
        data=$line
        exon_parse  "$data" $Av_Cov_Min $Min_SNPs $REF $BAM1 $BAM2 $BAM3 $BAM4 $BAM5 $BAM6 $OUTPUT_FILE 
    fi

done < ${INPUT_FILE}

Rather than print out all the parameters, I get the following:

$> ./exonTables_recoverLostReads.bsh
Scaffold10026 154793 6043 . 1 6043 93
40
10
/filepath/Renamed_pmin.scaf.fa
/filepath/SRR573675.realigned.bam
/filepath/SRR573705.realigned.bam
/filepath/SRR573706.realigned.bam
/filepath/SRR573707.realigned.bam
/filepath/SRR573708.realigned.bam
Scaffold10026 154793 6043 . 1 6043 930
Scaffold10026 154793 6043 . 1 6043 931

What happened to the last two parameters?

The first few lines of the input file are as so (my built my code so that it would not parse the header line):

scaffold    scaff_length    exon_length strand  start   stop    total_polymorphic_sites
Scaffold10026   154793  6043    .   1   6043    93
Scaffold10026   154793  6043    .   1   6043    93
Scaffold10026   154793  6043    .   1   6043    93
Scaffold10575   154793  5235    .   22299   27533   103
Scaffold10575   154793  5235    .   22299   27533   103
gwilymh
  • 415
  • 1
  • 7
  • 20

1 Answers1

3

Try using ${10} and ${11} instead.

Bash variables can optionally be surrounded by {} to avoid ambiguity, and this is one of those cases where it is necessary.

However, there is another, idiomatic (and IMO cleaner) way to handle this problem. Reassign the argument variables to named variables, like so:

function exon_parse {
    data=$1;shift
    Av_Cov_Min=$1;shift
    Min_SNPs=$1;shift
    ....

The shift builtin causes the contents of $1 to be dropped, and the contents of $2 to be "shifted down" into $1; $3 and all other parameters are shifted down as well.

This allows you to access all of the parameters without referring to any positional variable higher than $1, thus avoiding this problem altogether. This isn't necessary, of course, but I find that it is all too easy to forget and accidentally type $11 instead of ${11}. By always shifting, I never need to worry about that.

jpaugh
  • 6,634
  • 4
  • 38
  • 90
  • I wouldn't say it's more idiomatic to use `shift` unnecessarily like this. – chepner May 13 '15 at 02:40
  • 1
    Hmm, well It's my idiom, at least. It *does* become necessary for more advanced argument processing, such as when dealing with switches and/or optional arguments, and I consider it "defensive coding", to avoid the very mistake the OP made. – jpaugh May 13 '15 at 02:43
  • Oh, yes, `shift` is very useful for iterating over the parameters in a non-uniform fashion, such as parsing a series of options, some of which may take one or more arguments themselves. In this case, though, a simple sequence of fixed size and known order, just using the numbered parameters is simpler. – chepner May 13 '15 at 02:50
  • 1
    Enclosing the within-function variables in {} worked. It has occurred to me that the shell was likely interpreting $10 and $11 as $1 + "0" and $1 + "1". Seems obvious, in retrospect – gwilymh May 13 '15 at 16:53
  • Tis an easy mistake to make, even when you know to check for it! – jpaugh May 13 '15 at 19:15