1

my problem is the following one, I have a bash script that reads a CSV, runs a command, and uses the output of such command to write a new CSV file, but the CSV isn't writing as expected.
The script goes a little bit like this:

echo "Strain,Type,SeqTec,Reads,SamclipReads,PercTrimmedReads" > $OUTPUT
while IFS="," read -r strain type state seqtec
do
    OUTRM=$(samtools view -c $RMDIR$strain$RMEND)
    OUTBAM=$(samtools view -c $BAMDIR$strain$BAMEND)
    PERC=$(awk "BEGIN {print ($OUTBAM - $OUTRM)*100 / $OUTBAM}")
    echo $strain,$type,$seqtec,$OUTBAM,$OUTRM,$PERC >> $OUTPUT
done < $INPUT

The output is the following (using head outfile.csv):

Strain,Type,SeqTec,Reads,SamclipReads,PercTrimmedReads
,17539382,16123818,8.07077
,17597797,16078099,8.63573
,16617830,15385883,7.4134
,16431966,15144807,7.83326

And the expected output is:

Strain,Type,SeqTec,Reads,SamclipReads,PercTrimmedReads
strain1,type1,tec1,17539382,16123818,8.07077
strain2,type2,tec2,17597797,16078099,8.63573
strain3,type3,tec3,16617830,15385883,7.4134
strain4,type4,tec4,16431966,15144807,7.83326

I´ve also tried (without success):

echo "${strain},${type},${seqtec},${OUTBAM},${OUTRM},${PERC}" >> $OUTPUT

I also tried using printf, which didn´t work either:

printf "%s,%s,%s,%s,%s,%s" "$strain" "$type" "$seqtec" "$OUTBAM" "$OUTRM" "$PERC" >> $OUTPUT

The output using printf was:

Strain,Type,SeqTec,Reads,SamclipReads,PercTrimmedReads
,17539382,16123818,8.07077
,17597797,16078099,8.63573
,16617830,15385883,7.4134
,16431966,15144807,7.83326

Does anyone know why this is behaving like this? I´ve tried many other things without success, does anyone can think of a solution to this?

Iseez
  • 15
  • 3

1 Answers1

0

$INPUT has Windows (DOS) line endings.

You can convert it to unix line endings using dos2unix or by adding \r to the input field separator:

Using dos2unix with process substitution:

while IFS="," read -r strain type state seqtec
do
    OUTRM=$(samtools view -c $RMDIR$strain$RMEND)
    OUTBAM=$(samtools view -c $BAMDIR$strain$BAMEND)
    PERC=$(awk "BEGIN {print ($OUTBAM - $OUTRM)*100 / $OUTBAM}")
    echo $strain,$type,$seqtec,$OUTBAM,$OUTRM,$PERC >> "$OUTPUT"
done < <(dos2unix < "$INPUT")

Using the alternate field separator:

while IFS=$',\r' read -r strain type state seqtec
do
    OUTRM=$(samtools view -c $RMDIR$strain$RMEND)
    OUTBAM=$(samtools view -c $BAMDIR$strain$BAMEND)
    PERC=$(awk "BEGIN {print ($OUTBAM - $OUTRM)*100 / $OUTBAM}")
    echo $strain,$type,$seqtec,$OUTBAM,$OUTRM,$PERC >> "$OUTPUT"
done < "$INPUT"
Ted Lyngmo
  • 93,841
  • 5
  • 60
  • 108