Assume an input table (intable.csv
) that contains ID numbers in its second column, and a fresh output table (outlist.csv
) into which the input file - extended by one column - is to be written line by line.
echo -ne "foo,NC_045043\nbar,NC_045193\nbaz,n.a.\nqux,NC_045054\n" > intable.csv
echo -n "" > outtable.csv
Further assume that one or more third-party commands (here: esearch
, efetch
; both part of Entrez Direct) are employed to retrieve additional information for each ID number. This additional info is to form the third column of the output table.
while IFS="" read -r line || [[ -n "$line" ]]
do
echo -n "$line" >> outtable.csv
NCNUM=$(echo "$line" | awk -F"," '{print $2}')
if [[ $NCNUM == NC_* ]]
then
echo "$NCNUM"
RECORD=$(esearch -db nucleotide -query "$NCNUM" | efetch -format gb)
echo "$RECORD" | grep "^LOCUS" | awk '{print ","$3}' | \
tr -d "\n" >> outtable.csv
else
echo ",n.a." >> outtable.csv
fi
done < intable.csv
Why does the while loop iterate only over the first input table entry under the above code, whereas it iterates over all input table entries if the code lines starting with RECORD and echo "$RECORD" are commented out? How can I correct this behavior?