0

I have this file:

NC_003037.1     453555  454448
NC_007493.2     2279220 2278345
NC_007952.1     1763831 1762950
NC_005791.1     844089  844916

I want to iterate of each line to obtain variables "id" "column1" "column2", that I will use in this command:

efetch -db nuccore -id $id -chr_start $column1 -chr_stop $column2 -format fasta > file.txt

Could you guide me with a way to do this in a shell script?

codeforester
  • 39,467
  • 16
  • 112
  • 140
David López
  • 500
  • 5
  • 21

2 Answers2

0

When you see multiple columns, think awk.

rm -f file.txt ; cat inputfile.txt | \
awk '{ print "efetch -db nuccore -id "$1" -chr_start "$2" -chr_stop "$3" -format fasta" }' | \
while read cmd; do $cmd >> file.txt; done

(EDIT: moved the re-direct into the execution)

The awk forms up the command, and then the while read cmd ...; done executes it.

Kingsley
  • 14,398
  • 5
  • 31
  • 53
  • Hi @Kingsley. Thank you for your answer. What I expected when I run the code is the creation of a file (o several files to each row) with the info that I query to retrieve, but this file it is not created using your code. – David López Oct 17 '18 at 17:26
  • @DavidLópez - Ah yes, that `> file.txt` should be in the execution. – Kingsley Oct 17 '18 at 23:26
0

Use a read loop:

while read -r id column1 column2; do
  efetch -db nuccore -id "$id" -chr_start "$column1" -chr_stop "$column2" -format fasta
done < file.input > file.txt
  • it is important to use double quotes around variable expansions to prevent word splitting and globbing
  • put > file.txt at the end of the loop for better I/O efficiency

Related:

codeforester
  • 39,467
  • 16
  • 112
  • 140
  • 1
    Thanks, @codeforester, it works, and also do the work that add each query to the end of the file, so I don't to deal with an incremental variable to produce several files and concatenate them . – David López Oct 18 '18 at 02:45