0

How do I set up a command-line program to run through all my input files in a folder? Assume my input files are named like below and vary in the part between _ and .txt:

IsomiR_377G.txt,
IsomiR_377R.txt,
IsomiR_379G.txt,
....

any my program is named bowtie and has two options input and output

bowtie -i IsomiR_377G.txt -o IsomiR_377G.sam

Was thinking of something like:

for f in IsomiR_*.txt
do
    bowtie -i "$f"  -o "Output${f#IsomiR}" 
done

I have similar problem with my awk function:

for f in IsomiR_*.txt 
do 
awk '{printf ">%s_%s\n %s\n",$1,$2,$1;}' "$f" > "Header${f#IsomiR}" 
done

-bash: syntax error near unexpected token `>'
user2300940
  • 2,355
  • 1
  • 22
  • 35
  • Please take care when writing your question. You've used a mixture of switches (`-q/-s` vs. `-i/-o`) and shown an example of the command which is different to the files you've listed. What would the exact command be for the first file in your list? It seems like you're on the right lines with bash's string manipulation; where exactly are you stuck? – Tom Fenech Sep 22 '15 at 06:45
  • updated. Does it look correct? Maybe just a typing error.. – user2300940 Sep 22 '15 at 06:48
  • OK so you've changed the switches so that they match. What about the file names? How do you get from `IsomiR_377G.txt` to `input1.txt`? – Tom Fenech Sep 22 '15 at 06:50
  • 1
    The problem that you're having with your awk code seems completely unrelated - I would suggest that you remove it and focus on making your original question clearer. – Tom Fenech Sep 22 '15 at 06:59

1 Answers1

2

Assuming that you have these three input files:

IsomiR_377G.txt
IsomiR_377R.txt
IsomiR_379G.txt

and would like to run the corresponding commands:

bowtie -i IsomiR_377G.txt -o output377G.sam
bowtie -i IsomiR_377R.txt -o output377R.sam
bowtie -i IsomiR_379G.txt -o output379G.sam

then you could use a loop similar to the one in your question:

for f in IsomiR_*.txt; do 
    base_name=${f%.txt}
    id=${base_name#*_}
    bowtie -i "$f" -o "output${id}.sam"
done

This removes the .txt suffix and everything leading up to the first _ to obtain the ID, which is used in the output file name.

Tom Fenech
  • 72,334
  • 12
  • 107
  • 141