2

I'm new to Bash scripting. My script intended role is to access a provided path and then apply some software (RTG - Real time Genomics) commands on the data provided in that path. However, when i try to execute the bash from CLI, it gives me following error

ERROR:There were invalid input file paths

The path I have provided in the script is accurate. That is, In the original directory, where the program 'RTG' resides, I have made folders accordingly like /data/reads/NA19240 and placed both *_1.fastq and *_2.fastq files inside NA19240.

Here is the script:

#!/bin/bash
for left_fastq in /data/reads/NA19240/*_1.fastq; do
     right_fastq=${left_fastq/_1.fastq/_2.fastq}
     lane_id=$(basename ${left_fastq/_1.fastq})
     rtg format -f fastq -q sanger -o ${lane_id} -l ${left_fastq} -r ${right_fastq} --sam-rg "@RG\tID:${lane_id}\tSM:NA19240\tPL:ILLUMINA"
done

I have tried many workarounds but still not being able to bypass this error. I will be really grateful if you guys can help me fixing this problem. Thanks

After adding set -aux in bash script for debugging purpose, I'm getting following output now

adnan@adnan-VirtualBox[Linux] ./format.sh                           
+ for left_fastq in '/data/reads/NA19240/*_1.fastq'
+ right_fastq='/data/reads/NA19240/*_2.fastq'
++ basename '/data/reads/NA19240/*'
+ lane_id='*'
+ ./rtg format -f fastq -q sanger -o '*' -l '/data/reads/NA19240/*_1.fastq' -r '/data/reads/NA19240/*_2.fastq' --sam-rg '@RG\tID:*\tSM:NA19240\tPL:ILLUMINA'
Error: File not found: "/data/reads/NA19240/*_1.fastq"
Error: File not found: "/data/reads/NA19240/*_2.fastq"
Error: There were 2 invalid input file paths
miken32
  • 42,008
  • 16
  • 111
  • 154
  • 2
    try to debug your script - [How to debug a bash script?](http://stackoverflow.com/questions/951336/how-to-debug-a-bash-script) and revise your question according to output, mentioning which line is throwing error. – Sabir Khan Dec 24 '15 at 07:18
  • Is there a stray `}` in line#3? – sjsam Dec 24 '15 at 07:19
  • 1
    Please take a look: http://www.shellcheck.net/ – Cyrus Dec 24 '15 at 07:41
  • `and placed both *_1.fastq and *_2.fastq files inside NA19240` - If ,as you have said, `*_1.fastq` & `*_2.fastq` represent files, then `$left_fastq/_1.fastq/_2.fastq` in #3 is possibly wrong. Please check that out.. – sjsam Dec 24 '15 at 07:44
  • 1
    You script has multiple issues. Edit the question and add the output from `ls /data/reads/NA19240/*_1.fastq | head -5` and `ls /data/reads/NA19240/*_2.fastq | head -5` and also, tell us *what you want achieve - exactly*, e.g. what args you need to run the `rtg`. – clt60 Dec 24 '15 at 08:40
  • Guys, for debugging the script I added set -aux in the bast script and now I'm getting the output, I added above in question – Adnan Haider Dec 24 '15 at 09:25
  • 1
    Please, **please** try to choose a title that couldn't literally be used by every single bash-related question in the knowledge base. Your question's title should distinguish it, so other people with the same problem can find it an and answer there. (If you haven't yet made your question generic enough that its answer is likely to help anyone but you... well, back when this site was new, we'd simply close such questions as "too localized", and they're still not exceptionally welcome). – Charles Duffy Mar 02 '16 at 23:51
  • Err. Re: "placed both `*_1.fastq` and `*_2.fastq` files inside NA19240" -- do you mean you **literally** have filenames with `*`s in their names? (That _is_ possible and allowed at the filesystem level, but would also be a bit unusual). – Charles Duffy Mar 02 '16 at 23:54

2 Answers2

0

You need to set the nullglob option in the script, like so:

shopt -s nullglob

By default, non-matching globs are expanded to themselves. The output you got by setting set -aux indicates that the file glob /data/reads/NA19240/*_1.fastq is getting interpreted literally. The only way this would happen is if there were no files found, and nullglob was disabled.

miken32
  • 42,008
  • 16
  • 111
  • 154
  • ...but can `-l` be passed without a filename after it? If not, usage would need to use `${foo+-l "$left_fastq"}` or such to leave out the argument if no matches exist. – Charles Duffy Mar 02 '16 at 23:51
  • The call to `rtg` wouldn't happen at all if there was no match because it's inside the `for` loop. – miken32 Mar 02 '16 at 23:54
0

In the original directory, where the program 'RTG' resides, I have made folders accordingly like /data/reads/NA19240 and placed both *_1.fastq and *_2.fastq files inside NA19240.

So you say, your data folders are in the original directory (whatever that may be), but in the script you wrongly specify them to be in the root directory (by the leading /).
Since you start the script in the original directory, just drop the leading / and use a relative path:

for left_fastq in data/reads/NA19240/*_1.fastq
Armali
  • 18,255
  • 14
  • 57
  • 171