0

I have one Rscript that I want to run on 100 files in a directory, I am creating a .sh file in bash to do this.

Rscript run_data.R <inputfile> <outputroot>

But rather than one input file I need to create a script that will loop the Rscript through 100 files I have in a directory.

I have tried this:

#!/bin/bash
#
#
#PBS -t 0-99
#PBS -l nodes=1:ppn=2,walltime=200:00:00
# Change into working directory
# Execute code
#Set -e allows you to test the script
set -e

echo "Running on ${HOSTNAME}"
if [ -n "${1}" ]; then
  echo "${1}"
  PBS_ARRAYID=${1}
fi

i=${PBS_ARRAYID}

files=("/mydirectory/")
echo ${files[$i]}

output=${output[$i]}

Rscript run_data.R ${files[$i]} ${output[$i]} 

But this doesn't seem to be working, it says Execution halted and will not run the Rscript.

Any suggestions would be amazing

Thanks!

  • 1
    How do you run this script? Specifically, what's in `$1`? – Benjamin W. Dec 10 '20 at 15:34
  • Rscript run_data.R this is what I type in bash to run the R script, the $1 is just to test the script. I just need to set the input file so it'll be all files in the directory and the output will name each file separately – chazhatchet Dec 10 '20 at 15:37
  • I meant, how do you run this Bash script? – Benjamin W. Dec 10 '20 at 15:38
  • Oh sorry! So I make an .sh file and then run that using bash file.sh – chazhatchet Dec 10 '20 at 15:39
  • 1
    That means `$1` is unset, so `PBS_ARRAYID` is unset, so `i` becomes the empty string, `${files[$i]}` is empty, and `output` isn't declared anywhere in the first place. – Benjamin W. Dec 10 '20 at 15:42
  • 2
    Can you describe your exact setup with directory tree, a few example files, and which exact commands you'd like your Bash script to run? – Benjamin W. Dec 10 '20 at 15:43
  • Hi Ben, how do I make it so that ${files[$i]} isn't empty? At the moment I am just putting a file path to my directory where my files are I want to run the script on. I need to specify an outputroot but I am not sure how best to assign or do this? The Rscript itself runs several models on genetic data and the script works fine when I just read in one file so I think there are no problems with that, I just don't know how to scale it up so the Rscript will run on all my files in my directory and want to avoid running them manually. – chazhatchet Dec 10 '20 at 15:51
  • 1
    I understand that, but without knowing what your directories etc. look like, it's difficult to see what your script is supposed to do exactly (see my previous comment for a suggestion how to update your question). – Benjamin W. Dec 10 '20 at 16:35
  • Where and how are you initializing the arrays `files` and `output`? This looks like they are both empty, meaning the R script will not receive any arguments at all. (That's a bug on its own; you should [wrap quotes around your shell variables.)](https://stackoverflow.com/questions/10067266/when-to-wrap-quotes-around-a-shell-variable) – tripleee Dec 10 '20 at 16:41

0 Answers0