How to submit an R script to a job scheduler: bash, compiler, Rscript, or other?

Question

My Problem:

I have an R script myscript.R that uses a configuration file, e.g. config.xml, what is the best way to submit such a script to a job scheduler (e.g., using qsub)?

I would like to be able to use the script and file in the same way that I would use, e.g., a C or Fortran executable, which is embedded in a bash script.

How I currently use FORTRAN:

Here is an example of the approach that I use with a compiled Fortran executable fex like the following that I will call fscript.sh:

!#/bin/bash/
mpirun [arguments] "fex" -f $1

The above fscript.sh can be sent to a cluster with instructions to read the config file like this:

qsub [arguments] fscript.sh 1 config.xml

How I currently use R in an analogous way:

To run R in an analogous way, I am using a bash script rscript.sh

#!/bin/bash
CONFIG=$1
env $CONFIG R --vanilla < myscript.R

This can be run at the command line, e.g.

qsub [arguments] rscript.sh config.xml

Where the rscript.R contains something like

library(XML)
config <- Sys.getenv("CONFIG")
config <- xmlList(xmlParse(config.xml))
myfunction(config)

My Questions

Would Rscript or compiler provide a more robust approach than my current use of bash?
Under which conditions would one be more appropriate than the other (What are the pros and cons)?
How would I pass a configuration file in either case?

What I have done so far

In addition to coming up with the bash script rscript.sh described above, I have read through tutorials and some documentation for Rscript and compiler, but it is not clear to me if these are the contexts in which one would be preferred over the other. Also, it is not clear the best way to pass a configuration file in either context.

This questions is related to others, e.g., What are the ways to create an executable from R program, Does an R compiler exist?. However, I do not think that is essential to use compiled code.

score 5 · Accepted Answer · answered Aug 29 '12 at 17:58

5

What does compiler have to do with anything? It compiles R code into byte-code for the R interpreter so it may not do what you suspect.

For scripting, use Rscript (available everywhere), or littler (which predates Rscript).

We actally wrote littler explicitly for this scripting purpose and my "Intro to HPC with R" talks (see the presentations page) actually have examples of submitting such script to the slurm scheduler / resource managers (as I never had access to qsub).

There are many other questions here relating to Rscript and command-line parsing. That should get you started.

answered Aug 29 '12 at 17:58

Dirk Eddelbuettel

360,940
56
644
725

I referenced `compiler` because the [linked question](http://stackoverflow.com/questions/1452235/does-an-r-compiler-exist) suggested that compiler would create an executable that can be run using something like mpi in the same way that a fortran executable does, but I gather from your definition that the R interpreter is still required. – David LeBauer Aug 29 '12 at 18:19
Can you be more specific and quote the sentence or paragraph that mislead you? – Dirk Eddelbuettel Aug 29 '12 at 18:20
it wasn't clear, based on the edits in your [answer](http://stackoverflow.com/a/1452330/199217), if your comment "there is no way to have what you desire, _specific ways to compile and deploy R code without installing R in advance_" was negated by your edit when `compiler` was released. Understand that my primary understanding of the term "compiler" is in the context of compiling Fortran and C code. – David LeBauer Aug 29 '12 at 18:30
One other question ... does `littler` provide any functionality beyond what `Rscript` provides? I only ask because some people (server admins, developers) are reluctant to add additional dependencies when an function found in base R is suitable. – David LeBauer Aug 29 '12 at 18:31
1

I think that a bit of time spent with the documentation for the base R package `compiler` will help you. As for Rscript vs littler, use whichever suits you but for Pete's sake do not write bash script to launch R jobs. We have been able to do **much** better than that ever since littler came out. When Rscript followed it was initially broken but it works now. Pick one, and be merry. – Dirk Eddelbuettel Aug 29 '12 at 18:35

score 1 · Answer 2 · edited May 23 '17 at 11:53

Following from Dirk's answer and another question, Parsing command line arguments in R scripts, I have come up with the following solution that will enable me to create an R executable that accepts the name of a configuration file

The myscript.sh and rscript.R from the OP can be merged into the following newrscript.R

#!/usr/bin/Rscript
config.file <- commandArgs(trailingOnly = TRUE)
config <- xmlParse(config.file)
myfunction(config)

Which can then be called from the command line, passing the name of the config file in a way that is very similar to the original use of myscript.sh:

./newrscript.R config.xml