3

I think I have a simple problem because I was looking up and down the internet and couldn't find someone else asking this question: My university has a Condor set-up. I want to run several repetitions of the same code (e.g. 100 times). My R code has a routine to store the results in a file, i.e.:

write.csv(res, file=paste(paste(paste(format(Sys.time(), '%y%m%d'),'res', queue, sep="_"), sep='/'),'.csv',sep='',collapse=''))

res are my results (a data.frame), I indicate that this file contains the results with 'res' and finally I want to add the queue number of this calculation (otherwise files would be replaced, wouldn't they?). It should look like: 140109_res_1.csv, 140109_res_2.csv, ...

My submit file to condor looks like this:

universe = vanilla
executable = /usr/bin/R 
arguments =  --vanilla 
log =  testR.log
error = testR.err
input = run_condor.r
output = testR$(Process).txt
requirements = (opsys == "LINUX") && (arch == "X86_64") && (HAS_R_2_13 =?= True)
request_memory = 1000
should_transfer_files = YES
transfer_executable = FALSE
when_to_transfer_output = ON_EXIT
queue 3

I wonder how do I get the 'queue' number into my R code? I tried a simple example with

print(queue)
print(Queue)

But there is no object found called queue or Queue. Any suggestions? Best wishes, Marco

Marco Smolla
  • 188
  • 5
  • How do you repeat the code if it is with a for loop then add the `i` in the name. Other option I use is to add the time to the file name, so this way I know which one was created before which one (although it is in the property of the file) and I get different files not overwritten. – llrs Jan 09 '14 at 10:20
  • Okay, maybe a get the idea wrong. I have a calculation with loops and everything internally (which cannot be subset) and I want to repeat this calculation N times. Therefore, I thought I would just tell condor to execute this on N machines (using `queue N`). Is that the wrong way? – Marco Smolla Jan 09 '14 at 10:27
  • Sorry, I think I misunderstood the question, the code is about `condor` not R although it runs a R program. About it I can't help I am also interested in the input from another language to R... – llrs Jan 09 '14 at 10:36
  • Ah yes, sorry. I changed the title of the question to make it more obvious. Thanks anyway. – Marco Smolla Jan 09 '14 at 10:39
  • I have no idea about condor either, but passing arguments into an R script is usually straightforward. Maybe this helps: http://stackoverflow.com/questions/14167178/passing-command-line-arguments-to-r-cmd-batch – Matt Jan 09 '14 at 10:46
  • Hi Matt. Thanks for the suggestion. The post refers to a further post which is interesting when you parse on arguments from your terminal to R. However, Condor is a somewhat magic box for me. Somehow the queue must be named and numbered. I've seen it working for MATLAB, e.g. [here at the far end](http://www.ncl.ac.uk/media/sites/servicesites/itservice/communicationcollaborationandresearch/condor/Worked_Condor_Example_Simple_MATLAB_Add.pdf). – Marco Smolla Jan 09 '14 at 11:01

1 Answers1

2

Okay, I solved the problem. This is how it goes:

  1. I had to change my submit file. I changed the slot arguments to:

    arguments = --vanilla --args $(Process)

  2. Now the process number is forwarded to the R code. There you retrieve it with the following line. The value will be stored as a character. Therefore, you should convert it to a numeric value (also check whether a number like 10 is passed on as '1' and '0' in which case you should also collapse the values).

    run <- commandArgs(TRUE)

Here is an example of the code I let run.

> run <- commandArgs(TRUE)
> run
[1] "0"
> class(run)
[1] "character"
> try(as.numeric(run))
[1] 0
> try(run <- as.numeric(paste(run, collapse='')) )
> try(print(run))
[1] 0
> try(write(run, paste(run,'csv', sep='.')))

You can also find information how to pass on variables/arguments to your code here: http://research.cs.wisc.edu/htcondor/manual/v7.6/condor_submit.html

I hope this helps anyone. Cheers and thanks for all other commenters! Marco

Marco Smolla
  • 188
  • 5