17

I am using Rscript more and more where I would have normally used bash scripts. One small annoyance is that many of these scripts loop over some system() call that leaves basically no time for R to catch my control-c if I try to interrupt it. Instead it just aborts the system command that was running and continues on to the next loop iteration. For example, when I try to interrupt the following by holding down control-c, it still makes it through all the iterations:

for(i in 1:10) {
  cat(i)
  system('sleep 3')
}

So far I have always just hacked around this by inserting a small pause in each loop like

for(i in 1:10) {
  Sys.sleep(0.25)
  cat(i)
  system('sleep 3')
}

that will let me abort within an iteration or two if I hold down control-c, but I'm wondering, is there a more efficient way to accomplish this behavior?

John Colby
  • 22,169
  • 4
  • 57
  • 69
  • Great question. I have read some of the guRus say that any well written program will be listening for a break keystroke, but is seems many (and certainly all of mine) are not written with that feature. – IRTFM Nov 11 '11 at 16:40
  • Yea same here! It bugs me because these housekeeping scripts are usually for setting up experiment directory areas, so use lots of very fast `mkdir`, `chmod`, `ln`, etc., commands. But when I have to do it for 1000 subjects, even just having a 0.25 second pause in there wastes close to 5 minutes each time I run one. – John Colby Nov 11 '11 at 17:00
  • AFAIK an R loop is not a separate process, so `% kill-9 $PID` won't work. But, let me ask: what exactly do you need to kill? If it's a loop over, say 1:10000, you could put in a prompt() that runs every 1000 cycles. If you're running a script that's simply very long and you changed your mind after initiating it, well, that's pretty much your own fault :-) . – Carl Witthoft Nov 11 '11 at 17:06
  • @JohnColby Could you not do this all from R? `file.create()`, `dir.create()` `Sys.chomd()` and `file.link()` and `file.symlink()`? It might help if everything is running in R rather than dropping out to the system. – Gavin Simpson Nov 11 '11 at 17:13
  • @CarlWitthoft I like the prompt kind of idea...and it would be even better if it was just automatically checking for some keypress behind the scenes like DWin was talking about. It is definitely *always* my fault, but still annoying when I have to sit there holding control-c until all the loops finish, or else kill the R session altogether. – John Colby Nov 11 '11 at 17:17
  • @GavinSimpson I try use the R equivalents for those type of commands (although I admit I actually didn't know about `file.*` before, so that in itself will come in *very* handy. Many thanks!!) However, often I just need to link together random command line tools from my research field, so end up using `system` for those. – John Colby Nov 11 '11 at 17:21

2 Answers2

4

John, I'm not sure if this will help, but from investigating setTimeLimit, I learned that it can halt execution whenever a user is able to execute an interrupt, like Ctrl-C. See this question for some of the references.

In particular, callbacks may be the way to go, and I'd check out addTaskCallback and this guide on developer.r-project.org.

Here are four other suggestions:

  1. Although it's a hack, a very different approach my be to invoke two R sessions, one is a master session and the other simply exists to execute shell commands passed by the master session, which solely waits for a confirmation that the job was done before starting the next one.

  2. If you can use foreach instead of for (either in parallel, via %dopar%, or serial %do% rather than %dopar% or w/ only 1 registered worker), this may be more amenable to interruptions, as it may be equivalent to the first suggestion (since it forks R).

  3. If you can retrieve the exit code for the external command, then that could be passed to a loop conditional. This previous Q&A will be helpful in that regard.

  4. If you want to have everything run in a bash script, then R could just write one long script (i.e. output a string or series of strings to a file). This could be executed and the interrupt is guaranteed not to affect a loop, as you've unrolled the loop. Alternatively, you could write loops in bash. Here are examples. Personally, I like to apply commands to files using find (e.g. find .... -exec doStuff {} ';') or as inputs via backquotes. Unfortunately, I can't easily give well-formatted code on SO, since it embeds backquotes inside of backquotes... See this page for examples So, it may be the case that you could have one command, no looping, and apply a function to all files meeting a particular set of criteria. Using command substitution via backquotes is a very handy trick for a bash user.

Community
  • 1
  • 1
Iterator
  • 20,250
  • 12
  • 75
  • 111
  • Thanks for this, @Iterator. I've been chewing on it some more in the back of my mind the last couple of days. The callback infrastructure does sound promising. I'm worried though that it will come down to the same problem that prevents me from using DWin's idea above: I don't know of a function that "checks for a key press" or "checks for a button down" or something like that. I definitely need to read through some of that documentation more closely though, since there are quite a few details... – John Colby Nov 14 '11 at 18:00
  • the other suggestions will all be useful too. #1 & 2 especially for longer scripts, where I already use our cluster scheduler that is easy to kill things through. For short/simple little scripts though, I still wish there was a simpler way. #3 I think will be especially useful, as many of these tools do output an abnormal status if they are interrupted. – John Colby Nov 14 '11 at 18:04
  • After all this, though, I'm wondering if I'm trying too hard to fit a square peg in a round hole. It's so easy to do `for i in 1...n; do ...; done` in bash that for situations that really *are* that simple, maybe I should just stick with it? I dunno... – John Colby Nov 14 '11 at 18:07
  • I'm not really sure why the interrupt applies only to the lowest level program, rather than R, but trying to trap keystrokes using additional code may be setting yourself up for very context-dependent behavior. Another possibility, though is to write an R function that creates the bash script you're referring to. I'll update the answer. – Iterator Nov 14 '11 at 20:45
  • Thanks again, this discussion really helped out! – John Colby Dec 12 '11 at 23:22
0

You should be checking that every call to system() has a non-zero exit status. If there's an unexpected error, your program should stop(), or you can get unexpected results. The programs you call should be returning a non-zero exit status when killed with Ctrl+C, and if they don't, that is a bug or flaw in those programs.

Michael Hoffman
  • 32,526
  • 7
  • 64
  • 86
  • Actually I just tested this with `(system("sleep", "40"))` which doesn't indicate an error, even though doing it within Bash does, so maybe my above statement is wrong. – Michael Hoffman Nov 14 '11 at 21:16