2

I have a continous process that collects data, and I want to write the data that were collected every hour. Simply, how can I conditionally save data every hour to an .Rdata file.

For context, I collect data in a list, wish to save the list object to an hourly file, remove the list, and rebuild it.

I tried the code below but it did not work:

 if (identical(format(Sys.time(), "%M:%S"), "00:00")) {
      save(twt, file=fname_r)
 }

Any help will be much appreciated.

Btibert3
  • 38,798
  • 44
  • 129
  • 168

2 Answers2

6

You may be going about this in the wrong way. Not everything is a job for R (given that R really is single-threaded), and scheduling has always been a key operating system task. Use cron, or if you're on that market-leading OS from the Northwest, look into its scheduling options. Then setup a trivial Rscript file.

Have the continuous collection process run to collect, and to dump results somewhere, either in ascii or binary. Then have an hourly job collecting the most recent dumps. That can well be done in R once you figured the scheduling out.

As for the narrower question of deciding whether an hour has passed, use something like

then <- Sys.time()
# ... stuff happens ...
now <- Sys.time()
if (as.numeric(difftime(now, then, unit="mins") > 60) {
   # .. do stuff
}
Dirk Eddelbuettel
  • 360,940
  • 56
  • 644
  • 725
  • Hi Dirk, thanks for chiming in. I eventually want to get into scheduling the script for sure, but being new to scripting, R, and Ubuntu, I am trying to attack this in a hacky way that I can debug. I am interested in keeping the script running at all times, and saving the data objects out in hourly chunks. Any code will be very much appreciated. Best. – Btibert3 Jan 12 '12 at 20:17
  • Great, I didn't think about anchoring it against a date, thanks. – Btibert3 Jan 12 '12 at 20:28
5

To do the scheduling in R you could use the tclTaskSchedule function in the tcltk2 package. You tell it how lond to wait between running the tasks, the task to run (an expression/function) and to redo the task, then in the background it will run the task on a regular basis. Just be careful that you don't have 2 processes interfering with each other. If your task to save the object runs at the same time as something else is updating the same object then there is a chance that only part of the object will be saved or that what is saved is giberish. So you need some way to check if the data object is complete before saving it.

Greg Snow
  • 48,497
  • 6
  • 83
  • 110