6

For simple, home-made benchmarking I'd like to add a timer to my R script so that I know for how long it has been running. It's a script that's unleashed on loads of data so it can take for over an hour to complete. Therefore I'm looking for a method that tells me the exact time the script has been running.

The idea that I got was:

old = getCurrentTime()

# Do the rest of my script

(new = getCurrentTime() - old)

I don't know if this makes sense, but it seems the best way to do this, without having a counter running in the background, is by comparing the start time of the script and the time and the end and print the difference. However, I'm not sure how to get the time in R, get the difference, and format it in hh:mm:ss.

Bram Vanroy
  • 27,032
  • 24
  • 137
  • 239
  • Your last paragraph makes it sound like you want the answers in the linked question. If this is not correct, please edit to show *exactly* what you want and we will reopen the question. – Rich Scriven Aug 19 '15 at 16:45

2 Answers2

13

You can use Sys.time()

old <- Sys.time() # get start time

# some code
#...

# print elapsed time
new <- Sys.time() - old # calculate difference
print(new) # print in nice format

However, there's also the microbenchmark package for more sophisticated / accurate timing using multiple trials etc.

arvi1000
  • 9,393
  • 2
  • 42
  • 52
  • I think he wants to know during some code, not after some code. – nathanesau Aug 19 '15 at 15:53
  • Not sure what you mean but given that OP asks *"I'm not sure how to get the time in R, get the difference, and format it in hh:mm:ss"* i feel like this answer directly addresses all the questions – arvi1000 Aug 19 '15 at 16:36
  • Suppose you source a .R file, and you want to the running time to be printed intermittently (i.e every 10 seconds) while the program is running. For example, while you are installing a package on linux, the download speed and time elapsed is printed while the package is downloading. – nathanesau Aug 19 '15 at 16:39
  • @nathanesau No, arvi's answer was what I wanted. By declaring old at the top of the script and new at the end, you get the time the script needed to run. – Bram Vanroy Aug 19 '15 at 16:41
6

Your general approach was correct and is probably the most conventional way to accomplish this, regardless of the programming language. However, subtracting two POSIXt objects will give you an object of class difftime, where the unit of measurement is selected automatically (depending on the size of the difference), rather than the "HH:MM:SS" format you are looking for. Writing a function return this format is pretty straightforward, e.g. something like

hms_span <- function(start, end) {
  dsec <- as.numeric(difftime(end, start, unit = "secs"))
  hours <- floor(dsec / 3600)
  minutes <- floor((dsec - 3600 * hours) / 60)
  seconds <- dsec - 3600*hours - 60*minutes
  paste0(
    sapply(c(hours, minutes, seconds), function(x) {
      formatC(x, width = 2, format = "d", flag = "0")
    }), collapse = ":")
}

(t0 <- Sys.time() - 3600 * 8.543)
#[1] "2015-08-19 03:48:36 EDT"
(t1 <- Sys.time())
#[1] "2015-08-19 12:19:24 EDT
R> hms_span(t0, t1)
#[1] "08:32:34"
nrussell
  • 18,382
  • 4
  • 47
  • 60