4

EDIT: Updated thanks to @daroczig's lovely answer below. However, test 2 still feels like it takes longer than test 1 which is what I'm wondering about.

UPDATE: On second reading, @daroczig's answer does explain away my confusion -- the issue was due to me not properly thinking out the system.time(expr) line of code.

I wanted to make a version of the system.time function which would be a bit more informative for myself in terms of understanding run-to-run time fluctuations:

system.time.summary <- function(N, expr) {
  t.mat <- replicate(N, system.time(expr))
  as.data.frame(apply(t.mat[1:3,], 1, summary))
}

However the problem is, in the self contained code below, test.2 feels like it is taking longer to do than test.1 (and I've run them several times to check), even though the code is pretty much exactly the same (test.1 uses a wrapper function, whereas test.2 is just the raw code)

# set up number of runs
N <- 100

# test 1
system.time.summary(N, (1:1e8)^2 + 1)

        user.self sys.self elapsed
Min.        0.000    0.000   0.000
1st Qu.     0.000    0.000   0.000
Median      0.000    0.000   0.000
Mean        0.058    0.031   0.089
3rd Qu.     0.000    0.000   0.000
Max.        0.580    0.310   0.890

# test 2
t.mat = replicate(N, system.time((1:1e8)^2 + 1))
as.data.frame(apply(t.mat[1:3,], 1, summary))

        user.self sys.self elapsed
Min.        0.630    0.120   0.860
1st Qu.     0.665    0.170   0.880
Median      0.695    0.195   0.880
Mean        0.692    0.196   0.882
3rd Qu.     0.715    0.225   0.890
Max.        0.760    0.260   0.900

I hope I explained that OK! It could be that it's currently Monday morning, but this is confusing me...

My system:

# Windows Server 2008 R2
> sessionInfo()
R version 2.12.0 (2010-10-15)
Platform: x86_64-pc-mingw32/x64 (64-bit)
Tony Breyal
  • 5,338
  • 3
  • 29
  • 49

2 Answers2

7

You are running system.time(system.time()) in the fist test, and also having system.time(1:1e8)^2 + 1) as an expression in the function which not a good idea, see:

> expr <- system.time((1:1e8)^2 + 1)
> system.time(expr)
   user  system elapsed 
      0       0       0 

But anyway: use the microbenchmark package from CRAN for such purposes, you will not regret. Set up your distinct functions and you can easily roll out your simulations with 100, 1000 or any runs. You can get a neat summary and boxplots at the end of benchmarking.

For example:

> test1 <- function() (1:1e8)^2 + 1
> (results <- microbenchmark(test1(), times=10))
Unit: nanoeconds
               min         lq     median         uq        max
test1() 3565386356 3703142531 3856450582 3931033077 3986309085
> boxplot(results)

enter image description here

Gavin Simpson
  • 170,508
  • 25
  • 396
  • 453
daroczig
  • 28,004
  • 7
  • 90
  • 124
  • 1
    My goodness, I didn't know there was a microbenchmark package, thanks for that! You are quite correct about the system.time(system.time()) in the first test, but even when I remove that, test 2 still feels longer than test 1. – Tony Breyal Feb 28 '11 at 13:35
  • Re-reading you answer, I believe it does answer my question because I didn't properly think about the system.time(expr) line of code. Answer accepted, many thanks for your time and help. – Tony Breyal Feb 28 '11 at 13:51
  • 2
    In your function expr is only getting evaluated once, outside the function the expression is evaluated n times. – hadley Feb 28 '11 at 13:53
  • I've accepted @Joris Meys answer because it fully explains everything and corrects my initial code, but your answer was great too, especially for finding out about the microbenchmark package! :) – Tony Breyal Feb 28 '11 at 14:04
  • 1
    @Tony Breyal: that is right, the checkmark should definitely go there! :) I have learned a lot from @Joris Meys answer also! – daroczig Feb 28 '11 at 21:46
2

As daroczig said, you have an extra system.time. But there's something else :

If you put a browser() in your function, you'd see what happens. In fact, the expression you make, is evaluated only once and then kept in memory. This is how R optimizes internally. So if you do :

system.time.summary(N,(1:1e8)^2 +1)

t.mat is internally :

           [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
user.self  0.61    0    0    0    0    0    0    0    0     0
sys.self   0.36    0    0    0    0    0    0    0    0     0
elapsed    0.97    0    0    0    0    0    0    0    0     0
user.child   NA   NA   NA   NA   NA   NA   NA   NA   NA    NA
sys.child    NA   NA   NA   NA   NA   NA   NA   NA   NA    NA

and expr is :

Browse[2]> str(expr)
 num [1:100000000] 2 5 10 17 26 37 50 65 82 101 ...

It's a bit difficult to change this, as R will evaluate any static expression only once and then retrieve the result another 99 times from the memory. If you want this not to happen, you have to pass an expression explicitly, and add the eval() function. :

system.time.summary <- function(N, expr) {
  t.mat <- replicate(N, system.time(eval(expr)))
  as.data.frame(apply(t.mat[1:3,], 1, summary))
}

system.time.summary(N, expression((1:1e8)^2 + 1))

Now expr gets evaluated every time, and remains an expression in the function :

Browse[2]> expr
expression((1:1e+08)^2 + 1)

This gives you the correct timings.

        user.self sys.self elapsed
Min.       0.6400   0.2000   0.970
1st Qu.    0.6850   0.2375   0.980
Median     0.7150   0.2700   0.985
Mean       0.7130   0.2700   0.985
3rd Qu.    0.7425   0.2975   0.990
Max.       0.7800   0.3500   1.000
Joris Meys
  • 106,551
  • 31
  • 221
  • 263
  • Hang on, you mean to say I can basically debug inside of R? I didn't know that! How much of R is there that I don't know about... – Tony Breyal Feb 28 '11 at 14:08
  • 3
    @Tony : If you didn't know that, then it is time to look at : ?traceback, ?debug, ?debugonce, ?browser, and http://stackoverflow.com/questions/1882734/what-is-your-favorite-r-debugging-trick (plus all the others you find on the site using `[r] debug` as search terms ) Enjoy :-) – Joris Meys Feb 28 '11 at 14:11
  • Cheers for that mate, looks like I'm going to have to find time to read up on all of these very interesting and very useful functions I've never heard of before! :) – Tony Breyal Feb 28 '11 at 14:27
  • 2
    fantastic, thanks for this post! I have learned a lot from your code above. – daroczig Feb 28 '11 at 21:48