3

I want to use arms() to get one sample each time and make a loop like the following one in my function. It runs very slowly. How could I make it run faster? Thanks.

library(HI)    
dmat <- matrix(0, nrow=100,ncol=30)
system.time(
    for (d in 1:100){
        for (j in 1:30){
            y <- rep(0, 101)
            for (i in 2:100){

                y[i] <- arms(0.3, function(x) (3.5+0.000001*d*j*y[i-1])*log(x)-x,
                    function(x) (x>1e-4)*(x<20), 1)       
            }
        dmat[d, j] <- sum(y)
        }
    }
) 
moli
  • 129
  • 2
  • 8

3 Answers3

3

This is a version based on Tommy's answer but avoiding all loops:

library(multicore) # or library(parallel) in 2.14.x
set.seed(42)
m = 100
n = 30
system.time({
    arms.C <- getNativeSymbolInfo("arms")$address
    bounds <- 0.3 + convex.bounds(0.3, dir = 1, function(x) (x>1e-4)*(x<20))
    if (diff(bounds) < 1e-07) stop("pointless!")
    # create the vector of z values
    zval <- 0.00001 * rep(seq.int(n), m) * rep(seq.int(m), each = n)
    # apply the inner function to each grid point and return the matrix
    dmat <- matrix(unlist(mclapply(zval, function(z)
            sum(unlist(lapply(seq.int(100), function(i)
                .Call(arms.C, bounds, function(x) (3.5 + z * i) * log(x) - x,
                      0.3, 1L, parent.frame())
            )))
        )), m, byrow=TRUE)
}) 

On a multicore machine this will be really fast since it spreads the loads across cores. On a single-core machine (or for poor Windows users) you can replace mclapply above with lapply and get only a slight speedup compared to Tommy's answer. But note that the result will be different for parallel versions since it will use different RNG sequences.

Note that any C code that needs to evaluate R functions will be inherently slow (because interpreted code is slow). I have added the arms.C just to remove all R->C overhead to make moli happy ;), but it doesn't make any difference.

You could squeeze out a few more milliseconds by using column-major processing (the question code was row-major which requires re-copying as R matrices are always column-major).

Edit: I noticed that moli changed the question slightly since Tommy answered - so instead of the sum(...) part you have to use a loop since y[i] are dependent, so the function(z) would look like

function(z) { y <- 0
    for (i in seq.int(99))
         y <- y + .Call(arms.C, bounds, function(x) (3.5 + z * y) * log(x) - x,
                        0.3, 1L, parent.frame())
    y }
Simon Urbanek
  • 13,842
  • 45
  • 45
  • Nope - it is not because calls for all values are independent. It can be implemented as a loop but doesn't have to (which is the whole point if you read the above). – Simon Urbanek Feb 12 '12 at 03:31
2

Well, one effective way is to get rid of the overhead inside arms. It does some checks and calls the indFunc every time even though the result is always the same in your case. Some other evaluations can be also be done outside the loop. These optimizations bring down the time from 54 secs to around 6.3 secs on my machine. ...and the answer is identical.

set.seed(42)
#dmat2 <- ##RUN ORIGINAL CODE HERE##

# Now try this:
set.seed(42)
dmat <- matrix(0, nrow=100,ncol=30)
system.time({
    e <- new.env()
    bounds <- 0.3 + convex.bounds(0.3, dir = 1, function(x) (x>1e-4)*(x<20))
    f <- function(x) (3.5+z*i)*log(x)-x
    if (diff(bounds) < 1e-07) stop("pointless!")
    for (d in seq_len(nrow(dmat))) {
        for (j in seq_len(ncol(dmat))) {
            y <- 0
            z <- 0.00001*d*j
            for (i in 1:100) {
                y <- y + .Call("arms", bounds, f, 0.3, 1L, e)
            }
            dmat[d, j] <- y
        }
    }
}) 

all.equal(dmat, dmat2) # TRUE
Tommy
  • 39,997
  • 12
  • 90
  • 85
  • Thanks for your help. My logdens is very complicated and will be updated with the each sample point from arms(). So It takes very long time to run. Maybe I need go back to write c function. – moli Feb 09 '12 at 23:23
  • So you have to change logdens based on the sample from arms? If that's the case then the question definitely needs to be rewritten – John Feb 09 '12 at 23:36
0

why not like this?

dat <- expand.grid(d=1:10, j=1:3, i=1:10)

arms.func <- function(vec) {
  require(HI)
  dji <- vec[1]*vec[2]*vec[3]
  arms.out <- arms(0.3, 
                   function(x,params) (3.5 + 0.00001*params)*log(x) - x,
                   function(x,params) (x>1e-4)*(x<20),
                   n.sample=1,
                   params=dji)

  return(arms.out)
}

dat$arms <- apply(dat,1,arms.func)

library(plyr)
out <- ddply(dat,.(d,j),summarise, arms=sum(arms))

matrix(out$arms,nrow=length(unique(out$d)),ncol=length(unique(out$j)))

However, its still single core and time consuming. But that isn't R being slow, its the arms function.

Justin
  • 42,475
  • 9
  • 93
  • 111
  • system.time(`y <- arms(runif(1,1e-4,20), function(x) (3.5)*log(x)-x, function(x) (x>1e-4)*(x<20), 100*30*100))` user system elapsed 2.739 0.010 2.766 So I guess communication between R and c take too much time. – moli Feb 09 '12 at 23:26