Disclaimer
Hello everyone! I recently started programming in R. My codes are working just fine, but in terms of speed some of them are taking way too long to put to good use. I hope someone can help me making this code run faster either by optimising the code, or with the use of one of the multicore packages.
About the code
I have large datasets containing about 15000 numeric data each. The code takes two parameters (p, n) where p >= n, and make subsets of the data. It applies the zyp.yuepilon function (from the zyp package) to each row of the subsets. Then the parameter n is used to apply the same function on an n sized subset.
Problem is I run this code in a nested for loop: p in 10:40 and n in 10:40 so it takes an eternity to get the results, and it's just one dataset among many others.
sp <- function(p, n){
library(zyp)
data <- runif(15000, 1, 4)
lower <- seq(80 - p + 1, by=1, length.out=length(data)-81)
upper <- lower + p - 1
subsets <- matrix(nrow=length(lower), ncol=p)
for(j in 1:length(lower)){
subsets[j, ] = data[lower[j] : upper[j]]
}
ret <- apply(subsets, 1, zyp.yuepilon)
subset_n <- subsets[, 1:n]
ret2 <- apply(subset_n, 1, zyp.yuepilon)
return(list(ret, ret2))
}
Benchmark results in seconds:
expr min lq median uq max neval
sp(7, 6) 92.77266 94.24901 94.53346 95.10363 95.64914 10