Since the provided sample function is not complete (fun1
is not defined), I am using a pair of trivial functions but with a sleep
call to simulate some heavy computation:
s <- 0.1
fun_x1 <- function(x1) {
Sys.sleep(s)
2 + 0.5 * x2 -x1
}
fun_x2 <- function(x2) {
Sys.sleep(s)
3 + 0.25 * x1 -x2
}
As a baseline, we than call your code:
eps <- 0.000001
t1 <- Sys.time()
x1 <- 0
x2 <- 0
counter <- 0
repeat {
x1_old<-x1
x2_old<-x2
counter <- counter + 1
res<-uniroot(fun_x1,c(0,5),tol = 0.000001)
x1<-res$root
res_o<-uniroot(fun_x2,c(0,5),tol = 0.000001)
x2 <- res_o$root
if (abs(sum(c(x1,x2)-c(x1_old,x2_old))) < eps) {
print(c(counter,x1,x2,x1_old,x2_old))
break
}
}
#> [1] 10 4 4 4 4
t2 <- Sys.time()
print(t2 -t1)
#> Time difference of 8.089114 secs
Here it takes 10 iterations in 8s to find the common root. However, this cannot be parallelized since each step depends on the result of the previous step. We can untangle this at one point by first finding both individual roots and then updating x1
and x2
. Problem is that convergence is slower this way:
t1 <- Sys.time()
x1 <- 0
x2 <- 0
counter <- 0
repeat {
x1_old<-x1
x2_old<-x2
counter <- counter + 1
res<-uniroot(fun_x1,c(0,5),tol = 0.000001)
res_o<-uniroot(fun_x2,c(0,5),tol = 0.000001)
x1<-res$root
x2 <- res_o$root
if (abs(sum(c(x1,x2)-c(x1_old,x2_old))) < eps) {
print(c(counter,x1,x2,x1_old,x2_old))
break
}
}
#> [1] 16.000000 4.000000 4.000000 3.999999 4.000000
t2 <- Sys.time()
print(t2 -t1)
#> Time difference of 12.91926 secs
For my sample functions it now takes 16 iterations in almost 13s. However, this form can be parallelized since we can compute both roots in parallel using the future
package:
library(future)
plan("multiprocess")
t1 <- Sys.time()
x1 <- 0
x2 <- 0
counter <- 0
repeat {
x1_old<-x1
x2_old<-x2
counter <- counter + 1
res %<-% uniroot(fun_x1,c(0,5),tol = 0.000001)
res_o <- uniroot(fun_x2,c(0,5),tol = 0.000001)
x1 <- res$root
x2 <- res_o$root
if (abs(sum(c(x1,x2)-c(x1_old,x2_old))) < eps) {
print(c(counter,x1,x2,x1_old,x2_old))
break
}
}
#> [1] 16.000000 4.000000 4.000000 3.999999 4.000000
t2 <- Sys.time()
print(t2 -t1)
#> Time difference of 7.139439 secs
It still takes 16 iterations, but this time they are done in 7s. This is almost a factor of two faster than the previous version, i.e. there is little overhead. However, the original version is nearly as fast, since it converges faster.
You will have to try with your real functions if the speed-up from parallel execution is worth the slower convergence.
BTW, have you checked that there are no better algorithms for finding this common root?