Is there a way to configure furrr::future_map
that would allow a nested use case ? Consider the following code :
library(furrr)
library(tictoc)
# The problem is easier to reason about if you take N
# smaller than your number of cores, and M big.
N = 2
M = 100
plan(sequential)
tic()
x = future_map(1:N, function(i){
furrr::future_map(1:M,function(j){
Sys.sleep(1/M)
return(1)
})
})
toc() # 2sec + overhead
plan(multiprocess)
tic()
x = future_map(1:N, function(i){
furrr::future_map(1:M,function(j){
Sys.sleep(1/M)
return(1)
})
})
toc() # one sec + overhead !!
The first one should take a little more than 2sec. This is OK. But, even on a thousand-cores machine, is there a way to make the second one take less than 1sec ?
My use case is the following : some sub-tasks take a longer time than others to complete, and when some are finished, some cores are free to further disptach the longer tasks.
But furrr does not do that by default, and lnger-running tasks end up on only one core. The problem is equivalent to the one displayed on the above code : is there a way to have furrr re-dispatch inner tasks if some cores are free ?
Is it just unpossible to do, or did i miss a parameter to furrr/future calls ?