6

Is there a way to configure furrr::future_map that would allow a nested use case ? Consider the following code :

library(furrr)
library(tictoc)

# The problem is easier to reason about if you take N
# smaller than your number of cores, and M big.
N = 2 
M = 100

plan(sequential)
tic()
x = future_map(1:N, function(i){
  furrr::future_map(1:M,function(j){
    Sys.sleep(1/M)
    return(1)
  })
})
toc() # 2sec + overhead

plan(multiprocess)
tic()
x = future_map(1:N, function(i){
  furrr::future_map(1:M,function(j){
    Sys.sleep(1/M)
    return(1)
  })
})
toc() # one sec + overhead !!

The first one should take a little more than 2sec. This is OK. But, even on a thousand-cores machine, is there a way to make the second one take less than 1sec ?

My use case is the following : some sub-tasks take a longer time than others to complete, and when some are finished, some cores are free to further disptach the longer tasks.

But furrr does not do that by default, and lnger-running tasks end up on only one core. The problem is equivalent to the one displayed on the above code : is there a way to have furrr re-dispatch inner tasks if some cores are free ?

Is it just unpossible to do, or did i miss a parameter to furrr/future calls ?

lrnv
  • 1,038
  • 8
  • 19
  • See [this vignette](https://cran.r-project.org/web/packages/future/vignettes/future-3-topologies.html). – Axeman May 05 '20 at 22:37

1 Answers1

2

Edit: Thanks to the comment from henrikb change multiprocess to multisession because of deprication since Juli 2023.

In A Future for R: Future Topologies mentioned by Axeman you can use the future::tweak in future::plan. There the elements of the list show the depth. So if you provide two plans the parallelization also runs in your nested furrr::future_map e.g.:

future::plan(
      list(
        future::tweak(
          future::multisession, 
          workers = 2), 
        future::tweak(
          future::multisession,
          workers = 4)
        )
      )

The example works with 8 cores since every of the two first workers gets 4 additional workers.

Domingo
  • 613
  • 1
  • 5
  • 15
  • Please replace `multiprocess` with `multisession`. The formar has been deprecated for a long time (since 2020) and has now been fully removed from the **future** package (July 2023). – HenrikB Jul 02 '23 at 07:32