Understanding memory usage and performance of `furrr::future_apply`

Question

I am facing some issues parallelizing processes with furrr::future_apply (see Optimizing memory usage in applying a furrr function to a large list of tibbles: experiencing unexpected memory increase, and memory not being released in future with nested plan(multisession) ).

I simplified my setting to a simple situation:

    rm(list=ls(all=TRUE))
    
    require(future)
    require(furrr)
    require(dplyr)
    require(readr)
    require(parallel)
    set.seed(123)
    
    # fake data
    my_list <-   replicate(1000000, rnorm(1000), simplify = FALSE)
    
    # function to parallelize
    f_to_parallelize <- function(x){
      
      y <- sum(x)
      
      return(y)
      
    }
    
    # plans to test
    plan(sequential)
    #plan(multisession, workers=2)
    #plan(multisession, workers=6)
    #plan(multisession, workers=15)
    
    l <- future_walk(my_list, f_to_parallelize)

When I profile memory and time for these 4 plans this is what I get:

I have launched 4 different jobs from R studio server, while I was profiling all memory used for processes with my user in a separate job to get data for the graph.

This is the outpu of my sessionInfo()) of the parallelization jobs:

R version 4.2.2 Patched (2022-11-10 r83330) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Ubuntu 20.04.6 LTS Matrix products: default BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0 LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0 locale: [1] LC_CTYPE=C.UTF-8 LC_NUMERIC=C LC_TIME=C.UTF-8
[4] LC_COLLATE=C.UTF-8 LC_MONETARY=C.UTF-8 LC_MESSAGES=C.UTF-8
[7] LC_PAPER=C.UTF-8 LC_NAME=C LC_ADDRESS=C
[10] LC_TELEPHONE=C LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C
attached base packages: [1] parallel stats graphics grDevices utils datasets methods
[8] base
other attached packages: [1] readr_2.1.2 dplyr_1.1.0 furrr_0.2.3 future_1.24.0 loaded via a namespace (and not attached): [1] rstudioapi_0.13 parallelly_1.30.0 magrittr_2.0.2 hms_1.1.1
[5] tidyselect_1.2.0 R6_2.5.1 rlang_1.1.1 fansi_1.0.2
[9] globals_0.14.0 tools_4.2.2 utf8_1.2.2 cli_3.6.0
[13] ellipsis_0.3.2 digest_0.6.29 tibble_3.1.6 lifecycle_1.0.3
[17] crayon_1.5.0 tzdb_0.2.0 purrr_1.0.1 vctrs_0.5.2
[21] codetools_0.2-18 glue_1.6.2 compiler_4.2.2 pillar_1.7.0
[25] generics_0.1.2 listenv_0.8.0 pkgconfig_2.0.3

Is this behavior normal? I did not expected the steep increase in memory for all the plans, other than the increase in time when I increase the number of workers.

I also tested the sys.sleep(1) funtion in parallel, and I got the result I expected, time decreases as I increase workers.

What I am trying to parallelize is far more complex than this, i.e. a series of nested wrapped functions that do some training for some time series models and inference writing a csv and not returning anything.

I fill like I am losing something very simple but yet I cannot wrap my head around it, what concerns me the most is the memory increase, as it would be a very mempory intensive function. Also, the production machine will have Windows so I will not be able to use mclapply or other forked methods. Much appreciated if anyone is able to clarify this to me.

The function you are parallelizing is a bad example because it's so fast that overhead of parallelization is dominating benchmarks. Also, are you really surprised that used memory scales with number of workers? Each worker needs memory to hold a copy of its input data and to do the calculations. — Roland, Jun 08 '23 at 11:54
@Roland i think that no one expect that the memory increases so much. — manuzambo, Jun 08 '23 at 12:29
Also see [the 'No Free Lunch' theorem](https://en.wikipedia.org/wiki/No_free_lunch_theorem): in essence every convenience layer also comes at a cost. Up to you to trade those off. "Wishing the costs did not exist" sadly does not make them go away. — Dirk Eddelbuettel, Jun 08 '23 at 12:57
I wonder if you would get the same results using the [`future.apply`](https://cran.r-project.org/package=future.apply) package or by manually controlling what gets exported from the global environment by using the `env.globals` arugment in `future_walk()` — rps1227, Jun 08 '23 at 13:03

Understanding memory usage and performance of `furrr::future_apply`

0 Answers0