0

I make a new post as a consequence of this other post also from me (Why is R self-reported memory usage much inferior to MAC-reported memory usage?)

I have identified that the source of the problem (skyrocketing of memory usage) lies in a a new version of some R libraries.

In short, I have an R script that aggregates a dozen tables or so into 1 large table (10k rows, 2k columns approx). The imports of my scripts are

library(argparse)
library(dplyr)
library(readr)
library(readxl)
library(stringr)
library(tidyr)
library(tibble)

Running the script with this "good" environment results in no error, with the code running in < 1min and using ~ 500Mb of memory.

name: env_good
channels:
  - conda-forge
  - bioconda
  - defaults
  - r
dependencies:
  - r-argparse=2.1.1
  - r-base=4.1.1
  - r-base64enc=0.1_3
  - r-cpp11=0.3.1
  - r-dplyr=1.0.7
  - r-generics=0.1.0
  - r-glue=1.4.2
  - r-lifecycle=1.0.0
  - r-magrittr=2.0.1
  - r-pillar=1.6.2
  - r-r6=2.5.1
  - r-readr=2.0.1
  - r-readxl=1.3.1
  - r-rlang=0.4.11
  - r-stringr=1.4.0
  - r-tibble=3.1.4
  - r-tidyr=1.1.3
  - r-tidyselect=1.1.1
  - r-vctrs=0.3.8
  - r-vroom=1.5.5

Running the script with this "bad" environment runs for multiple minutes before being killed by the OS for consuming too much memory (Virtual memory size reported by the activity monitor increases up to 40Gb before being killed).

name: env_bad
channels:
  - conda-forge
  - bioconda
  - defaults
  - r
dependencies:
  - r-argparse=2.1.1
  - r-base=4.1.1
  - r-base64enc=0.1_3
  - r-cpp11=0.4.2
  - r-dplyr=1.0.7
  - r-generics=0.1.3
  - r-glue=1.6.2
  - r-lifecycle=1.0.1
  - r-magrittr=2.0.3
  - r-pillar=1.8.0
  - r-r6=2.5.1
  - r-readr=2.0.1
  - r-readxl=1.3.1
  - r-rlang=1.0.4
  - r-stringr=1.4.0
  - r-tibble=3.1.4
  - r-tidyr=1.1.3
  - r-tidyselect=1.1.2
  - r-vctrs=0.4.1
  - r-vroom=1.5.7

Do you have any idea which update of which package is causing the issue so I can report to package maintainer?

Best,

Yoann

PS: I am working on MAC OS (Mojave, Memory: 16 GB 2133 MHz LPDDR3)

Durzot
  • 31
  • 4
  • Without a minimal [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) its impossible to know what your code is even doing to see how it might be affected by changes in these packages. You could read the release notes for packages that have been updated as a start. – MrFlick Aug 04 '22 at 15:26
  • @MrFlick Thank you for your message. I am working on writing a MRE however I am short of time these days. I made a post because I believe that this may be an important issue affecting many other people. I'll post some time soon I hope. – Durzot Aug 05 '22 at 07:41

0 Answers0