I want to run a single function (i.e. calculating the Gini coefficient using the DescTools
library) on a vector of length 40k.
set.seed(42)
my_vec = sample(1:100000, 40000, replace = T)
#the function to get the Gini with confidence interval
DescTools::Gini(my_vec, conf.level = 0.99)
Calculating the confidence interval (calculating just the Gini coefficient without the confidence interval works perfectly in no time) results in some memory issues on my machine (64 bit R version, 8 GB RAM) and returns
Error: vector memory exhausted (limit reached?)
To solve this, I looked into these options:
- increase memory available to R but have not found an option for that on the Mac (
memory.limit()
seems to be only for Windows) - Run the function in parallel using the
parallel
R library
I'm struggling with the latter because the function does not require any iteration over multiple columns. So I'd not expect a parallelization to work:
mclapply(my_vec, function(x) Gini(x, unbiased = T, conf.level = 0.99), mc.cores = 3) #does not work
Is there a way to avoid the memory issue, and if parallelization is a solution, how could I implement it for the one vector? Thanks a lot!