This question can be considered related to this one, that helped me to improve the R performances in computing the mean on a big array. Unfortunately, in this case I'm trying to apply something more complex (like a quantile calculation).
I have a 4-D array with more than 40 millions of elements and I want to calculate the 66th percentile on a specific dimension. Here there is the MATLAB code:
> n = randn(100, 50, 100, 20);
> tic; q = quantile(n, 0.66, 4); toc
Elapsed time is 0.440824 seconds.
Let's do something similar in R.
> n = array(rnorm(100*50*100*20), dim = c(100,50,100,20))
> start = Sys.time(); q = apply(n, 1:3, quantile, .66); print(Sys.time() - start)
Time difference of 1.600693 mins
I was aware of the better performances of MATLAB wrt R but in this case I don't know what to do. Probably I just need to wait 2 minutes instead of one second... I hope someone can suggest me any way to improve running times, anyway, thank you in advance...
UPDATE I've applied some of the suggestions into the comments and I've reduced the running time:
> start = Sys.time(); q = apply(n, 1:3, quantile, .66, names = FALSE); print(Sys.time() - start)
Time difference of 33.42773 secs
We're still far from the MATLAB performances but at least I've learnt something.
UPDATE I put here some advancements related to `quantile' function discussed here. The running time of same code I've shown above has passed from 33 to 5 seconds...