Parallelizing Python code containing some code in R

Question

UPDATE: I actually tried Rcpp to see whether I could speed up that particular line in R but I decided not to pursue it because it took a long time when I tried for once and also saw Dirk's comment on the post How can I translate a function in R to Rcpp? which basically says that an R function run by C++ will not be sped up.

I have a script containing a function that creates word2vec embeddings with respect to the arguments provided to function, then implements an R code to do some statistical computations via embedding of rpy2 and uses the output of the R code.

def run_everything(*args): 
    
    #I create word embeddings, calculate cosine similarities here and write to csv
    

    #I call an R function through rpy2 embedding and apply some computations to identify outliers, write them to .csv again

    #Use the output of R function to do final operations and write another file to csv

Since I am trying a wide range of hyperparameters for my embeddings, I call this function from a different script and apply multiprocessing.Pool.

from build_model_and_run_r_code import run_everything
from multiprocessing import Pool

#I specify the args_for_pool as input for the function here 

if __name__ == '__main__':

with Pool(8) as pool: 
    pool.starmap(run_everything, args_for_pool)

Two things could be better:

Despite having 12 CPUs and only using 8 workers with multiprocessing.Pool, my computer significantly slows down while I run the code.
If I provide 48 different combinations of hyperparameters, everything finishes in 3 hours. While this is already less than the half of time it would take to run each combination one by one, I was wondering whether I was neglecting some aspect that could make it even more efficient. The processes before/after R code is implemented are run fairly quickly, it is the computations in R that takes significant time. This is the specific line that takes most of the time

Specific line in R that takes time is

outly = compBagplot(dat_filt, "sprojdepth", options=list(maxiter = 500))

Neither R nor Python has been designed for multicore architecture. They both suffer from a centralized global lock (aka GIL) preventing any multithreaded code to be faster except for few cases like IO operations (mainly for devices like SSDs). Both require multiprocessing to scale and using it introduces significant overheads and it quite a pain for any non-trivial code (ie. non embarrassingly parallel ones). Moreover, if a module use multiple thread internally (eg. Numpy BLAS operations), then using multiple processes can be slower (eg. because of too many threads to schedule, etc.). — Jérôme Richard, Feb 14 '23 at 17:30
While R codes can be executed in parallel, R tends to be insanely slow for many tasks. 8 snails are still slower to move objects than 1 human. I might be a good idea to optimized the hot part of the R code in the first place, typically using things like Rpp if this is possible. I tried to re-implement basic statistical functions manually using C++ multiple cores on a 6 core machine few years ago and got stupid speedups like a x10000 time faster code. — Jérôme Richard, Feb 14 '23 at 17:35
I think we need more details about what you actually execute so to really help you. That being said, ML libraries tends to generally be already optimized so they possibly already run in parallel. You need to check so to be sure. — Jérôme Richard, Feb 14 '23 at 17:36
thanks for the lengthy explanation! When you said Rpp, I am guessing you meant rcpp, right? @JérômeRichard. Would providing the code in R be of help to see whether rcpp can be helpful? Or do you need me to provide details about Python part of it? — newbie, Feb 15 '23 at 07:05
I included the line in that was taking the most time and tried Rcpp, but it seems like an R func cannot be sped up by Rccp @JérômeRichard. I will try to find a different approach. — newbie, Feb 16 '23 at 13:21

Parallelizing Python code containing some code in R

0 Answers0