How can I parrellize the apply function in pandas?

Question

I've been playing with the apply function to run functions on each row of my data..it's cool and seems faster than the for loops I was using. I'm now thinking about how I can speed it up so I'm wondering how can I use apply in parallel.

In [49]: df
Out[49]: 
          0         1
0  1.000000  0.000000
1 -0.494375  0.570994
2  1.000000  0.000000
3  1.876360 -0.229738
4  1.000000  0.000000

In [50]: def f(x):    
   ....:  return x[0] + x[1]  
   ....:  

In [51]: df.apply(f, axis=1) #passes a Series object, row-wise
Out[51]: 
0    1.000000
1    0.076619
2    1.000000
3    1.646622
4    1.000000

with this example, do I need to wrap my function or the apply method with import concurrent.futures or something similar?

`apply` is not really faster than `for` loop, if anything, it's slower due to all the overhead. Your question is almost identical to asking how to vectorize some certain function, which is way too broad, if possible at all, for a SO question. — Quang Hoang, Oct 20 '20 at 18:23
Quite a few options like `dask`, `pandarallel`, `swifter`. Have a look at [`this`](https://stackoverflow.com/questions/45545110/make-pandas-dataframe-apply-use-all-cores). — Mayank Porwal, Oct 20 '20 at 18:24
Does this answer your question? [pandas multiprocessing apply](https://stackoverflow.com/questions/26784164/pandas-multiprocessing-apply) — Michael Szczesny, Oct 20 '20 at 18:25
Also [this question](https://stackoverflow.com/questions/16476924/how-to-iterate-over-rows-in-a-dataframe-in-pandas/55557758#55557758) on iterrows. — Quang Hoang, Oct 20 '20 at 18:27

How can I parrellize the apply function in pandas?

0 Answers0