Pandarellel not progressing and at deadlock

Question

I am running an apply function on a pandas data frame using pandarallel package with initializing 4 cores. But unfortunately the process os not processing even a single records. Where as the same without Pandarallel parallel functionality taking 3 Min to complete the process.

Running the experiment on a 1000 records dataframe. Actually I have 2 Million dataset, that's where I am looking into pandarallel.

Attaching the screenshot for the same

The size of the dataset is 6 MB and RAM is 16 GB. What could be the issue of this deadlock situation?

What happens if you run it in the console instead of jupyter? — Eric Truett, Apr 20 '20 at 12:13
I recall some issues with multiprocessing in jupyter. I think I got around it by putting my multiprocessing code in a file and then importing the function, so you might want to give that a try. — Eric Truett, Apr 20 '20 at 13:06

score 1 · Answer 1 · answered Apr 20 '20 at 15:55

There are issues with multiprocessing in Jupyter. Try to run your code as a script or in the ipython console. If it works, then you can place the code in a separate file and import the function into your jupyter notebook.

# separatefile.py

def multiprocessing_function(params):

In jupyter

from separatefile import multiprocessing_function

multiprocessing_function(params)

Pandarellel not progressing and at deadlock

1 Answers1