1

I am running an apply function on a pandas data frame using pandarallel package with initializing 4 cores. But unfortunately the process os not processing even a single records. Where as the same without Pandarallel parallel functionality taking 3 Min to complete the process.

Running the experiment on a 1000 records dataframe. Actually I have 2 Million dataset, that's where I am looking into pandarallel.

Attaching the screenshot for the same

enter image description here

The size of the dataset is 6 MB and RAM is 16 GB. What could be the issue of this deadlock situation?

Jack Daniel
  • 2,527
  • 3
  • 31
  • 52

1 Answers1

1

There are issues with multiprocessing in Jupyter. Try to run your code as a script or in the ipython console. If it works, then you can place the code in a separate file and import the function into your jupyter notebook.

# separatefile.py

def multiprocessing_function(params):

In jupyter

from separatefile import multiprocessing_function

multiprocessing_function(params)
Eric Truett
  • 2,970
  • 1
  • 16
  • 21