1

I have recently left my job and had to make a new stackoverflow account. Hope it is fine that this is my first post once more :)

I am trying to paralelize a function:

@Parallelizer
def square(x):
    return x ** 2

Using a wrapper I built in the following way:

def Parallelizer(func):
    @wraps(func)
    def wrapper(*args, **kwargs):

        # Step 1: Start a pool of worker processes

        with multiprocessing.Pool(processes=4) as pool:
            # Step 2: Extract the iterable argument
            iterables = args[0]

            # Step 3: Generate tasks for each worker by pairing the function
            # and the individual items from the iterable
            tasks = [(func, (iterable, *args[1:]), kwargs) for iterable in iterables]

            # Step 4: Asynchronously apply each task using the pool
            async_results = [pool.apply_async(task[0], task[1], task[2]) for task in tasks]

            # Step 5: Wait for the results to be ready and retrieve them
            output = [async_result.get() for async_result in async_results]

        return output

    return wrapper

When I try to run the function in paralel for 10 different numbers using the following code:

# Create a list of integers
numbers = range(10)

# Square each number in the list using a serial computation
serial_output = square(numbers)

however I keep encountering the error that the function square cannot be pickled...

Full traceback:

*
Traceback (most recent call last):
  File "C:\Users\\OneDrive\Bureaublad\Work\Development\Projects\pythonProject\KFolder.py", line 57, in <module>
    serial_output = square(numbers)
  File "C:\Users\\OneDrive\Bureaublad\Work\Development\Projects\pythonProject\KFolder.py", line 38, in wrapper
    output = [async_result.get() for async_result in async_results]
  File "C:\Users\\OneDrive\Bureaublad\Work\Development\Projects\pythonProject\KFolder.py", line 38, in <listcomp>
    output = [async_result.get() for async_result in async_results]
  File "C:\Users\\anaconda3\lib\multiprocessing\pool.py", line 774, in get
    raise self._value
  File "C:\Users\\anaconda3\lib\multiprocessing\pool.py", line 540, in _handle_tasks
    put(task)
  File "C:\Users\\anaconda3\lib\multiprocessing\connection.py", line 206, in send
    self._send_bytes(_ForkingPickler.dumps(obj))
  File "C:\Users\\anaconda3\lib\multiprocessing\reduction.py", line 51, in dumps
    cls(buf, protocol).dump(obj)
_pickle.PicklingError: Can't pickle <function square at 0x00000258FACD12D0>: it's not the same object as __main__.square

Process finished with exit code 1
*

I ran some checks and I find that the function square can be pickled fine on its own. I am not yet familiar enough with the module to pinpoint exactly where it is going wrong. Hopefully someone can help me out!

I tried to create a wrapper that paralelizes a function with every element of an iterable given as input. I expect it to create several workers to complete the function for different elements in paralel. However, I get an error related to pickling.

  • 2
    Using decorators with multiprocessing requires special handling. See also https://stackoverflow.com/questions/9336646/python-decorator-with-multiprocessing-fails – Nick ODell Jul 16 '23 at 23:53

0 Answers0