0

I have written some code to parallelize the processing of some data in a Jupyter notebook.

It consists of a function taking some data as input, transforming them and writing the result in a file:

%%writefile my_functions.py
import pickle

def my_function(f):

    d = f*10

    with open(f"{v}.p", "wb") as f:
        pickle.dump(d, f, pickle.HIGHEST_PROTOCOL)

The function is called in the main:

from multiprocess import Pool
from my_functions import my_function
from tqdm import tqdm

values_list = [0, 1, 2, 3, 4, 5, 6]

max_pool = 5

factor=10

with Pool(max_pool) as p:
    pool_outputs = list(
        tqdm(
            p.imap(my_function,
                   values_list),
            total=len(values_list)
        )
    )    

How can I modify the code in order to pass some variables to my_function? For example, let's suppose I want to pass the value of a variable v:

%%writefile my_functions.py
import pickle

def my_function(f,v):

    d = f*v

    with open(f"{v}.p", "wb") as f:
        pickle.dump(d, f, pickle.HIGHEST_PROTOCOL)

How can I modify the call to p.imap accordingly?

Similarly to other solutions for multiprocessing (e.g. this one), I tried to use p.imap(my_function, zip(values_list, repeat(factor))) or p.imap(my_function(factor), values_list) but they did not work.

Note: I am not bound to using multiprocess. If you know solutions using other packages, I am a taker.

shamalaia
  • 2,282
  • 3
  • 23
  • 35

1 Answers1

1

To do many parallel tasks. I usually use ThreadPoolExecutor. Here I make a small example based on your source code.

from concurrent.futures import ThreadPoolExecutor
from functools import partial
import pickle


def my_function(f):

    d = f*10
    with open(f"{v}.p", "wb") as f:
        pickle.dump(d, f, pickle.HIGHEST_PROTOCOL)


if __name__ == "__main__":
    f = [1,2,3,4,5,6,7,8] # I assume the parameter f is a number.
    with ThreadPoolExecutor() as executor:
        fn = partial(my_function)
        executor.map(fn, f)

For more detail, you can refer below link:

concurrent.futures

Dharman
  • 30,962
  • 25
  • 85
  • 135