You did not say what type of lists a
, b
, c
and d
are. The elements in these lists must be able to be serializable using the pickle
module because they need to be passed to a function that will be executed by a process running in a different address space. For the sake of argument let's assume they are lists of integers of at least length 100.
You also did not state what platform you are running under (Windows? MacOS? Linux?). When you tag a question with multiprocessing
you are supposed to also tag the question with the platform. How you organize your code is somewhat dependent on the platform. In the code below, I have chosen the most efficient arrangement for those platforms that use spawn
to create new processes, namely Windows. But this will also be efficient on MacOS and Linux, which by default use fork
to create new processes. You can research what spawn
and fork
mean in connection with creating new processes. Ultimately to be memory and CPU efficient, you only want as global variables outside of a if __name__ == '__main__':
block those variables which have to be global. This is why I have the declaration of the lists local to a function.
Then using the concurrent.futures
module we have:
from concurrent.futures import ProcessPoolExecutor
def searching_algorithm(a, b, c, d):
...
return a * b * c * d
def main():
# We assume a, b, c and d each have 100 or more elements:
a = list(range(1, 101))
b = list(range(2, 102))
c = list(range(3, 103))
d = list(range(4, 104))
# Use all CPU cores:
with ProcessPoolExecutor() as executor:
result = list(executor.map(searching_algorithm, a[0:100], b[0:100], c[0:100], d[0:100]))
print(result[0], result[-1])
# Required for Windows:
if __name__ == '__main__':
main()
Prints:
24 106110600
To use the multiprocessing
module instead:
from multiprocessing import Pool
def searching_algorithm(a, b, c, d):
...
return a * b * c * d
def main():
# We assume a, b, c and d each have 100 or more elements:
a = list(range(1, 101))
b = list(range(2, 102))
c = list(range(3, 103))
d = list(range(4, 104))
# Use all CPU cores:
with Pool() as pool:
result = pool.starmap(searching_algorithm, zip(a[0:100], b[0:100], c[0:100], d[0:100]))
print(result[0], result[-1])
# Required for Windows:
if __name__ == '__main__':
main()
In both coding examples if the lists a
, b
, c
and d
contain exactly 100 elements, then there is no need to take slices of them such as a[0:100]
; just pass the lists themselves, e.g.:
result = list(executor.map(searching_algorithm, a, b, c, d))