Parallel calculations in Python

Question

I have the following code:

a=[3.5,7.6,8.6,1.3]
y=[0,0,0,0]
x=12

for i in range (4):
    y[i]=x**2*a[i]

print(y[0],a[0])
print(y[1],a[1])
print(y[2],a[2])
print(y[3],a[3])

I have a multi-core processor and I would like to distribute the calculations of the equation y[i]=x**2*a[i] for each parameter a[i] to a separate processor.

How to do parallel calculations for this example in Python?
I can easily do it in the Wolfram Mathematica using ParallTable, but I don't know how to do it in Python.

This is gonna be *slower* because creating thread/processes is expensive. You need far more work for this to be useful. In fact, CPython multiprocessing will be much slower for such a task and threading useless because of the GIL. Why do you want to do paralle computation in this case? — Jérôme Richard, Feb 02 '23 at 22:01
@Jérôme Richard, thanks for the comment! I would like to understand how to do process parallelization, because in my real task I need to solve a certain sequence of calculations for a large number of parameter values. — Mam Mam, Feb 03 '23 at 07:08
For large inputs, you need to check how computationally intensive the task is compared to the input size because transferring data is slow between processes in Python (especially if the output is big too). For such operations on arrays, please consider using Numpy instead. — Jérôme Richard, Feb 03 '23 at 09:15
@Jérôme Richard, my task contains modules, numpy, scipy and sympy. Could you show how do parallel calculations in numpy? — Mam Mam, Feb 03 '23 at 10:12
Numpy is meant to be fast on native data types (ie. integer, floats and fixed-size bounded strings). For generic objects coming from other modules, it will not help. It is not clear what your code does. Note that Numpy does not use multiple cores (except for some matrix operations), but it can be fast due to vectorization on native types (thanks to vectorization and mainly SIMD instructions). — Jérôme Richard, Feb 03 '23 at 10:17
@Jérôme Richard, thanks! In this question, I have given the complete code of my task, could you please take a look https://stackoverflow.com/questions/75336850/is-it-possible-to-solve-this-problem-in-parallel-for-several-parameter-values-in — Mam Mam, Feb 03 '23 at 14:08

score 1 · Answer 1 · answered Feb 02 '23 at 21:09

1

The easiest would be to use numpy instead, which would be way more efficient as it is written in C.

But if you need to explicitly use multiple cores, then here is another approach:

This is a CPU-bound task, therefore you should use multiprocessing module. Although it might seem slower at first, on the bigger arrays it will show a significant boost in time, but will consume a lot of system resources.

import multiprocessing as mp

def your_func(elem):
    return x * elem ** 2

x = 12
NUMBER_OF_CORES = 4
pool = mp.Pool(NUMBER_OF_CORES)

a = [1, 2, 4, 5, 6]

print(pool.map(your_func, a))
# [12, 48, 192, 300, 432]

answered Feb 02 '23 at 21:09

vanerk

21
4

Thank you very much! Could you show, how to do it using numpy too, please – Mam Mam Feb 02 '23 at 21:51
**Error occurs when running the code:** RuntimeError: An attempt has been made to start a new process before the current process has finished its bootstrapping phase. This probably means that you are not using fork to start your child processes and you have forgotten to use the proper idiom in the main module: if __name__ == '__main__': freeze_support() ... The "freeze_support()" line can be omitted if the program is not going to be frozen to produce an executable. – Mam Mam Feb 02 '23 at 22:30

Parallel calculations in Python

1 Answers1