I have to evaluate every element of a matrix using a function with a numerical integral (scipy.integrate.quad). The elements of the matrix are pixels of a 5202x3465 gray image.
I have access to a GPU and I would like to evaluate as many elements as possible in parallel, because right now, with linear programming, the entire computation takes more than 24 hours.
Here it's the sample code:
for i in range(0, rows):
for j in range(0, columns):
img[i, j] = myFun(constant_args, i, j)
def myFunc(constant_args, i, j):
new_pixel = quad(integrand, constant_args, i, j)
... other calculations ...
return new_pixel
I tried to use multiprocessing (as mp) like this:
arows = list(range(0, rows))
acolumns = list(range(0, columns))
with mp.Pool() as pool:
img = pool.map(myFunc, (constant_args, arows, acolumns))
or with img = pool.map(myFunc(constant_args), (arows, acolumns))
but it gives me:
TypeError: myFunc() missing 2 required positional arguments: 'j' and 'i'
I don't understand how this works from other examples and I don't know the terminology used in the documentation.
I only want to divide that nested loop into subthreads, if someone suggests a different approach I'm all ears.
ps. I tried with numba but it gives errors when interacting with some Scipy libraries
Thank you in advance for your help!