2

I'm applying an integration function using scipy.integrate to two 2D arrays. Here's the example:

from scipy import integrate
import numpy as np

def integrate_lno2(top, bottom, peak_height, peak_width):
    return integrate.quad(lambda x: np.exp(-np.power(x - peak_height, 2)/(2*np.power(peak_width, 2))), top, bottom)[0]

# change row and col to test speed
row = 100; col = 100; peak_height=300; peak_width=60

top = np.linspace(100, 200, row*col).reshape(row, col)
bottom = np.linspace(800, 900, row*col).reshape(row, col)

res = np.zeros((row, col))

for i in range(row):
    for j in range(col):
        res[i, j] = integrate_lno2(top[i, j], bottom[i, j], peak_height, peak_width)

If the shape of 2D arrays increase, the for loop can be slow. I have found the numba integrand example, however it doesn't accept the upper and lower limit.

zxdawn
  • 825
  • 1
  • 9
  • 19
  • Did you seek for any equivalent module, if exist, in scikit learn library? If so, perhaps that be faster for case of huge data; I briefly explain why on a section of [my previous answer (4th-5th paragraph)](https://stackoverflow.com/a/71332351/13394817). – Ali_Sh May 16 '22 at 18:53
  • @Ali_Sh Thanks. So, could you point out which module in scikit learn library is equivalent? And how to apply that in detail? – zxdawn May 16 '22 at 20:07
  • I am not sure if there be an equivalent module in Scikit (not found in my quick search). I just mentioned that for further searches, if it is of importance, and the probable causes of slowness of SciPy on huge data in the link. – Ali_Sh May 17 '22 at 11:50

1 Answers1

2

Like in this previous answer, you can use Numba to speed up the lambda calls that are very slow due to big Numpy overheads (Numpy is not optimized to operate on scalar and is very slow to do that). Even better: you can tell to Numba to generate a C function which can be called directly from Scipy with a very small overhead (since it almost completely remove the overhead of the slow CPython interpreter). You can also also pre-compute the division by a variable and convert it to a multiplication (faster).

Here is the resulting code:

import numba as nb
import numpy as np
import scipy as sp

factor = -1.0 / (2 * np.power(peak_width, 2))

# change row and col to test speed
row = 100; col = 100; peak_height=300; peak_width=60

@nb.cfunc('float64(float64)')
def compute_numba(x):
    return np.exp(np.power(x - peak_height, 2) * factor)

compute_c = sp.LowLevelCallable(compute_numba.ctypes)

def integrate_lno2(top, bottom):
    return sp.integrate.quad(compute_c, top, bottom)[0]

top = np.linspace(100, 200, row*col).reshape(row, col)
bottom = np.linspace(800, 900, row*col).reshape(row, col)

res = np.zeros((row, col))

for i in range(row):
    for j in range(col):
        res[i, j] = integrate_lno2(top[i, j], bottom[i, j])

The computing loop is roughly 100 times faster on my machine.

Jérôme Richard
  • 41,678
  • 6
  • 29
  • 59
  • Thanks a lot! It works well. I suppose `peak_height, peak_width` can be deleted from the input of `integrate_lno2`. – zxdawn May 16 '22 at 20:35
  • Yes. Note that for Numba to be able to use the constants, they should be defined before the function. I edited the answer to take that into account. – Jérôme Richard May 16 '22 at 20:42