Reducing redundancy for calculating large number of integrals numerically

Question

I need to calculate the following integral on a 2D-grid (x,y positions):

integral

with r = sqrt(x^2 + y^2) and the 2D-grid centered at x=y=0. The implementation is straightforward:

import numpy as np
from scipy import integrate

def integralFunction(x):
    def squareSaturation(y):
        return np.sqrt(1-np.exp(-y**2))
    return integrate.quad(squareSaturation,0,x)[0]

#vectorize function to apply function with integrals on np-array
integralFunctionVec = np.vectorize(integralFunction)

xmax = ymax = 5
Nx = Ny = 1024

X, Y = np.linspace(-xmax, xmax, Nx), np.linspace(-ymax, ymax, Ny)
X, Y = np.meshgrid(X, Y)

R = np.sqrt(X**2+Y**2)
Z = integralFunctionVec(R)

However, I'm currently working on a 1024x1024 grid and the calculation takes ~1.5 minutes. Now there is some redundancy in those calculations that I want to reduce to speed up the calculation. Namely:

As the grid is centered around r = 0, many values for r on the grid are the same. Due to symmetry only ~1/8 of all values are unique (for a square grid). One idea was to calculate the integral only for the unique values (found via np.unique) and then save them in a look-up table (hashmap?) Or I could cache the function values so that only new values are calculated (via @lru_cache). But does that actually work when I vectorize the function afterwards?
As the integral goes from 0 to r, the integral is often calculating integrals over intervals it has already calculated. E.g. if you calculate from 0 to 1 and afterwards from 0 to 2, only the interval from 1 to 2 is "new". But what would be the best way to utilize that? And would that even be a real performance boost using scipy.integrate.quad?

Do you have any feedback or other ideas to optimize this calculation?

Jérôme Richard · Accepted Answer · 2022-02-25T12:01:52.687

You can use Numba to speed up the computation of quad. Here is an example:

import numpy as np
import numba as nb
from scipy import integrate

@nb.cfunc('float64(float64)')
def numbaSquareSaturation(y):
    return np.sqrt(1-np.exp(-y**2))

squareSaturation = scipy.LowLevelCallable(numbaSquareSaturation.ctypes)

def integralFunction(x):
    return integrate.quad(squareSaturation,0,x)[0]

integralFunctionVec = np.vectorize(integralFunction)

xmax = ymax = 5
Nx = Ny = 1024

X, Y = np.linspace(-xmax, xmax, Nx), np.linspace(-ymax, ymax, Ny)
X, Y = np.meshgrid(X, Y)

R = np.sqrt(X**2+Y**2)
Z = integralFunctionVec(R)

This is about 25 times faster on my machine. The code is still suboptimal since squareSaturation calls introduces a big overhead but is seems SciPy does not provide a way to vectorize quad efficiently for your case. Note that using nb.cfunc+scipy.LowLevelCallable significantly speed up the execution as pointed out by @max9111.

As the grid is centered around r = 0, many values for r on the grid are the same. Due to symmetry only ~1/8 of all values are unique (for a square grid). One idea was to calculate the integral only for the unique values (found via np.unique) and then save them in a look-up table (hashmap?) Or I could cache the function values so that only new values are calculated (via @lru_cache). But does that actually work when I vectorize the function afterwards?

I do not expect this approach to be significantly faster although not recomputing the values is indeed a good idea. Note that hashmap are pretty slow as well as np.unique. I suggest you to just select the quarter of the input array R. Something like R[0:R.shape[0]//2, 0:R.shape[1]//2]. Be careful if the shape is odd.

As the integral goes from 0 to r, the integral is often calculating integrals over intervals it has already calculated. E.g. if you calculate from 0 to 1 and afterwards from 0 to 2, only the interval from 1 to 2 is "new". But what would be the best way to utilize that? And would that even be a real performance boost using scipy.integrate.quad?

This could help since the domain of a integral is smaller and the function should be smoother. This means Scipy should be faster to compute it. Even if it would not do that automatically, you can reduce the precision of the computed sub-intervals using optional parameters of quad.

If you use a LowLevelCallable you can get another factor of two by avoiding callbacks to the interpreter. (Decorating squareSaturation with `@nb.cfunc('float64(float64)')` and `squareSaturation=scipy.LowLevelCallable(squareSaturation.ctypes)` — max9111, Feb 25 '22 at 10:02
Very good improvement indeed! I did not know `scipy.LowLevelCallable`. Thank you. — Jérôme Richard, Feb 25 '22 at 12:04
It is quite straight forward if there are no additional arguments, for scalar arguments it's getting a bit more complicated: https://stackoverflow.com/a/61089817/4045774 and even more for arrays: https://stackoverflow.com/a/58561573/4045774 — max9111, Feb 25 '22 at 12:43

Reducing redundancy for calculating large number of integrals numerically

1 Answers1

Linked