0

I need to numerically solve a convolution. I would like to feed an array at which the convolution is solved into scipy.stats.quad in one go:

# Python 2.7.12
import numpy as np
from scipy import integrate


def gauss(x, sig, mu):
    return(1 / np.sqrt(2*np.pi*sig**2) * np.exp(-(x-mu)**2/(2.*sig**2)))


def PDF_log(x, sig, mu):
    mu = np.log(mu)
    x = np.asarray(x)  # get nans instead of errors for 1/x
    a = 1/(x * sig * np.sqrt(2*np.pi)) * np.exp(-(np.log(x)-mu)**2/(2*sig**2))
    # make sure negative 'x' are returned to 0., as the function only is
    # defined for positive values.
    a = np.nan_to_num(a)
    return a


def gauss_conv(t, x, sig, mu, sig0, mu0):
    a = PDF_log(t, sig, mu) * gauss(x-t, sig0, mu0)
    return a


def gauss_log_num(x, sig, mu, sig0, mu0):
    return integrate.quad(
        gauss_conv,
        a=-np.inf, b=np.inf, args=(x, sig, mu, sig0, mu0)
        )


mu = 0.3
sig = 0.12
mu0 = 0.
sig0 = 0.05

x = np.array([
    0.06581838,
    0.11165416,
    0.15748993,
    0.20332571,
    0.24916149,
    0.29499726,
    0.34083304,
    0.38666882,
    0.43250459,
    0.47834037,
    0.52417615,
    ])

print(gauss_log_num(x, sig, mu, sig0, mu0))

this raises:

Traceback (most recent call last):
  File "num_gauss_lognormal.py", line 327, in <module>
    test()
  File "num_gauss_lognormal.py", line 325, in test
    print(gauss_log_num(x, sig, mu, sig0, mu0))
  File "num_gauss_lognormal.py", line 163, in gauss_log_num
    return ( integrate.quad(gauss_conv, a = -np.inf, b = np.inf, args=(x,sig,mu,sig0,mu0)) )
  File "/usr/local/lib/python2.7/dist-packages/scipy/integrate/quadpack.py", line 323, in quad
    points)
  File "/usr/local/lib/python2.7/dist-packages/scipy/integrate/quadpack.py", line 390, in _quad
    return _quadpack._qagie(func,bound,infbounds,args,full_output,epsabs,epsrel,limit)
TypeError: only length-1 arrays can be converted to Python scalars`

If I only evaluate x at a single position, e.g. x[0], the convolution works and I get a value. Obviously I could now run a for loop over x, but this feels like the slowest possible way of doing it. What do I have to do to evaluate the convolution at every value of x in one go?

Nico Schlömer
  • 53,797
  • 27
  • 201
  • 249
Sebastiano1991
  • 867
  • 1
  • 10
  • 26
  • 1
    Please post the Traceback. – wwii Nov 10 '17 at 17:43
  • Might not be able to, ...scipy.integrate.quad uses `a technique from the Fortran library QUADPACK` and `QUADPACK is a FORTRAN 77 library for numerical integration of one-dimensional functions.` You might need to roll your own if you want a *vectorized* solution. How fine a granularity do you need? – wwii Nov 10 '17 at 18:13
  • Does it need to be an indefinite integral? -infinty to +infinity? – wwii Nov 10 '17 at 18:31
  • `a = PDF_log(t,sig,mu) ...` is this correct? You aren't using your input vector in `PDF_log()`? – wwii Nov 10 '17 at 18:47
  • 1
    People ask for a vectorized form of `quad` about once a month. There isn't one, there cannot be one. An adaptive routine cannot be vectorized. See [NumPy vectorization with integration](https://stackoverflow.com/q/41223186#41226624), for example. –  Nov 11 '17 at 18:21
  • @6'whitemale thank you, I oversaw that post. However, I am wondering: When I try with `scipy.integrate.simps` I need to create an array, i.e. I need to feed a sample range of `t` - which needs to be the same length as `x` (do I?). In consequence I only evaluate the integral at certain points - i.e. the advantage of numerical integration over a discrete evaluation as in `scipy.signal.convolve` is lost - am I right? – Sebastiano1991 Nov 16 '17 at 10:35
  • 1
    Yes, integration of sampled data (trapz, simps, romb) is not going to be as accurate as `quad` which strategically chooses evaluation points based on the function itself. I ran some [experiments](https://calculus7.org/2017/01/07/vectorization-of-integration/) which show using `quad` in for-loop as a clear winner over vectorized non-adaptive methods.. –  Nov 16 '17 at 15:14
  • The above code contains syntax errors. Please fix. – Nico Schlömer Nov 17 '17 at 08:55
  • Never mind now, I did it myself. – Nico Schlömer Nov 17 '17 at 14:09

1 Answers1

1

One way to perform the calculation across all of the elements in x is to vectorize the calculation.

from scipy import  integrate
import numpy as np

def gauss(x,sig,mu):
    return( 1/(np.sqrt(2*np.pi*sig**2)) * np.exp(-(x-mu)**2/(2.*sig**2)) )

def PDF_log(x,sig,mu):
    mu = np.log(mu)
    x = np.asarray(x) # get nans instead of errors for 1/x
    a =  (1/x)*(1/(sig*np.sqrt(2*np.pi)))*np.exp(-(np.log(x)-mu)**2/(2*sig**2)) 
    a = np.nan_to_num(a) #make sure negative 'x' are returned to 0., as the function only is defined for positive values.
    return(a)

def gauss_conv(t,x,sig,mu,sig0,mu0):
    a = PDF_log(t,sig,mu) * gauss(x-t,sig0,mu0) 
    return(a)

def gauss_log_num(x, sig, mu, sig0, mu0):
    return ( integrate.quad(gauss_conv, a = -np.inf, b = np.inf, args=(x,sig,mu,sig0,mu0)) )


x = np.array([[ 0.06581838 , 0.11165416 , 0.15748993 , 0.20332571 , 0.24916149,  0.29499726 , 0.34083304,  0.38666882 , 0.43250459 , 0.47834037 , 0.52417615]])


mu = 0.3
sig = 0.12
mu0 = 0.
sig0 = 0.05

convolver = lambda t: gauss_log_num(t, sig, mu, sig0, mu0)
vfunc = np.vectorize(convolver)
ans = vfunc(x)

Which returns:

print ans
(array([[  2.64327555e-03,   4.42748593e-02,   3.87454290e-01,
          1.80492291e+00,   4.57171773e+00,   6.44923191e+00,
          5.20617751e+00,   2.47941776e+00,   7.20733704e-01,
          1.32773639e-01,   1.61483270e-02]]), array([[  8.75523521e-09,   3.90932482e-09,   9.90265796e-09,
          6.87900177e-09,   9.93674832e-10,   5.72760020e-08,
          3.17433287e-09,   2.29346039e-10,   6.21327924e-09,
          5.81321976e-09,   1.33339787e-08]]))

See the docs

dubbbdan
  • 2,650
  • 1
  • 25
  • 43
  • posted what `ans` returns – dubbbdan Nov 10 '17 at 18:51
  • What are the advantages of this over a for loop? – wwii Nov 10 '17 at 18:55
  • 1
    @wwii From the docs "The vectorize function is provided primarily for convenience, not for performance. The implementation is essentially a for loop." – dubbbdan Nov 10 '17 at 19:41
  • 2
    Right, so the remark about scalability doesn't apply. The `vectorize` method is misleadingly named, its correct name would be `pretend_it_is_vectorized`. –  Nov 11 '17 at 18:22