0
import scipy.stats as st
import numpy as np  # generic math functions


# https://scicomp.stackexchange.com/q/1658
class LorentzGen(st.rv_continuous):
    """Lorentz distribution"""

    def _pdf(self, x):
        gamma = 0.27
        return 2 * gamma / (np.pi * (gamma ** 2 + x ** 2))


transverse_fields = LorentzGen(a=0)
gaussian_gen = st.norm()

L = 2

list_of_temps = np.linspace(1, 10, 40)

for T, temp in enumerate(list_of_temps):
    print(f"Run {T}")
    for t in range(5000):
        if t%500==0:
            print(f"Trial {t}")
        h_x = [[-transverse_fields.rvs(), xx] for xx in range(L)]  # OverflowError: (34, 'Result too large')
        # h_y = [[-gaussian_gen.rvs(), xx] for xx in range(L)]  # Works

In the above code, I have implemented my own probability distribution (essentially a half Lorentzian, x∈[0,∞]), modified from the answer from scicomp.SE, which I call transverse_fields.

I need to generate a whole bunch of values from this transverse_fields, using them in a nested For loop. The issue is that beyond a certain number of runs, here "Run 1 Trial ~3500", I get a bunch of errors:

C:\ProgramData\Anaconda3\lib\site-packages\scipy\integrate\quadpack.py:385: IntegrationWarning: The integral is probably divergent, or slowly convergent.
  warnings.warn(msg, IntegrationWarning)
C:\ProgramData\Anaconda3\lib\site-packages\numpy\lib\function_base.py:2831: RuntimeWarning: overflow encountered in ? (vectorized)
  outputs = ufunc(*inputs)
Traceback (most recent call last):
  File "C:/<redacted>/stackoverflow.py", line 26, in <module>
    h_x = [[-transverse_fields.rvs(), xx] for xx in range(L)]  # Result Overflow
  File "C:/<redacted>/stackoverflow.py", line 26, in <listcomp>
    h_x = [[-transverse_fields.rvs(), xx] for xx in range(L)]  # Result Overflow
  File "C:\ProgramData\Anaconda3\lib\site-packages\scipy\stats\_distn_infrastructure.py", line 954, in rvs
    vals = self._rvs(*args)
  File "C:\ProgramData\Anaconda3\lib\site-packages\scipy\stats\_distn_infrastructure.py", line 889, in _rvs
    Y = self._ppf(U, *args)
  File "C:\ProgramData\Anaconda3\lib\site-packages\scipy\stats\_distn_infrastructure.py", line 902, in _ppf
    return self._ppfvec(q, *args)
  File "C:\ProgramData\Anaconda3\lib\site-packages\numpy\lib\function_base.py", line 2755, in __call__
    return self._vectorize_call(func=func, args=vargs)
  File "C:\ProgramData\Anaconda3\lib\site-packages\numpy\lib\function_base.py", line 2831, in _vectorize_call
    outputs = ufunc(*inputs)
  File "C:\ProgramData\Anaconda3\lib\site-packages\scipy\stats\_distn_infrastructure.py", line 1587, in _ppf_single
    while self._ppf_to_solve(right, q, *args) < 0.:
  File "C:\ProgramData\Anaconda3\lib\site-packages\scipy\stats\_distn_infrastructure.py", line 1569, in _ppf_to_solve
    return self.cdf(*(x, )+args)-q
  File "C:\ProgramData\Anaconda3\lib\site-packages\scipy\stats\_distn_infrastructure.py", line 1745, in cdf
    place(output, cond, self._cdf(*goodargs))
  File "C:\ProgramData\Anaconda3\lib\site-packages\scipy\stats\_distn_infrastructure.py", line 1621, in _cdf
    return self._cdfvec(x, *args)
  File "C:\ProgramData\Anaconda3\lib\site-packages\numpy\lib\function_base.py", line 2755, in __call__
    return self._vectorize_call(func=func, args=vargs)
  File "C:\ProgramData\Anaconda3\lib\site-packages\numpy\lib\function_base.py", line 2831, in _vectorize_call
    outputs = ufunc(*inputs)
  File "C:\ProgramData\Anaconda3\lib\site-packages\scipy\stats\_distn_infrastructure.py", line 1618, in _cdf_single
    return integrate.quad(self._pdf, self.a, x, args=args)[0]
  File "C:\ProgramData\Anaconda3\lib\site-packages\scipy\integrate\quadpack.py", line 341, in quad
    points)
  File "C:\ProgramData\Anaconda3\lib\site-packages\scipy\integrate\quadpack.py", line 448, in _quad
    return _quadpack._qagse(func,a,b,args,full_output,epsabs,epsrel,limit)
  File "C:/<redacted>/stackoverflow.py", line 11, in _pdf
    return 2 * gamma / (np.pi * (gamma ** 2 + x ** 2))
OverflowError: (34, 'Result too large')

Process finished with exit code 1

Note that the error does not occur if I bump the number of trials, here t to a smaller number like 50, nor does it occur if list_of_temps has less values, e.g. np.linspace(1,10,4). Even though in the original problem, with np.linspace(1,10,40), the error popped up during Run 1.

With the original setup, there is also no overflow error when I use the standard Gaussian distribution function from scipy.stats.

Similar issues on SO that I've seen attribute this to the range over which the for loop is run is too big. But I don't quite see that here? And I don't quite understand how to implement Decimal as suggested in the answer in the linked question, in any case.

How can I fix this?

I'm running Python 3.6.5 with Anaconda on 64bit Windows 10.

Troy
  • 518
  • 5
  • 11

1 Answers1

0

It seems to be an issue with the way I'm declaring my probability distribution with transverse_fields/LorentzGen.

My solution was to use the in-built cauchy distribution in scipy.stats, with a modified scale. Also since I wanted a half-Lorentzian, I just took the absolute np.abs(...) when drawing a random number from transverse_fields.

import scipy.stats as st
import numpy as np  # generic math functions

transverse_fields = st.cauchy(scale=0.27)
L = 2

list_of_temps = np.linspace(1, 10, 40)

for T, temp in enumerate(list_of_temps):
    print(f"Run {T}")
    for t in range(5000):
        if t%500==0:
            print(f"Trial {t}")
        h_x = [[-np.abs(transverse_fields.rvs()), xx] for xx in range(L)]  # Now works

This is satisfactory enough for me now, but I would still appreciate someone explaining why my way of subclassing rv_continuous gave me the aforementioned errors.

Troy
  • 518
  • 5
  • 11