15

I am trying to fit some data to a curve in Python using scipy.optimize.curve_fit. I am running into the error ValueError: array must not contain infs or NaNs.

I don't believe either my x or y data contain infs or NaNs:

>>> x_array = np.asarray_chkfinite(x_array)
>>> y_array = np.asarray_chkfinite(y_array)
>>>

To give some idea of what my x_array and y_array look like at either end (x_array is counts and y_array is quantiles):

>>> type(x_array)
<type 'numpy.ndarray'>
>>> type(y_array)
<type 'numpy.ndarray'>
>>> x_array[:5]
array([0, 0, 0, 0, 0])
>>> x_array[-5:]
array([2919, 2965, 3154, 3218, 3461])
>>> y_array[:5]
array([ 0.9999582,  0.9999163,  0.9998745,  0.9998326,  0.9997908])
>>> y_array[-5:]
array([  1.67399000e-04,   1.25549300e-04,   8.36995200e-05,
     4.18497600e-05,  -2.22044600e-16])

And my function:

>>> def func(x,alpha,beta,b):
...    return ((x/1)**(-alpha) * ((x+1*b)/(1+1*b))**(alpha-beta))
...

Which I am executing with:

>>> popt, pcov = curve_fit(func, x_array, y_array)

resulting in the error stack trace:

Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/usr/lib/python2.7/dist-packages/scipy/optimize/minpack.py", line 426, in curve_fit
res = leastsq(func, p0, args=args, full_output=1, **kw)
File "/usr/lib/python2.7/dist-packages/scipy/optimize/minpack.py", line 338, in leastsq
cov_x = inv(dot(transpose(R),R))
File "/usr/lib/python2.7/dist-packages/scipy/linalg/basic.py", line 285, in inv
a1 = asarray_chkfinite(a)
File "/usr/lib/python2.7/dist-packages/numpy/lib/function_base.py", line 590, in asarray_chkfinite
"array must not contain infs or NaNs")
ValueError: array must not contain infs or NaNs

I'm guessing the error might not be with respect to my arrays, but rather an array created by scipy in an intermediate step? I've had a bit of a dig through the relevant scipy source files, but things get hairy pretty quickly debugging the problem that way. Is there something obvious I'm doing wrong here? I've seen casually mentioned in other questions that sometimes certain initial parameter guesses (of which I currently don't have any explicit) might result in these kind of errors, but even if this is the case, it would be good to know a) why that is and b) how to avoid it.

Community
  • 1
  • 1
Bryce Thomas
  • 10,479
  • 26
  • 77
  • 126

3 Answers3

14

Why it is failing

Not your input arrays are entailing nans or infs, but evaluation of your objective function at some X points and for some values of the parameters results in nans or infs: in other words, the array with values func(x,alpha,beta,b) for some x, alpha, beta and b is giving nans or infs over the optimization routine.

Scipy.optimize curve fitting function uses Levenberg-Marquardt algorithm. It is also called damped least square optimization. It is an iterative procedure, and a new estimate for the optimal function parameters is computed at each iteration. Also, at some point during optimization, algorithm is exploring some region of the parameters space where your function is not defined.

How to fix

1/Initial guess

Initial guess for parameters is decisive for the convergence. If initial guess is far from optimal solution, you are more likely to explore some regions where objective function is undefined. So, if you can have a better clue of what your optimal parameters are, and feed your algorithm with this initial guess, error while proceeding might be avoided.

2/Model

Also, you could modify your model, so that it is not returning nans. For those values of the parameters, params where original function func is not defined, you wish that objective function takes huge values, or in other words that func(params) is far from Y values to be fitted.

Also, at points where your objective function is not defined, you may return a big float, for instance AVG(Y)*10e5 with AVG the average (so that you make sure to be much bigger than average of Y values to be fitted).

Link

You could have a look at this post: Fitting data to an equation in python vs gnuplot

Community
  • 1
  • 1
kiriloff
  • 25,609
  • 37
  • 148
  • 229
  • The same argument applies to scipy.optimize.minimize. Setting a realistic initial guess fixed it for me. – Rexcirus Apr 05 '22 at 22:13
3

Your function has a negative power (x^-alpha) this is the same as (1/x)^(alpha). If x is ever 0 your function will return inf and your curve fit operation will break, I'm surprised a warning/error isn't thrown earlier informing you of a divide by 0.

BTW why are you multiplying and dividing by 1?

Adam Cadien
  • 1,137
  • 9
  • 19
  • Interesting. So is there a way around this? Regarding why I'm multiplying/dividing by 1, 1 just happens to be the known value of another parameter not shown. You're right that it could be removed without affecting the answer, it's just a cue for myself. – Bryce Thomas Jan 23 '13 at 06:02
2

I was able to reproduce this error in python2.7 like so:

from sklearn.decomposition import FastICA
X = load_data.load("stuff")    #this sets X to a 2d numpy array containing 
                               #large positive and negative numbers.
ica = FastICA(whiten=False)

print(np.isnan(X).any())   #this prints False
print(np.isinf(X).any())   #this prints False

ica.fit(X)                 #this produces the error:

Which always produces the Error:

/usr/lib64/python2.7/site-packages/sklearn/decomposition/fastica_.py:58: RuntimeWarning: invalid value encountered in sqrt
  return np.dot(np.dot(u * (1. / np.sqrt(s)), u.T), W)
Traceback (most recent call last):
  File "main.py", line 43, in <module>
    ica()
  File "main.py", line 18, in ica
    ica.fit(X)
  File "/usr/lib64/python2.7/site-packages/sklearn/decomposition/fastica_.py", line 523, in fit
    self._fit(X, compute_sources=False)
  File "/usr/lib64/python2.7/site-packages/sklearn/decomposition/fastica_.py", line 479, in _fit
    compute_sources=compute_sources, return_n_iter=True)
  File "/usr/lib64/python2.7/site-packages/sklearn/decomposition/fastica_.py", line 335, in fastica
    W, n_iter = _ica_par(X1, **kwargs)
  File "/usr/lib64/python2.7/site-packages/sklearn/decomposition/fastica_.py", line 108, in _ica_par
    - g_wtx[:, np.newaxis] * W)
  File "/usr/lib64/python2.7/site-packages/sklearn/decomposition/fastica_.py", line 55, in _sym_decorrelation
    s, u = linalg.eigh(np.dot(W, W.T))
  File "/usr/lib64/python2.7/site-packages/scipy/linalg/decomp.py", line 297, in eigh
    a1 = asarray_chkfinite(a)
  File "/usr/lib64/python2.7/site-packages/numpy/lib/function_base.py", line 613, in asarray_chkfinite
    "array must not contain infs or NaNs")
ValueError: array must not contain infs or NaNs

Solution:

from sklearn.decomposition import FastICA
X = load_data.load("stuff")    #this sets X to a 2d numpy array containing 
                               #large positive and negative numbers.
ica = FastICA(whiten=False)

#this is a column wise normalization function which flattens the
#two dimensional array from very large and very small numbers to 
#reasonably sized numbers between roughly -1 and 1
X = (X - np.mean(X, axis=0)) / np.std(X, axis=0)

print(np.isnan(X).any())   #this prints False
print(np.isinf(X).any())   #this prints False

ica.fit(X)                 #this works correctly.

Why does that normalization step fix the error?

I found the eureka moment here: sklearn's PLSRegression: "ValueError: array must not contain infs or NaNs"

What I think is happening is that numpy is being fed gigantic numbers and very tiny numbers, and inside it's tiny brain it's creating NaN's and Inf's. So it's a bug in the sklearn. The work around is to flatten your input data to the algorithm so that there are no very large or very small numbers.

Bad sklearn! NO biscuit!

Community
  • 1
  • 1
Eric Leschinski
  • 146,994
  • 96
  • 417
  • 335
  • For the FastICA algorithm from sklearn the problem is not always solved scaling the data. The error raise up because of the `whitening` process. During this process the components_ vectors are multiplied by the square root of n_samples which can lead to extremely high and low values (even if scaling). Despite of that, I do not recommend to set `whitening = False` because it makes FastICA faster and it is a preprocessing step before analyzing the mutual indepent variables of the new feature space that it is trying to create. – Álvaro H.G Nov 11 '21 at 11:44