2

I'm trying to fit a set of data with a function (see the example below) using scipy.optimize.curvefit, but when I use bounds (documentation) the fit fails and I simply get the initial guess parameters as output. As soon as I substitute -np.inf ad np.inf as bounds for the second parameter (dt in the function), the fit works. What am I doing wrong?

import numpy as np
import matplotlib.pyplot as plt
import scipy.optimize as opt

#Generate data
crc=np.array([-1.4e-14, 7.3e-14, 1.9e-13, 3.9e-13, 6.e-13, 8.0e-13, 9.2e-13, 9.9e-13, 
              1.e-12, 1.e-12, 1.e-12, 1.0e-12, 1.1e-12, 1.1e-12, 1.1e-12, 1.0e-12, 1.1e-12])
time=np.array([0., 368., 648., 960., 1520.,1864., 2248., 2655., 3031., 
                3384., 3688., 4048., 4680., 5343., 6055.,  6928.,  8120.])

#Define the function for the fit
def testcurve(x, Dp, dt):
    k = -Dp*(x+dt)*2e11
    curve = 1e-12 * (1+2*(-np.exp(k) + np.exp(4*k) - np.exp(9*k) + np.exp(16*k)))
    curve[0]= 0
    return curve

#Set fit bounds 
dtmax=time[2]
param_bounds = ((-np.inf, -dtmax),(np.inf, dtmax))

#Perform fit
(par, par_cov) = opt.curve_fit(testcurve, time, crc, p0 = (5e-15, 0), bounds = param_bounds)

#Print and plot output
print(par)       
plt.plot(time, crc, 'o')
plt.plot(time, testcurve(time, par[0], par[1]), 'r-')
plt.show()
David
  • 51
  • 8
  • What are your bounds when not inf? –  Apr 22 '17 at 09:35
  • They're the ones in the example, +/- `time[2]`, i.e. `-648.` and `648.` But I also tried manually entering different values, it still doesn't work. – David Apr 22 '17 at 09:40
  • I have the same problem as well, without bounds, all my fits fail (with curve_fit, least_squares, and even minimize and a directly defined loss function), but with bounds, they all fail. – Marses Nov 30 '21 at 17:09

1 Answers1

1

I encountered the same behavior today in a different fitting problem. After some searching online, I found this link quite helpful: Why does scipy.optimize.curve_fit not fit to the data?

The short answer is that: using extremely small (or large) numbers in numerical fitting is not robust and scale them leads to a much better fitting.


In your case, both crc and Dp are extremely small numbers which could be scaled up. You could play with the scale factors and within certain range the fitting looks quite robust. Full example:

import numpy as np
import matplotlib.pyplot as plt
import scipy.optimize as opt

#Generate data
crc=np.array([-1.4e-14, 7.3e-14, 1.9e-13, 3.9e-13, 6.e-13, 8.0e-13, 9.2e-13, 9.9e-13, 
              1.e-12, 1.e-12, 1.e-12, 1.0e-12, 1.1e-12, 1.1e-12, 1.1e-12, 1.0e-12, 1.1e-12])
time=np.array([0., 368., 648., 960., 1520.,1864., 2248., 2655., 3031., 
                3384., 3688., 4048., 4680., 5343., 6055.,  6928.,  8120.])

# add scale factors to the data as well as the fitting parameter
scale_factor_1 = 1e12 # 1./np.mean(crc) also works if you don't want to set the scale factor manually
scale_factor_2 = 1./2e11

#Define the function for the fit
def testcurve(x, Dp, dt):
    k = -Dp*(x+dt)*2e11 * scale_factor_2
    curve = 1e-12 * (1+2*(-np.exp(k) + np.exp(4*k) - np.exp(9*k) + np.exp(16*k))) * scale_factor_1
    curve[0]= 0
    return curve

#Set fit bounds 
dtmax=time[2]
param_bounds = ((-np.inf, -dtmax),(np.inf, dtmax))

#Perform fit
(par, par_cov) = opt.curve_fit(testcurve, time, crc*scale_factor_1, p0 = (5e-15/scale_factor_2, 0), bounds = param_bounds)

#Print and plot output
print(par[0]*scale_factor_2, par[1])

plt.plot(time, crc*scale_factor_1, 'o')
plt.plot(time, testcurve(time, par[0], par[1]), 'r-')
plt.show()

Fitting results: [6.273102923176595e-15, -21.12202697564494], which gives a reasonable fitting and also is very close to the result without any bounds: [6.27312512e-15, -2.11307470e+01]

zhxywzy
  • 53
  • 5
  • Shouldn't this be independent of whether there are bounds or not however? In my case, I have the same problem as the poster; without bounds I get a normal fit as expected, with bounds it fails. Do you know what it is about bounds that would make these "small number problem" suddenly show up? – Marses Nov 30 '21 at 17:10
  • Also in your solution, in `func`, `scale_factor_2` cancels out with the `2e11` and same with `scale_factor_1`. This doesn't seem right? What's supposed to be going on there? – Marses Nov 30 '21 at 17:40