33

I have two NumPy arrays x and y. When I try to fit my data using exponential function and curve_fit (SciPy) with this simple code

#!/usr/bin/env python
from pylab import *
from scipy.optimize import curve_fit

x = np.array([399.75, 989.25, 1578.75, 2168.25, 2757.75, 3347.25, 3936.75, 4526.25, 5115.75, 5705.25])
y = np.array([109,62,39,13,10,4,2,0,1,2])

def func(x, a, b, c, d):
    return a*np.exp(b-c*x)+d

popt, pcov = curve_fit(func, x, y)

I get wrong coefficients popt

[a,b,c,d] = [1., 1., 1., 24.19999988]

What is the problem?

Trenton McKinney
  • 56,955
  • 33
  • 144
  • 158
drastega
  • 1,581
  • 5
  • 30
  • 42
  • similar question http://stackoverflow.com/questions/17527869/curve-fit-fails-with-exponential-but-zunzun-gets-it-right – Josef Jan 29 '14 at 02:54

2 Answers2

51

First comment: since a*exp(b - c*x) = (a*exp(b))*exp(-c*x) = A*exp(-c*x), a or b is redundant. I'll drop b and use:

import matplotlib.pyplot as plt

def func(x, a, c, d):
    return a*np.exp(-c*x)+d

That isn't the main issue. The problem is simply that curve_fit fails to converge to a solution to this problem when you use the default initial guess (which is all 1s). Check pcov; you'll see that it is inf. This is not surprising, because if c is 1, most of the values of exp(-c*x) underflow to 0:

In [32]: np.exp(-x)
Out[32]: 
array([  2.45912644e-174,   0.00000000e+000,   0.00000000e+000,
         0.00000000e+000,   0.00000000e+000,   0.00000000e+000,
         0.00000000e+000,   0.00000000e+000,   0.00000000e+000,
         0.00000000e+000])

This suggests that c should be small. A better initial guess is, say, p0 = (1, 1e-6, 1). Then I get:

In [36]: popt, pcov = curve_fit(func, x, y, p0=(1, 1e-6, 1))

In [37]: popt
Out[37]: array([  1.63561656e+02,   9.71142196e-04,  -1.16854450e+00])

This looks reasonable:

In [42]: xx = np.linspace(300, 6000, 1000)

In [43]: yy = func(xx, *popt)

In [44]: plt.plot(x, y, 'ko')
Out[44]: [<matplotlib.lines.Line2D at 0x41c5ad0>]

In [45]: plt.plot(xx, yy)
Out[45]: [<matplotlib.lines.Line2D at 0x41c5c10>]

enter image description here

Trenton McKinney
  • 56,955
  • 33
  • 144
  • 158
Warren Weckesser
  • 110,654
  • 19
  • 194
  • 214
8

Firstly I would recommend modifying your equation to a*np.exp(-c*(x-b))+d, otherwise the exponential will always be centered on x=0 which may not always be the case. You also need to specify reasonable initial conditions (the 4th argument to curve_fit specifies initial conditions for [a,b,c,d]).

This code fits nicely:

from pylab import *
from scipy.optimize import curve_fit

x = np.array([399.75, 989.25, 1578.75, 2168.25, 2757.75, 3347.25, 3936.75, 4526.25, 5115.75, 5705.25])
y = np.array([109,62,39,13,10,4,2,0,1,2])

def func(x, a, b, c, d):
    return a*np.exp(-c*(x-b))+d

popt, pcov = curve_fit(func, x, y, [100,400,0.001,0])
print popt

plot(x,y)
x=linspace(400,6000,10000)
plot(x,func(x,*popt))
show()
three_pineapples
  • 11,579
  • 5
  • 38
  • 75
  • 1
    Where do initial conditions come from? – Marcin Zdunek Jan 02 '16 at 17:30
  • @MarcinZdunek this was a while ago so I don't remember exactly. The amplitude will have been estimated from the graph. The others may have been determined via trial and error, although the value for c can be estimated too (see the accepted answer of this question) – three_pineapples Jan 02 '16 at 22:08
  • @MarcinZdunek The default initial values are fine if you normalize both data ranges and afterwards denormalize the estimated parameters... – gboffi Nov 08 '18 at 11:39
  • I'll just add that looking over this again, I think the initial conditions for `a` and `b` came from the first `y` and `x` values (assuming values are in order), `c` can be estimated as in the accepted answer, and the estimate for `d` came from the final `y` values which are `~0`. If you're having trouble with initial conditions, this can be a good starting point. – three_pineapples Nov 21 '20 at 01:01