0

I am trying to fit an exponential decay in python. I've tried using scipy.optimize.curve_fit, but it completely fails...

x
Out[18]: 
array([  1.06001000e+04,   1.18721000e+04,   1.32966000e+04,
         1.48926000e+04,   1.66801000e+04,   1.86816000e+04,
         2.09236000e+04,   2.34351000e+04,   2.62481000e+04,
         2.93981000e+04,   3.29261000e+04,   3.68781000e+04,
         4.13041000e+04,   4.62611000e+04,   5.18136000e+04,
         5.80321000e+04,   6.49966000e+04,   7.27971000e+04,
         8.15341000e+04,   9.13196000e+04,   1.02279100e+05,
         1.14554100e+05,   1.28302600e+05,   1.43701100e+05,
         1.60947600e+05,   1.80264100e+05,   2.01898600e+05,
         2.26129600e+05,   2.53268600e+05,   2.83664600e+05,
         3.17709100e+05,   3.55839100e+05,   3.98545100e+05,
         4.46377100e+05,   4.99949600e+05,   5.59951100e+05,
         6.27154100e+05,   7.02422600e+05,   7.86724100e+05,
         8.81143100e+05,   9.86894100e+05,   1.10533660e+06,
         1.23799410e+06,   1.38657310e+06,   1.55298310e+06,
         1.73936510e+06,   1.94811610e+06,   2.18192010e+06,
         2.44378460e+06,   2.73707660e+06,   3.06556810e+06,
         3.43348410e+06,   3.84555560e+06,   4.30708210e+06,
         4.82399910e+06,   5.40295410e+06,   6.05139210e+06,
         6.77765310e+06,   7.59107710e+06,   8.50212410e+06,
         9.52251060e+06,   1.06653596e+07,   1.19453686e+07,
         1.33789981e+07,   1.49846856e+07,   1.67830806e+07,
         1.87973106e+07,   2.10532796e+07,   2.35800001e+07,
         2.64099671e+07])

y
Out[19]: 
array([  7.21779435e-06,   6.88096911e-06,   6.44766520e-06,
         6.06220818e-06,   5.59156825e-06,   5.27746585e-06,
         4.90419458e-06,   4.57028098e-06,   4.19594740e-06,
         3.87213247e-06,   3.53253198e-06,   3.21746863e-06,
         2.96593379e-06,   2.69902818e-06,   2.45720479e-06,
         2.22894945e-06,   2.00554860e-06,   1.78755768e-06,
         1.60389345e-06,   1.43594942e-06,   1.27660849e-06,
         1.12632772e-06,   9.93404773e-07,   8.78887840e-07,
         7.68431386e-07,   6.69981141e-07,   5.88274963e-07,
         5.12602683e-07,   4.47113130e-07,   3.91898528e-07,
         3.42875999e-07,   3.00697454e-07,   2.63373855e-07,
         2.35082385e-07,   2.06185600e-07,   1.81771840e-07,
         1.60044617e-07,   1.42299315e-07,   1.26392523e-07,
         1.12661361e-07,   1.01275721e-07,   9.01458593e-08,
         8.09207343e-08,   7.38619000e-08,   6.76745276e-08,
         6.17079129e-08,   5.68279252e-08,   5.34049900e-08,
         5.05521909e-08,   4.76524243e-08,   4.36574532e-08,
         4.05941897e-08,   3.78241485e-08,   3.51867595e-08,
         3.34753821e-08,   3.13213498e-08,   2.96139649e-08,
         2.74616096e-08,   2.49946165e-08,   2.23428677e-08,
         2.04127328e-08,   1.84783950e-08,   1.65030587e-08,
         1.47845483e-08,   1.35851162e-08,   1.15353701e-08,
         9.18553778e-09,   7.01208306e-09,   5.04006337e-09,
                    nan])

def exp_func(x, a, b, c):
    ...:     return a * np.exp(-b * x) + c
    ...: 

curve_fit(exp_func, x, y)
Out[21]: (array([ 1.,  1.,  1.]), inf)

I know that the initial guess for the parameters is very important, but I can't guess them (I only know that b is close to 1). So, if someone could point out either a way of guessing the parameters or a method that would not require the guessing, I would be very grateful. This is the loglog plot of the data:

More or less linear in log-log scale, and definitely with a linear part

EDIT:

So, actually I realized that all the problems arise from data not being exponential, but rather power-function, so with power function instead of an exponent it works reasonably out of the box.

Phlya
  • 5,726
  • 4
  • 35
  • 54
  • possible duplicate of [fitting exponential decay with no initial guessing](http://stackoverflow.com/questions/3938042/fitting-exponential-decay-with-no-initial-guessing) – runDOSrun Feb 23 '15 at 13:48
  • @runDOSrun, I've seen that question, but I didn't find a satisfactory answer there. The accepted answer suggested either linearizing it so losing the y-offset data or guessing the p0. The other answer about estimating the coefficients requires a lot of hand-work which I am not a fan of :) – Phlya Feb 23 '15 at 13:55
  • @Phyla but it's already obvious that your data underflowed the minimum. Look at the last result of your `y` it's `nan`. You don't need to calculate a lot by hand, but you need to adjust your scale and perhaps act with a `log` function because otherwise with these numbers you will never be able to get the numerical precision to get the result you need. There's a reason math exists ;) – ljetibo Feb 23 '15 at 13:58
  • @Phlya Please include resources/other answers you've encountered next time, so your question isn't marked as duplicate. – runDOSrun Feb 23 '15 at 15:06

1 Answers1

1

You might want to ignore the nan:

valid = np.logical_not(np.isnan(x + y))
optimize.curve_fit(exp_func, x[valid], y[valid])

To avoid numerical instabilities, you should somehow normalize the input data, e.g. by multiplying x with 1e6. Of course, you will need to correct the resulting parameters accordingly.

Falko
  • 17,076
  • 13
  • 60
  • 105
  • Thanks, Falko, that's a good idea, but doesn't help, this is what I get: (array([ 1.00000000e+00, 1.00000000e+00, 1.30153974e-06]), inf) – Phlya Feb 23 '15 at 14:30
  • That's what I get as well. But what is your expectation? (I haven't further analyzed and plotted the data...) – Falko Feb 23 '15 at 14:33
  • I think this is the computational precision problem, but with these coefficients I get a constant y value. And it's hard to believe both a and b would be equal to exactly 1 here... – Phlya Feb 23 '15 at 14:55
  • Yes, the problem is numerically unstable. If you multiply `x` with, e.g., `1e6`, you already get a more reasonable result. It still does not fit the data very well, when displayed in loglog space. But in the original space this might be an optimal fit! – Falko Feb 23 '15 at 15:03