Fitting two distinct equations to a function (curve_fit)

Question

I have a problem: I have two distinct equations, one is a linear equation, the other one is an exponential equation. However not both equations should be valid at the same time, meaning that there are two distinct regimes.

Equation 1 (x < a): E*x
Equation 2 (x >=a): a+b*x+c*(1-np.exp(-d*np.array(x)))

Meaning the first part of the data should just be fit with a linear equation and the rest should be fit with the before mentioned equation 2.

The data I'm trying to fit looks like this (I have also added some sample data, if people wanna have a go):

I have tried several thing already, from just defining one fit function with a heaviside function:

def fit_fun(x,a,b,c,d,E):
    
    funktion1=E*np.array(x)
    
    funktion2=a+b*x+c*(1-np.exp(-d*np.array(x)))
           
    return np.heaviside(x+a,0)*funktion2+(1-np.heaviside(x+a,0))*funktion1

defining a piecewise function:

def fit_fun(x,a,b,c,d,E):
    return np.piecewise(x, [x <= a, x > a], [lambda x: E*np.array(x), lambda x: a+b*x+c*(1-np.exp(-d*np.array(x)))])

to lastly (which unforunattly yields me some form function error?):

def plast_fun(x,a,b,c,d,E):
   
    out = E*x
    out [np.where(x >= a)] = a+b*x+c*(1-np.exp(-d+x))
    
    return out

Don't get me wrong I do get "some" fits, but they do seem to either take one or the other equation and not really use both. I also tried using several bounds and inital guesses, but it never changes.

Any input would be greatly appreciated!

Data:

0.000000     -1.570670 
0.000434     83.292677 
0.000867     108.909402 
0.001301     124.121676 
0.001734     138.187659 
0.002168     151.278839 
0.002601     163.160478 
0.003035     174.255626 
0.003468     185.035092 
0.003902     195.629820 
0.004336     205.887161 
0.004769     215.611995 
0.005203     224.752083 
0.005636     233.436680 
0.006070     241.897851 
0.006503     250.352697 
0.006937     258.915168 
0.007370     267.569337 
0.007804     276.199005 
0.008237     284.646778 
0.008671     292.772349 
0.009105     300.489611 
0.009538     307.776858 
0.009972     314.666291 
0.010405     321.224211 
0.010839     327.531594 
0.011272     333.669261 
0.011706     339.706420 
0.012139     345.689265 
0.012573     351.628362 
0.013007     357.488150 
0.013440     363.185771 
0.013874     368.606298 
0.014307     373.635696 
0.014741     378.203192 
0.015174     382.315634 
0.015608     386.064126 
0.016041     389.592120 
0.016475     393.033854 
0.016908     396.454226 
0.017342     399.831519 
0.017776     403.107084 
0.018209     406.277016 
0.018643     409.441119 
0.019076     412.710982 
0.019510     415.987331 
0.019943     418.873140 
0.020377     421.178098 
0.020810     423.756827

So far I have found these two questions, but I could't figure it out: Fit of two different functions with boarder as fit parameter Fit a curve for data made up of two distinct regimes

Why are you trying to fit this data? What do you want to do once you've fit the data? Using more recent approaches like neural networks, you could get a good fit with just a single model... — duhaime, Nov 23 '20 at 16:06
@duhaime: I want to use the fit parameters for a different optimization. So these two equations basically define a model I want to use in another calculation. If you want more details, I could answer in a PM ^^ — M.Pow, Nov 24 '20 at 09:21

score 2 · Accepted Answer · answered Nov 23 '20 at 16:35

2

I suspect you are making a mistake in the second equation, where you do a+b*x+c*(1-np.exp(-d+x)). where a is the value of x where you change from one curve to the other. I think you should use the value of y instead which is a*E. Also it is very important to define initial parameters to the fit. I've ran the following code with your data in .txt file and the fit seems pretty good as you can see bellow:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from scipy import optimize, stats

def fit_fun(x,a,b,c,d,E):
    return np.piecewise(x, [x <= a, x > a], [lambda x: E*x, lambda x: a*E+b*x+c*(1-np.exp(-d*x))])

df = pd.read_csv('teste.txt', delimiter='\s+', header=None)
df.columns = ['x','y']

xdata = df['x']
ydata = df['y']

p0 = [0.001,1,1,1,100000]
popt, pcov = optimize.curve_fit(fit_fun, xdata.values, ydata.values, p0=p0, maxfev=10000, absolute_sigma=True, method='trf')
print(popt)

plt.plot(xdata, ydata,'*')
plt.plot(xdata, fit_fun(xdata.values, *popt), 'r')
plt.show()

answered Nov 23 '20 at 16:35

Flavio Moraes

1,331
1
5
16

1

To avoid a strange discontinuity you should set the second equation as `lambda x: a * E + b * ( x - a ) + c * ( 1 - np.exp( -d * ( x - a ) ) )`. Plotting on a finer grid will make the difference obvious. ..... unless it is desired to be non-continuous of course. – mikuszefski Nov 24 '20 at 08:33
@mikuszefski, thank you for your contribution. However, the discontinuation comes from the fact that there are two different behaviors of the curve and its being fitted by a piece-wise function. If you use `x-a` you say the second part starts with in (a,E) and if you don't you say it starts at (0,E) and both can be valid. Normally I would prefer to start at (0,0) and to use `y0` instead of `a*E`. See here that if you do this, then doesn't matter if you use `x` or `x-a` because the difference can be adjusted by the parameters `c` and `y0`. – Flavio Moraes Nov 24 '20 at 15:16
I am sorry, this explanation makes no sense. Both functions are continuous functions on R, There is nothing that starts at `0` or `a`. And the fact that a function is defined piecewise doesn't mean it has to be discontinuous; it could even be continuous in the first derivative. The only difference is that version 1 forces curve 2 to pass through `(0, 0)` while the second choice forces it through `(a, E a)` making the total graph continuous. Assuming, e.g., physical data one could probably find examples for both. From that point of view my first comment would have been better stating "could" – mikuszefski Nov 25 '20 at 06:30
Just a mathematical add on. A piecewise function with infinite continuous derivatives, i.e. $C^infty$ would be `Exp( 1/(x^2-1) )` for `-1 – mikuszefski Nov 25 '20 at 06:35

Fitting two distinct equations to a function (curve_fit)

1 Answers1