3

I use this code for smoothing data by fitting exponent with scipy.optimize.curve_fit:

def smooth_data_v1(x_arr,y_arr):
    def func(x, a, b, c):
        return a*np.exp(-b*x)+c

    #Scale data
    y = y_orig / 10000.0
    x = 500.0 * x_orig

    popt, pcov = curve_fit(func, x, y, p0=(1, 0.01, 1))

    y_smooth = func(x, *popt) # Calcaulate smoothed values for same points

    #Undo scaling
    y_final = y_smooth * 10000.0

    return y_final

Howewer I want estimated exponent curve to go through 1st point.

Bad case:

enter image description here

Good case:

enter image description here

I have tried to remove last parameter using first point x0,y0:

def smooth_data_v2(x_orig,y_orig):
    x0 = x_orig[0]
    y0 = y_orig[0]

    def func(x, a, b):
        return a*np.exp(-b*x)+y0-a*np.exp(-b*x0)

    #Scale data
    y = y_orig / 10000.0
    x = 500.0 * x_orig

    popt, pcov = curve_fit(func, x, y, p0=(1, 0.01))

    y_smooth = func(x, *popt) # Calcaulate smoothed values for same points

    #Undo scaling
    y_final = y_smooth * 10000.0

    return y_final

Howewer something go wrong and I get:

enter image description here

a param is really large popt [ 4.45028144e+05 2.74698863e+01]

Any ideas?

Update:

Example of data

x_orig [  0.   1.   2.   3.   4.   5.   6.   7.   8.   9.  10.  11.  12.  13.  14.]
y_orig [ 445057.  447635.  450213.  425089.  391746.  350725.  285433.  269027.
  243835.  230587.  216757.  202927.  189097.  175267.  161437.]
mrgloom
  • 20,061
  • 36
  • 171
  • 301

2 Answers2

2

Scipy curve_fit allows for passing the parameter sigma, which is designed to be the standard deviation for weighting the fit. But this array can be filled with arbitrary data:

from scipy.optimize import curve_fit

def smooth_data_v1(x_arr,y_arr):
    def func(x, a, b, c):
        return a*np.exp(-b*x)+c

    #create the weighting array
    y_weight = np.empty(len(y_arr))
    #high pseudo-sd values, meaning less weighting in the fit
    y_weight.fill(10)
    #low values for point 0 and the last points, meaning more weighting during the fit procedure 
    y_weight[0] = y_weight[-5:-1] = 0.1

    popt, pcov = curve_fit(func, x_arr, y_arr, p0=(y_arr[0], 1, 1), sigma = y_weight, absolute_sigma = True)
    print("a, b, c:", *popt)
    y_smooth = func(x_arr, *popt)

    return y_smooth


x_orig = np.asarray([  0,   1,   2,   3,   4,   5,   6,   7,   8,   9,  10,  11,  12,  13,  14])
y_orig = np.asarray([ 445057,  447635,  450213,  425089,  391746,  350725,  285433,  269027,
  243835,  230587,  216757,  202927,  189097,  175267,  161437])

print(smooth_data_v1(x_orig, y_orig))

As you can see, now the first and last point are close to the original values, but this "clamping of values" comes sometimes at a price for the rest of the data points.
You have probably also noticed, that I removed your rescaling part. Imho, one shouldn't do this before curve fitting procedures. It is usually better to use raw data. Additionally, your data are not really well represented by an exponential function, hence the tiny b value.

Mr. T
  • 11,960
  • 10
  • 32
  • 54
  • You have to attribute a value to each data point anyhow - `sigma` is an array of the same length as `x_arr/y_arr`. So you either tell `curve_fit`, that the last points are as reliable as `x[0]`, as unreliable as `x[1]` or something in between. Give it a try, to see the effect, when you don't attribute a low pseudo-sd value to the last points. – Mr. T Mar 07 '18 at 18:12
0

How about changing the definition of the function you try to fit?

def func(x, a, b, c):
    return (y_arr[0] - c)*np.exp(-b*(x - x_arr[0]))+c

This function always fits the first point perfectly by definition, but otherwise can do anything your current function does.

Jonas Adler
  • 10,365
  • 5
  • 46
  • 73
  • 1
    You mean `(y_arr[0] - c)*np.exp(-b*(x - x_arr[0]))+c`? one bracket is missed. For this func I get same result as my v2. – mrgloom Mar 07 '18 at 12:15