1

My problem is very simple: training a regressor by curve_fit. The input data X is 3-d with shape: (95, 3), where 95 is the number of samples. We want to predict real-value y: (95,).

The X has 3 dimensions: a0, a1, a2. The fitted function is:

y = a3 *  np.exp(a2 * (k1 * a1 + k0))

This gain a good results and the MAE is around 0.23

However, when I changed the fitted function to:

y = a3 *  np.exp((-1) * a2 * (k1 * a1 + k0))

Normally, I expect that the newly learned k0 and k1 should be just the opposite number of the previously learned k0 and k1, and the MAE should stay the same. However, the MAE comes to 0.95 which is really large, and the learned k0 and k1 are not as expected.

My code together with the data is here: https://www.dropbox.com/scl/fo/k5ie12lr8l3xz4o8g086v/h?dl=0&rlkey=7uomgmlpwxle4oobwvq5bp3yc.

For reference I post my code here, a very simple demo:

import numpy as np
from scipy.optimize import curve_fit
from sklearn.metrics import mean_squared_error, mean_absolute_error
np.random.seed(42)
def func_exp_linear(X, k0, k1):
    T = X[:, 0]  # Temp
    t = X[:, 1]  # time
    res_cl = X[:, 2]  # res cl
    return res_cl * np.exp(t * (k0 + k1 * T))
def func_exp_linear_minus(X, k0, k1):
    T = X[:, 0]  # Temp
    t = X[:, 1]  # time
    res_cl = X[:, 2]  # res cl
    return res_cl * np.exp((-1) * t * (k0 + k1 * T))

with open('X.npy', 'rb') as f:
    X = np.load(f)
with open('y.npy', 'rb') as f:
    y = np.load(f)

popt, pcov = curve_fit(func_exp_linear_minus, X, y)
y_hat = func_exp_linear_minus(X, *popt)
MAE = mean_absolute_error(y, y_hat)
print('when adding -1 the MAE is large', MAE)

popt, pcov = curve_fit(func_exp_linear, X, y)
y_hat = func_exp_linear(X, *popt)
MAE = mean_absolute_error(y, y_hat)
print('when not adding -1 the MAE is small', MAE)

My outcome is:

when adding -1 the MAE is large 0.9278041290464241
when not adding -1 the MAE is small 0.23378209064718086
  • Give the function `func_exp_linear_minus` an initial guess, like `popt, pcov = curve_fit(func_exp_linear_minus, x, y, p0=[0, 0])` – HMH1013 Feb 16 '23 at 10:03
  • @HMH1013 Yes thanks a lot, if I set the initial guess to `p0=[1e-3, 1e-3]`, the problem solves. However, looks like `curve_fit` is sensitive to the initial guess because when I set `p0=[1,1]` the problem is still there. May I know if there is any insight about this issue? Looks like `curve_fit` is not as stable as I expect. – Zizhuo Meng Feb 16 '23 at 20:57
  • maybe this helps : https://stackoverflow.com/questions/52356128/how-to-set-up-the-initial-value-for-curve-fit-to-find-the-best-optimizing-not-j – HMH1013 Feb 17 '23 at 10:53

0 Answers0