Piecewise Fit not working - large dataset

Question

I have been using a solution found in several places on stack overflow for fitting a piecewise function:

from scipy import optimize
import matplotlib.pyplot as plt
import numpy as np

x = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10 ,11, 12, 13, 14, 15], dtype=float)
y = np.array([5, 7, 9, 11, 13, 15, 28.92, 42.81, 56.7, 70.59, 84.47, 98.36, 112.25, 126.14, 140.03])

def piecewise_linear(x, x0, y0, k1, k2):
    return np.piecewise(x, [x < x0], [lambda x:k1*x + y0-k1*x0, lambda x:k2*x + y0-k2*x0])


p, e = optimize.curve_fit(piecewise_linear, x, y)
xd = np.linspace(-5, 30, 100)
plt.plot(x, y, ".")
plt.plot(xd, piecewise_linear(xd, *p))
plt.show()

(for example, here: How to apply piecewise linear fit in Python?)

The first time I try it in the console I get an OptimizeWarning.

OptimizeWarning: Covariance of the parameters could not be estimated
  category=OptimizeWarning)

After that I just get a straight line for my fit. It seems as though there is clearly a bend in the data that the fit isn't following, although I cannot figure out why.

For the dataset I am using there are about 3200 points in each x and y, is this part of the problem?

Here are some fake data that kind of simulate mine (same problem occurs where fit is not piecewise):

x = np.append(np.random.uniform(low=10.0, high=40.2, size=(1500,)), np.random.uniform(low=-10.0, high=20.2, size=(1500,)))
y = np.append(np.random.uniform(low=-3000, high=0, size=(1500,)), np.random.uniform(low=-2000, high=1000, size=(1500,)))

I've never used this routine. However, I see that the examples in the doc for it such as `np.piecewise(x, [x < 0, x >= 0], [lambda x: -x, lambda x: x])` include two boolean conditions but your invocation of `piecewise` has only one. Could that be it? — Bill Bell, Mar 27 '18 at 18:12
Try giving `curve_fit()` a reasonable initial guess for the parameters, e.g. `p, e = optimize.curve_fit(piecewise_linear, x, y, p0=(10, -2500, 0, -500))`. — Warren Weckesser, Mar 27 '18 at 18:24

score 0 · Accepted Answer · answered Apr 19 '18 at 15:19

Just to complete the question with the answer provided in the comment above:

The issue was not due to the large number of points, but the fact that I had such large values on my y axis. Since the default initial values are 1, my values of around 1000 were too large. To fix that an initial guess for the line fit was used for parameter p0. From the docs for scipy.optimize.curve_fit it looks like:

p0 : None, scalar, or N-length sequence, optional Initial guess for the parameters. If None, then the initial values will all be 1 (if the number of parameters for the function can be determined using introspection, otherwise a ValueError is raised).

So my final code ended up looking like this:

from scipy import optimize
import matplotlib.pyplot as plt
import numpy as np

x = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10 ,11, 12, 13, 14, 15], dtype=float)
y = np.array([500, 700, 900, 1100, 1300, 1500, 2892, 4281, 5670, 7059, 8447, 9836, 11225, 12614, 14003])

def piecewise_linear(x, x0, y0, k1, k2):
    return np.piecewise(x, [x < x0], [lambda x:k1*x + y0-k1*x0, lambda x:k2*x + y0-k2*x0])

p, e = optimize.curve_fit(piecewise_linear, x, y, p0=(10, -2500, 0, -500))
xd = np.linspace(-5, 30, 100)
plt.plot(x, y, ".")
plt.plot(xd, piecewise_linear(xd, *p))
plt.show()

score 0 · Answer 2 · answered Jun 10 '18 at 05:47

Just for fun (very scattered case) :

Since the original data was not available, the coordinates of the points are obtained from the figure published in the Rachel W's question, thanks to a graphical scan and the record of the blue pixels. They are some artefact due to the straight line and the grid which, after scanning, appear in white.

The result of a piecewise regression (two segments) is drawn in red on the above figure.

The equation of the fitted function is :

The regression method used is not iterative and don't require initial guess. The code is very simple : pp.12-13 in this paper https://fr.scribd.com/document/380941024/Regression-par-morceaux-Piecewise-Regression-pdf

Piecewise Fit not working - large dataset

2 Answers2