I'm trying to analyse and plot piecewise regressions for daily temperature and gas use. I have six columns (two corresponding to each year) within a csv which I am pulling in using pandas then defining each column as a seperate variable.
I found one of the answers on How to apply piecewise linear fit in Python? extremely helpful and was able to use the following code to run a breakpoint analysis and also plot a graph:
import matplotlib.pyplot as plt
import pwlf
# Importing the csv and defining columns as variables
df = pd.read_csv(PATH)
Y_A = df.Column1
X_A = df.Column2
Y_B = df.Column3
X_B = df.Column4
# Analysing breakpoints
my_pwlf_a = pwlf.PiecewiseLinFit(X_A, Y_A)
breaks_a = my_pwlf_a.fit(2)
print(breaks_a)
# Graphing
x_hat = np.linspace(X_A.min(), X_A.max(), 100)
y_hat = my_pwlf.predict(x_hat)
plt.figure()
plt.plot(X_A, Y_A, 'o')
plt.plot(x_hat, y_hat, '-')
plt.xlabel('X'); plt.ylabel('Y');
plt.show()
This runs with no problems and gives the results the desired.
When I try to repurpose the code using my next pair of variables (Y_B and X_B) I run into problems:
my_pwlf_b = pwlf.PiecewiseLinFit(X_B, Y_B)
breaks_b = my_pwlf_b.fit(2)
print(breaks_b)
The error returned is:
ValueError: bounds should be a sequence containing real valued (min, max) pairs for each value in x
All variables are float64 and each column contains 366 rows. Thanks for any help in spotting what I'm missing!