0

I'm trying to analyse and plot piecewise regressions for daily temperature and gas use. I have six columns (two corresponding to each year) within a csv which I am pulling in using pandas then defining each column as a seperate variable.

I found one of the answers on How to apply piecewise linear fit in Python? extremely helpful and was able to use the following code to run a breakpoint analysis and also plot a graph:

import matplotlib.pyplot as plt
import pwlf

# Importing the csv and defining columns as variables
df = pd.read_csv(PATH)

Y_A = df.Column1 
X_A = df.Column2 
Y_B = df.Column3
X_B = df.Column4

# Analysing breakpoints
my_pwlf_a = pwlf.PiecewiseLinFit(X_A, Y_A)
breaks_a = my_pwlf_a.fit(2)
print(breaks_a)

# Graphing
x_hat = np.linspace(X_A.min(), X_A.max(), 100)
y_hat = my_pwlf.predict(x_hat)

plt.figure()
plt.plot(X_A, Y_A, 'o')
plt.plot(x_hat, y_hat, '-')
plt.xlabel('X'); plt.ylabel('Y');
plt.show()

This runs with no problems and gives the results the desired.

When I try to repurpose the code using my next pair of variables (Y_B and X_B) I run into problems:

my_pwlf_b = pwlf.PiecewiseLinFit(X_B, Y_B)
breaks_b = my_pwlf_b.fit(2)
print(breaks_b) 

The error returned is:

ValueError: bounds should be a sequence containing real valued (min, max) pairs for each value in x

All variables are float64 and each column contains 366 rows. Thanks for any help in spotting what I'm missing!

danwri
  • 193
  • 11
  • 1
    There's no way we can help without seeing what your data is composed of, so please upload and link to it – Zionsof Jul 25 '19 at 13:26

1 Answers1

0

Thansk to Zionsof for the nudge back towards the data!

Further testing shows that unequal lengths of the column pairings was the problem (e.g. Columns 1 & 2 contained 366 while Columns 3 & 4 contained 365). I had foolishly thought that seperating the columns into seperate variables may fix this but I was incorrect. Here is what I used to fix it (numpy.isfinite):

# Remove any blanks by ensuring the values are finite
Y_A = df.Column1[np.isfinite(df['Column1'])]
X_A = df.Column2[np.isfinite(df['Column2'])]
Y_B = df.Column3[np.isfinite(df['Column3'])]
X_B = df.Column4[np.isfinite(df['Column4'])]
danwri
  • 193
  • 11