I am experimenting the fit of 1-3 degree polynomial transformation to the original data using 100 predicted values each. I first 1) reshaped the original data, 2) applied fit_transform on the test set and prediction space (of data features), 3) obtained linear prediction on the prediction space, and 4) exported them into an array, using the following code:
from sklearn.linear_model import LinearRegression
from sklearn.preprocessing import PolynomialFeatures
from sklearn.model_selection import train_test_split
np.random.seed(0)
n = 100
x = np.linspace(0,10,n) + np.random.randn(n)/5
y = np.sin(x)+n/6 + np.random.randn(n)/10
x = x.reshape(-1, 1)
y = y.reshape(-1, 1)
pred_data = np.linspace(0,10,100).reshape(-1,1)
results = []
for i in [1, 2, 3] :
poly = PolynomialFeatures(degree = i)
x_train, x_test, y_train, y_test = train_test_split(x, y, random_state=0)
x_poly1 = poly.fit_transform(x_train)
pred_data = poly.fit_transform(pred_data)
linreg1 = LinearRegression().fit(x_poly1, y_train)
pred = linreg1.predict(pred_data)
results.append(pred)
results
However, I did not get what I wanted, Python did not return an array of (3, 100) shape as I was expecting and, in fact, I received an error message
ValueError: shapes (100,10) and (4,1) not aligned: 10 (dim 1) != 4 (dim 0)
Seems to be a dimensional problem resulting either from "reshape" or from the "fit_transform" step. I got confused as this was supposed to be straightforward test. Would anyone enlighten me on this? It will be much appreciated.
Thank you.
Sincerely,