I am using sklearn
modules to find the best fitting models and model parameters. However, I have an unexpected Index error down below:
> IndexError Traceback (most recent call
> last) <ipython-input-38-ea3f99e30226> in <module>
> 22 s = mean_squared_error(y[ts], best_m.predict(X[ts]))
> 23 cv[i].append(s)
> ---> 24 print(np.mean(cv, 1))
> IndexError: tuple index out of range
what I want to do is to find best fitting regressor and its parameters, but I got above error. I looked into SO
and tried this solution but still, same error bumps up. any idea to fix this bug? can anyone point me out why this error happening? any thought?
my code:
from sklearn.model_selection import KFold
from sklearn.metrics import mean_squared_error
from sklearn.linear_model import LinearRegression, Ridge, Lasso
from xgboost.sklearn import XGBRegressor
from sklearn.datasets import make_regression
models = [SVR(), RandomForestRegressor(), LinearRegression(), Ridge(), Lasso(), XGBRegressor()]
params = [{'C': [0.01, 1]}, {'n_estimators': [10, 20]}]
X, y = make_regression(n_samples=10000, n_features=20)
with warnings.catch_warnings():
warnings.filterwarnings("ignore")
cv = [[] for _ in range(len(models))]
fold = KFold(5,shuffle=False)
for tr, ts in fold.split(X):
for i, (model, param) in enumerate(zip(models, params)):
best_m = GridSearchCV(model, param)
best_m.fit(X[tr], y[tr])
s = mean_squared_error(y[ts], best_m.predict(X[ts]))
cv[i].append(s)
print(np.mean(cv, 1))
desired output:
if there is a way to fix up above error, I am expecting to pick up best-fitted models with parameters, then use it for estimation. Any idea to improve the above attempt? Thanks