0

Background

I am trying to make a relation between one df column and another. So if we assume the columns are named x,y respectively, what I need is a relation f(x)=y. I tried using linear regression like so:

model = Pipeline([('poly', PolynomialFeatures(degree=deg)),
                      ('linear', LinearRegression(fit_intercept=False))])

and iterating over deg with a for loop, I found out that the best value is deg=9 (even though for deg=3 the results are close) where

MSE: 7.282878279669111    MAE: 1.704843791514593

The problem

Now I am trying to create a neural network (Here I'm using MLPRegressor) but unsure of what parameters to choose. I tried changing parameters by hand and sometimes got good results of mse~7.1 and sometimes got terrible result like mse~20. I got better results after switching from relu to logistic, and played a bit with the # of layers and nodes in each layer.

My question is how can I efficiently and automatically search for the best hyperparameters, including the layer architecture (# of layers and # of nodes in each layer) and the degree of polynomial features. So far the code I have is:

scaler = MinMaxScaler()
X_train_norm = scaler.fit_transform(X_train.values.reshape(-1, 1))
X_test_norm = scaler.fit_transform(X_test.values.reshape(-1, 1))

model = Pipeline([('poly', PolynomialFeatures(degree=deg)),
                ('mlpr', MLPRegressor(hidden_layer_sizes=(deg, 6, 8, 1),
                                      activation='relu',
                                      solver='adam', max_iter=10000))])

I did find out that for a random forest you can use

rf_random = RandomizedSearchCV(estimator=rf,
                               param_distributions=random_grid,
                               n_iter=100, cv=3, verbose=2,
                               random_state=42, n_jobs=-1)

But I wasn't able to implement a search like this one in the case of a neural network.

Ariel Yael
  • 361
  • 2
  • 10
  • Take a look at sklearn's [grid search](https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.GridSearchCV.html). Edit: A more [in-depth](https://scikit-learn.org/stable/modules/grid_search.html) explanation. – Scratch'N'Purr Jun 22 '21 at 07:02
  • There's literally the -almost- same question in SO: https://stackoverflow.com/questions/61163759/tuning-mlpregressor-hyper-parameters Do some research beforehand, it helps keeping the place clean and lean :) – Aleix CC Jun 22 '21 at 07:06
  • @Scratch'N'Purr Hi! How can I use this to change the degree as well? – Ariel Yael Jun 22 '21 at 07:31
  • @AleixCC I came across that question, but it did not show how to control the degree in the pipeline – Ariel Yael Jun 22 '21 at 07:32
  • @ArielYael Take a look at this [example](https://scikit-learn.org/stable/auto_examples/model_selection/grid_search_text_feature_extraction.html#sphx-glr-auto-examples-model-selection-grid-search-text-feature-extraction-py). They also use a pipeline and they control their grid search by using the following pattern `__`. Therefore, in your case, it would be `poly__degree`. – Scratch'N'Purr Jun 22 '21 at 07:36

0 Answers0