I am doing a comparative study on a simple regression (one independent variable and one target variable) in two ways:- LinearRegression vs neural network (NN - Keras API). My sample data as follows:
x1 y
121.9114 121.856
121.856 121.4011
121.4011 121.3222
121.3222 121.9502
121.9502 122.0644
LinearRegression Code:
lr = LinearRegression()
lr.fit(X_train, y_train)
Note: LR model gives me RMSE 0.22 consistently in each subsequent run.
NN Code:
nn_model = models.Sequential()
nn_model.add(layers.Dense(2, input_dim=1, activation='relu'))
nn_model.add(layers.Dense(1))
nn_model.compile(optimizer='adam', loss='mse', metrics=['mae'])
nn_model.fit(X_train, y_train, epochs=40, batch_size=32)
Training Loss:
Epoch 1/40 539/539 [==============================] - 0s 808us/sample - loss: 16835.0895 -
mean_absolute_error: 129.5276
Epoch 2/40 539/539 [==============================] - 0s 163us/sample - loss: 16830.6868 -
mean_absolute_error: 129.5106
Epoch 3/40 539/539 [==============================] - 0s 204us/sample - loss: 16826.2856 -
mean_absolute_error: 129.4935
...........................................
...........................................
Epoch 39/40 539/539 [==============================] - 0s 187us/sample - loss: 16668.3582 -
mean_absolute_error: 128.8823
Epoch 40/40 539/539 [==============================] - 0s 168us/sample - loss: 16663.9828 -
mean_absolute_error: 128.8654
NN based solution gives me RMSE = 136.7476
Interestingly NN based solution gives me different RMSE in different run because training loss appears different in each run.
For example in first run as shown above loss starts with 16835 and final loss in 40th epoch is 16663. In this case model gives me RMSE=136.74
If i run the same code second time then loss starts with 16144 and final loss in 40th iteration is 5. In this case if RMSE comes to 7.3.
Sometimes i see RMSE as 0.22 also when training loss starts with 400 and ends (40th epoch) with 0.06.
This Keras behavior giving me hard time to understand if there is a problem with Keras API or i am doing something wrong or this problem statement is not suitable for Keras.
Could you please help me in understanding the issue and what could be the best way to stabilize the NN based solution ?
Some Additional Info:
- My training and test data is always fixed so no data is shuffled.
- number of records in train data = 539
- number of records in test data = 154
- tried MinMaxScaling also on train & test but doesn't bring stability in prediction.