I state that I am not at all familiar with neural networks and this is the first time that I have tried to develop one. The problem lies in predicting a week's pollution forecast, based on the previous month. Unstructured data with 15 features are: Start data
The data to be predicted is 'gas'
, for a total of 168 hours in the next week, is the hours in a week.
MinMaxScaler(feature_range (0,1))
is applied to the data. And then the data is split into train and test data. Since only one year of hourly measurements is available, the data is resampled in series of 672 hourly samples that each starts from every day of the year at midnight. Therefore, from about 8000 starting hourly surveys, about 600 series of 672 samples are obtained.
The 'date'
is removed from the initial data and the form of train_x
and train_y
is:
Shape of train_x and train_y
In train_x[0]
there are 672 hourly readings for the first 4 weeks of the data set and consist of all features including 'gas'
.
In train_y [0]
, on the other hand, there are 168 hourly readings for the following week which begins when the month ends in train_x [0]
.
Train_X[0] where column 0 is 'gas' and Train_y[0] with only gas column for the next week after train_x[0]
TRAIN X SHAPE = (631, 672, 14)
TRAIN Y SHAPE = (631, 168, 1)
After organizing the data in this way (if it's wrong please let me know), I built the neural network as the following:
train_x, train_y = to_supervised(train, n_input)
train_x = train_x.astype(float)
train_y = train_y.astype(float)
# define parameters
verbose, epochs, batch_size = 1, 200, 50
n_timesteps, n_features, n_outputs = train_x.shape[1], train_x.shape[2], train_y.shape[1]
# define model
model = Sequential()
opt = optimizers.RMSprop(learning_rate=1e-3)
model.add(layers.GRU(14, activation='relu', input_shape=(n_timesteps, n_features),return_sequences=False, stateful=False))
model.add(layers.Dense(1, activation='relu'))
#model.add(layers.Dense(14, activation='linear'))
model.add(layers.Dense(n_outputs, activation='sigmoid'))
model.summary()
model.compile(loss='mse', optimizer=opt, metrics=['accuracy'])
train_y = np.concatenate(train_y).reshape(len(train_y), 168)
callback_early_stopping = EarlyStopping(monitor='val_loss',
patience=5, verbose=1)
callback_tensorboard = TensorBoard(log_dir='./23_logs/',
histogram_freq=0,
write_graph=False)
callback_reduce_lr = ReduceLROnPlateau(monitor='val_loss',
factor=0.1,
min_lr=1e-4,
patience=0,
verbose=1)
callbacks = [callback_early_stopping,
callback_tensorboard,
callback_reduce_lr]
history = model.fit(train_x, train_y, epochs=epochs, batch_size=batch_size, verbose=verbose, shuffle=False
, validation_split=0.2, callbacks=callbacks)
When i fit the network i get:
11/11 [==============================] - 5s 305ms/step - loss: 0.1625 - accuracy: 0.0207 - val_loss: 0.1905 - val_accuracy: 0.0157
Epoch 2/200
11/11 [==============================] - 2s 179ms/step - loss: 0.1594 - accuracy: 0.0037 - val_loss: 0.1879 - val_accuracy: 0.0157
Epoch 3/200
11/11 [==============================] - 2s 169ms/step - loss: 0.1571 - accuracy: 0.0040 - val_loss: 0.1855 - val_accuracy: 0.0079
Epoch 4/200
11/11 [==============================] - 2s 165ms/step - loss: 0.1550 - accuracy: 0.0092 - val_loss: 0.1832 - val_accuracy: 0.0079
Epoch 5/200
11/11 [==============================] - 2s 162ms/step - loss: 0.1529 - accuracy: 0.0102 - val_loss: 0.1809 - val_accuracy: 0.0079
Epoch 6/200
11/11 [==============================] - 2s 160ms/step - loss: 0.1508 - accuracy: 0.0085 - val_loss: 0.1786 - val_accuracy: 0.0079
Epoch 7/200
11/11 [==============================] - 2s 160ms/step - loss: 0.1487 - accuracy: 0.0023 - val_loss: 0.1763 - val_accuracy: 0.0079
Epoch 8/200
11/11 [==============================] - 2s 158ms/step - loss: 0.1467 - accuracy: 0.0023 - val_loss: 0.1740 - val_accuracy: 0.0079
Epoch 9/200
11/11 [==============================] - 2s 159ms/step - loss: 0.1446 - accuracy: 0.0034 - val_loss: 0.1718 - val_accuracy: 0.0000e+00
Epoch 10/200
11/11 [==============================] - 2s 160ms/step - loss: 0.1426 - accuracy: 0.0034 - val_loss: 0.1695 - val_accuracy: 0.0000e+00
Epoch 11/200
11/11 [==============================] - 2s 162ms/step - loss: 0.1406 - accuracy: 0.0034 - val_loss: 0.1673 - val_accuracy: 0.0000e+00
Epoch 12/200
11/11 [==============================] - 2s 159ms/step - loss: 0.1387 - accuracy: 0.0034 - val_loss: 0.1651 - val_accuracy: 0.0000e+00
Epoch 13/200
11/11 [==============================] - 2s 159ms/step - loss: 0.1367 - accuracy: 0.0052 - val_loss: 0.1629 - val_accuracy: 0.0000e+00
Epoch 14/200
11/11 [==============================] - 2s 159ms/step - loss: 0.1348 - accuracy: 0.0052 - val_loss: 0.1608 - val_accuracy: 0.0000e+00
Epoch 15/200
11/11 [==============================] - 2s 161ms/step - loss: 0.1328 - accuracy: 0.0052 - val_loss: 0.1586 - val_accuracy: 0.0000e+00
Epoch 16/200
11/11 [==============================] - 2s 162ms/step - loss: 0.1309 - accuracy: 0.0052 - val_loss: 0.1565 - val_accuracy: 0.0000e+00
Epoch 17/200
11/11 [==============================] - 2s 171ms/step - loss: 0.1290 - accuracy: 0.0052 - val_loss: 0.1544 - val_accuracy: 0.0000e+00
Epoch 18/200
11/11 [==============================] - 2s 174ms/step - loss: 0.1271 - accuracy: 0.0052 - val_loss: 0.1523 - val_accuracy: 0.0000e+00
Epoch 19/200
11/11 [==============================] - 2s 161ms/step - loss: 0.1253 - accuracy: 0.0052 - val_loss: 0.1502 - val_accuracy: 0.0000e+00
Epoch 20/200
11/11 [==============================] - 2s 161ms/step - loss: 0.1234 - accuracy: 0.0052 - val_loss: 0.1482 - val_accuracy: 0.0000e+00
Epoch 21/200
11/11 [==============================] - 2s 159ms/step - loss: 0.1216 - accuracy: 0.0052 - val_loss: 0.1461 - val_accuracy: 0.0000e+00
Epoch 22/200
11/11 [==============================] - 2s 164ms/step - loss: 0.1198 - accuracy: 0.0052 - val_loss: 0.1441 - val_accuracy: 0.0000e+00
Epoch 23/200
11/11 [==============================] - 2s 164ms/step - loss: 0.1180 - accuracy: 0.0052 - val_loss: 0.1421 - val_accuracy: 0.0000e+00
Epoch 24/200
11/11 [==============================] - 2s 163ms/step - loss: 0.1162 - accuracy: 0.0052 - val_loss: 0.1401 - val_accuracy: 0.0000e+00
Epoch 25/200
11/11 [==============================] - 2s 167ms/step - loss: 0.1145 - accuracy: 0.0052 - val_loss: 0.1381 - val_accuracy: 0.0000e+00
Epoch 26/200
11/11 [==============================] - 2s 188ms/step - loss: 0.1127 - accuracy: 0.0052 - val_loss: 0.1361 - val_accuracy: 0.0000e+00
Epoch 27/200
11/11 [==============================] - 2s 169ms/step - loss: 0.1110 - accuracy: 0.0052 - val_loss: 0.1342 - val_accuracy: 0.0000e+00
Epoch 28/200
11/11 [==============================] - 2s 189ms/step - loss: 0.1093 - accuracy: 0.0052 - val_loss: 0.1323 - val_accuracy: 0.0000e+00
Epoch 29/200
11/11 [==============================] - 2s 183ms/step - loss: 0.1076 - accuracy: 0.0079 - val_loss: 0.1304 - val_accuracy: 0.0000e+00
Epoch 30/200
11/11 [==============================] - 2s 172ms/step - loss: 0.1059 - accuracy: 0.0079 - val_loss: 0.1285 - val_accuracy: 0.0000e+00
Epoch 31/200
11/11 [==============================] - 2s 164ms/step - loss: 0.1042 - accuracy: 0.0079 - val_loss: 0.1266 - val_accuracy: 0.0000e+00
Epoch 32/200
Accuracy always remains very low and sometimes (like this case) val_accuracy
becomes 0 and never changes. While loss
and val_loss
do not converge well but decrease. I realize that I am certainly doing many things wrong and I cannot understand how I can fix it. I have obviously tried with other hyperparameters and also with other networks like LSTM
, but I didn't get satisfactory results.
How can I improve the model so that the accuracy is at least decent? Any advice is welcome, thank you very much!