0

I state that I am not at all familiar with neural networks and this is the first time that I have tried to develop one. The problem lies in predicting a week's pollution forecast, based on the previous month. Unstructured data with 15 features are: Start data

The data to be predicted is 'gas', for a total of 168 hours in the next week, is the hours in a week. MinMaxScaler(feature_range (0,1)) is applied to the data. And then the data is split into train and test data. Since only one year of hourly measurements is available, the data is resampled in series of 672 hourly samples that each starts from every day of the year at midnight. Therefore, from about 8000 starting hourly surveys, about 600 series of 672 samples are obtained. The 'date' is removed from the initial data and the form of train_x and train_y is: Shape of train_x and train_y

In train_x[0] there are 672 hourly readings for the first 4 weeks of the data set and consist of all features including 'gas'. In train_y [0], on the other hand, there are 168 hourly readings for the following week which begins when the month ends in train_x [0]. Train_X[0] where column 0 is 'gas' and Train_y[0] with only gas column for the next week after train_x[0]

TRAIN X SHAPE = (631, 672, 14)

TRAIN Y SHAPE = (631, 168, 1)

After organizing the data in this way (if it's wrong please let me know), I built the neural network as the following:

    train_x, train_y = to_supervised(train, n_input)
    train_x = train_x.astype(float)
    train_y = train_y.astype(float)
    # define parameters
     verbose, epochs, batch_size = 1, 200, 50
    n_timesteps, n_features, n_outputs = train_x.shape[1], train_x.shape[2], train_y.shape[1]
    # define model
    model = Sequential()
    opt = optimizers.RMSprop(learning_rate=1e-3)
    model.add(layers.GRU(14, activation='relu', input_shape=(n_timesteps, n_features),return_sequences=False, stateful=False))
    model.add(layers.Dense(1, activation='relu'))
    #model.add(layers.Dense(14, activation='linear'))
    model.add(layers.Dense(n_outputs, activation='sigmoid'))
    model.summary()
    model.compile(loss='mse', optimizer=opt, metrics=['accuracy'])

    train_y = np.concatenate(train_y).reshape(len(train_y), 168)

    callback_early_stopping = EarlyStopping(monitor='val_loss',
                                            patience=5, verbose=1)
    callback_tensorboard = TensorBoard(log_dir='./23_logs/',
                                       histogram_freq=0,
                                       write_graph=False)
    callback_reduce_lr = ReduceLROnPlateau(monitor='val_loss',
                                           factor=0.1,
                                           min_lr=1e-4,
                                           patience=0,
                                           verbose=1)
    callbacks = [callback_early_stopping,
                 callback_tensorboard,
                 callback_reduce_lr]
    history = model.fit(train_x, train_y, epochs=epochs, batch_size=batch_size, verbose=verbose, shuffle=False
                        , validation_split=0.2, callbacks=callbacks)

When i fit the network i get:

    11/11 [==============================] - 5s 305ms/step - loss: 0.1625 - accuracy: 0.0207 - val_loss: 0.1905 - val_accuracy: 0.0157
    Epoch 2/200
    11/11 [==============================] - 2s 179ms/step - loss: 0.1594 - accuracy: 0.0037 - val_loss: 0.1879 - val_accuracy: 0.0157
    Epoch 3/200
    11/11 [==============================] - 2s 169ms/step - loss: 0.1571 - accuracy: 0.0040 - val_loss: 0.1855 - val_accuracy: 0.0079
    Epoch 4/200
    11/11 [==============================] - 2s 165ms/step - loss: 0.1550 - accuracy: 0.0092 - val_loss: 0.1832 - val_accuracy: 0.0079
    Epoch 5/200
    11/11 [==============================] - 2s 162ms/step - loss: 0.1529 - accuracy: 0.0102 - val_loss: 0.1809 - val_accuracy: 0.0079
    Epoch 6/200
    11/11 [==============================] - 2s 160ms/step - loss: 0.1508 - accuracy: 0.0085 - val_loss: 0.1786 - val_accuracy: 0.0079
    Epoch 7/200
    11/11 [==============================] - 2s 160ms/step - loss: 0.1487 - accuracy: 0.0023 - val_loss: 0.1763 - val_accuracy: 0.0079
    Epoch 8/200
    11/11 [==============================] - 2s 158ms/step - loss: 0.1467 - accuracy: 0.0023 - val_loss: 0.1740 - val_accuracy: 0.0079
    Epoch 9/200
    11/11 [==============================] - 2s 159ms/step - loss: 0.1446 - accuracy: 0.0034 - val_loss: 0.1718 - val_accuracy: 0.0000e+00
    Epoch 10/200
    11/11 [==============================] - 2s 160ms/step - loss: 0.1426 - accuracy: 0.0034 - val_loss: 0.1695 - val_accuracy: 0.0000e+00
    Epoch 11/200
    11/11 [==============================] - 2s 162ms/step - loss: 0.1406 - accuracy: 0.0034 - val_loss: 0.1673 - val_accuracy: 0.0000e+00
    Epoch 12/200
    11/11 [==============================] - 2s 159ms/step - loss: 0.1387 - accuracy: 0.0034 - val_loss: 0.1651 - val_accuracy: 0.0000e+00
    Epoch 13/200
    11/11 [==============================] - 2s 159ms/step - loss: 0.1367 - accuracy: 0.0052 - val_loss: 0.1629 - val_accuracy: 0.0000e+00
    Epoch 14/200
    11/11 [==============================] - 2s 159ms/step - loss: 0.1348 - accuracy: 0.0052 - val_loss: 0.1608 - val_accuracy: 0.0000e+00
    Epoch 15/200
    11/11 [==============================] - 2s 161ms/step - loss: 0.1328 - accuracy: 0.0052 - val_loss: 0.1586 - val_accuracy: 0.0000e+00
    Epoch 16/200
    11/11 [==============================] - 2s 162ms/step - loss: 0.1309 - accuracy: 0.0052 - val_loss: 0.1565 - val_accuracy: 0.0000e+00
    Epoch 17/200
    11/11 [==============================] - 2s 171ms/step - loss: 0.1290 - accuracy: 0.0052 - val_loss: 0.1544 - val_accuracy: 0.0000e+00
    Epoch 18/200
    11/11 [==============================] - 2s 174ms/step - loss: 0.1271 - accuracy: 0.0052 - val_loss: 0.1523 - val_accuracy: 0.0000e+00
    Epoch 19/200
    11/11 [==============================] - 2s 161ms/step - loss: 0.1253 - accuracy: 0.0052 - val_loss: 0.1502 - val_accuracy: 0.0000e+00
    Epoch 20/200
    11/11 [==============================] - 2s 161ms/step - loss: 0.1234 - accuracy: 0.0052 - val_loss: 0.1482 - val_accuracy: 0.0000e+00
    Epoch 21/200
    11/11 [==============================] - 2s 159ms/step - loss: 0.1216 - accuracy: 0.0052 - val_loss: 0.1461 - val_accuracy: 0.0000e+00
    Epoch 22/200
    11/11 [==============================] - 2s 164ms/step - loss: 0.1198 - accuracy: 0.0052 - val_loss: 0.1441 - val_accuracy: 0.0000e+00
    Epoch 23/200
    11/11 [==============================] - 2s 164ms/step - loss: 0.1180 - accuracy: 0.0052 - val_loss: 0.1421 - val_accuracy: 0.0000e+00
    Epoch 24/200
    11/11 [==============================] - 2s 163ms/step - loss: 0.1162 - accuracy: 0.0052 - val_loss: 0.1401 - val_accuracy: 0.0000e+00
    Epoch 25/200
    11/11 [==============================] - 2s 167ms/step - loss: 0.1145 - accuracy: 0.0052 - val_loss: 0.1381 - val_accuracy: 0.0000e+00
    Epoch 26/200
    11/11 [==============================] - 2s 188ms/step - loss: 0.1127 - accuracy: 0.0052 - val_loss: 0.1361 - val_accuracy: 0.0000e+00
    Epoch 27/200
    11/11 [==============================] - 2s 169ms/step - loss: 0.1110 - accuracy: 0.0052 - val_loss: 0.1342 - val_accuracy: 0.0000e+00
    Epoch 28/200
    11/11 [==============================] - 2s 189ms/step - loss: 0.1093 - accuracy: 0.0052 - val_loss: 0.1323 - val_accuracy: 0.0000e+00
    Epoch 29/200
    11/11 [==============================] - 2s 183ms/step - loss: 0.1076 - accuracy: 0.0079 - val_loss: 0.1304 - val_accuracy: 0.0000e+00
    Epoch 30/200
    11/11 [==============================] - 2s 172ms/step - loss: 0.1059 - accuracy: 0.0079 - val_loss: 0.1285 - val_accuracy: 0.0000e+00
    Epoch 31/200
    11/11 [==============================] - 2s 164ms/step - loss: 0.1042 - accuracy: 0.0079 - val_loss: 0.1266 - val_accuracy: 0.0000e+00
    Epoch 32/200

Accuracy always remains very low and sometimes (like this case) val_accuracy becomes 0 and never changes. While loss and val_loss do not converge well but decrease. I realize that I am certainly doing many things wrong and I cannot understand how I can fix it. I have obviously tried with other hyperparameters and also with other networks like LSTM, but I didn't get satisfactory results.

How can I improve the model so that the accuracy is at least decent? Any advice is welcome, thank you very much!

Khalid Saifullah
  • 747
  • 7
  • 16
Brodino
  • 23
  • 6
  • Accuracy is meaningless in regression settings (see [What function defines accuracy in Keras when the loss is mean squared error (MSE)?](https://stackoverflow.com/q/48775305/4685471)), as is the `sigmoid` activation in the final layer (should be `linear`). – desertnaut Jan 15 '21 at 12:24
  • Ok perfect so accuracy has no reason to be used in regression, and how can I get a metric that gives me a prediction quality value? Thank you so much for your reply anyway, you have already helped me so much! – Brodino Jan 15 '21 at 12:33
  • You do not necessarily need a (extra) metric in regression problems - the loss itself can be the metric, too. – desertnaut Jan 15 '21 at 12:34
  • I'm confused, you've the "target" ('gas' which you're trying to predict) included in your training data (`train_x`)? and also what does your tarin_y have then? I mean the basic ML procedure is to have only the features (the characteristics of the "target") in your `train_x` and `train_y` should have the "targets" (thing you're trying to predict), so each row of the `train_x` is associated with each row of `train_y`. I hope you can clear it out... – Khalid Saifullah Jan 15 '21 at 12:46
  • 1
    The gas field in train_x relates to the 672 hourly readings belonging to the month to be used as a basis for the prediction. With the data present in the 672 measurements of the month we want to predict the 168 gas values ​​(ie a week) belonging to train_y. The week to be predicted is the week immediately following the month. So the gas values ​​are different, I hope I was able to clarify – Brodino Jan 15 '21 at 13:14

0 Answers0