1

I'm struggling with optimizing a LSTM NN, let me explain what I'm trying to do :)

--> I have a data set with lets say daily temperature of my location, since 2015.

--> I want to predict the temperature of tomorrow based on the 30 last days temperatures.

So basically what I did is a panda table with 31 columns and 2k rows. Each row represent temperatures in a 31 days period

[[18.5, 19.6, 15.2, 16.3 ... 12.4, 13.2]
[19.6, 15.2, 16.3, 12.6 ... 13.2, 15.5]
[......]]

Then I created the same table but with the variation of temperature of each day as compared to the day before in %

I then isolated the first 30 rows of my table as the input, and the last row as the outcome. So I try to predict the % variation of temperature of tomorrow based on % variations on the last days.

So I wrote this code :

import pandas as pd
import tensorflow as tf
from tensorflow.keras import layers
import numpy as np
import matplotlib.pyplot as plt


   
def visualize_training_results(results):
    history = results.history
    plt.figure(figsize=(12, 4))
    plt.plot(history['loss'])
    plt.title('Loss')
    plt.xlabel('Epochs')
    plt.ylabel('Loss')
    plt.show()

data_delta = [] data_base = pd.read_csv('data.csv')

length_data_base = len(data_base) for i in range(1, (length_data_base
- 1)):
    data_delta.append(round(((data_base.iloc[(i), 5]) - (data_base.iloc[(i - 1), 5])) / (data_base.iloc[(i - 1), 5]) * 100, 4))

training_set = pd.DataFrame([], columns= ['0', '1', '2', '3', '4', '5', '6', '7', '8', '9', '10', '11', '12', '13', '14', '15', '16', '17', '18', '19', '20', '21', '22', '23', '24', '25', '26', '27', '28', '29', 'outputs']) 

for j in range(31, (length_data_base - 1)):    
    data_train = pd.Series(data_delta[j - 31:j], index = training_set.columns)
    training_set = training_set.append(data_train, ignore_index = True)

training_data = training_set.drop(training_set.columns[[30]], axis='columns') training_labels = training_set.pop('outputs')

training_data_model = np.array(training_data) training_labels_model = np.array(training_labels)

training_data_model = training_data_model.reshape(len(training_data_model), 30, 1)

data_model = tf.keras.Sequential([

    layers.LSTM(30, return_sequences=True, activation= 'relu' , input_shape=(30,1)),
    layers.Dense(12, activation= 'relu'),
    layers.Dense(12, activation= 'relu'),
    layers.LSTM(10, activation= 'relu'),
    layers.Dense(1) ])

data_model.compile(loss = tf.losses.MeanSquaredError(),
                   optimizer = tf.optimizers.Adam()) data_model.summary()

results = data_model.fit(training_data_model, training_labels_model, batch_size = 300, epochs=10000)

visualize_training_results(results)

Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
lstm (LSTM)                  (None, 30, 30)            3840      
_________________________________________________________________
dense (Dense)                (None, 30, 12)            372       
_________________________________________________________________
dense_1 (Dense)              (None, 30, 12)            156       
_________________________________________________________________
lstm_1 (LSTM)                (None, 10)                920       
_________________________________________________________________
dense_2 (Dense)              (None, 1)                 11        
=================================================================
Total params: 5,299
Trainable params: 5,299
Non-trainable params: 0

And at first it work great but after 5000 epoch i have a huge spike and it never comes back to the low levels of loss.

Here is a picture of my loss vs epochs Spike after 5000 epoch

The % in my data set range from -37 to +42 with a lot of values around 0, I tried to normalize it but using the minmaxscaler make me lose to much granularity in my data, I want to be able to predict a 40% increase even if most days the change is only 0-3%.

What am I doing wrong here ? Is the architecture of the NN even good for what I am trying to do ? Should I set a different learning rate ?

PS : I'm a beginner so I may have done very wrong stuffs :D.

Thank you in advance !

0 Answers0