Understanding multivariate time series classification with Keras

Question

I am trying to understand how to correctly feed data into my keras model to classify multivariate time series data into three classes using a LSTM neural network.

I looked at different resources already - mainly these three excellent blog posts by Jason Brownlee post1, post2, post3), other SO questions and different papers, but none of the information given there exactly fits my problem case, and I was not able to figure out if my data preprocessing / feeding it into the model is correct, so I guessed I might get some help if I specify my exact conditions here.

What I am trying to do is classify multivariate time series data, which in its original form is structured as follows:

I have 200 samples
One sample is one csv file.
A sample can have 1 to 50 features (i.e. the csv file has 1 to 50 columns).
Each feature has its value "tracked" over a fixed amount of time steps, let's say 100 (i.e. each csv file has exactly 100 rows).
Each csv file has one of three classes ("good", "too small", "too big")

So what my current status looks like is the following:

I have a numpy array "samples" with the following structure:

# array holding all samples
[
    # sample 1        
    [
        # feature 1 of sample 1 
        [ 0.1, 0.2, 0.3, 0.2, 0.3, 0.1, 0.2, 0.4, 0.5, 0.1, ... ], # "time series" of feature 1
        # feature 2 of sample 1 
        [ 0.5, 0.6, 0.7, 0.6, 0.4, 0.3, 0.2, 0.1, -0.1, -0.2, ... ], # "time series" of feature 2
        ... # up to 50 features
    ],
    # sample 2        
    [
        # feature 1 of sample 2 
        [ 0.1, 0.2, 0.3, 0.2, 0.3, 0.1, 0.2, 0.4, 0.5, 0.1, ... ], # "time series" of feature 1
        # feature 2 of sample 2 
        [ 0.5, 0.6, 0.7, 0.6, 0.4, 0.3, 0.2, 0.1, -0.1, -0.2, ... ], # "time series" of feature 2
        ...  # up to 50 features
    ],
    ... # up to sample no. 200
]

I also have a numpy array "labels" with the same length as the "samples" array (i.e. 200). The labels are encoded in the following way:

"good" = 0
"too small" = 1
"too big" = 2

[0, 2, 2, 1, 0, 1, 2, 0, 0, 0, 1, 2, ... ] # up to label no. 200

This "labels" array is then encoded with keras' to_categorical function

to_categorical(labels, len(np.unique(labels)))

My model definition currently looks like that:

max_nb_features = 50
nb_time_steps = 100

model = Sequential()
model.add(LSTM(5, input_shape=(max_nb_features, nb_time_steps)))
model.add(Dense(3, activation='softmax'))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])

The 5 units in the LSTM layer are just randomly picked for now
3 Output neurons in the dense layer for my three classes

I then split the data into training / testing data:

samples_train, samples_test, labels_train, labels_test = train_test_split(samples, labels, test_size=0.33)

This leaves us with 134 samples for training and 66 samples for testing.

The problem I'm currenty running into, is that the following code is not working:

model.fit(samples_train, labels_train, epochs=1, batch_size=1)

The error is the following:

Traceback (most recent call last):
  File "lstm_test.py", line 152, in <module>
    model.fit(samples_train, labels_train, epochs=1, batch_size=1)
  File "C:\Program Files\Python36\lib\site-packages\keras\models.py", line 1002, in fit
    validation_steps=validation_steps)
  File "C:\Program Files\Python36\lib\site-packages\keras\engine\training.py", line 1630, in fit
    batch_size=batch_size)
  File "C:\Program Files\Python36\lib\site-packages\keras\engine\training.py", line 1476, in _standardize_user_data
    exception_prefix='input')
  File "C:\Program Files\Python36\lib\site-packages\keras\engine\training.py", line 113, in _standardize_input_data
    'with shape ' + str(data_shape))

ValueError: Error when checking input: expected lstm_1_input to have 3 dimensions, but got array with shape (134, 1)

For me, it seems to not work because of the variable amount of features my samples can have. If I use "fake" (generated) data, where all parameters are the same, except each sample has exactly the same amount of features (50), the code works.

Now what I'm trying to understand is:

Are my general assumptions on how I structured my data for the LSTM input correct? Are the parameters (batch_size, input_shape) correct / sensible?
Is the keras LSTM model in general able to handle samples with different amount of features?
If yes, how do I have to adapt my code for it to work with different amount of features?
If no, would "zero-padding" (filling) the columns in the samples with less than 50 features work? Are there other, preferred methods of achieving my goal?

score 2 · Accepted Answer · answered Sep 28 '18 at 02:41

I believe the input shape for Keras should be:

input_shape=(number_of_samples, nb_time_steps, max_nb_features).

And most often nb_time_steps = 1

P.S.: I tried solving a very similar problem for an internship position (but my results turned out to be wrong). You may take a look here: https://github.com/AbbasHub/Deep_Learning_LSTM/blob/master/2018-09-22_Multivariate_LSTM.ipynb (see if you can spot my mistake!)

score 1 · Answer 2 · answered May 13 '20 at 07:28

The LSTM model requires a 3D input in the form of [samples, time steps, features]

When defining the first layer of our LSTM model, we need to specify only the time steps and features. Even though this may seem 2D it is actually 3D as the samples size i.e. batch size is specified at the time of model fit.

features = x_train_d.shape[1]

Hence, we first need to reshape our input in the 3D format:

x_train_d = np.reshape(x_train_d, (x_train_d.shape[0], 1, x_train_d.shape[1]))

Here goes the LSTM first layer:

model.add(LSTM(5,input_shape=(1, features),activation='relu'))

And the model fit specifies the samples=50 as expected by LSTM

model.fit(x_train_d,y_train_d.values,batch_size=50,epochs=100)

For the question asked with variable length inputs, there is a discussion initiated at https://datascience.stackexchange.com/questions/26366/training-an-rnn-with-examples-of-different-lengths-in-keras

Understanding multivariate time series classification with Keras

2 Answers2