In Keras, how to get 3D input and 3D output for LSTM layers

Question

In my original setting, I got

X1 = (1200,40,1)
y1 = (1200,10)

Then, I work perfectly with my codes:

model = Sequential()
model.add(LSTM(12, input_shape=(40, 1), return_sequences=True))
model.add(LSTM(12, return_sequences=True))
model.add(LSTM(6, return_sequences=False))
model.add((Dense(10)))

Now, I further got another time series data same sizes as X1 and y1. i.e.,

X2 = (1200,40,1)
y2 = (1200,10)

Now, I stack X1, X2 and y1, y2 as 3D arrays:

X_stack = (1200,40,2)
y_stack = (1200,10,2)

Then, I try to modify my keras code like:

model = Sequential()
model.add(LSTM(12, input_shape=(40, 2), return_sequences=True))
model.add(LSTM(12, return_sequences=True))
model.add(LSTM(6, return_sequences=False))
model.add((Dense((10,2))))

I want my code work directly with the 3D arrays X_stack and y_stack without reshaping them as 2D arrays. Would you give me a hand on how to modify the settings? Thank you.

score 2 · Answer 1 · answered May 29 '19 at 08:01

I am assuming that there is an error somewhere in the shapes that you reported for your arrays. I'm guessing y_stack.shape == (1200, 10, 2), is that correct?

However, here is one possibility to do what you describe:

model = Sequential()
model.add(LSTM(12, input_shape=(40, 2), return_sequences=True))
model.add(LSTM(12, return_sequences=True))
model.add(LSTM(6, return_sequences=False))
model.add(Dense(10 * 2))
model.add(Reshape((10, 2)))

The output of the network is created as a 2D tensor by the Dense layer, and then reshaped to a 3D tensor by the Reshape. From an input-output perspective, this should behave like you specified.

score 1 · Answer 2 · answered Jun 01 '19 at 11:38

i can not give a short answer to this question however i think there is clarification needed about some basic concepts of LSTM (one-to-one, one-to-many,...)

As a superstructure RNNs (including LSTMs) are sequential, they are constructed to find time-like correlations, while CNNs are spatial they are build to find space-like correlations

Then there is a further differentiation of LSTM in one-to-one, one-to-many, many-to-one and many-to-many like shown in Many to one and many to many LSTM examples in Keras

The network type that is wanted here is point 5 in Many to one and many to many LSTM examples in Keras and it says :

Many-to-many when number of steps differ from input/output length: this is freaky hard in Keras. There are no easy code snippets to code that.

It is type 5 because input shape is X_stack = (1200,40,2) and output shape is y_stack = (1200,10,2) so the number of timesteps differ (40 input and 10 output)

If you could manage to have an equal number of input and output timesteps you can reshape input and output data (numpy.reshape) like in keras LSTM feeding input with the right shape ( note the arrangement of the [ and ] in the arrays). This does not mean reshaping to 2D ( i.e. flattening ). In https://machinelearningmastery.com/timedistributed-layer-for-long-short-term-memory-networks-in-python/ is a complete example for building a many-to-many LSTM with equal input and output timesteps using TimeDistributed layer

Only for completeness, for spatio-temporal data there are also CNN-LSTMs however this does not apply here because two stacked timeseries have no explicit spatial correlations :

If you have a 3D quantity, i.e. a distribution in a volume that changes over time and want to learn this then you have to use a CNN-LSTM network. In this approach both the 3D information and the temporal information is preserved. With 3D information is preserved is meant that the spatial information is not discarded. Normally in time-like learners like LSTM this spatial information is often discarded i.e. by flattening an image before processing it in an LSTM. A complete tutorial how a (spatio-temporal) CNN-LSTM can be built in keras is in https://machinelearningmastery.com/cnn-long-short-term-memory-networks/

score 0 · Answer 3 · answered May 22 '19 at 11:25

0

You can use the output tuple of X_stack.shape():

model = Sequential()
model.add(LSTM(12, input_shape=(X_stack.shape[1], X_stack.shape[2]),return_sequences=True))
model.add(LSTM(12, return_sequences=True))
model.add(LSTM(6, return_sequences=False))
model.add((Dense((10,2))))

answered May 22 '19 at 11:25

mobelahcen

414
5
22

it will make it more dynamic, but it's exactly the same input_shape as @nam has – VnC May 22 '19 at 11:29

Chris Farr · Answer 4 · 2019-05-31T16:47:23.927

I am assuming that you will need to share the parameters for each array that you stack.

If you were stacking entirely new features, then there wouldn't be an associated target with each one.
If you were stacking completely different examples, then you would not be using 3D arrays, and would just be appending them to the end like normal.

Solution

To solve this problem, I would leverage the TimeDistributed wrapper from Keras.

LSTM layers expect a shape (j, k) where j is the number of time steps, and k is the number of features. Since you want to keep your array as 3D for the input and output, you will want to stack on a different dimension than the feature dimension.

Quick side note:

I think it’s important to note the difference between the approaches. Stacking on the feature dimension gives you multiple features for the same time steps. In that case you would want to use the same LSTM layers and not go this route. Because you want a 3D input and a 3D output, I am proposing that you create a new dimension to stack on which will allow you to apply the same LSTM layers independently.

TimeDistributed:

This wrapper applies a layer to each array at the 1 index. By stacking your X1 and X2 arrays on the 1 index, and using the TimeDistributed wrapper, you are applying LSTM layers independently to each array that you stack. Notice below that the original and updated model summaries have the exact same number of parameters.

Implementation Steps:

The first step is to reshape the input of (40, 2) into (2, 40, 1). This gives you the equivalent of 2 x (40, 1) array inputs. You can either do this in the model like I’ve done, or when building your dataset and update the input shape.
- By adding the extra dimension (..., 1) to the end, we are keeping the data in a format that the LSTM would understand if it was just looking at one of the arrays that we stacked at a time. Notice how your original input_shape is (40, 1) for instance.
Then wrap each layer in the TimeDistributed wrapper.
And finally, reshape the y output to match your data by swapping (2, 10) to (10, 2).

Code

from tensorflow.python.keras import Sequential
from tensorflow.python.keras.layers import LSTM, Dense, TimeDistributed, InputLayer, Reshape
from tensorflow.python.keras import backend
import numpy as np

# Original Model
model = Sequential()
model.add(LSTM(12, input_shape=(40, 1), return_sequences=True))
model.add(LSTM(12, return_sequences=True))
model.add(LSTM(6, return_sequences=False))
model.add((Dense(10)))

model.summary()

Original Model Summary

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
lstm (LSTM)                  (None, 40, 12)            672       
_________________________________________________________________
lstm_1 (LSTM)                (None, 40, 12)            1200      
_________________________________________________________________
lstm_2 (LSTM)                (None, 6)                 456       
_________________________________________________________________
dense (Dense)                (None, 10)                70        
=================================================================
Total params: 2,398
Trainable params: 2,398
Non-trainable params: 0
_________________________________________________________________

Apply TimeDistributed Wrapper

model = Sequential()
model.add(InputLayer(input_shape=(40, 2)))
model.add(Reshape(target_shape=(2, 40, 1)))
model.add(TimeDistributed(LSTM(12, return_sequences=True)))
model.add(TimeDistributed(LSTM(12, return_sequences=True)))
model.add(TimeDistributed(LSTM(6, return_sequences=False)))
model.add(TimeDistributed(Dense(10)))
model.add(Reshape(target_shape=(10, 2)))

model.summary()

Updated Model Summary

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
reshape (Reshape)            (None, 2, 40, 1)          0         
_________________________________________________________________
time_distributed (TimeDistri (None, 2, 40, 12)         672       
_________________________________________________________________
time_distributed_1 (TimeDist (None, 2, 40, 12)         1200      
_________________________________________________________________
time_distributed_2 (TimeDist (None, 2, 6)              456       
_________________________________________________________________
time_distributed_3 (TimeDist (None, 2, 10)             70        
_________________________________________________________________
reshape_1 (Reshape)          (None, 10, 2)             0         
=================================================================
Total params: 2,398
Trainable params: 2,398
Non-trainable params: 0
_________________________________________________________________

In Keras, how to get 3D input and 3D output for LSTM layers

4 Answers4