I have an input shape of 64x60x4 for reinforcement learning an agent to play Mario. The problem is, it seems very "if screen looks like this then do that", which isn't very good for this problem.
I want to add an LSTM layer after 3 conv2D layers in Keras (TensorFlow) but it complains that it expects 5 dimensions, but received 4. When I play with the layers, it then becomes 6 and 5.
So how do I get an LSTM layer into the following model with input_shape 64x60x4 (the 4 being the last 4 frames for helping learn acceleration and direction of objects):
image_input = Input(shape=input_shape)
out = Conv2D(filters=32, kernel_size=8, strides=(4, 4), padding=padding, activation='relu')(image_input)
out = Conv2D(filters=64, kernel_size=4, strides=(2, 2), padding=padding, activation='relu')(out)
out = Conv2D(filters=64, kernel_size=4, strides=(1, 1), padding=padding, activation='relu')(out)
out = MaxPooling2D(pool_size=(2, 2))(out)
out = Flatten()(out)
out = Dense(256, activation='relu')(out)
### LSTM should go here ###
q_value = Dense(num_actions, activation='linear')(out)
Any other suggestions/pointers for this would be welcome.