0

I have a duelling double deep Q network model which works with two dense layers and I trying to convert it inot two LSTM layers as my model deal with time series. When I change the dense layer in the code, this error appear and I was unable to deal with it. I know that this problem has been solved many times here, but these solutions aren't working.

The code that works with two dense layers is write as follow:

class DuelingDeepQNetwork(keras.Model):
    def __init__(self, n_actions, fc1_dims, fc2_dims):
        super(DuelingDeepQNetwork, self).__init__()
        self.dense1 = keras.layers.Dense(fc1_dims, activation='relu')
        self.dense2 = keras.layers.Dense(fc2_dims, activation='relu')
        self.V = keras.layers.Dense(1, activation=None)
        self.A = keras.layers.Dense(n_actions, activation=None)
        
    def call(self, state):
        x = self.dense1(state)
        x = self.dense2(x)
        V = self.V(x)
        A = self.A(x)

        Q = (V + (A - tf.math.reduce_mean(A, axis=1, keepdims=True)))

        return Q

    def advantage(self, state):
        x = self.dense1(state)
        x = self.dense2(x)
        A = self.A(x)

        return A

It works without error but when I turn the two first dense layers into LSTM as follow:

class DuelingDeepQNetwork(keras.Model):
    def __init__(self, n_actions, fc1_dims, fc2_dims):
        super(DuelingDeepQNetwork, self).__init__()
        self.dense1 = keras.layers.LSTM(fc1_dims, activation='relu')
        self.dense2 = keras.layers.LSTM(fc2_dims, activation='relu')
        self.V = keras.layers.Dense(1, activation=None)
        self.A = keras.layers.Dense(n_actions, activation=None)

This error appears:

Input 0 of layer lstm_24 is incompatible with the layer: expected ndim=3, found ndim=2. Full shape received: [64, 8]

Following this question "expected ndim=3, found ndim=2 I already tried to set the input shape using "state = state.reshape(64, 1, 8)" before run the neural network as follow:

    def choose_action(self, observation):
    if np.random.random() < self.epsilon:
        action = np.random.choice(self.action_space)
    else:
        state = np.array([observation])
        state = state.reshape(64, 1, 8) #<--------
        actions = self.q_eval.advantage(state)
        action = tf.math.argmax(actions, axis=1).numpy()[0,0]

    return action

But I get the exact same error. I also tried to add the argument "return_sequences=True" in both layers but it didn't work aswell.

I don't know what to do and I have to hand in it in one week, someone to enlighten me?

EDIT

I'm using fc1_dims = 64, fc2_dims = 32 and n_actions = 2. The model uses 8 variables and have batch size of 64. I uploaded the code in github so you can execute it, if you want. The project is not finished so I will not write a proper read-me for now.

[github with code][2]

  • Could you kindly share the with which parameters are you building the network? I would like to debug to give you a proper answer. And also the input shape that you use. – pratsbhatt Aug 01 '20 at 10:17
  • @PrateekBhatt I'm using fc1_dims = 64, fc2_dims = 32 and n_actions = 2. The model uses 8 variables and have batch size of 64. I eddited the question and add my github with the codes. – Luan Cézari Aug 01 '20 at 13:05

1 Answers1

0

So below code works for me without any issues.

class DuelingDeepQNetwork(keras.Model):
    def __init__(self, n_actions, fc1_dims, fc2_dims):
        super(DuelingDeepQNetwork, self).__init__()
        self.dense1 = keras.layers.LSTM(fc1_dims, activation='relu', return_sequences=True)
        self.dense2 = keras.layers.LSTM(fc2_dims, activation='relu')
        self.V = keras.layers.Dense(1, activation=None)
        self.A = keras.layers.Dense(n_actions, activation=None)
        
    def call(self, state):
        x = self.dense1(state)
        x = self.dense2(x)
        V = self.V(x)
        A = self.A(x)

        Q = (V + (A - tf.math.reduce_mean(A, axis=1, keepdims=True)))

        return Q

    def advantage(self, state):
        x = self.dense1(state)
        x = self.dense2(x)
        A = self.A(x)

        return A

And then calling the model as shown below:

LSTMModel = DuelingDeepQNetwork(2, 64, 32)
LSTMModel.build(input_shape=(None,1,8))
LSTMModel.summary()

The result is as shown below:

Model: "dueling_deep_q_network_7"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
lstm_12 (LSTM)               multiple                  18688     
_________________________________________________________________
lstm_13 (LSTM)               multiple                  12416     
_________________________________________________________________
dense_16 (Dense)             multiple                  33        
_________________________________________________________________
dense_17 (Dense)             multiple                  66        
=================================================================
pratsbhatt
  • 1,498
  • 10
  • 20