Tensorflow 2.0 DQN Agent Issue with Custom Environment

Question

So I've been following the DQN agent example / tutorial and I set it up like in the example, only difference is that I built my own custom python environment which I then wrapped in TensorFlow. However, no matter how I shape my observations and action specs, I can't seem to get it to work whenever I give it an observation and request an action. Here's the error that I get:

tensorflow.python.framework.errors_impl.InvalidArgumentError: In[0] is not a matrix. Instead it has shape [10] [Op:MatMul]

Here's how I'm setting up my agent:

layer_parameters = (10,) #10 layers deep, shape is unspecified

#placeholders 
learning_rate = 1e-3  # @param {type:"number"}
train_step_counter = tf.Variable(0)

#instantiate agent

optimizer = tf.compat.v1.train.AdamOptimizer(learning_rate=learning_rate)

env = SumoEnvironment(self._num_actions,self._num_states)
env2 = tf_py_environment.TFPyEnvironment(env)
q_net= q_network.QNetwork(env2.observation_spec(),env2.action_spec(),fc_layer_params = layer_parameters)

print("Time step spec")
print(env2.time_step_spec())

agent = dqn_agent.DqnAgent(env2.time_step_spec(),
env2.action_spec(),
q_network=q_net,
optimizer = optimizer,
td_errors_loss_fn=common.element_wise_squared_loss,
train_step_counter=train_step_counter)

And here's how I'm setting up my environment:

class SumoEnvironment(py_environment.PyEnvironment):

def __init__(self, no_of_Actions, no_of_Observations):

    #this means that the observation consists of a number of arrays equal to self._num_states, with datatype float32
    self._observation_spec = specs.TensorSpec(shape=(16,),dtype=np.float32,name='observation')
    #action spec, shape unknown, min is 0, max is the number of actions
    self._action_spec = specs.BoundedArraySpec(shape=(1,),dtype=np.int32,minimum=0,maximum=no_of_Actions-1,name='action')


    self._state = 0
    self._episode_ended = False

And here is what my input / observations look like:

tf.Tensor([ 0. 0. 0. 0. 0. 0. 0. 0. -1. -1. -1. -1. 0. 0. 0. -1.], shape=(16,), dtype=float32)

I've tried experimenting with the shape and depth of my Q_Net and it seems to me that the [10] in the error is related to the shape of my q network. Setting its layer parameters to (4,) yields an error of:

tensorflow.python.framework.errors_impl.InvalidArgumentError: In[0] is not a matrix. Instead it has shape [4] [Op:MatMul]

Do you have any link to a notebook, or a whole minimal example to share? — AlexisBRENON, Dec 02 '19 at 15:10
@AlexisBRENON unfortunately not. Though my code is mostly similar to the example provided here https://github.com/tensorflow/agents/blob/master/tf_agents/colabs/1_dqn_tutorial.ipynb and here for the environment https://github.com/tensorflow/agents/blob/master/tf_agents/colabs/2_environments_tutorial.ipynb — Ibraheem Nofal, Dec 02 '19 at 15:14

score 1 · Answer 1 · answered Sep 15 '20 at 07:30

In your python environment, you should define your self._observation_spec as the type BoundedArraySpec instead of TensorSpec, then tf_py_environment.TFPyEnvironment(env) converts the python environment into a tensorflow environment.

Not sure it causes that error, but at least it's a problem of the code.

score 1 · Answer 2 · answered Sep 28 '20 at 07:37

1

You could try to set the layer parameters as below,

layer_parameters = (16,)

The q network would try to predict the next action by current state. The shape of state should match the input of the underlying fc net of q network.

answered Sep 28 '20 at 07:37

Chris Chang

11
2

AlexisBRENON · Answer 3 · 2019-12-02T15:10:46.530

0

From the key word matrix in the error message, I suppose that TF expects a 2-dimensional tensor and not a one-dimensional one.

I would suggest to set the layer parameters to (4, 1) (or (1, 4)).

I will try to play a bit with it to validate my answer.

edited Dec 02 '19 at 15:10

answered Dec 02 '19 at 15:04

AlexisBRENON

2,921
2
18
30

That was one of the first things I tried. But no matter how I shape my q_net layers, I can't seem to get it to work. Tried (4,4) and (10,10), but the issue still persists – Ibraheem Nofal Dec 02 '19 at 15:16

Tensorflow 2.0 DQN Agent Issue with Custom Environment

3 Answers3