4

I built a cumstom environment with Openai Gym spaces.Tuple because my observation is made up of: hour(0-23), day(1-7), month(1-12), which are discrete; four continuous numbers, which are from a csv file; and an array of shape (4*24), which are also from a csv file.

self.observation_space = spaces.Tuple(spaces=(
                                             spaces.Box(low=-high, high=high, shape=(4,), dtype=np.float16),
                                             spaces.Box(low=-high, high=high, shape=(4, 24), dtype=np.float16),
                                             spaces.Discrete(24),
                                             spaces.Discrete(7),
                                             spaces.Discrete(12)))

This is my reset() function to read data from the csv file:

    def reset(self):
        index = 0
        hour = 0
        day = 1
        month = 6
        array1 = np.array([
            self.df.loc[index, 'A'],
            self.df.loc[index, 'B'],
            self.df.loc[index, 'C'],
            self.df.loc[index, 'D'],
        ], dtype=np.float16)
        array2 = np.array([
            self.df.loc[index: index+23, 'A'],
            self.df.loc[index: index+23, 'B'],
            self.df.loc[index: index+23, 'C'],
            self.df.loc[index: index+23, 'D'],
        ], dtype=np.float16)
        tup = (array1, array2, hour, day, month)
        return tup 

To train the agent, I want to use DQN algorithm, which is the DQNAgent from keras-rl library Here is my code to build the neural network model:

model = Sequential()
model.add(Flatten(input_shape=(1,) + env.observation_space.shape))
model.add(Dense(16))
model.add(Activation('relu'))
model.add(Dense(nb_actions))
model.add(Activation('linear'))

According to my understanding, spaces.Tuple instances don't have shape() method, and the len method returns the number of spaces in the tuple. e.g. len = 5 in my environment

state = env.reset()
len = state.__len__()

But to build the neural network, it seems that I need 4 + 4*24 + 3 = 103 input neuron. I tried to hard-code the input dimension as :

model.add(Flatten(input_shape=(1,) + (103,)))

But I got the following error:

ValueError: Error when checking input: expected flatten_1_input to have shape (1, 103) but got array with shape (1, 5).

So I then tried:

model.add(Flatten(input_shape=(1,) + (env.observation_space.__len__(),)))

But I also got error:

TypeError: only size-1 arrays can be converted to Python scalars The above exception was the direct cause of the following exception: Traceback (most recent call last): File "C:/Users/yuche/Dropbox/risk hedging/rl-project/DqnDAMarketAgent.py", line 37, in dqn.fit(env, nb_steps=1440, visualize=True, verbose=2) File "C:\Users\yuche\anaconda3\envs\py37\lib\site-packages\rl\core.py", line 169, in fit action = self.forward(observation) File "C:\Users\yuche\anaconda3\envs\py37\lib\site-packages\rl\agents\dqn.py", line 228, in forward q_values = self.compute_q_values(state) File "C:\Users\yuche\anaconda3\envs\py37\lib\site-packages\rl\agents\dqn.py", line 69, in compute_q_values q_values = self.compute_batch_q_values([state]).flatten() File "C:\Users\yuche\anaconda3\envs\py37\lib\site-packages\rl\agents\dqn.py", line 64, in compute_batch_q_values q_values = self.model.predict_on_batch(batch) File "C:\Users\yuche\anaconda3\envs\py37\lib\site-packages\keras\engine\training.py", line 1580, in predict_on_batch outputs = self.predict_function(ins) File "C:\Users\yuche\anaconda3\envs\py37\lib\site-packages\tensorflow\python\keras\backend.py", line 3277, in call dtype=tensor_type.as_numpy_dtype)) File "C:\Users\yuche\anaconda3\envs\py37\lib\site-packages\numpy\core_asarray.py", line 83, in asarray return array(a, dtype, copy=False, order=order) ValueError: setting an array element with a sequence.

I googled this error and find the possible reason:

This happens while the function that you hold defined or have built is expecting any single parameter yet gets an array rather.

It seems that I still need to have 103 instead of 5 neuron as input, but the Tuple directly feed in two arrays to the network.I was wondering, what is the typical usage of Tuple in DQN?

BTW, I came up with a method that use Spaces.Box instead of Spaces.Tuple:

self.observation_space = spaces.Box(low=-high, high=high, shape=(103,), dtype=np.float16) 

But this seems not be the most ideal way.

Thanks in advance!

Yuchen
  • 81
  • 5
  • If you are just using the observation_space attribute for determining the number of input neurons, then maybe you could just hard-code that value while building the neural network. – nsidn98 Jan 23 '21 at 16:43
  • Hi, thanks for your suggestion. I tried to hard-code the input dimension, but then I got another errors. I've updated the question accordingly. – Yuchen Jan 26 '21 at 08:10

0 Answers0