I built a cumstom environment with Openai Gym spaces.Tuple because my observation is made up of: hour(0-23), day(1-7), month(1-12), which are discrete; four continuous numbers, which are from a csv file; and an array of shape (4*24), which are also from a csv file.
self.observation_space = spaces.Tuple(spaces=(
spaces.Box(low=-high, high=high, shape=(4,), dtype=np.float16),
spaces.Box(low=-high, high=high, shape=(4, 24), dtype=np.float16),
spaces.Discrete(24),
spaces.Discrete(7),
spaces.Discrete(12)))
This is my reset() function to read data from the csv file:
def reset(self):
index = 0
hour = 0
day = 1
month = 6
array1 = np.array([
self.df.loc[index, 'A'],
self.df.loc[index, 'B'],
self.df.loc[index, 'C'],
self.df.loc[index, 'D'],
], dtype=np.float16)
array2 = np.array([
self.df.loc[index: index+23, 'A'],
self.df.loc[index: index+23, 'B'],
self.df.loc[index: index+23, 'C'],
self.df.loc[index: index+23, 'D'],
], dtype=np.float16)
tup = (array1, array2, hour, day, month)
return tup
To train the agent, I want to use DQN algorithm, which is the DQNAgent from keras-rl library Here is my code to build the neural network model:
model = Sequential()
model.add(Flatten(input_shape=(1,) + env.observation_space.shape))
model.add(Dense(16))
model.add(Activation('relu'))
model.add(Dense(nb_actions))
model.add(Activation('linear'))
According to my understanding, spaces.Tuple instances don't have shape() method, and the len method returns the number of spaces in the tuple. e.g. len = 5 in my environment
state = env.reset()
len = state.__len__()
But to build the neural network, it seems that I need 4 + 4*24 + 3 = 103 input neuron. I tried to hard-code the input dimension as :
model.add(Flatten(input_shape=(1,) + (103,)))
But I got the following error:
ValueError: Error when checking input: expected flatten_1_input to have shape (1, 103) but got array with shape (1, 5).
So I then tried:
model.add(Flatten(input_shape=(1,) + (env.observation_space.__len__(),)))
But I also got error:
TypeError: only size-1 arrays can be converted to Python scalars The above exception was the direct cause of the following exception: Traceback (most recent call last): File "C:/Users/yuche/Dropbox/risk hedging/rl-project/DqnDAMarketAgent.py", line 37, in dqn.fit(env, nb_steps=1440, visualize=True, verbose=2) File "C:\Users\yuche\anaconda3\envs\py37\lib\site-packages\rl\core.py", line 169, in fit action = self.forward(observation) File "C:\Users\yuche\anaconda3\envs\py37\lib\site-packages\rl\agents\dqn.py", line 228, in forward q_values = self.compute_q_values(state) File "C:\Users\yuche\anaconda3\envs\py37\lib\site-packages\rl\agents\dqn.py", line 69, in compute_q_values q_values = self.compute_batch_q_values([state]).flatten() File "C:\Users\yuche\anaconda3\envs\py37\lib\site-packages\rl\agents\dqn.py", line 64, in compute_batch_q_values q_values = self.model.predict_on_batch(batch) File "C:\Users\yuche\anaconda3\envs\py37\lib\site-packages\keras\engine\training.py", line 1580, in predict_on_batch outputs = self.predict_function(ins) File "C:\Users\yuche\anaconda3\envs\py37\lib\site-packages\tensorflow\python\keras\backend.py", line 3277, in call dtype=tensor_type.as_numpy_dtype)) File "C:\Users\yuche\anaconda3\envs\py37\lib\site-packages\numpy\core_asarray.py", line 83, in asarray return array(a, dtype, copy=False, order=order) ValueError: setting an array element with a sequence.
I googled this error and find the possible reason:
This happens while the function that you hold defined or have built is expecting any single parameter yet gets an array rather.
It seems that I still need to have 103 instead of 5 neuron as input, but the Tuple directly feed in two arrays to the network.I was wondering, what is the typical usage of Tuple in DQN?
BTW, I came up with a method that use Spaces.Box instead of Spaces.Tuple:
self.observation_space = spaces.Box(low=-high, high=high, shape=(103,), dtype=np.float16)
But this seems not be the most ideal way.
Thanks in advance!