1

I am trying to implement DDPG algorithm of the Paper.

Here in the image below, gk[n] and rk[n] are KxM matrices of real values. Theta[n] and v[n] are arrays of size M.

I want to write correct code to specify state/observation space in my custom environment.

Since the data type input to the neural network needs to be unified, the state array can be expressed as

The state Space definition as mentioned in the Paper

observation_space = spaces.Box(low=0, high=1, shape=(K, M), dtype=np.float16......)

I am stuck.

1 Answers1

2

If you use stable-baselines3, you may use a Dict observation space filled with Boxes with meaningful limits for all your vectors and matrices (if limits are unknown, you may always use +inf/-inf). The code could be something like:

from gym import Env
from gym.spaces import Box, Dict

class MySuperGymEnv(Env):
  def __init__(self):
    ...
    spaces = {
       'theta': Box(low=0, high=1, shape=(99,), dtype=np.float32),
       'g': Box(low=0, high=255, shape=(100,200), dtype=np.float32),
       ...
    }
    self.observation_space = Dict(spaces)
    ...
gehirndienst
  • 424
  • 2
  • 13
  • Thank you for your answer. I wanted to ask how should I proceed with specifying limits on g , v, r as they have complex values. Also, as shown in the screenshot, state space looks something like = {real(theta, g, v, r), imag(theta, g, v, r)}. Should I then create a Dict(dict of arrays)? – Sukhamjot Singh Jan 20 '23 at 10:03
  • you can use your distinction between real and imaginary parts as a 2nd dimension for vectors and as a 3rd dimension for matrices in your `shape` parameter; `low` and `high` can be also vectors, therefore specifying limits for every element – gehirndienst Jan 24 '23 at 16:37