I am working on a deep reinforcement problem, I am new to this. I am writing a snippet of code and errors I am getting.
Broker_Node_Map is a list of values present in different positions in a machine. I don't know how to present these values as integers. This is the state I have as it's changing too accordingly. Please suggest what should I do. Please be kind, I am pretty new and trying to get hold of things.
def __init__(self):
super(BrokerEnv2, self).__init__()
reward = 0
self.action_space = DiscreteActions.get_action_space()
self.observation_space = DiscreteObservations.get_observation_space()
def reset(self):
observed_State = self.Broker_Node_Map
return observed_State
While checking env on stable baselines - check_env(env) **Error** - AssertionError: The observation returned by reset() method must be an int
EDIT 1 - Very careless of me. Changed the space to box space but now another error emerged. AssertionError: The observation returned by the reset() method does not match the given observation space
This is what my reset() is returning -
<class 'numpy.ndarray'> (46,) [26 33 0 50 0 0 73 26 0 29 0 34 27 67 0 0 0 0 35 60 0 0 24 22
0 0 0 0 25 0 17 0 0 0 21 0 0 53 68 40 51 0 62 0 56 0]
This is how I have defined my observation space -
self.observation_space = spaces.Box(low=1, high=73, shape=(46,), dtype=np.int64)
Please help me out why this error is coming?