I have an observation space in the format of Box but is actually defined as numpy array.
For example:
Box(low=np.array([0, 0, 0]), high=np.array([15, 10,150]))
Now I want to get the q_value for a single observation, but since the observation is Box the code of the stable baseline 3 is:
if isinstance(observation_space, spaces.Box):
return obs.float()
But, the input observation does not have float attribute, So in this case how can I access the q_values of all the actions?