Gym (openAI) environment actions space depends from actual state

Question

I'm using gym toolkit to create my own env and keras-rl to use my env within an agent. The problem is that my actions space changes, it depends from actual state. For example, i have 46 possible actions, but given a certain state only 7 are available, and i'm not able to find a way to modeling that.

I've read that question open-ai-enviroment-with-changing-action-space-after-each-step

but this did not resolve my problem.

In Gym Documentation there are not instructions to do this, only an issue on their Github repo (still open). I can't understand how the agent (keras-rl, dqn agent) pick up an action, is it randomically choosen? but from where?

Can somebody help me? Ideas?

score 0 · Answer 1 · answered Aug 27 '19 at 06:14

I've handled this by just ignoring any invalid actions and letting the exploration mechanics keep it from getting stuck. Quick and simple, but likely better ways to do it.

I think the better option is to somehow set the probability of selecting that action to zero, but I've had trouble figuring out how to do that.

Gym (openAI) environment actions space depends from actual state

1 Answers1