2

I am learning markov decision process. Am I don't know where to mark terminal states.

In 4x3 grid world, I marked the terminal state that I think correct(I might be wrong) with T. Pic

I saw an instruction mark terminal states as follow.

terminals=[(3, 2), (3, 1)]

Can someone explain how does it work?

  • [Artificial Intelligence Stack Exchange](https://ai.stackexchange.com/) is probably a better place to ask theoretical questions related to reinforcement learning, so I suggest that you ask your question there (and, definitely, next time you have a question about RL, you should ask it there). If you ask it there, please, delete it from here (to avoid cross-posting, which is generally discouraged). – nbro Nov 02 '20 at 12:13

1 Answers1

0

In the given grid-world, you start at "start" which is (0,0). Then you move around from that point. If you reach at "end +1"{(3,2)} then the reward is +1 and the game ends. Likewise, if you reach at "end -1"{(3,1)} then the reward is -1 and the game ends. However, while you are moving around, you can't move to {(1,1)} as its invalid state. Also, if you reach any of the terminal state "T" which are at {(2,0) and(2,1)} then the game ends with zero reward.

Dharman
  • 30,962
  • 25
  • 85
  • 135
Lawhatre
  • 1,302
  • 2
  • 10
  • 28