I am trying to understand how to use mdptoolbox and had a few questions.
What does 20
mean in the following statement?
P, R = mdptoolbox.example.forest(10, 20, is_sparse=False)
I understand that 10
here denotes the number of possible states. What does 20
mean here? Does it represent the total number of actions per state? I want to restrict the MDP to exactly 2 actions per state. How could I do this?
The shape of P
returned above is (2, 10, 10)
. What does 2
represent here? No matter what values I use for total states and actions, it is always 2
.