I'm doing a research on a finite horizon decision problem with t=1,...,40 periods. In every time step t, the (only) agent has to chose an action a(t) ∈ A(t), while the agent is in state s(t) ∈ S(t). The chosen action a(t) in state s(t) affects the transition to the following state s(t+1). So there is a finite horizon markov decision problem.
In my case the following holds true: A(t)=A and S(t)=S, while the size of A is 6 000 000 and the size of S is 10^8. Further the transition function is stochastic.
Since I'm relatively new to the theory of Monte Carlo Tree Search (MCTS), i ask myself: is MCTS an appropriate method for my problem (in particular due to the large size of A and S and the stochastic transition function?)
I have already read a lot of papers about MCTS (e.g. progressiv widening and double progressiv widening, which sound quite promising), but maybe someone can tell me about his experiences applying MCTS to similar problems or about appropriate methods for this problem (with large state/action space and a stochastic transition function).