Is MonteCarloTreeSearch an appropriate method for this problem size (large action/state space)?

Question

I'm doing a research on a finite horizon decision problem with t=1,...,40 periods. In every time step t, the (only) agent has to chose an action a(t) ∈ A(t), while the agent is in state s(t) ∈ S(t). The chosen action a(t) in state s(t) affects the transition to the following state s(t+1). So there is a finite horizon markov decision problem.

In my case the following holds true: A(t)=A and S(t)=S, while the size of A is 6 000 000 and the size of S is 10^8. Further the transition function is stochastic.

Since I'm relatively new to the theory of Monte Carlo Tree Search (MCTS), i ask myself: is MCTS an appropriate method for my problem (in particular due to the large size of A and S and the stochastic transition function?)

I have already read a lot of papers about MCTS (e.g. progressiv widening and double progressiv widening, which sound quite promising), but maybe someone can tell me about his experiences applying MCTS to similar problems or about appropriate methods for this problem (with large state/action space and a stochastic transition function).

Not to say this doesn't belong on Stack Overflow, but perhaps there would be a better chance for a good answer on https://cs.stackexchange.com/ ? — Jolta, Jan 09 '19 at 09:25

score 1 · Answer 1 · answered Jan 12 '19 at 17:27

With 6 million stochastic actions per state, I don't think any kind of simulation is realistically going to differentiate between those moves without running essentially forever.

100 MM states isn't a lot however, you can store the value for all of them in less than a gigabyte of memory and something like value iteration or policy iteration would solve this optimally much faster.

Is MonteCarloTreeSearch an appropriate method for this problem size (large action/state space)?

1 Answers1