2

From many blogs and this one https://web.archive.org/web/20160308070346/http://mcts.ai/about/index.html We know that the process of MCTS algorithm has 4 steps.

  1. Selection: Starting at root node R, recursively select optimal child nodes until a leaf node L is reached.

What does leaf node L mean here? I thought it should be a node representing the terminal state of the game, or another word which ends the game. If L is not a terminal node (one end state of the game), how do we decide that the selection step stops on node L?

  1. Expansion: If L is a not a terminal node (i.e. it does not end the game) then create one or more child nodes and select one C.

From this description I realise that obviously my previous thought incorrect. Then if L is not a terminal node, it implies that L should have children, why not continue finding a child from L at the "Selection" step? Do we have the children list of L at this step?
From the description of this step itself, when do we create one child node, and when do we need to create more than one child nodes? Based on what rule/policy do we select node C?

  1. Simulation: Run a simulated playout from C until a result is achieved.

Because of the confusion of the 1st question, I totally cannot understand why we need to simulate the game. I thought from the selection step, we can reach the terminal node and the game should be ended on node L in this path. We even do not need to do "Expansion" because node L is the terminal node.

  1. Backpropagation: Update the current move sequence with the simulation result.

Fine. Last question, from where did you get the answer to these questions?

Thank you

BTW, I also post the same question https://ai.stackexchange.com/questions/16606/how-to-understand-the-4-steps-of-monte-carlo-tree-search

manlio
  • 18,345
  • 14
  • 76
  • 126
Wenbo
  • 1,212
  • 8
  • 17
  • This is not a question about a specific programming problem, thus offtopic. I recommend to post that question on www.stats.stackexchange.com . – Florian H Nov 18 '19 at 14:00
  • Thanks. I will post it there. – Wenbo Nov 18 '19 at 20:03
  • 1
    @Wenbo This is more suitable for ai.stackexchange.com than for stats, as the approach is widely used in AI. See also: https://stackoverflow.com/questions/44230911/how-does-monte-carlo-search-tree-work – Nathan S. Nov 18 '19 at 20:17
  • @NathanS. you are right. Then you know I am a newbie. – Wenbo Nov 18 '19 at 20:22
  • This is ontopic. The help center gives [software algorithm](https://stackoverflow.com/help/on-topic) as an example of something you could ask about. In general it's perfectly ok to [ask algorithms questions on SO even if the question doesn't contain any code](https://meta.stackoverflow.com/a/342740/3235496). The OP is asking many questions but answering the first one is enough to clarify the others (it's just the misunderstanding of a specific step of an algorithm). – manlio Nov 20 '19 at 13:59

1 Answers1

2

What does leaf node L mean here?

For the sake of explanation I'm assuming that all the children of a selected node are added during the expansion phase of the algorithm.

When the algorithm starts, the tree is formed only by the root node (a leaf node).

The Expansion phase adds all the states reachable from the root to the tree. Now you have a bigger tree where the leaves are the last added nodes (the root node isn't a leaf anymore).

MCTS policy

At any given iteration of the algorithm, the tree (the gray area of the picture) grows. Some of its leaves could be terminal states (according to the rules of the game/problem) but it's not granted.

If you expand too much, you could run out of memory. So the typical implementation of the expansion phase only adds a single node to the existing tree.

In this scenario you could change the word leaf with not fully expanded:

Starting at root node R, recursively select optimal child nodes until a not fully expanded node L is reached


Based on what rule/policy do we select node C?

It's domain-dependent. Usually you randomly choose a move/state.


NOTES

  1. Image from Multi-Objective Monte Carlo Tree Search for Real-Time Games (Diego Perez, Sanaz Mostaghim, Spyridon Samothrakis, Simon M. Lucas).
manlio
  • 18,345
  • 14
  • 76
  • 126
  • Thank you @manlio. I got the answer from youtube.com/watch?v=UXW2yZndl7U introduced by Philip from https://ai.stackexchange.com/questions/16606/how-to-understand-the-4-steps-of-monte-carlo-tree-search. I also answered my question there. – Wenbo Nov 20 '19 at 21:40
  • About the definition of "leaf" node, The key point is what tree is the host/owner of a "leaf" node to this question. Through "Expansion" step, we are actually creating a tree with MCTS. The tree, the owner of a "leaf" node, should be the one that we are building, not the tree of the game state in our head (perhaps it is too big to fill in our head, the tree of the game state actually does not exist). Then we know that a "leaf" node is the one, which does not have any child, in the tree that we are building. Hope we thought it same. – Wenbo Nov 20 '19 at 21:41
  • What does "Thanks for the feedback! Votes cast by those with less than 15 reputation are recorded, but do not change the publicly displayed post score." mean.... – Wenbo Nov 20 '19 at 21:42
  • @Wenbo *Hope we thought it same*... yes, same about the tree that MCTS builds. In the actual implementation the algorithm may *expand* (add a child to) a node that isn't a leaf in the traditional Computer Science meaning: it could be an internal node that hasn't already been fully expanded. – manlio Nov 21 '19 at 10:55