Trees grow from seeds and so do forests ;-) (scnr)
There are different ways to built a random forest, however, all in common is that multiple trees are built. To improve classification accuracy over a single decision tree, the individual trees in a random forest need to differ, as you would have nTree
times the same tree. This difference is achieved by introducing randomness in the generation of the trees. The randomness is influenced by the seed and what is most important about the seed is that using the same seed should always generate the same result.
How does the randomness influence the tree build? There are multiple ways.
- build the tree for a random subset. This is for each individual tree of the forest a subset of training example are drawn and then a tree is build for this subset
- at each decision point in the tree, the decision attribute is selected randomly.
Often these two elements are combined.
http://link.springer.com/article/10.1023%2FA%3A1010933404324#page-1