Reduced Error Pruning Algorithm

Question

I have a question about this algorithm:

Partition training data in “grow” and “validation” sets.
Build a complete tree from the “grow” data.
Until accuracy on validation set decreases do:
    For each non-leaf node, n, in the tree do:
        Temporarily prune the subtree below n and replace it with a leaf labeled with
                                              the current majority class at that node.
        Measure and record the accuracy of the pruned tree on the validation set.
    Permanently prune the node that results in the greatest increase in accuracy on the
                                                                         validation set.

I don't understand the part "Permanently prune the node that results in the greatest increase in accuracy on the validation set." We are supposed to keep the nodes that increase the accuracy and prune those that increase the error rate. Am I wrong?

Walter Tross · Accepted Answer · 2015-11-26T20:49:45.457

1

I don't even know which realm this algorithm applies to, but it's my understanding that the nodes that increase accuracy are the ones that are not pruned, so there is no contradiction in the phrase you quote. Maybe it could be rephrased

permanently prune the node that, when pruned, causes the greatest increase in accuracy on the validation set

to make it clearer.

edited Nov 26 '15 at 20:49

answered Nov 26 '15 at 20:30

Walter Tross

12,237
2
40
64

Reduced Error Pruning Algorithm

1 Answers1