I have searched on google about this issue and I can't find something that explains this algorithm in a simple yet detailed way.
For instance, I know the id3 algorithm doesn't use pruning at all, so if you have a continuous characteristic, the prediction success rates will be very low.
So the C4.5 in order to support continuous characteristics it uses pruning, but is this the only reason?
Also I can't really understand in the WEKA application, how exactly the confidence factor affects the efficiency of the predictions. The smaller the confidence factor the more pruning the algorithm will do, however what is the correlation between pruning and the prediction's accuracy? The more you prune, the better the predictions or the worse?
Thanks