Can someone explain me the difference between ID3 and CART algorithm?

Question

I have to create decision trees with the R software and the rpart Package. In my paper I should first define the ID3 algorithm and then implement various decision trees.

I found out that the rpart package does not work with the ID3 algorithm. It uses the CART algorithm. I would like to understand the difference and maybe explain the difference in my paper, but I did not found any literature who compares both sides.

Can you help me? Do you know a paper where both are compared or can you explain the difference to me?

They use different loss functions, see wikipedia: http://en.wikipedia.org/wiki/Classification_and_regression_tree#Formulae — Neal Fultz, Nov 21 '13 at 02:45
The only difference then is ID3 uses Information gain with Entropie and CART the Gini Impurity? — user2988757, Nov 21 '13 at 06:55

score 6 · Answer 1 · answered May 05 '17 at 16:03

I don't have access to the original texts ^1,2 but using some secondary sources, key differences between these recursive ("greedy") partitioning ("tree") algorithms seem to be:

Type of learning:
- ID3, as an "Iterative Dichotomiser," is for binary classification only
- CART, or "Classification And Regression Trees," is a family of algorithms (including, but not limited to, binary classification tree learning). With rpart(), you can specify method='class' or method='anova', but rpart can infer this from the type of dependent variable (i.e., factor or numeric).
Loss functions used for split selection.
- ID3, as other comments have mentioned, selects its splits based on Information Gain, which is the reduction in entropy between the parent node and (weighted sum of) children nodes.
- CART, when used for classification, selects its splits to achieve the subsets that minimize Gini impurity

Anecdotally, as a practitioner, I hardly ever hear the term ID3 used, whereas CART is often used as a catch-all term for decision trees. CART has a very popular implementation in R's rpart package. ?rpart notes that "In most details it follows Breiman et. al (1984) quite closely."

However, you can pass rpart(..., parms=list(split='information')) to override the default behavior and split on information gain instead.

¹ Quinlan, J. R. 1986. Induction of Decision Trees. Mach. Learn. 1, 1 (Mar. 1986), 81–106

² Breiman, Leo; Friedman, J. H.; Olshen, R. A.; Stone, C. J. (1984). Classification and regression trees. Monterey, CA: Wadsworth & Brooks/Cole Advanced Books & Software.

score 1 · Answer 2 · answered Apr 05 '14 at 22:36

http://www.cs.umd.edu/~samir/498/10Algorithms-08.pdf

Read 1 C4.5 and beyond of the paper It will clarify all your doubts, helped me with mine. Don't get discouraged by the title, its about differences in different tree algorithms. Anyways a good paper to read through

score 0 · Answer 3 · answered Dec 12 '22 at 14:40

0

ID3 algorithm can be used for categorical feature and categorical label. Whereas, CART is used for continuous features and continuous label.

answered Dec 12 '22 at 14:40

nitishwa

1

CART can also be used for categorical features and labels. – Ashish Feb 24 '23 at 14:51

Can someone explain me the difference between ID3 and CART algorithm?

3 Answers3