The variable in question is the most informative one (sorry if the language is bad, Im a newbie), and the tree is 90% accurate (base rate would be around 86%), however I want the algorithm to use more than one attiribute. I constructed a CART tree based on the same data and it used a few more of the available variables and achieved an accuracy of about 92% (all tested on holdout). Is there any way to force the tree to make more splits? Here is the code that I am using:
predictors <- subset(student_train, select = -c(10)) dependant <- as.factor(student_train$g3) c50fit <- C5.0(x = predictors, y = dependant, trials = 10, control = C5.0Control(noGlobalPruning = TRUE))
As you can see I tried some of the stuff I found online but it did not work.
Here's a picture of the output:
You can see in the code that I tried out some of the control options but none seemed to work.
Here's the head of the data from the dput() code, I hope I provided it correctly:
structure(list(school = structure(c(2L, 2L, 1L, 1L, 1L, 1L, 2L,
1L, 1L, 1L), levels = c("GP", "MS"), class = "factor"), address =
structure(c(2L,
1L, 2L, 2L, 2L, 2L, 1L, 2L, 2L, 2L), levels = c("R", "U"), class
= "factor"),
mjob = structure(c(1L, 3L, 3L, 3L, 3L, 1L, 4L, 1L, 3L, 3L
), levels = c("at_home", "health", "other", "services", "teacher"
), class = "factor"), fjob = structure(c(4L, 3L, 3L, 3L,
3L, 3L, 3L, 4L, 3L, 3L), levels = c("at_home", "health",
"other", "services", "teacher"), class = "factor"), reason =
structure(c(1L,
1L, 1L, 1L, 2L, 2L, 1L, 1L, 2L, 1L), levels = c("course",
"home", "other", "reputation"), class = "factor"), internet =
structure(c(2L,
1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), levels = c("no", "yes"
), class = "factor"), dalc = c(2L, 3L, 1L, 2L, 2L, 1L, 1L,
1L, 1L, 2L), walc = c(3L, 3L, 1L, 2L, 3L, 1L, 2L, 4L, 2L,
4L), g2 = c(10L, 11L, 12L, 14L, 12L, 10L, 11L, 8L, 12L, 11L
), g3 = c("Fail", "Pass", "Pass", "Pass", "Pass", "Pass",
"Pass", "Fail", "Pass", "Pass")), row.names = c(580L, 545L,
98L, 378L, 85L, 113L, 645L, 178L, 27L, 384L), class =
"data.frame")