So I generated a random dataset online and I need to apply the C4.5 algorithm on it.
I installed the RWeka package and all its dependencies but I do not know how to execute it.
Can somebody help me with links to tutorials? Anything apart from the RWeka documentation. Or a sample C4.5 code in R to understand its working?
Thank you
Asked
Active
Viewed 4,508 times
2

Saksham Arora
- 71
- 2
- 9
1 Answers
5
I think it would be worth your time to check out the caret
package. It standardizes the syntax for most machine learning packages in R, including RWeka
.
It also has a ton of really useful helper functions and a great tutorial on their website
Here's the syntax for predicting Species on the iris dataset using the RWeka
package with C4.5-like trees:
library(caret)
train_rows <- createDataPartition(iris$Species, list=FALSE)
train_set <- iris[train_rows, ]
test_set <- iris[-train_rows, ]
fit.rweka <- train(Species ~ ., data=train_set, method='J48')
pred <- predict(fit.rweka, newdata=test_set)
then, if you want to try a gradient boosting machine or some other algorithm, just change to method='gbm'

DunderChief
- 726
- 4
- 7
-
Hi @DunderChief. Thanks for the help. I am still facing one issue though. When I ran this on my dataset, which I have structured based on the Iris dataset, I'm getting an error at the fit.rweka step. This is the error shown: **Error in train.default(x, y, weights = w, ...) : One or more factor levels in the outcome has no data: ''** – Saksham Arora Nov 15 '15 at 10:11
-
Also, the how can I view the output we get from **predict**? Sorry for the novice questions. I am completely new to this. – Saksham Arora Nov 15 '15 at 10:13
-
@SakshamArora It seems that one of your outcome variables is not present in the training data. So in the case of the iris dataset it would be as if the training set contained only setosa & versicolor flowers, but no virginica. The output from predict is saved in the pred object in the code above. – DunderChief Nov 15 '15 at 18:52
-
1What about the MLR (Machine Learning in R) package [here](https://github.com/mlr-org/mlr)? How does it compare with caret? How do you know in caret that you are using RWeka and is it possible use the preprocessing steps of WEKA with RWeka? – hhh Jul 17 '17 at 11:30