Questions tagged [cart-analysis]

Classification and Regression Tree (CART) analysis

42 questions
8
votes
4 answers

Search for corresponding node in a regression tree using rpart

I'm pretty new to R and I'm stuck with a pretty dumb problem. I'm calibrating a regression tree using the rpart package in order to do some classification and some forecasting. Thanks to R the calibration part is easy to do and easy to control. #the…
antoine
  • 123
  • 1
  • 5
7
votes
1 answer

How do I interpret rpart splits on factor variables when building classification trees in R?

If the factor variable is Climate, with 4 possible values: Tropical, Arid, Temperate, Snow, and a node in my rpart tree is labeled as "Climate:ab", what is the split?
user281537
  • 111
  • 1
  • 2
  • 4
7
votes
1 answer

Running regression tree on large dataset in R

I am working with a dataset of roughly 1.5 million observations. I am finding that running a regression tree (I am using the mob()* function from the party package) on more than a small subset of my data is taking extremely long (I can't run on a…
Rob Donnelly
  • 2,256
  • 2
  • 20
  • 29
5
votes
1 answer

R packages/models that can handle NA's

I'm looking for R packages or machine learning models/algos like randomForest, glmnet, gbdt, etc that can handle NA's, as opposed to ignoring the row or column that has any instances of NA's. I'm not looking to impute. Any suggestions?
screechOwl
  • 27,310
  • 61
  • 158
  • 267
5
votes
3 answers

Can someone explain me the difference between ID3 and CART algorithm?

I have to create decision trees with the R software and the rpart Package. In my paper I should first define the ID3 algorithm and then implement various decision trees. I found out that the rpart package does not work with the ID3 algorithm. It…
user2988757
  • 105
  • 1
  • 1
  • 8
4
votes
2 answers

decision trees with forced structure

I have been using decision trees (CART) in R using the rpart package to look at the relationship between SST (predictor variables) and climate (predictand variable). I would like to "force" the tree into a particular structure - i.e. split on…
kiriwhan
  • 41
  • 3
4
votes
1 answer

Difference Between Sklearn's DecisionTreeClassifier and CART

Understand the difference between CART and DecisionTreeClassifier of Sklearn. In Sklearn's documentation, it says that "scikit-learn uses an optimised version of the CART algorithm". However, I couldn't find what this optimisation was anywhere! It…
Sanchez_P
  • 51
  • 3
4
votes
0 answers

Add indval values on a multivariate regression tree

I used the mvpart package in R to generate a multivariate regression tree. To determine the indicator species responsible for the split, I used the indval command in the labdsv package. My question is, how do I insert my indicator species into my…
Tim Quimpo
  • 85
  • 1
  • 7
4
votes
1 answer

Adding informations to tree - Rpart

I want to add some information to my tree. Let's say for instance I have a database like this : library(rpart) library(rpart.plot) set.seed(1) mydb<-data.frame(results=rnorm(1000,0,1),expo=runif(1000),var1=sample(LETTERS[1:4],1000,replace=T), …
Rhesous
  • 984
  • 6
  • 12
4
votes
2 answers

What does the rpart "Error in as.character(x) : cannot coerce type 'builtin' to vector of type 'character' " message mean?

I've been banging my head against rpart for a few days now (trying to make classification trees for this dataset that I have), and I think it's time to ask a lifeline at this point :-) I'm sure it's something silly that I'm not seeing, but here's…
user281537
  • 111
  • 1
  • 2
  • 4
4
votes
1 answer

Splits and Root node of binary decision tree(CART)

How to find a split and root node in a regression tree, I made a regression tree from multiple vectors now I have to extract root node of rpart of multiple vectors.file contains numeric value of multiple vectors A,B,C,D,E,F,G,H ex. A vector contains…
Aashu
  • 1,247
  • 1
  • 26
  • 41
3
votes
1 answer

What impurity index (Gini, entropy?) is used in TensorFlow Random Forests with CART trees?

I was looking for this information in the tensorflow_decision_forests docs (https://github.com/tensorflow/decision-forests) (https://www.tensorflow.org/decision_forests/api_docs/python/tfdf/keras/wrappers/CartModel) and yggdrasil_decision_forests…
3
votes
0 answers

R randomForest - how to predict with a "getTree" tree

Background: I can make a random Forest in R: set.seed(1) library(randomForest) data(iris) model.rf <- randomForest(Species ~ ., data=iris, importance=TRUE, ntree=20, mtry = 2) I can predict values using the randomForest object that I just…
EngrStudent
  • 1,924
  • 31
  • 46
3
votes
1 answer

R ctree strange error

I have some strange problem in for loops with ctree data. If I write this code in a loop then R freezes. data = read.csv("train.csv") #data description https://www.kaggle.com/c/titanic-gettingStarted/data treet = ctree(Survived ~ ., data =…
2xP
  • 47
  • 1
  • 9
2
votes
1 answer

How to get classification probabilities of each tree in the random forest using R

I want to get classification probabilities of each class by each tree in the randomForest. (1) This outputs individual outputs but its type is response, not probabilities: predict(rf_cl, newdata, predict.all=TRUE)$individual (2) This outputs…
Whitney
  • 21
  • 2
1
2 3